19 Computer Vision News Computer Vision News DreamBlend Shwetha found the desired solution when she figured out that it was this catastrophic attention collapse that leads to this problem and she had the idea that using this guidance would help it. The image editing community has published quite a few works which use similar cross attention manipulation for different image editing techniques, so she took some inspiration from there. We are curious to ask Shwetha whether this solution is specific to this work or it also opens new directions for her or for others to follow. “Right now, this is focused on this particular challenge,” Shwetha tells us “but I wouldn't say that it is completely solved: there are still things that can be done to improve it and there are of course similar tradeoffs in other fine-tuning problems.” DreamBlend merges the prompt fidelity and diversity of underfit checkpoints with subject fidelity of overfit checkpoints during image generation. Early checkpoints have higher prompt fidelity and diversity but lower subject fidelity, while later checkpoints have higher subject fidelity but lower prompt fidelity and diversity. Prompt: a backpack* on a cobblestone street.
RkJQdWJsaXNoZXIy NTc3NzU=