CVPR Daily - Thursday
“ Stable Diffusion has some safety measures, ” Patrick clarifies. “ They just added a classification module , classifying if the generated content contains nudity, for example, and then showing you a black image. It says, please try again, change your prompt, or try a different random image. ” Nevertheless, the prompt ‘Asian woman’ might unexpectedly result in a blocked image, leaving the user puzzled as to why the desired image was not generated. Manuel and Patrick devised an approach to tackle this without requiring model training or tuning . Their strategy involves raising the model’s awareness of inappropriate content and enhancing its capacity to generate appropriate content by explicitly conveying the undesirability of nudity and violence. “ The basic idea is that since all of these inappropriate concepts are contained in the training data, we can explicitly instruct it to avoid them and learn good representations, ” Manuel explains. “ We have one static, very long text containing all the content we don’t want, like hate, violence, nudity, and self-harm. We pass that along to the model with our new methodology . It ensures you get an image close to the text prompt but avoids inappropriate concepts. ” The team has built on the established technique of classifier- free guidance in text-to-image diffusion models . This technique involves generating two noise estimates during image generation: one without conditioning and another conditioned on the text input. The process begins with the unconditioned estimate and progressively moves towards the conditioned estimate , resulting in a faithful representation of the text prompt within the generated images. “ We’re calculating a third term conditioned on unsafe concepts, ” Manuel continues. “ We move the generation away from these unsafe concepts while maintaining the overall direction of the text prompt. If you imagine this in a 2D abstraction, 12 DAILY CVPR Thursday Poster Presentation Figure 2. Grounded in reporting bias, one can observe ethnic biases in DMs (left). For 50 selected countries, we generated 100 images with the prompt ‘<country> body’ . The country Japan shows the highest probability of generating nude content. SLD uses the strong hyper parameter set to counteract this bias (right).
Made with FlippingBook
RkJQdWJsaXNoZXIy NTc3NzU=