Weird behavior with StableDiffusionSafetyChecker

I set up the Dreambooth training notebook on my desktop, and the StableDiffusionSafetyChecker seems to have a weird definition of NSFW. I switched the sample prompt for the toy cat sample dataset from “a photo of sks toy riding a bicycle” to “a photo of sks toy flying a plane” and it flagged one of the outputs as NSWF.

I’m not sure what it’s seeing, but I disabled the safety checker by making the following changes to this file. The above image is from after disabling it.

# safety_checker_input = self.feature_extractor(self.numpy_to_pil(image), return_tensors="pt").to(self.device)
# image, has_nsfw_concept = self.safety_checker(
    # images=image,
# )

return StableDiffusionPipelineOutput(images=image, nsfw_content_detected=False)
1 Like

Here is with using “a photo of sks toy riding a plane” instead.

Skating very close to the vector space of some adult industry content with those words.

The peril of large source data from the internet is that it contains images paired with alt-text labels from the HTML, it will reflect the diversity of that, warts & all.

(And if you’ve seen some of training set for the NSFW flagging, you may never want to look humanity in the eye again.)


Thats insightful, for varying degrees of “riding”.