Lesson 10 official topic

No. In a large dataset, they’re not going to be very similar very often, and this noise will just cancel out in the end, so it can be safely ignored.

2 Likes

bash commands in notebook also work without !

4 Likes

Here is another one which seems to have been released the same day as Imagic, not sure how they compare but these are really cool

10 Likes

Does anybody know a resource to get some intuition on how outpainting works? It’s something I’d like to try, but I cannot really think of how to make it work.

3 Likes

I had implemented the negative prompts simply using it instead of the empty string for the unconditioned predictions. It seems to be working for me, but put it in this way (i.e. start with the unconditioned and move in the direction of prompt - negative prompt) it makes no sense. I am effectively starting from where I do not want to be and heading in the direction I want to be. I will try to see if I can test the two side to side

Edit: Sorry Jeremy: I replied to the wrong post, I meant to reply to Philip’s

Imagic on 11Gb of vram. Runs in 20mins on a Tesla T4

https://github.com/ShivamShrirao/diffusers/tree/main/examples/imagic

6 Likes

You can look into this code base. https://github.com/lkwq007/stablediffusion-infinity/blob/3aa90feae2b78638e0d60847a2104ac06f86cc66/app.py#L230

Thanks. I had seen that, I was wondering whether anybody has seen a blog post or a paper or an article. I am looking for some intuition rather than the actual implementation at the moment

You can also just replace the unconditional prompt (blank string) with the negative prompt since we move ‘away’ from that already. But your way is nice since it generalizes - you could do several positive prompts and several negative ones with a separate weight (guidance scale) for each.

3 Likes

So I have run some quick experiments with negative prompts to understand a bit better what was going on. Essentially I tried the following

  1. Using the negative prompt instead of the blank string (as @johnowhitaker suggests). That is pred = g * (t - n) + n
  2. Using @Turkey / Jeremy approach, which makes more intuitive sense (why start from the point you want to move away?) so pred = g * (t - n) + u
  3. Using a somewhat hybrid approach: pred = g * (t -n) + g * (t - u) + u. Apart from empirical consideration, my gut feel about this is that I want to first move away from the unconditioned image in the direction of the prompt, and then move in the direction of the negative prompt.
  4. Same as 3, but with halved g, since I am otherwise moving away too much from u

these are some results. The original image, with the prompt “A busy street in Paris on a summer day”.

These are approaches 1-4 with the negative prompt “trees”. You can see they are all different looking at the bottom right corner. Interestingly enough approach 3 is the one that removes more completely the trees, followed by 4 (which I prefer, since too high of a g tends to oversaturate). Also approach 3 feels way less summery.

Same exercise, but this time the negative prompt is “eiffel tower”. Approaches 1 and 2 modify the trees in a way that makes them more tree-like, but further away from the original one. Approach 2 introduces some weird artifacts at the end of the street, though. 3 and 4 are very similar, but I do prefer 4 as it is more natural (and more similar to 1 and 2).

All in all I think I would stick either to 1 for simplicity/performance reason, although it makes less intuitive sense to me, or possibly modify approach 4 as to be able to weight the secondary prompt and distribute the g among them (with the possibility of adding both positive and negative weights).

13 Likes

There is an even faster version. I posted about it here Lesson 10 official topic - #57

1 Like

@pcuenq your suggestion to use pipe.enable_attention_slicing() indeed did work! A mere 1650Ti seems able to run inference on the SD model. Super interesting. However, I ran into an interesting problem - I was getting only black images as outputs. I then played around with the inputs a little bit after reading this to try and get to understand why they were being marked as “NSFW”, but I’m unable to figure it out (please see screenshots below):



If anyone has any insights on this it’d be great! It seems to be a very curious situation.

PS: If this should’ve been a post on the Lesson 9 Official Topic, I’ll move it there. For now I’ll assume it’s appropriate to post here since the discussion began here.

3 Likes

Glad to hear you got it working!

The NSFW flagging is because of the safety checker which is by default on. You can turn it off like this:

3 Likes

Thanks for your reply, @ilovescience!
I tried to switch off the safety checker as shown:


The output is still a black image, and printing the model output shows that NSFW tag is False.

Then I tried this method from @pcuenq :

Here’s the result:


The model output, showing the nsfw_content_detected values as None :

How am I seeing a black image despite the NSFW filter apparently being switched off? Am I making any mistakes here?

One that I use and have tested, is to uninstall HF diffusers and install a custom version of diffusers by using

pip install --user git+https://github.com/huggingface/diffusers.git@e895952816dc4a9a04d1d82fb5f8bdee59341c06

One more method which I have tested is to disable the code for safety check in installed packages manually show by @cjmills. More details here

1 Like

@vettukal thanks for your suggestions.
Tried the first - uninstalling the HF diffusers and installing a custom version, but that still yielded a black image.
Then I checked this post:

Since we have an updated diffusers library (0.6.0, as of today), the modification suggested by @cjmills is in essence the same as passing in safety_checker=None during pipeline creation, as suggested here:

Regardless, I did give it a go - manually switched off the safety checker in the pipeline_stable_diffusion.py script, imported StableDiffusionPipeline again and still saw a black image as output. As mentioned in my previous reply, the NSFW warning is gone, but the image output is the same! Suspecting that a low value of num_inference_steps might somehow be causing the problem, I tried bumping that up to 42, but that didn’t work either.

If anyone has any insights on this, please do let me know. Any other alternative directions to explore as to why the model keeps giving a black image output even with the safety checker turned off?
(Apologies for spamming the forum with questions about this one issue!)

Hi @james.em, I agree with you that it doesn’t look related to the safety checker. I’m thinking maybe there’s some sort of limitation in that card, beyond the RAM? My understanding is it’s the same architecture as, for example, the 1080 Ti, and I know people that successfully use this one with diffusers. If you don’t want to use the general forum for this one topic maybe you can start a new one, or you can send me a private message with the code you use and I’ll try to reproduce (although I don’t currently have any GPUs of that family).

3 Likes

The random generator bug seems to be a common issue among many libraries: Random fork-safety - OpenSSLWiki

2 Likes

I honestly don’t think it’s a bug. I think it’s the intended behaviour (otherwise reproducibility would not be possible in a parallel context). There are (complicated) ways around it like documented here for numpy, for example.

In fact this is documented in the python docs:

By re-using a seed value, the same sequence should be reproducible from run to run as long as multiple threads are not running

With python, seed does not give you reproducibility when multithreading. Useful for us, but a blocker in other contexts

Hey, did anyone try the new Stable diffusion model, v1-5.