From what I understand of their docs here, this is mainly for places where you have memory constraints.
“Note: If you are limited by GPU memory and have less than 10GB of GPU RAM available, please make sure to load the StableDiffusionPipeline in float16 precision instead of the default float32 precision as done above.”
I imagine that most of us should probably be ok to use the float32, esp if using cloud instances, but I wonder if it’s also slower then to generate the images etc. Probably the way it is right now allows for faster iteration.
In any case, I used the requirements.txt (recently merged in) for my pip install, then restarted the kernel (while using JarvisLabs) and it worked for me with the notebook as is.
@jeremy Seems like there are a lot of issues related to the versions of packages. Do you want to keep this in the back of your mind and maybe release a stable configuration later on?
A requirements.txt file was merged onto the repo (yesterday, I think), which should be what you’re looking for in terms of ‘stable’ versions of packages. Hopefully that can be updated as needed.
There seems to be similar issue with the inpainting pipeline.
Have to use autocast currently to get around it
If anyone wants to play around with it, here is the code
from diffusers import StableDiffusionInpaintPipeline
from PIL import ImageDraw
pipe_inpaint = StableDiffusionInpaintPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4",
revision="fp16",
torch_dtype=torch.float16,
).to("cuda")
p = FastDownload().download("https://i.redd.it/hdxi9adiquw21.jpg")#credit-https://www.reddit.com/r/funny/comments/blw0hq/a_true_photobomb/
init_image = Image.open(p).convert("RGB").resize((485, 323))
#mask is black and white(region to be inpainted)
msk = init_image.resize((485, 323))
drawMsk = ImageDraw.Draw(msk)
drawMsk.rectangle([(0, 0), (485, 323)],outline=(0, 0, 0), fill=(0, 0, 0))#black bkg
drawMsk.rectangle([(350, 170), (480, 250)],outline=(255, 255, 255), fill=(255, 255, 255))#white
image_grid([init_image,msk],1,2)
torch.manual_seed(10)
from torch import autocast
prompt = "rocks in the water"
with autocast("cuda"):
images = pipe_inpaint(prompt=prompt, init_image=init_image, mask_image=msk,strength=0.7,guidance_scale=8, num_inference_steps=70).images
images[0]
Thanks for reporting it! This issue with image2image and inpainting pipeline has been fixed in v0.4.2 please upgrade your version of diffusers and then this should run in fp16 without any issues.
This course will cover a lot of cutting edge stuff - the idea of having a stable configuration isn’t really compatible with what we’re doing here… Expect sharp edges!
Also, a goal of this course is for students to get comfortable with cutting edge research - which includes dealing with all the little software issues which come up in research code.