Height and width of images must to be divisible by 8

bipin · October 12, 2022, 5:11pm

I was playing around with StableDiffusionPipeline and tried modifying the height and width of the output image using the code below. But it throws an error saying that these values should be divisible by 8. Is this applicable to all stable diffusion model implementations or is it specific to the diffusers library? Also, is there any reason for this?

Code:

pipe(prompt, width=28, height=128).images[0]

Exception raised:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/tmp/ipykernel_17/476363274.py in <module>
----> 1 pipe(prompt, width=28, height=128).images[0]

/opt/conda/lib/python3.7/site-packages/torch/autograd/grad_mode.py in decorate_context(*args, **kwargs)
     25         def decorate_context(*args, **kwargs):
     26             with self.clone():
---> 27                 return func(*args, **kwargs)
     28         return cast(F, decorate_context)
     29 

/opt/conda/lib/python3.7/site-packages/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py in __call__(self, prompt, height, width, num_inference_steps, guidance_scale, negative_prompt, num_images_per_prompt, eta, generator, latents, output_type, return_dict, callback, callback_steps, **kwargs)
    191 
    192         if height % 8 != 0 or width % 8 != 0:
--> 193             raise ValueError(f"`height` and `width` have to be divisible by 8 but are {height} and {width}.")
    194 
    195         if (callback_steps is None) or (

ValueError: `height` and `width` have to be divisible by 8 but are 128 and 28.

barnacl · October 12, 2022, 5:40pm

@bipin The Autoencoder takes 8x8x3 volumes(ht,wd,ch) and compresses it down to 1x1x4 vector this is the reason. check out jono’s video - Lesson 2 A and the notebook ` section The Autoencoder(AE)

bipin · October 12, 2022, 5:51pm

Oh okay, thanks for pointing this out. I’ll check the notebook.