Questions on Finetuning Stable Diffusion

VigneshBaskaran · November 29, 2022, 8:55am

I would like to understand how to build build datasets for finetuning stable diffusion. I have the following questions and may I request someone to please help me with some answers please:

If I understood it correctly we need pairs of images and text captions to finetune a Stable Diffusion model. Let us say I would like to finetune a SD model to generate high quality faces alone. I can collect a dataset of 1000+ faces. But then wouldn’t I need captions along with them to finetune the model? If yes, how can I get captions? Otherwise can I simply use a standard caption like ‘A photo of man’ or ‘A photo of a woman’?
In general how do people finetune stable diffusion? Are there some repositories guiding people to do it or so?
May I please ask you to please share any resources on efficiently finetuning stable diffusion?

Thank you very very much for your help!

bencoman · November 29, 2022, 1:50pm

I saw this yesterday…

VigneshBaskaran · November 29, 2022, 4:51pm

Thank you Ben. Let me check this out

jamesrequa · November 30, 2022, 2:08am

I recommend the huggingface diffusers repo as they have many well documented example scripts for finetuning stable diffusion among other tasks.

github.com

huggingface/diffusers/blob/main/examples/text_to_image/README.md

# Stable Diffusion text-to-image fine-tuning

The `train_text_to_image.py` script shows how to fine-tune stable diffusion model on your own dataset.

___Note___:

___This script is experimental. The script fine-tunes the whole model and often times the model overfits and runs into issues like catastrophic forgetting. It's recommended to try different hyperparamters to get the best result on your dataset.___


## Running locally with PyTorch
### Installing the dependencies

Before running the scripts, make sure to install the library's training dependencies:

```bash
pip install git+https://github.com/huggingface/diffusers.git
pip install -U -r requirements.txt
```

And initialize an [🤗Accelerate](https://github.com/huggingface/accelerate/) environment with:

This file has been truncated. show original