Lesson 9 official topic

jsa169 · October 13, 2022, 12:02am

If anyone has run into CUDA out of memory issues on their stable_diffusion.ipynb notebook, I’ve put up a quick and dirty fix as a pull request here: https://github.com/fastai/diffusion-nbs/pull/4 . This at least helped my12GB 3080 TI gpu. I’ll be looking at other ways to improve on this next (mostly based on this optimization recommendations page: https://huggingface.co/docs/diffusers/optimization/fp16 )

Update: It now runs on an 11GB 1080TI with some further small changes for memory efficiency.

akash5474 · October 13, 2022, 1:20am

Hey @jeremy you mentioned at the beginning of the lesson that we would be able to view the youtube live chat. It doesn’t look like the chat replay is available from the first lesson live stream.

Is anyone else able to see the chat on youtube?

brismith · October 13, 2022, 1:47am

I couldn’t see live chat either.

jeremy · October 13, 2022, 2:37am

Yeah I discovered afterwards that that feature was turned off on my account! Sorry about that. I’ve enabled it now, so hopefully it works next time.

Clive · October 13, 2022, 4:21am

Just watched lesson 9A and have a question about combining the text embedding and the positional encoding. It obviously works but seems un-intuitive to me as the addition would seem to push the text embeding away from its native state. I can intuite a concatenation where both remain available in their raw form. Is there a good way to think about this I am missing?

strickvl · October 13, 2022, 4:47am

This worked great for me on a JarvisLabs fastai instance. Just be sure to restart the kernel and then it’ll all work.

barnacl · October 13, 2022, 4:56am

I was also wondering the same thing, there is an obvious drawback of concatenation, it makes everything larger and hence slower. I was reading an explanation which says the dimensions of the text embedding are large enough to capture both text and positional embedding. That is all i have
Here is the video by Coffee Bean that talks about it
Here is the other link
tldr:
“It is intuitively possible that, in high dimensions, the word vectors form a smaller dimensional subspace within the full embedding space, and the positional vectors form a different smaller dimensional subspace approximately orthogonal to the one spanned by word vectors. Thus despite vector addition, the two subspaces can be manipulated essentially independently of each other by some single learned transformation. Thus, concatenation doesn’t add much, but greatly increases cost in terms of parameters to learn.”

SHAR1 · October 13, 2022, 5:43am

The link seems to be down.

I’ll try the repo and update.

Clive · October 13, 2022, 6:57am

I think that it is the same way we find it hard to visualise >3 dimensions. I accept that premise and I like the summary you have. Thanks

johnowhitaker · October 13, 2022, 6:58am

Building on @barnacl 's answer:
Neural networks want to work. Since everything here (the different embeddings, the many transformer layers) is learned, my intuition is that the model ‘figures out’ how to do the best with what it has. We could separately create a token embedding and a position embedding and concatenate, but here instead we’re saying “here are 768 dimensions total to work with” and the network can find the best way to make use of that. Maybe that’s having a much lower-dinemsional subspace for the position components, maybe it’s something a little more weird that a human programmer wouldn’t have come up with.

Clive · October 13, 2022, 7:03am

Thanks. I really enjoyed the lesson.

ItsJeffrey · October 13, 2022, 7:07am

Annybody else has a problem viewing the lesson? Its barely loading on any quality setting and buffering every few seconds. Preloading only solves the problem for a few minutes. I really don’t know why. Internet is totally fine with 100mbit. I can watch anything else including 4k content on yt just fine. Heck I even have YT premium
I am watching from germany, maybe I could try switching the region via vpn. Its works if I download it via the yt app on my phone and iPad but not the best viewing experience. Happy for any pointers here

strickvl · October 13, 2022, 7:14am

I am having exactly the same problem. I’m also in Europe and on a super fast internet connection. I use a VPN always, but changing region doesn’t seem to make much of a difference. I’m probably going to download the raw video to get round this problem.

ItsJeffrey · October 13, 2022, 7:27am

Thx for the quick answer. Doesn’t seem to be only me then. I just tried downloading via pytube but it doesn’t work. maybe its not working due to the video being unlisted. Not sure. Maybe will try some browser plugins

ItsJeffrey · October 13, 2022, 7:36am

I don’t seem to have an issue while using Chrome for some reason. Firefox on windows didn’t work and safari on Mac didn’t work. Now viewing via Chrome on Mac, no issues. Maybe give it a try

Daniel · October 13, 2022, 7:56am

Hi guys,

I try to run stable_diffusion.ipynb on paperspace, after I get logged into Huggingface, when I run

pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", revision="fp16", torch_dtype=torch.float16).to("cuda")

I received the following error message:

---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
File ~/mambaforge/lib/python3.9/site-packages/huggingface_hub/utils/_errors.py:213, in hf_raise_for_status(response, endpoint_name)
    212 try:
--> 213     response.raise_for_status()
    214 except HTTPError as e:

File ~/mambaforge/lib/python3.9/site-packages/requests/models.py:960, in Response.raise_for_status(self)
    959 if http_error_msg:
--> 960     raise HTTPError(http_error_msg, response=self)

HTTPError: 403 Client Error: Forbidden for url: https://huggingface.co/CompVis/stable-diffusion-v1-4/resolve/fp16/model_index.json

The above exception was the direct cause of the following exception:

HfHubHTTPError                            Traceback (most recent call last)
File ~/mambaforge/lib/python3.9/site-packages/diffusers/configuration_utils.py:223, in ConfigMixin.get_config_dict(cls, pretrained_model_name_or_path, **kwargs)
    221 try:
    222     # Load from URL or cache if already cached
--> 223     config_file = hf_hub_download(
    224         pretrained_model_name_or_path,
    225         filename=cls.config_name,
    226         cache_dir=cache_dir,
    227         force_download=force_download,
    228         proxies=proxies,
    229         resume_download=resume_download,
    230         local_files_only=local_files_only,
    231         use_auth_token=use_auth_token,
    232         user_agent=user_agent,
    233         subfolder=subfolder,
    234         revision=revision,
    235     )
    237 except RepositoryNotFoundError:

File ~/mambaforge/lib/python3.9/site-packages/huggingface_hub/file_download.py:1053, in hf_hub_download(repo_id, filename, subfolder, repo_type, revision, library_name, library_version, cache_dir, user_agent, force_download, force_filename, proxies, etag_timeout, resume_download, use_auth_token, local_files_only, legacy_cache_layout)
   1052 try:
-> 1053     metadata = get_hf_file_metadata(
   1054         url=url,
   1055         use_auth_token=use_auth_token,
   1056         proxies=proxies,
   1057         timeout=etag_timeout,
   1058     )
   1059 except EntryNotFoundError as http_error:
   1060     # Cache the non-existence of the file and raise

File ~/mambaforge/lib/python3.9/site-packages/huggingface_hub/file_download.py:1359, in get_hf_file_metadata(url, use_auth_token, proxies, timeout)
   1350 r = _request_wrapper(
   1351     method="HEAD",
   1352     url=url,
   (...)
   1357     timeout=timeout,
   1358 )
-> 1359 hf_raise_for_status(r)
   1361 # Return

File ~/mambaforge/lib/python3.9/site-packages/huggingface_hub/utils/_errors.py:254, in hf_raise_for_status(response, endpoint_name)
    252 # Convert `HTTPError` into a `HfHubHTTPError` to display request information
    253 # as well (request id and/or server error message)
--> 254 raise HfHubHTTPError(str(HTTPError), response=response) from e

HfHubHTTPError: <class 'requests.exceptions.HTTPError'> (Request ID: D43zpesQMDw2FCdgEwYc6)

During handling of the above exception, another exception occurred:

OSError                                   Traceback (most recent call last)
Input In [4], in <cell line: 1>()
----> 1 pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", revision="fp16", torch_dtype=torch.float16).to("cuda")

File ~/mambaforge/lib/python3.9/site-packages/diffusers/pipeline_utils.py:345, in DiffusionPipeline.from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
    342 # 1. Download the checkpoints and configs
    343 # use snapshot download here to get it working from from_pretrained
    344 if not os.path.isdir(pretrained_model_name_or_path):
--> 345     config_dict = cls.get_config_dict(
    346         pretrained_model_name_or_path,
    347         cache_dir=cache_dir,
    348         resume_download=resume_download,
    349         proxies=proxies,
    350         local_files_only=local_files_only,
    351         use_auth_token=use_auth_token,
    352         revision=revision,
    353     )
    354     # make sure we only download sub-folders and `diffusers` filenames
    355     folder_names = [k for k in config_dict.keys() if not k.startswith("_")]

File ~/mambaforge/lib/python3.9/site-packages/diffusers/configuration_utils.py:255, in ConfigMixin.get_config_dict(cls, pretrained_model_name_or_path, **kwargs)
    251     raise EnvironmentError(
    252         f"{pretrained_model_name_or_path} does not appear to have a file named {cls.config_name}."
    253     )
    254 except HTTPError as err:
--> 255     raise EnvironmentError(
    256         "There was a specific connection error when trying to load"
    257         f" {pretrained_model_name_or_path}:\n{err}"
    258     )
    259 except ValueError:
    260     raise EnvironmentError(
    261         f"We couldn't connect to '{HUGGINGFACE_CO_RESOLVE_ENDPOINT}' to load this model, couldn't find it"
    262         f" in the cached files and it looks like {pretrained_model_name_or_path} is not the path to a"
   (...)
    265         " 'https://huggingface.co/docs/diffusers/installation#offline-mode'."
    266     )

OSError: There was a specific connection error when trying to load CompVis/stable-diffusion-v1-4:
<class 'requests.exceptions.HTTPError'> (Request ID: D43zpesQMDw2FCdgEwYc6)

I have no experience of dealing with this kind of problem. Could anyone have a look? Thanks!

ItsJeffrey · October 13, 2022, 7:56am

ok, let me rephrase that into less issues, appeared again after some time. sigh

ItsJeffrey · October 13, 2022, 8:13am

I found a working yt downloader and will upload the video to to my gdrive right away for sharing within this thread only once its done. I hope that is fine with @jeremy?

barnacl · October 13, 2022, 8:29am

Did you accept the License on the hugging face website?
There is a model card in the Using Stable Diffusion section, click that and it should take you the website, where you can accept the terms and you have to create a token

miko · October 13, 2022, 8:35am

Small thought that helped me visualise what’s going on (I understand this is only a slightly rephrase of what was said in the lecture, it just helped me phrasing it this way). An incredibly high view of what goes on with SD is the following:

A model M is a function of the input, its parameters and a loss function. While training, we minimize the loss by changing the parameters according to fixed inputs. With SD inference we minimize a different loss by changing the input and keeping the parameters.

In a way, during inference the input becomes the parameters of our model.

Possibly trivial observation, but maybe somebody finds it useful too.