RuntimeError when exporting a model

HoangLong · July 25, 2022, 8:41am

When trying to export a model in VSCode on my M1 Mac with

learn.export('model.pkl')

I get following error:
‘RuntimeError: don’t know how to determine data location of torch.storage._UntypedStorage’

Anybody knows how to solve this problem?

bencoman · July 25, 2022, 2:13pm

I found your error message mentioned here

HoangLong · July 25, 2022, 4:26pm

I saw that too, but that doesn’t rellay help me

bencoman · July 25, 2022, 11:32pm

Oh. Thats dissapointing.

It really really helps to describe what you’ve already tried and learnt, particularly for the next person that encounters the same problem searching for an answer. For example, I’m sure you’ve checked how your current python version compares to the environment reported for that bug. Let us know. That is useful information for those in the community that would like to help you.

Also for example, I’m sure that you’ve tried reproducing the YOLO5 error, using the given Minimal Reproducible Example. Their Quickstart example seems pretty simple. So please report the exact command/response you got from that. Since currently google results for your error message are current very scarce, and the YOLO5 bug was reported very recently, only a month ago, maybe very few people have encountered your issue.

Please read How To Ask Questions The Smart Way paying particular attention to “display the fact that you have done these things.” Hopefully we can help you find an answer to your issue.

HoangLong · July 26, 2022, 8:55am

Sorry for my ignorance, but I don’t really get how the YOLO5 error is related to mine.
My problem occurs when I use the export() function from fastai.
I use the same python version as in the reported bug, but my pytorch version is 1.12.0 instead of the nightly build, simply because fastai overrides the pytorch version for compatibility reasons I guess.

bencoman · August 1, 2022, 12:49pm

Hi @HoangLong, Have you managed to solve this yet?

I was curious to see what change was made in the PR that fixed that YOLO issue I linked above, and its interesting that the fix seems to be to force cpu usage.

You don’t mention that you’ve tried any simple test programs like the following, so can you report if they work for you?

If they do all work, then sorry I’m lacking any more useful tips. HTH.

manojmohan · August 1, 2022, 2:38pm

I can’t reproduce this issue on M1 Mac with fastai 2.7.7/torch 1.12.0. This runs successfully.

learn = vision_learner(dls, resnet18, metrics=error_rate, lr=0.15)
learn.export(‘model.pkl’)

Are you using a custom model?

HoangLong · August 2, 2022, 2:42pm

All of the steps work fine.
I’m guessing it has something to do with the missing ‘mps’ support in ‘serialization.py’ in pytorch. I have pytorch 1.12 installed, but when comparing with the github code somehow I am missing the code for 'mps" support.

HoangLong · August 2, 2022, 2:44pm

Also using fastai 2.7.7/torch 1.12.0. No custom model. Just trying out this notebook locally on VSCode:

malwin · August 29, 2022, 9:49pm

I also have this issue.

This is a pytorch library problem, pytorch is what’s throwing the error. (not yolov5)

(@bencoman in the issue you linked, they closed it at the end by identifying the issue as a pytorch issue and asked the user to file a bug with pytorch (Torch MPS (gpu) acceleration not working M1 Mac. · Issue #8102 · ultralytics/yolov5 · GitHub))

Calling the fastai learn.export() fuction is very simple code and that is the reproducibility here, just calling learn.export(). It works for me in Kaggle but not locally on my computer.

@manojmohan Interesting you don’t get the same issue since you do have seemingly the same Mac setup we do. We’re running the same pytorch versions. What’s your Mac OS version?
@HoangLong What’s your Mac OS version?
Mine: Intel Mac 2019. Mac version 12.5.1

Also to have a record of the logs, this is my stack trace:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Input In [16], in <cell line: 1>()
----> 1 learn.export('resnet18.pkl')

File ~/.pyenv/versions/mambaforge/lib/python3.10/site-packages/fastai/learner.py:418, in export(self, fname, pickle_module, pickle_protocol)
    415 with warnings.catch_warnings():
    416     #To avoid the warning that come from PyTorch about model not being checked
    417     warnings.simplefilter("ignore")
--> 418     torch.save(self, self.path/fname, pickle_module=pickle_module, pickle_protocol=pickle_protocol)
    419 self.create_opt()
    420 if state is not None: self.opt.load_state_dict(state)

File ~/.pyenv/versions/mambaforge/lib/python3.10/site-packages/torch/serialization.py:379, in save(obj, f, pickle_module, pickle_protocol, _use_new_zipfile_serialization)
    377 if _use_new_zipfile_serialization:
    378     with _open_zipfile_writer(opened_file) as opened_zipfile:
--> 379         _save(obj, opened_zipfile, pickle_module, pickle_protocol)
    380         return
    381 _legacy_save(obj, opened_file, pickle_module, pickle_protocol)

File ~/.pyenv/versions/mambaforge/lib/python3.10/site-packages/torch/serialization.py:589, in _save(obj, zip_file, pickle_module, pickle_protocol)
    587 pickler = pickle_module.Pickler(data_buf, protocol=pickle_protocol)
    588 pickler.persistent_id = persistent_id
--> 589 pickler.dump(obj)
    590 data_value = data_buf.getvalue()
    591 zip_file.write_record('data.pkl', data_value, len(data_value))

File ~/.pyenv/versions/mambaforge/lib/python3.10/site-packages/torch/serialization.py:574, in _save.<locals>.persistent_id(obj)
    571         storage_dtypes[storage.data_ptr()] = storage_dtype
    573 storage_key = id_map.setdefault(storage._cdata, str(len(id_map)))
--> 574 location = location_tag(storage)
    575 serialized_storages[storage_key] = storage
    577 return ('storage',
    578         storage_type,
    579         storage_key,
    580         location,
    581         storage_numel)

File ~/.pyenv/versions/mambaforge/lib/python3.10/site-packages/torch/serialization.py:169, in location_tag(storage)
    167     if location:
    168         return location
--> 169 raise RuntimeError("don't know how to determine data location of "
    170                    + torch.typename(storage))

RuntimeError: don't know how to determine data location of torch.storage._UntypedStorage

manojmohan · August 31, 2022, 6:29am

@malwin Mine is an M1 Pro 2021 running Monterey.

HoangLong · August 31, 2022, 8:17am

Mine is also M1 Pro 2021 with Monterey.

kulasb · September 21, 2022, 5:05pm

@HoangLong @malwin Hello~ Have any of you resolved or found more information on this issue?

I’m working through the 02_production notebook and encountering the exact same error…

When I run learn.export() to pickle+save my model, it spits out “RuntimeError: don’t know how to determine data location of torch.storage._UntypedStorage”

kulasb · September 21, 2022, 5:28pm

Still curious what the issue may be, but for now, this worked for me to at least save the model as a pkl file:

torch.save(learn.state_dict(), ‘./model.pkl’)

HoangLong · September 21, 2022, 7:10pm

At the moment there seem to still be a lot of issues when trying to use “mps” backend of silicon macs.
So the only way for me to make “learn.export()” work was to disable the use of “mps” with following code:

fastai.torch_core.default_device(False)

Downside is that training takes much longer now…