@ady_anr yes, you can. As the tutorial suggests, you can access root_dir using the standard python file system commands.
Example:
To view ALL you GDrive file, just execute
Has anybody found a solution to running Image_Cleaner on colab?
each time i try running that line on lesson 2 download ipnb, the runtime gets restarted.
Stuck.
Does anyone have the solution to my problem:
When using collab, rather than filling up my gdrive with lots of standard datasets, i want to use gdrive just for my model data.
So I want my setup to look like this
/content/gdrive/My Drive/my_project/models/…
model_files"
/root/.fastai/data/…
mnist_sample
train
valid
labels csv
I am guessing running the learner on data stored on the collab file server is much faster. But i want my models saves to persist an dbe able to be loaded again.
It doesnt seem a big issue to redownload the dataset when you need it
But it seems a learner object or the save/load methods wont allow me to specify a path
Does anyone have a solution to this?
Either in how you setup your project for persistance, or how you save out your learner models?
If you have a solution to this, can you post the code to setup, and a short example work flow of a learner save / load
NB: I am guessing i can write something like the following,
but is there a cleaner way of doing it witht he inbuilt functions. EG create the ability to set a custom save location using the learner.save method e.g.
I did some research and overloaded the save and load methods with this code.
Anyone care to comment? @jeremy
def custom_path_save(self, name:PathOrStr, path='', return_path:bool=False, with_opt:bool=True):
"Save model and optimizer state (if `with_opt`) with `name` to `self.model_dir`."
# delete # path = self.path/self.model_dir/f'{name}.pth'
# my addition: start
if path=='': path = self.path/self.model_dir/f'{name}.pth'
else: path = f'{path}/{name}.pth'
# end
if not with_opt: state = get_model(self.model).state_dict()
else: state = {'model': get_model(self.model).state_dict(), 'opt':self.opt.state_dict()}
torch.save(state, path)
if return_path: return path
def custom_path_load(self, name:PathOrStr, path='', device:torch.device=None, strict:bool=True, with_opt:bool=None):
"Load model and optimizer state (if `with_opt`) `name` from `self.model_dir` using `device`."
if device is None: device = self.data.device
# delete # state = torch.load(self.path/self.model_dir/f'{name}.pth', map_location=device)
# my addition: start
if path=='': path = self.path/self.model_dir/f'{name}.pth'
else: path = f'{path}/{name}.pth'
state = torch.load(path, map_location=device)
# end
if set(state.keys()) == {'model', 'opt'}:
get_model(self.model).load_state_dict(state['model'], strict=strict)
if ifnone(with_opt,True):
if not hasattr(self, 'opt'): opt = self.create_opt(defaults.lr, self.wd)
try: self.opt.load_state_dict(state['opt'])
except: pass
else:
if with_opt: warn("Saved filed doesn't contain an optimizer state.")
get_model(self.model).load_state_dict(state, strict=strict)
return self
learn.save = custom_path_save.__get__(learn)
learn.load = custom_path_load.__get__(learn)
# if you don't want to overload
#learn.custom_path_save = custom_path_save.__get__(learn)
#learn.custom_path_load = custom_path_load.__get__(learn)
model_path = '/content/gdrive/My Drive/fastai-v3/data/'
learn.save('new-model-name', path=model_path)
learn.load('new-model-name', path=model_path)
and received this error:
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-8-41891b907d8d> in <module>()
1 path = Path(base_dir + 'data/pets')
----> 2 dest = path/folder
3 dest.mkdir(parents=True, exist_ok=True)
4 path = untar_data(URLs.PETS); path
NameError: name 'folder' is not defined
I made in drive a folder named: fastai-v3. I am not well understanding where untar_data is saving the files.
How to make it saving in the google drive?
or in a folder in the pc?
@salvatore.r@vikbehal hope @gamo fix worked for you. I found that my overload learn.save/load code that i posted a few posts above helped me split up my model saves - Whilst retaining the fastai colab free functionality
If you are using standard data sets, the notebooks and saved weights are usually enough to learn and progress, and you can to keep a gdrive copy.
Saved weights can be in the 250Mb range for image recognition
If you are creating your own datasets then having all of the folder (images weights notebooks ) on your gdrive is a good option. Though I am not sure if you need to shift your image sets to the colab instance for performance. Anyone care to comment?
As you get more sophisticated you will see that practitioners begin using paid cloud services like AWS ec2/s3 for storage of models. or you build your own DL computer and have the datasets on it.
I am not sure of your tech experience (high | low) so I hope this advice hits the right level for you
I did try but above will just create the directory system. How do I tell fast.ai that mentioned path is my path where data and models will be saved?
I followed this post to change the configuration. It works if I run notebook as-is. As soon as I change the runtime to GPU, it goes back to default fast.ai path.
Gabriel, thank you. I did that. Now how do I tell fast.ai to download data at that path? Also, what changes should I make so that model data is saved in Google drive - i.e. the path.
Having fastai work directly with gdrive is probably not a good idea, you have to treat data on gdrive as on a NAS or other remote storage, if you try to run data directly off gdrive it will have to move that data over network and it will be slow.
Keep all data and models local on colab while you are working and then use !cp or python specific library to copy data and/or models to gdrive. It is best to create a function (def) for this that way you can include it in your learner and have it save the model to gdrive during learning if learning has to run for a long time giving you running backups of your model.
however doing this what it is actually happening is that is creating a folder in my google drive, but for some reasons I am missing it is not downloading inside this folder. So it is creating an empty folder in the fastai folder on my drive.
In my opinion I think what is happening is that is creating a new path but without downloading the data in it.
As I said in a previous post, if you are just saving the data to gd then that is ok, but if you will then use the data on colab, then colab will have to get that data back from gd over network before it can be used and that is slow.
Instead download all data and models locally to your colab instance and use them there, then when you want to save/backup your work copy the whole project folder to gd.
As of 30 minutes ago, I have not been able to use the Fastai library. Despite following the setup guide, I get the error below when I run: from fastai.vision import *