Does anyone have a benchmarking fastai on colab script already set up I could make use of? (I made a little dummy task, but I don’t trust the results - it’s not a realistic test. An actual notebook would be a better test)
Otherwise does anyone have a oneclick run all notebook set up for lesson 1 they can share? ie all the bash and setup commands already done for colab, no manually twiddling around with paths and setup?
I want to quickly benchmark my box against colab just to see if my box is close enough to just use it instead of colab.
Also, I’m having some docker issues with smh and ipc-host that I need a benchmark test to diagnose further.
@ady_anr yes, you can. As the tutorial suggests, you can access root_dir using the standard python file system commands.
Example:
To view ALL you GDrive file, just execute
Has anybody found a solution to running Image_Cleaner on colab?
each time i try running that line on lesson 2 download ipnb, the runtime gets restarted.
Stuck.
Does anyone have the solution to my problem:
When using collab, rather than filling up my gdrive with lots of standard datasets, i want to use gdrive just for my model data.
So I want my setup to look like this
/content/gdrive/My Drive/my_project/models/…
model_files"
/root/.fastai/data/…
mnist_sample
train
valid
labels csv
I am guessing running the learner on data stored on the collab file server is much faster. But i want my models saves to persist an dbe able to be loaded again.
It doesnt seem a big issue to redownload the dataset when you need it
But it seems a learner object or the save/load methods wont allow me to specify a path
Does anyone have a solution to this?
Either in how you setup your project for persistance, or how you save out your learner models?
If you have a solution to this, can you post the code to setup, and a short example work flow of a learner save / load
NB: I am guessing i can write something like the following,
but is there a cleaner way of doing it witht he inbuilt functions. EG create the ability to set a custom save location using the learner.save method e.g.
I did some research and overloaded the save and load methods with this code.
Anyone care to comment? @jeremy
def custom_path_save(self, name:PathOrStr, path='', return_path:bool=False, with_opt:bool=True):
"Save model and optimizer state (if `with_opt`) with `name` to `self.model_dir`."
# delete # path = self.path/self.model_dir/f'{name}.pth'
# my addition: start
if path=='': path = self.path/self.model_dir/f'{name}.pth'
else: path = f'{path}/{name}.pth'
# end
if not with_opt: state = get_model(self.model).state_dict()
else: state = {'model': get_model(self.model).state_dict(), 'opt':self.opt.state_dict()}
torch.save(state, path)
if return_path: return path
def custom_path_load(self, name:PathOrStr, path='', device:torch.device=None, strict:bool=True, with_opt:bool=None):
"Load model and optimizer state (if `with_opt`) `name` from `self.model_dir` using `device`."
if device is None: device = self.data.device
# delete # state = torch.load(self.path/self.model_dir/f'{name}.pth', map_location=device)
# my addition: start
if path=='': path = self.path/self.model_dir/f'{name}.pth'
else: path = f'{path}/{name}.pth'
state = torch.load(path, map_location=device)
# end
if set(state.keys()) == {'model', 'opt'}:
get_model(self.model).load_state_dict(state['model'], strict=strict)
if ifnone(with_opt,True):
if not hasattr(self, 'opt'): opt = self.create_opt(defaults.lr, self.wd)
try: self.opt.load_state_dict(state['opt'])
except: pass
else:
if with_opt: warn("Saved filed doesn't contain an optimizer state.")
get_model(self.model).load_state_dict(state, strict=strict)
return self
learn.save = custom_path_save.__get__(learn)
learn.load = custom_path_load.__get__(learn)
# if you don't want to overload
#learn.custom_path_save = custom_path_save.__get__(learn)
#learn.custom_path_load = custom_path_load.__get__(learn)
model_path = '/content/gdrive/My Drive/fastai-v3/data/'
learn.save('new-model-name', path=model_path)
learn.load('new-model-name', path=model_path)
and received this error:
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-8-41891b907d8d> in <module>()
1 path = Path(base_dir + 'data/pets')
----> 2 dest = path/folder
3 dest.mkdir(parents=True, exist_ok=True)
4 path = untar_data(URLs.PETS); path
NameError: name 'folder' is not defined
I made in drive a folder named: fastai-v3. I am not well understanding where untar_data is saving the files.
How to make it saving in the google drive?
or in a folder in the pc?
@salvatore.r@vikbehal hope @gamo fix worked for you. I found that my overload learn.save/load code that i posted a few posts above helped me split up my model saves - Whilst retaining the fastai colab free functionality
If you are using standard data sets, the notebooks and saved weights are usually enough to learn and progress, and you can to keep a gdrive copy.
Saved weights can be in the 250Mb range for image recognition
If you are creating your own datasets then having all of the folder (images weights notebooks ) on your gdrive is a good option. Though I am not sure if you need to shift your image sets to the colab instance for performance. Anyone care to comment?
As you get more sophisticated you will see that practitioners begin using paid cloud services like AWS ec2/s3 for storage of models. or you build your own DL computer and have the datasets on it.
I am not sure of your tech experience (high | low) so I hope this advice hits the right level for you
I did try but above will just create the directory system. How do I tell fast.ai that mentioned path is my path where data and models will be saved?
I followed this post to change the configuration. It works if I run notebook as-is. As soon as I change the runtime to GPU, it goes back to default fast.ai path.
Gabriel, thank you. I did that. Now how do I tell fast.ai to download data at that path? Also, what changes should I make so that model data is saved in Google drive - i.e. the path.