The problem seems to be that targ’s size is [64, 1, 2] and pred is [64, 2], so your custom mse is causing an unwanted broadcasting. Try return ((targ.squeeze() - pred.squeeze())**2).mean().
while running the lesson3-imdb notebook I am getting RuntimeError: CUDA error: out of memory at this line:
learn.fit_one_cycle(1, 1e-2, moms=(0.8,0.7))
I am stuck at this point. I have 2 1080Ti. As this uses 1 gpu, can anyone suggest how to overcome this either by using 2 gpus or any other way? @sgugger
Thanks I am also having a CUDA issue and reduced batchsize but it didn’t solve the issue. Am also working with tweaking bptt per thread 1 and thread 2 from prior courses but haven;t cracked it yet in event these threads help @karan
Question re: the Data Block API (which btw is super cool and flexible)
In the docs, we have the following example:
data = (ImageFileList.from_folder(planet)
#Where to find the data? -> in planet and its subfolders
.label_from_csv('labels.csv', sep=' ', folder='train', suffix='.jpg')
#How to label? -> use the csv file labels.csv in path,
#add .jpg to the names and take them in the folder train
.random_split_by_pct()
#How to split in train/valid? -> randomly with the default 20% in valid
.datasets()
#How to convert to datasets? -> use ImageMultiDataset
.transform(planet_tfms, size=128)
#Data augmentation? -> use tfms with a size of 128
.databunch())
#Finally? -> use the defaults for conversion to databunch
My understanding is that the first line recursively goes through all the subfolders of path, so what happens in the case where the all of those files (which I’d imagine could include files under valid or test folders) != files specified in labels.csv != files in train folder? I guess this is for the scenario where there’s either a.) multiple csv label files for train and valid images, or a single label file for train and valid subfolders…
My apologies because I am a huge newb, so this could be entirely in relation to that.
Also, I’m using salamander and, in Jeremy’s words, “Depending on your platform you may need sudo you may need slash something elseslash pip , you may need source activate.”
@joshfp is probably right - your .json file isn’t in your active directory. Importantly, the “root” shown by jupyter is jupyter’s root, not the root of the operating system. So, the fully-qualified path surely includes a couple of directory levels above it before reaching jupyter’s root directory.
It might help if you examine your active directory with the following (and might be useful in general for others who stumble across this post):
Use !pwd to display the active directory;
Use !ls -a to list all of the files and directories in the active directory, including the hidden files or folders;
Use something like find . -name "*.json" or find / -name "*.json" to search the active directory and its subdirectories or the root directory and subs, respectively. The second one may take a long time and throw a slew of ‘permission denied’ responses.
Use !echo $HOME to display the home path of the OS, which is usually something like /home/<username>;
Essentially, joshfp is saying that your active directory is something other than your home directory (where your home directory equals /home/<username> aka ~/). As a result you need to use the fully-qualified path to the file, kaggle.json file to move it to the hidden folder ~/.kaggle.
Or, you could change to your home directory using os.chdir('/home/<username>') or maybe even os.chdir('~/'). !cd ~/ or !cd /home/<username>/ might work, but sometimes cd flakes out in a jupyter notebook, in my experience.
Apparently, n==1 because self.layer_groups==1. If your model has only one layer_group, instead of passing an slice, try passing a single number (float) as learning rate.
Hey @joshfp, was wondering if you might be able to provide some intuition behind the reasoning the squeeze is needed? In the camvid notebook, we have the following code:
My understanding is a torch_tensor.squeeze() removes all dimensions that are equal to 1, and torch_tensor.squeeze(dim) removes that dimension if it’s of size 1? It makes sense that the pred and targ dimensions should be the same, but any tips on how to think about what the dimensions of pred and targ are during the learn process in order to know how to squeeze them to the same shape? Thanks!!