Possible to have data sets in different drive/directory as python notebooks?

Hi everyone,

As my ubuntu home directory is running out of space and I have a 2TB secondary drive, I was wondering what the most efficient way was to have my training data for Kaggle competitions for example in my secondary drive, while running the ipython notebooks from my SSD?

Sorry, new-ish to python and ubuntu (6 months in and still learning!). Was going to post this in ‘Part1 v2 Beginner category’ but realize that it doesnt really adhere to the topic/guidelines there.

Any advice?

Ian

Hi Ian,

Its absolutely possible to store data anywhere on your system (including secondary disks etc). There are a few options as to how you can reference that remote data:

  1. You can harcode the remote path in your notebook and be done with it
    PATH=/mnt/mybigdisk/planet-data

  2. You can use something called softlink. This creates a convenient alias to your remote folder within your working directory (containing the notebook).

E.g., above, on first glance data appears to be a regular directory sitting in the same folder as my notebook. However, if you look closely, you will see something like
data -> /mnt/data/truck/

This says that data is an alias to original /mnt/data/truck folder.

Now, there may be performance consequence for storing data on a SSD versus mechanical drives. But for our purposes here, it will be negligible at best.

Cheers,
A

2 Likes

What worked for me was creating a link in the fastai\courses\dl1 folder.

Using the command prompt:
cd fastai\courses\dl1
mklink /d data2 E:\data2