Part 2 lesson 11 wiki

Can @jeremy guide me on whether or not its possible to install fastText in Windows (running Anaconda as admin)?

Trying to install under WIndows 10 - I have VS 2015 installed (put carriage returns into the error message pasted below to make it easier to read):

Command "d:\anaconda3\python.exe -u -c "import setuptools, tokenize;
__file__='C:\\Users\\User\\AppData\\Local\\Temp\\pip-zsgb88jk-build\\setup.py';
f=getattr(tokenize, 'open', open)(__file__);
code=f.read().replace('\r\n', '\n');
f.close();exec(compile(code, __file__, 'exec'))
" install --record C:\Users\User\AppData\Local\Temp\pip-fjpzmszv-record\install-record.txt --single-version-externally-managed --compile" 
failed with error code 1 in C:\Users\User\AppData\Local\Temp\pip-zsgb88jk-build\
1 Like

SOLVED - pathlib.py was not installed correctly. whew!

Hey does anyone have a translate notebook running? My os does not seem to work with PosixPath even though I’m running python 3.6.4 and I’d like to know if anyone else is hitting similar problems.

I have consistent errors using PosixPath (see question in Beginner topic). The error is

TypeError: expected str, bytes or os.PathLike object, not PosixPath

So I have been converting all paths to strings using str() to open files, which is kind of a gross workaround, but it usually works.

But I can’t run a command like

learn = RNN_Learner(md, SingleModel(to_gpu(rnn)), opt_fn=opt_fn)

because the PosixPath problem comes up from code inside text.py.
I just want to make sure it’s just me having this problem (so probably a configuration error) before I go further in debugging wtf is up!

Thanks!

ps here’s my error trace:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-112-d0041fa690c8> in <module>()
      2 # triggered by something inside fastai.core.SingleModel.
      3 
----> 4 learn = RNN_Learner(md, SingleModel(to_gpu(rnn)), opt_fn=opt_fn)

~/_ker-notebooks/dl2/fastai/text.py in __init__(self, data, models, **kwargs)
    180 class RNN_Learner(Learner):
    181     def __init__(self, data, models, **kwargs):
--> 182         super().__init__(data, models, **kwargs)
    183         self.crit = F.cross_entropy
    184 

~/_ker-notebooks/dl2/fastai/learner.py in __init__(self, data, models, opt_fn, tmp_name, models_name, metrics, clip)
     32         self.clip = None
     33         self.opt_fn = opt_fn or SGD_Momentum(0.9)
---> 34         self.tmp_path = os.path.join(self.data.path, tmp_name)
     35         self.models_path = os.path.join(self.data.path, models_name)
     36         os.makedirs(self.tmp_path, exist_ok=True)

~/src/anaconda3/envs/fastai/lib/python3.6/posixpath.py in join(a, *p)
     76     will be discarded.  An empty last part will result in a path that
     77     ends with a separator."""
---> 78     a = os.fspath(a)
     79     sep = _get_sep(a)
     80     path = a

TypeError: expected str, bytes or os.PathLike object, not PosixPath

Try giving the string version for the Path object when you’re using os library. Just wrap it around a str and you should be good to go.

no the problem is that I want things to run without changing py.text. The problem is that py.text is setting up the PosixPath afaict

Your suggestion is simply to continue my workaround. I’ve run into a situation where the workaround doesn’t work.

In this line of the stacktrace,

---> 34         self.tmp_path = os.path.join(self.data.path, tmp_name)

os.path.join is expecting a string. Did you use Path to point to your data? Maybe use the python debugger and check the types just incase.

Thanks for trying to help read the stack trace.

However the problem is that it seems like for everyone except me, os.path.join expects PosixPath.

My specific question is what other people’s experience with os and PosixPath is.

further honing in on what my specific problem is:

According to the python docs for PathLike here, PathLike is an abstract base class that contains the fspath method.

Maybe my question is why doesn’t PosixPath have the fspath attribute.

ah HA

ok in case this helps anyone else I think the answer was that I had two copies of pathlib:

fastai/lib/python3.6/pathlib.py
and

fastai/lib/python3.6/site-packages/pathlib.py
I think uninstalling and reinstalling pathlib should work, but I manually removed the site-packages file (which didn’t have fspath defined in it—I checked with $cat pathlib.py | grep fspath ) and replaced it with the file that seemed more up to date manually.

Ah, it’s in translate.ipynb but I had jumped straight in to devise.ipynb.

DeVISE is keeping me up at night - this is zero-shot learning on images…mind blowing stuff :exploding_head:

6 Likes

in “translate” I found that bs=470 is still fits into memory and makes training an epoch almost 2 times faster:

image

1 Like

You should be able to double the learning rate then too!

2 Likes

@username_not_found This should do.
!pip install --upgrade git+https://github.com/fastai/fastai.git

2 Likes

Just out of curiosity, is there a faster way to decompress a huge file (like full imagenet 156GB) other than tar -xzf and wait?

Not really - although people shouldn’t be using gzip with image files, so if you have the ability to choose how it’s saved, don’t use the z flag to tar.

Also be sure to decompress it to nvme or similar if you can.

Yeah I realized decompressing it on my hard drive was kind of a mistake. Maybe it’ll be finished by tomorrow :sweat_smile:
More seriously I’ll switch to Paperspace for this.

1 Like

Got to buy a brand new hard drive (SSD) because the imageset.tar.gz file was 166GB compressed and over 170GB Uncompressed. (Previously stated 400GB was incorrect)

Leave it as jpeg - it should stay at ~170GB.

I don’t understand this comment. It seems to imply that you decompress the JPEG?

Oh, I guess you mean “only download the JPEGs”?

FYI, the command to only extract the JPEGs is

    ! tar xzvf {DOWNLOADS_PATH}/'imagenet_object_localization.tar.gz' -C {IMAGENET_PATH} --wildcards *.JPEG > /tmp/tar.out

Ooops, I looked at the wrong column. You are correct uncompressed ~170GB