A Code-First Introduction to Natural Language Processing 2019

Preparing to run the course notebooks

I’ll assume that you have a working installation of fastai V1.0 under the latest anaconda (version 2019.03 with build channel py37_0), and that you have activated the environment you created for fastai.

Follow these steps to prepare your environment to run the first course notebook:

(1) Install the course materials from github
git clone https://github.com/fastai/course-nlp.git

(2) Install scikit-learn, a Python machine learning library
conda install scikit-learn

(3) Install nltk, the Natural Language Toolkit, a library for Natural Language Processing that is widely used for teaching and research.
conda install -c anaconda nltk

(4) Install spaCy, a library for “Industrial-Strength Natural Language Processing”
conda install -c conda-forge spacy

(5) Download an English language model for spaCy
python -m spacy download en_core_web_sm

(6) Install fbpca, a library for “Fast computations of PCA/SVD/eigendecompositions via randomized methods”
pip install fbpca

After this, you should be able to run the first notebook with code, which is 2-svd-nmf-topic-modeling.ipynb

I will continue to update this post in case the infrastructure needs to be extended in order to run subsequent notebooks in the course.

16 Likes