New: support for one-hot encoded labels in multi-classification. The most basic way is to use
one_hot=True but if you label from a dataframe with multiple columns, the data block API will guess it’s a multiclass problem and put that flag for you.
New: support for one-hot encoded labels in multi-classification. The most basic way is to use
Hey all ! It may be a very basic issue but I’ve looked on the forum, didn’t find anything and I’m really stuck.
I’m trying to get to get the fastai dev environment working in order to start to contribute to fastai (beginning by the documentation Sylvain asked for) but I’m having trouble setting it up. I followed the tutorial of the documentation and I’m at step 4 “write the code” (so I forked the repo, cloned it on my local computer, did
pip install -e . and then
However when I try to run the first
import cell of any notebook I get the following error :
File “/Users/bocra/anaconda3/lib/python3.5/site-packages/fastai/init.py”, line 1, in
from .basic_train import *
File “/Users/bocra/anaconda3/lib/python3.5/site-packages/fastai/basic_train.py”, line 97
SyntaxError: invalid syntax
I think (but I’m not sure) it’s because fastai requires python >= 3.6 (pip install -e . told me so when I ran it). On the other hand, I installed fastai with the classic conda install and pytorch locked python at version 3.5 :
pytorch -> python[version=’>=3.5,<3.6.0a0’]
What am I missing ?
EDIT : And when I try to open an example notebook I get the following error :
Here’s the output of the terminal :
Adapting to protocol v5.1 for kernel 60c63f91-6415-4726-9b75-369756acfca7
Hi seeing that the error refers to the conda enviroment i think you forgot to uninstall fastai.
First, follow the instructions above for either
Conda . Then uninstall the
fastai package using the same package manager you used to install it, i.e.
pip uninstall fastai or
conda uninstall fastai , and then, replace it with a pip editable install.
git clone https://github.com/fastai/fastai cd fastai tools/run-after-git-clone pip install -e .[dev]
Oops that’s right, a bad case of RTFM I guess, sorry about that.
Though to be fair I was following the instructions here and I think there’s an omission as there should be the uninstall step. I guess that’s going to be my first PR
Thank you very much !
I’m doing fp16 training. After training completes I call
learn.show_results(figsize=(12,15), rows = 10)
and I got and error
--------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) <ipython-input-21-cc1e87ad044f> in <module>() ----> 1 learn.show_results(figsize=(12,15), rows = 10) ~/anaconda3/lib/python3.7/site-packages/fastai/basic_train.py in show_results(self, ds_type, rows, **kwargs) 278 ds = self.dl(ds_type).dataset 279 self.callbacks.append(RecordOnCPU()) --> 280 preds = self.pred_batch(ds_type) 281 *self.callbacks,rec_cpu = self.callbacks 282 x,y = rec_cpu.input,rec_cpu.target ~/anaconda3/lib/python3.7/site-packages/fastai/basic_train.py in pred_batch(self, ds_type, batch) 236 cb_handler = CallbackHandler(self.callbacks) 237 cb_handler.on_batch_begin(xb,yb, train=False) --> 238 preds = loss_batch(self.model.eval(), xb, yb, cb_handler=cb_handler) 239 return _loss_func2activ(self.loss_func)(preds) 240 ~/anaconda3/lib/python3.7/site-packages/fastai/basic_train.py in loss_batch(model, xb, yb, loss_func, opt, cb_handler) 16 if not is_listy(xb): xb = [xb] 17 if not is_listy(yb): yb = [yb] ---> 18 out = model(*xb) 19 out = cb_handler.on_loss_begin(out) 20 RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.cuda.HalfTensor) should be the same
from my understanding class MixedPrecision(Callback) sets the transformation on the start of training and then removes the transformation at the training end
class MixedPrecision(Callback): ... def on_train_begin(self, **kwargs:Any)->None: "Ensure everything is in half precision mode." self.learn.data.train_dl.add_tfm(to_half) if hasattr(self.learn.data, 'valid_dl') and self.learn.data.valid_dl is not None: self.learn.data.valid_dl.add_tfm(to_half) if hasattr(self.learn.data, 'test_dl') and self.learn.data.test_dl is not None: self.learn.data.test_dl.add_tfm(to_half) def on_train_end(self, **kwargs:Any)->None: "Remove half precision transforms added at `on_train_begin`." self.learn.data.train_dl.remove_tfm(to_half) if hasattr(self.learn.data, 'valid_dl') and self.learn.data.valid_dl is not None: self.learn.data.valid_dl.remove_tfm(to_half)
So after the training there is no default transformation to fp16. I’d like to fix that if somebody will point towards where is the best place to put a fix.
Actually the transforms probably shouldn’t be removed at the end of training since that’s why you have an error: your model is still in half precision but the input you feed it isn’t. @jeremy let me know if you feel differently but we should still be using half precision for inference, no?
Hi all and @jeremy, @sgugger, @stas TextLMDataBunch uses a lot of memory. I think it is a big issue because training on 400 mio word for 4 epochs must generalize better than 16 epoch on 100 mio words.
I believe that what we see is mostly the memory overhead of multiprocessing. Especially in the tokenizer.process_all
I have a 32 GB-RAM machine. The maximum csv file that i can pass to TextLMDataBunch.from.csv is 1.3 GB with about 170 mio words in the trainingset. at 1.4 gb with 180 mio words in the trainingset i get a “broken process” error, because there is not enough memory to pickle the results back to the calling process in tokenizer.process_all.
The memory is at 2.2 just before calling TextLMDataBunch.from.csv: os+jupyter+browser
The tokenizer.process_all creates 12 processes with default chunksize=10000
The process breaks around 80% into the tokenization process with the 1.4 GB wiki (the 1.3 GB wiki pass)
What could be done
disclaimer: i do not know the use cases in language-processing well enough to pinpoint the weeknesses in the following but: Having looked through the code i believe that most of the processes could be row-based. If the function _jointext was moved to the tokenization step then the input file could be split in n=n_cpus processes. Each process could read row-wise from a file and write to another file to return the result to the calling process. In this way the multiprocessing memory overhead would be almost zero, because we only pass a filename to each process. Despite the extra serialization i think it would be faster - ie ThreadPool also seriales/pickels the data in order to pass them to the parallel processes and back
What do you think ?
I’ve got more adventures with fp16 on inference time. This time with norm/denorm.
--------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) <ipython-input-42-f62f0b041cab> in <module>() 13 data = data.type(torch.HalfTensor) 14 data = data.cuda() ---> 15 cat, t1, t2 = learn.predict(data) ~/anaconda3/lib/python3.7/site-packages/fastai/basic_train.py in predict(self, item, **kwargs) 249 "Return prect class, label and probabilities for `item`." 250 self.callbacks.append(RecordOnCPU()) --> 251 batch = self.data.one_item(item) 252 res = self.pred_batch(batch=batch) 253 pred = res ~/anaconda3/lib/python3.7/site-packages/fastai/basic_data.py in one_item(self, item, detach, denorm) 147 ds = self.single_ds 148 with ds.set_item(item): --> 149 return self.one_batch(ds_type=DatasetType.Single, detach=detach, denorm=denorm) 150 151 def show_batch(self, rows:int=5, ds_type:DatasetType=DatasetType.Train, **kwargs)->None: ~/anaconda3/lib/python3.7/site-packages/fastai/basic_data.py in one_batch(self, ds_type, detach, denorm) 134 w = self.num_workers 135 self.num_workers = 0 --> 136 try: x,y = next(iter(dl)) 137 finally: self.num_workers = w 138 if detach: x,y = to_detach(x),to_detach(y) ~/anaconda3/lib/python3.7/site-packages/fastai/basic_data.py in __iter__(self) 70 for b in self.dl: 71 #y = b if is_listy(b) else b # XXX: Why is this line here? ---> 72 yield self.proc_batch(b) 73 74 @classmethod ~/anaconda3/lib/python3.7/site-packages/fastai/basic_data.py in proc_batch(self, b) 63 "Proces batch `b` of `TensorImage`." 64 b = to_device(b, self.device) ---> 65 for f in listify(self.tfms): b = f(b) 66 return b 67 ~/anaconda3/lib/python3.7/site-packages/fastai/vision/data.py in _normalize_batch(b, mean, std, do_x, do_y) 74 x,y = b 75 mean,std = mean.to(x.device),std.to(x.device) ---> 76 if do_x: x = normalize(x,mean,std) 77 if do_y and len(y.shape) == 4: y = normalize(y,mean,std) 78 return x,y ~/anaconda3/lib/python3.7/site-packages/fastai/vision/data.py in normalize(x, mean, std) 64 def normalize(x:TensorImage, mean:FloatTensor,std:FloatTensor)->TensorImage: 65 "Normalize `x` with `mean` and `std`." ---> 66 return (x-mean[...,None,None]) / std[...,None,None] 67 68 def denormalize(x:TensorImage, mean:FloatTensor,std:FloatTensor, do_x:bool=True)->TensorImage: RuntimeError: expected type torch.cuda.FloatTensor but got torch.cuda.HalfTensor
so I converted the mean and std to the type of input data:
def normalize(x:TensorImage, mean:FloatTensor,std:FloatTensor)->TensorImage: "Normalize `x` with `mean` and `std`." mean,std = mean.type(x.type()),std.type(x.type()) return (x-mean[...,None,None]) / std[...,None,None] def denormalize(x:TensorImage, mean:FloatTensor,std:FloatTensor, do_x:bool=True)->TensorImage: "Denormalize `x` with `mean` and `std`." mean,std = mean.type(x.type()),std.type(x.type()) return x.cpu()*std[...,None,None] + mean[...,None,None] if do_x else x.cpu()
and got this:
~/anaconda3/lib/python3.7/site-packages/fastai/vision/data.py in denormalize(x, mean, std, do_x) 70 "Denormalize `x` with `mean` and `std`." 71 mean,std = mean.type(x.type()),std.type(x.type()) ---> 72 return x.cpu()*std[...,None,None] + mean[...,None,None] if do_x else x.cpu() 73 74 def _normalize_batch(b:Tuple[Tensor,Tensor], mean:FloatTensor, std:FloatTensor, do_x:bool=True, do_y:bool=False)->Tuple[Tensor,Tensor]: RuntimeError: "mul" not implemented for 'torch.HalfTensor'
After that I changed denormalize and converted x to the type of mean:
def denormalize(x:TensorImage, mean:FloatTensor,std:FloatTensor, do_x:bool=True)->TensorImage: "Denormalize `x` with `mean` and `std`." x = x.type(mean.type()) return x.cpu()*std[...,None,None] + mean[...,None,None] if do_x else x.cpu()
It worked. I’d like to fix it properly, this looks hacky. Any idea how?
I found a way to speed up our test suite, using parallel testing.
pip install pytest-xdist
$ time pytest real 0m51.069s $ time pytest -n 6 real 0m26.940s
half the time - not bad!
We just need to fix the temp files creation to use a unique string (pid?), otherwise at times some tests collide in a race condition over the same temp file path.
And if you have a powerful machine, you might be able to crank it up even more. I run out of 8GB CUDA memory when using more than 8 workers.
Dev Projects General Discussion
For those of you who wanted to understand the
libjpeg-turbo story with all of its complexities head to https://docs.fast.ai/performance.html#faster-image-processing, where it’s unraveled for you including installation details with the complex nuances of it.
There is now an automated script that does the performance analysis for you.
It’s discussed in this new thread:
Also note a new dependency:
nvidia-ml-py3, so make sure you either install it (pip/conda) or run 'pip install -e .[dev]
nvidia-ml-py3 you can start tapping into nvidia status from fastai code w/o needing the awkward parsing of
nvidia-ml-py3 was already on pypi and we added a conda version in the
And it’s super fast!
i am new to the course an am having issues with gpu management or capacity (which idk).
I outlined the issue here: https://github.com/fastai/fastai/issues/1330
Is it a problem of management? i sure hope so
I have been working on a simple Language Model generator in fast.ai v1
All works quite smoothly and then I hit the
learner.predict() and get different text generation each time I call it. Digging into the code, I see how
torch.multinomial() is being called to sample from the output probabilities. I also (think) I understand the
min_p parameters which allow you to control how much variation you want in your selection. (for example, if you set
temp = 0 , you get a uniform selection of words and a random sample on the output.)
What I don’t see is a configuration that will take the single-best word on each pass through the loop. This could lead to “loops” where you flip-flop predictions but I think that would be good to know about your network. This would also generate the same words in the same order given the same state.
Am I missing something important here about how the
text.learner.predict() works? Or a set of parameters that would meet this criteria?
Where would be a good place to discuss test coverage, e.g. as a subtopic in: Dev Projects Index ?
Made a simple test coverage for basic_data.show_batch in test_text_data.py. Motivation initially was I could try to reproduce this bug here: https://forums.fast.ai/t/getting-cuda-oom-on-lesson3-imdb-notebook-with-a-bs-8/30080/14 . But I could already make a PR of the test scripts, even though, there are no asserts: its simply coverage for this method as a regression test that it doesn’t throw exceptions.
After running make coverage, I can see other areas, where test scripts could be useful. Am aware of the general guidelines (https://docs.fast.ai/dev/test.html), but maybe good to have one place to coordinate, set prios and have guidelines when asserts matter (or not)? Might help preventing non - accepted PRs.
Coverage for the sake of coverage is mostly meaningless unless it verifies the workings, but tests for the functionality that hasn’t been tested yet are very welcome.
In general you can just submit such test contributions directly via github, but if you’d like to make a study and perhaps compile a list of functionality that’s untested yet, then yes, making a dedicated to testing thread under https://forums.fast.ai/c/fastai-users/dev-projects and linking to it from Dev Projects would be a very useful contribution, @Benudek. There is one for Documentation improvements so that would be one for Testing Improvements or something like that.
Feel free to adjust task & process
I was going through lesson5-sgd-mnist.ipynb and noticed that pathlib’s Path does not get imported:
I did not think much of it and did
from fastai.vision import *
And got past the error. However, a little later in the same notebook, I am getting this error:
data = DataBunch.create(train_ds, valid_ds, bs=bs)
--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-12-16801c71829f> in <module> 1 ----> 2 data = DataBunch.create(train_ds, valid_ds, bs=bs) ~/git/fastai/fastai/basic_data.py in create(cls, train_ds, valid_ds, test_ds, path, bs, num_workers, tfms, device, collate_fn, no_check) 112 collate_fn:Callable=data_collate, no_check:bool=False)->'DataBunch': 113 "Create a `DataBunch` from `train_ds`, `valid_ds` and maybe `test_ds` with a batch size of `bs`." --> 114 datasets = cls._init_ds(train_ds, valid_ds, test_ds) 115 val_bs = bs 116 dls = [DataLoader(d, b, shuffle=s, drop_last=(s and b>1), num_workers=num_workers) for d,b,s in ~/git/fastai/fastai/basic_data.py in _init_ds(train_ds, valid_ds, test_ds) 102 @staticmethod 103 def _init_ds(train_ds:Dataset, valid_ds:Dataset, test_ds:Optional[Dataset]=None): --> 104 fix_ds = valid_ds.new(train_ds.x, train_ds.y) # train_ds, but without training tfms 105 datasets = [train_ds,valid_ds,fix_ds] 106 if test_ds is not None: datasets.append(test_ds) AttributeError: 'TensorDataset' object has no attribute 'new'
I am on commit ce85b4718865f25d0243042c0e4d051929ca3b52 and not seeing anything obvious.
Does this ring a bell to anybody?
we now have a pillow conda package built against a custom build of libjpeg-turbo (and libtiff w/ libjpeg-turbo) in the fastai test channel
I made 3.6 and 3.7 linux builds:
conda uninstall -y pillow libjpeg-turbo conda install -c fastai/label/test pillow
This will make your jpeg decompressing much faster (See Performance Improvement Through Faster Software Components).
Note that this is pillow-5.4.0.dev0 build - i.e. don’t use in production w/o testing.
I also uploaded pillow-simd-5.3.0.post0 w/ libjpeg-turbo and built w/ avx2 - only py36.
conda uninstall -y pillow libjpeg-turbo conda install -c fastai/label/test pillow-simd
note that it’d get overwritten by
pillow through update/install of any other package depending on