General course chat

larcat · December 3, 2018, 7:31pm

Hi all –

I’m a bit behind on the lectures, so if this was covered, my apologies.

Is there built in functionality for up/down sampling, or if we want do to it do we need to script repetition/deletion of the offending images, rows, etc?

Thanks!

ArchieIndian · December 4, 2018, 6:55am

Don’t think that this has been covered for tabular data. Jeremy says normal upsampling is the best possible approach for structured data in his answer to one of the questions.
However, Data Argumentation has been covered for image data.

larcat · December 4, 2018, 7:03am

That was my eventual solution – running images through transforms, saving them and adding those permanently to the training set. In this case, there’s plenty of.classes with n < 5, so ineeded to do that to not toss errors.when the data gets split.

amitkayal · December 4, 2018, 7:39am

Hello…
I am getting error “TypeError: an integer is required (got type NoneType)” while execution of whale_50.fit_one_cycle(5, slice(lr)).

Can anyone please help me on this?

TypeError Traceback (most recent call last)
in ()
----> 1 whale_50.fit_one_cycle(5, slice(lr))

~/.anaconda3/lib/python3.7/site-packages/fastai/train.py in fit_one_cycle(learn, cyc_len, max_lr, moms, div_factor, pct_start, wd, callbacks, **kwargs)
18 callbacks.append(OneCycleScheduler(learn, max_lr, moms=moms, div_factor=div_factor,
19 pct_start=pct_start, **kwargs))
—> 20 learn.fit(cyc_len, max_lr, wd=wd, callbacks=callbacks)
21
22 def lr_find(learn:Learner, start_lr:Floats=1e-7, end_lr:Floats=10, num_it:int=100, stop_div:bool=True, **kwargs:Any):

~/.anaconda3/lib/python3.7/site-packages/fastai/basic_train.py in fit(self, epochs, lr, wd, callbacks)
160 callbacks = [cb(self) for cb in self.callback_fns] + listify(callbacks)
161 fit(epochs, self.model, self.loss_func, opt=self.opt, data=self.data, metrics=self.metrics,
–> 162 callbacks=self.callbacks+callbacks)
163
164 def create_opt(self, lr:Floats, wd:Floats=0.)->None:

~/.anaconda3/lib/python3.7/site-packages/fastai/basic_train.py in fit(epochs, model, loss_func, opt, data, callbacks, metrics)
92 except Exception as e:
93 exception = e
—> 94 raise e
95 finally: cb_handler.on_train_end(exception)
96

~/.anaconda3/lib/python3.7/site-packages/fastai/basic_train.py in fit(epochs, model, loss_func, opt, data, callbacks, metrics)
87 if hasattr(data,‘valid_dl’) and data.valid_dl is not None and data.valid_ds is not None:
88 val_loss = validate(model, data.valid_dl, loss_func=loss_func,
—> 89 cb_handler=cb_handler, pbar=pbar)
90 else: val_loss=None
91 if cb_handler.on_epoch_end(val_loss): break

~/.anaconda3/lib/python3.7/site-packages/fastai/basic_train.py in validate(model, dl, loss_func, cb_handler, pbar, average, n_batch)
47 with torch.no_grad():
48 val_losses,nums = [],[]
—> 49 for xb,yb in progress_bar(dl, parent=pbar, leave=(pbar is not None)):
50 if cb_handler: xb, yb = cb_handler.on_batch_begin(xb, yb, train=False)
51 val_losses.append(loss_batch(model, xb, yb, loss_func, cb_handler=cb_handler))

~/.anaconda3/lib/python3.7/site-packages/fastprogress/fastprogress.py in iter(self)
63 self.update(0)
64 try:
—> 65 for i,o in enumerate(self._gen):
66 yield o
67 if self.auto_update: self.update(i+1)

~/.anaconda3/lib/python3.7/site-packages/fastai/basic_data.py in iter(self)
67 “Process and returns items from DataLoader.”
68 assert not self.skip_size1 or self.batch_size > 1, “Batch size cannot be one if skip_size1 is set to True”
—> 69 for b in self.dl:
70 y = b[1][0] if is_listy(b[1]) else b[1]
71 if not self.skip_size1 or y.size(0) != 1: yield self.proc_batch(b)

~/.anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py in next(self)
635 self.reorder_dict[idx] = batch
636 continue
–> 637 return self._process_next_batch(batch)
638
639 next = next # Python 2 compatibility

~/.anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py in _process_next_batch(self, batch)
656 self._put_indices()
657 if isinstance(batch, ExceptionWrapper):
–> 658 raise batch.exc_type(batch.exc_msg)
659 return batch
660

TypeError: Traceback (most recent call last):
File “/home/nbuser/.anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py”, line 138, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File “/home/nbuser/.anaconda3/lib/python3.7/site-packages/fastai/torch_core.py”, line 97, in data_collate
return torch.utils.data.dataloader.default_collate(to_data(batch))
File “/home/nbuser/.anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py”, line 232, in default_collate
return [default_collate(samples) for samples in transposed]
File “/home/nbuser/.anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py”, line 232, in
return [default_collate(samples) for samples in transposed]
File “/home/nbuser/.anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py”, line 223, in default_collate
return torch.LongTensor(batch)
TypeError: an integer is required (got type NoneType)

larcat · December 4, 2018, 1:08pm

Have you done anything to massage the classes that only have one or two examples? Until you handle that, the train/valid split will cause this with the whale data.

See my post directly above yours.

amitkayal · December 4, 2018, 1:37pm

Hello …Thanks a lot for your respond…Could you please guide how to handle such minority class ones? Do i need to ensure that validation class does not contain minority class ones or generate more image for minority class first and then split data into train and validation?

Thanks
Amit

larcat · December 4, 2018, 1:41pm

There is a Humpback Whale thread:.

I strongly suspect that my approach is terrible/naive but all I did was take a data frame of the classes that have less than 5 entries, and then start looping through them and applying a transform/saving them with a new image name. I then appended those image names / class entries to a copy of train.csv.

Probably bad practice but it did let me complete model training

amitkayal · December 4, 2018, 1:43pm

Thanks a lot and this is quite helpful information. This is quite useful info and will be quite common for lot of practical use cases. BTW: I am not getting why the code throws this error? I tried this with keras and did not get any such error during model training…Is this issue with pytorch?

Thanks
Amit

bhollan · December 4, 2018, 3:01pm

Use AI to build an endless world to drive through, or y’know, make your buddy dance Gangnam Style.

NVIDIA’s AI-graphics engine

gshashank84 · December 4, 2018, 9:57pm

How can we build a TextDataBunch to Classify more than one labels in one context i.e Tags?

hellobharadwaj · December 5, 2018, 3:54am

I am trying to train a new model, (pretty much on the lines of lesson 1- pets). I learn based on a pre-trained model, then saved it . Following this, I unfreeze and train again. After this, I tried to load my pre-unfreeze model. I get the following error.

loaded state dict contains a parameter group that doesn’t match the size of optimizer’s group

Am I missing something here?

karan · December 6, 2018, 12:19pm

Do we need GPU while doing prediction?
I have made text classification model from lesson 4. Now I am doing prediction on other google cloud instance which dont have gpu. I gave it a try to predict without gpu and it gave me runtime error.
Can any one clarify on this?

brismith · December 6, 2018, 2:06pm

You can fix this by adding ,with_opt=False after your model name in the load command. There will be a fix coming to avoid this, but in fastai 1.0.31 the save will save the optimizers but if the model is changed and frozen you will see this message on load.

larcat · December 6, 2018, 2:55pm

I really feel like an idiot here, but can someone give me a minimal code snip for returning a data frame with the appropriate class labels from a …get_preds call or similar?

Thanks in advance.

gshashank84 · December 6, 2018, 4:25pm

what is the full form of ‘train_ds’ and ‘train_dl’?

nisham · December 6, 2018, 4:43pm

In a couple of his lectures Jeremy mentioned solving non-image kind of problems, for instance, audio and others by converting the data into image(s). Can someone please share a popular paper in this area?

nisham · December 6, 2018, 4:43pm

I think “ds” - dataset and “dl” - dataloader

gshashank84 · December 6, 2018, 4:43pm

thanks

amitkayal · December 6, 2018, 6:40pm

Hello,

Is there anyway i can resize all the image to a common size? My input images are all varying in size and hence i want to resize them to 224x224. Is that size into the following function will do that?

data = (src.transform(tfms, size=224, bs=32)
.databunch().normalize(resent_101_stats))

Thanks
Amit

larcat · December 6, 2018, 7:58pm

I believe the …normalize(resnet) bit will do precisely that.