Beginner: Beginner questions that don't fit elsewhere ✅

I start to get a feel for training a model, and basically, you try different variations and see what works best. E.g., different data sets or different architectures, or different types of cropping. Keeping track of all that can become quite messy.

I’m sure this has been solved somehow?

Is there a framework (or method / workflow) we can use to make sure we never lose experiments and that we know what the “parameters” were of that experiment?

Weights and Biases. Is easy to use, well featured and has lots of documentation and tutorials.

And moving beyond simple recording, you can use more of its features supporting MLOps.

W&B Effective MLOps Model Development Course

3 Likes

Thanks!

I am unsure of how to call my trained model with new data.

The training and saving of my model:
procs = [Categorify, FillMissing]
cont,cat = cont_cat_split(btrain)
splits = RandomSplitter(valid_pct=0.5)(range_of(btrain))
btrain = TabularPandas(btrain, procs, cat, cont, y_names='15R', splits=splits)
dls = btrain.dataloaders(1024)
learn = tabular_learner(dls, layers=[500,250])
learn.fit_one_cycle(5, 1e-2)
pickle.dump(learn, open(filename, "wb"))

Attempting to use my model on test data:
learn = pickle.load(open(filename, 'rb'))
procs = [Categorify, FillMissing]
cont,cat = cont_cat_split(btest)
btest = TabularPandas(btest, procs, cat, cont)
dls = learn.dls.test_dl(btest)
preds = learn.get_preds(dl=dls, with_targs=False)

In the line dls = ... I get the following error:
KeyError: "['15R'] not in index"

Upon deployment, I will not have the 15R feature. I believe I am making a simple error in my understanding of either the dataloader or learner classes.
The docs (fastai - Tabular training) provide the following: " To get prediction on a new dataframe, you can use the test_dl method of the DataLoaders. That dataframe does not need to have the dependent variable in its column."
Batch prediction is helpful for testing the model, but during deployment the number of predictions need will be low (approx. 1-10), so .predict will likely be more cost-effective; I am deploying with GCP and would prefer not to use a GPU.

This problem has been resolved. Using this cont cat split function without specifying the dependent variable resulted in the y variable being passed as an x variable as well.

Hi, I am very new to use FastAI, I have a question that how to calculate the expected target batch size? I am currently using TextLoader for my dataset.
The codes are shown below:


tweet = TextDataLoaders.from_df(df_train, bs=128, path=‘.’, valid_pct=0.1,text_cols = [‘text’])
learn = language_model_learner(tweet, AWD_LSTM, metrics=[accuracy, Perplexity()], wd=0.1).to_fp16()
learn.fine_tune(10)


However, it shows the error message like:


0.00% [0/1 00:00<?]
epoch train_loss valid_loss accuracy perplexity time

0.00% [0/92 00:00<?]

ValueError Traceback (most recent call last)
Cell In[79], line 1
----> 1 learn.fine_tune(10)

File ~/miniconda3/lib/python3.10/site-packages/fastai/callback/schedule.py:165, in fine_tune(self, epochs, base_lr, freeze_epochs, lr_mult, pct_start, div, **kwargs)
163 “Fine tune with Learner.freeze for freeze_epochs, then with Learner.unfreeze for epochs, using discriminative LR.”
164 self.freeze()
→ 165 self.fit_one_cycle(freeze_epochs, slice(base_lr), pct_start=0.99, **kwargs)
166 base_lr /= 2
167 self.unfreeze()

File ~/miniconda3/lib/python3.10/site-packages/fastai/callback/schedule.py:119, in fit_one_cycle(self, n_epoch, lr_max, div, div_final, pct_start, wd, moms, cbs, reset_opt, start_epoch)
116 lr_max = np.array([h[‘lr’] for h in self.opt.hypers])
117 scheds = {‘lr’: combined_cos(pct_start, lr_max/div, lr_max, lr_max/div_final),
118 ‘mom’: combined_cos(pct_start, *(self.moms if moms is None else moms))}
→ 119 self.fit(n_epoch, cbs=ParamScheduler(scheds)+L(cbs), reset_opt=reset_opt, wd=wd, start_epoch=start_epoch)

File ~/miniconda3/lib/python3.10/site-packages/fastai/learner.py:264, in Learner.fit(self, n_epoch, lr, wd, cbs, reset_opt, start_epoch)
262 self.opt.set_hypers(lr=self.lr if lr is None else lr)
263 self.n_epoch = n_epoch
→ 264 self._with_events(self._do_fit, ‘fit’, CancelFitException, self._end_cleanup)

File ~/miniconda3/lib/python3.10/site-packages/fastai/learner.py:199, in Learner.with_events(self, f, event_type, ex, final)
198 def with_events(self, f, event_type, ex, final=noop):
→ 199 try: self(f’before
{event_type}'); f()
200 except ex: self(f’after_cancel
{event_type}‘)
201 self(f’after_{event_type}’); final()

File ~/miniconda3/lib/python3.10/site-packages/fastai/learner.py:253, in Learner._do_fit(self)
251 for epoch in range(self.n_epoch):
252 self.epoch=epoch
→ 253 self._with_events(self._do_epoch, ‘epoch’, CancelEpochException)

File ~/miniconda3/lib/python3.10/site-packages/fastai/learner.py:199, in Learner.with_events(self, f, event_type, ex, final)
198 def with_events(self, f, event_type, ex, final=noop):
→ 199 try: self(f’before
{event_type}'); f()
200 except ex: self(f’after_cancel
{event_type}‘)
201 self(f’after_{event_type}’); final()

File ~/miniconda3/lib/python3.10/site-packages/fastai/learner.py:247, in Learner._do_epoch(self)
246 def _do_epoch(self):
→ 247 self._do_epoch_train()
248 self._do_epoch_validate()

File ~/miniconda3/lib/python3.10/site-packages/fastai/learner.py:239, in Learner._do_epoch_train(self)
237 def _do_epoch_train(self):
238 self.dl = self.dls.train
→ 239 self._with_events(self.all_batches, ‘train’, CancelTrainException)

File ~/miniconda3/lib/python3.10/site-packages/fastai/learner.py:199, in Learner.with_events(self, f, event_type, ex, final)
198 def with_events(self, f, event_type, ex, final=noop):
→ 199 try: self(f’before
{event_type}'); f()
200 except ex: self(f’after_cancel
{event_type}‘)
201 self(f’after_{event_type}’); final()

File ~/miniconda3/lib/python3.10/site-packages/fastai/learner.py:205, in Learner.all_batches(self)
203 def all_batches(self):
204 self.n_iter = len(self.dl)
→ 205 for o in enumerate(self.dl): self.one_batch(*o)

File ~/miniconda3/lib/python3.10/site-packages/fastai/learner.py:235, in Learner.one_batch(self, i, b)
233 b = self._set_device(b)
234 self._split(b)
→ 235 self._with_events(self._do_one_batch, ‘batch’, CancelBatchException)

File ~/miniconda3/lib/python3.10/site-packages/fastai/learner.py:199, in Learner.with_events(self, f, event_type, ex, final)
198 def with_events(self, f, event_type, ex, final=noop):
→ 199 try: self(f’before
{event_type}'); f()
200 except ex: self(f’after_cancel
{event_type}‘)
201 self(f’after_{event_type}’); final()

File ~/miniconda3/lib/python3.10/site-packages/fastai/learner.py:219, in Learner._do_one_batch(self)
217 self(‘after_pred’)
218 if len(self.yb):
→ 219 self.loss_grad = self.loss_func(self.pred, *self.yb)
220 self.loss = self.loss_grad.clone()
221 self(‘after_loss’)

File ~/miniconda3/lib/python3.10/site-packages/fastai/losses.py:54, in BaseLoss.call(self, inp, targ, **kwargs)
52 if targ.dtype in [torch.int8, torch.int16, torch.int32]: targ = targ.long()
53 if self.flatten: inp = inp.view(-1,inp.shape[-1]) if self.is_2d else inp.view(-1)
—> 54 return self.func.call(inp, targ.view(-1) if self.flatten else targ, **kwargs)

File ~/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
1496 # If we don’t have any hooks, we want to skip the rest of the logic in
1497 # this function, and just call forward.
1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
1499 or _global_backward_pre_hooks or _global_backward_hooks
1500 or _global_forward_hooks or _global_forward_pre_hooks):
→ 1501 return forward_call(*args, **kwargs)
1502 # Do not call functions when jit is used
1503 full_backward_hooks, non_full_backward_hooks = [], []

File ~/miniconda3/lib/python3.10/site-packages/torch/nn/modules/loss.py:1174, in CrossEntropyLoss.forward(self, input, target)
1173 def forward(self, input: Tensor, target: Tensor) → Tensor:
→ 1174 return F.cross_entropy(input, target, weight=self.weight,
1175 ignore_index=self.ignore_index, reduction=self.reduction,
1176 label_smoothing=self.label_smoothing)

File ~/miniconda3/lib/python3.10/site-packages/torch/nn/functional.py:3015, in cross_entropy(input, target, weight, size_average, ignore_index, reduce, reduction, label_smoothing)
2949 r""“This criterion computes the cross entropy loss between input logits and target.
2950
2951 See :class:~torch.nn.CrossEntropyLoss for details.
(…)
3012 >>> loss.backward()
3013 “””
3014 if has_torch_function_variadic(input, target, weight):
→ 3015 return handle_torch_function(
3016 cross_entropy,
3017 (input, target, weight),
3018 input,
3019 target,
3020 weight=weight,
3021 size_average=size_average,
3022 ignore_index=ignore_index,
3023 reduce=reduce,
3024 reduction=reduction,
3025 label_smoothing=label_smoothing,
3026 )
3027 if size_average is not None or reduce is not None:
3028 reduction = _Reduction.legacy_get_string(size_average, reduce)

File ~/miniconda3/lib/python3.10/site-packages/torch/overrides.py:1551, in handle_torch_function(public_api, relevant_args, *args, **kwargs)
1545 warnings.warn("Defining your __torch_function__ as a plain method is deprecated and " 1546 "will be an error in future, please define it as a classmethod.", 1547 DeprecationWarning) 1549 # Use public_apiinstead ofimplementation` so torch_function
1550 # implementations can do equality/identity comparisons.
→ 1551 result = torch_func_method(public_api, types, args, kwargs)
1553 if result is not NotImplemented:
1554 return result

File ~/miniconda3/lib/python3.10/site-packages/fastai/torch_core.py:382, in TensorBase.torch_function(cls, func, types, args, kwargs)
380 if cls.debug and func.name not in (‘str’,‘repr’): print(func, types, args, kwargs)
381 if _torch_handled(args, cls._opt, func): types = (torch.Tensor,)
→ 382 res = super().torch_function(func, types, args, ifnone(kwargs, {}))
383 dict_objs = _find_args(args) if args else _find_args(list(kwargs.values()))
384 if issubclass(type(res),TensorBase) and dict_objs: res.set_meta(dict_objs[0],as_copy=True)

File ~/miniconda3/lib/python3.10/site-packages/torch/_tensor.py:1295, in Tensor.torch_function(cls, func, types, args, kwargs)
1292 return NotImplemented
1294 with _C.DisableTorchFunctionSubclass():
→ 1295 ret = func(*args, **kwargs)
1296 if func in get_default_nowrap_functions():
1297 return ret

File ~/miniconda3/lib/python3.10/site-packages/torch/nn/functional.py:3029, in cross_entropy(input, target, weight, size_average, ignore_index, reduce, reduction, label_smoothing)
3027 if size_average is not None or reduce is not None:
3028 reduction = _Reduction.legacy_get_string(size_average, reduce)
→ 3029 return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)

ValueError: Expected input batch_size (6912) to match target batch_size (128).


I try to install the old version of FastAI but it didn’t succeed cause I want to use the function of TextDataBunch, and I checked the batch size as:


Could you please help me with how to solve the problem? Thank you!

Hi,
i started to learn Lesson 01 2022 through Youtube. i tried to regenerate code in Jupyter. How ever when i go into line 37. i couldn’t run, there is an “Attribute Error”

AttributeError Traceback (most recent call last)
Cell In[10], line 4
1 learn = vision_learner(dls, resnet18, metrics=error_rate)
2 learn.fine_tune(3)
----> 4 is_bird,_,probs = learn.predict(PILImage.create(‘bird.jpg’))
5 print(f"This is a: {is_bird}.“)
6 print(f"Probability it’s a bird: {probs[0]:.4f}”)

Yeah this is a known issue with the latest version of fastai. You can remove the PILImage.create and use the predict method by passing the path directly as follows:

is_bird,_,probs = learn.predict(‘bird.jpg’)

That issue was version 2.7.11,

It should be fixed in version 2.7.12 which was released a few days ago. Is it still an issue?

wow, thanks very much. it worked

should i just download ver 2.7.12 directly to ‘python’ installed folder? and run upgrade from command promt?

Upgrade should work. Can run the install/update from inside the notebook.
!pip install -U fastai

You can check the various fastai library & cuda versions with show_install. It will give you an easy-to-share formatted summary, that is useful when reporting issues.

from fastai.test_utils import show_install
show_install() 

Yeah, it worked well. Thank you very much much.

Hi! I am doing the basics as shown in the first chapters of the course like:

dblock = DataBlock(blocks    = (ImageBlock, CategoryBlock),
                   get_items = get_image_files,
                   get_y     = parent_label,
                   splitter  = GrandparentSplitter(),
                   item_tfms = Resize(224)
                   )

I have my images organized in two top directories train and valid, and each of these two directories has two directories cats and dogs.

The training directories have each 1000 images and the valid directories each 200 images.

I want to increase the number of images used between experiments. So the first experiment 2x200 training images and 2x40 validation images. And later maybe double that.

Is there a way to do this in the code without actually adding/removing the actual images from the data source? Like somewhere slicing both the training and the validation image data source?

I am currently overwhelmed by the number of resources available. Is it still advisable to go through the fastai book given that there are newly uploaded resources for part 2 of the course?

Is it best to just focus on the latest part 1 and 2 uploaded or the book is still relevant? Or should I do part 1 of the book and then do part 2 (latest version) that is now widely available? Do you have any recommended approach? Thank you.

Yeah, I’m also a bit confused. The video course jumps from chapter 4 in the book to chapter 10. Should I go through chapters 5-9 first before continuing with the video course? Or are these chapters covered later on?

How often do Data Scientists just fine-tune already pre-trained models at work and how often they require building a model completely from scratch?

Hi all,

I’ve been trying to run through Chapter 1 of the associated jupyter notebook.
(i.e. by making a copy of 01_intro.ipynb)

I spend sometime over multiple days reading this, and not finish it in one go.

However, everytime I start, I have to run the following code in the first cell and rest of the code works fine.

! [ -e /content ] && pip install -Uqq fastbook
import fastbook
fastbook.setup_book()

Running this also requests an access from my Google account everytime I run this.
Shouldn’t this ‘allow Google to run this’ happen only once ?

Also is the way I run the code every new session, the proper approach ? Is there a way by which I run this cell only once, and in later sessions have to just run the next cell containing from fastbook import * ?

I can’t speak for this, but it may be something to do with cookies? Or perhaps how you’re loading the notebook into Colab.

Google Colab isn’t persistent, meaning you’ll be starting from a blank slate each time you turn off and turn on the environment running the notebook. So that’s normal and how it works, though inconvenient it is. That said, certain libraries are already preinstalled on Colab, so you won’t have to pip install them each time.

You can run ! pip freeze to view all preinstalled libraries. fastai, PyTorch, NumPy, Pandas, and Matplotlib are some of the libraries that are preinstalled, though the versions preinstalled most likely aren’t the most latest.

2 Likes

Thank you for the response, Salman

Colab not being persistent explains this.

The following are part of pip freeze output
fastai==2.7.12
fastcore==1.5.29

However fastbook isn’t present

So the line ! [ -e /content ] && pip install -Uqq fastbook
seems to be required to be run everytime I open the notebook in colab.

1 Like