Fastbook Chapter 11 questionnaire (wiki)

ilovescience · September 5, 2021, 5:01am

Here are the questions:

Why do we say that fastai has a “layered” API? What does it mean?

Fastai’s layered API refers to how we have a high-level API that allows for training neural networks for common applications with just a few lines of code, but also have lower-level APIs that are more flexible and better for custom tasks.

Why does a Transform have a decode method? What does it do?

The decode methods reverses (if possible) the application of the transform. It is often used to convert predictions and mini-batches into human-understandable representation

Why does a Transform have a setup method? What does it do?

Sometimes it is necessary to initialize some inner state, like the vocabulary for a tokenizer. The setup method handles this.

How does a Transform work when called on a tuple?

The Transform is always applied to each item of the tuple. If a type annotation is provided, the Transform is only applied to the items with the correct type.

Which methods do you need to implement when writing your own Transform?

Just the encodes method, and optionally the decodes method for it to be reversible, and setups for initializing an inner state.

Write a Normalize transform that fully normalizes items (subtract the mean and divide by the standard deviation of the dataset), and that can decode that behavior. Try not to peek!

Here is a Normalize transform:

class Normalize(Transform):
   def setups(self, items): 
	    self.mean = x.mean()
		self.std = x.std()
	def encodes(self, x): return (x-self.mean)/self.std
	def decodes(self, x): return x*self.std+self.mean

Write a Transform that does the numericalization of tokenized texts (it should set its vocab automatically from the dataset seen and have a decode method). Look at the source code of fastai if you need help.

Here is a numericalization transform:

class Numericalize(Transform):
   def __init__(self, min_freq=3, max_vocab=60000): store_attr('min_freq,max_vocab')
   def setups(self, dsets):
	    count = Counter(p for o in dsets for p in o)
		self.special_toks = dsets.special_toks
		self.vocab = make_vocab(count, min_freq=self.min_freq, max_vocab=self.max_vocab, special_toks=self.special_toks)
		self.o2i = defaultdict(int, {v:k for k,v in enumerate(self.vocab) if v != 'xxfake'})
	def encodes(self, o): return TensorText(tensor([self.o2i [o_] for o_ in o]))
	def decodes(self, o): return L(self.vocab[o_] for o_ in o)

What is a Pipeline?

The Pipeline class is meant for composing several transforms together. It is defined by passing a list of Transforms to Pipeline(...). When you call Pipeline on an object, it will automatically call the transforms inside, in order.

What is a TfmdLists?

A TfmdLists object groups together the raw items with a Pipeline.

What is a Datasets? How is it different from a TfmdLists?

Datasets will apply two (or more) pipelines in parallel to the same raw object and build a tuple with the result. This is different from TfmdLists which leads to two separate objects for the input and target.

Why are TfmdLists and Datasets named with an “s”?

Because they can handle a training and a validation set with the splits argument

How can you build a DataLoaders from a TfmdLists or a Datasets?

You can call the dataloaders method.

How do you pass item_tfms and batch_tfms when building a DataLoaders from a TfmdLists or a Datasets?

You can pass after_item and after_batch, respectively, to the dataloaders argument.

What do you need to do when you want to have your custom items work with methods like show_batch or show_results?

You need to create a custom type with a show method, since TfmdLists/Datasets will decode the items until it reaches a type with a show method.

Why can we easily apply fastai data augmentation transforms to the SiamesePair we built?

Because they dispatch over tuples or their subclasses.

ilovescience · September 5, 2021, 5:01am

After more than a year, finally got a chance to start going through the remaining chapters and fill out the questionnaire. As usual, it’s a wiki…
Will complete all of the remaining chapters within the next few weeks.