Learning simple boolean functions

Hi,

First of all thanks for fastai and the great online course (just went through Part I). It’s really impressive to see what you can achieve with fastai in a few lines of code :slight_smile:

However, I am currently struggling with something that seems to me a very easy problem compared to the problems shown in the course. I wanted to start with a very simple Boolean function that my model should learn (to later on develop it to a more difficult problem): e.g. a simple XOR of two binary inputs.

My collab notebook is located here btw.: https://colab.research.google.com/drive/1rCkqVUNG7JIkkYxVpPgRAcmkc5_Jk4wF

For passing my Dataset (numpy.array, containing the input bits and the output bit) to a DataBunch , I used the “ArrayDataset” from the fastai docs:

class ArrayDataset(Dataset):
    "Sample numpy array dataset"
    def __init__(self, x, y):
        self.x, self.y = x, y
        self.c = 2 # binary label
    
    def __len__(self):
        return len(self.x)
    
    def __getitem__(self, i):
        return self.x[i], self.y[i]

# First two indices are the two inputs a and b, third index = output of funtion to learn
train_ds = ArrayDataset(training_set[:,0:2], training_set[:,2])
valid_ds = ArrayDataset(validation_set[:,0:2], validation_set[:,2])

# Define data bunch to use for learner
db = DataBunch(train_dl= DataLoader(dataset=train_ds), valid_dl=DataLoader(dataset=valid_ds))

db.batch_size=int(num_training_samples/batches)

learn = Learner(db, model,  loss_func=MSELossFlat())

The model also seems to learn the model (at least sometimes) since the MSE loss function drops to 0.00… However, this is not deterministic, in the sense that when I restart the whole notebook the error sometimes gets stuck at 0.25 (I seeded the numpy RNG such that I always create the same data sets).

My questions in short:

  1. What else can I configure in fastai to make the learning reproducable (always learn the same model)?

  2. Also, is the code/model okay so far or did I make some fundamental mistakes, which is the reason for the strange behavior I get?

  3. Is the “c=2” parameter in ArrayDataset correct. Since I defined a regression problem (rather than a classification problem, which I also could have done) I am not sure what this parameter is used for in this case :frowning:

  4. Is there an easier way to get my data set into the DataBunch?

Thanks for helping me,
Hannes

1 Like

Hi Hannes and welcome,

I will take a shot at some of your questions.

There are already many forum posts investigating this question. In your case, you have insured the same training and validation sets. However, the Linear layers will be initialized differently each time. You could manually re-initialize them the same way after nn.Linear creates them. I suspect that then you’d get the same outcome each time.

Also, is the code/model okay so far or did I make some fundamental mistakes, which is the reason for the strange behavior I get?

Your model is being asked to do something extremely simple: memorize 4 different inputs with their results. The validation set is irrelevant because it merely repeats the training examples. The model is clearly capable of learning this task, but does not learn it every time.

Your model contains very few parameters. What I suspect is that the loss vs weights landscape is low dimensional and lumpy with deep local minima. The weight solution gets trapped in a local minimum depending on where the Linear’s happen to have been initialized that time.

The .25 MSE loss simply means that the model is predicting one of the four inputs incorrectly (its opposite). You could run the four inputs through the trained model to test this speculation.

About #3, no idea. Seems like c should be irrelevant.

#4: You really do not need to build huge training arrays containing the same four inputs in random order. Unless you need exact reproducibility, generating training inputs on the fly would be enough.

Hope this helps you to take the next steps,
Malcolm

Hi Malcolm and thanks for your answers :slight_smile:

In this case I would assume the error should be 0, and not 0.000xxxx. But maybe this is just some imprecission problem.

I also thought so, but when I set it to 0, None or 4, for instance, I get some weired effects like the model is not learning at all.

Yes I know, it was just that I started of with a more complicated problem and than I switched to a simpler one. So I thought that it doesn’t matter from a performance perspective. It also doesn’t seem to change anything, so I didn’t bother to change it back.

So is there a more elegant way without creating a own ArrayDataset class?

Anybody who knows what this parameter does, and what difference it makes when learning regression over a classification problem?

Cheers,
Hannes

Right. You are approximating a Boolean function with a continuous function. More training would bring the loss closer to zero. You could visualize what function is being learned by plotting [0,1]x[0,1] -> z-axis.