**My question is this: can a simple visual CNN learn integer addition?**

I have been digging into this idea and I am eager to discuss it with other FastAI learners.

The training data are computer generated images of expressions, like “1+1” and “3-4” and “-2+4”.

The classes are their respective sums: integers spanning the range, like “-1” and “0”, and “2”.

The validation data should be an unfamiliar combination of familiar tokens, like “1+3”.

The essential setup can be reduced to the following question:

TRAIN

1 >>> A

2 >>> B

3 >>> C

4 >>> D

1+2 >>> C

1+4 >>> E

2+2 >>> D

VALID

2+3 >>>???

In my imagination, to fill in the question marks, the machine learner must recognize the equivalence of the following activations:

(B and C) == (B and (A and B)) == (A and (B and B)) == (A and D) == E

The problems I have with this are that it would appear to take arbitrarily many layers (one for each time you want to do addition). Or that it would appear to require some internal-representation of potential classes BEFORE conclusion of the algorithm actually produces a classification (in a rather self-aware manner).

Currently, I haven’t had success! My strategy has been to train a model on simply recognizing and classifying the tokens themselves e.g. “1” and “3” and “-2”. Then after achieving arbitrary accuracy, I freeze the model, swap out the data/classes for the addition images, and train some new layers on top. I can’t figure out why this isn’t working!

As a side note, I am not interested in hand-crafting a piecemeal model that recognizes digits, “+”, “-” symbols separately, then formulating an expression that python evaluates. I actually want the NN to discover/derive the concept of addition internally.

If you find this idea or implementation interesting or challenging, please add your thoughts/comments/suggestions here? Is it possible?