Predicting if a number is even or odd

From the material in lesson 2 I wanted to understand if we could build a keras.Sequential to predict if a number is even or odd. The following is what I tried.

# Number of samples in the training set.
>>> samples = 100

# Numpy array of arrays with one number.
>>> x = np.reshape(np.arange(samples), (samples, 1))

# A even number has a 0 and an odd number has a 1.
>>> y = np.array([temp[0] % 2 for temp in x])

# Sample values of x
>>> x[:5]
array([[0],
       [1],
       [2],
       [3],
       [4]])
# Sample values of y
>>> y[:5]
array([0, 1, 0, 1, 0])

>>> lm = Sequential([Dense(1, input_shape=(1, )), ])
>>> lm.compile(optimizer=SGD(lr=0.1), loss='mse')

# Computes the loss on some input data, batch by batch.
>>> lm.evaluate(x, y, verbose=1)

# Computes the loss on some input data, batch by batch.
>>> lm.evaluate(x, y, verbose=1)
  32/100 [========>.....................] - ETA: 0s
  4156.3061328125004

>>> lm.fit(x, y, nb_epoch=5, batch_size=1)

Epoch 1/5
100/100 [==============================] - 0s - loss: nan          
Epoch 2/5
100/100 [==============================] - 0s - loss: nan     
Epoch 3/5
100/100 [==============================] - 0s - loss: nan     
Epoch 4/5
100/100 [==============================] - 0s - loss: nan     
Epoch 5/5
100/100 [==============================] - 0s - loss: nan     
<keras.callbacks.History at 0x7f847acf16a0>

>>> lm.evaluate(x, y)
  32/100 [========>.....................] - ETA: 0s
  nan

# Let's look at the weights.
>>> lm.get_weights()
[array([[ nan]], dtype=float32), array([ nan], dtype=float32)]

NOTE: The post ^ went out accidentally while I was still writing it.

I don’t understand why loss from lm.evaluate is 4156.3061328125004 the first time but nan for the later runs? Similarly the weights are nan too.

You can edit a post even after it is out in the wild if you would like.

image

Hit the grey pencil under your post.

Try using a smaller learning rate. try like lr=0.0001 to start.

What you can do is, extract the weights, study the weights and do the evaluation manually. See how to extract weights here: https://keras.io/layers/about-keras-layers/

That might give you some insight

I don’t think a single Dense layer is expressive enough to express the is_even function. In this setup, the Dense layer is merely the function f(x) = wx + b, and I can’t think of a w and a b that would give me the is_even function. I like this style, though, of choosing a simple function and seeing the different ways of getting a model to converge to it. I wonder if you can do it with a single hidden layer.

@ NaN
I agree with Kevin that you should try a smaller learning rate, although I don’t know why the loss is becoming NaN. Maybe you could print the weights (w, b) every so often to see how they evolve. Maybe they’re becoming huge or tiny.