Lesson 4 discussion

prateek2686 · December 1, 2017, 8:42pm

I am using the following, fairly simple code to predict an output variable which may have 3 categories:

n_factors = 20
np.random.seed = 42

def embedding_input(name, n_in, n_out, reg):
    inp = Input(shape=(1,), dtype='int64', name=name)
    return inp, Embedding(n_in, n_out, input_length=1, W_regularizer=l2(reg))(inp)

user_in, u = embedding_input('user_in', n_users, n_factors, 1e-4)
artifact_in, a = embedding_input('artifact_in', n_artifacts, n_factors, 1e-4)

mt = Input(shape=(31,))
mr = Input(shape=(1,))
sub = Input(shape=(24,))

def onehot(featurename):
    onehot_encoder = OneHotEncoder(sparse=False)
    onehot_encoded = onehot_encoder.fit_transform(Modality_Durations[featurename].reshape(-1, 1))
    trn_onehot_encoded = onehot_encoded[msk]
    val_onehot_encoded = onehot_encoded[~msk]
    return trn_onehot_encoded, val_onehot_encoded

# One hot encode the categorical variables
trn_onehot_encoded_mt, val_onehot_encoded_mt = onehot('modality_type')
trn_onehot_encoded_mr, val_onehot_encoded_mr = onehot('roleid')
trn_onehot_encoded_sub, val_onehot_encoded_sub = onehot('subject')
trn_onehot_encoded_quartile, val_onehot_encoded_quartile = onehot('quartile')

# Model
x = merge([u, a], mode='concat')
x = Flatten()(x)
x = merge([x, mt], mode='concat')
x = merge([x, mr], mode='concat')
x = merge([x, sub], mode='concat')
x = Dense(10, activation='relu')(x)
BatchNormalization()
x = Dense(3, activation='softmax')(x)
nn = Model([user_in, artifact_in, mt, mr, sub], x)
nn.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

nn.optimizer.lr = 0.001
nn.fit([trn.member_id, trn.artifact_id, trn_onehot_encoded_mt, trn_onehot_encoded_mr, trn_onehot_encoded_sub], trn_onehot_encoded_quartile, 
       batch_size=256, 
       epochs=2, 
       validation_data=([val.member_id, val.artifact_id, val_onehot_encoded_mt, val_onehot_encoded_mr, val_onehot_encoded_sub], val_onehot_encoded_quartile)
      )

Here’s the summary of the model:

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
====================================================================================================
user_in (InputLayer)             (None, 1)             0                                            
____________________________________________________________________________________________________
artifact_in (InputLayer)         (None, 1)             0                                            
____________________________________________________________________________________________________
embedding_9 (Embedding)          (None, 1, 20)         5902380     user_in[0][0]                    
____________________________________________________________________________________________________
embedding_10 (Embedding)         (None, 1, 20)         594200      artifact_in[0][0]                
____________________________________________________________________________________________________
merge_25 (Merge)                 (None, 1, 40)         0           embedding_9[0][0]                
                                                                   embedding_10[0][0]               
____________________________________________________________________________________________________
flatten_7 (Flatten)              (None, 40)            0           merge_25[0][0]                   
____________________________________________________________________________________________________
input_13 (InputLayer)            (None, 31)            0                                            
____________________________________________________________________________________________________
merge_26 (Merge)                 (None, 71)            0           flatten_7[0][0]                  
                                                                   input_13[0][0]                   
____________________________________________________________________________________________________
input_14 (InputLayer)            (None, 1)             0                                            
____________________________________________________________________________________________________
merge_27 (Merge)                 (None, 72)            0           merge_26[0][0]                   
                                                                   input_14[0][0]                   
____________________________________________________________________________________________________
input_15 (InputLayer)            (None, 24)            0                                            
____________________________________________________________________________________________________
merge_28 (Merge)                 (None, 96)            0           merge_27[0][0]                   
                                                                   input_15[0][0]                   
____________________________________________________________________________________________________
dense_13 (Dense)                 (None, 10)            970         merge_28[0][0]                   
____________________________________________________________________________________________________
dense_14 (Dense)                 (None, 3)             33          dense_13[0][0]                   
====================================================================================================
Total params: 6,497,583
Trainable params: 6,497,583
Non-trainable params: 0
_____________________________

But on the fit statement, I get the following error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-71-7de0782d7d5d> in <module>()
      5        batch_size=256,
      6        epochs=2,
----> 7        validation_data=([val.member_id, val.artifact_id, val_onehot_encoded_mt, val_onehot_encoded_mr, val_onehot_encoded_sub], val_onehot_encoded_quartile)
      8       )
      9 # nn.fit([trn.member_id, trn.artifact_id, trn_onehot_encoded_mt, trn_onehot_encoded_mr, trn_onehot_encoded_sub], trn.duration_new,

/home/prateek_dl/anaconda3/lib/python3.5/site-packages/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, **kwargs)
   1520             class_weight=class_weight,
   1521             check_batch_axis=False,
-> 1522             batch_size=batch_size)
   1523         # Prepare validation data.
   1524         do_validation = False

/home/prateek_dl/anaconda3/lib/python3.5/site-packages/keras/engine/training.py in _standardize_user_data(self, x, y, sample_weight, class_weight, check_batch_axis, batch_size)
   1380                                     output_shapes,
   1381                                     check_batch_axis=False,
-> 1382                                     exception_prefix='target')
   1383         sample_weights = _standardize_sample_weights(sample_weight,
   1384                                                      self._feed_output_names)

/home/prateek_dl/anaconda3/lib/python3.5/site-packages/keras/engine/training.py in _standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix)
    142                             ' to have shape ' + str(shapes[i]) +
    143                             ' but got array with shape ' +
--> 144                             str(array.shape))
    145     return arrays
    146 

ValueError: Error when checking target: expected dense_14 to have shape (None, 1) but got array with shape (1956554, 3)

How do I resolve this error? Why is the final layer expecting (None,1) when according to the summary() it has to output (None,3)?

@jeremy - Am I doing something obviously stupid?
Any help would be greatly appreciated.

amil · December 24, 2017, 8:08am

Hello,
I have a doubt, in the starting of the first 20 minutes the convolution was explained using Excel Sheet. The second layer filter was a 3x3x2 tensor. Why was this the case? Where did the 2 come from? Is it simply the number of previous layer filters? What will be the size of such a tensor after max pooling?

bharadwaj · March 18, 2018, 9:22pm

Yes, it was the number of filters previously used.

Bit of background: Each filters is good at highlighting a certain pattern (horizontal edges for one filter, vertical edges for another and so on), so we add few of those in the beginning to find the most basic patterns - like lines.This is why we find such basic pattern detectors in initial layers of models such as resnet etc.

In the sheet, Jeremy convolved two of those with the input which is why the next tensor to be applied had to have
depth of 2. Note that jeremy decided to have two such 3x3x2 tensors in second convolution operation - this was a choice which will decide depth of filters in next convolution (if we need).

When we do a pooling operation, we don’t do convolution with any filter - rather we simply take pieces of previous layer weights and do some operation on it (taking max of those weights is called “max pooling”). So we’re in a way reducing dimensions of the previous layer, but the depth remains same.

janfytan · March 26, 2018, 3:12pm

Hello,
I’m watching the 2017 lesson 4 video Lesson 4: Practical Deep Learning for Coders
And I notice that at about 15 minutes you mention that
'Interestingly, the author of Keras, last week or maybe the week before made the contention that perhaps it will turn out that CNNs are the architecture that will be used for every type of ordered data. This was just after one of the leading NLP researchers released a paper basically showing a state of the art result in NLP using CNN. ’
I’m now looking for articles or papers about the above researchs.
Can you provide me the links of them?
Thanks.

Kirillino · February 13, 2020, 9:04pm

Hi,

Has any one encountered a problem where an Accuracy rate does not change at all when running the tabular learner with continuous data as a target?

I am using the same code as i lesson 4

I have tried every thing but to no success