Simple Model not converging

harveyslash · March 14, 2017, 10:22pm

I wanted to try to see if it was possible to use an LSTM to predict if a name is a female or male.

The dataset I am using is Index of /afs/cs/project/ai-repository/ai/areas/nlp/corpora/names.

My code is as follows:

maxLen = 15 #this will be used for padding in keras


"""
open the files , load them, and clean them
"""

file = open('../.datasets/genderNames/female.txt')
femaleArr = file.read().lower().split('\n')[:-1]
file = open('../.datasets/genderNames/male.txt')
maleArr = file.read().lower().split('\n')[:-1]

characters = """abcdefghijklmnopqrstuvwxyz- '"""
print('femaleArrays length  '+str(len(femaleArr)))
print('maleArrays length  '+str(len(maleArr)))


"""
looks up the variable characters and converts each character 
to its index.
"""
def getIndices(name):
    indices = []
    for character in name:
        indices.append(characters.index(character)+1)
    return indices


male_x = [getIndices(x) for x in maleArr]

female_x = [getIndices(x) for x in femaleArr]

"""
at this point, male_x should have an array of length
equal to the number of names in male, and each element in that, 
is each name. Each character is the index that getIndeces returned

"""

x = []
y = []
x.extend(male_x)
y = [0]*len(male_x)

x.extend(female_x)
y.extend( [1]*len(female_x))


print(len(x))
print(len(y))
print(y)
#0 is male, 1 is female



"""
adding a padding to X so that keras can accept batches
"""
X = sequence.pad_sequences(x,maxlen=maxLen)
# X = np.expand_dims(X,axis=2)
print(X.shape)

Y = np.array(y)
Y = Y.reshape((7944,1))
print(Y.shape)





"""
The model. 
I have already tried removing Embedding, and it still 
does not work
"""
model = Sequential()
model.add(Embedding(len(characters),50))
model.add(BatchNormalization())
model.add(LSTM(100))
model.add(BatchNormalization())
model.add(Dense(200,activation='relu'))
model.add(Dense(1))
model.add(Activation('softmax'))

model.compile(loss='binary_crossentropy', optimizer='adam')


model.fit(X, Y, batch_size=128, nb_epoch=100)

This loss is stuck at exactly

loss: 5.9061

Even after multiple epochs.

Is there something obvious that I am missing ?