Thanks @harveyslash,
I do believe the GPU is being used since nvidia-smi does show an increase in memory usage and indicates the PID of the python process.
The problem is I believe it’s being under-used.
I’m using the AWS P2 instance. CPU usage does spike on a single CPU to 100% so that may be an issue - I’m open to trying a bigger P* instance if that may be the issue. Memory usage is around 50% but that’s mostly reading the 10GB csv.
I believe the low GPU usage may be due to the architecture of the model since it is shallow (it’s just an embedding layer - I’m trying to fit the collaborative filtering example from lesson 4 but on new data).
I’ve got a lot of data (1.3 billion examples) but I’m sampling down to just 300m (which I’d rather avoid if possible but think I’ll need a higher memory instance - P2 only has 60GB) …I’ve tried various combinations of batch-size and learning rate and it’s slow and not getting much better. My hope is that’s because I’m under-utilizing the GPU somehow and I can fiddle with the batch size or something to increase GPU utilization but I’m worried if I increase it too much then that’s bad?
Details of the model below:
n_factors = 8
reg_strength = 1e-4
batch_size= 2**18 # 262144
user_in = Input(shape=(1,), dtype='int64', name='user_in')
u = Embedding(n_users, n_factors, input_length=1, W_regularizer=l2(reg_strength))(user_in)
target_in = Input(shape=(1,), dtype='int64', name='target_in')
m = Embedding(n_targets, n_factors, input_length=1, W_regularizer=l2(reg_strength))(target_in)
x = merge([u, m], mode='dot')
x = Flatten()(x)
model = Model([user_in, target_in], x)
model.compile(Adam(0.001), loss='binary_crossentropy', metrics=['accuracy'])
and here is the shape of the inputs:
n_users, n_targets # (7161769, 4503329)
trn.shape # (240001071, 3)
model summary below:
____________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
====================================================================================================
user_in (InputLayer) (None, 1) 0
____________________________________________________________________________________________________
target_in (InputLayer) (None, 1) 0
____________________________________________________________________________________________________
embedding_1 (Embedding) (None, 1, 8) 57294152 user_in[0][0]
____________________________________________________________________________________________________
embedding_2 (Embedding) (None, 1, 8) 36026632 target_in[0][0]
____________________________________________________________________________________________________
merge_1 (Merge) (None, 1, 1) 0 embedding_1[0][0]
embedding_2[0][0]
____________________________________________________________________________________________________
flatten_1 (Flatten) (None, 1) 0 merge_1[0][0]
====================================================================================================
Total params: 93320784
____________________________________________________________________________________________________
Thanks!