Hi, all
I am trying ULMFit these days. It works fine on fastai repo and the script. However, I am trying to build ULMFit on the latest version of PyTorch(0.4.0), the finetune and train-class script is fine, but got some problem on prediction.
I have 412 test case, the result of learner.predict_with_targs()
on pytorch 0.4 is a tuple with the first one length 3296, it looks like this: [ 20 48 107 … 200 4 63]
the result on the exactly same code but run with pytorch 0.31 (which is embed in fastai repo) will return a 412 length ndarry, something like below, which is what I want
[[ 1.00275 -1.00087]
[ 1.25688 -1.56693]
[ 2.83312 -2.40475]
[ 0.38864 -0.38543]
[-4.41717 3.6363 ]
[ 3.51488 -3.07165]
[ 2.77479 -2.56887]
[ 2.07361 -1.77178]
[ 2.4589 -2.47992]
[ 2.2935 -2.06768]
[ 3.62962 -3.14912]
I have also tried learner.predict(is_test=True)
but still got this wired predict result. Every thing is the same (including the code and model) except the pytorch version, so I am pretty sure it is something related to it. I have tried to dig into the source code of fastai, but didn’t anything useful… Any clues? Thanks!
The prediction code is as below:
# load vocabulary lookup
itos = pickle.load(open(join(PATH, 'tmp/itos.pkl'), 'rb'))
vs = len(itos)
# load data
test_data = np.load(join(PATH, "tmp/tst_ids.npy"))
test_data = np.squeeze(test_data)
test_lbls = np.load(join(PATH, "tmp/lbl_tst.npy"))
test_lbls = np.squeeze(test_lbls)
c=int(test_lbls.max())+1
# build a TextDataset
test_dataset = TextDataset(test_data, test_lbls)
# build a SortSampler
BATCH_SIZE = 4
test_samp = SortSampler(test_data, key=lambda x: len(test_data[x]))
# build a DataLoader
test_loader = DataLoader(test_dataset, BATCH_SIZE, transpose=True, num_workers=1, pad_idx=1, sampler=None, shuffle=False)
#ld a TextData instance
md = TextData(PATH, None, None, test_loader)
# build the classifier (exactly as it was in train_clas.py)
opt_fn = partial(optim.Adam, betas=(0.8, 0.99))
bptt = 70 # back propogation through time
em_sz = 400 # size of embeddings
nh = 1150 # size of hidden
nl = 3 # number of layers
dps = np.array([0.4,0.5,0.05,0.3,0.4])
model = get_rnn_classifer(
bptt=bptt,
max_seq=20*70,
n_class=c,
n_tok=vs,
emb_sz=em_sz,
n_hid=nh,
n_layers=nl,
pad_token=1,
layers=[
em_sz*3, # three layers of 1200, but then where does nh=1150 come in?
50, # just like an intermediate compression layer? Why 50?
c # number of total labels
],
drops=[dps[4], 0.1],
dropouti=dps[0],
wdrop=dps[1],
dropoute=dps[2],
dropouth=dps[3]
)
model.eval # just to make sure dropout is being applied
#ld an RNN_Learner
learner = RNN_Learner(
data=md,
models=TextModel(to_gpu(model)), # not sure why this is required
opt_fn=opt_fn
)
learner.model.eval # just to make sure dropout is being applied
loaded_weights = torch.load(join(PATH, "models/%s.h5" % model_name))
learner.load(model_name)
preds_dist, preds = learner.predict_with_targs()