Hello,
As part of a project, I’m working on the Mercari Price Suggestion Challenge, where the objective is predict a price of a product given information about it. While, the given details also includes categorical variables, I’m only concentrating on the name
and item_description
columns which constitutes free-text (I will be using other variables in the next stage). As the objective is to predict the price of the product, this is a regression problem.
What I have done:
- Trained a LM on this data utilizing the FastAI’s API by fine-tuning the pre-trained WT103 LM
- Used only the
name
anditem_description
column as part of the data - Created a dummy “classification” task by just adding a dummy
label
column with random binary labels and tested it by following the docsimdb_sample
example to make sure the API works on this data (and it does!).
The next step is to actually setup and solve the regression problem. This entails the following:
- Create
DataBunch
similar to theTextClasDataBunch
that formats the data such that each batch contains price values in they
variable corresponding to eachx
- Create
RNNLearner
for regression similar toRNNLearner.Classifier
- Setup a new loss function for the learner which is the root mean square log error as specified in the evaluation
- Replace the “head” of the model with a layer such that it outputs a single value. In particular, the current model architecture is:
RNNLearner(data=<fastai.text.data.TextClasDataBunch object at 0x7fd1a390f9e8>, model=SequentialRNN(
(0): MultiBatchRNNCore(
(encoder): Embedding(60093, 400, padding_idx=1)
(encoder_dp): EmbeddingDropout(
(emb): Embedding(60093, 400, padding_idx=1)
)
(rnns): ModuleList(
(0): WeightDropout(
(module): LSTM(400, 1150)
)
(1): WeightDropout(
(module): LSTM(1150, 1150)
)
(2): WeightDropout(
(module): LSTM(1150, 400)
)
)
(input_dp): RNNDropout()
(hidden_dps): ModuleList(
(0): RNNDropout()
(1): RNNDropout()
(2): RNNDropout()
)
)
(1): PoolingLinearClassifier(
(layers): Sequential(
(0): BatchNorm1d(1200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(1): Dropout(p=0.2)
(2): Linear(in_features=1200, out_features=50, bias=True)
(3): ReLU(inplace)
(4): BatchNorm1d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(5): Dropout(p=0.1)
(6): Linear(in_features=50, out_features=2, bias=True)
)
)
), opt_func=functools.partial(<class 'torch.optim.adam.Adam'>, betas=(0.9, 0.99)), loss_func=<function cross_entropy at 0x7fd34695ce18>, metrics=[<function accuracy at 0x7fd340669488>], true_wd=True, bn_wd=True, wd=0.01, train_bn=True, path=PosixPath('data/price-pred'), model_dir='models', callback_fns=[<class 'fastai.basic_train.Recorder'>], callbacks=[RNNTrainer(learn=RNNLearner(data=<fastai.text.data.TextClasDataBunch object at 0x7fd1a390f9e8>, model=SequentialRNN(
(0): MultiBatchRNNCore(
(encoder): Embedding(60093, 400, padding_idx=1)
(encoder_dp): EmbeddingDropout(
(emb): Embedding(60093, 400, padding_idx=1)
)
(rnns): ModuleList(
(0): WeightDropout(
(module): LSTM(400, 1150)
)
(1): WeightDropout(
(module): LSTM(1150, 1150)
)
(2): WeightDropout(
(module): LSTM(1150, 400)
)
)
(input_dp): RNNDropout()
(hidden_dps): ModuleList(
(0): RNNDropout()
(1): RNNDropout()
(2): RNNDropout()
)
)
(1): PoolingLinearClassifier(
(layers): Sequential(
(0): BatchNorm1d(1200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(1): Dropout(p=0.2)
(2): Linear(in_features=1200, out_features=50, bias=True)
(3): ReLU(inplace)
(4): BatchNorm1d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(5): Dropout(p=0.1)
(6): Linear(in_features=50, out_features=2, bias=True)
)
)
), opt_func=functools.partial(<class 'torch.optim.adam.Adam'>, betas=(0.9, 0.99)), loss_func=<function cross_entropy at 0x7fd34695ce18>, metrics=[<function accuracy at 0x7fd340669488>], true_wd=True, bn_wd=True, wd=0.01, train_bn=True, path=PosixPath('data/price-pred'), model_dir='models', callback_fns=[<class 'fastai.basic_train.Recorder'>], callbacks=[...], layer_groups=[Sequential(
(0): Embedding(60093, 400, padding_idx=1)
(1): EmbeddingDropout(
(emb): Embedding(60093, 400, padding_idx=1)
)
), Sequential(
(0): WeightDropout(
(module): LSTM(400, 1150)
)
(1): RNNDropout()
), Sequential(
(0): WeightDropout(
(module): LSTM(1150, 1150)
)
(1): RNNDropout()
), Sequential(
(0): WeightDropout(
(module): LSTM(1150, 400)
)
(1): RNNDropout()
), Sequential(
(0): PoolingLinearClassifier(
(layers): Sequential(
(0): BatchNorm1d(1200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(1): Dropout(p=0.2)
(2): Linear(in_features=1200, out_features=50, bias=True)
(3): ReLU(inplace)
(4): BatchNorm1d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(5): Dropout(p=0.1)
(6): Linear(in_features=50, out_features=2, bias=True)
)
)
)]), bptt=70, alpha=2.0, beta=1.0, adjust=False)], layer_groups=[Sequential(
(0): Embedding(60093, 400, padding_idx=1)
(1): EmbeddingDropout(
(emb): Embedding(60093, 400, padding_idx=1)
)
), Sequential(
(0): WeightDropout(
(module): LSTM(400, 1150)
)
(1): RNNDropout()
), Sequential(
(0): WeightDropout(
(module): LSTM(1150, 1150)
)
(1): RNNDropout()
), Sequential(
(0): WeightDropout(
(module): LSTM(1150, 400)
)
(1): RNNDropout()
), Sequential(
(0): PoolingLinearClassifier(
(layers): Sequential(
(0): BatchNorm1d(1200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(1): Dropout(p=0.2)
(2): Linear(in_features=1200, out_features=50, bias=True)
(3): ReLU(inplace)
(4): BatchNorm1d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(5): Dropout(p=0.1)
(6): Linear(in_features=50, out_features=2, bias=True)
)
)
)])
I’m thinking that I need to replace Linear(in_features=50, out_features=2, bias=True)
with Linear(in_features=50, out_features=1, bias=True)
or something similar.
There is a thread from the part2 course this year that talks about a similar problem. However, I didn’t really understand how to use that knowledge in the v1 library.
I’m looking for the easiest way to implement this. I would be grateful for any pointers on how to use/tweak the library in achieving my objective.
Thanks.