Are we able to download
I’m trying to follow
lang_model-arxiv.ipynb to use my own dataset to build a language model and perform a classification task.
I also found this Torchtext guide, which uses a .tsv file format (and seems simpler for my task), but I’m not able to get it to work.
I had an error when running
TextData.from_splits while specifying all three train, val, and test in the torch Dataset, seemingly because TextData.from_splits has a line that only anticipates receiving
trn_iter,val_iter as a return value.
I worked past that by not specifying test data, but then ran into another issue when torchtext was sorting batches, so I suspect I’m not setting up something properly.
FWIW here’s a gist. The dataset is a very small one I made and not an interesting problem =). I just wanted to see if these new techniques we learned can work on a small dataset.