Maybe not exactly the same. I am running lesson 10 nlp notebook. I can run with no problems all the exercises up to the creation of the language model and make samples from it, but when I try to run the classification model for sentiment analysis, creating the dls_clas Datablock then I run into this problem.
Thank you, I doesnt’ work. The problem is not when displaying I believe, but rather when when building the dls_clas DataBlock, since when I train it fails to do so.
The issue persist. I hacked a function to display text without padding:
def show_batch_text(dls, max_n=10, ctxs=None, trunc_at=150, unpad=True, **kwargs):
b = dls.one_batch()
x, y, samples = dls._pre_show_batch(b, max_n=max_n)
if ctxs is None: ctxs = get_empty_df(min(len(samples), max_n))
# next line removes padding
if unpad: samples = L((TitledStr(s[0].replace('xxpad', '').strip()),*s[1:]) for s in samples)
if trunc_at is not None: samples = L((s[0].truncate(trunc_at),*s[1:]) for s in samples)
for i in range_of(samples[0]):
ctxs = [b.show(ctx=c) for b,c,_ in zip(samples.itemgot(i),ctxs, range(max_n))]
display_df(pd.DataFrame(ctxs))
return ctxs
It would be nice to to add an option to remove padding from decoded text. Not sure what is the best place to do it, may be in dls._decode_batch or in show_batch[TensorText] (similar to what I did above).
I wonder if I’m missing something and there is a reason no to do so?