SentencePiece: 'NoneType' object has no attribute 'EncodeAsPieces'

I have a stripped down version of fastbook Chapter 10:

which runs without an issue. As soon as I swap out the tok_func to SentencePieceTokenizer instead of spacy, learn.predict results in an error:

~/fastai/fastai2/fastai2/text/core.py in __call__(self, items)
    371 
    372     def __call__(self, items):
--> 373         for t in items: yield self.tok.EncodeAsPieces(t)
    374 
    375 # Cell

AttributeError: 'NoneType' object has no attribute 'EncodeAsPieces'

Here is the gist with just one difference.

I do see that self.tok is indeed None. Has anybody come across this before?

The commit of fastcore and fastai2 that I am running:
fastcore: 4a2d5ea702d0dc4a6c34c4acefafd9b494d9e222
fastai2: bf455de9bc37c76f7f92b3c43227ef9d4779b614

Quick check if sentencepiece is installed using pip install???

Try this starter notebook. This is based on wiki tutorial but use sentence piece instead. https://colab.research.google.com/drive/1xpue3b9DJNhUzRj5HHBdWpAB9SSsTTlS#scrollTo=W-lHBWZlklZJ

Yes!

sentencepiece             0.1.83                   pypi_0    pypi

The issue is somewhere in predict method of the leaner, so the starter notebook works fine for me.

Yes. I can reproduce the issue reported here in the same colab notebook.

[Edit] The issue is already reported by @hiromi

Should be fixed now in master.

2 Likes