[Solved] Issue with 'utf-8' codec when saving and/ or loading np.ndarray

Hi,

I’ve found a bug that is preventing me from saving and/ or loading numpy arrays.

Here’s the code I’ve used to reproduce it:

import numpy as np
from pathlib import Path
import os
PATH = Path('data/')
if not os.path.exists(PATH): os.makedirs(PATH)
path = PATH/'test.npy'
a = np.random.rand(2, 2)
a

array([[0.0175015 , 0.83473786],
[0.78010567, 0.11571356]])

Save works correctly:

np.save(path, a)

And load works as well:

np.load(path)

array([[0.0175015 , 0.83473786],
[0.78010567, 0.11571356]])

But when I import fastai2:

from fastai2.imports import *
import fastai2, fastcore
print('fastai2 :', fastai2.__version__)
print('fastcore:', fastcore.__version__)

fastai2 : 0.0.17
fastcore: 0.1.17

Now load doesn’t work using the exact same path:

np.load(path)

produces this error:

---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-6-eefdf3b56709> in <module>
----> 1 np.load(path)

~/anaconda3/lib/python3.7/site-packages/numpy/lib/npyio.py in load(file, mmap_mode, allow_pickle, fix_imports, encoding)
    434         _ZIP_SUFFIX = b'PK\x05\x06' # empty zip files start with this
    435         N = len(format.MAGIC_PREFIX)
--> 436         magic = fid.read(N)
    437         # If the file size is less than N, we need to make sure not
    438         # to seek past the beginning of the file

~/Documents/Machine_Learning/Jupyter_Notebooks/fastcore/fastcore/utils.py in read(self, size, encoding)
    474 def read(self:Path, size=-1, encoding='utf8'):
    475     "Read the content of `fname`"
--> 476     with self.open(encoding=encoding) as f: return f.read(size)
    477 
    478 # Cell

~/anaconda3/lib/python3.7/codecs.py in decode(self, input, final)
    320         # decode input (taking the buffer into account)
    321         data = self.buffer + input
--> 322         (result, consumed) = self._buffer_decode(data, self.errors, final)
    323         # keep undecoded input until the next call
    324         self.buffer = data[consumed:]

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x93 in position 0: invalid start byte

Neither does save:

np.save(path, a)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-8-9cdb4d6f119b> in <module>
----> 1 np.save(path, a)

<__array_function__ internals> in save(*args, **kwargs)

~/anaconda3/lib/python3.7/site-packages/numpy/lib/npyio.py in save(file, arr, allow_pickle, fix_imports)
    551         arr = np.asanyarray(arr)
    552         format.write_array(fid, arr, allow_pickle=allow_pickle,
--> 553                            pickle_kwargs=pickle_kwargs)
    554     finally:
    555         if own_fid:

~/anaconda3/lib/python3.7/site-packages/numpy/lib/format.py in write_array(fp, array, version, allow_pickle, pickle_kwargs)
    658     """
    659     _check_version(version)
--> 660     _write_array_header(fp, header_data_from_array_1_0(array), version)
    661 
    662     if array.itemsize == 0:

~/anaconda3/lib/python3.7/site-packages/numpy/lib/format.py in _write_array_header(fp, d, version)
    432     else:
    433         header = _wrap_header(header, version)
--> 434     fp.write(header)
    435 
    436 def write_array_header_1_0(fp, d):

~/Documents/Machine_Learning/Jupyter_Notebooks/fastcore/fastcore/utils.py in write(self, txt, encoding)
    481     "Write `txt` to `self`, creating directories as needed"
    482     self.parent.mkdir(parents=True,exist_ok=True)
--> 483     with self.open('w', encoding=encoding) as f: f.write(txt)
    484 
    485 # Cell

TypeError: write() argument must be str, not bytes

Please, let me know if more details are required.

Just in case anybody has the same issue, I’ve found a simple workaround.
I still don’t understand why ths issue occurs, but the solution is simple.

Instead of doing this:

np.load(path)

you should do this:

np.load(str(path))

At least it has worked for me.

1 Like