'charmap' codec

Poltigo · January 18, 2021, 1:45am

hi, while running the 01_intro notebook on my laptop (US - English) I’ve encountered the error

UnicodeEncodeError: ‘charmap’ codec can’t encode characters in position 16-17: character maps to

where do I fix it?

meanpenguin · January 19, 2021, 5:22pm

Probably need to post the section of code that is causing this error for anyone to help.

What version of Python are you using?

samb · February 8, 2021, 5:17pm

I just ran into the same error.
In my case I am running jupyter notebook on my local windows machine.
From the error log I saw that there is some issue with encoding.

~\anaconda3\lib\pathlib.py in read_text(self, encoding, errors)
   1231         """
   1232         with self.open(mode='r', encoding=encoding, errors=errors) as f:
-> 1233             return f.read()
   1234 
   1235     def write_bytes(self, data):

~\anaconda3\lib\encodings\cp1252.py in decode(self, input, final)
     21 class IncrementalDecoder(codecs.IncrementalDecoder):
     22     def decode(self, input, final=False):
---> 23         return codecs.charmap_decode(input,self.errors,decoding_table)[0]
     24 
     25 class StreamWriter(Codec,codecs.StreamWriter):

A closer look into the signatures of the methods reveals the following:
def read_text(self, encoding=None, errors=None):
and
def write_text(self, data, encoding=None, errors=None):
in
“C:\Users<user>\anaconda3\Lib\pathlib.py”

I could fix it temporarily by adjusting these methods in the “pathlib.py” file. This is not the way the problem should be tackled, however, just for the sake of making it run you could apply the following:

def read_text(self, encoding='utf-8', errors=None):
and
def write_text(self, data, encoding='utf-8', errors=None):

Maybe someone else has an explanation why this encoding issue is popping up.

Best regards

amrit · June 7, 2021, 8:15pm

I know this reply is quite late but posting anyway in case it helps anyone else reading in the future.

So I ran into a similar problem in notebook 10. To start, what the error is telling you is that the current text encoder/decoder is unable to figure out what characters are at positions 16-17 in the file it’s currently processing. So let’s check what encoder is currently being used with the following Python code

import locale
print(locale.getpreferredencoding(False))

What you should get back (like I did, and what you probably will based of your error message) is cp1252 which appears to be a legacy Windows encoding scheme. To fix this we will need to use the more widely adopted encoding scheme, as you’ve shown, of UTF-8. But how to do this without modifying any library code. Turns out there was a PEP for exactly this. We simply need to run the following in the command line which tells Python to use UTF-8 by default for any code that requires an encoding parameter

$ set PYTHONUTF8=1

To explicitly confirm the fix worked, you can try running the previous Python snipped which should now print out UTF-8.

More information on this fix can also be found here.