• utf-8
  • windows
  • unicode
  • character-encoding

When you open a file in text (non-binary) mode, Python defaults to the system’s default character encoding. This works well on most modern platforms, where everything is UTF-8 all the time, but fails on Windows, where the default system encoding is some legacy 8-bit code page.

A typical symptom is getting a Unicode decoding error when Python reads a character which is not defined in that code page (commonly, in the 0x80-0x9F byte range).

The fix is trivial; always use encoding='utf-8' or, more generally, declare the encoding of the file when you open a text file for reading (or writing, for that matter, but the symptoms around that are different).

See also https://sopython.com/canon/132/unicodedecodeerror-utf-8-codec-cannot-decode/