【蛇】乙巳年己卯月戊子日 / 二月廿二日
Thursday March 20, 2025

Python

Since the release of Python 3 (3-Dec-2008) all text strings are in Unicode by default. As this can cause backward compatibility problems with the 2.x branch it is important to understand the impact. Please read the Python Unicode howto for an in depth understanding.

Make sure you define the encoding of the file in the header of the file (i.e. # -*- coding: utf-8 -*-), and save the file in the same encoding.
You can also tell the Python compliler that the string that follows is in unicode by a u in front (i.e. u'This is a string'), however this is only for Python 2. From Python 3 onwards the default is unicode and you need to specify a b in front (i.e. b'This string is in bytes'), to declare a sequence of bytes.

Use .encode() and .decode() to change the encoding of strings. You can find a list of encodings here

Link
Python (official page)
Python based Pinyin parser


[ < back ] - [ home ]