【龍】甲辰年丁卯月壬辰日 / 二月廿一日
Taiwan's Youth day :: Friday March 29, 2024

Python

Since the release of Python 3 (3-Dec-2008) all text strings are in Unicode by default. As this can cause backward compatibility problems with the 2.x branch it is important to understand the impact. Please read the Python Unicode howto for an in depth understanding.

Make sure you define the encoding of the file in the header of the file (i.e. # -*- coding: utf-8 -*-), and save the file in the same encoding.
You can also tell the Python compliler that the string that follows is in unicode by a u in front (i.e. u'This is a string'), however this is only for Python 2. From Python 3 onwards the default is unicode and you need to specify a b in front (i.e. b'This string is in bytes'), to declare a sequence of bytes.

Use .encode() and .decode() to change the encoding of strings. You can find a list of encodings here

Link
Python (official page)
Python based Pinyin parser


[ < back ] - [ home ]