import codecs
file = codecs.open("file_with_unicode_data.txt", "r", "utf-8")
print file.readlines()
file.close()
Showing posts with label non-ascii. Show all posts
Showing posts with label non-ascii. Show all posts
Sunday, March 11, 2012
Reading unicode data through Python
Processing files containing unicode based characters requires using the codec library instead of the standard file processing libraries. The relevant code:
Thursday, July 7, 2011
Python Non-ASCII character '\xc3'
During a python script execution, if you get the following error:
SyntaxError: Non-ASCII character '\xc3' in file text2term_topia.py on line 21, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details
SyntaxError: Non-ASCII character '\xc3' in file text2term_topia.py on line 21, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details
Add the following statement right at the begining, before all import statements.
# -*- coding: utf-8 -*-
Similar approach for errors with the term '\xe2'.
Subscribe to:
Posts (Atom)