text
/ binary
iobytes
/ bytearray
)unicode
, py3 str
)open
open
will always be in binary mode'wb'
or 'rb'
as the mode is superfluous (but good for documenting
intent!)unicode
may be implicitly converted by the
ASCII
codecopen
>>> with open('f.txt', 'w') as f:
... f.write(b'hello world!')
... f.write(u'unicode ascii text')
...
>>> with open('f.txt') as f:
... print(type(f.read()) is bytes)
...
True
>>> with open('f.txt', 'w') as f:
... f.write(u'☃')
...
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2603' in position 0: ordinal not in range(128)
open
PyFile_SetEncoding
/ PyFile_SetEncodingAndErrors
stdout
/ stderr
/ print
stdout
/ stderr
are just file
objects tooPyFile_SetEncodingAndErrors
is calledtty
sstdout
/ stderr
/ print
>>> import sys
>>> sys.stdout.write(u'☃\n')
☃
>>> sys.stderr.write(u'☃\n')
☃
>>> print(u'☃')
☃
stdout
/ stderr
/ print
$ LANG=C python -c 'print(u"\u2603")'
Traceback (most recent call last):
File "<string>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2603' in position 0: ordinal not in range(128)
stdout
/ stderr
/ print
$ LANG=en_US.UTF-8 python -c 'print(u"\u2603")'
☃
$ LANG=en_US.UTF-8 python -c 'print(u"\u2603")' | cat
Traceback (most recent call last):
File "<string>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2603' in position 0: ordinal not in range(128)
stdout
/ stderr
/ print
stdout
/ stderr
/ print
is with bytescStringIO
/ StringIO
cStringIO.StringIO
- a binary stream (similar to python2 open)StringIO.StringIO
ASCII
encoding can convertcStringIO
as it is implemented in pure python)cStringIO
/ StringIO
>>> x = StringIO.StringIO()
>>> x.write(b'\xe2\x98\x83')
>>> x.write(u'hi')
>>> x.getvalue()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/StringIO.py", line 271, in getvalue
self.buf += ''.join(self.buflist)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 0: ordinal not in range(128)
open
io
moduleopen
in python3 is just io.open
io.open(..., 'rb')
or io.open(..., 'wb')
produce a binary io objectio.open
otherwise returns an io.TextIOWrapper
open
>>> with open('f', 'wb') as f:
... f.write('hi')
...
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
TypeError: a bytes-like object is required, not 'str'
open
>>> with open('f', 'w') as f:
... f.write(b'hi')
...
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
TypeError: write() argument must be str, not bytes
open
encoding=
keyword argument may be passed to change what encoding
the TextIOWrapper
uses to writeencoding=
is not passed, the encoding is determined using
locale.getpreferredencoding()
LANG
environment variablestdout
/ stderr
stdout
/ stderr
are TextIOWrapper
s.buffer
binary streamprint
print
in python3 will write as if writing to a text streamstr()
on arguments if necessary>>> print(b'foo')
b'foo'
>>> print('hi')
hi
stdout
/ stderr
/ print
$ LANG=C python3 -c 'print("\u2603")'
Traceback (most recent call last):
File "<string>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character '\u2603' in position 0: ordinal not in range(128)
.flush()
or
print(..., flush=True)
so it shows immediatelyio.BytesIO
for a binary in memory file-like objectbytes
io.StringIO
for a text in memory file-like objectio
module is included in python2.6+open
calls with io.open
cStringIO
/ StringIO
with either io.BytesIO
or io.StringIO
if PY2:
stdout_binary = sys.stdout
else:
stdout_binary = sys.stdout.buffer
stdout_binary.write(b'\xe2\x98\x83\n')
io.TextIOWrapper
does not work with the stdio streamscodecs.getwriter
instead!if PY2:
stdout_text = codecs.getwriter(locale.getprefferedencoding())(sys.stdout)
else:
stdout_text = sys.stdout
stdout_text.write('☃\n')
print('☃', file=stdout_text)
LANG
if PY2:
stdout_text = codecs.getwriter('UTF-8')(sys.stdout)
else:
stdout_text = io.TextIOWrapper(sys.stdout.buffer, encoding='UTF-8')
stdout_text.write('☃\n')
print('☃', file=stdout_text)