Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-32236: Issue RuntimeWarning if buffering=1 for open() in binary mode #4842

Merged
merged 12 commits into from
Oct 20, 2018
6 changes: 3 additions & 3 deletions Doc/library/codecs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -174,7 +174,7 @@ recommended approach for working with encoded text files, this module
provides additional utility functions and classes that allow the use of a
wider range of codecs when working with binary files:

.. function:: open(filename, mode='r', encoding=None, errors='strict', buffering=1)
.. function:: open(filename, mode='r', encoding=None, errors='strict', buffering=-1)

Open an encoded file using the given *mode* and return an instance of
:class:`StreamReaderWriter`, providing transparent encoding/decoding.
Expand All @@ -194,8 +194,8 @@ wider range of codecs when working with binary files:
*errors* may be given to define the error handling. It defaults to ``'strict'``
which causes a :exc:`ValueError` to be raised in case an encoding error occurs.

*buffering* has the same meaning as for the built-in :func:`open` function. It
defaults to line buffered.
*buffering* has the same meaning as for the built-in :func:`open` function.
It defaults to -1 which means that the default buffer size will be used.


.. function:: EncodedFile(file, data_encoding, file_encoding=None, errors='strict')
Expand Down
5 changes: 5 additions & 0 deletions Lib/_pyio.py
Original file line number Diff line number Diff line change
Expand Up @@ -198,6 +198,11 @@ def open(file, mode="r", buffering=-1, encoding=None, errors=None,
raise ValueError("binary mode doesn't take an errors argument")
if binary and newline is not None:
raise ValueError("binary mode doesn't take a newline argument")
if binary and buffering == 1:
import warnings
warnings.warn("line buffering (buffering=1) isn't supported in binary "
"mode, the default buffer size will be used",
RuntimeWarning, 2)
raw = FileIO(file,
(creating and "x" or "") +
(reading and "r" or "") +
Expand Down
5 changes: 3 additions & 2 deletions Lib/codecs.py
Original file line number Diff line number Diff line change
Expand Up @@ -862,7 +862,7 @@ def __exit__(self, type, value, tb):

### Shortcuts

def open(filename, mode='r', encoding=None, errors='strict', buffering=1):
def open(filename, mode='r', encoding=None, errors='strict', buffering=-1):

""" Open an encoded file using the given mode and return
a wrapped version providing transparent encoding/decoding.
Expand All @@ -883,7 +883,8 @@ def open(filename, mode='r', encoding=None, errors='strict', buffering=1):
encoding error occurs.

buffering has the same meaning as for the builtin open() API.
It defaults to line buffered.
It defaults to -1 which means that the default buffer size will
be used.

The returned wrapped file object provides an extra attribute
.encoding which allows querying the used encoding. This
Expand Down
11 changes: 10 additions & 1 deletion Lib/subprocess.py
Original file line number Diff line number Diff line change
Expand Up @@ -721,12 +721,21 @@ def __init__(self, args, bufsize=-1, executable=None,

self._closed_child_pipe_fds = False

if self.text_mode:
if bufsize == 1:
line_buffering = True
# Use the default buffer size for the underlying binary streams
# since they don't support line buffering.
bufsize = -1
else:
line_buffering = False

try:
if p2cwrite != -1:
self.stdin = io.open(p2cwrite, 'wb', bufsize)
if self.text_mode:
self.stdin = io.TextIOWrapper(self.stdin, write_through=True,
line_buffering=(bufsize == 1),
line_buffering=line_buffering,
encoding=encoding, errors=errors)
if c2pread != -1:
self.stdout = io.open(c2pread, 'rb', bufsize)
Expand Down
32 changes: 27 additions & 5 deletions Lib/test/support/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,8 @@
# threads
"threading_setup", "threading_cleanup", "reap_threads", "start_threads",
# miscellaneous
"check_warnings", "check_no_resource_warning", "EnvironmentVarGuard",
"check_warnings", "check_no_resource_warning", "check_no_warnings",
"EnvironmentVarGuard",
"run_with_locale", "swap_item",
"swap_attr", "Matcher", "set_memlimit", "SuppressCrashReport", "sortdict",
"run_with_tz", "PGO", "missing_compiler_executable", "fd_count",
Expand Down Expand Up @@ -1208,6 +1209,30 @@ def check_warnings(*filters, **kwargs):
return _filterwarnings(filters, quiet)


@contextlib.contextmanager
def check_no_warnings(testcase, message='', category=Warning, force_gc=False):
"""Context manager to check that no warnings are emitted.

This context manager enables a given warning within its scope
and checks that no warnings are emitted even with that warning
enabled.

If force_gc is True, a garbage collection is attempted before checking
for warnings. This may help to catch warnings emitted when objects
are deleted, such as ResourceWarning.

Other keyword arguments are passed to warnings.filterwarnings().
"""
with warnings.catch_warnings(record=True) as warns:
warnings.filterwarnings('always',
message=message,
category=category)
yield
if force_gc:
gc_collect()
testcase.assertEqual(warns, [])


@contextlib.contextmanager
def check_no_resource_warning(testcase):
"""Context manager to check that no ResourceWarning is emitted.
Expand All @@ -1222,11 +1247,8 @@ def check_no_resource_warning(testcase):
You must remove the object which may emit ResourceWarning before
the end of the context manager.
"""
with warnings.catch_warnings(record=True) as warns:
warnings.filterwarnings('always', category=ResourceWarning)
with check_no_warnings(testcase, category=ResourceWarning, force_gc=True):
yield
gc_collect()
testcase.assertEqual(warns, [])


class CleanImport(object):
Expand Down
4 changes: 2 additions & 2 deletions Lib/test/test_cmd_line_script.py
Original file line number Diff line number Diff line change
Expand Up @@ -177,10 +177,10 @@ def test_stdin_loader(self):
@contextlib.contextmanager
def interactive_python(self, separate_stderr=False):
if separate_stderr:
p = spawn_python('-i', bufsize=1, stderr=subprocess.PIPE)
p = spawn_python('-i', stderr=subprocess.PIPE)
stderr = p.stderr
else:
p = spawn_python('-i', bufsize=1, stderr=subprocess.STDOUT)
p = spawn_python('-i', stderr=subprocess.STDOUT)
stderr = p.stdout
try:
# Drain stderr until prompt
Expand Down
37 changes: 24 additions & 13 deletions Lib/test/test_file.py
Original file line number Diff line number Diff line change
Expand Up @@ -164,22 +164,33 @@ def testBadModeArgument(self):
f.close()
self.fail("no error for invalid mode: %s" % bad_mode)

def _checkBufferSize(self, s):
try:
f = self.open(TESTFN, 'wb', s)
f.write(str(s).encode("ascii"))
f.close()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the point of calling close() twice here? with should be used, no?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a refactor of an existing test.

f.close()
f = self.open(TESTFN, 'rb', s)
d = int(f.read().decode("ascii"))
f.close()
f.close()
except OSError as msg:
self.fail('error setting buffer size %d: %s' % (s, str(msg)))
self.assertEqual(d, s)

def testSetBufferSize(self):
# make sure that explicitly setting the buffer size doesn't cause
# misbehaviour especially with repeated close() calls
for s in (-1, 0, 1, 512):
try:
f = self.open(TESTFN, 'wb', s)
f.write(str(s).encode("ascii"))
f.close()
f.close()
f = self.open(TESTFN, 'rb', s)
d = int(f.read().decode("ascii"))
f.close()
f.close()
except OSError as msg:
self.fail('error setting buffer size %d: %s' % (s, str(msg)))
self.assertEqual(d, s)
for s in (-1, 0, 512):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to check that these calls don't emit RuntimeWarning?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the check.

with support.check_no_warnings(self,
message='line buffering',
category=RuntimeWarning):
self._checkBufferSize(s)

# test that attempts to use line buffering in binary mode cause
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to test this both here and in test_io.py, IMHO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I've preserved only one check here. I think that test_with_open in test_io.py wasn't a really good place for that check after all.

# a warning
with self.assertWarnsRegex(RuntimeWarning, 'line buffering'):
self._checkBufferSize(1)

def testTruncateOnWindows(self):
# SF bug <http://www.python.org/sf/801631>
Expand Down
2 changes: 1 addition & 1 deletion Lib/test/test_io.py
Original file line number Diff line number Diff line change
Expand Up @@ -592,7 +592,7 @@ def test_large_file_ops(self):
self.large_file_ops(f)

def test_with_open(self):
for bufsize in (0, 1, 100):
for bufsize in (0, 100):
f = None
with self.open(support.TESTFN, "wb", bufsize) as f:
f.write(b"xxx")
Expand Down
3 changes: 2 additions & 1 deletion Lib/test/test_subprocess.py
Original file line number Diff line number Diff line change
Expand Up @@ -1137,7 +1137,8 @@ def test_bufsize_equal_one_binary_mode(self):
# line is not flushed in binary mode with bufsize=1.
# we should get empty response
line = b'line' + os.linesep.encode() # assume ascii-based locale
self._test_bufsize_equal_one(line, b'', universal_newlines=False)
with self.assertWarnsRegex(RuntimeWarning, 'line buffering'):
self._test_bufsize_equal_one(line, b'', universal_newlines=False)

def test_leaking_fds_on_error(self):
# see bug #5179: Popen leaks file descriptors to PIPEs if
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Warn that line buffering is not supported if :func:`open` is called with
binary mode and ``buffering=1``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to be more explicit and list more impact functions: open, io.open, codecs.open. Would it be possible to put functions first in the sentence? Something like: "open(), io.open(), codecs.open() now emit a warning if the file is opened in binary mode with buffering=1." So the reader directly gets the context.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some other functions are impacted too, like pathlib.Path.open() and tempfile.TemporaryFile() (and some other classes from tempfile). Do you think we should find and list all such functions here? Maybe we should just provide a couple of examples: io.open and the standard library functions relying on it (e.g., codecs.open, pathlib.Path.open) now issue a warning if a file is opened in binary mode with buffering=1.

9 changes: 9 additions & 0 deletions Modules/_io/_iomodule.c
Original file line number Diff line number Diff line change
Expand Up @@ -362,6 +362,15 @@ _io_open_impl(PyObject *module, PyObject *file, const char *mode,
goto error;
}

if (binary && buffering == 1) {
if (PyErr_WarnEx(PyExc_RuntimeWarning,
"line buffering (buffering=1) isn't supported in "
"binary mode, the default buffer size will be used",
1) < 0) {
goto error;
}
}

/* Create the Raw file stream */
{
PyObject *RawIO_class = (PyObject *)&PyFileIO_Type;
Expand Down