Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-28180: Implementation for PEP 538 #659

Merged
merged 43 commits into from
Jun 11, 2017
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
b5d125b
WIP: PEP 538 reference implementation
ncoghlan Mar 5, 2017
78c17a7
Fix test case failures
ncoghlan Mar 11, 2017
721b27f
Merge remote-tracking branch 'origin/master' into pep538-coerce-c-locale
ncoghlan Mar 11, 2017
16a4415
Merge branch 'master' into pep538-coerce-c-locale
ncoghlan Mar 13, 2017
d283de1
Clarify locale coercion warnings
ncoghlan Mar 13, 2017
f7a03fe
Avoid -Wformat-security warning
ncoghlan Mar 13, 2017
fe92a29
Support running tests under 'LANG=C'
ncoghlan Mar 13, 2017
64d9d2f
Suppress locale warning for PYTHONCOERCECLOCALE=0
ncoghlan Mar 13, 2017
384a146
Add test case for library runtime warning
ncoghlan Mar 13, 2017
b4f3a34
Add C locale coercion and warning build flags
ncoghlan Mar 13, 2017
4d684a6
Always use C.UTF-8 on Android
ncoghlan Mar 13, 2017
1c3a270
Fix PYTHONCOERCECLOCALE docs
ncoghlan Mar 15, 2017
7626fcf
Use Py_SetStandardStreamEncoding instead of PYTHONIOENCODING
ncoghlan Mar 15, 2017
d12b412
Some test cleanups suggested by Barry
ncoghlan Mar 15, 2017
ec4f2ea
Use more precise name for test file
ncoghlan Mar 15, 2017
4e6d502
Check standard stream settings in locale coercion tests
ncoghlan Mar 15, 2017
ccfc83f
Use US spelling
ncoghlan Mar 15, 2017
b173af3
Helper function to query PYTHONCOERCECLOCALE
ncoghlan Mar 15, 2017
501a829
Merge remote-tracking branch 'origin/master' into pep538-coerce-c-locale
ncoghlan Mar 15, 2017
d099a52
Fix ReST markup
ncoghlan Mar 15, 2017
762a09b
Fix Py_DEBUG/Py_SetStandardStreamEncoding compatibility problem
ncoghlan Mar 15, 2017
820bfad
Restore Windows _testembed compatibility
ncoghlan Mar 17, 2017
6a00ce6
Merge remote-tracking branch 'origin/master' into pep538-coerce-c-locale
ncoghlan May 6, 2017
188e780
Update to latest version of PEP 538
ncoghlan May 6, 2017
476a781
Change locale coercion to always respect LC_ALL
ncoghlan May 9, 2017
123ba24
Merge remote-tracking branch 'origin/master' into pep538-coerce-c-locale
ncoghlan May 27, 2017
939ba0a
Don't set LANG during locale coercion
ncoghlan May 27, 2017
6d564c9
Update docs to match current behaviour
ncoghlan Jun 3, 2017
53bd6da
Address CI failure and review comments
ncoghlan Jun 3, 2017
cad0669
OK, two-use function :)
ncoghlan Jun 3, 2017
421516f
Still check for the C locale in Windows
ncoghlan Jun 3, 2017
e48a378
Check actual control flow on Appveyor
ncoghlan Jun 3, 2017
f62dbd8
Merge remote-tracking branch 'origin/master' into pep538-coerce-c-locale
ncoghlan Jun 3, 2017
d181b92
Use correct reference type in docs
ncoghlan Jun 3, 2017
8cf0590
More Appveyor debugging
ncoghlan Jun 3, 2017
cea7970
New theory regarding the Windows problem
ncoghlan Jun 3, 2017
c63d5fa
Locale coercion may inject LC_CTYPE into environment
ncoghlan Jun 3, 2017
8e0e1ca
Ensure SYSTEMROOT is set in Windows embedding tests
ncoghlan Jun 3, 2017
7379398
Don't use the default pipe encoding in test_capi
ncoghlan Jun 3, 2017
89759b5
stdin encoding ends up normalised on Windows
ncoghlan Jun 3, 2017
5a56a3f
PEP 538: Add What's New entry
ncoghlan Jun 4, 2017
0036bea
Merge remote-tracking branch 'origin/master' into pep538-coerce-c-locale
ncoghlan Jun 11, 2017
5288662
Add NEWS entry
ncoghlan Jun 11, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Suppress locale warning for PYTHONCOERCECLOCALE=0
  • Loading branch information
ncoghlan committed Mar 13, 2017
commit 64d9d2f250c3c7c7e28d2bb4577262923b6a0882
7 changes: 0 additions & 7 deletions Lib/test/support/script_helper.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,13 +14,6 @@
from test.support import make_legacy_pyc, strip_python_stderr


RUNTIME_C_LOCALE_WARNING = (
"Python runtime initialized with LC_CTYPE=C (a locale with default ASCII "
"encoding), which may cause Unicode compatibility problems. Using C.UTF-8, "
"C.utf8, or UTF-8 (if available) as alternative Unicode-compatible "
"locales is recommended."
)

# Cached result of the expensive test performed in the function below.
__cached_interp_requires_environment = None

Expand Down
8 changes: 3 additions & 5 deletions Lib/test/test_cmd_line.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,7 @@
import tempfile
from test.support import script_helper, is_android
from test.support.script_helper import (
spawn_python, kill_python, assert_python_ok, assert_python_failure,
RUNTIME_C_LOCALE_WARNING
spawn_python, kill_python, assert_python_ok, assert_python_failure
)


Expand Down Expand Up @@ -162,18 +161,17 @@ def test_undecodable_code(self):
stdout=subprocess.PIPE, stderr=subprocess.STDOUT,
env=env)
stdout, stderr = p.communicate()
pattern = RUNTIME_C_LOCALE_WARNING.encode() + b'\n'
if p.returncode == 1:
# _Py_char2wchar() decoded b'\xff' as '\udcff' (b'\xff' is not
# decodable from ASCII) and run_command() failed on
# PyUnicode_AsUTF8String(). This is the expected behaviour on
# Linux.
pattern += b"Unable to decode the command from the command line:"
pattern = b"Unable to decode the command from the command line:"
elif p.returncode == 0:
# _Py_char2wchar() decoded b'\xff' as '\xff' even if the locale is
# C and the locale encoding is ASCII. It occurs on FreeBSD, Solaris
# and Mac OS X.
pattern += b"'\\xff' "
pattern = b"'\\xff' "
# The output is followed by the encoding name, an alias to ASCII.
# Examples: "US-ASCII" or "646" (ISO 646, on Solaris).
else:
Expand Down
21 changes: 11 additions & 10 deletions Lib/test/test_locale_coercion.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@
from test.support.script_helper import (
run_python_until_end,
interpreter_requires_environment,
RUNTIME_C_LOCALE_WARNING
)

# In order to get the warning messages to match up as expected, the candidate
Expand All @@ -30,6 +29,14 @@ def _set_locale_in_subprocess(locale_name, category):
result, py_cmd = run_python_until_end("-c", cmd, __isolated=True)
return result.rc == 0

# Details of the shared library warning emitted at runtime
LIBRARY_C_LOCALE_WARNING = (
"Python runtime initialized with LC_CTYPE=C (a locale with default ASCII "
"encoding), which may cause Unicode compatibility problems. Using C.UTF-8, "
"C.utf8, or UTF-8 (if available) as alternative Unicode-compatible "
"locales is recommended."
)

# Details of the CLI warning emitted at runtime
CLI_COERCION_WARNING_FMT = (
"Python detected LC_CTYPE=C: {} coerced to {} (set another locale "
Expand Down Expand Up @@ -100,16 +107,10 @@ def _check_c_locale_coercion(self, expected_fsencoding, coerce_c_locale):
None: don't set the variable at all
str: the value set in the child's environment
"""
if coerce_c_locale == "0":
# Check the library emits a warning
expected_warning = [
RUNTIME_C_LOCALE_WARNING,
]
else:
expected_warning = []
if coerce_c_locale != "0":
# Check C locale is coerced with a warning on stderr
expected_warning = [
self.EXPECTED_COERCION_WARNING,
]
expected_warning.append(self.EXPECTED_COERCION_WARNING)
base_var_dict = {
"LANG": "",
"LC_CTYPE": "",
Expand Down
9 changes: 2 additions & 7 deletions Lib/test/test_sys.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,5 @@
import unittest, test.support
from test.support.script_helper import (
assert_python_ok, assert_python_failure,
RUNTIME_C_LOCALE_WARNING
)
from test.support.script_helper import assert_python_ok, assert_python_failure
import sys, io, os
import struct
import subprocess
Expand Down Expand Up @@ -679,8 +676,6 @@ def test_getfilesystemencoding(self):
expected = None
self.check_fsencoding(fs_encoding, expected)

_SKIP_RUNTIME_C_LOCALE_WARNING = len(RUNTIME_C_LOCALE_WARNING + "\n")

def c_locale_get_error_handler(self, isolated=False, encoding=None):
# Force the POSIX locale
env = os.environ.copy()
Expand Down Expand Up @@ -708,7 +703,7 @@ def c_locale_get_error_handler(self, isolated=False, encoding=None):
env=env,
universal_newlines=True)
stdout, stderr = p.communicate()
return stdout[self._SKIP_RUNTIME_C_LOCALE_WARNING:]
return stdout

def test_c_locale_surrogateescape(self):
out = self.c_locale_get_error_handler(isolated=True)
Expand Down
14 changes: 11 additions & 3 deletions Python/pylifecycle.c
Original file line number Diff line number Diff line change
Expand Up @@ -311,9 +311,17 @@ static const char *_C_LOCALE_WARNING =
static void
_emit_stderr_warning_for_c_locale(void)
{
const char *ctype_loc = setlocale(LC_CTYPE, NULL);
if (ctype_loc != NULL && strcmp(ctype_loc, "C") == 0) {
fprintf(stderr, "%s", _C_LOCALE_WARNING);
const char *coerce_c_locale = getenv("PYTHONCOERCECLOCALE");
/* We don't emit a warning if locale coercion has been explicitly disabled.
*
* For consistency with the corresponding check in Programs/python.c
* we ignore the Python -E and -I flags here.
*/
if (coerce_c_locale == NULL || strncmp(coerce_c_locale, "0", 2) != 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You do this same check in two places. Would it make sense to refactor that into a call or macro so there's no chance in getting out of sync? (See my question at the top of the PR.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative would be to check a C global here, rather than checking the environment variable directly, and have a _Py_DisableCLocaleWarning() helper that the CLI calls when PYTHONCOERCECLOCALE=0 is set.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The downside of that approach would be that it means embedding applications can't use PYTHONCOERCECLOCALE=0 to silence the warning without changing the configured locale, and I don't want to define a new public embedding configuration API just to avoid checking the environment variable twice.

So I'll unconditionally export a private helper function from pylifecycle.c (so the ABI doesn't depend on the configure flag setting), and re-use that from the CLI implementation.

const char *ctype_loc = setlocale(LC_CTYPE, NULL);
if (ctype_loc != NULL && strcmp(ctype_loc, "C") == 0) {
fprintf(stderr, "%s", _C_LOCALE_WARNING);
}
}
}

Expand Down