Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-113993: Allow interned strings to be mortal, and fix related issues #120520

Merged
merged 79 commits into from
Jun 21, 2024
Merged
Changes from 1 commit
Commits
Show all changes
79 commits
Select commit Hold shift + click to select a range
2b74021
Declare explicit interning routines
encukou Jun 10, 2024
0aedb83
Use _PyUnicode_InternStatic for the statically allocated stuff
encukou Jun 10, 2024
aad79b2
Add _PyUnicode_InternStatic and extra checks
encukou Jun 10, 2024
5413223
Check against immortalizing un-interned strings
encukou Jun 10, 2024
23845fb
Add _PyUnicode_InternImmortal & make `marshal` use it
encukou Jun 10, 2024
9ece220
Remove special case that makes previously-immortal strings STATIC
encukou Jun 10, 2024
7c62641
Split _PyUnicode_InternMortal and _PyUnicode_InternImmortal
encukou Jun 10, 2024
62a35a1
Do the refcnt dance in _PyUnicode_InternMortal & _PyUnicode_ClearInte…
encukou Jun 10, 2024
ac55a05
Deallocate mortal interned strings
encukou Jun 10, 2024
2466ae9
Use _PyUnicode_InternMortal in codecs
encukou Jun 10, 2024
032ac17
Start a notes document
encukou Jun 10, 2024
198d9c6
Handle attempts to "overwrite" interned heap types by static ones
encukou Jun 10, 2024
86ccb08
Intern statically allocated non-identifier strings at init
encukou Jun 11, 2024
e34b8da
Parenthesize the LATIN1 macro argument
encukou Jun 11, 2024
66338fe
Don't create the per-interp interned_dict until after InitStaticStrings
encukou Jun 11, 2024
9f16cb0
Move hashmap destroy to _PyUnicode_ClearInterned (symmetry with creat…
encukou Jun 14, 2024
e27abfc
Special-case short string singletons
encukou Jun 11, 2024
89f24df
Verify we don't add process-global entries after a per-interp dict ex…
encukou Jun 11, 2024
b965acf
More editing of the InternalDocs write-up
encukou Jun 11, 2024
4b69712
Only readjust refleak tests for *immortal* interned strings
encukou Jun 12, 2024
b2c9865
Be pedantic with the ref total
encukou Jun 12, 2024
cf7cb72
Split InternInPlace in sysmodule
encukou Jun 12, 2024
85f9fe0
Split InternInPlace in import
encukou Jun 12, 2024
a288389
Split InternInPlace in getargs
encukou Jun 12, 2024
6036cb1
Use mortal strings in type_setattro
encukou Jun 12, 2024
01f2dbf
Use mortal string in type_module
encukou Jun 12, 2024
73f7fb3
Use mortal strings for object attributes
encukou Jun 12, 2024
13be4e7
Use mortal strings for code object names
encukou Jun 12, 2024
1930919
Use mortal strings for code constants
encukou Jun 12, 2024
d45f20b
Use mortals in pickle
encukou Jun 12, 2024
348d95c
Use mortals for PyDict_SetItemString keys
encukou Jun 12, 2024
04f080e
Use mortals in operator: methodcaller_new
encukou Jun 12, 2024
afe5400
Use mortals in operator: attrgetter_new
encukou Jun 12, 2024
fe0b8c5
Simplify logic in _PyUnicode_InternImmortal
encukou Jun 12, 2024
1116191
Immortalize ill interned strings in the free-threaded build
encukou Jun 12, 2024
f10e521
Rewrite the write-up
encukou Jun 12, 2024
0787f8f
Restore immortalization for PyDict_SetItemString.
encukou Jun 12, 2024
0d56eba
Intern single-byte (latin1) strings at startup in free-threaded build
encukou Jun 12, 2024
8b32762
Make the three sets of singletons disjoint
encukou Jun 12, 2024
24bf76a
One more single-char string in _Py_STR
encukou Jun 13, 2024
2fb04fd
Use a less unwieldy name
encukou Jun 14, 2024
26fa26e
Adjust comments & writeup
encukou Jun 14, 2024
9b14dbb
Don't call _Py_SetImmortal on strings.
encukou Jun 14, 2024
ac6dfae
Beef up the tests
encukou Jun 14, 2024
a9e91b1
Fix comment
encukou Jun 14, 2024
ee0f068
A bit more thought-through error handling for failed PyDict_Pop
encukou Jun 14, 2024
61bf404
Switch parser to PyUnicode_InternImmortal
encukou Jun 14, 2024
80ce95b
Touch up comments
encukou Jun 14, 2024
70aa294
Switch public PyUnicode API to _PyUnicode_InternImmortal
encukou Jun 14, 2024
de2ff7f
Add an assert to _PyUnicode_EqualToASCIIId
encukou Jun 14, 2024
0e6744e
Remove #ifdef Py_DEBUG from the body of _PyUnicode_ClearInterned.
encukou Jun 14, 2024
c50e151
Consolidate the interning logic
encukou Jun 14, 2024
08798d0
Fix the free-threading initialization
encukou Jun 14, 2024
62959cd
Typo in comments
encukou Jun 14, 2024
f62ccc6
Add blurb
encukou Jun 14, 2024
86cf124
Guard call to debug function
encukou Jun 14, 2024
f2e857e
Avoid -bb warnings in tests
encukou Jun 14, 2024
6011c05
Add typing to a clinic function
encukou Jun 14, 2024
7e8d727
Work around build failure on macOS clang
encukou Jun 14, 2024
ccb7f42
Silence a mypy error
encukou Jun 17, 2024
e0bb1c2
_PyCodec_Lookup: Immortalize key on success
encukou Jun 17, 2024
975e4ba
getargs.c: Immortalize the kwtuple keys
encukou Jun 17, 2024
1c05a60
Don't re-mortalize interned immortals at interpreter shutdown (in non…
encukou Jun 17, 2024
5ac3c5f
Avoid `case` label on a declaration (invalid in standard C and, fortu…
encukou Jun 17, 2024
686d2b6
Merge in the main branch
encukou Jun 18, 2024
7d79d10
Remove PyUnicode_InternImmortal from the header
encukou Jun 19, 2024
d4eb879
Move _Py_LATIN1_CHR to pycore_global_strings.h
encukou Jun 19, 2024
f7df09a
Remove mistaken check in _pickle.c
encukou Jun 19, 2024
fe7fb13
Comment/doc clarifications, rewordings; PEP-7 style
encukou Jun 19, 2024
7a75099
Add a pedantic DECREF
encukou Jun 19, 2024
ac402d8
Use more straightforward signatures for the internal functions
encukou Jun 19, 2024
aa58c01
Group _PyUnicode_Intern funcs in the header
encukou Jun 19, 2024
929d0bc
Break out init_global_interned_strings & clear_global_interned_strings
encukou Jun 19, 2024
44c0192
Merge in the main branch
encukou Jun 19, 2024
9e3ce44
Fix function declaration
encukou Jun 20, 2024
2ebf8a0
Fix return value
encukou Jun 20, 2024
bf49f61
Convert check to assert
encukou Jun 20, 2024
6d668e6
Limit _PyUnicode_InternStatic to runtime initialization
encukou Jun 20, 2024
fd8ca83
Add a comment for _Py_hashtable_new_full destroys
encukou Jun 20, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Handle attempts to "overwrite" interned heap types by static ones
  • Loading branch information
encukou committed Jun 14, 2024
commit 198d9c630e368e3879e0996567a9fcd528d05b50
23 changes: 23 additions & 0 deletions Objects/unicodeobject.c
Original file line number Diff line number Diff line change
Expand Up @@ -15022,6 +15022,29 @@ _PyUnicode_InternStatic(PyInterpreterState *interp, PyObject **p)
return;
}

PyObject *interned = get_interned_dict(interp);
if (interned) {
int res = PyDict_GetItemRef(interned, s, &r);
if (res < 0) {
PyErr_WriteUnraisable(*p);
}
else if (res == 1) {
// Someone "beat us to it"; there's an interpreter-specific key
// we must use :(
if (_Py_IsImmortal(r)) {
*p = r;
} else {
// Immortalize it.
// For now this is a slow path that re-does all the checks
// and lookups.
_PyUnicode_InternImmortal(interp, p);
}
_PyUnicode_STATE(*p).interned = SSTATE_INTERNED_IMMORTAL;
assert(_Py_IsImmortal(*p));
return;
}
}

if (_Py_hashtable_set(INTERNED_STRINGS, s, s) < -1) {
Py_FatalError("failed to intern static string");
}
Expand Down