Detect and complain about use of `pgx-pg-sys` functions from multiple threads #777

thomcc · 2022-10-18T16:57:55Z

Doing this will likely break stuff like #775, although that code is already thread-unsafe (due to our setjmp shim)... That said, it reduces UB from data races (at best) to a panic with a reasonable backtrace.

thomcc · 2022-10-18T17:37:10Z

It looks like I was mislead by the docs I read -- this should always be fine then.

thomcc · 2022-10-18T17:38:06Z

Hmm, actually, maybe I need to set an atfork handler to clear this in the child...

pgx-pg-sys/src/submodules/thread_check.rs

eeeebbbbrrrr · 2022-10-18T17:40:53Z

pgx-pg-sys/src/submodules/setjmp.rs

 unsafe fn pg_guard_ffi_boundary_impl<T, F: FnOnce() -> T>(f: F) -> T {
    //! This is the version that uses sigsetjmp and all that, for "normal" Rust/PGX interfaces.
    use crate as pg_sys;

+    // The next code is definitely thread-unsafe (it manipulates statics in an
+    // unsynchronized manner), so we may as well check here.
+    super::thread_check::check_active_thread();


Can we #[cfg] this out of --release builds? Or make it an assert!() or have it behind a feature flag that's disabled by default or something?

We really don't need additional overhead for every call back into Postgres for something that's most likely not going to happen in the typical pgx extension.

Hmm, well, it's still unsound. I think we could have a feature to disable this check, but by default it feels like a bit of a footgun...

That said I could also just make this be a single load from %fs:0/tpidr_el0, as mentioned in the PR. In which case it should only be a few-cycle load and a (easily predicted) branch in the worst case, likely not much worse than array bounds checks.

I'm more concerned that this might break the fork case, which I need to do some testing with.

My view is that the ideal situation is that it's compiled out of --release builds.

The reasoning here is that basically all pgx extensions are going to be single-threaded, just because Postgres is. If an extension does threading, it'd likely be "accidentally" from some 3rd-party crate like rayon or crossbeam.

Keeping the check for debug builds would, in a perfect world, catch these problems during development and tests.

If we want a feature like pedantic-threadness-checks (or whatever) that can be enabled by the developer for --release builds, that'd be fine. I do see the usefulness of this in a release build for overly-paranoid runtime environments.

EDIT: As an aside, what kind of runtime overhead does #[track_caller] add per call?

I'm not sure. It seems pretty easy to accidentally end up calling into PGX from multiple threads given that so much of rust assumes thread-safety by default (outside of unsafe)...

EDIT: As an aside, what kind of runtime overhead does #[track_caller] add per call?

Adds a secret argument of an &'static Location<'static>, which gets loaded from static data and passed in. Concretely, less than passing an extra &'static str argument.

Well. Then I'd like to see some kind of "benchmark" to see if this shows up.

And I don't really mean anything scientific here. Like, does calling this function (https://github.com/tcdi/pgx/blob/d1ffdb2969f414652d48548341e9d9ff9d02e672/pgx-examples/srf/src/lib.rs#L15) through SQL as something like SELECT srf.generate_series(1, 10000000); show any measurable difference with and without this check?

Where I'm concerned about overhead is in a case like this where a #[pg_extern] function is being called 10s-or-100s-of-millions-of-times by some SQL statement.

Sure, a request for a benchmark is totally reasonable. I'll work on that (after I make sure the atfork handling is sufficient in the cross-process case).

I'll re-request review when that's ready.

For example, the Timescale folks have mentioned setjmp overhead already. My intuition says this would pop up on their radar too. ZomboDB too, for that matter.

If it doesn't turn out to be significant, then great!

This should be significantly cheaper than the call to sigsetjmp, but I'll take the time to ensure that.

I measured this in a few ways, both as standalone benchmarks, benchmarks that include the rest of the pgx sigsetjmp error-handling dance, and in real queries (SELECT my_generate_series(0, 10000000, 1), with \timing on). I did this on linux and macOS, which reported similar numbers (although Linux numbers were like 10x slower across the board due to the machine being literally the cheapest laptop I could find 3y ago).

The check in this PR takes around 700ps standalone, and increases the average cost the error handling from ~6ns to ~6.5ns in my benchmarks (there's probably around 200ps of overhead from measurement).

When used in a query, these checks had no reliable overhead that I could detect. With or without the thread safety check, SELECT my_generate_series(0, 10000000, 1), reported times between 950ms and 975ms (well, 4.5-5ms on my linux box). I tested this pretty thoroughly (several DBs/extensions), and am confident that it's not a case of the old code remaining in use.

We could get this down to essentially 0-cost¹ via the use of inline asm. I have implemented this in the thomcc/threadsafe-asm branch (the relevant code is https://github.com/tcdi/pgx/blob/thomcc/threadsafe-asm/pgx-pg-sys/src/submodules/thread_check.rs#L115-L194). It wouldn't be that hard for me to maintain (I only implemented versions I understood), but is definitely way worse than the version in this PR, so I don't know that I think this is worth it, since the change is not measurable in queries².

It's definitely your call, though -- I'd be totally fine to pull that into this PR if you'd rather take the optimization.

Footnotes

Small enough to be comparable with black_box(0) on its own (100ps), and to cause no measurable difference when run as part of the other pg error code. ↩

I wouldn't have bothered with this had I realized that, but my initial measurements accidentally compared versions with different compiler flags, so I thought the overhead was much more significant at first. ↩

… threads

thomcc · 2022-10-29T02:52:28Z

Because it got mentioned in #780 (and because I realized I was making a dumb mistake in the macOS reloading issue, and needed to look at something else), I finished this up. See #777 (comment) for a description of the overhead (very small, not measurable in queries) and a link to an alternate approach that is (much) worse to maintain but is basically 0 cost on the platforms we care about.

workingjubilee · 2022-10-31T21:02:21Z

I'm gonna need to make sure this works with PL/Rust before we merge this.

thomcc · 2022-10-31T22:39:49Z

I'm gonna need to make sure this works with PL/Rust before we merge this.

The only call to the check is inside a cfg(not(feature = "postgrestd")) function, but yeah, worth double-checking. (Actually, there might be a lot of unused code warnings if that feature is on...).

workingjubilee

This PR passed PL/Rust's test suite in tcdi/plrust#105

700 picoseconds is obviously worth the improvement in soundness, and unlike the previous version, this isn't likely to break on no-op thread renaming antics.

pgx-pg-sys/src/submodules/thread_check.rs

workingjubilee · 2022-11-02T23:44:32Z

pgx-pg-sys/src/submodules/thread_check.rs

+fn nonzero_thread_id() -> NonZeroUsize {
+    // Returns the `addr()` of a thread local variable.


You don't think the asm! version was cuter? 🐱

pgx-pg-sys/src/submodules/thread_check.rs

workingjubilee · 2022-11-02T23:52:13Z

pgx-pg-sys/src/submodules/thread_check.rs

+fn init_active_thread(tid: NonZeroUsize) {
+    match ACTIVE_THREAD.compare_exchange(0, tid.get(), Ordering::Relaxed, Ordering::Relaxed) {
+        Ok(_) => unsafe {
+            // We won the race. Register an atfork handler to clear the atomic


Here he comes, here comes Speed Racer, he's a demon on wheels...

pgx-pg-sys/src/submodules/thread_check.rs

Co-authored-by: Jubilee <46493976+workingjubilee@users.noreply.github.com>

Eric confirmed this was okay in Discord

thomcc requested a review from workingjubilee October 18, 2022 16:57

eeeebbbbrrrr previously requested changes Oct 18, 2022

View reviewed changes

This was referenced Oct 19, 2022

Initial stab at a design doc #780

Merged

Invalid memory reference error when running test in project that sets session_preload_libraries in _bootstart.sql #816

Closed

workingjubilee added the ffi-safety label Oct 26, 2022

thomcc added 3 commits October 28, 2022 14:48

Detect and complain about use of pgx-pg-sys functions from multiple…

1b2e963

… threads

Fix typo in panic message

cff97e9

Register an atfork handler to clear the atomic in child processes

7918ae1

thomcc force-pushed the threadsafe branch from 933fb35 to 7918ae1 Compare October 29, 2022 00:41

thomcc requested a review from eeeebbbbrrrr October 29, 2022 02:52

workingjubilee mentioned this pull request Nov 2, 2022

Run an experiment to validate pgx#777 tcdi/plrust#105

Closed

workingjubilee approved these changes Nov 2, 2022

View reviewed changes

workingjubilee reviewed Nov 3, 2022

View reviewed changes

pgx-pg-sys/src/submodules/thread_check.rs Outdated Show resolved Hide resolved

Apply suggestions from code review

0f8468b

Co-authored-by: Jubilee <46493976+workingjubilee@users.noreply.github.com>

eeeebbbbrrrr merged commit 3e4b188 into pgcentralfoundation:develop Nov 8, 2022

sumerman mentioned this pull request Feb 15, 2023

Extend UTF-8 detection in PGX init and test #1041

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Detect and complain about use of `pgx-pg-sys` functions from multiple threads #777

Detect and complain about use of `pgx-pg-sys` functions from multiple threads #777

thomcc commented Oct 18, 2022 •

edited

Loading

thomcc commented Oct 18, 2022

thomcc commented Oct 18, 2022

eeeebbbbrrrr Oct 18, 2022

thomcc Oct 18, 2022 •

edited

Loading

eeeebbbbrrrr Oct 18, 2022 •

edited

Loading

thomcc Oct 18, 2022

eeeebbbbrrrr Oct 18, 2022

thomcc Oct 18, 2022 •

edited

Loading

eeeebbbbrrrr Oct 18, 2022

thomcc Oct 18, 2022

thomcc Oct 29, 2022

thomcc commented Oct 29, 2022

workingjubilee commented Oct 31, 2022

thomcc commented Oct 31, 2022

workingjubilee left a comment

workingjubilee Nov 2, 2022

thomcc Nov 3, 2022

workingjubilee Nov 3, 2022

workingjubilee Nov 2, 2022

		fn nonzero_thread_id() -> NonZeroUsize {
		// Returns the `addr()` of a thread local variable.

Detect and complain about use of pgx-pg-sys functions from multiple threads #777

Detect and complain about use of pgx-pg-sys functions from multiple threads #777

Conversation

thomcc commented Oct 18, 2022 • edited Loading

thomcc commented Oct 18, 2022

thomcc commented Oct 18, 2022

Choose a reason for hiding this comment

thomcc Oct 18, 2022 • edited Loading

Choose a reason for hiding this comment

eeeebbbbrrrr Oct 18, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thomcc Oct 18, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Footnotes

thomcc commented Oct 29, 2022

workingjubilee commented Oct 31, 2022

thomcc commented Oct 31, 2022

workingjubilee left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Detect and complain about use of `pgx-pg-sys` functions from multiple threads #777

Detect and complain about use of `pgx-pg-sys` functions from multiple threads #777

thomcc commented Oct 18, 2022 •

edited

Loading

thomcc Oct 18, 2022 •

edited

Loading

eeeebbbbrrrr Oct 18, 2022 •

edited

Loading

thomcc Oct 18, 2022 •

edited

Loading