Skip to content

Use Builtin Atomics for wait/waitAsync/notify#956

Open
komyg wants to merge 20 commits intotrynova:mainfrom
komyg:fix/use-built-in-atomics
Open

Use Builtin Atomics for wait/waitAsync/notify#956
komyg wants to merge 20 commits intotrynova:mainfrom
komyg:fix/use-built-in-atomics

Conversation

@komyg
Copy link
Contributor

@komyg komyg commented Feb 21, 2026

Fix: #899

@komyg komyg changed the title Create Parking Lot Mutex in SharedDataBlock Use Builtin Atomics for wait/waitAsync/notify Feb 21, 2026
@komyg komyg force-pushed the fix/use-built-in-atomics branch from 4482ead to 4a0935d Compare February 21, 2026 14:03
@komyg komyg requested a review from aapoalas February 23, 2026 12:23
Copy link
Member

@aapoalas aapoalas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've updated the code according to your comment.

Can you take a look if the general idea is good?

Note: it seems that I broke a few tests that were passing, so I still have to fix those.

Yeah, the general idea looks really good to me, excellent stuff! <3

It seems quite likely to me that the reason for the breakage is the too-strict-by-half locking of the mutex, since you're effectively making it so that only one thread can wait or notify concurrently. (I think.) The mutex should only be guarding adding a new waiter and removing an old waiter, but not the waiting itself.

@komyg komyg force-pushed the fix/use-built-in-atomics branch from 1cfe244 to 3ca1733 Compare February 24, 2026 11:37
@komyg komyg marked this pull request as ready for review February 24, 2026 14:45
@komyg komyg force-pushed the fix/use-built-in-atomics branch from 9f4eea6 to 0d39787 Compare February 24, 2026 15:21
@komyg komyg requested a review from aapoalas February 24, 2026 16:16
@komyg
Copy link
Contributor Author

komyg commented Feb 24, 2026

Hey @aapoalas, I've addressed your feedback, can you review again, please?

Copy link
Member

@aapoalas aapoalas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still some things I'd like changed; most importantly we want to std functions as much as possible.

let handle = thread::spawn(move || {
// SAFETY: buffer is a cloned SharedDataBlock; non-dangling.
let waiters = unsafe { buffer.get_or_init_waiters() };
let waiter_record = Arc::new(WaiterRecord {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Could create a WaiterRecord::new_shared() helper function that returns Arc<WaiterRecord>.

} else {
let deadline = Instant::now() + Duration::from_millis(t);
loop {
if waiter_record.notified.load(StdOrdering::Acquire) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: I think we should be fine using just Relaxed for the notification. Waking up is our synchronisation, we only really care about the value of the bool here and it is atomic on its own.

@aapoalas
Copy link
Member

Oh! I forgot: the ecmascript_futex dependency should be removed.

@aapoalas
Copy link
Member

Hmm... I've been looking at the code and doing some cleanups and whatnot and ... I think we have a somewhat fundamental issue here in that the async side has been just bollixed by me. It's pretty badly mangled from what it ought to be now that we're actually using a real parking lot instead of Futexes directly.

Importantly: it seems like the critical section must be entered on the main thread, not on the async thread. eg. the NotEqual check and return you've done on the waiter thread inside the critical section: that's not something that is allowed by the spec, ie. the Promise should never resolve with not_equal. It should be done within the critical section indeed, as you've correctly identified, but if the check fails then we should still return synchronously from the waitAsync function with an object { async: false, value: "not-equal" }.

Maybe we can still hack the effect by changing the signal that we use to wait for the waiter thread to startup to being not a boolean but a state enum: Startup, Started, and NotEqual, but there might be a race condition somewhere there...

If we cannot hack it, we have to actually do the spec-things exactly and create a separate timeout job and a separate handler job. Also we'd need to somehow figure out if we're resolving a promise on our own Agent or not...

@komyg
Copy link
Contributor Author

komyg commented Feb 25, 2026

Maybe we can still hack the effect by changing the signal that we use to wait for the waiter thread to startup to being not a boolean but a state enum: Startup, Started, and NotEqual, but there might be a race condition somewhere there...

If we cannot hack it, we have to actually do the spec-things exactly and create a separate timeout job and a separate handler job. Also we'd need to somehow figure out if we're resolving a promise on our own Agent or not...

I think we can start with the hack by having the enqueue_atomics_wait_async_job return an Enum with the state, and returning that in the main thread.

By doing that, and addressing your feedback above, I think we will have a solution that is good enough.

Then we can see if it is worth it to take the time to re-write this to be more adherent to the spec.

What do you think?

@aapoalas
Copy link
Member

Maybe we can still hack the effect by changing the signal that we use to wait for the waiter thread to startup to being not a boolean but a state enum: Startup, Started, and NotEqual, but there might be a race condition somewhere there...
If we cannot hack it, we have to actually do the spec-things exactly and create a separate timeout job and a separate handler job. Also we'd need to somehow figure out if we're resolving a promise on our own Agent or not...

I think we can start with the hack by having the enqueue_atomics_wait_async_job return an Enum with the state, and returning that in the main thread.

By doing that, and addressing your feedback above, I think we will have a solution that is good enough.

Then we can see if it is worth it to take the time to re-write this to be more adherent to the spec.

What do you think?

Yeah, I guess that could work. I have something akin to this (without the enum) working locally on my laptop with a lot of cleanups, and I was just running test262 on it when the laptop ran out of battery and it turns out I've left my charger at work!

Based on my experience there, I must admit I'm not exactly thrilled about having the critical section effectively entirely on a separate thread with the main thread having to wait for a sync message from the waiter thread to see if the waiter thread's read of the byte index produced a "notEqual" result or not... But I think we can live with it temporarily.

Looking at it from a performance point of view now that we have a proper parking lot futex instead of native OS futexes, it makes sense that we'll avoid spawning a waiter thread unless specifically needed for a timeout (and actually even this should be delegated to the host, so that the host can decide if a timeout thread is needed or if it should instead be some sort of OS timer API that we rely on). For waitAsync in particular, the user expectation is that they'll wait for a sync message to come in from a foreign thread but while they wait for that, they want to continue working on other things. Spawning a thread is an unnecessarily complex thing to do when they could just check an atomic boolean after they've done the "other things" to see if a sync message has arrived or not.

@komyg komyg force-pushed the fix/use-built-in-atomics branch from 0d39787 to ba58810 Compare March 3, 2026 10:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Atomics wait/waitAsync/notify are not correct

2 participants