In bug 1081952 we changed the ordering in backgroundFinalize to explicitly release the GC lock *before* setting the background finalize state, on the theory that the atomic access to backgroundFinalizeState protects us from racing. However, I've had cause to think more about atomics and memory ordering since then, and I no longer think it provides any additional safety - and may be slightly worse for performance. backgroundFinalizeState has ReleaseAcquire memory ordering. - A release-store operation on thread 1 issues a memory barrier, then performs the store. - This guarantees that all threads that perform a load-acquire operation thereafter will see all the accesses that happened preceding the release-store operation on thread 1. The store itself may not be visible to the first subsequent load-acquire as it's not followed by a memory barrier. - A load-acquire operation on thread 1 performs the load, then issues a memory barrier. - This guarantees that all subsequent accesses on thread 1 will see any stores performed prior to the most recent release-store operations performed by each other thread. The load itself may still see an older value, since it is not preceded by a memory barrier. As a result, there may be a period of time during which (main thread) allocateFromArena sees backgroundFinalizeState[thingKind] == BFS_RUNNING, even though backgroundFinalize already set it to BFS_DONE. However, if allocateFromArena sees backgroundFinalizeState[thingKind] == BFS_DONE, then we know that backgroundFinalize is definitely finished, *and* all its modifications to the ArenaList must be visible. Thus releasing the GC lock before setting the background finalize state provides no additional safety. However, releasing the GC lock *does* issue a memory barrier of its own. If we revert to the pre-bug 1081952 ordering, this means the store will potentially be visible sooner*. So I think we should revert to releasing the GC lock *after* setting the background finalize state to BFS_DONE. Let me know what you think. Waldo, could you check my reasoning? I realize this is probably a lot of effort for such a tiny change, but it was bothering me ;) * Indeed, I'm not sure there's currently any guarantees about exactly when other threads will see that store. This probably works out because memory barriers act like a proper choke point on ARM (so the memory barrier caused by a load-acquire in allocateFromArena will also propagate the stores on other threads), and x86 is strongly ordered. Either that, or because threads are constantly switching between physical cores *anyway*, so we'll have another memory barrier on the right core pretty soon.
You need to log in before you can comment on or make changes to this bug.