Closed Bug 1755823 Opened 2 years ago Closed 2 years ago

Reduce impact of ThreadRegistry locking in sampler thread

Categories

(Core :: Gecko Profiler, task, P2)

task

Tracking

()

RESOLVED FIXED
99 Branch
Tracking Status
firefox99 --- fixed

People

(Reporter: mozbugz, Assigned: mozbugz)

References

(Blocks 1 open bug)

Details

Attachments

(2 files)

Example profile: https://share.firefox.dev/3sFjL36
"FileIO (non-main thread)" markers are generated from non-main threads (in this case "IndexDB #21", tid 2405483). But they target the main thread, so the marker code needs to check if that target thread is actively being sampled. This involves locking the ThreadRegistry, in order to access the thread data including its current is-being-profiled state (the locking is necessary to prevent that thread from ending, which would destroy the data we're trying to read).
Meanwhile the sampler thread regularly runs its sampling loop, which also locks the ThreadRegistry in order to sample all to-be-sampled threads, and this takes a fairly significant time.
In the example profile, notice how the FileIO marker row has gaps (around 300-600 microseconds each) around each sampling loop (the small blue boxes in the "Parent Process" track above).

And presumably, this long locking in the sampler could block other similar operations that need to access another thread's data.

I can think of a few possible solutions:

  1. The sampler thread could unlock the ThreadRegistry betweeen sampling each thread. This would allow other threads to do their own work, like writing these FileIO markers.
    However, this would be more complex to handle in the sampler, since it would need to be able to deal with the list of threads potentially changing, and so it would need to keep track of which threads have been sampled so far, and find the next candidate (if any) to continue sampling.

  2. Change the ThreadRegistry locking mechanism to a readers-writer lock.
    This would allow simultaneous reads (to access individual ThreadRegistration objects), in our case from the sampler and from markers, while preventing writes (to add/remove ThreadRegistrations to/from the list) when a thread starts or stops.

  3. Add locks around ThreadRegistration objects, that would at least prevent their destruction -- Maybe through a thread-safe RefPtr? This would allow the sampler to lock the ThreadRegistry for only a short time, just to read the list and lock/reference individual ThreadRegistrations to be sampled.

Option #2 feels like the easiest, and could be done using mozilla::RWLock or std::shared_mutex.

This is a profiler-specific shared lock (aka readers-writer lock) implemented on top of RWLockImpl.
Similar to BaseProfilerMutex, it records which thread is currently holding the exclusive lock.

Assignee: nobody → gsquelart
Status: NEW → ASSIGNED

This allows multiple threads to access ThreadRegistrations through the thread registry, as long as the registry itself is not modified. So in particular, ThreadRegistrations are guaranteed to stay alive as long as a non-exclusive lock is held.
This ensures cross-thread markers are not blocked while the sampler is running.

Depends on D139916

Pushed by gsquelart@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/169754a1f337
BaseProfilerSharedMutex and exclusive&shared RAII locks - r=canaltinova
https://hg.mozilla.org/integration/autoland/rev/d758cab0d5cf
Lock the ThreadRegistry non-exclusively when only reading the thread list, exclusively when adding/removing threads - r=canaltinova

Sorry about that.
The test was slightly wrong: It should have updated the state before unlocking the mutex, so that the other thread would see the expected state when acquiring the lock.

Flags: needinfo?(gsquelart)
Pushed by gsquelart@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e1ddbd49a3db
BaseProfilerSharedMutex and exclusive&shared RAII locks - r=canaltinova
https://hg.mozilla.org/integration/autoland/rev/8a0cdf481812
Lock the ThreadRegistry non-exclusively when only reading the thread list, exclusively when adding/removing threads - r=canaltinova
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 99 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: