Closed Bug 1685937 Opened 3 years ago Closed 3 years ago

Revisit inter-sampling sleep time, to avoid tiny sleeps

Categories

(Core :: Gecko Profiler, enhancement, P2)

enhancement

Tracking

()

RESOLVED FIXED
86 Branch
Tracking Status
firefox86 --- fixed

People

(Reporter: mozbugz, Assigned: mozbugz)

References

(Blocks 3 open bugs)

Details

Attachments

(1 file)

Currently the SleepMicro time is computed with a goal of "keeping to schedule".
I.e., It tries to make the next sampling loop start at (time this loop started) + (user-requested interval).

This is great to handle sleep jitters.
E.g.: If the requested interval is 1ms, and one loop starts 1.2ms after the previous one (0.2ms overshoot), then the next loop will be scheduled 0.8ms later (to undo the overshoot); or conversely, if one loop starts 0.9ms after the previous one (0.1ms undershoot), the next one will be scheduled 1.1ms later.
So overall, the average sampling rate will stay around what the user requested.

Now, notice that the sleep time is computed as max(0.0, targetSleepDuration - lastSleepOvershoot).
This becomes important when a sampling loop takes almost/more than its timeslice. In this case the sleep time can easily become zero, meaning that we'll start the next sampling loop as soon as possible!

I don't think it's a good thing:

  • Doing two samples one after the other probably has less value, as there is some chance sampled threads may not have done any work in-between.
  • One likely cause of overshoots is that Firefox and/or the system is busy, so we're making things a bit worse by scheduling more work ASAP.
  • When an overshoot becomes quite long, sometimes multiple times the requested interval, "keeping to schedule" is a lost cause.
  • More urgently, since bug 1329600 (CPU measurements), tiny inter-loop intervals can exaggerate apparent CPU usage spikes. Bug 1685938 will fix the main issue (gap between CPU measurement and its associated timestamp), but fixing the tiny-interval issue will help as well.

I'm proposing the following:

  • Try keeping to schedule as before, this is useful in almost all cases.
  • When the sleeping time becomes "too small" (to be defined), impose a minimum sleep time.
  • When the computed sleeping time (before max(0.0, ...)) becomes negative, give up on "keeping to schedule", and wait for a "normal" (TBD) sleep interval.

Time diagram:
'S' Sample loop. '.' minimum sleep time (one quarter of the interval here).

0...1...2...3   <-- Ideal intervals
S...S...S...S   <-- All good.
S..S....S...S   <-- S1 undershoot, longer sleep until S2.
S....S..S...S   <-- S1 overshoot, shorter sleep until S2.
S.....S.S...S   <-- S1 overshoot, shorter sleep still fine.
S......S.S..S   <-- S1 big overshoot, use minimum sleep, S2 smaller overshoot, S3 ok.
S.......S.S.S   <-- S1 big overshoot, now effectively S2! Sleeping minimum time twice, S3 ok here.
S........S...S  <-- S1 huge overshoot, now effectively overshot S2! Sleeping full interval to attempt to resume normal schedule.
S.........S...S <-- same.

The "big overshoot" scenarios feel more complicated to program. If too complex I may skip them (i.e., go for "reset to full interval" instead); we can revisit again if needed.

More exactly: Instead of trying to compensate for only the previous sleep over/undershoot, we now try to keep each sampling loop to a schedule based on the very beginning of sampling, by adding the requested interval to the scheduled sampling time.

In addition, the sleep time is always kept to a minimum, to avoid making the system busier by having one loop right after the other -- also, this very-close data may be less useful.

And in presumably very busy times, one sleep and the following sampling work may take much more time than the requested sampling interval, trying to keep to schedule is now futile (it would require trying to effectively multiply the sampling rate, which seems unlikely to succeed, and would impact Firefox even more), in which case we revert to the full sampling interval.

Depends on D101546

Pushed by gsquelart@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/7d1df171a5b6
Between sampling loops, never sleep less than 1/4 the requested interval - r=canaltinova
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 86 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: