Closed Bug 1158110 Opened 9 years ago Closed 9 years ago

Throttle jank monitoring of quick-firing compartments

Categories

(Toolkit :: Performance Monitoring, defect)

defect
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: Yoric, Unassigned)

References

Details

Bug 1152930 indicates that quick-firing compartments, such as compartments that use lots of DOM Promise, suffer from considerable slowdown because of jank monitoring.

To remedy this, we can introduce a mechanism for throttling jank monitoring, then extrapolate using available data. I'll use $ITERATION for the number of times we have called ProcessNextEvent() in the entire Process containing this XPConnect/Worker Runtime (that's a piece of information that we already use for StopWatch monitoring).

The key idea is that if the stopwatch enters the same PerfGroup twice within $THROTTLING_LIMIT iterations, the PerfGroup is considered quick-firing, so we:
- deactivate the stopwatch temporarily for this PerfGroup;
- wait 2 * Random() * $THROTTING_LIMIT until we reactivate it;
- count how many times we have hit the PerfGroup while the stopwatch is deactivated;
- on the next attempt to enter the same PerfGroup after the stopwatch is reactivated, we extrapolate the data as follows:
  CPU time <- average of (CPU time spent in this PerfGroup just before deactivating, CPU time spent in the PerfGroup just after reactivating) * number of times we have hit the PerfGroup while the stopwatch was deactivated.
This algorithm should have no effect on non-quick-firing PerfGroups, but should decrease the overhead on quick-firing PerfGroups by a factor proportional to $THROTTLING_LIMIT.

A few possible drawbacks:
- we may accidentally miss occurrences that are extremely expensive in the middle of cheap occurrences, and vice-versa;
- the benefit of the approach decreases if we have several quick-firing PerfGroups running interleaved.

Any thoughts?
Flags: needinfo?(padenot)
Flags: needinfo?(bzbarsky)
Flags: needinfo?(avihpit)
Flags: needinfo?(padenot)
This seems somewhat plausible to try, at least...
Flags: needinfo?(bzbarsky)
My knowledge of this context is limited, to say the least, so I can't comment on the actual algorithm, but I'd agree with comment 2.

However, whatever algorithms you end up trying, in order to get some rough understanding of inaccuracy levels this might be introducing, IMO it's worth trying to assess the following:

- How frequently is this throttling happening on few use cases, preferably some extreme cases and some real-world ones.

- Compare perf numbers collected with the throttling to non-throttled numbers of the same use case.
Flags: needinfo?(avihpit)
Depending on bug 1181175, this may be useless.
Depends on: 1181175
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.