Decouple WebRTC's LoadMonitor from RTC, move it to toolkit.




4 years ago
3 years ago


(Reporter: Yoric, Unassigned)


Firefox Tracking Flags

(Not tracked)



(1 attachment, 1 obsolete attachment)

Comment hidden (empty)
Assignee: nobody → dteller
Looking at the code, we come back to a frequent problem: we're polling the system for information at regular intervals. I haven't tried tracing the polling interval back to its value, but it's potentially battery-unfriendly. That's probably not a problem for WebRTC, but that can be a problem for PerfMonitoring and possibly other consumers.

I believe that the right thing to do is to expose as a public API the code used for accessing System/Process load, rather than the entire LoadMonitor. This will let us tie polling to other activities that take place when we already know that the thread is awake – either the instant we perform polling for add-on monitoring or a when we enter/exit the event loop.

gcp, pkerr, does this sound ok?
Flags: needinfo?(pkerr)
Flags: needinfo?(gpascutto)
I think it would be lead to a clean separation of the details: how to get the data from each operation system and the caller/consumer. The only tricky part is that you load estimate needs to take time into account when not running on a periodic callback.
Flags: needinfo?(pkerr)
So I'll try and move GetProcessLoad() and GetSystemLoad() to nsIPerformanceStatsService.
I have a WIP on moving UpdateProcessLoad()/UpdateSystemLoad() to toolkit, but 1/ that code is growing a bit too complex for my tastes; 2/ I'm starting to wonder whether this is the right thing to do.

I wonder if an API with the following primitives would not be better:
- acquire/release monitor;
- get latest value (without lock);
- get next value (also without lock).

With a single monitoring thread shared by the system, which runs iff at least one client has currently acquired the monitor. This would save resources, and the thread may be used by:
- WebRTC;
- the add-on monitor;
- the jank monitor;
- the unresponsive webpage monitor.

What do you think, Paul?
Flags: needinfo?(pkerr)
I think you have the clearest view on a solution that satisfies all consumers fwiw. From the WebRTC side the important thing is that it can get a timely notification if the actual load changes. From its perspective battery doesn't factor in (it's maxing the CPU) and it does want continuous monitoring.

From your description it sounds like you want LoadMonitor clients to tell it whether they want opportunistic polling (battery friendly) or a maximum interval and adopt accordingly. The actual functions for retrieving the load could be pushed in a separate API even lower, sure.
Flags: needinfo?(gpascutto)
Created attachment 8610507 [details]
MozReview Request: bz://1167175/Yoric

/r/9349 - Bug 1167175 - LoadMonitor, v2

Pull down this commit:

hg pull -r f1affbf998689c8c4e1ab949da414c24f6736458
Attachment #8610507 - Flags: review?(pkerr)
Attachment #8610507 - Flags: review?(gpascutto)
I have pushed to MozReview a possible replacement API for LoadMonitor. I believe that this API would satisfy the needs of PerfMonitoring. Would it also be sufficient for WebRTC?

::: dom/media/systemservices/LoadMonitor.h:86
(Diff revision 1)
> +     * thread, they may return NaN. Otherwise, a value in [.0, 1.0].

NaN sounds like a potential footgun.

Can you implement LoadManager in terms of this?

To me the fact that the clients don't signal the load granulairty they need makes it seem this can't achieve what was commented in this bug earlier.
We could certainly implement the following without too many difficulties:

Acquire(float maxGranularity)

Would that be sufficient?
Flags: needinfo?(gpascutto)
minGranularity sounds more logical :-)

LoadManager is currently the only consumer of LoadMonitor, so as long as you can keep that working your LoadMonitor changes should be fine for WebRTC.
Flags: needinfo?(gpascutto)
Ok. I'll experiment with this and I'll try and find something better than NaN.
Looks like the LoadManager can be reimplemented easily on top of this LoadMonitor v2.
Summary: Move dom/media/systemservices/LoadMonitor to toolkit → Decouple WebRTC's LoadMonitor from RTC, move it to toolkit.
Comment on attachment 8610507 [details]
MozReview Request: bz://1167175/Yoric

::: dom/media/systemservices/LoadMonitor.h:98
(Diff revision 1)
> +    explicit InternalLoadMonitor(int aLoadUpdateInterval);

How is the aLoadUpdateInterval selected? Can the user of LoadMonitor select a minimum update interval for a callback registration?

Also, can the user select the minimum change in a load value that will trigger a callback notification?
Attachment #8610507 - Flags: review?(pkerr)
comment via review board, see above
Flags: needinfo?(pkerr)
Comment on attachment 8610507 [details]
MozReview Request: bz://1167175/Yoric

Sorry but I find reviewboard completely unusable and I have no idea how to close this.
Attachment #8610507 - Flags: review?(gpascutto)
Comment on attachment 8610507 [details]
MozReview Request: bz://1167175/Yoric
Attachment #8610507 - Attachment is obsolete: true
Created attachment 8620346 [details]
MozReview Request: Bug 1167175 - Implement a CPU load monitor in toolkit;r=mossop
Comment on attachment 8620346 [details]
MozReview Request: Bug 1167175 - Implement a CPU load monitor in toolkit;r=mossop

Bug 1167175 - Implement a CPU load monitor in toolkit;r=mossop
Attachment #8620346 - Attachment description: MozReview Request: Bug 1167175 - LoadMonitor, v2 → MozReview Request: Bug 1167175 - Implement a CPU load monitor in toolkit;r=mossop
Comment on attachment 8620346 [details]
MozReview Request: Bug 1167175 - Implement a CPU load monitor in toolkit;r=mossop

Here's a first draft (not entirely tested yet).
For all non-Windows platforms, this largely a port of the code in LoadMonitor.cpp. For Windows, I used `GetSystemTimes` instead of a Process Counter, because it felt much simpler.
Attachment #8620346 - Flags: feedback?(pkerr)

::: toolkit/components/perfmonitoring/nsSystemLoad.cpp:138
(Diff revision 2)
> +  bool success = GetSystemTimes(&idleTime, &kernelTime, &userTime);

As long as the GetSystemTimes API call closely tracks the system load, in both value and response time, as show by the Windows performance monitor utility, then there should be no problem with changing the load calculation method for windows platforms. Using the system.diagnotic interface to the process counter was intended to work in the the same manner as the windows diagnostic tools and tracked the output of those tools very well.


4 years ago
Attachment #8620346 - Flags: feedback?(pkerr) → feedback+
No longer blocks: 1185926
By the way, do you know if the code you have been using is affected by the OS swapping to disk? I'm interested in [de]activating stuff conditionally based on whether the OS is responsive, and I'd like to include this in the definition of being not responsive.
Flags: needinfo?(pkerr)
Flags: needinfo?(gpascutto)
Based on general experience, you're more likely to see the load drop if the OS is swapping to disk, as processing gets blocked on disk IO.
Flags: needinfo?(gpascutto)
That's what I feared.
So how do you cope with swapping (memory pressure?) in WebRTC.
Flags: needinfo?(gpascutto)
Not? It's not like it's very relevant?
Flags: needinfo?(gpascutto)
To clarify, we care about CPU usage because video and audio encoders have high CPU complexity (and could easily take 100% of the CPU in the quest for better quality, if that was desirable). Memory requirements are much more modest.
Well, if I understand correctly, you use the LoadMonitor to decrease quality when you cannot cope with them because of system load. But swapping will also prevent the process from getting CPU time. which will cause the same kind of issues, no?
>swapping will also prevent the process from getting CPU time

No? Whatever needs to be swapped in won't be able to run at all but that shouldn't stop the (fairly small) core loops in the encoders (which are on separate threads) from pegging the CPU if possible.

A scenario where the WebRTC code is starved for CPU time because the DSP code in the encoders is swapped out is not one you need to consider, because before you get to that point everything else in the browser has already become completely unusable.
I agree with what gcp has posted on this subject.
Flags: needinfo?(pkerr)
Fair enough. The code attached should work, but I'm putting this on the backburner, because it looks like system load is not the information I'm looking for after all.
I've gone for an entire strategy change, so I'm dropping this bug. If someone wants to pick it, feel free.
Assignee: dteller → nobody
You need to log in before you can comment on or make changes to this bug.