Closed Bug 902000 Opened 11 years ago Closed 11 years ago

Add a service that monitors the system load


(Core :: WebRTC: Audio/Video, defect)

Not set





(Reporter: gcp, Assigned: gcp)




(2 files, 3 obsolete files)

For adapting the video encoding during a WebRTC call (bug 877954), we want to know the system load and how much we are responsible for that. 

Google's libjingle has an implementation in talk/base/ that we can use ideas from.

Bug 854204 implements something like this for B2G, but it looks like an e10s and Android/Linux specific hack.
Assignee: nobody → gpascutto
Blocks: 877954
Based on what libjingle does. We use a background thread so we can safely extend this to all OSes, and that probably also works a bit better if the system is heavily loaded, i.e. when we need this information the most. Only Linux/Android/B2G support for now. 

We collect the system and Firefox process load information once per second, and fire a runnable back to the main thread to update the nsSysinfo structure.
Attachment #790258 - Flags: review?(doug.turner)
Looks like hg ate part of the patch. Here's the correct version.
Attachment #790258 - Attachment is obsolete: true
Attachment #790258 - Flags: review?(doug.turner)
Attachment #790341 - Flags: review?(doug.turner)
Comment on attachment 790341 [details] [diff] [review]
Patch 1. Add load collection to nsSysInfo

God, this patch sucks. It will delay shutdown by up to 1 second and leak memory.

Also, there's no reason to incur the overhead of that extra thread if no WebRTC session is running, so this really should go into a seperate service.
Attachment #790341 - Attachment is obsolete: true
Attachment #790341 - Flags: review?(doug.turner)
Well, that escalated quickly. Somewhere between observers needing to be on the main thread, WebRTC being in a thread by itself, and this thing wanting its own thread this got a bit more complicated to deal with proper shutdown and object lifetimes. It think I got it correct now.

This might at some point better become a proper XPCOM service again, but I'd love to actually try to use it first to see how the API should evolve. Besides being an XPCOM service brings in all kind of threading issues again because then it needs to go back to the main thread instead of the WebRTC one.
Attachment #793416 - Flags: review?(rjesup)
I'd love to have this for Gecko/Firefox 26, but I need it for Gecko 27; so that's the milestone I'm setting.
Target Milestone: --- → mozilla27
Comment on attachment 793416 [details] [diff] [review]
Patch 1. Add service to monitor system load

Review of attachment 793416 [details] [diff] [review]:

r+ with a switch from runnables to update the load values to mutex-locking access to the load values.

::: content/media/webrtc/LoadMonitor.cpp
@@ +220,5 @@
> +  const uint64_t total_times = cpu_times + idle;
> +
> +  UpdateCpuLoad(total_times,
> +                cpu_times * PR_GetNumberOfProcessors(),
> +                &mSystemLoad);

For the future, we may want point measurements and also some type of smoothed measurement, but that can be defined later.

@@ +248,5 @@
> +      (usage.ru_utime.tv_sec + usage.ru_stime.tv_sec) * PR_USEC_PER_SEC +
> +       usage.ru_utime.tv_usec + usage.ru_stime.tv_usec;
> +#endif // defined(LINUX) || defined(ANDROID)
> +
> +#if defined(LINUX) || defined(ANDROID)

At least for now, merge these two blocks

@@ +268,5 @@
> +
> +  NS_IMETHOD Run()
> +  {
> +    mLoadMonitor->SetSystemLoad(mLoadInfo->GetSystemLoad());
> +    mLoadMonitor->SetProcessLoad(mLoadInfo->GetProcessLoad());

I have a concern about using a runnable (allocate, addrefs, run through the event queue, deallocate) for each update, and with the assumption this will only be accessed from mainthread.  The second may well not be true (in fact it's unlikely to be true), so let's switch to a mutex instead of using a runnable to update it.

@@ +291,5 @@
> +  {
> +    MutexAutoLock lock(mLoadMonitor->mLock);
> +    while (!mLoadMonitor->mShutdownPending) {
> +      mLoadInfo->UpdateSystemLoad();
> +      mLoadInfo->UpdateProcessLoad();

Let's add a LOG() here every second or so when log level is high, and a LOG every 10 seconds (or so) when logs are on but at a lower level.
Attachment #793416 - Flags: review?(rjesup) → review+
Since I think the emergencies on Android have abated, let's land this so we can make sure we don't have any surprises later, and to set it up for extending to make use of the load factor. Is there some reason not to?

Question: will this handle e10s?  I assume this won't handle it fully, since RUSAGE_SELF is only the current process (though that might work well enough, depending).

Is there an easy way to auto-disable this if E10S is in use? (until we code something up to handle E10S) and complain.
Nothing is holding this back besides that I was (and am) still putting out other fires, and I did want to have a first shot at using the info it provides to see if the API needs change.

Pretty sure we can disable it in e10s, but for now there's no reason to: system load should be correct and process load is also correct but with the caveat you mentioned.
Rebased. Incorporated review comments.
Attachment #793416 - Attachment is obsolete: true
Attachment #822289 - Flags: review?(rjesup)
Attachment #822291 - Flags: review?(rjesup)
Comment on attachment 822289 [details] [diff] [review]
Patch 1. Add service to monitor system load

Review of attachment 822289 [details] [diff] [review]:

::: content/media/webrtc/LoadMonitor.cpp
@@ +1,1 @@
> +/* -*- Mode: C++; tab-width: 50; indent-tabs-mode: nil; c-basic-offset: 2 -*- */


@@ +130,5 @@
> +
> +    mLoadInfoThread = nullptr;
> +
> +    nsRefPtr<LoadMonitorRemoveObserver> remObsRunner = new LoadMonitorRemoveObserver(this);
> +    NS_DispatchToMainThread(remObsRunner, NS_DISPATCH_NORMAL);

I don't love dispatching to main when we may be on main already, but not a big deal.  It does mean the main event queue may need another cycle to remove the observer if we are in xpcom-shutdown observe.

@@ +232,5 @@
> +}
> +
> +nsresult LoadInfo::UpdateProcessLoad() {
> +#if defined(LINUX) || defined(ANDROID)
> +  // Common to both OSX and Linux.

Comment is confusing given the ifdef

@@ +238,5 @@
> +  gettimeofday(&tv, NULL);
> +  const uint64_t total_times = tv.tv_sec * PR_USEC_PER_SEC + tv.tv_usec;
> +#endif
> +
> +#if defined(LINUX) || defined(ANDROID)

Why end one ifdef and start another identical one?

::: content/media/webrtc/LoadMonitor.h
@@ +1,1 @@
> +/* -*- Mode: C++; tab-width: 50; indent-tabs-mode: nil; c-basic-offset: 4 -*- */

Attachment #822289 - Flags: review?(rjesup) → review+
Attachment #822291 - Flags: review?(rjesup) → review+
Pushed with a check if we're on the main thread already before dispatching, and a small fix that resets the clock data if it's inconsistent - that allows us to recover if the device clock is shaky (*cough* Android). The tab-width: 50 is in several of our files. It's there to make tabs stand out and avoid mixed indentation.
You need to log in before you can comment on or make changes to this bug.