When the phone app starts up under process stress, it briefly gets vanilla FOREGROUND priority

RESOLVED FIXED in Firefox 22

Status

defect
RESOLVED FIXED
7 years ago
6 years ago

People

(Reporter: justin.lebar+bug, Assigned: justin.lebar+bug)

Tracking

({verifyme})

unspecified
B2G C4 (2jan on)
Dependency tree / graph
Bug Flags:
in-moztrap -

Firefox Tracking Flags

(blocking-b2g:tef+, firefox20 wontfix, firefox21 wontfix, firefox22 fixed, b2g18 fixed, b2g18-v1.0.0 wontfix, b2g18-v1.0.1 fixed)

Details

(Whiteboard: QARegressExclude)

Attachments

(3 attachments, 1 obsolete attachment)

STR:

 1. Reboot the phone
 2. Open the browser app
 3. Load http://people.mozilla.com/~cjones/membuster.html
 4. Click "bust processes"
 5. A second or two later, place a call to the phone

Results:

The communications app briefly gets vanilla FOREGROUND priority before it gets upgraded to FOREGROUND_HIGH priority.

This can cause the app to be killed due to pressure.

We're giving the comms app FOREGROUND priority because, at the point when the first nsGlobalWindow is created, we don't have any TabChild objects attached to the ContentChild.  Therefore the process can't figure out whether it's high-priority.

I'll see if there's a simple fix here.  If not, we may need to rewrite the process priority manager /again/, this time moving it into the parent process.  :-/
Here are the relevant logs:

> I/Gecko:ProcessPriorityManager(20886): Starting up.
> I/Gecko:ProcessPriorityManager(20886): Done starting up.  mHoldsCPUWakeLock=1, mHoldsHighPriorityWakeLock=0
> I/Gecko   (20886): ###################################### forms.js loaded
> I/Gecko   (20886): ############################### browserElementPanning.js loaded
> I/Gecko:ProcessPriorityManager(20886): DocumentGlobalCreated
> I/Gecko:ProcessPriorityManager(20886): IsCriticalProcessWithWakeLock returning false; no critical frames out of 0 frames.
> I/Gecko:ProcessPriorityManager(20886): SetIsForeground
> I/Gecko:ProcessPriorityManager(20886): IsCriticalProcessWithWakeLock returning false; no critical frames out of 0 frames.
> I/Gecko:ProcessPriorityManager(20886): Setting priority to FOREGROUND.
> I/Gecko   (20886): ######################## BrowserElementChildPreload.js loaded
> I/Gecko:ProcessPriorityManager(20886): Got visibilitychange.
> I/Gecko:ProcessPriorityManager(20886): IsCriticalProcessWithWakeLock returning false; no critical frames out of 0 frames.
> I/Gecko:ProcessPriorityManager(20886): Scheduling reset timer to fire in 1000ms.
> I/Gecko:ProcessPriorityManager(20886): DocumentGlobalCreated
> I/Gecko:ProcessPriorityManager(20886): IsCriticalProcessWithWakeLock returning false; no critical frames out of 0 frames.
> I/Gecko:ProcessPriorityManager(20886): ScheduleResetPriority bailing; the timer is already running.
> I/Gecko:ProcessPriorityManager(20886): TemporarilyLockProcessPriority
> I/Gecko:ProcessPriorityManager(20886): Scheduling reset timer to fire in 5000ms.
> I/Gecko:ProcessPriorityManager(20886): Got process-priority:reset-now notification.
> I/Gecko:ProcessPriorityManager(20886): IsCriticalProcessWithWakeLock returning TRUE.
> I/Gecko:ProcessPriorityManager(20886): SetIsForeground
> I/Gecko:ProcessPriorityManager(20886): IsCriticalProcessWithWakeLock returning TRUE.
> I/Gecko:ProcessPriorityManager(20886): Setting priority to FOREGROUND_HIGH.
Here's a log from another run, this time with timestamps.

> 02-19 14:29:34.588 I/DMD     ( 1482): $DMD = '1'
> 02-19 14:29:34.598 I/DMD     ( 1482): DMD is enabled
> 02-19 14:29:35.969 E/profiler( 1482): Registering start signal
> 02-19 14:29:36.240 E/GeckoConsole( 1482): Could not read chrome manifest 'file:///system/b2g/chrome.manifest'.
> 02-19 14:29:36.790 I/Gecko:ProcessPriorityManager( 1482): Starting up.
> 02-19 14:29:38.382 I/Gecko   ( 1482): ###################################### forms.js loaded
> 02-19 14:29:38.412 I/Gecko   ( 1482): ############################### browserElementPanning.js loaded
> 02-19 14:29:38.562 I/Gecko:ProcessPriorityManager( 1482): DocumentGlobalCreated
> 02-19 14:29:38.562 I/Gecko:ProcessPriorityManager( 1482): IsCriticalProcessWithWakeLock returning false; no critical frames out of 0 frames.
> 02-19 14:29:38.572 I/Gecko:ProcessPriorityManager( 1482): SetIsForeground
> 02-19 14:29:38.572 I/Gecko:ProcessPriorityManager( 1482): IsCriticalProcessWithWakeLock returning false; no critical frames out of 0 frames.
> 02-19 14:29:38.572 I/Gecko:ProcessPriorityManager( 1482): Setting priority to FOREGROUND.
> 02-19 14:29:38.632 I/Gecko   ( 1482): ######################## BrowserElementChildPreload.js loaded
> 02-19 14:29:38.702 I/Gecko:ProcessPriorityManager( 1482): Got visibilitychange.
> 02-19 14:29:38.702 I/Gecko:ProcessPriorityManager( 1482): IsCriticalProcessWithWakeLock returning false; no critical frames out of 0 frames.
> 02-19 14:29:38.702 I/Gecko:ProcessPriorityManager( 1482): Scheduling reset timer to fire in 1000ms.
> 02-19 14:29:38.802 I/Gecko:ProcessPriorityManager( 1482): DocumentGlobalCreated
> 02-19 14:29:38.812 I/Gecko:ProcessPriorityManager( 1482): IsCriticalProcessWithWakeLock returning false; no critical frames out of 0 frames.
> 02-19 14:29:38.812 I/Gecko:ProcessPriorityManager( 1482): ScheduleResetPriority bailing; the timer is already running.
> 02-19 14:29:39.142 I/Gecko:ProcessPriorityManager( 1482): TemporarilyLockProcessPriority
> 02-19 14:29:39.142 I/Gecko:ProcessPriorityManager( 1482): Scheduling reset timer to fire in 5000ms.
> 02-19 14:29:39.883 I/Gecko:ProcessPriorityManager( 1482): Got process-priority:reset-now notification.
> 02-19 14:29:39.883 I/Gecko:ProcessPriorityManager( 1482): IsCriticalProcessWithWakeLock returning TRUE.
> 02-19 14:29:39.883 I/Gecko:ProcessPriorityManager( 1482): SetIsForeground
> 02-19 14:29:39.883 I/Gecko:ProcessPriorityManager( 1482): IsCriticalProcessWithWakeLock returning TRUE.
> 02-19 14:29:39.883 I/Gecko:ProcessPriorityManager( 1482): Setting priority to FOREGROUND_HIGH.
Blocks: 836199
blocking-b2g: --- → tef?
Assignee: nobody → justin.lebar+bug
This is the safest way I could come up with to fix this bug.  But I'm now convinced that we need to move this machinery into the main process at our earliest opportunity; the number of hacks and timeouts in here is now really gross.
We should block on this so that we can get a better situation for apps on memory stress, specifically dialer, and this work affects other blockers.
blocking-b2g: tef? → tef+
Attachment #716188 - Flags: review?(jones.chris.g) → review+
Comment on attachment 716190 [details] [diff] [review]
Part 3: Wait for a tab-child to be created before modifying priorities in the ProcessPriorityManager.

The code in part 1 looks fine.

>diff --git a/dom/ipc/ProcessPriorityManager.cpp b/dom/ipc/ProcessPriorityManager.cpp

>-   * But if the new priority is a BACKGROUND* priority and this process is not
>-   * currently in the background, we schedule a timer and run
>-   * ResetPriorityNow() after a short period of time.

>+   * So for example, a transition between any two BACKGROUND* priorites happens
>+   * immediately, but a transition from UNKNOWN to FOREGROUND_HIGH happens
>+   * after a grace period.

That's not at all obvious to me.  We wouldn't ever transition between
BACKGROUND, but why would they be excluded?

I'm having some trouble "seeing" the race condition.  It's not obvious to me from the log.  Can you help me out with a quick summary content?  (My eyes are starting to glaze over a bit, not the fault of this patch ;) .)
Attachment #716190 - Flags: review?(jones.chris.g)
Flags: needinfo?(justin.lebar+bug)
> That's not at all obvious to me.  We wouldn't ever transition between
> BACKGROUND, but why would they be excluded?

This was the behavior before, and I kept it mostly for the sake of not changing behavior without need.

We do transition between BACKGROUND priorities if e.g. a bg process holding the CPU lock (and thus with BG_PERCEIVABLE priority) releases it.

I agree it would be simpler if we always applied a grace period for "down" transitions and never applied it for "up" transitions.  I think that probably wouldn't cause problems, but I don't see a big advantage to taking that change at this point.

What do you think?

> I'm having some trouble "seeing" the race condition.  It's not obvious to me
> from the log.  Can you help me out with a quick summary content?

Sure, it's not obvious, and some of these log messages aren't even in the source.

> 02-19 14:29:38.562 I/Gecko:ProcessPriorityManager( 1482): DocumentGlobalCreated
> 02-19 14:29:38.572 I/Gecko:ProcessPriorityManager( 1482): SetIsForeground
> 02-19 14:29:38.572 I/Gecko:ProcessPriorityManager( 1482): IsCriticalProcessWithWakeLock returning false; no critical frames out of 0 frames.
> 02-19 14:29:38.572 I/Gecko:ProcessPriorityManager( 1482): Setting priority to FOREGROUND.

What happens is that DocumentGlobalCreated occurs, and that triggers ResetPriority.  We notice that we're in the foreground, and we notice that the main process has acquired the CPU wake lock on our behalf.  But because we do not have any TabChild's ("no critical frames out of **0 frames**"), we can't check them for the mozapptype=critical.  Thus IsCriticalProcessWithWakeLock returns false, and we get vanilla FG priority.
Flags: needinfo?(justin.lebar+bug)
Attachment #716187 - Flags: review?(jones.chris.g) → review+
Comment on attachment 716190 [details] [diff] [review]
Part 3: Wait for a tab-child to be created before modifying priorities in the ProcessPriorityManager.

>diff --git a/dom/ipc/ProcessPriorityManager.cpp b/dom/ipc/ProcessPriorityManager.cpp

>+   *  - UNKNOWN
>+   *  - FOREGROUND_HIGH
>+   *  - FOREGROUND
>+   *  - BACKGROUND*
>+   *
>+   * So for example, a transition between any two BACKGROUND* priorites happens
>+   * immediately, but a transition from UNKNOWN to FOREGROUND_HIGH happens
>+   * after a grace period.

I think it would be simpler and easier to understand if imposed a
total ordering on priorities and always used a grace period for
"down", but we can't be unnecessarily changing this kind of behavior
at this point.  So let's get a followup on file.

I would have found it helpful to have one more sentence of explanation
for the BACKGROUND* transitions.  Just a note "that's just the way we
do it now" is fine.

> NS_IMETHODIMP
> ProcessPriorityManager::Observe(
>   nsISupports* aSubject,
>   const char* aTopic,
>   const PRUnichar* aData)
> {

>+  if (!strcmp(aTopic, "tab-child-created")) {

In the case where c-d-g-c arrives before this, aren't we going to miss
hooking a visibility change watcher for the global?

>+  } else if (!strcmp(aTopic, "content-document-global-created")) {

It seems to me that if we can resolve the above problem, this and
tab-child-created might be redundant.  But I might misunderstand.

(r- not because I think the approach is wrong, but because I already cleared the review request and want to ensure you see this ;).)
Attachment #716190 - Flags: review-
> In the case where c-d-g-c arrives before this, aren't we going to miss
> hooking a visibility change watcher for the global?

Ouch, you're right.  :-/
This is much better than the previous approach, IMO.
Attachment #717339 - Flags: review?(jones.chris.g)
Attachment #716190 - Attachment is obsolete: true
Attachment #717339 - Flags: review?(jones.chris.g) → review+
> I think it would be simpler and easier to understand if imposed a
> total ordering on priorities and always used a grace period for
> "down", but we can't be unnecessarily changing this kind of behavior
> at this point.  So let's get a followup on file.

I filed bug 844323.
m-i is closed; I guess that's one way to get me to push to try.  :)

https://tbpl.mozilla.org/?tree=Try&rev=ff7943a84f2d
Whiteboard: he tc for Portuges
Whiteboard: he tc for Portuges → QARegressExclude
No Test case creation is needed in moztrap for this issue.
Flags: in-moztrap-
Cannot verify, need steps to blackbox test this issue.
Angela - Can you verify this per comment 0?
Flags: needinfo?(ahubenya)
Keywords: verifyme
Jason,
that link doesn't work anymore.
Flags: needinfo?(ahubenya)
(In reply to Angela Hubenya from comment #21)
> Jason,
> that link doesn't work anymore.

http://bit.ly/membuster
You need to log in before you can comment on or make changes to this bug.