Closed Bug 938470 Opened 6 years ago Closed 6 years ago

Investigate increasing usable memory on FxOS by delaying homescreen launch

Categories

(Core :: IPC, defect)

ARM
Gonk (Firefox OS)
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla29
blocking-b2g 1.3+
Tracking Status
firefox27 --- unaffected
firefox28 --- fixed
firefox29 --- fixed
b2g-v1.3 --- fixed
b2g-v1.3T --- fixed

People

(Reporter: cyu, Assigned: cyu)

References

Details

Attachments

(2 files, 8 obsolete files)

4.04 KB, patch
bent.mozilla
: review+
Details | Diff | Splinter Review
10.29 KB, patch
smaug
: review+
Details | Diff | Splinter Review
We can consider delaying launching the homescreen and the usage app and launch them after the Nuwa process is ready.
We'd save ~11mb.  That memory is available anyway, but with the current implementation, you have to get to an extreme OOM situation where homescreen and usage are killed.  Usage dies off early and gives you 5mb of usable memory back.  But because the oom_adj of homescreen is low, you probably won't ever get there - the lowmemorykiller will assassinate other processes before we get to <10mb of available memory which is when homescreen would be killed - so 6mb of memory are really wasted.

11mb of memory now is plenty to run another app with the fantastic Nuwa work.  If the implementation for delayed launching for this were straightforward, seems like a clear win.
ok, concrete data (finally back in front of a computer), before (homescreen and usage forked from b2g):

                          |    megabytes    |
           NAME  PID PPID NICE  USS  PSS  RSS VSIZE OOM_ADJ USER    
            b2g 2812    1    0 52.1 56.1 67.7 169.3       0 root    
         (Nuwa) 2847 2812    1  3.4  9.8 23.4  51.6       2 root    
          Usage 2880 2812   18 13.1 16.4 27.5  67.0      10 app_2880
     Homescreen 2920 2812    1 19.9 23.6 34.9  77.5       2 app_2920
(Preallocated a 3012 2847   18  2.9  7.5 15.3  55.6      10 root    

System memory info:

            Total 172.6 MB
     Used - cache 114.2 MB
  B2G procs (PSS) 113.4 MB
    Non-B2G procs   0.8 MB
     Free + cache  58.3 MB
             Free  19.2 MB
            Cache  39.1 MB

after (homescreen and usage forked from nuwa):

                          |    megabytes    |
           NAME  PID PPID NICE  USS  PSS  RSS VSIZE OOM_ADJ USER    
            b2g 3036    1    0 51.6 57.3 69.3 170.9       0 root    
         (Nuwa) 3068 3036    1  4.4  9.9 23.4  51.6       2 root    
     Homescreen 3205 3068    1 11.0 15.2 28.0  69.3       2 app_3205
          Usage 3243 3068   18  9.4 14.5 28.3  72.1      10 app_3243
(Preallocated a 3291 3068   18  2.9  5.7 14.2  55.6      10 root    

System memory info:

            Total 172.6 MB
     Used - cache 104.6 MB
  B2G procs (PSS) 102.7 MB
    Non-B2G procs   2.0 MB
     Free + cache  67.9 MB
             Free  17.5 MB
            Cache  50.4 MB

Low-memory killer parameters:

  notify_trigger 14336 KB

  oom_adj min_free
        0  4096 KB
        1  5120 KB
        2  6144 KB
        6  7168 KB
        8  8192 KB
       10 20480 KB

For these snapshots, homescreen is stealing an unneeded 7.9mb and Usage 3.7mb.
Does that slow down the boot to usable phone time? If yes, by how much?
Short answer: I think if done right it wouldn't slow down boot, the contrary.  On the outside worst case, you're looking at a second or two.

Longer answer: That's a very interesting question.  To answer it I added CPU (utime + stime ticks converted to seconds) to b2g-info and restarted b2g - right at the point the lock screen comes up, I see this:


                          |     megabytes     |
           NAME  PID PPID CPU(s) NICE  USS  PSS  RSS VSIZE OOM_ADJ USER    
            b2g 3988    1   15.9    0 51.9 55.7 67.5 167.3       0 root    
         (Nuwa) 4020 3988    1.1    1  3.4  9.8 23.4  51.6       2 root    
          Usage 4039 3988    1.6   18 13.0 16.4 27.5  67.0      10 app_4039
     Homescreen 4070 3988    3.0    1 18.9 22.6 34.0  77.5       2 app_4070
(Preallocated a 4156 4020    0.1   18  2.9  7.4 15.2  55.6      10 root 

b2g consumes between 10s-15s from launch to fully "ready".

Given that in agregate launching Usage, Nuwa and Homescreen take only 5.7s of compute time to start, I would think with creative re-ordering of launching you could actually *save* time.  and the worst possible case would be delaying first visible pixel by 3s.

questions & thoughts:
1. why start usage at all?  it's just killed off by lowmemorykiller as soon as the user opens a an app or two today.  and because it runs in parallel to homescreen it just slows things down
2. Homescreen takes 3s to launch without Numa, 2.2s to launch with.  So waiting until numa is around means you shave total compute time required to spawn up the process.
I also proposed to not start usage unless we want to display the widget in the notification tray, but the idea was shot down at the time, for no good reason imho.

3s to launch the homescreen seems way more than what we have now... And remember the lockscreen is rendered by the system app, not by the homescreen app so waiting for the lockscreen may not mean much. Also, the system app is the one that launches the homescreen, not some magic from the platform side. Welcome to b2g ;)
So this is someone hard to measure.  But the best I can test, with the following patch (which I think does what you propose):

diff --git a/apps/system/js/cost_control.js b/apps/system/js/cost_control.js
index 9f817be..c8c0fa5 100644
--- a/apps/system/js/cost_control.js
+++ b/apps/system/js/cost_control.js
@@ -117,7 +117,6 @@
           widgetFrame.setVisible(false);
         });
       } else {
-        _ensureWidget();
         widgetFrame.setVisible(false);
       }
     });

Time from b2g launch to lock screen visible is reduced from ~15.2 to ~14.7s  

Give usage at launch looks like this (1.7s of cpu time to startup):

          Usage 2095    1.7 2043   18 11.8 15.1 26.3  66.0      10 app_2095

This suggests we are hurting boot up time by parallelizing Usage startup and system startup.

And precisely at the 14.7s mark, b2g-info output looks like this:

[lloyd@flapper ~]$ adb shell b2g-info
                            |    megabytes    |
           NAME  PID CPU(s) PPID NICE  USS  PSS  RSS VSIZE OOM_ADJ USER    
            b2g 1756   12.0    1    0 54.9 59.8 68.7 162.1       0 root    
plugin-containe 1788    0.2 1756   18  3.2  4.4  9.1  39.7      10 root    
plugin-containe 1809    0.2 1756   18  3.4  4.8 10.0  48.6      10 app_1809
     Homescreen 1812    1.2 1756    1 12.1 16.8 25.3  66.2       2 app_1812

Note, homescreen has consumed 1.2s of CPU time.

Because humans are very slow compared to computers, If we simply waited to start homescreen spawning until the user unlocks (homescreen is already spawned at that point) or until Nuwa was initialized, then we'd save even more (presumably, another 1.2s) on perceived boot up time, as well as reduced memory usage.

Bottom line: it *looks* like you can shave 1/2 second off bootup time and reclaim 13mb for apps the user is actually using, with a one line patch.  

With a slightly more complex patch, I think you could save another 1.2s on perceived bootup and also get 4ish mb back for the user's apps (homescreen under nuwa).

sorry for all the typing, firefoxos is fun!
s/someone/somewhat/
What when the lockscreen is disabled? ;)
With lockscreen disabled you have options:
1. You can wait to display first pixel until b2g -> Nuwa -> Homescreen
2. You can maintain the status quo b2g, then Nuwa and Homescreen in parallel

Because homescreen spawns about a second faster when we wait for Nuwa, and because Nuwa spawns in 1.1s, well, do the math...

If there's a simple and reliable way to block higher level initialization until Nuwa is ready, then I'd be happy to apply that patch and give you timing information.  

My guess is based on the numbers you'll have no significant delta in bootup and and extra 4-6mb of memory to play with.
Summary: Investigate delaying homescreen launch from the Nuwa process → Investigate increasing usable memory on FxOS by delaying homescreen launch
But summary so far, my claims are thus:

1. You can improve boot time and save memory by not spawning Usage if the widget is not enabled (Comment #6 above 0 one line patch)
2. You can free 4-7mb of memory with negligible impact to bootup times (could be a bit faster) by not launching homescreen and Nuwa in parallel (Nuwa first, then homescreen)
(In reply to Lloyd Hilaiel [:lloyd] from comment #10)
> But summary so far, my claims are thus:
> 
> 1. You can improve boot time and save memory by not spawning Usage if the
> widget is not enabled (Comment #6 above 0 one line patch)

Vivien & Salva, do you remember why my patch doing that was r- a while ago? I can't find the bug number anymore...
Flags: needinfo?(salva)
Flags: needinfo?(21)
Yes, the reason (and I think it is a good reason) is that we **needed** to detect the SIM has changed as soon as possible as the NetworkStats API did not have the ability to distinguish between the data of two different SIM cards. So, if we does not launch Usage as soon as possible, all the data usage made before showing the widget was assigned to an incorrect SIM.

I said **needed** because now we have multi-SIM support so you're lucky as it is about to enter in m-c. See bug 937041 and bug 937240.
Flags: needinfo?(salva)
(In reply to Salvador de la Puente González [:salva] from comment #12)
> Yes, the reason (and I think it is a good reason) is that we **needed** to
> detect the SIM has changed as soon as possible as the NetworkStats API did
> not have the ability to distinguish between the data of two different SIM
> cards. So, if we does not launch Usage as soon as possible, all the data
> usage made before showing the widget was assigned to an incorrect SIM.
> 
> I said **needed** because now we have multi-SIM support so you're lucky as
> it is about to enter in m-c. See bug 937041 and bug 937240.

Does it mean we can just not start the widget at startup now? Or is there any followup bugs to fix in the cost control app?
Flags: needinfo?(21) → needinfo?(salva)
Hello Vivien. How long no see you!

I would prefer to wait a little bit because we are adapting the application to the new API. Once we know there is no more need of starting the app at the beginning, we can remove it from the start-up.
Flags: needinfo?(salva)
What's the estimated time for knowing? Seems like we have potential big savings in memory use if we don't have to start the app at startup.

Also, in the future, please do let us know that there's a problem with the API rather than work around those problems by introducing hacks like starting apps during startup. The APIs are generally designed the way they are because we think it's what makes the API good, not because it's the only thing we are willing to implement. We always aim to design APIs that are good for you gaia developers.
Cervantes, are you the best one to own this bug? Thanks.
Flags: needinfo?(cyu)
Attached patch notify-slowthings.patch (obsolete) — Splinter Review
This patch fires a system message once the pre-allocated process is loaded. The system app could use that to wait before launching the homescreen.
Yes, I am working on this.
Assignee: nobody → cyu
Flags: needinfo?(cyu)
(In reply to Salvador de la Puente González [:salva] from comment #14)
> Hello Vivien. How long no see you!
> 
> I would prefer to wait a little bit because we are adapting the application
> to the new API. Once we know there is no more need of starting the app at
> the beginning, we can remove it from the start-up.

Thanks for the information, is there a bug number for that ?
Flags: needinfo?(salva)
Yep, bug 858003
Flags: needinfo?(salva)
(In reply to Salvador de la Puente González [:salva] from comment #20)
> Yep, bug 858003

Sorry I meant to fix what you describe in the Cost Control app.
Flags: needinfo?(salva)
No, I think is ok. Maybe that bug is quite general but we are stabilizing what's happening on Cost Control there: new API (in Gecko) + adaptations (in Gaia). Maybe, after resolution of bug 937251 we could address this.
Flags: needinfo?(salva)
Attached patch Gaia patch (untested) (obsolete) — Splinter Review
This untested Gaia patch use the notification from Fabrice's patch to delay the opening of any apps until the preloaded process is ready. This should work for the homescreen, the first time experience, the keyboard and also the cost control widget.
(In reply to Vivien Nicolas (:vingtetun) (:21) (RTO - Review Time Off until 09/12/13) from comment #23)
> Created attachment 8339903 [details] [diff] [review]
> Gaia patch (untested)
> 
> This untested Gaia patch use the notification from Fabrice's patch to delay
> the opening of any apps until the preloaded process is ready. This should
> work for the homescreen, the first time experience, the keyboard and also
> the cost control widget.

And this probably break tests for non-oop builds.
I believe the next step is that someone needs to try Gecko & Gaia patch together on top of a Nuwa build :)
(In reply to Fabrice Desré [:fabrice] from comment #17)
> Created attachment 8337280 [details] [diff] [review]
> notify-slowthings.patch
> 
> This patch fires a system message once the pre-allocated process is loaded.
> The system app could use that to wait before launching the homescreen.

Thanks Fabrice, but we are trying another approach without modification to the system app. The plan is to make nsFrameLoader async, and the system app just adds a remote iframe as usual.
(In reply to Cervantes Yu from comment #26)
> (In reply to Fabrice Desré [:fabrice] from comment #17)
> > Created attachment 8337280 [details] [diff] [review]
> > notify-slowthings.patch
> > 
> > This patch fires a system message once the pre-allocated process is loaded.
> > The system app could use that to wait before launching the homescreen.
> 
> Thanks Fabrice, but we are trying another approach without modification to
> the system app. The plan is to make nsFrameLoader async, and the system app
> just adds a remote iframe as usual.

That sounds like the right approach, but if we want to activate Nuwa for 1.3, it is not much more risky and should we go with a safer approach first and make it better in 1.4 ?
Please note that this patch has 2 prerequisites:
1. patch from bug 930282.
2. patch from bug 944665.
Depends on: 944665
V2. Fix the bug that the the homescreen fails to launch when the property is set to false.

TODO: Delaying async messages is kind of dirty. Will try to figure out other alternatives.
Attachment #8340979 - Attachment is obsolete: true
Whiteboard: [tarako]
This fixes the assertion failure in the debug build:
PJavaScriptParent is created on the parent side, but we use RegisterID() with it.
Attachment #8343021 - Flags: review?(bent.mozilla)
Fix leaks in the static nsTArray
Attachment #8343018 - Attachment is obsolete: true
Attachment #8343018 - Flags: review?(khuey)
Attachment #8343625 - Flags: review?(khuey)
Comment on attachment 8343021 [details] [diff] [review]
Part 2: Fix CloneManagees() for registering parent-created actors

Review of attachment 8343021 [details] [diff] [review]:
-----------------------------------------------------------------

::: ipc/ipdl/ipdl/lower.py
@@ +3828,5 @@
>                  cond=ExprBinary(ivar, '<', _callCxxArrayLength(kidsvar)),
>                  update=ExprPrefixUnop(ivar, '++'))
> +            registerstmt = StmtIf(ExprBinary(_actorId(actorvar),
> +                                             '>',
> +                                             ExprLiteral.ZERO))

Hm, why do we have to do this dynamic check? Shouldn't we be able to know at codegen time whether we will need to call Register or RegisterID?
Comment on attachment 8343021 [details] [diff] [review]
Part 2: Fix CloneManagees() for registering parent-created actors

Ah, because PJavaScript has 'both' constructors I think.

So it looks like we can know at codegen time if we need to call Register() or RegisterID() for 'child' or 'parent' constructors, and then I guess we need this dynamic check for for protocols that have 'both' constructors?
Blocks: 128RAM
blocking-b2g: --- → 1.3?
triage: 1.3+ to benefit tarako. unless after review we believe this is too risky to uplift then we can discuss further
blocking-b2g: 1.3? → 1.3+
Comment on attachment 8343021 [details] [diff] [review]
Part 2: Fix CloneManagees() for registering parent-created actors

Clearing review until those questions are answered (just to clean up my work queue).
Attachment #8343021 - Flags: review?(bent.mozilla)
(In reply to ben turner [:bent] (use the needinfo? flag!) from comment #36)
> So it looks like we can know at codegen time if we need to call Register()
> or RegisterID() for 'child' or 'parent' constructors, and then I guess we
> need this dynamic check for for protocols that have 'both' constructors?
Yes, we can do that.
So we need to check that the message is ctor and direction is INOUT and only call Register() or RegisterID() on such ctors. I will update the patch to address this.
Only select Register() or RegisterID() for managees created by an INOUT ctor when cloning managees.
Attachment #8343021 - Attachment is obsolete: true
Attachment #8347164 - Flags: review?(bent.mozilla)
Depends on: 950266
Comment on attachment 8347164 [details] [diff] [review]
Part 2: Fix CloneManagees() for registering parent-created actors

Review of attachment 8347164 [details] [diff] [review]:
-----------------------------------------------------------------

Great, thanks!
Attachment #8347164 - Flags: review?(bent.mozilla) → review+
Part 1 v2: fix a corner case revealed by a hamachi with 128MB config: when the preallocated process is killed by the lowmem killer, the request for launching an app with get stuck.
Attachment #8343625 - Attachment is obsolete: true
Attachment #8343625 - Flags: review?(khuey)
Attachment #8349325 - Flags: review?(khuey)
fwiw, I tested with these 2 patches and nuwa re-enabled, and this works.
I also tested on 128MB device, process can still fork from Nuwa even when Preallocated process is killed by LMK. 

The different is launch time without Preallocated process is longer than the one with Preallocated process. 
For example launch time of "Messages" is about "965ms" vs "1718ms". 

With Preallocated process App can be launched much quicker, but on a low memory devices, the possibility of Preallocated process been killed by LMK is very high. So with and without Preallocated process is much different for low memory devices. May be we can consider to support Nuwa without Preallocated process for low memory devices.

It's only my 2 cents...
We should spend time on making starting applications without having a preallocated process faster. I.e. we should analyze what code is running during the preallocation and try to move that to happen before the fork()ing. Hopefully this should remove the need to preallocate a process.

We should also look at what memory ends up getting allocated and copy-on-write'd once an app has started. I bet that through use of memory pools and smarter data structures we can enable much more memory to get shared between the nuwa processes.

Both of these things should help a lot with memory usage. I think this would make a particularly big difference on the 128MB device. But lets do this in a different bug.
(In reply to Jonas Sicking (:sicking) needinfo? or feedback? encouraged. from comment #46)
> We should spend time on making starting applications without having a
> preallocated process faster. I.e. we should analyze what code is running
> during the preallocation and try to move that to happen before the
> fork()ing. Hopefully this should remove the need to preallocate a process.

I rather to promote the priority of the preallocated process to avoid being killed, rather than remove it.  Since the preallocated process shares most pages with the Nuwa process.  It's overhead is really light.  (It is 4.5MB for now, we are studying why 4.5MB, it should be far smaller.)

> 
> We should also look at what memory ends up getting allocated and
> copy-on-write'd once an app has started. I bet that through use of memory
> pools and smarter data structures we can enable much more memory to get
> shared between the nuwa processes.

I had told with Ting-Yuan about analysis changes of values of data section and move unchanged data to a separated section to share more pages.

For heap, just like you said, we need to analysis data structures.
Jonas, relevant bugs for what you are proposing are bug 941466 and bug 948648.
I just open a new bug 952393.  It would work on data sections.
Comment on attachment 8349325 [details] [diff] [review]
Part 1: Make nsFrameLoader always use Nuwa-spawned processes for remote iframes

Review of attachment 8349325 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/ipc/ContentParent.cpp
@@ +581,5 @@
>          if (!p) {
> +            if (Preferences::GetBool("dom.ipc.processPrelaunch.enabled",
> +                                     false)) {
> +                return nullptr;
> +            }

This needs a comment.  Should it be an assertion?
Attachment #8349325 - Flags: review?(khuey) → review+
Comment on attachment 8349325 [details] [diff] [review]
Part 1: Make nsFrameLoader always use Nuwa-spawned processes for remote iframes


>+    nsSubDocumentFrame* subdocFrame = do_QueryFrame(frame);
>+    if (!subdocFrame) {
>+      return NS_OK;
>+    }
>+    nsIFrame* subdocRootFrame = subdocFrame->GetSubdocumentRootFrame();
>+    if (!subdocRootFrame) {
>+      return NS_OK;
>+    }
>+    nsIFrame* subdocRootScrollFrame = subdocRootFrame->PresContext()->
>+      PresShell()->GetRootScrollFrame();
>+    if (subdocRootScrollFrame) {
>+      frame->PresContext()->PresShell()->
>+        FrameNeedsReflow(frame, nsIPresShell::eResize, NS_FRAME_IS_DIRTY);
>+    }
Why do we need this subdoc stuff here? The layout for the subdocument lives in the child process.
Comment on attachment 8349325 [details] [diff] [review]
Part 1: Make nsFrameLoader always use Nuwa-spawned processes for remote iframes

I would like to get an answer to smaug's question before we land this.
Attachment #8349325 - Flags: review?(bugs)
Flags: needinfo?(cyu)
(In reply to Olli Pettay [:smaug] from comment #51)
> Why do we need this subdoc stuff here? The layout for the subdocument lives
> in the child process.
You're right. We don't need this part of code here. I will update the patch.
Flags: needinfo?(cyu)
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #50)
> Comment on attachment 8349325 [details] [diff] [review]
> Part 1: Make nsFrameLoader always use Nuwa-spawned processes for remote
> iframes
> 
> Review of attachment 8349325 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: dom/ipc/ContentParent.cpp
> @@ +581,5 @@
> >          if (!p) {
> > +            if (Preferences::GetBool("dom.ipc.processPrelaunch.enabled",
> > +                                     false)) {
> > +                return nullptr;
> > +            }
> 
> This needs a comment.  Should it be an assertion?

This is not an assertion. If this pref value is false, then we will not prelaunch any app process and every app will fork from the b2g process.
I will add a comment here.
Changes:
* Remove unnecessary reflow.
* #ifdef the change in ContentParent::CreateBrowserOrApp()
Attachment #8349325 - Attachment is obsolete: true
Attachment #8349325 - Flags: review?(bugs)
Attachment #8351172 - Flags: review?(bugs)
(In reply to Cervantes Yu from comment #54)
> (In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #50)
> > Comment on attachment 8349325 [details] [diff] [review]
> > Part 1: Make nsFrameLoader always use Nuwa-spawned processes for remote
> > iframes
> > 
> > Review of attachment 8349325 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > ::: dom/ipc/ContentParent.cpp
> > @@ +581,5 @@
> > >          if (!p) {
> > > +            if (Preferences::GetBool("dom.ipc.processPrelaunch.enabled",
> > > +                                     false)) {
> > > +                return nullptr;
> > > +            }
> > 
> > This needs a comment.  Should it be an assertion?
> 
Actually this needs to be #ifdef'd, or homescreen won't launch because it fails to get ContentParent synchronously.
Attachment #8351172 - Flags: review?(bugs) → review+
Attachment #8337280 - Attachment is obsolete: true
Attachment #8339903 - Attachment is obsolete: true
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/86b5c561608c
https://hg.mozilla.org/mozilla-central/rev/340153048276
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla29
status-b2g-v1.3T fixed. remove [tarako] whiteboard
Whiteboard: [tarako]
Depends on: 1033618
You need to log in before you can comment on or make changes to this bug.