Closed Bug 977026 Opened 10 years ago Closed 10 years ago

Fork b2g and Nuwa processes from the same process

Categories

(Firefox OS Graveyard :: Performance, defect, P1)

ARM
Gonk (Firefox OS)
defect

Tracking

(tracking-b2g:backlog)

RESOLVED FIXED
2.1 S1 (1aug)
tracking-b2g backlog

People

(Reporter: sinker, Assigned: sinker)

References

Details

(Keywords: memory-footprint, perf, Whiteboard: [c=memory p= s=2014.08.01.t u=tarako] [partner-blocker][MemShrink:P1])

Attachments

(12 files, 26 obsolete files)

1.92 KB, patch
Details | Diff | Splinter Review
1.30 KB, patch
Details | Diff | Splinter Review
51.29 KB, patch
Details | Diff | Splinter Review
26.47 KB, patch
Details | Diff | Splinter Review
984 bytes, patch
Details | Diff | Splinter Review
6.23 KB, patch
Details | Diff | Splinter Review
22.20 KB, patch
Details | Diff | Splinter Review
1.82 KB, patch
sinker
: review+
Details | Diff | Splinter Review
28.65 KB, patch
sinker
: review+
Details | Diff | Splinter Review
1.64 KB, patch
sinker
: review-
Details | Diff | Splinter Review
50.69 KB, patch
sinker
: review+
Details | Diff | Splinter Review
977.75 KB, application/zip
Details
The size of relocation of libxul.so is more than 1.7mb, it could be shared among Nuwa, content, and b2g processes, but not at all for now.  The problem is b2g process and Nuwa process are not the same binary, they don't share the relocation data of libxul.so.

The idea:
 - create a main binary
   - link b2g as a shared object (b2g.so)
   - link mozplugin-container as a shared object (mozplugin-container.so)
 - main binary calls dlopen() to load libxul.so
   - initialize libxul(?)
 - main binary fork into parent and child process
   - the parent load b2g.so and call its main function.
   - the child load mozplugin-container.so and call its main function.

It could save at least 1.7mb(relo)+2mb(xpti).

This solution may introduce some security concerns.
Assignee: nobody → tlee
Attached patch WIP - loader of b2g and nuwa (obsolete) — Splinter Review
This is a loader of b2g and nuwa.  It loads libxul before forking for b2g and Nuwa process.  With this patch, 1.9MB of libxul of the b2g process become shared dirty from private dirty.  The xptinfo is a potential improvement for another 2MB.
Attachment #8391347 - Flags: feedback?(khuey)
Attachment #8391347 - Flags: feedback?(cyu)
Attachment #8391347 - Flags: feedback?(bent.mozilla)
For enabling the patch 8391347, plugin-container is no more, be linked as a part of libxul.so.  A new protocol PProcLoader is ran between b2g and the process for Nuwa.  With PProcLoader, b2g passes args, env, and socket pairs to the process for Nuwa.  These parameters are passed to the old main function of plugin-container to mimic the behavior of exec plugin-container.  After calling old main function of plugin-container, everything is just like what it used to be.
Assignee: tlee → nobody
Component: General → Performance
Attached patch WIP - loader of b2g and nuwa, v2 (obsolete) — Splinter Review
Attachment #8391347 - Attachment is obsolete: true
Attachment #8391347 - Flags: feedback?(khuey)
Attachment #8391347 - Flags: feedback?(cyu)
Attachment #8391347 - Flags: feedback?(bent.mozilla)
Attachment #8391860 - Flags: feedback?(khuey)
Attachment #8391860 - Flags: feedback?(cyu)
Attachment #8391860 - Flags: feedback?(bent.mozilla)
Attached patch WIP - loader of b2g and nuwa, v3 (obsolete) — Splinter Review
Add files being missed.
Attachment #8391860 - Attachment is obsolete: true
Attachment #8391860 - Flags: feedback?(khuey)
Attachment #8391860 - Flags: feedback?(cyu)
Attachment #8391860 - Flags: feedback?(bent.mozilla)
Attachment #8391863 - Flags: feedback?(khuey)
Attachment #8391863 - Flags: feedback?(cyu)
Attachment #8391863 - Flags: feedback?(bent.mozilla)
B2GLoader.cpp is now where the main function of b2g is.  It forks processes and loading old main functions of nsBrowserApp.cpp and MozillaRuntimeMain.cpp (plugin-container).
Blocks: 957509
blocking-b2g: --- → 1.3T?
(In reply to Thinker Li [:sinker] from comment #0)
> The size of relocation of libxul.so is more than 1.7mb, it could be shared
> among Nuwa, content, and b2g processes, but not at all for now.  The problem
> is b2g process and Nuwa process are not the same binary, they don't share
> the relocation data of libxul.so.
> 
> The idea:
>  - create a main binary
>    - link b2g as a shared object (b2g.so)
>    - link mozplugin-container as a shared object (mozplugin-container.so)
>  - main binary calls dlopen() to load libxul.so
>    - initialize libxul(?)
>  - main binary fork into parent and child process
>    - the parent load b2g.so and call its main function.
>    - the child load mozplugin-container.so and call its main function.
> 
> It could save at least 1.7mb(relo)+2mb(xpti).
> 
> This solution may introduce some security concerns.
Do you have a security concern in mind beyond essentially removing ASLR?
That looks very risky so late, so I'm unblocking for now. Re-nominate once we have more confidence.
Assignee: nobody → tlee
blocking-b2g: 1.3T? → ---
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #7)
> Do you have a security concern in mind beyond essentially removing ASLR?

This bug just load and initialize shared objects and shared static data, and fork into the b2g and the Nuwa processes.  No additional information would be passed from the b2g to the Nuwa by this change except ASLR, arguments and environment variables.

Do we need to raise a security review?
Nuwa already makes all content processes share the memory map, but there is still ASLR between chrome and all content processes. This will make all process having the same memory maps. I think we'd better raise this issue to team and let them make risk assessments.
Keywords: sec-want
Flags: sec-review?(ptheriault)
Keywords: sec-want
Comment on attachment 8391863 [details] [diff] [review]
WIP - loader of b2g and nuwa, v3

Review of attachment 8391863 [details] [diff] [review]:
-----------------------------------------------------------------

Canceling for now until we get sec feedback. Happy to look then!
Attachment #8391863 - Flags: feedback?(bent.mozilla)
Sorry I only just saw this. Is this still a priority for review?
jld/kang - would you be able to take a look at this for me ?
Flags: needinfo?(jld)
Flags: needinfo?(gdestuynder)
Some high-level thoughts (I haven't looked at the patches yet):

The address layout would still be randomized when the b2g process starts, so it changes on reboot, and at any given time each phone is potentially different — *if* there's enough entropy available at that point in boot; I don't know if we've ever investigated that.  (See also bug 811671 comment #129.)

I suppose this also means that a compromised content process trying to attack the parent would have information on the parent's address space layout (load base of libxul/libc, etc.) instead of having to try to discover it some other way, and there are hypothetical scenarios where that could be useful, but that seems to me to be a relatively small benefit compared to saving 4 MB of memory.
Flags: needinfo?(jld)
(In reply to Paul Theriault [:pauljt] from comment #12)
> Sorry I only just saw this. Is this still a priority for review?

I think so.  I aim on landing this on 1.3T if possible.
Thinker, can you rebase on 1.3t ?
I usually really discourage risk this late but 4MB could help us quite a bit in corner cases. Can we get a rebase and some testing with this?
blocking-b2g: --- → 1.3T?
rebase to m-c.  1.3t be later.
Attachment #8391863 - Attachment is obsolete: true
Attachment #8391863 - Flags: feedback?(khuey)
Attachment #8391863 - Flags: feedback?(cyu)
Attached patch part 2: v4 rebase to v1.3t (obsolete) — Splinter Review
Rebased to v1.3t, but ran into SIGSEGV:

Program received signal SIGSEGV, Segmentation fault.
mozilla::ipc::MessageChannel::Close (this=<value optimized out>)
    at /home/ting/w/fx/os/tarako/gecko/ipc/glue/MessageChannel.cpp:1615
1615	        MonitorAutoLock lock(*mMonitor);
(gdb) bt
#0  mozilla::ipc::MessageChannel::Close (this=<value optimized out>)
    at /home/ting/w/fx/os/tarako/gecko/ipc/glue/MessageChannel.cpp:1615
#1  0x40c3cd74 in mozilla::dom::PContentParent::Close (
    this=<value optimized out>)
    at /home/ting/w/fx/os/tarako/objdir-gecko/ipc/ipdl/PContentParent.cpp:211
#2  0x40c11b48 in _ProcLoaderParentDestroy (aLoader=0x30)
    at /home/ting/w/fx/os/tarako/gecko/ipc/glue/ProcessUtils_linux.cpp:79
#3  0x40c1161a in DispatchToFunction<void (*)(mozilla::ipc::ProcLoaderChild*), mozilla::ipc::ProcLoaderChild*> (this=<value optimized out>)
    at /home/ting/w/fx/os/tarako/gecko/ipc/chromium/src/base/tuple.h:449
#4  RunnableFunction<void (*)(mozilla::ipc::ProcLoaderChild*), Tuple1<mozilla::ipc::ProcLoaderChild*> >::Run (this=<value optimized out>)
    at /home/ting/w/fx/os/tarako/gecko/ipc/chromium/src/base/task.h:415
#5  0x40c07364 in MessageLoop::RunTask (this=0x403460c0, task=0x442d60a0)
    at /home/ting/w/fx/os/tarako/gecko/ipc/chromium/src/base/message_loop.cc:340
#6  0x40c080ce in MessageLoop::DeferOrRunPendingTask (this=0x30, 
    pending_task=<value optimized out>)
    at /home/ting/w/fx/os/tarako/gecko/ipc/chromium/src/base/message_loop.cc:348
#7  0x40c08c8c in MessageLoop::DoWork (this=0x403460c0)
    at /home/ting/w/fx/os/tarako/gecko/ipc/chromium/src/base/message_loop.cc:448
---Type <return> to continue, or q <return> to quit---
#8  0x40c11238 in mozilla::ipc::DoWorkRunnable::Run (
    this=<value optimized out>)
    at /home/ting/w/fx/os/tarako/gecko/ipc/glue/MessagePump.cpp:45
#9  0x40ae574c in nsThread::ProcessNextEvent (this=0x4031e8e0, 
    mayWait=<value optimized out>, result=0xbe97779f)
    at /home/ting/w/fx/os/tarako/gecko/xpcom/threads/nsThread.cpp:612
#10 0x40ab84e0 in NS_ProcessNextEvent (thread=0x30, mayWait=false)
    at /home/ting/w/fx/os/tarako/gecko/xpcom/glue/nsThreadUtils.cpp:263
#11 0x40c11380 in mozilla::ipc::MessagePump::Run (this=0x40302d30, 
    aDelegate=0x403460c0)
    at /home/ting/w/fx/os/tarako/gecko/ipc/glue/MessagePump.cpp:85
#12 0x40c07328 in MessageLoop::RunInternal (this=0x1000000)
    at /home/ting/w/fx/os/tarako/gecko/ipc/chromium/src/base/message_loop.cc:222
#13 0x40c073a6 in MessageLoop::RunHandler (this=0x403460c0)
    at /home/ting/w/fx/os/tarako/gecko/ipc/chromium/src/base/message_loop.cc:215
#14 MessageLoop::Run (this=0x403460c0)
    at /home/ting/w/fx/os/tarako/gecko/ipc/chromium/src/base/message_loop.cc:189
#15 0x40fcc874 in nsBaseAppShell::Run (this=0x433a3460)
    at /home/ting/w/fx/os/tarako/gecko/widget/xpwidgets/nsBaseAppShell.cpp:161
#16 0x416006e0 in nsAppStartup::Run (this=0x43901160)
---Type <return> to continue, or q <return> to quit---
    at /home/ting/w/fx/os/tarako/gecko/toolkit/components/startup/nsAppStartup.cpp:276
#17 0x415d81f4 in XREMain::XRE_mainRun (this=0xbe9779d4)
    at /home/ting/w/fx/os/tarako/gecko/toolkit/xre/nsAppRunner.cpp:4067
#18 0x415dabca in XREMain::XRE_main (this=0xbe9779d4, 
    argc=<value optimized out>, argv=<value optimized out>, aAppData=0x22270)
    at /home/ting/w/fx/os/tarako/gecko/toolkit/xre/nsAppRunner.cpp:4135
#19 0x415dad34 in XRE_main (argc=1, argv=0x4031f170, aAppData=0x22270, 
    aFlags=<value optimized out>)
    at /home/ting/w/fx/os/tarako/gecko/toolkit/xre/nsAppRunner.cpp:4345
#20 0x00009b42 in do_main (argc=1, argv=<value optimized out>)
    at /home/ting/w/fx/os/tarako/gecko/b2g/app/nsBrowserApp.cpp:164
#21 b2g_main (argc=1, argv=<value optimized out>)
    at /home/ting/w/fx/os/tarako/gecko/b2g/app/nsBrowserApp.cpp:279
#22 0x000098a0 in forkParentChild (argc=1, argv=0xbe979bd4)
    at /home/ting/w/fx/os/tarako/gecko/b2g/app/B2GLoader.cpp:101
#23 main (argc=1, argv=0xbe979bd4)
    at /home/ting/w/fx/os/tarako/gecko/b2g/app/B2GLoader.cpp:118
(gdb)
It is proved that gecko profiler conflicts with b2g loader.  So, the patch uploaded by ting is work without profiler.  I would update the patch later.
Fix profiler issue.
Attached patch part 1 rebased to 1.3T (obsolete) — Splinter Review
apply part 1 and WIP for 1.3T.
Attachment #8411700 - Attachment is patch: true
Attachment #8411700 - Attachment mime type: text/x-patch → text/plain
Attachment #8411562 - Attachment description: WIP - loader of b2g and nuwa, v4 → part 2: loader of b2g and nuwa, v4
Attachment #8411700 - Attachment description: WIPv4 rebase to v1.3t → part 2: v4 rebase to v1.3t
how about profiler_enabled instead of is_init?
I had used *_enabled, but *_enabled seems confusing with build-time enabling profiler.  I am not sure if it is a good name, and I don't insist *_is_init if people think profiler_enabled is not confusing.
profiler_started then maybe?
Drop USS of b2g process to 46.xMB from 48.xMB after screen-off of the second time of booting.

STR:
1. flash.sh gecko
2. wait lockscreen
3. stop and start b2g
4. wait locakscreen
5. wait screen-off
6. b2g-procrank
Last comment is for running on unagi.
Some known issues:
 - b2g-info is broken,
 - b2g-procrank is broken sometime.
(In reply to Thinker Li [:sinker] from comment #29)
> Some known issues:
>  - b2g-info is broken,
>  - b2g-procrank is broken sometime.

yep, and get_about_memory.py needs to be updated too.
triage: 1.3T+ for 3MB memory saving for tarako
blocking-b2g: 1.3T? → 1.3T+
Attached patch part 3 for v1.3t (obsolete) — Splinter Review
Attached patch part 2 v4 rebase to v1.3t (obsolete) — Splinter Review
Add code fixing the segmentation fault of comment 19.
Attachment #8411700 - Attachment is obsolete: true
Note: Kyle is on PTO until 5/9. I suggest getting reviews from bsmedberg in addition to Cervantes.
Update b2g-info to find b2g main and child processes correctly, as the exe is now always "/system/b2g/b2g".
Attachment #8414194 - Attachment description: part4: corresponding update for b2g-info → part 4: Corresponding update for b2g-info (v1.3t)
Rebase part 4 to master.
Update get_about_memory.py to find b2g main and child processes correctly.
Comment on attachment 8414194 [details] [diff] [review]
part 4: Corresponding update for b2g-info (v1.3t)

Obsolete since there's no differences for the modified files between v1.3t and master.
Attachment #8414194 - Attachment is obsolete: true
Attachment #8414203 - Attachment description: part 4: For master → part 4: Corresponding update for b2g-info (master)
Attachment #8411562 - Attachment is obsolete: true
Attachment #8414268 - Flags: review?(cyu)
Attachment #8414268 - Flags: review?(benjamin)
Attachment #8412841 - Flags: review?(bgirard)
Attachment #8413412 - Flags: review?(benjamin)
This is a work-around in order to make b2g-info does work properly after applied WIP from this bug.
Comment on attachment 8414289 [details] [diff] [review]
Fix-b2g-info-error-for-Bug-977026.patch

I didn't see Ting-Yu's updated this in part 4.
Attachment #8414289 - Attachment is obsolete: true
Use name() to determine it is a main or child process.
Attachment #8414203 - Attachment is obsolete: true
Comment on attachment 8413412 [details] [diff] [review]
part 3: preload XPT before calling fork.

This needs significant documentation, but I'm also skeptical of some things here:

It appears that XPCOM is not started when XRE_ProcLoaderPreload is being called. There is a bunch of code which is then run which currently assumes it is running in an XPCOM environment: dirservice, omnijar, and the manifest-parsing bits. This introduces considerable technical debt and confusion.

Long-term what we should do is either:

1) compile the XPTs into libxul.so, bug 685241.
2) keep the XPTs in omnijar but map the data directly into the xptinfo system (no byte-swapping or allocation) so that the majority of the data is automatically shared.

In any case, I'd really like to avoid the hacks around the dirservice especially. If we going to do something crazy like this we probably ought to explicitly initialize the omnijar path and pass it around and not touch the dirservice at all.
Attachment #8413412 - Flags: review?(benjamin) → review-
Comment on attachment 8414268 [details] [diff] [review]
part 2: loader of b2g and nuwa, v5

I can't review this patch as it stands:

* Please add a detailed commit comment explaining what the change actually does, and how it affects other build configs (desktop, b2g-desktop, etc)

* Please use standard coding style: especially common are patterns such as

+static int
+loadSharedData(const char *aProgram) {

The opening brace of a function goes in column 0.

* Many functions needs doc-comments explaining their purpose. Also in this particular patch, the function docs should explain whether the function expects to be called with XPCOM initialized or not.

Other review comments:

Defining MOZ_B2G_LOADER only in certain moz.build files instead of globally is unusual. Why are you doing that? If it's important to be that way, please document it. Otherwise make it a global AC_DEFINE.
Attachment #8414268 - Flags: review?(benjamin) → review-
Comment on attachment 8412841 [details] [diff] [review]
part 1: check init before registering thread to the profiler

Review of attachment 8412841 [details] [diff] [review]:
-----------------------------------------------------------------

::: tools/profiler/GeckoProfiler.h
@@ +100,5 @@
>  // also save a shutdown profile if requested. No profiler calls should happen
>  // after this point and all pseudo labels should have been popped.
>  static inline void profiler_shutdown() {};
>  
> +// Return if the profiler have been initialized.

I might be nit picking but I think this really makes the profiler API dirty. I'd rather keep it clean. You're using this to conditionally call thread_register/thread_unregister.

Instead we can count the number of register/unregister calls in their implementation and only perform the steps on the first/last call.
Attachment #8412841 - Flags: review?(bgirard) → review-
we need this patch, please update new patch.
Flags: needinfo?(tlee)
Change as what BenWa said.

I don't really like to hide potential errors in this way.  Without explicit flag/condition to check if the call flow is right make me nervous.  With this change, it will not detect the error of being called unexpected.
Attachment #8412841 - Attachment is obsolete: true
Attachment #8417149 - Flags: review?(bgirard)
Flags: needinfo?(tlee)
Comment on attachment 8417149 [details] [diff] [review]
part 1: allow thread registering before the profiler is initialized

Review of attachment 8417149 [details] [diff] [review]:
-----------------------------------------------------------------

This wasn't what I suggested as we're not counting anything here.

First can you explain why Nuwa would make register_thread prone to be called more then once? We don't have this logic in the normal flow of gecko and I don't see how Nuwa would change that.

Anyways, you should change mozilla_sampler_register_thread to first check with Sampler if the thread has been registered already in a new int field of ThreadInfo. If it has then keep track in that threads' own ThreadInfo how many times it has been registered.

On unregister check if that thread is still registered then decrement the count. Only when that count goes to zero should you delete the pseudostack and fully unregister the thread.

This is like the profiler' own global init/deinit works with sInitCount but we need to track this per thread.
Attachment #8417149 - Flags: review?(bgirard) → review-
Whiteboard: [partner-blocker]
@Benoit, I think we have misunderstanding here.  Threads are not registered for more than once.  This bug would run an IPC protocol between parent and child before XPCOM is initialized.  That means IO thread would be created before XPCOM is ready.  But, the IO thread and event loop would be shutdown before initializing XPCOM.  So, all I need is to disable profiler before the initialization of XPCOM since the profiler is called unconditional.  Is that make sense?
Change according the suggestions of comment 44.
Attachment #8414268 - Attachment is obsolete: true
Attachment #8414268 - Flags: review?(cyu)
Attachment #8417207 - Flags: feedback?(benjamin)
Pass the path of monijar from the caller of PreloadXPT(), and avoid to call any component during parsing manifest.

Passing path to monijar or dirservice are the same for me.  Both them are some kind of hacking, dirty or not.  In this patch, it creates nsFileLocation by itself, to skip both Omnijar and dirservice.
Attachment #8413412 - Attachment is obsolete: true
Attachment #8417340 - Flags: feedback?(benjamin)
Please land this patch on v1.3t before 5.10.
Severity: normal → blocker
Status: NEW → ASSIGNED
Keywords: footprint, perf
Priority: -- → P1
Whiteboard: [partner-blocker] → [c=memory p= s= u=tarako] [partner-blocker]
(same as jed comments, clearing flag)
Flags: needinfo?(gdestuynder)
(In reply to Jed Davis [:jld] from comment #14)
> Some high-level thoughts (I haven't looked at the patches yet):
> 
> The address layout would still be randomized when the b2g process starts, so
> it changes on reboot, and at any given time each phone is potentially
> different — *if* there's enough entropy available at that point in boot; I
> don't know if we've ever investigated that.  (See also bug 811671 comment
> #129.)

Thanks Jed - I'll investigate the entropy on restart question further but that's separate to this bug I think for now. I don't have any others concerns beyond those Jed discussed.
Flags: sec-review?(ptheriault) → sec-review+
Comment on attachment 8417207 [details] [diff] [review]
part 2: loader of b2g and nuwa, v6

This is not a full review: somebody from b2g should also review this.

>This patch makes a new main function to replace original main
>functions of b2g and plugin-container, they are renamed to b2g_main()
>and content_process_main() respectively.  The new main function, in
>B2GLoader.cpp, would load libxul and some static data, then fork into
>two processes to run original main functions, the parent for b2g and
>the child for plugin-container.  Then, relocation table of libxul and
>static data could be shared among b2g, Nuwa, and content processes.

I think I'm missing something here. Does plugin-container (the binary) still exist (in B2G)? If so, why?

Does it still exist in desktop builds?

>diff --git a/b2g/app/B2GLoader.cpp b/b2g/app/B2GLoader.cpp

>+#define ASSERT(x)  MOZ_ASSERT(x)

Why?

>+static int
>+getDirnameSlash(const char *aPath, char *aOutDir, int aMaxLen)
>+{
>+  char *lastSlash = strrchr(aPath, XPCOM_FILE_PATH_SEPARATOR[0]);
>+  int cpsz = lastSlash - aPath + 1; // include slash
>+  if (aMaxLen <= cpsz)
>+    return 255;
>+  strncpy(aOutDir, aPath, cpsz);
>+  aOutDir[cpsz] = 0;
>+  return 0;

The purpose of this function is to copy all but the last element of a path to another buffer? Please document it. Also the "int" return value is unusual. We should use either `bool` or `nsresult`. Here and below please make it a bool (true=success). The only thing which returns an int success=0 is the main argc/argv functions.

>+static int
>+getXPCOMPath(const char *aProgram, char *aOutPath, int aMaxLen)

bool result here also.

>+{
>+  nsAutoArrayPtr<char> progBuf;
>+  progBuf = new char[aMaxLen];
>+  nsresult rv = mozilla::BinaryPath::Get(aProgram, progBuf);

MAXPATHLEN is hardcoded in BinaryPath::Get. Why don't we either consistently use MAXPATHLEN? If we need to pass around a dynamic value, we need to assert that it's at least MAXPATHLEN or we risk overflow of this buffer.

>+  int dirSize = strlen(aOutPath);
>+  if ((dirSize + sizeof(XPCOM_DLL) + 1) > aMaxLen)
>+    return 255;
>+  char *afterSlash = aOutPath + dirSize;
>+  strcpy(afterSlash, XPCOM_DLL);
>+  return 0;

This almost makes me feel like we should be using std::string. Or at least strncat.

I don't think it's necessary or desirable to declare b2g_main as "extern"... this is C++, just use a normal declaration.

>diff --git a/ipc/app/Makefile.in b/ipc/app/Makefile.in

>+include $(DEPTH)/config/autoconf.mk

This is weird and shouldn't be necessary: autoconf.mk is auto-included by config.mk.

>+
>+ifneq ($(MOZ_WIDGET_TOOLKIT),android)
>+LIBS += ../contentproc/$(LIB_PREFIX)plugin-container.$(LIB_SUFFIX)
>+endif
>+

>diff --git a/ipc/contentproc/moz.build b/ipc/contentproc/moz.build

>+LOCAL_INCLUDES += [
>+    '/toolkit/xre',
>+    '/xpcom/base',
>+]

Why are these local includes necessary?

>diff --git a/ipc/contentproc/plugin-container.cpp b/ipc/contentproc/plugin-container.cpp


Wasn't most of this code copied/moved from another location? Can we use `hg move` to preserve blame and make the patch easier to read? Or if plugin-container (the binary) doesn't need to exist any more, can this code just live in it's original location in ipc/app ?

>diff --git a/toolkit/xre/nsAppRunner.cpp b/toolkit/xre/nsAppRunner.cpp

> nsresult
> XREMain::XRE_mainRun()
> {
>   nsresult rv = NS_OK;
>   NS_ASSERTION(mScopedXPCom, "Scoped xpcom not initialized.");
> 
>+#ifdef MOZ_B2G_LOADER
>+  mozilla::ipc::ProcLoaderClientGeckoInit();

In which processes is this supposed to be called? This ends up using XRE_GetIOMessageLoop which I thought was for children only.

>diff --git a/xpcom/threads/BackgroundHangMonitor.h b/xpcom/threads/BackgroundHangMonitor.h
>--- a/xpcom/threads/BackgroundHangMonitor.h
>+++ b/xpcom/threads/BackgroundHangMonitor.h
>@@ -101,16 +101,18 @@ class BackgroundHangThread;
>  *        hangMonitor.NotifyWait();
>  *        wait_for_next_event();
>  *      }
>  *    } else {
>  *      process_nonsync_event();
>  *    }
>  *  }
>  *
>+ * Prohibit() and Allow() allow the background hang monitor working
>+ * safely before Startup().

This comment is insufficient. Why would this patch affect the background hang monitor? Who is expected to call Prohibit, and at what point during startup or shutdown? Perhaps it would make more sense to add a flag on the specific event loop which is triggering the hang monitor at the wrong time?
Attachment #8417207 - Flags: feedback?(benjamin) → feedback+
Benjamin,

Are you also planning to give feedback on attachment #8417340 [details] [diff] [review]? We're working to finalize Tarako changes this week.

Thanks,
Mike
Flags: needinfo?(benjamin)
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #55)
> Comment on attachment 8417207 [details] [diff] [review]
> part 2: loader of b2g and nuwa, v6

> I think I'm missing something here. Does plugin-container (the binary) still
> exist (in B2G)? If so, why?

It is still there as a fallback if Nuwa process is dead for some reason.

> 
> Does it still exist in desktop builds?

Yes!  I don't enable the loader for desktop builds.

> 
> This almost makes me feel like we should be using std::string. Or at least
> strncat.

As I know, strncat is not so viable for a lot of platforms.  I would move to std::string.

> >diff --git a/ipc/contentproc/plugin-container.cpp b/ipc/contentproc/plugin-container.cpp
> 
> 
> Wasn't most of this code copied/moved from another location? Can we use `hg
> move` to preserve blame and make the patch easier to read? Or if
> plugin-container (the binary) doesn't need to exist any more, can this code
> just live in it's original location in ipc/app ?

The code is called by plugin-container and b2g loader.  I move it for making it as a library.

moz.build seems allow to build at most one library or program in a directory.  Is it possible to build a program and a library in the same directory at simultaneously?  II would left it at the place if possible.

> 
> >diff --git a/toolkit/xre/nsAppRunner.cpp b/toolkit/xre/nsAppRunner.cpp
> 
> In which processes is this supposed to be called? This ends up using
> XRE_GetIOMessageLoop which I thought was for children only.

It is called by the parent.  XRE_GetIOMessageLoop() is available for both ends of IPC.  For example, CotentParent also use it.

> 
> >diff --git a/xpcom/threads/BackgroundHangMonitor.h b/xpcom/threads/BackgroundHangMonitor.h
> >--- a/xpcom/threads/BackgroundHangMonitor.h
> >+++ b/xpcom/threads/BackgroundHangMonitor.h
> This comment is insufficient. Why would this patch affect the background
> hang monitor? Who is expected to call Prohibit, and at what point during
> startup or shutdown? Perhaps it would make more sense to add a flag on the
> specific event loop which is triggering the hang monitor at the wrong time?

It is called by the child process before XPCOM is inited.  The background hang monitor is more like a feature being mixed into threads or event loops.  It is no good to expose the feature on the interface of threads or event loops.  How do you think?
Change according comment 55.
Attachment #8417207 - Attachment is obsolete: true
Attachment #8418637 - Flags: feedback?(benjamin)
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #55)
> >+  nsAutoArrayPtr<char> progBuf;
> >+  progBuf = new char[aMaxLen];
> >+  nsresult rv = mozilla::BinaryPath::Get(aProgram, progBuf);
> 
> MAXPATHLEN is hardcoded in BinaryPath::Get. Why don't we either consistently
> use MAXPATHLEN? If we need to pass around a dynamic value, we need to assert
> that it's at least MAXPATHLEN or we risk overflow of this buffer.

Since boundary checking is always necessary, there is no much benefit with MAXPATHLEN.  And, these function are static, it means compiler is likely to expand and inline it.  The overhead of passing size is really not a mater.

> 
> >+  int dirSize = strlen(aOutPath);
> >+  if ((dirSize + sizeof(XPCOM_DLL) + 1) > aMaxLen)
> >+    return 255;
> >+  char *afterSlash = aOutPath + dirSize;
> >+  strcpy(afterSlash, XPCOM_DLL);
> >+  return 0;
> 
> This almost makes me feel like we should be using std::string. Or at least
> strncat.

There are boundary checking, using std::string or strncat don't simplify the code.
> Since boundary checking is always necessary, there is no much benefit with
> MAXPATHLEN.  And, these function are static, it means compiler is likely to
> expand and inline it.  The overhead of passing size is really not a mater.

I don't understand this paragraph. BinaryPath::Get bounds-checks to MAXPATHLEN, even though the rest of this code boundary-checks to an arbitrary value. That mismatch needs to be corrected.

> 
> > 
> > >+  int dirSize = strlen(aOutPath);
> > >+  if ((dirSize + sizeof(XPCOM_DLL) + 1) > aMaxLen)
> > >+    return 255;
> > >+  char *afterSlash = aOutPath + dirSize;
> > >+  strcpy(afterSlash, XPCOM_DLL);
> > >+  return 0;
> > 
> > This almost makes me feel like we should be using std::string. Or at least
> > strncat.
> 
> There are boundary checking, using std::string or strncat don't simplify the
> code.

If the function returns a std::string instead of doing hand-munged buffer math, you don't need some of the bounds checks to begin with, and the code will be easier to understand.
Flags: needinfo?(benjamin)
Comment on attachment 8417340 [details] [diff] [review]
part 3: preload XPT before calling fork, v2

>diff --git a/toolkit/xre/moz.build b/toolkit/xre/moz.build

>+    GENERATED_INCLUDES += [
>+        '/xpcom',
>+    ]

This doesn't make much sense to me. Why is this necessary?

>diff --git a/toolkit/xre/nsEmbedFunctions.cpp b/toolkit/xre/nsEmbedFunctions.cpp

>+#ifdef MOZ_B2G_LOADER
>+/**
>+ * Preload static data of Gecko for B2G loader.

Why is the implementation of this function here, instead of in xpcom/components? That would not only fix some of the LOCAL_INCLUDES mess, but it's also more logical since this entirely a function of the component manager.

Is failure of this method fatal, or will we simply load XPTs later?

>+void
>+XRE_ProcLoaderPreload(const char *aProgramDir) {

style, opening brace in column 0
style, snuggle the * with the type, so `const char* aProgramDir`

>+    nsresult rv;
>+    nsRefPtr<nsLocalFile> omnijarFile = new nsLocalFile;
>+    rv = omnijarFile->InitWithNativePath(nsCString(aProgramDir));

Use NS_NewNativeLocalFile here.

>diff --git a/xpcom/components/ManifestParser.cpp b/xpcom/components/ManifestParser.cpp

> static const ManifestDirective kParsingTable[] = {
>   { "manifest",         1, false, true, true, false,
>-    &nsComponentManagerImpl::ManifestManifest, nullptr },
>+    &nsComponentManagerImpl::ManifestManifest, nullptr,
>+    &XPTInterfaceInfoManager::ManifestManifest },

Does this require additional ifdefs?

Is this preload method only called for content processes? It appears to only preload from the GRE omni.ja, not the app omni.ja, and it doesn't load from flat files at all.

Also we're talking about having manifest directives that limit directives to the chrome process or the content process. Will this preloading affect that?

>+/**
>+ * This function could be called, as a noop, if the component manager
>+ * is not ready, it also means to be called before XPCOM be
>+ * initialized.

Reword: "If we are pre-loading XPTs, this method may do nothing because the console service is not initialized."

> void LogMessage(const char* aMsg, ...)
> {
>+  if (nsComponentManagerImpl::gComponentManager == nullptr) {

if (!nsComponentManagerImpl::gComponentManager)

>+    // Avoid to create any component before nsComponentManagerImpl is
>+    // initialized.

This comment is unnecessary.

>-  nsCOMPtr<nsIXULAppInfo> xapp (do_GetService(XULAPPINFO_SERVICE_CONTRACTID));
>+  nsCOMPtr<nsIXULAppInfo> xapp;
>+  if (!aXPTOnly) {
>+    // Avoid to create any component for XPT only mode.
>+    // No xapp means no ID, version, ..., modifiers checking.
>+    xapp = do_GetService(XULAPPINFO_SERVICE_CONTRACTID);
>+  }

This means that it isn't safe for the "manifest" or "interfaces" directives in the GRE omni.ja to ever have modifiers. Do we ever assert or log that this is the case?

"manifest", "interfaces"

>diff --git a/xpcom/components/nsComponentManager.cpp b/xpcom/components/nsComponentManager.cpp

>+typedef nsDataHashtable<nsCStringHashKey, bool> XPTIInfosBookType;
>+static XPTIInfosBookType *sXPTIInfosBook = nullptr;

This needs docs. It also seems as if we will be creating this hash even in desktop builds when it will never be used. Is the string hash key guaranteed to be the same across runs? Is it safe to call the GetURIString method when XPCOM is not started?


>+#ifdef MOZ_B2G_LOADER
>+
>+/* static */
>+void
>+XPTInterfaceInfoManager::ManifestManifest(ManifestProcessingContext &aCx,
>+                                          int aLineno,
>+                                          char * const * aArgv)
>+{
>+    char* file = aArgv[0];
>+    FileLocation f(aCx.mFile, file);
>+
>+    DoRegisterManifest(NS_COMPONENT_LOCATION, f, false, true);
>+}
>+
>+/* static */
>+void
>+XPTInterfaceInfoManager::ManifestXPT(ManifestProcessingContext &aCx,
>+                                     int aLineno,
>+                                     char * const * aArgv)
>+{
>+    FileLocation f(aCx.mFile, aArgv[0]);
>+    DoRegisterXPT(f);
>+}

If you're going to put these method implementations in this file, they should be members of nsComponentManager, not XPTIInterfaceInfoManager. Give them a unique name like "nsComponentManagerImpl::ManifestXPTPreload" or something.
Attachment #8417340 - Flags: feedback?(benjamin)
It should be possible to avoid moving MozillaRuntimeMain.cpp using this technique:

<bsmedberg> glandium: is it possible in today's moz.build world to have .cpp files in a single directory end up in different libraries/final link targets?
<glandium> bsmedberg: yes and no. No, not directly, but yes, you can if you refer it from several directories
<glandium> bsmedberg: that is, you can do SOURCES += ['otherdir/foo.c']
<bsmedberg> glandium: ah ok, is ".." allowed in that case?
<glandium> bsmedberg: yes
Comment on attachment 8418637 [details] [diff] [review]
part 2: loader of b2g and nuwa, v7

I don't think I need to provide more feedback on this, do I?
Attachment #8418637 - Flags: feedback?(benjamin)
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #61)
> Comment on attachment 8417340 [details] [diff] [review]
> part 3: preload XPT before calling fork, v2
> 
> 
> This means that it isn't safe for the "manifest" or "interfaces" directives
> in the GRE omni.ja to ever have modifiers. Do we ever assert or log that
> this is the case?
> 
> "manifest", "interfaces"

Maybe we should skip all directives with flags/modifiers.  Then, it would fallback to normal load.

> 
> >diff --git a/xpcom/components/nsComponentManager.cpp b/xpcom/components/nsComponentManager.cpp
> 
> >+typedef nsDataHashtable<nsCStringHashKey, bool> XPTIInfosBookType;
> >+static XPTIInfosBookType *sXPTIInfosBook = nullptr;
> 
> This needs docs. It also seems as if we will be creating this hash even in
> desktop builds when it will never be used. Is the string hash key guaranteed
> to be the same across runs? Is it safe to call the GetURIString method when
> XPCOM is not started?

I think it is.  After checking the code base, GetURIString() works even without XPCOM.
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #63)
> Comment on attachment 8418637 [details] [diff] [review]
> part 2: loader of b2g and nuwa, v7
> 
> I don't think I need to provide more feedback on this, do I?

So, would you review it?  Or I should find someone!?
Change for comment 61.
Attachment #8417340 - Attachment is obsolete: true
Comment on attachment 8417149 [details] [diff] [review]
part 1: allow thread registering before the profiler is initialized

Review of attachment 8417149 [details] [diff] [review]:
-----------------------------------------------------------------

Actually I misunderstood the problem. This is fine.
Attachment #8417149 - Flags: review- → review+
Attached patch part 1 for 1.3TSplinter Review
Attachment #8412846 - Attachment is obsolete: true
Attached patch part 2 for 1.3TSplinter Review
Attachment #8413606 - Attachment is obsolete: true
Attached patch part 3 for 1.3TSplinter Review
Attachment #8413600 - Attachment is obsolete: true
We'll verify these patches by monkey test now.
Flags: needinfo?(fabrice)
This patch is very very useful for tarako, let's land it quickly.
But I think Fabrice need revert some lazy load module when this patch land. Because b2g should load all necessary module before b2g fork, exclude NFC.
Flags: needinfo?(ttsai)
Flags: needinfo?(tlee)
Flags: needinfo?(kkuo)
I think this bug don't affect Fabrice's changes.
Flags: needinfo?(tlee)
(In reply to Thinker Li [:sinker] from comment #74)
> I think this bug don't affect Fabrice's changes.

That's correct!
Flags: needinfo?(fabrice)
Can you paste the information here? At least I have no easy access to your bugzilla. Thanks!
Flags: needinfo?(james.zhang)
(In reply to James Zhang from comment #77)
> Please see http://bugzilla.spreadtrum.com/bugzilla/show_bug.cgi?id=311228.
> 
> Is it this patch side effect?

Our RD has checked this issue, it's not caused by this patch, it's caused by bug 935750.
We found two prealloced process in b2g-ps.
Flags: needinfo?(james.zhang)
(In reply to James Zhang from comment #79)
> (In reply to James Zhang from comment #77)
> > Please see http://bugzilla.spreadtrum.com/bugzilla/show_bug.cgi?id=311228.
> > 
> > Is it this patch side effect?
> 
> Our RD has checked this issue, it's not caused by this patch, it's caused by
> bug 935750.
> We found two prealloced process in b2g-ps.

Please ignore my comment 77 and comment 79, I have checked the build revision with this bug, it's before 5/7, and we have fixed bug 1005787 after 5/7.
Flags: needinfo?(kkuo)
Attachment #8418637 - Flags: review?(khuey)
Attachment #8418637 - Flags: review?(cyu)
We can't print console.log with this patch.
(In reply to James Zhang from comment #81)
> We can't print console.log with this patch.

Thiner, help.
Flags: needinfo?(tlee)
Flags: needinfo?(fabrice)
It seems like that __android_log_print() is called before calling ShuffleFileDescriptors(), so the fds used by logd are over-written by ShuffleFileDescriptors().  The solution is to avoid calling __android_log_print() before calling ShuffleFileDescriptors().
Flags: needinfo?(tlee)
Flags: needinfo?(renfeng.mei)
Flags: needinfo?(fabrice)
Thinker, have you verify your solution?  would please update your patch?
Flags: needinfo?(renfeng.mei)
Attached patch disable-android-log.diff (obsolete) — Splinter Review
Apply this patch after part 2 to fix the console.log issue.
When can we land all patch?
(In reply to Thinker Li [:sinker] from comment #85)
> Created attachment 8427503 [details] [diff] [review]
> disable-android-log.diff
> 
> Apply this patch after part 2 to fix the console.log issue.

Can you give us the patch base v1.3t?
FWIW I've almost finished reviewing.  Expect my comments tomorrow.
Comment on attachment 8418637 [details] [diff] [review]
part 2: loader of b2g and nuwa, v7

Review of attachment 8418637 [details] [diff] [review]:
-----------------------------------------------------------------

This patch look good to me. My comments are mostly on maintainability and styles. I don't see major problems herein but there are some issues to address so I give r+. We still need another reviewer's review to get this patch to land.

::: b2g/app/B2GLoader.cpp
@@ +34,5 @@
> +  { nullptr, nullptr }
> +};
> +
> +static bool
> +getDirnameSlash(const char *aPath, char *aOutDir, int aMaxLen)

nit: capital g for camalcase function names. Same to other functions below in this file.

@@ +37,5 @@
> +static bool
> +getDirnameSlash(const char *aPath, char *aOutDir, int aMaxLen)
> +{
> +  char *lastSlash = strrchr(aPath, XPCOM_FILE_PATH_SEPARATOR[0]);
> +  int cpsz = lastSlash - aPath + 1; // include slash

NULL check lastSlash or assert it to be non-NULL.

@@ +39,5 @@
> +{
> +  char *lastSlash = strrchr(aPath, XPCOM_FILE_PATH_SEPARATOR[0]);
> +  int cpsz = lastSlash - aPath + 1; // include slash
> +  if (aMaxLen <= cpsz)
> +    return true;

nit: curly braces even for single-line if statements.

@@ +62,5 @@
> +  }
> +
> +  int dirSize = strlen(aOutPath);
> +  if ((dirSize + sizeof(XPCOM_DLL) + 1) > aMaxLen)
> +    return true;

nit: curly braces even for single-line if statements.

@@ +124,5 @@
> +  r = fcntl(childSock, F_SETFL, O_NONBLOCK);
> +  MOZ_ASSERT(r != -1);
> +
> +  pid_t pid = fork();
> +  MOZ_ASSERT(pid >= 0);

Consider _exit()ing the process and let the init daemon retry if socketpair(), fork() or other system calls fail in this function.

::: b2g/app/nsBrowserApp.cpp
@@ +167,5 @@
> +#ifndef MOZ_B2G_LOADER
> +#define MAIN main
> +#else
> +/*
> + * B2G loader is the main function instead of the main() here if it is

I think what you mean here is that B2G loader has a new main() and will call the renamed version in this file.

@@ +272,5 @@
>  
>    int result;
>    {
>      ScopedLogging log;
> +    char **_argv;

It's unclear why duping argv here. Please document the rationale for doing this.

::: configure.in
@@ +8717,5 @@
>  fi
>  AC_SUBST(MOZ_NUWA_PROCESS)
> +if test "$MOZ_WIDGET_TOOLKIT" = gonk -a -n "$MOZ_B2G_LOADER"; then
> +    if test -z "$MOZ_NUWA_PROCESS"; then
> +       AC_MSG_ERROR([B2G loader is only work with Nuwa]);

nit: AC_MSG_ERROR([B2G loader only works with Nuwa]);

::: ipc/chromium/src/base/process_util_linux.cc
@@ +240,5 @@
> +static bool
> +IsLaunchingNuwa(const std::vector<std::string>& argv) {
> +  std::vector<std::string>>::const_iterator it;
> +  for (it = argv.begin(); it != argv.end(); ++it) {
> +    if (argv[i] == NS_LITERAL_STRING("-nuwa")) {

We already have 3 uses of the literal "-nuwa" including this one. Please refactor to reduce future maintenance effort.

@@ +257,5 @@
>                 bool wait, ProcessHandle* process_handle,
>                 ProcessArchitecture arch) {
> +#ifdef MOZ_B2G_LOADER
> +  static bool firstTimeLaunch = true;
> +  if (!wait && firstTimeLaunch) {

I don't think we need to check wait here. We always pass false to this function and never wait for the child process to die. Also, what should we do if b2g loader is enabled and wait is false? Given that I would suggest just removing this check.

::: ipc/glue/PProcLoader.ipdl
@@ +27,5 @@
> +parent:
> +  /**
> +   * The acknowledge of Load().
> +   */
> +  async LoadComplete(int32_t pid, int32_t cookie);

Can we make Load() sync and eliminate LoadComplete()? I don't see the necessity to make it async. After all the child process sends LoadComplete() message back to the parent process when it receives the Load() request.

::: ipc/glue/ProcessUtils.h
@@ +18,5 @@
>  // this directly.
>  void SetThisProcessName(const char *aName);
>  
> +#ifdef MOZ_B2G_LOADER
> +void ProcLoaderClientGeckoInit();

I don't think this to be a good name. There are already too many initialization things going int the code. "Gecko" in this name doesn't tell us much about why or what it does.

::: ipc/glue/ProcessUtils_linux.cpp
@@ +37,5 @@
> +
> +int content_process_main(int argc, char *argv[]);
> +
> +#endif /* MOZ_B2G_LOADER */
> +

The addition to this file already takes a large portion, and functions related to ProcLoader are self-contained enough to deserve a separate file. I would suggest moving these functions out of this file.

@@ +102,5 @@
> +class ProcLoaderParent : public PProcLoaderParent
> +{
> +private:
> +  nsAutoPtr<FileDescriptor> mFd;
> +

Please comment what this FD is for.

@@ +171,5 @@
> + * Initialize the client of B2G loader for Gecko.
> + */
> +void
> +ProcLoaderClientGeckoInit()
> +{

It's unclear how come this function is to initialize Gecko. What it does is connecting to the PProcLoader service running in the child side.

In addition to the documentation above,  please also comment or assert (if applicable) in which process the functions in this file run to make them clearer.

@@ +220,5 @@
> +               const char *aEnvp[],
> +               const file_handle_mapping_vector &aFdsRemap,
> +               const ChildPrivileges aPrivs,
> +               ProcessHandle *aProcessHandle)
> +{

This function sounds like doing the loading on itself, but actually requests the child process to load. Consider renaming this to RequestProcLoaderToLoad() or something like that.

@@ +270,5 @@
> +};
> +
> +
> +class ProcLoaderNoopRunner : public ProcLoaderRunnerBase {
> +public:

Why creating a concrete class for Noop?

@@ +307,5 @@
> +};
> +
> +void
> +ProcLoaderLoadRunner::ArrangeFds()
> +{

This function does FD shuffling as in base::LauchApp(), right? Please document this to make it clear. Also why not use ShuffleFileDescriptors()?

@@ +420,5 @@
> +}
> +
> +void
> +ProcLoaderChild::OnChannelError()
> +{

Should we also exit the process in this function?

@@ +436,5 @@
> +                                   NewRunnableFunction(_ProcLoaderChildDestroy,
> +                                                       this));
> +}
> +
> +/**

We need some comment here about what this function does, considering the importance of it.

::: xpcom/threads/BackgroundHangMonitor.h
@@ +224,5 @@
> +   * Allow() and Prohibit() should be called in pair.
> +   *
> +   * \see Prohibit()
> +   */
> +  static void Allow();

Prohibiting and allowing the background hang monitor is tricky. The document here doesn't make it clear that it's used to run a temporary message loop for the PProcLoader actor in the child process (if I read it correctly). Please document the intention here. Also, is there any way around to start a MessageLoop without the BackgroundHangMonitor?
Attachment #8418637 - Flags: review?(cyu) → review+
Comment on attachment 8418637 [details] [diff] [review]
part 2: loader of b2g and nuwa, v7

Review of attachment 8418637 [details] [diff] [review]:
-----------------------------------------------------------------

The code looks pretty good.  Most of my comments are grammar/word choice stuff.

I assume you've run this with a debug build and made sure it doesn't trip assertions?

::: b2g/app/B2GLoader.cpp
@@ +34,5 @@
> +  { nullptr, nullptr }
> +};
> +
> +static bool
> +getDirnameSlash(const char *aPath, char *aOutDir, int aMaxLen)

Make this function return an int.

@@ +38,5 @@
> +getDirnameSlash(const char *aPath, char *aOutDir, int aMaxLen)
> +{
> +  char *lastSlash = strrchr(aPath, XPCOM_FILE_PATH_SEPARATOR[0]);
> +  int cpsz = lastSlash - aPath + 1; // include slash
> +  if (aMaxLen <= cpsz)

This needs to be strictly less than, to leave space for the null terminator.

@@ +39,5 @@
> +{
> +  char *lastSlash = strrchr(aPath, XPCOM_FILE_PATH_SEPARATOR[0]);
> +  int cpsz = lastSlash - aPath + 1; // include slash
> +  if (aMaxLen <= cpsz)
> +    return true;

and return 0 here.

@@ +42,5 @@
> +  if (aMaxLen <= cpsz)
> +    return true;
> +  strncpy(aOutDir, aPath, cpsz);
> +  aOutDir[cpsz] = 0;
> +  return false;

and return cpsz here

@@ +46,5 @@
> +  return false;
> +}
> +
> +static bool
> +getXPCOMPath(const char *aProgram, char *aOutPath, int aMaxLen)

and change the sense of the boolean return value here so that true means success

@@ +49,5 @@
> +static bool
> +getXPCOMPath(const char *aProgram, char *aOutPath, int aMaxLen)
> +{
> +  nsAutoArrayPtr<char> progBuf;
> +  progBuf = new char[aMaxLen];

on one line (might need to write progBuf(new ...);)

@@ +53,5 @@
> +  progBuf = new char[aMaxLen];
> +  nsresult rv = mozilla::BinaryPath::Get(aProgram, progBuf);
> +  if (NS_FAILED(rv)) {
> +    return true;
> +  }

NS_ENSURE_SUCCESS(rv, false);

@@ +55,5 @@
> +  if (NS_FAILED(rv)) {
> +    return true;
> +  }
> +
> +  bool err = getDirnameSlash(progBuf, aOutPath, aMaxLen);

rename this to int len once you change the boolean return values.

@@ +58,5 @@
> +
> +  bool err = getDirnameSlash(progBuf, aOutPath, aMaxLen);
> +  if (err) {
> +    return true;
> +  }

NS_ENSURE_TRUE(!!len, false)

@@ +60,5 @@
> +  if (err) {
> +    return true;
> +  }
> +
> +  int dirSize = strlen(aOutPath);

Then you can skip running strlen here and just use len from above.

@@ +62,5 @@
> +  }
> +
> +  int dirSize = strlen(aOutPath);
> +  if ((dirSize + sizeof(XPCOM_DLL) + 1) > aMaxLen)
> +    return true;

NS_ENSURE_TRUE(condition, false)

@@ +65,5 @@
> +  if ((dirSize + sizeof(XPCOM_DLL) + 1) > aMaxLen)
> +    return true;
> +  char *afterSlash = aOutPath + dirSize;
> +  strcpy(afterSlash, XPCOM_DLL);
> +  return false;

return true

@@ +68,5 @@
> +  strcpy(afterSlash, XPCOM_DLL);
> +  return false;
> +}
> +
> +static bool

this too

@@ +77,5 @@
> +  XPCOMGlueEnablePreload();
> +  rv = XPCOMGlueStartup(aXPCOMPath);
> +  if (NS_FAILED(rv)) {
> +    return true;
> +  }

NS_ENSURE_SUCCESS(rv, false)

@@ +82,5 @@
> +
> +  rv = XPCOMGlueLoadXULFunctions(kXULFuncs);
> +  if (NS_FAILED(rv)) {
> +    return true;
> +  }

and here

@@ +84,5 @@
> +  if (NS_FAILED(rv)) {
> +    return true;
> +  }
> +
> +  return false;

return true;

@@ +87,5 @@
> +
> +  return false;
> +}
> +
> +static bool

and here

@@ +92,5 @@
> +loadStaticData(const char *aProgram)
> +{
> +  char xpcomPath[MAXPATHLEN];
> +  bool err = getXPCOMPath(aProgram, xpcomPath, MAXPATHLEN);
> +  if (err) return true;

NS_ENSURE_SUCCESS(ok, false);

@@ +138,5 @@
> +     */
> +    return XRE_ProcLoaderServiceRun(getppid(), childSock, argc, argv);
> +  }
> +
> +  // The b2g processes

process

@@ +145,5 @@
> +  return b2g_main(argc, argv);
> +}
> +
> +/**
> + * B2G Loader is responsible for loading of the b2g process and the

loading the

@@ +151,5 @@
> + * process, and the child process, for the Nuwa process.
> + *
> + * The loader load libxul and doing initialization static data before
> + * forking, so relocation of libxul and static data could be shared
> + * among the b2g process, the Nuwa process, and content processes.

The loader loads libxul and performs initialization of static data before forking, so relocation of libxul and static data can be shared between the b2g process, the Nuwa process, and all the content processes.

::: b2g/app/nsBrowserApp.cpp
@@ +225,5 @@
>    DllBlocklist_Initialize();
>  #endif
>  
> +  // For B2G loader have initialized Gecko, here is no more used if
> +  // B2g loader is enabled.

B2G loader has already initialized Gecko so we can't initialize it again here.

@@ +272,5 @@
>  
>    int result;
>    {
>      ScopedLogging log;
> +    char **_argv;

yes, please

@@ +286,5 @@
> +
> +    for (int i = 0; i < argc; i++) {
> +      free(_argv[i]);
> +    }
> +    delete _argv;

delete[]

::: ipc/chromium/src/base/process_util_linux.cc
@@ +208,5 @@
> +                    const environment_map& env_vars_to_set,
> +                    ChildPrivileges privs,
> +                    ProcessHandle* process_handle) {
> +  size_t i;
> +  scoped_array<char*> argv_cstr(new char*[argv.size() + 1]);

I would prefer that you just used nsAutoArrayPtr, but I don't feel strongly about it.

@@ +212,5 @@
> +  scoped_array<char*> argv_cstr(new char*[argv.size() + 1]);
> +  for (i = 0; i < argv.size(); i++) {
> +    argv_cstr[i] = const_cast<char*>(argv[i].c_str());
> +  }
> +  argv_cstr[argv.size()] = NULL;

nullptr

@@ +220,5 @@
> +  for (environment_map::const_iterator it = env_vars_to_set.begin();
> +       it != env_vars_to_set.end(); ++it) {
> +    env_cstr[i++] = strdup((it->first + "=" + it->second).c_str());
> +  }
> +  env_cstr[env_vars_to_set.size()] = nullptr;

Unless I'm missing something you just wrote beyond the end of this array.

@@ +257,5 @@
>                 bool wait, ProcessHandle* process_handle,
>                 ProcessArchitecture arch) {
> +#ifdef MOZ_B2G_LOADER
> +  static bool firstTimeLaunch = true;
> +  if (!wait && firstTimeLaunch) {

+1

::: ipc/contentproc/moz.build
@@ +19,5 @@
> +    LOCAL_INCLUDES += [
> +        '/security',
> +        '/security/sandbox',
> +        '/security/sandbox/chromium',
> +    ]

Do we still need this in ipc/app?  File a followup on removing it if we don't?

::: ipc/glue/PProcLoader.ipdl
@@ +25,5 @@
> +             int32_t cookie);
> +
> +parent:
> +  /**
> +   * The acknowledge of Load().

acknowledgement

@@ +27,5 @@
> +parent:
> +  /**
> +   * The acknowledge of Load().
> +   */
> +  async LoadComplete(int32_t pid, int32_t cookie);

If we make this sync does that block startup any longer than the current code?

::: ipc/glue/ProcessUtils_linux.cpp
@@ +64,5 @@
> + *
> + * B2G loader includes an IPC protocol PProcLoader for communication
> + * between client (parent) and server (child).  The b2g process is the
> + * client, it request the server to load/start Nuwa process with give
> + * the arguments, env variables, and file descriptions.

The b2g process is the client.  It requests the server to load/start the Nuwa process with the given arguments, env variables, and file descriptors.

@@ +68,5 @@
> + * the arguments, env variables, and file descriptions.
> + *
> + * ProcLoaderClientInit() is called by B2G loader to initialize client
> + * side, the b2g process.  Then, the b2g_main() is called to start b2g
> + * process.

initialize the client side ... Then the (no comma)

@@ +73,5 @@
> + *
> + * ProcLoaderClientGeckoInit() is called by XRE_main() to create the
> + * parent actor, |ProcLoaderParent|, of PProcLoader for serving the
> + * request of running Nuwa process later, once Gecko have been
> + * initialized.

for servicing the request to run the Nuwa process once Gecko has been initialized.

@@ +80,5 @@
> + * an IOThread and event loop to serve the |ProcLoaderChild|
> + * implmentation of PProcLoader protocol as the child actor.  Once it
> + * recieves a load() request, it stops the IOThread and event loop,
> + * then start running the main function of the content process as the
> + * Nuwa process according the arguments.

then starts ... process with the given arguments

@@ +117,5 @@
> +
> +void
> +ProcLoaderParent::ActorDestroy(ActorDestroyReason aWhy)
> +{
> +}

Do we really not need to do anything here?  What happens if the child process crashes during loading?  We should make sure that that causes the phone to reboot at least.

@@ +148,5 @@
> +
> +/**
> + * Initialize the client of B2G loader for loader itself.
> + *
> +  * The initialization of B2G loader are divided into two stages. First

line up your *s

@@ +151,5 @@
> + *
> +  * The initialization of B2G loader are divided into two stages. First
> + * stage is to collect child info passed from the main program of the
> + * loader.  2nd stage is to initialize Gecko according info from the
> + * first stage and making the client of loader service ready.

Spell out second.

according to info ... and make the

@@ +357,5 @@
> +
> +  for (i = 0; i < argc; i++) {
> +    free(argv[i]);
> +  }
> +  delete argv;

delete[]

@@ +468,5 @@
> +    MOZ_ASSERT(!sProcLoaderServing);
> +    MessageLoop loop;
> +
> +    nsAutoPtr<ContentProcess> process;
> +    process = new ContentProcess(aPeerPid);

I worry that creating more than one of these in a process's lifetime will screw something up ... I guess we'll see what happens.

@@ +495,5 @@
> +
> +  for (int i = 0; i < aArgc; i++) {
> +    free(_argv[i]);
> +  }
> +  delete _argv;

delete[]

::: xpcom/threads/BackgroundHangMonitor.cpp
@@ +415,5 @@
>  {
>  #ifdef MOZ_ENABLE_BACKGROUND_HANG_MONITOR
> +  if (BackgroundHangManager::sInstance == nullptr) {
> +    MOZ_ASSERT(BackgroundHangManager::sProhibited,
> +               "BackbroundHandleManager is not initialized");

BackgroundHangManager

::: xpcom/threads/BackgroundHangMonitor.h
@@ +106,5 @@
>   *    }
>   *  }
>   *
> + * Prohibit() and Allow() allow the background hang monitor working
> + * safely before Startup().

Prohibit() and Allow() make the background hang monitor work safely before Startup().

@@ +206,5 @@
>     */
>    void NotifyWait();
> +
> +  /**
> +   * Prohibit the hang monitor from active.

Prohibit the hang monitor from activating.

@@ +210,5 @@
> +   * Prohibit the hang monitor from active.
> +   *
> +   * Startup() should not be called in the duration from Prohibit() to
> +   * Allow().  This function make the background hang monitor stopping
> +   * monitoring threads.

Startup() should not be called between Prohibit() and Allow().  This function makes the background hang monitor stop monitoring threads.

@@ +213,5 @@
> +   * Allow().  This function make the background hang monitor stopping
> +   * monitoring threads.
> +   *
> +   * Prohibit() and Allow() could be called before XPCOM is ready.
> +   * Without stop monitoring, it would go error if XPCOM is not ready.

Prohibit() and Allow() can be called before XPCOM is ready.  If we did not stop monitoring threads it could cause errors.

@@ +218,5 @@
> +   */
> +  static void Prohibit();
> +
> +  /**
> +   * Allow the hang monitor to be ran.

Allow the hang monitor to run.

@@ +220,5 @@
> +
> +  /**
> +   * Allow the hang monitor to be ran.
> +   *
> +   * Allow() and Prohibit() should be called in pair.

A call to Allow() must match a previous call to Prohibit().

@@ +224,5 @@
> +   * Allow() and Prohibit() should be called in pair.
> +   *
> +   * \see Prohibit()
> +   */
> +  static void Allow();

+1
Attachment #8418637 - Flags: review?(khuey) → review+
FWIW I think QA needs to run a full set of smoketests on this before we land it.
Comment on attachment 8418637 [details] [diff] [review]
part 2: loader of b2g and nuwa, v7

Review of attachment 8418637 [details] [diff] [review]:
-----------------------------------------------------------------

::: ipc/chromium/src/base/process_util_linux.cc
@@ +257,5 @@
>                 bool wait, ProcessHandle* process_handle,
>                 ProcessArchitecture arch) {
> +#ifdef MOZ_B2G_LOADER
> +  static bool firstTimeLaunch = true;
> +  if (!wait && firstTimeLaunch) {

This check here is a fault back to plugin-container.

I will remove |!wait|, but add an assertion to make sure it is always false.  I don't think we should not depend on precondition without any check.

::: ipc/contentproc/moz.build
@@ +19,5 @@
> +    LOCAL_INCLUDES += [
> +        '/security',
> +        '/security/sandbox',
> +        '/security/sandbox/chromium',
> +    ]

plugin-container is still there as a fault back.

::: ipc/glue/ProcessUtils_linux.cpp
@@ +117,5 @@
> +
> +void
> +ProcLoaderParent::ActorDestroy(ActorDestroyReason aWhy)
> +{
> +}

The destructor would do all things, includes to close the FileDescriptor of the channel.

@@ +171,5 @@
> + * Initialize the client of B2G loader for Gecko.
> + */
> +void
> +ProcLoaderClientGeckoInit()
> +{

The comment near the head of the file explains what this function for.

@@ +270,5 @@
> +};
> +
> +
> +class ProcLoaderNoopRunner : public ProcLoaderRunnerBase {
> +public:

Use noop to avoid a condition checking in ProcLoaderServiceRun().

@@ +307,5 @@
> +};
> +
> +void
> +ProcLoaderLoadRunner::ArrangeFds()
> +{

Avoid a dependency to the internal of chromium from ipc/glue.

@@ +420,5 @@
> +}
> +
> +void
> +ProcLoaderChild::OnChannelError()
> +{

No!  A channel could be error for various reasons, one is that the peer is down.  For whatever type of error, the server should just shutdown itself normally.

::: xpcom/threads/BackgroundHangMonitor.h
@@ +224,5 @@
> +   * Allow() and Prohibit() should be called in pair.
> +   *
> +   * \see Prohibit()
> +   */
> +  static void Allow();

Since BackgroundHangMonitor already decided to call itself from MessageLoop unconditionally, following the design, it is more reasonable to check the condition by BackgroundHangMonitor.
Whiteboard: [c=memory p= s= u=tarako] [partner-blocker] → [c=memory p= s= u=tarako] [partner-blocker][MemShrink:P1]
Attachment #8419291 - Flags: review?(benjamin)
Patch review please?
Flags: needinfo?(benjamin)
Attachment #8419291 - Flags: review?(benjamin) → review+
Flags: needinfo?(benjamin)
Comment on attachment 8427503 [details] [diff] [review]
disable-android-log.diff

@Kyle, could you review this part?  This patch would disable android log temporarily before loading Nuwa on child side.
Attachment #8427503 - Flags: review?(khuey)
Comment on attachment 8427503 [details] [diff] [review]
disable-android-log.diff

Review of attachment 8427503 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/base/nsDebugImpl.cpp
@@ +72,5 @@
>  
> +#ifdef MOZ_B2G_LOADER
> +/* Avoid to call Android logger/logd for temporary during
> + * running service of B2GLoader at child process.
> + */

Avoid calling Android logger/logd temporarily while running B2GLoader to start the child process.
Attachment #8427503 - Flags: review?(khuey) → review+
Can not pass testcases with debug build.  I am working on it.
(In reply to Thinker Li [:sinker] from comment #96)
> Can not pass testcases with debug build.  I am working on it.

Which test case, can you tell us more detail information?
Renom. This is too high risk to take into the 1.3T release at this point. At this point of release, the quality organization needs the blocker list to only contain patches that have manageable risk since we need to have the Tarako release finished off with no instability changes due to our current resource capacity.
blocking-b2g: 1.3T+ → 1.3T?
if i recall correctly, James you already have taken the patches on your side, can you please confirm? thanks
Flags: needinfo?(james.zhang)
I have landed this WIP patch on my side and tested about 1 month, I don't think it has very high risk.
Why not land it on v1.3t?
Flags: needinfo?(james.zhang)
(In reply to James Zhang from comment #100)
> I have landed this WIP patch on my side and tested about 1 month, I don't
> think it has very high risk.
> Why not land it on v1.3t?

That doesn't really count though - that was tested with unfinished patches. There can be problems with risk with more updated patches post landing. Additionally, nuwa patches historically can have regressions in random, unexpected places, so I don't think I agree with the assessment here. https://bugzilla.mozilla.org/show_bug.cgi?id=977026#c91 also hints heavily that this is risky, since development is asking for a full smoketest pass before this lands.

Mozilla's quality team right now has a firm need that patches can only land on 1.3T if the patch is low risk & is deemed a critical blocker. I don't think this patch hits that criteria.
1.3T is a branch specifically for this device.  So if the OEM is going to take these patches on their branch why wouldn't we land them on 1.3T?

This is quite risky, but the partner has already decided they are willing to live with the risk.  What do we gain from not landing it on our branch?
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #102)
> 1.3T is a branch specifically for this device.  So if the OEM is going to
> take these patches on their branch why wouldn't we land them on 1.3T?

That's probably a mistake on their part, but that's not something I can control.

> 
> This is quite risky, but the partner has already decided they are willing to
> live with the risk.  What do we gain from not landing it on our branch?

We risk not being able to shut down the release asap. That's a need that Mozilla QA needs asap, since we've hit resource capacity that prevents significant testing of Tarako going forward that can handle instability (aka Friday will be the last day smoketests will be supported). The mitigation plan that was established in place here was a requirement that we lock down the branch asap to low risk patches that are critical, since we need to stabilize and shutdown asap.

If you have concerns, please followup with me, Tony, and Clint offline.
1.3T is owned by SPRD. They can land any patch they like. SPRD taking over testing from here on makes sense.
(In reply to Andreas Gal :gal from comment #104)
> 1.3T is owned by SPRD. They can land any patch they like. SPRD taking over
> testing from here on makes sense.

Let me followup offline with Clint, Tony, & nhirata and get back to you on this comment.
(In reply to Joe Cheng [:jcheng] from comment #99)
> if i recall correctly, James you already have taken the patches on your
> side, can you please confirm? thanks

Yes, We applied these patches inside sprd.
and run monkey test continuesly with these patches for more than a week.

we meet one minidump
https://bugzilla.mozilla.org/show_bug.cgi?id=906108


ffos@ffos2:~/yingxu/6821/sprd_patch/gonk4.0$ ls bug977026*    
bug977026-part1.patch  bug977026-part2.patch  bug977026-part3.patch  bug977026-part4-disable-android-log.patch
FWIW, https://bugzilla.mozilla.org/show_bug.cgi?id=1007019#c70 is the sort of thing we want to avoid here ...

I think this discussion is already happening offline but after Mozilla QA terminates its responsibility for this branch (on Friday?) we should land everything SPRD is going to take on our 1.3t branch.
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #107)
> FWIW, https://bugzilla.mozilla.org/show_bug.cgi?id=1007019#c70 is the sort
> of thing we want to avoid here ...
> 
> I think this discussion is already happening offline but after Mozilla QA
> terminates its responsibility for this branch (on Friday?) we should land
> everything SPRD is going to take on our 1.3t branch.

Yes, please land it or give ying v1.3t gecko merge permission to land it.
Blocks: 1007019
Comment on attachment 8440753 [details] [diff] [review]
part 1: allow thread registering before the profiler is initialized, v2

Minor changes to pass testcases with debug build.

bgirard, could you review it again?
Attachment #8440753 - Flags: review?(bgirard)
Merge part2 and disable-android-log.diff to one patch.  Call SendLoad() async and on main thread.

Kyle and Cervantes, could you review it again?
Attachment #8418637 - Attachment is obsolete: true
Attachment #8427503 - Attachment is obsolete: true
Attachment #8440762 - Flags: review?(khuey)
Attachment #8440762 - Flags: review?(cyu)
A minor change.  This patch passes an object of nsXREAppData to XRE_ProcLoaderPreload() and setup nsXULAppInfo that is required by manifest parser to evaluate flags.

Benjamin, could you review it again?
Attachment #8419291 - Attachment is obsolete: true
Attachment #8440769 - Flags: review?(benjamin)
Comment on attachment 8440753 [details] [diff] [review]
part 1: allow thread registering before the profiler is initialized, v2

Review of attachment 8440753 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry I'd like to understand better how this change works and hopefully I wont regress it when I make further changes to the profiler:

All of these calls need to be balanced. So if we ignore the register and then we init then the unregister will be unmatched. This means it's no longer safe to init the profiler while we're in the middle of these calls. So either these calls should happen for the nuwa process or they are potentially unsafe?
(In reply to Benoit Girard (:BenWa) from comment #113)
> Comment on attachment 8440753 [details] [diff] [review]
> part 1: allow thread registering before the profiler is initialized, v2
> 
> Review of attachment 8440753 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Sorry I'd like to understand better how this change works and hopefully I
> wont regress it when I make further changes to the profiler:
> 
> All of these calls need to be balanced. So if we ignore the register and
> then we init then the unregister will be unmatched. This means it's no
> longer safe to init the profiler while we're in the middle of these calls.
> So either these calls should happen for the nuwa process or they are
> potentially unsafe?

The nuwa process would wait for the terminations of all threads, except the main thread, before calling NS_InitXPCOM2().  That means all function calls are in paired.
Attachment #8440753 - Flags: review?(bgirard) → review+
This is interdiff of part 3 between current and previous revision.
Attached patch interdiff-b2g-loader.diff (obsolete) — Splinter Review
This is interdiff of part 2 between the current and previous revision.
I believe that I have interdiff in the wrong direction. Sorry!
The interdiff is not v8 - v7. There is something missing in between. This is the correct interdiff.
Attachment #8442576 - Attachment is obsolete: true
disable-android-log.diff was merged to v7 of part 2 before running interdiff.
Target Milestone: --- → 2.0 S5 (4july)
I'm a little worried about the appdata preloading in this case. A pretty fundamental difference between content processes and the main process is that content processes only have the core-gecko components installed: they don't have any app components registered via XPCOM.

Now in this case, since it's only XPT files, that might not be a big deal, but we at least need to note this in various places that we're leaking app interfaces into content processes.
Comment on attachment 8440769 [details] [diff] [review]
part 3: preload XPT before calling fork, v4

So I'm going to mark a wary r+ on here, with the understanding that you're going to add scary comments to the preload functions about XPT leakage.
Attachment #8440769 - Flags: review?(benjamin) → review+
Comment on attachment 8440762 [details] [diff] [review]
part 2: loader of b2g and nuwa, v8

Review of attachment 8440762 [details] [diff] [review]:
-----------------------------------------------------------------

r+ but still needs another review before landing.

::: b2g/app/B2GLoader.cpp
@@ +39,5 @@
> +static int
> +GetDirnameSlash(const char *aPath, char *aOutDir, int aMaxLen)
> +{
> +  char *lastSlash = strrchr(aPath, XPCOM_FILE_PATH_SEPARATOR[0]);
> +  int cpsz = lastSlash - aPath + 1; // include slash

cpsz can be negative and this function will return a wrong value. Please add null check of lastSlash or assert it to be non-null.

@@ +147,5 @@
> + * shared between the b2g process, the Nuwa process, and the content
> + * processes.
> + */
> +int
> +main(int argc, const char* argv[])

Why do you add const to argv? This makes it necessary to make b2g_main() receive const char* argv, and then duplicate argv before calling do_main().

::: b2g/app/nsBrowserApp.cpp
@@ +277,5 @@
> +    char **_argv;
> +
> +    /*
> +     * Duplicate argument vector to conform non-const argv of
> +     * do_main() since XRE_main() is very stupid with non-const argv.

Another alternative is don't add const to main() in B2GLoader.cpp, and you won't need to duplicate argv to conform to do_main() and XRE_main(). How does it sound?

::: ipc/app/MozillaRuntimeMain.cpp
@@ -29,5 @@
> -#endif
> -
> -#ifdef MOZ_WIDGET_GONK
> -# include <sys/time.h>
> -# include <sys/resource.h> 

nit: trailing space.

@@ -56,5 @@
> -static void
> -InitializeBinder(void *aDummy) {
> -    // Change thread priority to 0 only during calling ProcessState::self().
> -    // The priority is registered to binder driver and used for default Binder
> -    // Thread's priority. 

nit: trailing space.

::: ipc/chromium/src/base/process_util_linux.cc
@@ +254,5 @@
>                 ChildPrivileges privs,
>                 bool wait, ProcessHandle* process_handle,
>                 ProcessArchitecture arch) {
> +#ifdef MOZ_B2G_LOADER
> +  static bool firstNuwaLaunch = true;

nit: firstNuwaLaunch doesn't sound right. This sounds like there is secondNuwaLaunch and so on. I think firstLaunch is enough.

::: ipc/glue/ProcessUtils_linux.cpp
@@ +191,5 @@
> +  sProcLoaderParent->Open(transport,
> +                          sProcLoaderPid,
> +                          XRE_GetIOMessageLoop(),
> +                          ParentSide);
> +  sProcLoaderLoop = MessageLoop::current();

Is there any special reason to save MessageLoop::current() here for later use? Is it different from MessageLoop::current() in ProcLoaderLoad()? If not then I think call MessageLoop::current() directly is clearer.

@@ +277,5 @@
> +  *aProcessHandle = sProcLoaderPid;
> +  sProcLoaderPid = 0;
> +
> +  sProcLoaderLoop->PostTask(FROM_HERE,
> +                            NewRunnableFunction(AsyncSendLoad, load));

Why do you call PProcLoaderParent::SendLoad() asynchronously? Is it necessary?

@@ +335,5 @@
> + * This is a duplication of ShuffleFileDescriptors() to avoid a
> + * dependency on the internal of chromium.
> + *
> + * see ShuffleFileDescriptors()
> + */

This comment is incorrect. This isn't a duplicate of ShuffleFileDescriptors() (and it calls ShuffleFileDescriptors()). It's part of base::LaunchApp() (the child side).

::: xpcom/threads/BackgroundHangMonitor.h
@@ +210,5 @@
> +   * Prohibit the hang monitor from activating.
> +   *
> +   * Startup() should not be called between Prohibit() and Allow().
> +   * This function makes the background hang monitor stopping
> +   * monitoring threads.

nit: This function makes the background hang monitor stop monitoring threads.
Attachment #8440762 - Flags: review?(cyu) → review+
Flags: needinfo?(ttsai)
with patches "part 1/2/3/4 for 1.3T"
and enable dmd on v1.3t. 

no apps can be launched. all apps were stuck at launching window. get_about_memory.py run fault.

seems nuwa process run fault
apusr@yingxuubt:~/buglog/479$ adb shell b2g-ps
APPLICATION      USER     PID   PPID  VSIZE  RSS     WCHAN    PC         NAME
b2g              root      653   1     150680 64912 ffffffff 400e9518 S /system/b2g/b2g
b2g              root      669   653   42692  5468  ffffffff 400e9724 S /system/b2g/b2g

after killing nuwa manully, apps can be launched successfully.

But get_about_memory.py still run fault.
ni? Thinker
Let's move this to backlog for now, partner can take this on their branch as they already have. Thanks
blocking-b2g: 1.3T? → backlog
Flags: needinfo?(kkuo)
Comment on attachment 8440762 [details] [diff] [review]
part 2: loader of b2g and nuwa, v8

I didn't look at this that closely ... I trust Cervantes :)
Attachment #8440762 - Flags: review?(khuey) → review+
Flags: needinfo?(kkuo)
--
Attachment #8440762 [details] [diff] - Attachment is obsolete: true
Attachment #8462944 - Flags: review?(khuey)
Attachment #8462944 - Flags: review?(cyu)
--
Attachment #8440769 [details] [diff] - Attachment is obsolete: true
Attachment #8462945 - Flags: review?(benjamin)
Attachment #8440753 - Attachment is obsolete: true
Attachment #8440762 - Attachment is obsolete: true
Attachment #8440769 - Attachment is obsolete: true
Attachment #8462943 - Flags: review?(bgirard) → review+
Attachment #8462944 - Flags: review?(khuey)
Attachment #8462944 - Flags: review?(cyu)
Attachment #8462944 - Flags: review+
Attachment #8462945 - Flags: review?(benjamin) → review+
please check in just for part 1 ~ 3 on m-c.
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/7ff3cd713466
https://hg.mozilla.org/mozilla-central/rev/f8ec5977a454
https://hg.mozilla.org/mozilla-central/rev/c7c37390b46b
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: 2.0 S5 (4july) → 2.1 S1 (1aug)
Whiteboard: [c=memory p= s= u=tarako] [partner-blocker][MemShrink:P1] → [c=memory p= s=2014.08.01.t u=tarako] [partner-blocker][MemShrink:P1]
19:09:56    ERROR -  ../../../gecko/ipc/glue/ProcessUtils_linux.cpp:272: error: no matching function for call to 'mozilla::ipc::FDRemap::FDRemap(const int&, const int&)'
19:09:56     INFO -  ../ipdl/_ipdlheaders/mozilla/ipc/PProcLoader.h:40: note: candidates are: mozilla::ipc::FDRemap::FDRemap(const mozilla::ipc::FDRemap&)
19:09:56     INFO -  ../ipdl/_ipdlheaders/mozilla/ipc/PProcLoader.h:32: note:                 mozilla::ipc::FDRemap::FDRemap(const mozilla::ipc::FileDescriptor&, const int&)
19:09:56     INFO -  ../ipdl/_ipdlheaders/mozilla/ipc/PProcLoader.h:30: note:                 mozilla::ipc::FDRemap::FDRemap()

gcc 4.9 adds a more obvious message:
note:   no known conversion for argument 1 from 'const int' to 'const FileDescriptor& {aka const mozilla::ipc::FileDescriptor&}'

I really don't know how this built at all before reaching m-i.
Assignee: tlee → karlt
Assignee: karlt → tlee
What is the rule to backout changeset here?!  Since my change had been committed into m-c before another, I don't why people choose to backout my changes.  Could any one explain to me the rule?
I don't think Mike understood that 62e1c853536b was involved when he backed this out.  The changes here seemed to be the code that didn't compile so he backed out that code.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment on attachment 8463815 [details] [diff] [review]
use int for 'from' fd in FDRemap, for consistency with 'to' fd

Review of attachment 8463815 [details] [diff] [review]:
-----------------------------------------------------------------

::: ipc/glue/PProcLoader.ipdl
@@ +7,5 @@
>  namespace mozilla {
>  namespace ipc {
>  
>  struct FDRemap {
> +  int mapfrom;

This change does not work since the file descriptor should be transferred through IPC. The original code sends fd to the peer of the IPC channel, but this change sends an int value, not a file descriptor expected.  So, you can not read()/write() on an int value get from a IPC message.
Attachment #8463815 - Flags: review?(tlee) → review-
(In reply to Thinker Li [:sinker] from comment #139)
> This change does not work since the file descriptor should be transferred
> through IPC.

Ah, OK.  Thanks for the explanation.
(In reply to Thinker Li [:sinker] from comment #137)
> What is the rule to backout changeset here?!  Since my change had been
> committed into m-c before another, I don't why people choose to backout my
> changes.  Could any one explain to me the rule?

There isn't a set rule for resolving bustage that results from merges; if anything, I think the right thing to do is back out the first patch found to be problematic, so that the tree can reopen.  It's a rare case and a difficult one to investigate.

That said, I think backing out this patch was reasonable even knowing both sides of the problematic merge, since the code from this bug was relying on implicit constructors on which it should not have been relying, and which seem likely to have been unclear to authors and/or reviewers of that code, and thus at least appears to be faulty if not actually being faulty.

If the dependence on the implicit constructors was actually intended, adding them in explicitly and getting review for that addition should be simple.
--
Attachment #8462944 [details] [diff] - Attachment is obsolete: true
Attachment #8462944 - Attachment is obsolete: true
Comment on attachment 8464606 [details] [diff] [review]
part 2: loader of b2g and nuwa, v9

call constructor of FileDescriptor explicit.
Attachment #8464606 - Flags: review+
See Also: → 845738
Please check in part 1 ~ 3 for m-c.  Thanks!
Keywords: checkin-needed
Blocks: 1048024
Hi Thinker,

First, thank you very much for your outstanding contribution in this bug.

Recently, I found the keyboard bn-Avro in SPRD version cannot work well as the attached video shown. After testing, it seems the issue is introduced by the patches here and we can see it works fine shown in '-without-patches-.3gp'.

It looks like that there's always an excrescent copy of current buffer(typed contents) inserted before the new input. Just like the Space typed, the actual content shown is 'previous contents'+' ' rather than ' '. Confusingly, the weird problem only happened on the keyboard bn-Avro.

Could you please help to check the issue?

Thanks a lot.

[1] 8420000: part 1 for 1.3T
[2] 8420001: part 2 for 1.3T
[3] 8420003: part 3 for 1.3T
Flags: needinfo?(tlee)
I suggest to open a new bug for this.  And, ni tim for the new bug too.
Flags: needinfo?(tlee)
(In reply to Thinker Li [:sinker] from comment #149)
> I suggest to open a new bug for this.  And, ni tim for the new bug too.

Let's track the issue on https://bugzilla.mozilla.org/show_bug.cgi?id=1051228

Thanks.
Depends on: 1054191
Depends on: 1079451
blocking-b2g: backlog → ---
You need to log in before you can comment on or make changes to this bug.