Closed Bug 1314466 Opened 3 years ago Closed 3 years ago

Create child processes as an Android Service instead of using fork

Categories

(GeckoView :: General, defect)

defect
Not set

Tracking

(firefox53 fixed)

RESOLVED FIXED
mozilla53
Tracking Status
firefox53 --- fixed

People

(Reporter: rbarker, Assigned: rbarker)

References

(Blocks 2 open bugs)

Details

Attachments

(7 files, 8 obsolete files)

6.41 KB, patch
Details | Diff | Splinter Review
3.92 KB, patch
froydnj
: review+
Details | Diff | Splinter Review
1.43 KB, patch
Details | Diff | Splinter Review
7.91 KB, patch
Details | Diff | Splinter Review
33.73 KB, patch
snorp
: review+
Details | Diff | Splinter Review
5.07 KB, patch
Details | Diff | Splinter Review
3.89 KB, patch
Details | Diff | Splinter Review
Fennec currently uses fork to create child processes. Creating child processes as a service allows the child service to contain a dalvik VM that enables the usage of the binder between processes. The binder is required for passing surfaces between processes which is required to enable a GPU process as well as video in content processes.
Assignee: nobody → rbarker
Blocks: 1307960
Blocks: 1314759
Comment on attachment 8806877 [details] [diff] [review]
0001-Bug-1314466-part-1-Allow-android-to-set-client-channel-fd-for-ipc-16110213-e6c26b8.patch

This patch allows android to set the ipc FD handle when the service is started. The FD is mapped into the child service process by an android.os.ParcelFileDescriptor.
Attachment #8806877 - Flags: review?(wmccloskey)
Comment on attachment 8806878 [details] [diff] [review]
0002-Bug-1314466-part-2-Allow-android-to-set-channel-file-descriptor-for-crash-reporter-16110213-41f99ad6.patch

This patch allows android to set the Crash Reporter ipc FD handle when the child service process is started. The FD is mapped into the child service process by an android.os.ParcelFileDescriptor.
Attachment #8806878 - Flags: review?(ted)
Comment on attachment 8806879 [details] [diff] [review]
0003-Bug-1314466-part-3-Fennec-child-processes-are-no-longer-forked-so-kill-and-waitpid-do-not-work-as-expected-and-must-gracefully-handle-error-cases-16110213-40f4caa.patch

This patch prevent KillProcess from attempting 60 times to kill a child service process if it no longer exists.
Attachment #8806879 - Flags: review?(wmccloskey)
Comment on attachment 8806880 [details] [diff] [review]
0004-Bug-1314466-part-4-update-GeckoChildProcessHost-to-call-LaunchAndroidService-16110213-9825750.patch

This patch updates GeckoChildProcessHost so that it call LaunchAndriodService to start child processes. Additionally, it remove code that is no longer necessary for creating the child process on Android.
Attachment #8806880 - Flags: review?(wmccloskey)
Comment on attachment 8806881 [details] [diff] [review]
0005-Bug-1314466-part-5-Add-service-process-manager-16110213-ae2460b.patch

Code required to create and manage service based child processes in Fennec.
Attachment #8806881 - Flags: review?(nchen)
Comment on attachment 8806882 [details] [diff] [review]
0006-Bug-1314466-part-6-Add-new-files-to-build-16110213-0312bef.patch

Updates to build new java files and add new services to manifest files.
Attachment #8806882 - Flags: review?(nchen)
Attachment #8806883 - Flags: review?(nchen)
Comment on attachment 8806877 [details] [diff] [review]
0001-Bug-1314466-part-1-Allow-android-to-set-client-channel-fd-for-ipc-16110213-e6c26b8.patch

Review of attachment 8806877 [details] [diff] [review]:
-----------------------------------------------------------------

Please update the comment here:
http://searchfox.org/mozilla-central/rev/f5c9e9a249637c9abd88754c8963ecb3838475cb/ipc/chromium/src/chrome/common/ipc_channel_posix.cc#52-91
This would be a good place to describe the whole scheme you're setting up here.

::: ipc/chromium/src/chrome/common/ipc_channel.h
@@ +147,5 @@
>    // For portability the prefix should not include the \ character.
>    static std::wstring GenerateVerifiedChannelID(const std::wstring& prefix);
>  
> +#if defined(MOZ_WIDGET_ANDROID)
> +  static void SetClientChannelFd(int fd);

Please comment this as well. Maybe reference the comment at the top of ipc_channel_posix.cc.

::: ipc/chromium/src/chrome/common/ipc_channel_posix.cc
@@ +133,5 @@
>  
>  // This is the file descriptor number that a client process expects to find its
>  // IPC socket.
> +#if defined(MOZ_WIDGET_ANDROID)
> +static int kClientChannelFd = 3;

Let's make this non-const for all platforms and call it gClientChannelFd.
Attachment #8806877 - Flags: review?(wmccloskey)
Comment on attachment 8806879 [details] [diff] [review]
0003-Bug-1314466-part-3-Fennec-child-processes-are-no-longer-forked-so-kill-and-waitpid-do-not-work-as-expected-and-must-gracefully-handle-error-cases-16110213-40f4caa.patch

Review of attachment 8806879 [details] [diff] [review]:
-----------------------------------------------------------------

::: ipc/chromium/src/base/process_util_posix.cc
@@ +70,5 @@
>  bool KillProcess(ProcessHandle process_id, int exit_code, bool wait) {
>    bool result = kill(process_id, SIGTERM) == 0;
>  
> +  if (!result && (errno == ESRCH)) {
> +    result = true;

Can you only set wait to false (and not set result to true). That way we get a log message, which seems good to me (you didn't explain why you need this patch).
Attachment #8806879 - Flags: review?(wmccloskey) → review+
(In reply to Bill McCloskey (:billm) from comment #15)
> Comment on attachment 8806879 [details] [diff] [review]
> 0003-Bug-1314466-part-3-Fennec-child-processes-are-no-longer-forked-so-kill-
> and-waitpid-do-not-work-as-expected-and-must-gracefully-handle-error-cases-
> 16110213-40f4caa.patch
> 
> Review of attachment 8806879 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: ipc/chromium/src/base/process_util_posix.cc
> @@ +70,5 @@
> >  bool KillProcess(ProcessHandle process_id, int exit_code, bool wait) {
> >    bool result = kill(process_id, SIGTERM) == 0;
> >  
> > +  if (!result && (errno == ESRCH)) {
> > +    result = true;
> 
> Can you only set wait to false (and not set result to true). That way we get
> a log message, which seems good to me (you didn't explain why you need this
> patch).

Sorry, I thought I had explained it some where but it seems to have slipped through the cracks. The issue I was working around is that since the "child" processes are now new processes and not created from a fork, kill will fail if the service based child process has already exited. This would cause the waitpid to be called 60 times and hang the process for a minute (sleep(1) x 60). waitpid does seem to work on android even when the process being waited on is not a child from fork. However if the process has already exited then waitpid also fails unlike when the child is created from a fork. So both of these checks are to prevent KillProcess from spinning if the target process has already died.

As for leaving the result value as false, I actually didn't want the log message since technically we were not "Unable to terminate process.", something else just beat us to it. I can change it but the log message felt misleading to me.
> Sorry, I thought I had explained it some where but it seems to have slipped through the
> cracks. The issue I was working around is that since the "child" processes are now new
> processes and not created from a fork, kill will fail if the service based child process has
> already exited.

Is the problem coming from EnsureProcessTerminated? Given what you've described, it feels like it would be better to patch that code instead (maybe so it doesn't do anything on Android).
Comment on attachment 8806880 [details] [diff] [review]
0004-Bug-1314466-part-4-update-GeckoChildProcessHost-to-call-LaunchAndroidService-16110213-9825750.patch

Review of attachment 8806880 [details] [diff] [review]:
-----------------------------------------------------------------

::: ipc/glue/GeckoChildProcessHost.cpp
@@ +1224,5 @@
> +
> +  int argvSize = argv.size();
> +  jobjectArray jargs = env->NewObjectArray(argvSize, env->FindClass("java/lang/String"), nullptr);
> +  for (int ix = 0; ix < argvSize; ix++) {
> +    jstring string =  AndroidBridge::NewJavaString(env, argv[ix].c_str());

Two spaces after the =.

@@ +1227,5 @@
> +  for (int ix = 0; ix < argvSize; ix++) {
> +    jstring string =  AndroidBridge::NewJavaString(env, argv[ix].c_str());
> +    env->SetObjectArrayElement(jargs, ix, string);
> +  }
> +  base::file_handle_mapping_vector::const_iterator it = fds_to_remap.begin();

Please assert that fds_to_remap has size 1 or 2.
Attachment #8806880 - Flags: review?(wmccloskey) → review+
(In reply to Bill McCloskey (:billm) from comment #17)
> > Sorry, I thought I had explained it some where but it seems to have slipped through the
> > cracks. The issue I was working around is that since the "child" processes are now new
> > processes and not created from a fork, kill will fail if the service based child process has
> > already exited.
> 
> Is the problem coming from EnsureProcessTerminated? Given what you've
> described, it feels like it would be better to patch that code instead
> (maybe so it doesn't do anything on Android).

Yes, I hit this path early on when the child process was not exiting cleanly due to the fact that I missed closing a dup of the ipc channel fd in Java land. I think it is okay to try and kill as a last resort. It is just kill and waitpid behave a little differently when the target process was not created from a fork of the parent process.

I looked into how this is currently handled in modern Cr for android. I believe they have a callback into Java to unbind the service which will cause it to exit. I thought about doing this but decided to leave the code mostly as-is since it seemed to work.
(In reply to Bill McCloskey (:billm) from comment #14)
> Comment on attachment 8806877 [details] [diff] [review]
> 0001-Bug-1314466-part-1-Allow-android-to-set-client-channel-fd-for-ipc-
> 16110213-e6c26b8.patch
> 
> Review of attachment 8806877 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Please update the comment here:
> http://searchfox.org/mozilla-central/rev/
> f5c9e9a249637c9abd88754c8963ecb3838475cb/ipc/chromium/src/chrome/common/
> ipc_channel_posix.cc#52-91
> This would be a good place to describe the whole scheme you're setting up
> here.
> 

Here is the comment I plan to add. I figure I would post it here first in case there are any points you would like me to cover that I failed to include.

// On Android, child processes are created as a service instead of
// forking the parent process. The Android Binder service is used to
// transport the IPC channel file descriptor to the child process.
// So rather than re-mapping the file descriptor to a known value,
// the received channel file descriptor is set by calling
// SetClientChannelFd before gecko has been initialized and started
// in the child process.
Attachment #8806883 - Flags: review?(nchen) → review+
(In reply to Randall Barker [:rbarker] from comment #20)
> (In reply to Bill McCloskey (:billm) from comment #14)
> > Comment on attachment 8806877 [details] [diff] [review]
> > 0001-Bug-1314466-part-1-Allow-android-to-set-client-channel-fd-for-ipc-
> > 16110213-e6c26b8.patch
> > 
> > Review of attachment 8806877 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > Please update the comment here:
> > http://searchfox.org/mozilla-central/rev/
> > f5c9e9a249637c9abd88754c8963ecb3838475cb/ipc/chromium/src/chrome/common/
> > ipc_channel_posix.cc#52-91
> > This would be a good place to describe the whole scheme you're setting up
> > here.
> > 
> 
> Here is the comment I plan to add. I figure I would post it here first in
> case there are any points you would like me to cover that I failed to
> include.
> 
> // On Android, child processes are created as a service instead of
> // forking the parent process. The Android Binder service is used to
> // transport the IPC channel file descriptor to the child process.
> // So rather than re-mapping the file descriptor to a known value,
> // the received channel file descriptor is set by calling
> // SetClientChannelFd before gecko has been initialized and started
> // in the child process.

OK, that seems good. Please also update the existing references to kClientChannelFd in the comments to reflect the new name.
Comment on attachment 8806882 [details] [diff] [review]
0006-Bug-1314466-part-6-Add-new-files-to-build-16110213-0312bef.patch

Review of attachment 8806882 [details] [diff] [review]:
-----------------------------------------------------------------

Need :nalexander to review build config changes.

::: mobile/android/base/AndroidManifest.xml.in
@@ +443,5 @@
>              android:isolatedProcess="false">
>          </service>
>  
> +        <!-- New child services must also be added to the GeckoView AndroidManifest.xml -->
> +        <service

We probably won't support e10s in Fennec right? So we don't want these in the Fennec AndroidManifest? Or at least put these behind a build flag?
Attachment #8806882 - Flags: review?(nchen) → review?(nalexander)
(In reply to Jim Chen [:jchen] [:darchons] from comment #22)
> Comment on attachment 8806882 [details] [diff] [review]
> 0006-Bug-1314466-part-6-Add-new-files-to-build-16110213-0312bef.patch
> 
> Review of attachment 8806882 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Need :nalexander to review build config changes.
> 
> ::: mobile/android/base/AndroidManifest.xml.in
> @@ +443,5 @@
> >              android:isolatedProcess="false">
> >          </service>
> >  
> > +        <!-- New child services must also be added to the GeckoView AndroidManifest.xml -->
> > +        <service
> 
> We probably won't support e10s in Fennec right? So we don't want these in
> the Fennec AndroidManifest? Or at least put these behind a build flag?

Actually we do. Since both GMP and the GPU process (currently in progress) will be used in Fennec. Additionally, e10s is tested in Fennec on try (I was surprised to find this out).
Comment on attachment 8806882 [details] [diff] [review]
0006-Bug-1314466-part-6-Add-new-files-to-build-16110213-0312bef.patch

Review of attachment 8806882 [details] [diff] [review]:
-----------------------------------------------------------------

Just a nit/suggestion.

::: mobile/android/geckoview/build.gradle
@@ +36,5 @@
>  
>      sourceSets {
>          main {
> +            aidl {
> +                srcDir "${topsrcdir}/mobile/android/geckoview/src/aidl"

The default directory for aidl is "geckoview/src/main/aidl"; if you put the aidl definition there, you'll conform to the Android convention and shouldn't need this line.

::: mobile/android/geckoview/src/main/AndroidManifest.xml
@@ +35,5 @@
>  
>      <!-- App requires OpenGL ES 2.0 -->
>      <uses-feature android:glEsVersion="0x00020000" android:required="true" />
>  
> +    <!-- New child services must also be added to the Fennec AndroidManifest.xml.in -->

Man, I can't wait until we just use Gradle, so these types of duplications are better handled.
Attachment #8806882 - Flags: review?(nalexander) → review+
Comment on attachment 8806878 [details] [diff] [review]
0002-Bug-1314466-part-2-Allow-android-to-set-channel-file-descriptor-for-crash-reporter-16110213-41f99ad6.patch

Review of attachment 8806878 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/crashreporter/nsExceptionHandler.h
@@ +263,5 @@
>  
>  bool UnsetRemoteExceptionHandler();
>  
>  #if defined(MOZ_WIDGET_ANDROID)
> +// Android creates child process as services so we need explicitly set

nit: "so we need to", or "so we must".
Comment on attachment 8806881 [details] [diff] [review]
0005-Bug-1314466-part-5-Add-service-process-manager-16110213-ae2460b.patch

Review of attachment 8806881 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/geckoview/src/main/java/org/mozilla/gecko/GeckoThread.java
@@ +541,5 @@
>          Log.w(LOGTAG, "zerdatime " + SystemClock.uptimeMillis() + " - runGecko");
>  
>          if (!AppConstants.MOZILLA_OFFICIAL) {
> +            String msg = new String("RunGecko - args =");
> +            for(String arg : args) {

nit: space after for.

::: mobile/android/geckoview/src/main/java/org/mozilla/gecko/process/GeckoProcessManager.java
@@ +25,5 @@
> +import java.util.concurrent.TimeUnit;
> +import java.util.HashMap;
> +import java.util.Map;
> +
> +public final class GeckoProcessManager { // implements IBinder.DeathRecipient {

nit: drop comment.

Also, can you include a class comment explaining how this fits into the larger scheme?

@@ +115,5 @@
> +    public int start(String type, String[] args, int crashFd, int ipcFd) {
> +        if (mConnections.containsKey(type)) {
> +            Log.e(LOGTAG, "\nAttempting to start a child process service that is already running. Attempting to kill existing process first");
> +            ChildConnection connection = mConnections.get(type);
> +            connection.prepareToWait();

Isn't this racy?  You're re-setting the latch value, and waiting for it, potentially concurrently.  How are you guarding this?

Should the latch be specific to a connection _attempt_?  Or, should this just be `synchronized`, so you can't race starters?    Or, is there an external guarantee that start never races against the same `type`?  Or, do you want instead to wake up N readers after `stop`, and have a separate concurrency primitive for start and stop?

Apologies if this is all obvious to the platform Android team -- but it's definitely not clear what your expected semantics are to me.

@@ +125,5 @@
> +            }
> +        }
> +
> +        try {
> +            ChildConnection connection = new ChildConnection(type);

nit: final everything?
(In reply to Nick Alexander :nalexander (leave until January 2017) from comment #26)
> Comment on attachment 8806881 [details] [diff] [review]
> 0005-Bug-1314466-part-5-Add-service-process-manager-16110213-ae2460b.patch
> 
> Review of attachment 8806881 [details] [diff] [review]:
> ----------------------------------------------------------------- 
> @@ +115,5 @@
> > +    public int start(String type, String[] args, int crashFd, int ipcFd) {
> > +        if (mConnections.containsKey(type)) {
> > +            Log.e(LOGTAG, "\nAttempting to start a child process service that is already running. Attempting to kill existing process first");
> > +            ChildConnection connection = mConnections.get(type);
> > +            connection.prepareToWait();
> 
> Isn't this racy?  You're re-setting the latch value, and waiting for it,
> potentially concurrently.  How are you guarding this?
> 
> Should the latch be specific to a connection _attempt_?  Or, should this
> just be `synchronized`, so you can't race starters?    Or, is there an
> external guarantee that start never races against the same `type`?  Or, do
> you want instead to wake up N readers after `stop`, and have a separate
> concurrency primitive for start and stop?
> 
> Apologies if this is all obvious to the platform Android team -- but it's
> definitely not clear what your expected semantics are to me.
> 

I'm not sure I follow your question. There can only ever be one connection per service type. If we are trying to start a service and there is already a connection, it must be a zombie so the code attempts to kill the zombie before creating a new service process connection to start. The connection is only added to the map after the wait has completed so once it is in the map, it is okay to use the latch again to wait to stop. If start is called on multiple threads at the same time for the same type, something has gone horribly wrong. All child process are currently launched from the IO thread I believe.
Updated to address review comments.
Attachment #8806877 - Attachment is obsolete: true
Attachment #8807351 - Flags: review?(wmccloskey)
Comment on attachment 8806881 [details] [diff] [review]
0005-Bug-1314466-part-5-Add-service-process-manager-16110213-ae2460b.patch

Review of attachment 8806881 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/geckoview/src/main/java/org/mozilla/gecko/GeckoThread.java
@@ +160,5 @@
>          return false;
>      }
>  
> +    public static boolean initChildProcess(GeckoProfile profile, String[] args, int crashFd, int ipcFd, boolean debugging) {
> +        if (init(profile, "", "", debugging)) {

Use null instead of empty strings

@@ +431,5 @@
> +        if (args != null) {
> +            StringTokenizer st = new StringTokenizer(args);
> +            while (st.hasMoreTokens()) {
> +                String token = st.nextToken();
> +                if (token == "-P" || token == "-profile") {

"-P".equals(token) || "-profile".equals(token)

@@ +472,4 @@
>          }
>  
> +        String[] result = new String[args.size()];
> +        return args.toArray(result);

toArray(new String[args.size()])

@@ +521,5 @@
>              } catch (final InterruptedException e) {
>              }
>          }
>  
> +        String[] args = null;

final String[] args;

@@ +542,5 @@
>  
>          if (!AppConstants.MOZILLA_OFFICIAL) {
> +            String msg = new String("RunGecko - args =");
> +            for(String arg : args) {
> +                msg += " " + arg;

Use TextUtils.join

::: mobile/android/geckoview/src/main/java/org/mozilla/gecko/mozglue/GeckoLoader.java
@@ +149,5 @@
>                  Log.d(LOGTAG, "env" + c + ": " + env);
>              }
> +            if (envList.size() > 0) {
> +              sEnvList = new String[envList.size()];
> +              sEnvList = envList.toArray(sEnvList);

toArray(new String[envList.size()])

::: mobile/android/geckoview/src/main/java/org/mozilla/gecko/process/GeckoProcessManager.java
@@ +27,5 @@
> +import java.util.Map;
> +
> +public final class GeckoProcessManager { // implements IBinder.DeathRecipient {
> +    private static final String LOGTAG = "GeckoProcessManager";
> +    private static GeckoProcessManager sManager = null;

private static final GeckoProcessManager INSTANCE = new GeckoProcessManager();

Since the first reference to GeckoProcessManager is most likely to call getInstance(), it's better to just initialize it in a static final field. Then you can also drop `synchronized` from getInstance().

@@ +37,5 @@
> +
> +        return sManager;
> +    }
> +
> +    private class ChildConnection implements ServiceConnection, IBinder.DeathRecipient {

private static final class probably?

@@ +40,5 @@
> +
> +    private class ChildConnection implements ServiceConnection, IBinder.DeathRecipient {
> +        public final String mType;
> +        private volatile CountDownLatch mConnectionLatch = null;
> +        public volatile IChildProcess mChild = null;

I don't think mChild has to be volatile.

@@ +47,5 @@
> +            mType = type;
> +        }
> +
> +        void prepareToWait() {
> +            mConnectionLatch = new CountDownLatch(1);

Use Object.wait and Object.notify instead of CountDownLatch.

@@ +55,5 @@
> +            ThreadUtils.assertNotOnUiThread();
> +            try {
> +                mConnectionLatch.await(5, TimeUnit.SECONDS);
> +            } catch (InterruptedException e) {
> +                Log.e(LOGTAG, "ChildConnection interrupted waiting for connection: " + e);

`, e);` instead of ` + e);`

@@ +56,5 @@
> +            try {
> +                mConnectionLatch.await(5, TimeUnit.SECONDS);
> +            } catch (InterruptedException e) {
> +                Log.e(LOGTAG, "ChildConnection interrupted waiting for connection: " + e);
> +                e.printStackTrace();

Don't need this line.

@@ +69,5 @@
> +        @Override
> +        public void onServiceConnected(ComponentName name, IBinder service) {
> +            try {
> +                service.linkToDeath(this, 0);
> +            } catch (RemoteException e) {

final RemoteException e

And log the exception.

@@ +75,5 @@
> +            mChild = IChildProcess.Stub.asInterface(service);
> +            try {
> +                mPid = mChild.getPid();
> +            } catch (final RemoteException e) {
> +                Log.e(LOGTAG, "Failed to get child " + mType + " process PID. Process may have died.");

, e);

@@ +99,5 @@
> +        public void binderDied() {
> +            Log.e(LOGTAG,"Binder died. Attempt to unbind service: " + mType + " " + mPid);
> +            try {
> +                GeckoAppShell.getApplicationContext().unbindService(this);
> +            } catch (java.lang.IllegalArgumentException e) {

final IllegalArgumentException e

@@ +100,5 @@
> +            Log.e(LOGTAG,"Binder died. Attempt to unbind service: " + mType + " " + mPid);
> +            try {
> +                GeckoAppShell.getApplicationContext().unbindService(this);
> +            } catch (java.lang.IllegalArgumentException e) {
> +                Log.e(LOGTAG,"Looks like connection was already unbound");

, e);

@@ +108,5 @@
> +
> +    Map<String, ChildConnection> mConnections;
> +
> +    private GeckoProcessManager() {
> +        mConnections = Collections.synchronizedMap(new HashMap<String, ChildConnection>());

Use SimpleArrayMap/ArrayMap instead of HashMap

@@ +112,5 @@
> +        mConnections = Collections.synchronizedMap(new HashMap<String, ChildConnection>());
> +    }
> +
> +    public int start(String type, String[] args, int crashFd, int ipcFd) {
> +        if (mConnections.containsKey(type)) {

Use .get() and check result != null. Separate containsKey + get is potentially racy

@@ +113,5 @@
> +    }
> +
> +    public int start(String type, String[] args, int crashFd, int ipcFd) {
> +        if (mConnections.containsKey(type)) {
> +            Log.e(LOGTAG, "\nAttempting to start a child process service that is already running. Attempting to kill existing process first");

Log.w

@@ +126,5 @@
> +        }
> +
> +        try {
> +            ChildConnection connection = new ChildConnection(type);
> +            Intent intent = new Intent(GeckoAppShell.getApplicationContext(), Class.forName("org.mozilla.gecko.process.GeckoServiceChildProcess$" + type));

Intent intent = new Intent();
intent.setClassName(GeckoAppShell.getApplicationContext(),
                    GeckoServiceChildProcess.class.getName() + '$' + type);

@@ +131,5 @@
> +            GeckoLoader.addEnvironmentToIntent(intent);
> +            connection.prepareToWait();
> +            GeckoAppShell.getApplicationContext().bindService(intent, connection, Context.BIND_AUTO_CREATE);
> +            connection.waitForChild();
> +            if (connection.mChild != null) {

if (connection.mChild == null) {
    ...
    return 0;
}
...

@@ +135,5 @@
> +            if (connection.mChild != null) {
> +                ParcelFileDescriptor crashPfd = null;
> +                if (crashFd >= 0) {
> +                    crashPfd = ParcelFileDescriptor.fromFd(crashFd);
> +                    connection.mChild.setCrashFd(crashPfd);

I feel like `setCrashFd` can be combined into `start` below

@@ +147,5 @@
> +            } else {
> +                // FAILED TO CONNECT.
> +                Log.e(LOGTAG, "Failed to connect to child process of '" + type + "'");
> +                GeckoAppShell.getApplicationContext().unbindService(connection);
> +                connection = null;

Don't need this line

@@ +150,5 @@
> +                GeckoAppShell.getApplicationContext().unbindService(connection);
> +                connection = null;
> +            }
> +        } catch (final ClassNotFoundException e) {
> +            Log.e(LOGTAG, "Unable to create child process for: '" + type + "'. Error finding class needed to createintent: " + e);

`, e);` instead of ` + e);`; same below

::: mobile/android/geckoview/src/main/java/org/mozilla/gecko/process/GeckoServiceChildProcess.java
@@ +29,5 @@
> +    static private void stop() {
> +        ThreadUtils.postToUiThread(new Runnable() {
> +            @Override
> +            public void run() {
> +                System.exit(1);

Process.killProcess(Process.myPid());

Unless we want System.exit to do cleanup for some reason.

@@ +36,5 @@
> +    }
> +
> +    @Override // Service
> +    public void onCreate() {
> +        super.onCreate();

Don't need to override.

@@ +41,5 @@
> +    }
> +
> +    @Override // Service
> +    public void onDestroy() {
> +        super.onDestroy();

Don't need to override.

@@ +57,5 @@
> +        }
> +
> +        @Override
> +        public int getPid() {
> +            return android.os.Process.myPid();

Import android.os.Process.

@@ +72,5 @@
> +                Log.e(LOGTAG, "Attempting to start a service that has already be started.");
> +                return;
> +            }
> +            serviceStarted = true;
> +            final String[] initArgs = args;

Just make args itself final.

@@ +79,5 @@
> +                @Override
> +                public void run() {
> +                    GeckoAppShell.ensureCrashHandling();
> +                    GeckoAppShell.setApplicationContext(getApplicationContext());
> +                    GeckoThread.initChildProcess(null, initArgs, crashFd, ipcFd, false);

if (GeckoThread.initChildProcess(...)) {
    GeckoThread.launch();
}

@@ +94,5 @@
> +    }
> +
> +    @Override // Service
> +    public boolean onUnbind(Intent intent) {
> +        Log.e(LOGTAG, "Service has been unbound. Stopping.");

Is this really an error?

::: mozglue/android/APKOpen.cpp
@@ +392,5 @@
>  
> +static char**
> +CreateArgvFromObjectArray(JNIEnv *jenv, jobjectArray jargs, int* length)
> +{
> +  int stringCount = jenv->GetArrayLength(jargs);

size_t

@@ +398,5 @@
> +  if (length) {
> +    *length = stringCount;
> +  }
> +
> +  if (stringCount <= 0) {

if (!stringCount) {

@@ +402,5 @@
> +  if (stringCount <= 0) {
> +    return nullptr;
> +  }
> +
> +  char** argv = (char**)malloc(sizeof(char*) * (stringCount + 1));

Can we use `new` here?

@@ +408,5 @@
> +  argv[stringCount] = nullptr;
> +
> +  for (int ix = 0; ix < stringCount; ix++) {
> +    jstring string = (jstring) (jenv->GetObjectArrayElement(jargs, ix));
> +    const char* rawString = jenv->GetStringUTFChars(string, 0);

nullptr instead of 0

@@ +409,5 @@
> +
> +  for (int ix = 0; ix < stringCount; ix++) {
> +    jstring string = (jstring) (jenv->GetObjectArrayElement(jargs, ix));
> +    const char* rawString = jenv->GetStringUTFChars(string, 0);
> +    const int strLength = jenv->GetStringUTFLength(string);

const size_t

@@ +410,5 @@
> +  for (int ix = 0; ix < stringCount; ix++) {
> +    jstring string = (jstring) (jenv->GetObjectArrayElement(jargs, ix));
> +    const char* rawString = jenv->GetStringUTFChars(string, 0);
> +    const int strLength = jenv->GetStringUTFLength(string);
> +    char* copy = (char*)malloc(strLength + 1);

new or strndup

@@ +412,5 @@
> +    const char* rawString = jenv->GetStringUTFChars(string, 0);
> +    const int strLength = jenv->GetStringUTFLength(string);
> +    char* copy = (char*)malloc(strLength + 1);
> +    strncpy(copy, rawString, strLength);
> +    copy[strLength] = '\0';

Don't need this line.

@@ +414,5 @@
> +    char* copy = (char*)malloc(strLength + 1);
> +    strncpy(copy, rawString, strLength);
> +    copy[strLength] = '\0';
> +    jenv->ReleaseStringUTFChars(string, rawString);
> +    argv[ix] = copy;

jenv->DeleteLocalRef(string);

@@ +426,5 @@
>  {
> +  for (int ix=0; ix < argc; ix++) {
> +    if (argv[ix]) {
> +      // Some args may have been removed from the list.
> +      free(argv[ix]);

It's okay to call free(nullptr);
Attachment #8806881 - Flags: review?(nchen) → feedback+
(In reply to Randall Barker [:rbarker] from comment #27)
> (In reply to Nick Alexander :nalexander (leave until January 2017) from
> comment #26)
> > Comment on attachment 8806881 [details] [diff] [review]
> > 0005-Bug-1314466-part-5-Add-service-process-manager-16110213-ae2460b.patch
> > 
> > Review of attachment 8806881 [details] [diff] [review]:
> > ----------------------------------------------------------------- 
> > @@ +115,5 @@
> > > +    public int start(String type, String[] args, int crashFd, int ipcFd) {
> > > +        if (mConnections.containsKey(type)) {
> > > +            Log.e(LOGTAG, "\nAttempting to start a child process service that is already running. Attempting to kill existing process first");
> > > +            ChildConnection connection = mConnections.get(type);
> > > +            connection.prepareToWait();
> > 
> > Isn't this racy?  You're re-setting the latch value, and waiting for it,
> > potentially concurrently.  How are you guarding this?
> > 
> > Should the latch be specific to a connection _attempt_?  Or, should this
> > just be `synchronized`, so you can't race starters?    Or, is there an
> > external guarantee that start never races against the same `type`?  Or, do
> > you want instead to wake up N readers after `stop`, and have a separate
> > concurrency primitive for start and stop?
> > 
> > Apologies if this is all obvious to the platform Android team -- but it's
> > definitely not clear what your expected semantics are to me.
> > 
> 
> I'm not sure I follow your question. There can only ever be one connection
> per service type.

This "there an external guarantee that start never races against the same `type`".  This is not clear from the code written, so please comment that you only expect one invocation.  I still think you can race if this is called twice, i.e., you might start-stop-start, but that should be okay.

 If we are trying to start a service and there is already a
> connection, it must be a zombie so the code attempts to kill the zombie
> before creating a new service process connection to start. The connection is
> only added to the map after the wait has completed so once it is in the map,
> it is okay to use the latch again to wait to stop.

This is definitely not obvious -- it's only true due to the current consumer -- and so definitely worth commenting on.

 If start is called on
> multiple threads at the same time for the same type, something has gone
> horribly wrong. All child process are currently launched from the IO thread
> I believe.

Then assert the thread that this should be on, or at least comment on the threading.  As it stands, this looks like racy, non-thread-safe code.  (Worse, since it's using a concurrent map, it looks like code that's intended to be thread-safe, but is only thread-safe with significant other assumptions.)
(In reply to Jim Chen (on PTO) [:jchen] [:darchons] from comment #29)
> Comment on attachment 8806881 [details] [diff] [review]
> 0005-Bug-1314466-part-5-Add-service-process-manager-16110213-ae2460b.patch
> 
> Review of attachment 8806881 [details] [diff] [review]:
> -----------------------------------------------------------------
>  
> :::
> mobile/android/geckoview/src/main/java/org/mozilla/gecko/process/
> GeckoProcessManager.java  
> @@ +40,5 @@
> > +
> > +    private class ChildConnection implements ServiceConnection, IBinder.DeathRecipient {
> > +        public final String mType;
> > +        private volatile CountDownLatch mConnectionLatch = null;
> > +        public volatile IChildProcess mChild = null;
> 
> I don't think mChild has to be volatile.
>

mChild is accessed from both the UI thread in onServiceConnected and the IO thread in start. Does it not need to be volatile if it is accessed in more than one thread?
 
> @@ +47,5 @@
> > +            mType = type;
> > +        }
> > +
> > +        void prepareToWait() {
> > +            mConnectionLatch = new CountDownLatch(1);
> 
> Use Object.wait and Object.notify instead of CountDownLatch.
> 

What happens if the Object.notify is called before Object.wait. Is it queued? The Object.notify can happen at *any* point after bindService I think so it seems that it can bee called before Object.wait is called.
  
> @@ +108,5 @@
> > +
> > +    Map<String, ChildConnection> mConnections;
> > +
> > +    private GeckoProcessManager() {
> > +        mConnections = Collections.synchronizedMap(new HashMap<String, ChildConnection>());
> 
> Use SimpleArrayMap/ArrayMap instead of HashMap
>

SimpleArrayMap/ArrayMap don't seem to be compatible with Collections.synchronizedMap. Is there something else I should use to make them thread safe?

> @@ +135,5 @@
> > +            if (connection.mChild != null) {
> > +                ParcelFileDescriptor crashPfd = null;
> > +                if (crashFd >= 0) {
> > +                    crashPfd = ParcelFileDescriptor.fromFd(crashFd);
> > +                    connection.mChild.setCrashFd(crashPfd);
> 
> I feel like `setCrashFd` can be combined into `start` below
> 

The crash reporter fd ParcelFileDescriptor will be null sometimes. It seems you should not pass null values to the binder? Is this not the case? The documentation was vague so I erred on the side of caution.
   
> @@ +72,5 @@
> > +                Log.e(LOGTAG, "Attempting to start a service that has already be started.");
> > +                return;
> > +            }
> > +            serviceStarted = true;
> > +            final String[] initArgs = args;
> 
> Just make args itself final.
> 

The base class IChildProcess.Stub is generated from an aidl. I could not find any way to make parameters final in aidl. Does that matter?
(In reply to Randall Barker [:rbarker] from comment #31)
> (In reply to Jim Chen (on PTO) [:jchen] [:darchons] from comment #29)
> > Comment on attachment 8806881 [details] [diff] [review]
> > 0005-Bug-1314466-part-5-Add-service-process-manager-16110213-ae2460b.patch
> > 
> > Review of attachment 8806881 [details] [diff] [review]:
> > -----------------------------------------------------------------
> >  
> > :::
> > mobile/android/geckoview/src/main/java/org/mozilla/gecko/process/
> > GeckoProcessManager.java  
> > @@ +40,5 @@
> > > +
> > > +    private class ChildConnection implements ServiceConnection, IBinder.DeathRecipient {
> > > +        public final String mType;
> > > +        private volatile CountDownLatch mConnectionLatch = null;
> > > +        public volatile IChildProcess mChild = null;
> > 
> > I don't think mChild has to be volatile.
> >
> 
> mChild is accessed from both the UI thread in onServiceConnected and the IO
> thread in start. Does it not need to be volatile if it is accessed in more
> than one thread?

While we haven't made all variable that are shared between threads volatile, it does appear to be the right thing to do IMHO.

>  
> > @@ +47,5 @@
> > > +            mType = type;
> > > +        }
> > > +
> > > +        void prepareToWait() {
> > > +            mConnectionLatch = new CountDownLatch(1);
> > 
> > Use Object.wait and Object.notify instead of CountDownLatch.
> > 
> 
> What happens if the Object.notify is called before Object.wait. Is it
> queued? The Object.notify can happen at *any* point after bindService I
> think so it seems that it can bee called before Object.wait is called.
>

Object.notify() is not queued. If the situation you describe is possible, the latch is the best solution.
   
> > @@ +108,5 @@
> > > +
> > > +    Map<String, ChildConnection> mConnections;
> > > +
> > > +    private GeckoProcessManager() {
> > > +        mConnections = Collections.synchronizedMap(new HashMap<String, ChildConnection>());
> > 
> > Use SimpleArrayMap/ArrayMap instead of HashMap
> >
> 
> SimpleArrayMap/ArrayMap don't seem to be compatible with
> Collections.synchronizedMap. Is there something else I should use to make
> them thread safe?
> 

It looks like it should be? Collections.synchronizedMap() takes a Map<K, V>, which ArrayMap inherits from.
Comment on attachment 8806880 [details] [diff] [review]
0004-Bug-1314466-part-4-update-GeckoChildProcessHost-to-call-LaunchAndroidService-16110213-9825750.patch

Review of attachment 8806880 [details] [diff] [review]:
-----------------------------------------------------------------

::: ipc/glue/GeckoChildProcessHost.cpp
@@ +1222,5 @@
> +  JNIEnv* env = mozilla::jni::GetEnvForThread();
> +  MOZ_ASSERT(env);
> +
> +  int argvSize = argv.size();
> +  jobjectArray jargs = env->NewObjectArray(argvSize, env->FindClass("java/lang/String"), nullptr);

jargs is leaked at the end of the function; use jni::ObjectArray::LocalRef. Bug 1307820 comment 10 also adds a jni::ObjectArray::New that you can use instead of NewObjectArray.

@@ +1225,5 @@
> +  int argvSize = argv.size();
> +  jobjectArray jargs = env->NewObjectArray(argvSize, env->FindClass("java/lang/String"), nullptr);
> +  for (int ix = 0; ix < argvSize; ix++) {
> +    jstring string =  AndroidBridge::NewJavaString(env, argv[ix].c_str());
> +    env->SetObjectArrayElement(jargs, ix, string);

string is leaked at the end of the loop. Use jni::StringParam(argv[ix].c_str(), env) and pass it to jni::ObjectArray::SetElement.

@@ +1233,5 @@
> +  it++;
> +  // If the Crash Reporter is disabled, there will not be a second file descriptor.
> +  int32_t crashFd = (it != fds_to_remap.end()) ? it->first : -1;
> +  int32_t handle = java::GeckoAppShell::StartGeckoServiceChildProcess(
> +    mozilla::jni::String::Ref::From(AndroidBridge::NewJavaString(env, type)),

This new string is leaked too. Just pass `type` in directly.
Attachment #8806880 - Flags: feedback+
(In reply to Randall Barker [:rbarker] from comment #31)
> (In reply to Jim Chen (on PTO) [:jchen] [:darchons] from comment #29)
> > Comment on attachment 8806881 [details] [diff] [review]
> > 0005-Bug-1314466-part-5-Add-service-process-manager-16110213-ae2460b.patch
> > 
> > Review of attachment 8806881 [details] [diff] [review]:
> > -----------------------------------------------------------------
> >  
> > > +        private volatile CountDownLatch mConnectionLatch = null;
> > > +        public volatile IChildProcess mChild = null;
> > 
> > I don't think mChild has to be volatile.
> >
> 
> mChild is accessed from both the UI thread in onServiceConnected and the IO
> thread in start. Does it not need to be volatile if it is accessed in more
> than one thread?

In this case, mConnectionLatch, which is volatile, acts as a "fence" for mChild so changes to mChild are visible on other threads whenever mConnectionLatch is changed afterwards. So mChild itself doesn't have to be volatile. If you use Object.wait and Object.notify, the synchronized block acts as the fence and accomplishes the same thing.

> > > +        void prepareToWait() {
> > > +            mConnectionLatch = new CountDownLatch(1);
> > 
> > Use Object.wait and Object.notify instead of CountDownLatch.
> > 
> 
> What happens if the Object.notify is called before Object.wait. Is it
> queued? The Object.notify can happen at *any* point after bindService I
> think so it seems that it can bee called before Object.wait is called.

It's not queued, but you're supposed to use Object.wait with a flag, so the flag should already be set. e.g.,

synchronized (this) {
    flag = true;
    this.notifyAll();
}

... and...

synchronized (this) {
    while (!flag) {
        this.wait();
    }
}

> > 
> > Use SimpleArrayMap/ArrayMap instead of HashMap
> >
> 
> SimpleArrayMap/ArrayMap don't seem to be compatible with
> Collections.synchronizedMap. Is there something else I should use to make
> them thread safe?

I would just use synchronized blocks tbh. synchronizedMap doesn't get you much benefit.

> > 
> > I feel like `setCrashFd` can be combined into `start` below
> > 
> 
> The crash reporter fd ParcelFileDescriptor will be null sometimes. It seems
> you should not pass null values to the binder? Is this not the case? The
> documentation was vague so I erred on the side of caution.

It seems weird to not allow nulls, but not big deal.

> > 
> > Just make args itself final.
> > 
> 
> The base class IChildProcess.Stub is generated from an aidl. I could not
> find any way to make parameters final in aidl. Does that matter?

You can make it final in your declaration, .e.g

@Override
public void start(final String[] args, final ParcelFileDescriptor ipcPfd) {
    ...
}
Comment on attachment 8807351 [details] [diff] [review]
0001-Bug-1314466-part-1-Allow-android-to-set-client-channel-fd-for-ipc-r-billm-16110315-2031dc4.patch

Review of attachment 8807351 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks.

::: ipc/chromium/src/chrome/common/ipc_channel.h
@@ +147,5 @@
>    // For portability the prefix should not include the \ character.
>    static std::wstring GenerateVerifiedChannelID(const std::wstring& prefix);
>  
> +#if defined(MOZ_WIDGET_ANDROID)
> +  // Used to set the IPC file descriptor in the child process on Android. See

s/IPC file descriptor/first IPC file descriptor/.
Attachment #8807351 - Flags: review?(wmccloskey) → review+
Fixed comment typo. Carry forward r+ from :billm
Attachment #8807351 - Attachment is obsolete: true
Comment on attachment 8808377 [details] [diff] [review]
0002-Bug-1314466-part-2-Allow-android-to-set-channel-file-descriptor-for-crash-reporter-16110713-7f2a4de.patch

Updated patch to match the requested changes made in ipc/chromium/src/chrome/common/ipc_channel_posix.cc in part 1.
Attachment #8808377 - Flags: review?(ted)
Addressed review comments from :nchen. Carry forward r+ from :billm
Attachment #8806880 - Attachment is obsolete: true
Updated to address review comments. Passing review to :snorp while :nchen is on PTO.
Attachment #8806881 - Attachment is obsolete: true
Attachment #8808380 - Flags: review?(snorp)
Addressed review comments. Carry forward r+ from :nalexander
Attachment #8806882 - Attachment is obsolete: true
Rebased patch. Carry forward r+ from :nchen
Attachment #8806883 - Attachment is obsolete: true
Comment on attachment 8808380 [details] [diff] [review]
0005-Bug-1314466-part-5-Add-service-process-manager-16110713-b7d895e.patch

Review of attachment 8808380 [details] [diff] [review]:
-----------------------------------------------------------------

Looks ok with nits

::: mobile/android/geckoview/src/main/aidl/org/mozilla/gecko/process/IChildProcess.aidl
@@ +7,5 @@
> +
> +interface IChildProcess {
> +    void stop();
> +    int getPid();
> +    void setCrashFd(in ParcelFileDescriptor pfd);

maybe setCrashReporterFd?

@@ +8,5 @@
> +interface IChildProcess {
> +    void stop();
> +    int getPid();
> +    void setCrashFd(in ParcelFileDescriptor pfd);
> +    void start(in String[] args, in ParcelFileDescriptor pfd);

Why can't the crash fd go here too?

::: mobile/android/geckoview/src/main/java/org/mozilla/gecko/process/GeckoServiceChildProcess.java
@@ +24,5 @@
> +
> +    static private String LOGTAG = "GeckoServiceChildProcess";
> +
> +    private int crashFd = -1;
> +    private boolean serviceStarted = false;

'false' is the default initializer, no need to specify that here

@@ +26,5 @@
> +
> +    private int crashFd = -1;
> +    private boolean serviceStarted = false;
> +
> +    static private void stop() {

I think you should probably just call System.exit() here. Also, why do you need to call it from the UI thread?

@@ +97,5 @@
> +    }
> +
> +    @JNITarget
> +    static public final class geckomediaplugin extends GeckoServiceChildProcess {
> +        public geckomediaplugin() {};

You shouldn't need this, as a default no-arg ctor should already be generated
Attachment #8808380 - Flags: review?(snorp) → review+
(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #44)
> Comment on attachment 8808380 [details] [diff] [review]
> 0005-Bug-1314466-part-5-Add-service-process-manager-16110713-b7d895e.patch
> 
> Review of attachment 8808380 [details] [diff] [review]:
> ----------------------------------------------------------------- 
> @@ +8,5 @@
> > +interface IChildProcess {
> > +    void stop();
> > +    int getPid();
> > +    void setCrashFd(in ParcelFileDescriptor pfd);
> > +    void start(in String[] args, in ParcelFileDescriptor pfd);
> 
> Why can't the crash fd go here too?
> 

The docs were vague over sending null objects over binder. Erred on the side of caution. 

> @@ +26,5 @@
> > +
> > +    private int crashFd = -1;
> > +    private boolean serviceStarted = false;
> > +
> > +    static private void stop() {
> 
> I think you should probably just call System.exit() here. Also, why do you
> need to call it from the UI thread?
> 

I had System.exit(), but :nchen asked me to change it to kill (somewhere in Comment 29). If I don't dispatch the kill on the UI thread, the stop call with throw an exception in GeckoProcessManager when invoked because the process disappears before the child process stop invocation has returned so we get a RemoteException.
(In reply to Randall Barker [:rbarker] from comment #45)
> (In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #44)
> > Comment on attachment 8808380 [details] [diff] [review]
> > 0005-Bug-1314466-part-5-Add-service-process-manager-16110713-b7d895e.patch
> > 
> > Review of attachment 8808380 [details] [diff] [review]:
> > ----------------------------------------------------------------- 
> > @@ +8,5 @@
> > > +interface IChildProcess {
> > > +    void stop();
> > > +    int getPid();
> > > +    void setCrashFd(in ParcelFileDescriptor pfd);
> > > +    void start(in String[] args, in ParcelFileDescriptor pfd);
> > 
> > Why can't the crash fd go here too?
> > 
> 
> The docs were vague over sending null objects over binder. Erred on the side
> of caution. 
> 

Looking at the generated code, while other types can not be null, ParcelFileDescriptor can so I'll combine into one function.
Comment on attachment 8808377 [details] [diff] [review]
0002-Bug-1314466-part-2-Allow-android-to-set-channel-file-descriptor-for-crash-reporter-16110713-7f2a4de.patch

:ted appears to be too busy to look at this. Changing reviewers.
Attachment #8808377 - Flags: review?(ted) → review?(nfroyd)
Comment on attachment 8808377 [details] [diff] [review]
0002-Bug-1314466-part-2-Allow-android-to-set-channel-file-descriptor-for-crash-reporter-16110713-7f2a4de.patch

Review of attachment 8808377 [details] [diff] [review]:
-----------------------------------------------------------------

r=me; I prefer a little more paranoia as suggested below, but if you can tell me why my paranoia is groundless, I will happily withdraw my comments.

::: toolkit/crashreporter/nsExceptionHandler.cpp
@@ +3727,5 @@
>  
>    MOZ_ASSERT(OOPInitialized());
>  
>    *childCrashFd = clientSocketFd;
> +  *childCrashRemapFd = gMagicChildCrashReportFd;

How do we know this is always going to get called after the notification pipe fd has been set on Android?  Could we do something like:

static int gMagicChildCrashReportFd =
#if ANDROID
  -1
#else
  4
#endif
  ;

and then assert wherever we access gMagicChildCrashReportFd that it's != -1?

@@ +3747,5 @@
>                       ChildFilter,
>                       nullptr,    // no minidump callback
>                       nullptr,    // no callback context
>                       true,       // install signal handlers
> +                     gMagicChildCrashReportFd);

Same question here.
Attachment #8808377 - Flags: review?(nfroyd) → review+
(In reply to Nathan Froyd [:froydnj] from comment #48)
> Comment on attachment 8808377 [details] [diff] [review]
> 0002-Bug-1314466-part-2-Allow-android-to-set-channel-file-descriptor-for-
> crash-reporter-16110713-7f2a4de.patch
> 
> Review of attachment 8808377 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> r=me; I prefer a little more paranoia as suggested below, but if you can
> tell me why my paranoia is groundless, I will happily withdraw my comments.
> 
> ::: toolkit/crashreporter/nsExceptionHandler.cpp
> @@ +3727,5 @@
> >  
> >    MOZ_ASSERT(OOPInitialized());
> >  
> >    *childCrashFd = clientSocketFd;
> > +  *childCrashRemapFd = gMagicChildCrashReportFd;
> 
> How do we know this is always going to get called after the notification
> pipe fd has been set on Android?  Could we do something like:
> 
> static int gMagicChildCrashReportFd =
> #if ANDROID
>   -1
> #else
>   4
> #endif
>   ;
> 

This is a good idea. I had originally left the default value when I was switching back and forth between forking and service based children for debugging but now that all the code to support forking has been remove for android there is no need to keep the magic number for android.

> and then assert wherever we access gMagicChildCrashReportFd that it's != -1?
> 
> @@ +3747,5 @@
> >                       ChildFilter,
> >                       nullptr,    // no minidump callback
> >                       nullptr,    // no callback context
> >                       true,       // install signal handlers
> > +                     gMagicChildCrashReportFd);
> 
> Same question here.

I can't guarantee that it will be set, but the way the code is currently written, the handles for both crash reporter and ipc are passed to the service before gecko is started.

The code is here:https://bugzilla.mozilla.org/attachment.cgi?id=8808380&action=diff#a/mozglue/android/APKOpen.cpp_sec2

Starting at line 435.

The basic premise is when a child service launch is requested, the two pipe handles are passed over the binder as part of the start call. They are set before XRE_InitChildProcess is invoked.

I don't know that this will help your paranoia but I will change the default value to -1 for android.
Pushed by rbarker@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/b59d2d54fc63
part 1, Allow android to set client channel fd for ipc r=billm
https://hg.mozilla.org/integration/mozilla-inbound/rev/03696808ec58
part 2, Allow android to set channel file descriptor for crash reporter r=froydnj
https://hg.mozilla.org/integration/mozilla-inbound/rev/d6e36e11cfa3
part 3, Fennec child processes are no longer forked so kill and waitpid do not work as expected and must gracefully handle error cases. r=billm
https://hg.mozilla.org/integration/mozilla-inbound/rev/524f33fa66c6
part 4, update GeckoChildProcessHost to call LaunchAndroidService r=billm
https://hg.mozilla.org/integration/mozilla-inbound/rev/a8198e0db7b5
part 5, Add service process manager r=snorp
https://hg.mozilla.org/integration/mozilla-inbound/rev/4ac9b7e9b8a2
part 6, Add new files to build r=nalexander
https://hg.mozilla.org/integration/mozilla-inbound/rev/33d7c7b42d7a
part 7, Update JNI Wrappers r=nchen
Depends on: 1318245
Product: Firefox for Android → GeckoView
Target Milestone: Firefox 53 → mozilla53
You need to log in before you can comment on or make changes to this bug.