Last Comment Bug 683245 - navigator.buildID throws in content processes
: navigator.buildID throws in content processes
Status: RESOLVED FIXED
[good first bug] [mentor=jdm]
: mobile
Product: Core
Classification: Components
Component: DOM (show other bugs)
: unspecified
: All All
: -- normal (vote)
: mozilla10
Assigned To: Jiten [:deLta30]
:
: Andrew Overholt [:overholt]
Mentors:
: 660457 (view as bug list)
Depends on:
Blocks: 682416
  Show dependency treegraph
 
Reported: 2011-08-30 11:54 PDT by Josh Matthews [:jdm] (on vacation until Dec 5)
Modified: 2012-03-02 15:16 PST (History)
9 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Added sync message between ContentParent and ContentChild to send appInfo (7.37 KB, patch)
2011-09-15 14:16 PDT, Jiten [:deLta30]
no flags Details | Diff | Splinter Review
patch_2 (9.54 KB, patch)
2011-09-17 12:35 PDT, Jiten [:deLta30]
no flags Details | Diff | Splinter Review
patch_3 (8.71 KB, patch)
2011-09-18 22:12 PDT, Jiten [:deLta30]
no flags Details | Diff | Splinter Review
patch_4 (9.00 KB, patch)
2011-09-19 01:01 PDT, Jiten [:deLta30]
benjamin: review-
josh: feedback+
Details | Diff | Splinter Review
patch_5 (7.31 KB, patch)
2011-09-30 23:27 PDT, Jiten [:deLta30]
no flags Details | Diff | Splinter Review
patch_5 (7.94 KB, patch)
2011-10-01 12:29 PDT, Jiten [:deLta30]
benjamin: review+
Details | Diff | Splinter Review
patch_6 (7.86 KB, patch)
2011-10-03 12:36 PDT, Jiten [:deLta30]
no flags Details | Diff | Splinter Review
patch_6 (8.13 KB, patch)
2011-10-03 13:08 PDT, Jiten [:deLta30]
no flags Details | Diff | Splinter Review
patch_7 (8.25 KB, patch)
2011-10-04 16:31 PDT, Jiten [:deLta30]
no flags Details | Diff | Splinter Review

Description Josh Matthews [:jdm] (on vacation until Dec 5) 2011-08-30 11:54:30 PDT
The app info service is null in content processes, so nsNavigator::GetBuildID throws when executing relevant mochitests.
Comment 1 Josh Matthews [:jdm] (on vacation until Dec 5) 2011-09-01 08:49:49 PDT
*** Bug 660457 has been marked as a duplicate of this bug. ***
Comment 2 Josh Matthews [:jdm] (on vacation until Dec 5) 2011-09-01 08:51:52 PDT
Benjamin, I recall you pushing back last time I tried to get gAppData set in content processes. Any thoughts on how to proceed here?
Comment 3 Benjamin Smedberg [:bsmedberg] 2011-09-01 08:54:35 PDT
forward it over during initialization?
Comment 4 Josh Matthews [:jdm] (on vacation until Dec 5) 2011-09-01 09:07:31 PDT
This sounds like a pretty small introductory electrolysis bug for someone to take on.
Comment 5 Josh Matthews [:jdm] (on vacation until Dec 5) 2011-09-01 12:13:51 PDT
For reference, the failing function is http://mxr.mozilla.org/mozilla-central/source/dom/base/nsGlobalWindow.cpp#11085, which tries to call http://mxr.mozilla.org/mozilla-central/source/toolkit/xre/nsAppRunner.cpp#666 but doesn't actually reach that line because of http://mxr.mozilla.org/mozilla-central/source/toolkit/xre/nsAppRunner.cpp#610 (gAppData is null is content processes). I'm considering the best way to deal with this.
Comment 6 Christopher Blizzard (:blizzard) 2011-09-01 14:37:56 PDT
Not tracking for Update 8 or Update 9.
Comment 7 Josh Matthews [:jdm] (on vacation until Dec 5) 2011-09-13 06:02:22 PDT
Benjamin, were you suggesting forwarding all of the app data and setting gAppData in the content process from that, or just finding somewhere to store it and using that data in favour of gAppData?
Comment 8 Josh Matthews [:jdm] (on vacation until Dec 5) 2011-09-13 06:15:48 PDT
Hmm, given how a bunch of places use gAppData checks and then grab things like the profile directory, I think we should store the vendor, name, version, buildID, ID, and
copyright fields (http://mxr.mozilla.org/mozilla-central/source/xpcom/build/nsXULAppAPI.h#57) in a shadow structure in the content process and return those if they exist in favour of gAppData. We'll need to modifiy the check at http://mxr.mozilla.org/mozilla-central/source/toolkit/xre/nsAppRunner.cpp#610 to work with the shadow structure as well.
Comment 9 Jiten [:deLta30] 2011-09-15 14:16:37 PDT
Created attachment 560450 [details] [diff] [review]
Added sync message  between ContentParent and ContentChild to send appInfo

I have added a structure in ContentParent which gets data from gAppData of nsAppRunner and sends requested data to ContentChild.Structure contains vendor,name,buildID,ID,copyright,version.
Comment 10 Josh Matthews [:jdm] (on vacation until Dec 5) 2011-09-15 20:48:12 PDT
Comment on attachment 560450 [details] [diff] [review]
Added sync message  between ContentParent and ContentChild to send appInfo

This is a good first start, but it's a bit backwards. We want to use as few sync messages as possible, since they noticeably lag web content, so we basically do this in reverse instead.

>     chromeRegistry->SendRegisteredChrome(this);
>     mMessageManager = nsFrameMessageManager::NewProcessMessageManager(this);
>+    if(!gAppData){
>+        appData->vendor = NS_strdup(gAppData->vendor);
>+        appData->name = NS_strdup(gAppData->name);
>+        appData->version = NS_strdup(gAppData->version);
>+        appData->buildID = NS_strdup(gAppData->buildID);
>+        appData->ID = NS_strdup(gAppData->ID);
>+        appData->copyright = NS_strdup(gAppData->copyright);        
>+    }

What we should do is send a message to the child containing all of these strings. Store the AppData object in the ContentChild, and add the following to each relevant nsAppRunner method:

>if (XRE_GetProcessType() == GeckoProcessType_Content) {
>  mozilla::dom::ContentChild* cc = mozilla::dom::ContentChild::GetSingleton();
>  ... // get the info from cc
>  return NS_OK;
>}

Does that make sense?
Comment 11 Jiten [:deLta30] 2011-09-17 12:35:11 PDT
Created attachment 560723 [details] [diff] [review]
patch_2
Comment 12 Josh Matthews [:jdm] (on vacation until Dec 5) 2011-09-17 16:49:19 PDT
Comment on attachment 560723 [details] [diff] [review]
patch_2

This is nearly there; great work! I'm sorry I wasn't clearer before - we can send one message with six string parameters instead of six different messages. You also need to be sure that you're testing this patch, because as written it still won't work. To do so, you need to build Fennec (just add ac_add_option --enable-app=mobile to your mozconfig) and run the testcase in bug 660457. The background will be green if this patch works, and red if it doesn't.

>diff --git a/dom/ipc/ContentChild.h b/dom/ipc/ContentChild.h

>+bool
>+ContentChild::RecvAppVendor(const nsCString& vendor)
>+{
>+    appInfo->vendor = vendor.get();

This is not safe, as the pointer that get() returns will be destroyed once the argument goes out of scope in the caller. If we make all of the structure members be nsCString variables instead of char*, this problem goes away.

>+    struct ReceivedAppInfo

AppInfo is a better name.

>+        const char *vendor; 
>+        const char *name;
>+        const char *version;
>+        const char *buildID;
>+        const char *ID;
>+        const char *copyright;

As I mentioned, nsCString, please.

>+    ReceivedAppInfo* getAppInfo() {
>+        return appInfo;
>+    }

Let's call this AppInfo, as the get prefix historically means "this function could return null".

>+    ReceivedAppInfo* appInfo;

Make this a non-pointer, and call it mAppInfo to match the existing style. Also, move this above sSingleton so it doesn't get lost.

>diff --git a/toolkit/xre/nsAppRunner.cpp b/toolkit/xre/nsAppRunner.cpp

>+mozilla::dom::ContentChild* cc = mozilla::dom::ContentChild::GetSingleton();

This doesn't look safe to me. Just add a local cc to every function that uses it.

With these changes, this patch should be ready for bsmedberg's review!
Comment 13 Jiten [:deLta30] 2011-09-18 01:22:03 PDT
What do you say about making mAppInfo static?
Comment 14 Josh Matthews [:jdm] (on vacation until Dec 5) 2011-09-18 05:09:41 PDT
There's no need for it to be static. Also, I forgot an important change that is required: http://mxr.mozilla.org/mozilla-central/source/toolkit/xre/nsAppRunner.cpp#610 means that the nsAppRunner::GetX functions won't even be called right now - for that to be happen, the condition there (gAppData != NULL) must be true. You'll need to modify that condition so it's true in the content process. I think || XRE_GetProcessType() == Gecko_ProcessTypeContent should do it.
Comment 15 Jiten [:deLta30] 2011-09-18 05:49:24 PDT
Why do we need that condition? I am using gAppData itself to get information in ContentParent.cpp. So if it is necessery  then the operation in condition should be '&&' rather then '||'.
Comment 16 Josh Matthews [:jdm] (on vacation until Dec 5) 2011-09-18 09:32:07 PDT
Web content that checks navigator.buildID runs in the content process, where gAppData is null.
Comment 17 Jiten [:deLta30] 2011-09-18 22:12:20 PDT
Created attachment 560849 [details] [diff] [review]
patch_3

I have checked this using test case and it is working.
Comment 18 Josh Matthews [:jdm] (on vacation until Dec 5) 2011-09-18 23:34:56 PDT
Comment on attachment 560849 [details] [diff] [review]
patch_3

Great! Just a couple style nitpicks that I'm sorry I didn't point out before:

>diff --git a/dom/ipc/ContentChild.h b/dom/ipc/ContentChild.h

>+    AppInfo appInfo() {

First off, let's make this return |const AppInfo&|. Also, change it back to GetAppInfo, we want to stay with the surrounding style of MemberName, but AppInfo is a type name so we can't use it.

>diff --git a/dom/ipc/ContentParent.cpp b/dom/ipc/ContentParent.cpp

>+    if(gAppData){  

Add a newline above this, and use the |if (condition) {| spacing style, please.

>diff --git a/dom/ipc/PContent.ipdl b/dom/ipc/PContent.ipdl

>+    AppInfo(nsCString vendor, nsCString name, nsCString version, nsCString buildID, nsCString ID, nsCString copyright);

Put buildID, ID and copyright on a new line, please. We try to keep lines to about 80 characters or less.

>diff --git a/toolkit/xre/nsAppRunner.cpp b/toolkit/xre/nsAppRunner.cpp

>+  NS_INTERFACE_MAP_ENTRY_CONDITIONAL(nsIXULAppInfo, gAppData || XRE_GetProcessType() == GeckoProcessType_Content)

Same thing here. Keep the || on the current line, and move the new condition onto a new one.

>+  if(XRE_GetProcessType() == GeckoProcessType_Content){
>+      mozilla::dom::ContentChild* cc = mozilla::dom::ContentChild::GetSingleton();
...
>+      
>+      return NS_OK;
>+  }

Same if spacing style as above, and remove the blank line before the return. Also, let's add a |using mozilla:dom::ContentChild;| at the top of the file after the #includes, and get rid of the mozilla::dom:: prefixes here.

Thanks so much for doing this! This is a great patch!
Comment 19 Jiten [:deLta30] 2011-09-19 01:01:10 PDT
Created attachment 560860 [details] [diff] [review]
patch_4
Comment 20 Josh Matthews [:jdm] (on vacation until Dec 5) 2011-09-19 04:22:40 PDT
Comment on attachment 560860 [details] [diff] [review]
patch_4

Time for Benjamin to look at this. Nice work!
Comment 21 Benjamin Smedberg [:bsmedberg] 2011-09-30 12:59:04 PDT
Comment on attachment 560860 [details] [diff] [review]
patch_4

General comments: we should only provide the data we absolutely have to provide: AFAICT that is version and buildid. We should simply throw or assert if somebody asks for the app vendor/name/ID/copyright info in the content process. The less information we can expose, the better.

Also there are some snuggly ifs: always put a space after if before the paren, according to our standard code style.

In general this looks ok, although I kinda wish we didn't have to do it at all, but I guess getting rid of navigator.buildID is a task for another day.
Comment 22 Jiten [:deLta30] 2011-09-30 23:27:22 PDT
Created attachment 563941 [details] [diff] [review]
patch_5

Removed unnecessary information provided to content process.Only BuildID and Version are passed to content process.Any request for information other then that will throw.
Comment 23 Josh Matthews [:jdm] (on vacation until Dec 5) 2011-10-01 11:27:03 PDT
You're going to need to add checks to the other nsIXULAppInfo getter functions that use gAppData and return NS_ERROR_NOT_AVAILABLE if they're called in the content process. Right now, checking navigator.vendor will cause a crash.
Comment 24 Jiten [:deLta30] 2011-10-01 12:29:31 PDT
Created attachment 563982 [details] [diff] [review]
patch_5
Comment 25 Benjamin Smedberg [:bsmedberg] 2011-10-03 12:17:04 PDT
Comment on attachment 563982 [details] [diff] [review]
patch_5

+    
+    if(gAppData) {   

* nit: remove all whitespace at end of lines throughout this patch
* nit: if-paren still needs un-snuggling

In all of the NS_ERROR_NOT_AVAILABLE cases, can you add either a NS_ERROR or a NS_WARNING?

r=me with those changes
Comment 26 Jiten [:deLta30] 2011-10-03 12:36:59 PDT
Created attachment 564295 [details] [diff] [review]
patch_6
Comment 27 Jiten [:deLta30] 2011-10-03 13:08:30 PDT
Created attachment 564304 [details] [diff] [review]
patch_6
Comment 28 Phil Ringnalda (:philor) 2011-10-03 15:19:56 PDT
Backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/25b8388347af - too early to tell if it will be just debug, but so far four debug xpcshell runs timed out in test_content_annotation.js, like https://tbpl.mozilla.org/php/getParsedLog.php?id=6659289&tree=Mozilla-Inbound
Comment 29 Josh Matthews [:jdm] (on vacation until Dec 5) 2011-10-03 20:27:00 PDT
Jiten, are you able to debug this failure? Try running |SOLO_FILE=test_content_annotation.js make -C objdir/toolkit/crashreporter/test/ check-interactive|, type |_execute_test()| at the prompt, and you should see the test hang with this patch applied. You can use gdb to attach to the parent xpcshell process before starting the test, or if you run with MOZ_DEBUG_CHILD_PROCESS=1, you can attach gdb to the child process (plugin-container) after the test starts. See https://developer.mozilla.org/en/Writing_xpcshell-based_unit_tests#Running_unit_tests_under_a_C.2B.2B_debugger for more complete documentation on using the xpcshell test harness.
Comment 30 Jiten [:deLta30] 2011-10-04 03:55:10 PDT
I think "NS_ERROR("Attempt to get unavailable information in content process.");" is causing this. I don't know why and I am not sure that even this is the problem.

When I ran command |SOLO_FILE=test_content_annotation.js make -C objdir/toolkit/crashreporter/test/ check-interactive| I got this,
"###!!! ASSERTION: Attempt to get unavailable information in content process.: 'Error', file /home/jiten/Mozilla/mozilla-central/toolkit/xre/nsAppRunner.cpp, line 656

Program /home/jiten/Mozilla/mozilla-central/obj-i686-pc-linux-gnu/dist/bin/plugin-container (pid = 27625) received signal 11."

This shows that signal 11(SIGSEGV) was sent to plugin-container(child) process. So I rechecked it with NS_WARNING instead of NS_ERROR and it didn't hang and passed testing of test_content_annotation.js.
Comment 31 Josh Matthews [:jdm] (on vacation until Dec 5) 2011-10-04 05:49:56 PDT
Yes, assertions trigger failures in xpcshell tests, so let's change to using NS_WARNING instead. However, before you do, could you run the test following the e10s instructions from the docs and attach gdb to the content process, and set a breakpoint in the content process at nsAppRunner.cpp:656? I'm really interested in who's calling those functions.
Comment 32 Jiten [:deLta30] 2011-10-04 16:31:28 PDT
Created attachment 564701 [details] [diff] [review]
patch_7

Josh, I tried to find out who is calling. It jumped to nsXULAppInfo::GetID() method after http://mxr.mozilla.org/mozilla-central/source/toolkit/xre/nsEmbedFunctions.cpp#510.
Comment 33 Josh Matthews [:jdm] (on vacation until Dec 5) 2011-10-05 08:25:50 PDT
Ok, this is being called as part of InitXPCOM while parsing manifests. Since we're still not returning any data for most of the fields of nsIXULAppInfo, this change shouldn't make a difference to the process, so it should be good to land with NS_WARNING.
Comment 34 Josh Matthews [:jdm] (on vacation until Dec 5) 2011-10-05 10:49:24 PDT
http://hg.mozilla.org/integration/mozilla-inbound/rev/9687f4b3cefb
Comment 35 Ed Morley [:emorley] 2011-10-06 03:50:35 PDT
https://hg.mozilla.org/mozilla-central/rev/9687f4b3cefb

Thanks for the patch :-)
Comment 36 Jason Duell [:jduell] (needinfo me) 2012-03-02 15:16:57 PST
This is causing a lot of NS_WARNING spew in e10s--about 30-40 lines of the same warning (from parsing manifests) every time you run an e10s xpcshell test.  I'm not a fan of just leaving warnings in--can we either throw in an #if child to avoid the call, or provide the info it wants?

Also nsHttpHandler::Init in the child barfs an warning trying to get the app->Name.  The only thing it uses the value for is to generate the User-Agent string (which is currently never used in the child).  I can fix this by shoehorning in a check for isChild before getting the appname, and causing nsHttpHandler::BuildUserAgent to fail on the child.  But really, wouldn't it be simpler to just pass the appname to the child?  So far I've needed to change literally about 3 lines of nsHttpHandler to make it work with e10s.  Is there some reason it's a Bad Thing to give the appname to the child?

Note You need to log in before you can comment on or make changes to this bug.