Last Comment Bug 834672 - AppProtocolHandler.js should never throw
: AppProtocolHandler.js should never throw
Status: RESOLVED FIXED
[qa-]
: dataloss
Product: Core
Classification: Components
Component: DOM: Apps (show other bugs)
: Trunk
: ARM Gonk (Firefox OS)
: -- normal (vote)
: mozilla22
Assigned To: [:fabrice] Fabrice Desré
:
Mentors:
: 850382 (view as bug list)
Depends on:
Blocks: b2g-app-updates
  Show dependency treegraph
 
Reported: 2013-01-25 05:37 PST by Julien Wajsberg [:julienw]
Modified: 2013-06-06 08:02 PDT (History)
10 users (show)
jsmith: in‑moztrap-
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
leo+
wontfix
wontfix
fixed
fixed
wontfix
wontfix


Attachments
patch (3.67 KB, patch)
2013-02-21 22:43 PST, [:fabrice] Fabrice Desré
no flags Details | Diff | Splinter Review
"trigger the error" patch (705 bytes, patch)
2013-02-25 07:25 PST, Julien Wajsberg [:julienw]
no flags Details | Diff | Splinter Review
patch v2 (5.77 KB, patch)
2013-02-25 17:59 PST, [:fabrice] Fabrice Desré
no flags Details | Diff | Splinter Review
patch v3 (5.28 KB, patch)
2013-02-25 18:28 PST, [:fabrice] Fabrice Desré
jduell.mcbugs: review+
Details | Diff | Splinter Review

Description Julien Wajsberg [:julienw] 2013-01-25 05:37:18 PST
In some situations, most notably because Homescreen bugs (yeah, our coders are great but still... :) ), we get to do XHR to app:// URL that have an invalid ID.

This leads to the following logs :

I/Gecko   (  108): -*-*- Webapps.jsm : No webapp for {7120fb68-4942-44f2-a02f-84a6d32efcab}
E/GeckoConsole(  413): [JavaScript Error: "appInfo is null" {file: "jar:file:///system/b2g/omni.ja!/components/AppProtocolHandler.js" line: 69}]
E/GeckoConsole(  413): [JavaScript Error: "NS_ERROR_XPC_JAVASCRIPT_ERROR_WITH_DETAILS: '[JavaScript Error: "appInfo is null" {file: "jar:file:///system/b2g/omni.ja!/components/AppProtocolHandler.js" line: 69}]' when calling method: [nsIProtocolHandler::newChannel]" {file: "app://homescreen.gaiamobile.org/js/page.js" line: 158}]
E/GeckoConsole(  413): [JavaScript Error: "NS_ERROR_XPC_JAVASCRIPT_ERROR_WITH_DETAILS: '[JavaScript Error: "appInfo is null" {file: "jar:file:///system/b2g/omni.ja!/components/AppProtocolHandler.js" line: 69}]' when calling method: [nsIProtocolHandler::newChannel]" {file: "app://homescreen.gaiamobile.org/js/page.js" line: 158}]

As we can see, AppProtocolHandler tries to get the app informations for a ID that doesn't exist anymore. And then it throws in the calling code.

If that's possible, I'd return a 404 error or something like that.
Or maybe use a try/catch block around the whole block and throw a 50x error if we get an exception. Or maybe both.

But I really don't know how that stuff works so..

The implications for the user code is that if this throws, it means that the phone in a bad state and we may not showing the icons any more, or not showing some pages in the Homescreen, without being able to recover without a full reset.

That's why I'd like to get this fixed, even if we obviously need to fix the other bugs in Homescreen too.

I really would like this to go to shira because I think it would be a small patch with a low risk with big reward. Of course Fabrice could disagree :)
Comment 1 Jason Smith [:jsmith] 2013-01-25 08:41:10 PST
I don't think we can block on this until we know what the implications are. A bad logcat bug with no understanding of the user impact will not block.
Comment 2 Julien Wajsberg [:julienw] 2013-01-25 09:38:02 PST
Actually I see this as an easy last guard against yet unfound bugs and so that is the reason I think it should block.
Comment 3 Joe Cheng [:jcheng] 2013-01-31 01:41:33 PST
Triage: shira is to be as close to tef as possible. before understanding a confirmed STR and user impact, move to leo? for consideration
Comment 4 [:fabrice] Fabrice Desré 2013-02-21 22:43:26 PST
Created attachment 716954 [details] [diff] [review]
patch

It's hard to verify since we hit that in cases that I suspect to be race conditions. Julien, do you have an way to tweak gaia to hit this?
Comment 5 Julien Wajsberg [:julienw] 2013-02-22 01:34:50 PST
IIRC this happens when we uninstall an app and install it again (because homescreen doesn't handle correctly the uninstall event). Maybe it was fixed though, I'll have a look today.

(the feedback to jduell was an error ?)
Comment 6 [:fabrice] Fabrice Desré 2013-02-22 08:16:07 PST
(In reply to Julien Wajsberg [:julienw] from comment #5)
> 
> (the feedback to jduell was an error ?)

No, I want some feedback on the DummyChannel code.
Comment 7 Jason Smith [:jsmith] 2013-02-22 17:14:32 PST
So I'm actually seeing this happening during packaged app updates. When this happens, the app update completes, but the app won't be able to be launched. Which then makes this data loss now. I've hit this twice in a row so far.

02-22 17:09:20.756: E/GeckoConsole(392): [JavaScript Error: "appInfo is null" {file: "jar:file:///system/b2g/omni.ja!/components/AppProtocolHandler.js" line: 68}]
02-22 17:09:20.756: E/GeckoConsole(392): [JavaScript Error: "NS_ERROR_XPC_JAVASCRIPT_ERROR_WITH_DETAILS: '[JavaScript Error: "appInfo is null" {file: "jar:file:///system/b2g/omni.ja!/components/AppProtocolHandler.js" line: 68}]' when calling method: [nsIProtocolHandler::newChannel]" {file: "app://homescreen.gaiamobile.org/js/page.js" line: 158}]
02-22 17:09:20.756: E/GeckoConsole(392): [JavaScript Error: "NS_ERROR_XPC_JAVASCRIPT_ERROR_WITH_DETAILS: '[JavaScript Error: "appInfo is null" {file: "jar:file:///system/b2g/omni.ja!/components/AppProtocolHandler.js" line: 68}]' when calling method: [nsIProtocolHandler::newChannel]" {file: "app://homescreen.gaiamobile.org/js/page.js" line: 158}]
Comment 8 Jason Duell [:jduell] (needinfo me) 2013-02-22 21:22:21 PST
Comment on attachment 716954 [details] [diff] [review]
patch

Review of attachment 716954 [details] [diff] [review]:
-----------------------------------------------------------------

My gut sense is that you'd be better served by throwing an error in newChannel instead of constructing a dummychannel.  But if it makes life a lot easier to have the error happen later in OnStopRequest, then I'm fine with that.  More details below.

::: netwerk/protocol/app/AppProtocolHandler.js
@@ +54,5 @@
>      // We map app://ABCDEF/path/to/file.ext to
>      // jar:file:///path/to/profile/webapps/ABCDEF/application.zip!/path/to/file.ext
> +    let url = aURI.QueryInterface(Ci.nsIURL);
> +    let appId = aURI.host;
> +    let fileSpec = url.filePath;

So you're relying on nsIURL parsing to treat everything up to first / as host/appID.  I don't know the URL code well enough to know if that's true--sounds like it ought to work.  :bz would know, or maybe you've already looked at the code.

@@ +62,5 @@
>      let uri;
> +
> +    if (!appInfo) {
> +      dump("!! got no appInfo for " + appId + "\n");
> +      return new DummyChannel();

Any reason you don't just return an error here instead of constructing a dummy channel?  It's certainly a cleaner approach.  But if it makes life easier upstream there's nothing inherently wrong with the dummy channel approach.  It's just kludgy.

@@ +89,5 @@
> +  dump("DummyChannel constructor");
> +}
> +
> +DummyChannel.prototype = {
> +  QueryInterface: XPCOMUtils.generateQI([Ci.nsIChannel]),

nsIChannel inherits from nsIRequest, so there's a bunch of members/methods there that you theoretically also need to implement to be a properly functioning nsIChannel.  Whether you actually need to implement them or not depends of course on whether the code upstream actually uses/relies on them.

@@ +115,5 @@
> +    Services.tm.currentThread.dispatch(
> +    {
> +      run: function dc_run() {
> +        aListener.onStartRequest(request, aContext);
> +        aListener.onStopRequest(request, aContext, Cr.NS_ERROR_FILE_NOT_FOUND);

My JS-Fu is poor.  Does this use of aListener in a nested function that gets run in a different event preseve a refcount for aListener the entire time?  A call to AsyncOpen that returns NS_OK is responsible for AddRef'ing the listener and then we must release it after calling OnStopRequest.

For thoroughness's sake, you might also want to set notificationCallbacks=null (after calling OnStopRequest), in case some code set it after the channel was created to avoid possibly ref cycle.

It's also generally the case that if nsIRequest.loadGroup has been set, AsyncOpen calls loadGroup.AddRequest(this) and after OnStopRequest is called we call loadgroup.RemoveRequest.  Don't know if you need it in the case here.  IsPending() would also generally return true in between AsyncOpen and OnStop, and false otherwise.

suspend() would prevent OnStart/Stop from happening until resume() is called (and could be called by aListener.OnStart())  Again maybe no need to implement.  

Perhaps you get a taste for how faking a channel is complicated, and this dummy approach could easily blow up on you later if some new code is added that assumes one of these things about the channel and you haven't implemented it :)  That's why I recommend the simple newChannel error throw if it's possible.
Comment 9 Julien Wajsberg [:julienw] 2013-02-25 06:34:19 PST
(In reply to Jason Duell (:jduell) from comment #8)

> My gut sense is that you'd be better served by throwing an error in
> newChannel instead of constructing a dummychannel.  But if it makes life a
> lot easier to have the error happen later in OnStopRequest, then I'm fine
> with that.  More details below.

In the calling code, we use a system XHR to get a file whose URL is an app:// url. I was suggesting that having a 404 or 50x error would be better than throwing at "open" (what happens now if the app id is not found).

At least we should not throw with a "null value error" but maybe the "50x" error could be generated by the XHR implementation if the protocol handler throws an Error ?
Comment 10 Julien Wajsberg [:julienw] 2013-02-25 06:59:07 PST
(In reply to Jason Smith [:jsmith] from comment #7)
> So I'm actually seeing this happening during packaged app updates. When this
> happens, the app update completes, but the app won't be able to be launched.
> Which then makes this data loss now. I've hit this twice in a row so far.
> 

I can't reproduce this (even if I got other unrelated issues trying to reproduce it). Would you have consistent STR ?
Comment 11 Julien Wajsberg [:julienw] 2013-02-25 07:25:02 PST
Created attachment 717883 [details] [diff] [review]
"trigger the error" patch

With this patch you can easily reproduce the error.

Don't use it at home as this will break homescreen until this bug is fixed ;)
Comment 12 Jason Smith [:jsmith] 2013-02-25 07:51:25 PST
(In reply to Julien Wajsberg [:julienw] from comment #10)
> (In reply to Jason Smith [:jsmith] from comment #7)
> > So I'm actually seeing this happening during packaged app updates. When this
> > happens, the app update completes, but the app won't be able to be launched.
> > Which then makes this data loss now. I've hit this twice in a row so far.
> > 
> 
> I can't reproduce this (even if I got other unrelated issues trying to
> reproduce it). Would you have consistent STR ?

There isn't really a consistent STR, but you've got about a 1/5 of a chance of hitting this and getting that error. Which means it's probably a timing issue. But I most definitely have seen this happen and have seen the app that gets hit by it become unlaunchable.

And if it's a timing issue, I doubt I'm going to get consistent STR.
Comment 13 Julien Wajsberg [:julienw] 2013-02-25 08:08:15 PST
(In reply to Jason Smith [:jsmith] from comment #12)
> 
> There isn't really a consistent STR, but you've got about a 1/5 of a chance
> of hitting this and getting that error. Which means it's probably a timing
> issue. But I most definitely have seen this happen and have seen the app
> that gets hit by it become unlaunchable.

Does a reboot fix it ?
Comment 14 Lukas Blakk [:lsblakk] use ?needinfo 2013-02-25 08:12:52 PST
With the lack of consistent reproducibility we won't block on tef but will push to leo nomination so it can be considered for blocking that version.
Comment 15 Jason Smith [:jsmith] 2013-02-25 08:13:28 PST
(In reply to Julien Wajsberg [:julienw] from comment #13)
> (In reply to Jason Smith [:jsmith] from comment #12)
> > 
> > There isn't really a consistent STR, but you've got about a 1/5 of a chance
> > of hitting this and getting that error. Which means it's probably a timing
> > issue. But I most definitely have seen this happen and have seen the app
> > that gets hit by it become unlaunchable.
> 
> Does a reboot fix it ?

Yes.
Comment 16 Jason Duell [:jduell] (needinfo me) 2013-02-25 11:25:02 PST
> we use a system XHR to get a file whose URL is an app:// url. I was suggesting
> that having a 404 or 50x error would be better than throwing at "open" 

If that's easier, then the dummychannel approach should be workable.
Comment 17 Julien Wajsberg [:julienw] 2013-02-25 15:03:53 PST
(In reply to Jason Duell (:jduell) from comment #16)
> > we use a system XHR to get a file whose URL is an app:// url. I was suggesting
> > that having a 404 or 50x error would be better than throwing at "open" 
> 
> If that's easier, then the dummychannel approach should be workable.

I wonder if throwing an error here, and catching it in the XMLHttpRequest implementation instead, would not be easier ? This would prevent any programming mistake in all protocol handlers to cause a problem to the calling code. Also, this is not unlike how servers deal with programming errors: with errors 50x.
Comment 18 [:fabrice] Fabrice Desré 2013-02-25 17:59:11 PST
Created attachment 718194 [details] [diff] [review]
patch v2

So if we want to return an HTTP error, we can either use a dummy channel, or tweak the XHR implementation to do something special in our case. Jason, if you think the later is the good thing to do, don't read further and r- this patch.

If not, I implemented the behavior you asked for in your first comment. I checked that gaia was indeed getting 404s.

The changes to the URL decoding code were asked by Gavin in bug 811540, but we finally used another solution so we never landed them.
Comment 19 [:fabrice] Fabrice Desré 2013-02-25 18:28:10 PST
Created attachment 718207 [details] [diff] [review]
patch v3

Correct version of the patch...
Comment 20 Jason Duell [:jduell] (needinfo me) 2013-02-25 19:06:01 PST
Comment on attachment 718207 [details] [diff] [review]
patch v3

Review of attachment 718207 [details] [diff] [review]:
-----------------------------------------------------------------

Look good.  If you fix up suspend/resume you can mark next patch +r and land w/o another review from me.

::: netwerk/protocol/app/AppProtocolHandler.js
@@ +63,5 @@
> +
> +    if (!appInfo) {
> +      // That should not happen, so dump() inconditionnally.
> +      dump("!! got no appInfo for " + appId + "\n");
> +      return new DummyChannel();

Can you put a comment that explains why we prefer to return a dummy channel instead of just throwing here?

@@ +86,5 @@
>    }
>  };
>  
> +/**
> +  * This dummy channel implementation only provides enough functionnality

functionality

@@ +88,5 @@
>  
> +/**
> +  * This dummy channel implementation only provides enough functionnality
> +  * to return a fake 404 error when the caller asks for an incorrect
> +  * app:// url in the case of an unknonw appId.

"asks for an app:// URL containing an unknown appId"

@@ +110,5 @@
> +  cancel: function dc_cancel() {
> +  },
> +
> +  suspend: function dc_suspend() {
> +    this._suspended = true;

this._suspendCount++

@@ +114,5 @@
> +    this._suspended = true;
> +  },
> +
> +  resume: function dc_resume() {
> +    this._suspended = false;

Suspend/resume can be recursive, so make this 

  if (suspendCount <= 0)
     throw NS_ERROR_UNEXPECTED;

  if (--suspendCount == 0) { if (pending) dispatch() }

@@ +137,5 @@
> +  contentDisposition: Ci.nsIChannel.DISPOSITION_INLINE,
> +  contentDispositionFilename: "",
> +
> +  _pending: false,
> +  _suspended: false,

_suspendCount = 0;
Comment 21 Jason Duell [:jduell] (needinfo me) 2013-02-25 19:08:22 PST
Julien:  I don't actually know XHR well enough to say if hooking there is significantly easier than using a dummy JAR channel.  This seems to work well enough?
Comment 22 Julien Wajsberg [:julienw] 2013-02-25 23:58:48 PST
Whichever is fine for me as soon as this is fixed, I was only exploring all possible solutions.
Comment 23 [:fabrice] Fabrice Desré 2013-02-27 11:32:37 PST
hg.mozilla.org/integration/mozilla-inbound/rev/7a79ddc7bedf
Comment 24 Ryan VanderMeulen [:RyanVM] 2013-02-27 12:29:22 PST
Backed out for debug-only bustage during packaging.
https://hg.mozilla.org/integration/mozilla-inbound/rev/a87453b34058

https://tbpl.mozilla.org/php/getParsedLog.php?id=20153551&tree=Mozilla-Inbound

e:\builds\moz2_slave\m-in-w32-d-0000000000000000000\build\toolkit\mozapps\installer\packager.mk:590:0$ e:/builds/moz2_slave/m-in-w32-d-0000000000000000000/build/obj-firefox/_virtualenv/Scripts/python.exe e:/builds/moz2_slave/m-in-w32-d-0000000000000000000/build/toolkit/mozapps/installer/packager.py -DNO_NSPR_10_SUPPORT -DAB_CD=en-US -DMOZ_APP_NAME=firefox -DPREF_DIR=defaults/preferences -DMOZ_DEBUG=1 -D_MSC_VER=1600 -DJAREXT= -DMOZ_ANGLE_RENDERER=1 -DMOZ_D3DCOMPILER_DLL=D3DCompiler_43.dll -DMOZ_CHILD_PROCESS_NAME=plugin-container.exe -DMOZ_MSVC_REDIST=1600 -DMOZ_SHARED_MOZGLUE=1 -DMOZ_JSDEBUGGER -DNECKO_WIFI -DDLL_PREFIX= -DDLL_SUFFIX=.dll -DBIN_SUFFIX=.exe -DBINPATH=bin -DENABLE_MARIONETTE=1 \
	--format omni \
	--removals e:/builds/moz2_slave/m-in-w32-d-0000000000000000000/build/browser/installer/removed-files.in \
	 \
	 \
	 \
	--optimizejars \
	 \
	package-manifest ../../dist ../../dist/firefox \
	
Executing e:/builds/moz2_slave/m-in-w32-d-0000000000000000000/build/obj-firefox/dist\bin\xpcshell.exe -g e:\builds\moz2_slave\m-in-w32-d-0000000000000000000\build\obj-firefox\dist\bin/ -a e:\builds\moz2_slave\m-in-w32-d-0000000000000000000\build\obj-firefox\dist\bin/ -f e:/builds/moz2_slave/m-in-w32-d-0000000000000000000/build/toolkit/mozapps/installer\precompile_cache.js -e precompile_startupcache("resource://gre/");
resource://gre/components/ActivityOptions.js
resource://gre/components/ActivityProxy.js
resource://gre/components/ActivityRequestHandler.js
resource://gre/components/ActivityWrapper.js
resource://gre/components/addonManager.js
resource://gre/components/AlarmsManager.js
resource://gre/components/amContentHandler.js
resource://gre/components/amWebInstallListener.js
resource://gre/components/AppProtocolHandler.js
###!!! ASSERTION: Recursive GetService!: 'Error', file e:/builds/moz2_slave/m-in-w32-d-0000000000000000000/build/xpcom/components/nsComponentManager.cpp, line 1385
xul!CallGetService+0x000000000000003B (e:\builds\moz2_slave\m-in-w32-d-0000000000000000000\build\obj-firefox\xpcom\build\nscomponentmanagerutils.cpp, line 63)
xul!CallGetService<nsIProtocolHandler>+0x0000000000000051 (e:\builds\moz2_slave\m-in-w32-d-0000000000000000000\build\obj-firefox\dist\include\nsservicemanagerutils.h, line 98)
xul!nsIOService::GetProtocolHandler+0x0000000000000110 (e:\builds\moz2_slave\m-in-w32-d-0000000000000000000\build\netwerk\base\src\nsioservice.cpp, line 418)
xul!nsIOService::NewURI+0x00000000000000E1 (e:\builds\moz2_slave\m-in-w32-d-0000000000000000000\build\netwerk\base\src\nsioservice.cpp, line 540)
xul!NS_InvokeByIndex_P+0x0000000000000027 (e:\builds\moz2_slave\m-in-w32-d-0000000000000000000\build\xpcom\reflect\xptcall\src\md\win32\xptcinvoke.cpp, line 71)
xul!XPCWrappedNative::CallMethod+0x0000000000001084 (e:\builds\moz2_slave\m-in-w32-d-0000000000000000000\build\js\xpconnect\src\xpcwrappednative.cpp, line 2385)
xul!XPC_WN_CallMethod+0x00000000000002A1 (e:\builds\moz2_slave\m-in-w32-d-0000000000000000000\build\js\xpconnect\src\xpcwrappednativejsops.cpp, line 1417)
mozjs!js::InvokeKernel+0x000000000000043D (e:\builds\moz2_slave\m-in-w32-d-0000000000000000000\build\js\src\jsinterp.cpp, line 383)
mozjs!js::Interpret+0x000000000000A142 (e:\builds\moz2_slave\m-in-w32-d-0000000000000000000\build\js\src\jsinterp.cpp, line 2362)
mozjs!js::RunScript+0x00000000000004B8 (e:\builds\moz2_slave\m-in-w32-d-0000000000000000000\build\js\src\jsinterp.cpp, line 340)
mozjs!js::ExecuteKernel+0x00000000000001EC (e:\builds\moz2_slave\m-in-w32-d-0000000000000000000\build\js\src\jsinterp.cpp, line 530)
mozjs!js::Execute+0x00000000000004A5 (e:\builds\moz2_slave\m-in-w32-d-0000000000000000000\build\js\src\jsinterp.cpp, line 570)
mozjs!JS_ExecuteScript+0x00000000000001E8 (e:\builds\moz2_slave\m-in-w32-d-0000000000000000000\build\js\src\jsapi.cpp, line 5516)
mozjs!JS_ExecuteScriptVersion+0x0000000000000033 (e:\builds\moz2_slave\m-in-w32-d-0000000000000000000\build\js\src\jsapi.cpp, line 5525)
xul!mozJSComponentLoader::ObjectForLocation+0x0000000000000C76 (e:\builds\moz2_slave\m-in-w32-d-0000000000000000000\build\js\xpconnect\loader\mozjscomponentloader.cpp, line 1103)
xul!mozJSComponentLoader::LoadModule+0x000000000000016B (e:\builds\moz2_slave\m-in-w32-d-0000000000000000000\build\js\xpconnect\loader\mozjscomponentloader.cpp, line 532)
xul!nsComponentManagerImpl::KnownModule::Load+0x000000000000002E (e:\builds\moz2_slave\m-in-w32-d-0000000000000000000\build\xpcom\components\nscomponentmanager.cpp, line 698)
xul!nsComponentManagerImpl::CreateInstanceByContractID+0x0000000000000083 (e:\builds\moz2_slave\m-in-w32-d-0000000000000000000\build\xpcom\components\nscomponentmanager.cpp, line 1030)
xul!nsComponentManagerImpl::GetServiceByContractID+0x00000000000002BE (e:\builds\moz2_slave\m-in-w32-d-0000000000000000000\build\xpcom\components\nscomponentmanager.cpp, line 1426)
xul!CallGetService+0x000000000000003B (e:\builds\moz2_slave\m-in-w32-d-0000000000000000000\build\obj-firefox\xpcom\build\nscomponentmanagerutils.cpp, line 63)
xul!CallGetService<nsIProtocolHandler>+0x0000000000000051 (e:\builds\moz2_slave\m-in-w32-d-0000000000000000000\build\obj-firefox\dist\include\nsservicemanagerutils.h, line 98)
xul!nsIOService::GetProtocolHandler+0x0000000000000110 (e:\builds\moz2_slave\m-in-w32-d-0000000000000000000\build\netwerk\base\src\nsioservice.cpp, line 418)
xul!nsIOService::NewURI+0x00000000000000E1 (e:\builds\moz2_slave\m-in-w32-d-0000000000000000000\build\netwerk\base\src\nsioservice.cpp, line 540)
xul!NS_InvokeByIndex_P+0x0000000000000027 (e:\builds\moz2_slave\m-in-w32-d-0000000000000000000\build\xpcom\reflect\xptcall\src\md\win32\xptcinvoke.cpp, line 71)
xul!XPCWrappedNative::CallMethod+0x0000000000001084 (e:\builds\moz2_slave\m-in-w32-d-0000000000000000000\build\js\xpconnect\src\xpcwrappednative.cpp, line 2385)
xul!XPC_WN_CallMethod+0x00000000000002A1 (e:\builds\moz2_slave\m-in-w32-d-0000000000000000000\build\js\xpconnect\src\xpcwrappednativejsops.cpp, line 1417)
mozjs!js::InvokeKernel+0x000000000000043D (e:\builds\moe:\builds\moz2_slave\m-in-w32-d-0000000000000000000\build\toolkit\mozapps\installer\packager.mk:590:0: command 'e:/builds/moz2_slave/m-in-w32-d-0000000000000000000/build/obj-firefox/_virtualenv/Scripts/python.exe e:/builds/moz2_slave/m-in-w32-d-0000000000000000000/build/toolkit/mozapps/installer/packager.py -DNO_NSPR_10_SUPPORT -DAB_CD=en-US -DMOZ_APP_NAME=firefox -DPREF_DIR=defaults/preferences -DMOZ_DEBUG=1 -D_MSC_VER=1600 -DJAREXT= -DMOZ_ANGLE_RENDERER=1 -DMOZ_D3DCOMPILER_DLL=D3DCompiler_43.dll -DMOZ_CHILD_PROCESS_NAME=plugin-container.exe -DMOZ_MSVC_REDIST=1600 -DMOZ_SHARED_MOZGLUE=1 -DMOZ_JSDEBUGGER -DNECKO_WIFI -DDLL_PREFIX= -DDLL_SUFFIX=.dll -DBIN_SUFFIX=.exe -DBINPATH=bin -DENABLE_MARIONETTE=1 \
	--format omni \
	--removals e:/builds/moz2_slave/m-in-w32-d-0000000000000000000/build/browser/installer/removed-files.in \
	 \
	 \
	 \
	--optimizejars \
	 \
	package-manifest ../../dist ../../dist/firefox \
	' failed, return code 1
e:\builds\moz2_slave\m-in-w32-d-0000000000000000000\build\toolkit\mozapps\installer\packager.mk:617:0: command 'C:/mozilla-build/python27/python.exe e:/builds/moz2_slave/m-in-w32-d-0000000000000000000/build/build/pymake/pymake/../make.py make-package-internal' failed, return code 2
e:\builds\moz2_slave\m-in-w32-d-0000000000000000000\build\config\rules.mk:596:0: command 'C:/mozilla-build/python27/python.exe e:/builds/moz2_slave/m-in-w32-d-0000000000000000000/build/build/pymake/pymake/../make.py libs' failed, return code 2
e:\builds\moz2_slave\m-in-w32-d-0000000000000000000\build\browser\build.mk:45:0: command 'C:/mozilla-build/python27/python.exe e:/builds/moz2_slave/m-in-w32-d-0000000000000000000/build/build/pymake/pymake/../make.py -C browser/installer' failed, return code 2
Traceback (most recent call last):
  File "e:/builds/moz2_slave/m-in-w32-d-0000000000000000000/build/toolkit/mozapps/installer/packager.py", line 366, in <module>
    main()
  File "e:/builds/moz2_slave/m-in-w32-d-0000000000000000000/build/toolkit/mozapps/installer/packager.py", line 358, in main
    args.source, gre_path, base)
  File "e:/builds/moz2_slave/m-in-w32-d-0000000000000000000/build/toolkit/mozapps/installer/packager.py", line 140, in precompile_cache
    errors.fatal('Error while running startup cache precompilation')
  File "e:\builds\moz2_slave\m-in-w32-d-0000000000000000000\build\python\mozbuild\mozpack\errors.py", line 101, in fatal
    self._handle(self.FATAL, msg)
  File "e:\builds\moz2_slave\m-in-w32-d-0000000000000000000\build\python\mozbuild\mozpack\errors.py", line 96, in _handle
    raise ErrorMessage(msg)
mozpack.errors.ErrorMessage: Error: Error while running startup cache precompilation
Comment 25 [:fabrice] Fabrice Desré 2013-02-28 10:08:38 PST
https://hg.mozilla.org/integration/mozilla-inbound/rev/3ac24f8c6d21
Comment 26 [:fabrice] Fabrice Desré 2013-02-28 11:28:07 PST
and https://hg.mozilla.org/integration/mozilla-inbound/rev/5f4904d3191f to fix bustage
Comment 28 Ryan VanderMeulen [:RyanVM] 2013-02-28 20:09:05 PST
https://hg.mozilla.org/releases/mozilla-b2g18/rev/8ecbfd32e788

I folded in the bustage fix with the other cset.
Comment 29 Julien Wajsberg [:julienw] 2013-03-21 07:28:01 PDT
Nom for tef per Bug 851281 comment 30.
Comment 30 Julien Wajsberg [:julienw] 2013-03-21 14:56:04 PDT
Even if we found the root cause in Bug 851281, this patch still provides us guards around such unexpected problems.
Comment 31 Michael Vines [:m1] [:evilmachines] 2013-03-21 15:04:46 PDT
I agree it's nice to have but doesn't seem mandatory for tef+, so rather not take the risk.  Instead lets, umm, fall on our face the next time this class of problem manifests on v1.0.1 :)
Comment 32 Julien Wajsberg [:julienw] 2013-03-22 03:16:31 PDT
Michael, I'd like to insist here.

* A homescreen that doesn't show anything is the last thing we want. This patch prevents a whole class of potential bugs that lead to this state.
* this patch landed for nearly 1 month in both m-c and b2g18, we would have caught any existing regression as of now. AFAIK we had none.
* even if the code added seems big, in reality there is not much changed in the existing code: a safer way to split the URL, a 404 error when we would have badly thrown instead.
Comment 33 Michael Vines [:m1] [:evilmachines] 2013-03-22 11:40:27 PDT
np.  The class of problems that this bug prevents are ones that would we would not expect the end-user to ever encounter right?  The bug reference in comment 30 looked to be more a build configuration issue.
Comment 34 Julien Wajsberg [:julienw] 2013-03-22 11:58:40 PDT
The fact that we don't expect them does not mean that a user won't trigger it...
Comment 35 Michael Vines [:m1] [:evilmachines] 2013-03-22 14:40:06 PDT
Yep, I know.  I'm on your side here!  Help me turn that - into a + with a strong justification for how this bug could manifest in the commercial device.
Comment 36 Julien Wajsberg [:julienw] 2013-03-23 09:01:30 PDT
When the homescreen starts up, it first displays the apps that are in its local indexed db and only then it synchronizes with the Gecko apps registry.

The problem manifests when the homescreen has an unavailable app in its local indexed db. "Unavailable" means here that the app is not on the device anymore (ie not in the apps registry).

I believe that this specific bug will prevent us from having a system update that removes apps from the device. This will prevent a user from removing an app "by hand". Even if that's unexpected, there could be bugs in the homescreen that unsynchronize the local indexed db, and without this patch, the local indexed db could not be resynchronized.

If that's not enough, I won't fight more. I believe we include much less important and much less safe patches in 1.0.1, having to fight about this drives me crazy.
Comment 37 Michael Vines [:m1] [:evilmachines] 2013-03-24 15:37:35 PDT
(In reply to Julien Wajsberg [:julienw] from comment #36)
> I believe that this specific bug will prevent us from having a system update
> that removes apps from the device.

Ok, not sure this is even a requirement at the moment.  If this becomes a requirement then I think that system update could also simply include this patch.

> This will prevent a user from removing an app "by hand".

You mean through |adb shell| right?  This is a developer use-case so not really blocking.  

> Even if that's unexpected, there could be bugs in the homescreen that 
> unsynchronize the local indexed db, and without this patch, the local 
> indexed db could not be resynchronized.

I'm *certain* there are bugs but really the concern is whether this patch will fix more bugs than it introduces. :)

> If that's not enough, I won't fight more. I believe we include much less
> important and much less safe patches in 1.0.1, having to fight about this
> drives me crazy.

I understand your frustration.  We had a window to get slightly risker patches in v1.0.1, as well as "would really be nice and very likely low risk" patches like this, but that window has now closed on us.  If this was a week back I'd have tef+'ed it already.  However I'm certainly not the sole arbiter of what lands in v1.0.1 so I do encourage you to tef? this bug again for somebody else to assess if you feel this bug could result in a significant number of support calls or device returns.
Comment 38 Myk Melez [:myk] [@mykmelez] 2013-03-29 15:55:02 PDT
*** Bug 850382 has been marked as a duplicate of this bug. ***
Comment 39 Jason Duell [:jduell] (needinfo me) 2013-04-01 14:12:24 PDT
re: whether this is tef+ worthy: comment 7 and comment 12 make it sound like w/o this patch, updating a packaged app has a 1/5 chance of getting it into a wedged state when it won't launch.

Note You need to log in before you can comment on or make changes to this bug.