Trusted UI and friends (identity, marketplace) should listen for mozbrowsererror and clean up when the process dies

RESOLVED FIXED

Status

Firefox OS
Gaia::System
RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: khuey, Assigned: jedp)

Tracking

(Depends on: 1 bug)

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(3 attachments, 1 obsolete attachment)

The Persona sign in feature creates a mozbrowser iframe in the system app to display its UI.  It doesn't listen for mozbrowsererror though, so if we kill -9 the persona browser process the UI does not get cleaned up.

Justin, are there other mozbrowser events we should care about here, like mozbrowserclosed or something?
Assignee: nobody → jparsons
Created attachment 788481 [details] [diff] [review]
Possible approach for this fix

I am on PTO this week, and I have no FirefoxOS device with me, so this patch is untested.  But if anyone can hack on it, this is the file that needs the fix, and the patch could look something like the attachment.
Ah - it's actually the trusted UI that should be closing on mozbrowsererror.
Summary: identity.js should listen for mozbrowsererror and clean up when the persona process dies → Trusted UI and friends (identity, marketplace) should listen for mozbrowsererror and clean up when the process dies
Created attachment 796821 [details]
Clean up Trusted UI on mozbrowsererror

This patch allows the Trusted UI to catch a mozbrowsererror and clean up.  There doesn't seem to be any point in putting listeners for mozbrowsererror in either identity.js or payments.js, because they are contained within the Trusted UI process.

This patch applies cleanly on today's gaia master and v1-train.
Attachment #788481 - Attachment is obsolete: true
Attachment #796821 - Flags: review?(ferjmoreno)
Attachment #796821 - Flags: review?(ferjmoreno)
Thanks Jed!

Shouldn't we show an error message instead of just closing the dialog? Check [1] for example.

I would also handle "mozbrowserclose", so "window.close" can work within the trusted UI.

[1] https://mxr.mozilla.org/gaia/source/apps/system/js/popup_manager.js#160
Oh nice - mozbrowserclose should indeed be handled.  Thanks for the comments.  I'll update the patch.
Having watched 2001 again last night, I now believe this should show a bright red error frame flashing the words COMPUTER MALFUNCTION.
Created attachment 797925 [details]
2013-08-30-13-06-56.png

Hi, Fernando,

I've updated the patch to display an error screen when there is a crash.  It also catches mozbrowserclose.

I wasn't sure which error property was the correct one to use.  Here's a screenshot.  Comments?

Thanks!
j
Attachment #797925 - Flags: feedback?(ferjmoreno)
Comment on attachment 797925 [details]
2013-08-30-13-06-56.png

Thanks Jed! I left a few comments in the PR.
Attachment #797925 - Flags: feedback?(ferjmoreno) → feedback+
Thanks for your close reading as always, Fernando!
With your suggestion of handling additional known error cases (like airplane mode and loss of network), I'll add a "Try again" button alongside "Close".  

It turns out to be a bit of a squeeze getting these longer error messages inside the Trusted UI.
Hey Jed, what's the next step here?
Flags: needinfo?(jparsons)
Hi, Kyle.  Next step is UI design, with me, or someone enthusiastic about doing so, making the necessary layout changes to accommodate the additional error traps and lengthier error strings.  I will check with :mhanratty for guidance on some layout questions.
I don't want to scope creep this bug if we don't have to.  Can we land the code that removes the iframe mozbrowser (and thus shuts down the process, which uses several MB of memory) and deal with giving this good UI in a followup.  IIRC the current UI just gives you a big empty gray box if the process crashes so we're not making things any worse here.
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #13)
> I don't want to scope creep this bug if we don't have to.  Can we land the
> code that removes the iframe mozbrowser (and thus shuts down the process,
> which uses several MB of memory) and deal with giving this good UI in a
> followup.  IIRC the current UI just gives you a big empty gray box if the
> process crashes so we're not making things any worse here.

That sounds like a plan
Flags: needinfo?(jparsons)
Great.  Thanks.

I'm told that we're branching for 1.2 on Monday but I can bless this bug with blocking-koi+ to land a safe patch for some time after that.
Thanks, Kyle.  Working on it now.
Got some weird html layout problems that just started appearing after reverting and rebasing.  (And alas, the awesome new Inspector for the App Manager, doesn't yet work on the System App.)
Oh wow.  Back to the drawing board.  As of today's m-c trunk and gaia master, killing the persona browser process reboots the phone!  The mozbrowsererror is not captured any more.
Created attachment 806241 [details]
reboot-log.txt

Here's the lolcat emitted when the process is kill -9'd
Adding some dump statements line-by-line in the patch (lines starting with "TRUI"), it's actually clear that mozbrowsererror is getting caught, and the code is starting to create an error message.  But in the midst of it all, we end up with a SIGSEGV

I/GeckoDump( 1718): TRUI handleEvent: mozbrowsererror
I/GeckoDump( 1718): TRUI handling mozbrowserloaderror
I/GeckoDump( 1718): TRUI added .error to data; now show error
I/GeckoDump( 1718): TRUI getTopDialog
I/Gecko   ( 1718): [Parent 1718] ###!!! ABORT: actor has been |delete|d: file /Users/zeus/code/b2g/objdir-gecko/ipc/ipdl/PGrallocBufferParent.cpp, line 378
E/Gecko   ( 1718): mozalloc_abort: [Parent 1718] ###!!! ABORT: actor has been |delete|d: file /Users/zeus/code/b2g/objdir-gecko/ipc/ipdl/PGrallocBufferParent.cpp, line 378
F/libc    ( 1718): Fatal signal 11 (SIGSEGV) at 0x00000000 (code=1)
I/GeckoDump( 1718): TRUI get name and stuff
I/GeckoDump( 1718): TRUI set textContexts
I/GeckoDump( 1718): TRUI add .error class
E/GeckoConsole( 1718): [JavaScript Error: "[Exception... "'Error: no message manager set' when calling method: [nsIObserver::observe]"  nsresult: "0x8057001c (NS_ERROR_XPC_JS_THREW_JS_OBJECT)"  location: "native frame :: <unknown filename> :: <TOP_LEVEL> :: line 0"  data: no]"]
I/DEBUG   ( 1484): debuggerd committing suicide to free the zombie!
The problem isn't confined to a trusted_ui Browser process; kill -9 on any app will cause the segfault and reboot.
Depends on: 917957
Yay, the bug described in Comment 18 is no longer occurring in the latest m-c.
With 100% repeatability, with this patch applied, the Persona Trusted UI will crash if the OTA update notification banner appears as it is loading.  If, however, I open a web browser first, and go to some random web page to induce the OTA updates notification to arrive, and *then* I go and open a Persona sign-in flow, the flow works perfectly.

STR:
- Apply the gaia patch above
- make reset-gaia
- choose wifi network in settings
- go to UI Tests -> navigator.mozId
- click 'request'

Result: Updates notification arrives just as Persona Sign-In is loading; Trusted UI displays 'Sign-In has crashed' error.

Log looks like:

I/Gecko   (16502): ###!!! [Parent][AsyncChannel] Error: Channel error: cannot send/recv
I/GeckoDump(16502): trusted_ui: handleEvent: mozbrowsererror
I/GeckoDump(16502): trusted_ui: handleEvent: appterminated

Alternately, STR to make it not crash:
- As above, but before going to navigator.mozId tests, visit some other web page first.
- Wait for OTA banner to come and go
- Then go to navigator.mozId tests

Result: things work nicely.  Persona login does not say it has crashed.

Fernando, do you have any idea what might be causing this?  Am I somehow catching a mozbrowsererror event that's not intended for me?
Flags: needinfo?(ferjmoreno)
Depends on: 920851
Having spoken to :fabrice on IRC, I've opened bug 920851 for the OTA banner issue described in Comment 23.
Flags: needinfo?(ferjmoreno)
As Bug 920851 Comment 12 notes, we probably need to be listening for these error events on the frame within the trusted ui, not the trusted ui itself.  I will update the patch.
I've updated the patch; once the PR in Bug 920851 is merged, I'll rebase and re-submit this.
Is this ready to go now Jed?
Flags: needinfo?(jparsons)
Oh wow.  That is a blast from the past.  Apologies that this fell off my radar.  Rebased and pushed the PR again.  I'll ask the good Fernando for a review.

Thanks!
Flags: needinfo?(jparsons)
Comment on attachment 796821 [details]
Clean up Trusted UI on mozbrowsererror

Hi, Fernando, thanks for looking at this again after all this time!
j
Attachment #796821 - Flags: review?(ferjmoreno)
Comment on attachment 796821 [details]
Clean up Trusted UI on mozbrowsererror

I am getting this error after applying the patch:

[JavaScript Error: "SyntaxError: property name handleBrowserEvent appears more than once in object literal" {file: "app://system.gaiamobile.org/js/trusted_ui.js" line: 461}]
Attachment #796821 - Flags: review?(ferjmoreno)
Oh?  Ok, taking a look now.  Thanks, ferjm.
Any luck with comment 31?
Flags: needinfo?(jparsons)
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #32)
> Any luck with comment 31?

Sorry, I've been underwater with 1.4 blockers.  I will try to get to this as soon as I can.  I expect by the end of the week.
Flags: needinfo?(jparsons)
Status: NEW → ASSIGNED
Comment on attachment 796821 [details]
Clean up Trusted UI on mozbrowsererror

Fixed and rebased

I've tested on device with today's m-c and gaia master as follows:
- in a browser, visit 123done.org
- click sign in
- when persona dialog appears, kill it via adb shell

mozbrowsererror is caught and error dialog appears as expected.

Closing the error dialog and attempting to sign in again works as expected.

Thanks again for your review, ferjm!
Attachment #796821 - Flags: review?(ferjmoreno)
Comment on attachment 796821 [details]
Clean up Trusted UI on mozbrowsererror

Thanks Jed!
Attachment #796821 - Flags: review?(ferjmoreno) → review+
Thanks, ferjm.

There were Travis build errors due to the analog clock app not adapting correctly to orientation changes.  Is there a way to "star" things like that for b2g, as one would with the try builder?

Anyway, merging the PR.
Cheers
j
Status: ASSIGNED → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
Why is there no new tests with this commit?
Good point, :rik, there should be.

I've opened follow-up Bug 976103 to add tests for this.
This patch should not have landed without new tests. Reviewers, please pay attention to this.
Depends on: 976103
You need to log in before you can comment on or make changes to this bug.