Closed Bug 824677 Opened 7 years ago Closed 7 years ago

Displaying error pages for hosted apps that we can't reach closes the app

Categories

(Firefox OS Graveyard :: Gaia::System, defect, P2)

ARM
Gonk (Firefox OS)
defect

Tracking

(blocking-basecamp:+)

VERIFIED FIXED
B2G C4 (2jan on)
blocking-basecamp +

People

(Reporter: alive, Assigned: alive)

References

(Blocks 1 open bug)

Details

(Keywords: late-l10n, regression)

Attachments

(2 files)

STR:
1. Flash a new phone
2. Make sure you have no network access.
3. Open Map

Expected:
Map is opened and the system tells you that you don't have network access.

Actual:
The transition of app is opened and closed immediately.
Correction: not only map. Any app is located on web and is opened without network first time is closing automatically. Very bad. set bb?
Triage: QAWANTED, can we understand how many preloaded app will behave this way? if only maps, then severity may be lower but if it's happening to many preloaded apps, then this is a bigger problem.
Keywords: qawanted
Summary: Map is not opened when you have no network first time → The app who lives in remote location(http:// or https://) is not opened when you have no network first time
i think you mean connectivity (cellular data / wifi) instead of network here, correct?
No network access..ya like you say.
Josh, Casey we need your input for that.
Flags: needinfo?(kyee)
Flags: needinfo?(jcarpenter)
Note, bug 817154 is the maps app specifically, which we have to fix anyway because it's a tremendous usability problem.
For now, marketplace and Maps will be blocked if there's no available network.
Keywords: qawanted
The patch must show a massage saying "You must be connected to a network for launching this app for the firstime"
blocking-basecamp: ? → +
Priority: -- → P3
Target Milestone: --- → B2G C4 (2jan on)
Assignee: nobody → mbudzynski
Attached file workaround in Gaia
Attachment #696206 - Flags: review?(21)
Attachment #696206 - Flags: review?(stas)
Comment on attachment 696206 [details]
workaround in Gaia

This is just workaround.
There's already a customized error page in system app. This bug is a regression.
Vivien knows the error page well. Please find out why open-app transition is failed instead of hacking homescreen.
Attachment #696206 - Flags: feedback-
(In reply to Alive Kuo [:alive] from comment #10)
> Comment on attachment 696206 [details]
> patch
> 
> This is just workaround.
> There's already a customized error page in system app. This bug is a
> regression.
> Vivien knows the error page well. Please find out why open-app transition is
> failed instead of hacking homescreen.

Agreed. We should resolve the core issue.

If we _do_ keep the work around, we need to massage the string for consistency with other error messages:

Before: "You must be connected to a network for launching this app"
Change to: "[AppName] requires a network connection. Connect to a data or wifi network."

Why? Our style guide avoids using "You" phrasing as much as possible. I'm not 100% set on thi
Flags: needinfo?(jcarpenter)
After investigating, this is a platform bug about mozbrowsererror, exactly.
In system app, the window manager is notified by a mozbrowsererror event, which detail=fatal, from the app which is opened under network-less condition. This was worked as detail=other before.

This should be promoted to P2 or P1. An app frame which is not crashing is firing a crash event. Serious regression I think.

[Log]

E/GeckoConsole(  108): Content JS LOG at app://system.gaiamobile.org/js/window_manager.js:1540 in anonymous: other https://marketplace.firefox.com/telefonica/ =====
E/GeckoConsole(  108): Content JS LOG at app://system.gaiamobile.org/js/window_manager.js:1540 in anonymous: fatal https://marketplace.firefox.com/telefonica/ =====

The 'other' error is fired first and then the fatal event. Don't know what happened here.
Ping platform guys I know they had been working on mozbrowser events.
Switch to general component first.
Changing the bug summary.
Assignee: mbudzynski → nobody
Component: Gaia → General
Keywords: regression
Priority: P3 → P2
Summary: The app who lives in remote location(http:// or https://) is not opened when you have no network first time → A crash event is fired on non-crashed app iframe.
Attachment #696206 - Attachment description: patch → workaround in Gaia
Comment on attachment 696206 [details]
workaround in Gaia

We should fix this in the platform.
Attachment #696206 - Flags: review-
I can take a look at this.
Assignee: nobody → philipp
If we block the first mozbrowsererror event, the second one will not be fired. I think that 'fatal' is caused by the action that tries to handle the error.
(In reply to Patrick Wang [:kk1fff] from comment #15)
> If we block the first mozbrowsererror event, the second one will not be
> fired. I think that 'fatal' is caused by the action that tries to handle the
> error.

Do you mean evt.preventDefault() or block it in gecko?
It worked fine before. I mean I never saw the second fatal event.
Comment on attachment 696206 [details]
workaround in Gaia

Canceling the review request on stas, though this would look OK from an l10n pov. Just don't want to confuse the r-/f- that we have so far.
Attachment #696206 - Flags: review?(stas)
> E/GeckoConsole(  108): Content JS LOG at app://system.gaiamobile.org
> /js/window_manager.js:1540 in anonymous: other https://marketplace.firefox.com
> /telefonica/ =====

> E/GeckoConsole(  108): Content JS LOG at app://system.gaiamobile.org
> /js/window_manager.js:1540 in anonymous: fatal https://marketplace.firefox.com
> /telefonica/ 

Are you positive that the event was actually fired twice?  We had a bug some time in the past where certain console.log() calls were writing to logcat twice.  I don't know if we've fixed that.
(In reply to Alive Kuo [:alive] from comment #16)
> (In reply to Patrick Wang [:kk1fff] from comment #15)
> > If we block the first mozbrowsererror event, the second one will not be
> > fired. I think that 'fatal' is caused by the action that tries to handle the
> > error.
> 
> Do you mean evt.preventDefault() or block it in gecko?
> It worked fine before. I mean I never saw the second fatal event.

When I block the mozbrowsererror in BEC, the screen shows a blue "Offline" page (the gecko's error page) and no fatal error fired. But If I block the mozbrowsererror in the handler of mozbrowsererror in window_manger.js by e.preventDefault() and return directly, the fatal error still fired, and the screen becomes black.
Comment on attachment 696206 [details]
workaround in Gaia

I guess philipp and alive reviews are enough.
Attachment #696206 - Flags: review?(21)
Flags: needinfo?(kyee)
(In reply to Justin Lebar [:jlebar] from comment #18)
> > E/GeckoConsole(  108): Content JS LOG at app://system.gaiamobile.org
> > /js/window_manager.js:1540 in anonymous: other https://marketplace.firefox.com
> > /telefonica/ =====
> 
> > E/GeckoConsole(  108): Content JS LOG at app://system.gaiamobile.org
> > /js/window_manager.js:1540 in anonymous: fatal https://marketplace.firefox.com
> > /telefonica/ 
> 
> Are you positive that the event was actually fired twice?  We had a bug some
> time in the past where certain console.log() calls were writing to logcat
> twice.  I don't know if we've fixed that.

Note that the first is an error event of type 'other', the second is 'fatal'. So yes, these are two separate error events.
(In reply to Patrick Wang [:kk1fff] from comment #19)
> (In reply to Alive Kuo [:alive] from comment #16)
> > (In reply to Patrick Wang [:kk1fff] from comment #15)
> > > If we block the first mozbrowsererror event, the second one will not be
> > > fired. I think that 'fatal' is caused by the action that tries to handle the
> > > error.
> > 
> > Do you mean evt.preventDefault() or block it in gecko?
> > It worked fine before. I mean I never saw the second fatal event.
> 
> When I block the mozbrowsererror in BEC, the screen shows a blue "Offline"
> page (the gecko's error page) and no fatal error fired.

I can confirm this. Commenting out

760         // TODO See nsDocShell::DisplayLoadError for a list of all the error
761         // codes (the status param) we should eventually handle here.
762         sendAsyncMsg('error', {type: 'other'});

makes the crash go away.

So the bug description is a little wrong. There *is* a crash. The Maps content process definitely crashes. So the 'fatal' error is quite correct. The question is, why does the 'other' error that happens because the docshell can't load eventually make the content process crash? I attached gdb to the content process and all I got was

Program received signal SIGTERM, Terminated.
write () at bionic/libc/arch-arm/syscalls/write.S:10
10	    ldmfd   sp!, {r4, r7}
(gdb) bt
#0  write () at bionic/libc/arch-arm/syscalls/write.S:10
Cannot access memory at address 0x40049588

Not very helpful. Looks like memory corruption.
Summary: A crash event is fired on non-crashed app iframe. → Opening a hosted app while Gecko is in offline mode crashes iframe mozapp content process
(In reply to Philipp von Weitershausen [:philikon] from comment #22)
> (In reply to Patrick Wang [:kk1fff] from comment #19)
> > (In reply to Alive Kuo [:alive] from comment #16)
> > > (In reply to Patrick Wang [:kk1fff] from comment #15)
> > > > If we block the first mozbrowsererror event, the second one will not be
> > > > fired. I think that 'fatal' is caused by the action that tries to handle the
> > > > error.
> > > 
> > > Do you mean evt.preventDefault() or block it in gecko?
> > > It worked fine before. I mean I never saw the second fatal event.
> > 
> > When I block the mozbrowsererror in BEC, the screen shows a blue "Offline"
> > page (the gecko's error page) and no fatal error fired.
> 
> I can confirm this. Commenting out
> 
> 760         // TODO See nsDocShell::DisplayLoadError for a list of all the
> error
> 761         // codes (the status param) we should eventually handle here.
> 762         sendAsyncMsg('error', {type: 'other'});
> 
> makes the crash go away.
> 
> So the bug description is a little wrong. There *is* a crash. The Maps
> content process definitely crashes. 

The same happens to Marketplace.
It's worse than firing wrong events if these two apps are crashing at the same time......:/
Duplicate of this bug: 827011
Blake said he could take a look here since Philikon's got the FM-radio-through-analog-path work to do.
Assignee: philipp → mrbkap
The "crash" here is Gecko killing the app that we're trying to load. When we launch an application, we set its manifest URL to a given manifest and start the load. If that load doesn't succeed, we fire a mozbrowsererror (type: other) at the frame, which is eventually handled by the system app. The system app, in order to display the error page, navigates the iframe to the system app's error page. However, Necko refuses to let mozbrowsers change their origin and kills them if they try.

Vivien and I talked about this and the easiest way to fix this is probably to launch the error page in a separate frame that has a button which relaunches the original app. To avoid string changes, we could probably use the same text as the existing error.html.
Component: General → Gaia::System
Summary: Opening a hosted app while Gecko is in offline mode crashes iframe mozapp content process → Displaying error pages for hosted apps that we can't reach doesn't work
Summary: Displaying error pages for hosted apps that we can't reach doesn't work → Displaying error pages for hosted apps that we can't reach closes the app
Why is there a difference between the first run of the app without an Internet connection and subsequent runs without an Internet connection where the app was previously loaded successfully?

Blake, are you working on this or would you like a Gaia dev to take it? I can help if needed.
I am grabbing this bug according to comment 26.
Josh, Vivien, we may have a chance to revisit the implementation in bug 824672 as well as this bug.
Assignee: mrbkap → alive
> However, Necko refuses to let mozbrowsers change their origin and kills them if they try.

What does Necko refuse to let mozbrowsers do, exactly?  You can navigate a mozbrowser/mozapp to whatever origin you want, I thought.
The problem might be that we're loading the error page as an app:// URL which checks. Is there another protocol that would work instead?
I have talked to Vivien and Josh that about my proposal to fix this.
Now we know that embedding an error page in app iframe may be a cool but doesn't work.

So I re-raised my previous idea about this:
Make app iframe living in its own container element in windows.
Put any dialog which is app-specific on the container.

This could resolve many pains we have from long time ago. And this could create the way to connect what we could have in the future. A self-containing window is much easilier to maintain any UX/UI change if you'd like.

This should be a big change, but I am willing to do this just like what I do to coordinate z-index stuff.

I am making some drafts now.
https://github.com/mozilla-b2g/gaia/pull/7460

This is still a WIP...in very initial version but a protective patch.
Just do adding a div container to the mozbrowser iframe.

However my final plan is to make a mozbrowser window be a instance of 'Window' object.

Let the object be in charge of resize, open, close, show/hide app-specific dialog, firing app-specific custom events
Let the window manager be in charge of centralized control like visibilitychange, transition, homebutton event...

I wrote a draft mail about this change but didn't send. Append it here.
===
I feel there's one thing that we should do but never did:

Make every mozBrowser iframe or iframe which is provided by platform having specific tasks live in its own DOM element container, something like a <div>. Now they are grouped together without any container.

It includes:
* A normal mozApp iframe [frameType="window"]
* Inline-actvity iframe [frameType="inline"]
* Popup dialog[frameType="popup"] (by window.open)
* Attention dialog[frameType="attention"] (by window.open with 'attention' specified)
* Wrapper[frameType='wrapper"…but now it's identified as dataset.wrapper] (a bookmark when you choose to bookmark to home screen in browser app)

What's the benefit?
1. We should know that some dialogs are app-specific and some dialogs are system-specific.

App specific dialogs:
* Modal Dialog(Alert, confirm, prompt, custom_dialog)
* Error Dialog(Tells you that we have network error when loading the app page)
* Popup!(By window.open)
* Inline Activity!(By design, an inline activity is tightened to the caller app. Any inline activities should be canceled/closed if we leave the caller app. I had a patch to enable multiple inline activities on single app already. So an inline activity 

Maybe System-wide
* Permission Screen - Josh tells me this is still under consideration.
  Currently the implementation of permission dialog in platform doesn't tell which iframe it's currently happening -- I begin to believe this is intentional if we confirm that any permission dialog should be system wide.

System-wide dialog
* SIM-PIN Dialog
* Crash Report
* System update prompt
* Attention screen

We are now make everything `singleton`ed in system/index.html
And use 'appopen' or 'appwillopen' event fired by window manager to deal with app switching case to determine show them or not.
* It's bad on UX because when the app is transitioning, the app-specific dialog should be transitioned as well as it could.

We could just resize the app Window who contains any dialog in its container regardless to what's being displayed. Regardless of z-index!

2. Gather mozBrowser API if possible
We have much function work relying on mozbrowser API and mozbrowser frames.
* mozbrowsererror
* mozbrowsershowmodalprompt
* mozbrowserloadstart
* mozbrowserloadend
* mozbrowsercontextmenu
* mozbrowseropenwindow
* mozbrowserclose
* mozbrowserusernameandpasswordrequired

These events handler are everywhere in system app in many modules.
Do we have a chance to gather it together to avoid rewrite event handler everywhere? Yes I think.

3. UX could easily change the style of dialogs if he wants. IMO.
Currently every dialog is designed to be fullscreen or cover the whole app.
Do we have a chance to change the alert dialog to be only a little dialog showing in center of the app? Yes.

4. Resize pains, fullscreen issue, keyboard size issue
Resize in Window manager is also a pain. Window manager is like a centralized manager to manage the windows but there's no other module is helping it.
For example, we need to resize the app/inline-activity-frame/homescreen app according to:
* Keyboard status
* Attention screen status
* Fullscreen status(mozRequireFullscreen or manifest.fullscreen is a different case.)

5. Visibilitychange pains.
We know that setVisible is also part of Browser API. Now window manager is in charge of change the visibility of every mozbrowser iframe by some complex code paths. And there are also some cases/bugs are not being found now.
For example, if we are having two activities at the same time, currently we don't setVisible(false) to the activity frame under the most top one. This is undefined because some app(Communications - Contact) is using mozvisibilitychange to close itself :O I don't think this is needed to be avoided. We still need to investigate this.

6. Home button events and holdhome events
Now this is not severe -- many effort had be done to keep 'home button' really returns the user to home screen now.

7. Transition relevant to app/inline activity/.…
Tim does a good job in implemanting app transition. But I think we can't avoid to make this code sections change if we are going to create a new pattern for system UI.

What we could do? What's the plan?
* Make (all) mozbrowser iframe be contained in its own container instead of '#windows'.
* Make it as being an object pattern if possible.
* Make it feasible.

Note:
1. We can't avoid to use some kind of HTML templates.
I am happy to learn about that James Lal is inventing some MVC in his Calendar app.
2. We also can't avoid to use object pattern to lessen the effort of window manager.
Something like this:

var Window = function Window(config){
	// Configuration to this window
	var default_config = {
		'id': GLOBAL_APP_ID++,
                'origin': origin,
                'manifest': manifest
	}
	for (var key in this.config)
	   this[key] = config[key] || default_config[key];
	}
}

Window.prototype.resize = function (){...};

*What could be done now?*
I am not going to do EVERYTHING here mentioned. But we could step out with adding the initial change.
Update: After lengthy discussion with Alive offline, we decided to limit the scope of change to limit the risk. Specifically, we will still roll out a <div class="appWindow"> container, but it will still being referred as |frame|. We the actual iframe of the app within the container as |iframe|, gettable with |app.iframe| or |frame.firstChild|.

Specifically, this is the Part I fix of this bug I delivered to Alive:

https://github.com/timdream/gaia/commit/c68e47e7dee06e33acfcfd634f507cf65b6d1781

He will be build this solution to this bug on top of it.
Depends on: 818575
BTW, the fix unfortunately hits Gecko bug 818575 for app frames, hence the dependency. cjones told me that bug is also blocking+ and will be landing soon, so I will not worry about it.
> The problem might be that we're loading the error page as an app:// URL

Yes, this is why the process is getting killed.  My understanding from talking with fabrice is that hosted apps have no application.zip file. Any time you use an app:// uri, you're asking necko for a JAR file, and in this case we're asking for a JAR file that doesn't belong to the app, so necko kills it.  Here's the logcat output:

  NeckoParent::AllocPRemoteOpenFile: FATAL error: app without webapps-manage permission is requesting file '/data/local/webapps/system.gaiamobile.org/application.zip' but is only allowed to open its own application.zip: KILLING CHILD 

It sounds like we really shouldn't be using an app:// URI here.
(In reply to Tim Guan-tin Chien [:timdream] (MoCo-TPE) from comment #33)
> Update: After lengthy discussion with Alive offline, we decided to limit the
> scope of change to limit the risk. Specifically, we will still roll out a
> <div class="appWindow"> container, but it will still being referred as
> |frame|. We the actual iframe of the app within the container as |iframe|,
> gettable with |app.iframe| or |frame.firstChild|.
> 
> Specifically, this is the Part I fix of this bug I delivered to Alive:
> 
> https://github.com/timdream/gaia/commit/
> c68e47e7dee06e33acfcfd634f507cf65b6d1781
> 
> He will be build this solution to this bug on top of it.

Much thanks to Tim, my second part is complete.
https://github.com/alivedise/gaia/tree/bugzilla/824677-v3/alive+tim

Close and Reload is working fine now.
I am finding out the way to implement 3rd part:
* Work with Card view
* Work with other BrowserAPI events
Patch v1:
* Move app frame into self-container
* Make a Window Object
* Make a Error Object
* Remove original error logic in Modal Dialog

Seems there's trouble making -moz-element as background now, so I am postponing the cards view issue. Let's fix it in another bug.
Attachment #700688 - Flags: review?(21)
Attachment #700688 - Flags: feedback?(timdream+bugs)
(In reply to Alive Kuo [:alive] from comment #38)
> Created attachment 700688 [details]
> https://github.com/mozilla-b2g/gaia/pull/7520
> 
> Patch v1:
> * Move app frame into self-container
> * Make a Window Object
> * Make a Error Object
> * Remove original error logic in Modal Dialog
> 
> Seems there's trouble making -moz-element as background now, so I am
> postponing the cards view issue. Let's fix it in another bug.

We tried to avoid moz-element as much as possible those days. I would like to discuss with you tomorrow to see why you need it and what exactly is the card view regression.

In the worst case I start to wonder if we can't simply move some part of the error page in b2g/chrome/content/ and use a different protocol than app:// to load it. That would avoid all those changes - which I think are good for the future but they makes me a little bit nervous at this point.
(In reply to Vivien Nicolas (:vingtetun) from comment #39)
> (In reply to Alive Kuo [:alive] from comment #38)
> > Created attachment 700688 [details]
> > https://github.com/mozilla-b2g/gaia/pull/7520
> > 
> > Patch v1:
> > * Move app frame into self-container
> > * Make a Window Object
> > * Make a Error Object
> > * Remove original error logic in Modal Dialog
> > 
> > Seems there's trouble making -moz-element as background now, so I am
> > postponing the cards view issue. Let's fix it in another bug.
> 
> We tried to avoid moz-element as much as possible those days. I would like
> to discuss with you tomorrow to see why you need it and what exactly is the
> card view regression.

Tim told me that getScreenshot only works on iframe element. Cards View use it the get the screenshot of running apps.
If we want to replace it with the error dialog, I think -moz-element is the only solution now.
I know -moz-element is having some performance issue because it's live.

> 
> In the worst case I start to wonder if we can't simply move some part of the
> error page in b2g/chrome/content/ and use a different protocol than app://
> to load it. That would avoid all those changes - which I think are good for
> the future but they makes me a little bit nervous at this point.

In the worst case we could back out bug 811243 manually(conflict I am sure) and keep original error design until we have the best solution to these pains.
Josh, Patryk, 
Please let me know if we have a chance to get a static image for the purpose to replace the screenshot in task manager for erro r screen *TODAY*.
Flags: needinfo?(padamczyk)
Flags: needinfo?(jcarpenter)
Duplicate of this bug: 827682
Comment on attachment 700688 [details]
https://github.com/mozilla-b2g/gaia/pull/7520

Everything is good, at least acceptable for v1, if you rename the |Window| constructor to |AppWindow|.
Attachment #700688 - Flags: feedback?(timdream+bugs) → feedback+
Flags: needinfo?(padamczyk)
Flags: needinfo?(jcarpenter)
Comment on attachment 700688 [details]
https://github.com/mozilla-b2g/gaia/pull/7520

I can r+ this if we rename iframe -> browser and if we don't override the Error object. 

Also this patch will more or less regress bug 811243 but as Tim says it seems like an acceptable solution for v1. Having an overlay for Error pages is not an ideal solution but all the other one that I can think of requires some platform changes :/

But this patch will put us in a good shape for the future. Not having to keep a stack of object for each dialogs and each singletons that manage it is a very nice improvement. Congrats for the idea. Fixing the Cards View stuff can be done in a followup as you said (hopefully by fixing the missing pieces in the platform).
Attachment #700688 - Flags: review?(21) → review+
(In reply to Vivien Nicolas (:vingtetun) from comment #44)
> Comment on attachment 700688 [details]
> https://github.com/mozilla-b2g/gaia/pull/7520
> 
> I can r+ this if we rename iframe -> browser.

Given the fact that ll |app.iframe| calls will be removed, in favor of the abstracted/wrapped methods in |AppWindow| instances, I don't think renaming that will do as any good here.

> and if we don't override the Error object. 

agree.
https://github.com/mozilla-b2g/gaia/commit/f35710af2c6794a377c7558ea275794bd6a1488d

Thank you all!!
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
this regressed show time to load - bug 829869
Hey you describe a behavior existed beyond this patch.
It is not a regress.
(In reply to Zbigniew Braniecki [:gandalf] from comment #47)
> this regressed show time to load - bug 829869
Alive: with last changeset before this patch landed the "Show time to load" shows the number more or less close to the actual app loading time. With this patch, it shows static ~300ms. 
I believe that this is a regression, even if the original behavior was not optimal it was still the best and only way to debug app load timing. Now it's not useful at all.
Keywords: verifyme
QA Contact: jsmith
Depends on: 830240
Verified on 1/17.
Status: RESOLVED → VERIFIED
Keywords: verifyme
You need to log in before you can comment on or make changes to this bug.