Closed Bug 770047 (black-jellybean) Opened 12 years ago Closed 12 years ago

Quick black screen flash on startup with 4.1 (Jelly Bean)

Categories

(Firefox for Android Graveyard :: General, defect)

ARM
Android
defect
Not set
normal

Tracking

(firefox14 affected, firefox15 verified, firefox16 verified, firefox17 verified, fennec+)

VERIFIED FIXED
Firefox 17
Tracking Status
firefox14 --- affected
firefox15 --- verified
firefox16 --- verified
firefox17 --- verified
fennec + ---

People

(Reporter: aaronmt, Assigned: kats)

References

Details

(Whiteboard: [jellybean])

Attachments

(2 files, 1 obsolete file)

On startup of Nightly/Aurora/Beta there is a quick black flash seen immediately after one see's about:home.

Tracking for 15?

--
Samsung Galaxy Nexus (Android 4.1)
Nightly (07/01)
Component: General → Plugins
Alias: black-jellybean
confirmed here. Android 4.1.1, Galaxy Nexus, Nightly 07/12.
Component: Plugins → General
+1 - It only happens on application startup, not when switching to it after it's started up. It happens whether launching with a link or launching the app directly.
Updated title to make this bug easier to find
Summary: Quick black flash on startup with 4.1 → Quick black screen flash on startup with 4.1 (Jelly Bean)
tracking-fennec: ? → +
Assignee: nobody → sriram
(In reply to Chris Lord [:cwiiis] from comment #3)
> +1 - It only happens on application startup, not when switching to it after
> it's started up. It happens whether launching with a link or launching the
> app directly.

Here too obviously, doesn't affect functionality, but *super* ugly.
Nexus S users just got 4.1.1 [1], just got it on my phone. Confirming same issue on my Nexus S.

[1] http://www.androidcentral.com/android-411-jelly-bean-ota-now-appearing-some-nexus-s-variations
Hypothesis:
The background texture is specified in windowBackground -- which is responsible for showing a screen right when the application icon is tapped in launcher.
The setContentView() happens "after" Gecko initialization currently.
So, the windowBackground and the setContentView() arent working in tandem!

One possible solution:
Remove windowBackground attribute.
Loss: Faster showing of the screen when icon is tapped.
This happens well before Gecko starts up.
I know this happens before Gecko starts up because it happens before loadGeckoLibs runs. I can start this in gdb, and the black flash happens before I even get a command prompt, which makes me think it's in Java.

I have also entirely failed to get the Android developer tools in Eclipse up and running so that I can debug this before Firefox starts on my handset. More trying tomorrow...
Assignee: sriram → joe
I gave up on debugging from the first launch in Eclipse, and just put a 30s delay at the beginning of GeckoApp.initialize().

I now know some things it definitely isn't, but I don't know what it is. Sriram had a theory that it's from quickly opening and closing the profile migration dialog, but I commented out that code and it didn't fix anything.

I thought this might have something to do with about:home, so I hooked my Galaxy Nexus up to an Eideticker box (thanks mdas!) and captured the output. Frequently, but not every time, about:home appears immediately after the black flash - i.e., the very next frame is about:home.

At this point, I am running up against the limits of my Android knowledge. I'm happy to learn more and continue debugging this, but I also wouldn't be offended if someone took this bug off my hands.
With absolutely no evidence to back this up, I would try disabling the GfxInfoThread code and see if that helps any. My half-baked theory is that the triple-buffering in JB tries to do something clever if it detects the app create a GL context and that causes some reinit to happen resulting in the black screen.
Actually, scratch that. The flash happens on 14 as well so it can't be because of that.
Some other things it isn't:
 - GfxInfoThread :(
 - Flash
 - Form Assistant popup

I give up :(
Assignee: joe → sriram
Confirming that this black screen is from GL layer initialization.
I did a trace view and the time of flashing was just around GL layer initialization (took some 40 ms or so).
I suspected it and commented out code here: https://hg.mozilla.org/mozilla-central/file/tip/mobile/android/base/GeckoApp.java#l1827 to avoid GL layer initialization. The app started without any screen flickering on JB. (And the app crashed ofcourse :P).
Adding kats...
Commenting out that line kills a lot more than just GL layer initialization, it kills all of gecko startup...
I isolated the flash to this line of code:

https://hg.mozilla.org/mozilla-central/file/399aff2fb74f/mobile/android/base/GeckoApp.java#l1869

Commenting it out makes it go away. I suspect it is GL-related, and android is doing something that causes the flash when we add the SurfaceView to the layout. I can dive through the android code to try to see what's going on, but I suspect the best way to fix it will likely involve moving that line of code so that the flash happens at a less noticeable time (either earlier during startup, or when we navigate to the first non-about:home page)
Attached patch WIP - attempt at a fix (obsolete) — Splinter Review
I tried to fix this by moving the layerview directly into the XML layout, so that hopefully the flash happens way earlier on startup and isn't noticeable. However for some reason this patch is throwing an NPE because the findViewById(R.id.layer_view) is returning null. I have no idea why. Sriram, any thoughts?

Also note that the Log.e I added to GeckoViewsFactory exposes some other errors that should be fixed.
NPE is because of this:
When LayoutInflater inflates, it requires this constructor:
CustomView(Context, AttributeSet);

You have supplied this: return new LayerView(context);

Please change it to take AttributeSet. The AttributeSet passed from GeckoViewsFactory can be null -- which is not a problem. (But I guess, in your case, it's better to pass the attrs you get to the LayerView, and the LayerView can call its super with attrs, so that the layout_params are set properly).
Attached patch PatchSplinter Review
Assignee: sriram → bugmail.mozilla
Attachment #646597 - Attachment is obsolete: true
Attachment #646712 - Flags: review?(sriram)
Comment on attachment 646712 [details] [diff] [review]
Patch

Review of attachment 646712 [details] [diff] [review]:
-----------------------------------------------------------------

This is awesome! :)
I'm loving LayerView :)
Attachment #646712 - Flags: review?(sriram) → review+
I have some doubts:
Why should GeckoApp holds mLayerController? Why can't it hold mLayerView and get the controller from it. After this patch -- LayerController.setView() and LayerView.connect() -- I feel we can restructure it this way.

In GeckoApp.java:
LayerView mLayerView; is stored.

getLayoutController() {
  return mLayerView.getLayoutController();
}

In LayerView.java:
public LayerView() {
  LayerController = new LayerController(this);
}

In LayerController.java: 
public LayerController(LayerView view) {
   setView(view);
}

-----
Basically GeckoApp holds mLayerView instead of mLayerController -- thereby it feels like holding a view than a controller for the view.
On a side note: Do we need those calls to getLayoutController() in other places? Is there a way to avoid it?
Looking into the code again:
1. Most of the places we do: layerController.getView().doSomething(). Instead layerView.doSomething() would be better.
2. getViewportMetrics() can be at LayerView and inturn return mController.getViewportMetrics().
3. convertLayerPointToViewPoint() and its converse can be at View level.

Basically the Controller and Renderer are more like "presentation" of the LayerView and can be known only to the view. The Activity and other dependent Views can get required values from LayerView.
Attached patch WIPSplinter Review
I couldn't stop myself from experimenting this :P
This places LayerController and LayerClient inside LayerView. GeckoApp knows only about LayerView (just like a SurfaceView). And required calls are through LayerView.

GeckoLayerClient initialization happens at LayerView's creation -- not sure if this is right.

There are so many checks for LayerController being null. What's the possibility of this happening -- given that initialization of LayerController happens at creation of LayerView -- which inturn happens during inflation of XML -- which is during onCreate -- which makes me feel LayerController can never be null.
Attachment #646825 - Flags: review?(bugmail.mozilla)
Note: Robocop tests might need to be changed to make sure it's not dependent on getLayerClient().
These are some good ideas but they should be done in a separate bug since they are refactoring that should not be uplifted, whereas this bug will need to be uplifted.

Also I have a long series of changes in the dependency list of bug 776030 that I've been working on which changes a lot of the gfx code. I will try to do this on top of that work, it will be much easier. One of the patches merges LayerController into GeckoLayerClient, for example.
Comment on attachment 646825 [details] [diff] [review]
WIP

Minusing this for now, but I can move it to a new bug and rebase it on top of bug 777351.
Attachment #646825 - Flags: review?(bugmail.mozilla) → review-
Looks good on mozilla-inbound
Comment on attachment 646712 [details] [diff] [review]
Patch

[Approval Request Comment]
Bug caused by (feature/regressing bug #): none
User impact if declined: there is an unsightly black flash when starting fennec
Testing completed (on m-c, etc.): on inbound
Risk to taking this patch (and alternatives if risky): mobile only. in theory it shouldn't be too risky by itself, but these types of changes (which change the order in which things are initialized) have a history of exposing more prominently races that already exist in the code. this might result in a higher crash rate, for example. i suggest letting bake on nightly/aurora for a few days before uplifting to beta.
String or UUID changes made by this patch: none
Attachment #646712 - Flags: approval-mozilla-beta?
Attachment #646712 - Flags: approval-mozilla-aurora?
Comment on attachment 646712 [details] [diff] [review]
Patch

[Triage Comment]
Approving for Aurora 16, but I don't think we need to risk any new regressions on Beta (esp. given the fact that 14 was affected).
Attachment #646712 - Flags: approval-mozilla-beta?
Attachment #646712 - Flags: approval-mozilla-beta-
Attachment #646712 - Flags: approval-mozilla-aurora?
Attachment #646712 - Flags: approval-mozilla-aurora+
Giving that there are many jelly bean devices now -- and even Galaxy Nexus S got it! :O -- this is going to affect a lot of users. Why don't we want this for beta? The release is one month away, if I'm right.
(In reply to Sriram Ramasubramanian [:sriram] from comment #35)
> Giving that there are many jelly bean devices now -- and even Galaxy Nexus S
> got it! :O -- this is going to affect a lot of users. Why don't we want this
> for beta? The release is one month away, if I'm right.

It's the tradeoff between a black Flash and the possibility for bad race conditions that possibly aren't caught by the crash reporter. We'd need somebody from product to weigh in on the necessity of this fix.
(this feels cosmetic)
This is the first thing I noticed on upgrading to Jelly Bean and it's super-ugly. It's viscerally bad. 

I don't know what our testing coverage is likely to catch or not catch so it's hard to make a call here. Can we get it into _a_ beta and pull it from the final beta if we're seeing a spike in crashes?

Also, is the risk as limited to Jelly Bean as the reward is?
I think this is not purely cosmetic but is also a perceived-performance issue, since our startup no longer feels instant and seamless.
The change that happened here is that the View in inflated in XML (like all other views) and then accessed in Java -- instead of creating in java and inflating into exisiting view.

The "creating in java" caused the black screen (of not death) in Jelly Bean. Now that it's in XML, we won't have any issues.
I'd like to see this patch in the next beta. That means we can observe Nightly and Aurora for a few days.

Matt is right about this being a perceived-performance issue and that is the _only_ reason I'd consider it for Fx15. I don't care about the super-ugly factor. We would just let it ride the trains if that was it.

Also, the patch is not too risky. Let's just watch Nightly and Aurora for any issues.
(In reply to Mark Finkle (:mfinkle) from comment #41)
> Matt is right about this being a perceived-performance issue and that is the
> _only_ reason I'd consider it for Fx15. I don't care about the super-ugly
> factor. We would just let it ride the trains if that was it.

The perceived-performance issue is the super-ugly factor. If feels (and is) a broken start-up experience. That's just ugly.
https://hg.mozilla.org/mozilla-central/rev/14ed511f2d6b
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 17
(In reply to Mark Finkle (:mfinkle) from comment #41)
> Also, the patch is not too risky. Let's just watch Nightly and Aurora for
> any issues.

OK - risk of startup regressions (without feedback from beta users) is what I'm most worried about. Sounds like this may be worth the risk, and we think we'll catch most new regressions through crash reporter.

We'll leave this nominated for beta and check back before the next build.
Attachment #646712 - Flags: approval-mozilla-beta- → approval-mozilla-beta?
Comment on attachment 646712 [details] [diff] [review]
Patch

Let's get it into beta 4 and see what happens, with time to backout if needed.
Attachment #646712 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Jumping into this thread now, for what it's worth. I'd really like to see this make it into the next beta so there's time to pull it if crash rates spike as a result. Whether it feels cosmetic or not, it's a very bad user experience and taints overall perception of the browser upon startup, not a place we want to be. Jelly Bean has been getting lots of press and upgrades are happening all around, so we want to do what we can to get the best experience we can.
This is approved for mozilla-beta and needs to be landed before EOD Pacific tomorrow to make it into the next beta.  Please land asap.
Keywords: checkin-needed
Sorry for the delay, I was gone for the long weekend. Landed on beta:

https://hg.mozilla.org/releases/mozilla-beta/rev/b40643c774f6
Please note that this patch caused bug 769269 to be un-fixed by exposing an error in the code. Bug 780699 is tracking the regression and I have a patch that seems to work, but may need to be tweaked and tested more.
Status: RESOLVED → VERIFIED
A Talos Ts regression on Beta was noted this week and blamed on this patch: see bug 781907.
Depends on: 781907
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: