Closed Bug 993666 Opened 6 years ago Closed 6 years ago

Home banner message can change while visible to user

Categories

(Firefox for Android :: General, defect)

ARM
Android
defect
Not set

Tracking

()

RESOLVED FIXED
Firefox 31
Tracking Status
firefox30 --- fixed
firefox31 --- fixed
fennec 30+ ---

People

(Reporter: Margaret, Assigned: Margaret)

Details

Attachments

(1 file, 1 obsolete file)

ibarlow ran into this while trying out my hub user testing add-on. I wonder if we're calling the HomePager's load method a second time by accident? That's the only place where we make a call to update the home banner.

I haven't been able to reproduce this locally. ibarlow, which version of fennec were you using? And can you consistently reproduce this?
Flags: needinfo?(ibarlow)
This is today's Nightly. 

I seem to be able to reproduce it about 2 out of 3 attempts -- it happens when I do a cold start to the browser.
Flags: needinfo?(ibarlow)
Does this happen without the add-on installed? Like, just regular snippets replacing each other? Also, what device?
Flags: needinfo?(ibarlow)
This is happening on a Nexus 5, and only seems to be happening with your add-on installed, Margaret. When I uninstalled it I couldn't reproduce the issue.
Flags: needinfo?(ibarlow)
If that's the case, I feel like this must be caused by us calling _handleGet more than we intend to. I want to try getting rid of this call and seeing what happens:
http://mxr.mozilla.org/mozilla-central/source/mobile/android/modules/Home.jsm#131

This logic was introduced when the lifecycle of the banner view was different, but now we send a "HomeBanner:Get" message every time the HomePager is loaded, so it doesn't matter if there's a slight delay between an add-on adding a banner message and it showing up in about:home.
I was able to reproduce this on Nightly, but not on Aurora, but it's hard to tell if that is actually meaningful, since there aren't any live snippets on aurora, but I was able to reproduce this on my local build with test snippets.

I want to debug this more to figure out exactly what's going on here.
Aha, I see this happen even without the add-on installed. It looks like we are responding to the "HomeBanner:Get" message after the observer is added, even if the message was sent before the observer was added. So we definitely could get rid of that _handleGet call in HomeBanner.add.

However, this was definitely not always the case. bnicholson, do you know if any of the recent gecko message code changes may have affected this? I'm happy to land the fix I suggested, but I want to know how far I might need to uplift it (and also if other parts of our code are affected by this issue).
Flags: needinfo?(bnicholson)
(In reply to :Margaret Leibovic from comment #6)
> Aha, I see this happen even without the add-on installed. It looks like we
> are responding to the "HomeBanner:Get" message after the observer is added,
> even if the message was sent before the observer was added. So we definitely
> could get rid of that _handleGet call in HomeBanner.add.

Sounds like a message is getting added to our pending event queue in Java before Gecko is up, then once Gecko is loaded and our listeners are ready, the queued message is sent. That's not new behavior though -- we've had a pending event queue since the beginning. I'm not aware of any recent changes that would affect this.

One drawback to removing that _handleGet call is that we'll never show add-on snippets on the initial Home load after a cold start. Not sure how big of an issue that is.
Flags: needinfo?(bnicholson)
(In reply to Brian Nicholson (:bnicholson) from comment #7)
> (In reply to :Margaret Leibovic from comment #6)
> > Aha, I see this happen even without the add-on installed. It looks like we
> > are responding to the "HomeBanner:Get" message after the observer is added,
> > even if the message was sent before the observer was added. So we definitely
> > could get rid of that _handleGet call in HomeBanner.add.
> 
> Sounds like a message is getting added to our pending event queue in Java
> before Gecko is up, then once Gecko is loaded and our listeners are ready,
> the queued message is sent. That's not new behavior though -- we've had a
> pending event queue since the beginning. I'm not aware of any recent changes
> that would affect this.
> 
> One drawback to removing that _handleGet call is that we'll never show
> add-on snippets on the initial Home load after a cold start. Not sure how
> big of an issue that is.

I think this is fine, since it's better than the alternative of replacing a message while it's visible.

However, the bug is caused by the fact that we are indeed handling that "HomeBanner:Get" message after the banner messages are added, so with some test snippets on my default build, I am seeing the snippet message shown on cold start.

But the weird thing is that this is only true of snippet messages, not the sync promo banner or add-on messages. Could this be caused by the fact that the snippet messages are added asynchronously after reading from disk?
I think we should go ahead with this fix, but I want to understand what's going on.
Attachment #8404792 - Flags: review?(bnicholson)
Comment on attachment 8404792 [details] [diff] [review]
Don't call _handleGet when first message is added to banner

Review of attachment 8404792 [details] [diff] [review]:
-----------------------------------------------------------------

(In reply to :Margaret Leibovic from comment #8)
> But the weird thing is that this is only true of snippet messages, not the
> sync promo banner or add-on messages. Could this be caused by the fact that
> the snippet messages are added asynchronously after reading from disk?

Yeah, it's a startup race condition -- we're getting the HomeBanner:Get before the async callbacks have completed. But after looking at Snippets.js again, I see that we don't even try to load from the cache until "browser-delayed-startup-finished". That surprised me that we see *any* home banner at all without this _handleGet() call since "browser-delayed-startup-finished" will generally happen after Gecko:Ready, and that means it will fire after we've flushed our pending event queue.

I added some logging to see what was going on:

D/BRN     (22216): HomeBanner#update: sending HomeBanner:Get
E/GeckoConsole(22216): BRN: browser.js#startup: sending Gecko:Ready to Java
D/BRN     (22216): GeckoAppShell#sendPendingEventsToGecko: sending delayed event: HomeBanner:Get
I/Gecko   (22216): BRN: nsAppShell.cpp#ProcessNextNativeEvent: notifying observers of Accessibility:Settings
I/Gecko   (22216): BRN: nsAppShell.cpp#ProcessNextNativeEvent: notifying observers of PanZoom:StateChange
I/Gecko   (22216): BRN: nsAppShell.cpp#ProcessNextNativeEvent: notifying observers of Session:Restore
I/Gecko   (22216): BRN: nsAppShell.cpp#ProcessNextNativeEvent: notifying observers of Tab:Load
I/Gecko   (22216): BRN: nsAppShell.cpp#ProcessNextNativeEvent: notifying observers of Tab:Selected
E/GeckoConsole(22216): BRN: browser.js#startup: notifying browser-delayed-startup-finished
I/Gecko   (22216): BRN: Home.jsm#add: registering HomeBanner observers
I/Gecko   (22216): BRN: nsAppShell.cpp#ProcessNextNativeEvent: notifying observers of HomeBanner:Get
I/Gecko   (22216): BRN: Home.jsm#observe: got HomeBanner:Get
I/Gecko   (22216): BRN: nsAppShell.cpp#ProcessNextNativeEvent: notifying observers of Accessibility:Settings
I/Gecko   (22216): BRN: nsAppShell.cpp#ProcessNextNativeEvent: notifying observers of Window:Resize
I/Gecko   (22216): BRN: nsAppShell.cpp#ProcessNextNativeEvent: notifying observers of Intent:GetHandlers:Response
I/Gecko   (22216): BRN: nsAppShell.cpp#ProcessNextNativeEvent: notifying observers of Accounts:Exist:Response
I/Gecko   (22216): BRN: nsAppShell.cpp#ProcessNextNativeEvent: notifying observers of HomeBanner:Shown
I/Gecko   (22216): BRN: nsAppShell.cpp#ProcessNextNativeEvent: notifying observers of Viewport:Change
I/Gecko   (22216): BRN: nsAppShell.cpp#ProcessNextNativeEvent: notifying observers of Viewport:Change
I/Gecko   (22216): BRN: nsAppShell.cpp#ProcessNextNativeEvent: notifying observers of Viewport:Change

It looks like the pending HomeBanner:Get event is in fact sent from Java before browser-delayed-startup-finished, but in nsAppShell, we don't actually notify observers of HomeBanner:Get until *after* browser-delayed-startup-finished. This looks like just a lucky coincidence -- notice how "notifying browser-delayed-startup-finished" just happens to kick in right before "notifying observers of HomeBanner:Get". If browser-delayed-startup-finished was only slightly later, we wouldn't be showing any snippets at all until we reloaded the Home fragment. That explains why this bug randomly started showing up -- it's triggered by a very fragile startup path, and something unrelated probably affected the timing of browser-delayed-startup-finished.

So in short, having the _handleGet guarantees that we'll show something, but results in this bug; removing the _handleGet results in the race above where the banner may or may not be shown. 

One option would be conditional _handleGet call: one that fires only if we haven't sent snippets data already. You could register the HomeBanner observers immediately at startup (i.e., without waiting for browser-delayed-startup-finished), and have a boolean to keep track of whether we've received a HomeBanner:Get request that we weren't able to respond to (i.e., _messages is empty). On the conditional _handleGet, send the data only if this flag is true.

Another--and probably better--way to solve this would be to use promises. Create a promise for loading the snippets from cache, a promise for the sync promo, and a promise for HomeBanner:Get. For each resolve, use Promise.all() to see if all of these promises have been resolved. If they have, send the HomeBanner:Data. If they haven't, you know you're still waiting on one of the conditions to complete.
Attachment #8404792 - Flags: review?(bnicholson) → review-
Here's a patch that implements your suggestion to keep track of whether or not there was a "HomeBanner:Get" request that we couldn't fulfill.

I prefer this to the promises approach, since I don't want the home banner code to depend on the snippets code (right now it's just a dumb UI widget).

Theoretically add-ons (or other future modules) could make their own Home.banner.add calls, and that shouldn't have any interaction with snippets code.
Attachment #8404792 - Attachment is obsolete: true
Attachment #8406396 - Flags: review?(bnicholson)
Comment on attachment 8406396 [details] [diff] [review]
Keep track of whether or not to send banner message to Java when a banner is added

Review of attachment 8406396 [details] [diff] [review]:
-----------------------------------------------------------------

OK, this works for me.
Attachment #8406396 - Flags: review?(bnicholson) → review+
https://hg.mozilla.org/integration/fx-team/rev/cd04719806d8

We should uplift this after it bakes on Nightly, since snippets are enabled in fx29.
tracking-fennec: --- → ?
Flags: needinfo?(margaret.leibovic)
https://hg.mozilla.org/mozilla-central/rev/cd04719806d8
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 31
tracking-fennec: ? → 30+
Comment on attachment 8406396 [details] [diff] [review]
Keep track of whether or not to send banner message to Java when a banner is added

[Approval Request Comment]
Bug caused by (feature/regressing bug #): enabling snippets (fx29, but too late for that)
User impact if declined: banner message may change while visible to user
Testing completed (on m-c, etc.): landed on m-c 4/15
Risk to taking this patch (and alternatives if risky): small change to how we update banner messages, baked on nightly for a week
String or IDL/UUID changes made by this patch: none
Attachment #8406396 - Flags: approval-mozilla-aurora?
Flags: needinfo?(margaret.leibovic)
Attachment #8406396 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
You need to log in before you can comment on or make changes to this bug.