Closed Bug 1116087 Opened 9 years ago Closed 9 years ago

[Flame][Email]The Email will have two option bars.

Categories

(Firefox OS Graveyard :: Gaia::E-Mail, defect)

ARM
Gonk (Firefox OS)
defect
Not set
major

Tracking

(blocking-b2g:2.2+, b2g-v2.1 unaffected, b2g-v2.2 verified)

VERIFIED FIXED
blocking-b2g 2.2+
Tracking Status
b2g-v2.1 --- unaffected
b2g-v2.2 --- verified

People

(Reporter: liuke, Assigned: asuth)

References

Details

(Keywords: regression)

Attachments

(4 files)

Attached file logcat_1436.txt
[1.Description]:
[Flame][2.2][Email]The Email will have two option bars after you kill the email progress then launch it again.
Found time:14:36
See attachment:1436.mp4 and logcat_1436.txt

[2.Testing Steps]: 
1.Launch Email.
2.Log in account successfully.
3.Long tap Home key to kill the Email progress.
4.Launch Email again.

[3.Expected Result]: 
4.The mail list should display normally, and only have one option bar in one page.

[4.Actual Result]: 
4.The account mail list can't display and have two option bars in page top and bottom.

[5.Reproduction build]: 
Gaia-Rev        3554ea9504046646b4679e3a61317c49fc55ca87
Gecko-Rev       https://hg.mozilla.org/mozilla-central/rev/67c42c076393
Build-ID        20141228010205
Version         37.0a1
Device-Name     flame
FW-Release      4.4.2
FW-Incremental  eng.cltbld.20141228.045435
FW-Date         Sun Dec 28 04:54:46 EST 2014
Bootloader      L1TC00011880

[6.Reproduction Frequency]: 
occasionally Recurrence,3/5
TCID: Free Test
Attached video 1436.MP4
Upgrading to a Smoketest blocker.  This prevents the user from using the app properly if it is reopened after an app closure.  This issue persists after a device reboot.

If the user does not have an account set up on the device, this issue will prevent them from adding an account. They are also unable to view or select mail items in the currently selected folder (inbox, sent, etc.)

This issue occurs on both v188-1 and v18D bases.

Device: Flame Unknown
BuildID: 20141229010215
Gaia: bdedbaf9f18a43c091ede770407d68d38582fe29
Gecko: 8850aa0f5332
Gonk: a814b2e2dfdda7140cb3a357617dc4fbb1435e76
Version: 37.0a1 (Unknown)
Firmware: v188-1
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(pbylenga)
This flow isn't directly in smoke test reports (closing/opening app).  Waiting on smoketest blocker confirmation.

Requesting a window, issue is a functional regression of a core feature.
blocking-b2g: --- → 2.2?
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(pbylenga)
QA Contact: jmercado
:asuth, do you think this is a smoketest blocker?
Flags: needinfo?(bugmail)
I need more info.  It's pretty confusing what's going on; the logcat provided only seems to cover the time period after the problem has occurred, presumably resulting in a corrupt cookie cache state (bindContainerHandler is throwing is the line 940 thing).  I'm also unable to correlate the video to the log time-wise.

The notable thing in the video is that when the task switcher is brought up, the email app's screenshot is visibly changing to something very messed-up looking.  This makes me suspect some kind of weird visibilitychange notification or spurious 'transitionend' is resulting in the HTML state becoming inconsistent and persisted in the HTML cookie cache, but that still raises a whole bunch of questions as to what is going on.  If this theory is sound, we would expect that the problem is timing dependent; if the user waits slightly longer to kill the email app, we would expect the UI to fully rebuild and update the cookie cache.

However, comment 2 seems to indicate this is also happening with no accounts configured as well, which is a much simpler situation and should not be happening.  It would suggest some type of platform regression related to custom elements or fundamental changes in cookie handling or something like that.

So info-wise, it'd be very useful to have a regression range for this.  Assuming it's not just a platform issue, it would be extremely useful to have:
- Steps to reproduce from the triggering of the email app during that the time that the corruption occurred.  (For example, the video starts with the email app already open but apparently looking fine.  I would want to know how the email app was started to cause it to be in the foreground and all steps taken.)
- A logcat gathered starting from the start of opening the email app during the time that the corruption occurred.  This means running logcat streaming to a file/disk before starting the testing.
- An indication of the current time when the error was reproduced to make it easier to find things in the log.
- How many accounts are defined and whether any of them had notifications in the system tray.  (I guess I would have expected to see them on the lock screen, but I'm not 100% clear on the current UX).
Flags: needinfo?(bugmail)
Note that even a coarse regression window is fine, especially if it's accompanied by a logcat.  A video isn't particularly required, but if it's possible to note if the weird task switcher visual corruption occurs, that's super useful to know.
Seems to have been caused by Bug 1113811.

Mozilla-inbound Regression Window

Last Working 
Environmental Variables:
Device: Flame 2.2
BuildID: 20141222170428
Gaia: c2da2bafd4e809317e2ca70c9bf5c11136a32818
Gecko: 2b933356daf7
Version: 37.0a1 (2.2) 
Firmware Version: v188-1
User Agent: Mozilla/5.0 (Mobile; rv:37.0) Gecko/37.0 Firefox/37.0

First Broken 
Environmental Variables:
Device: Flame 2.2
BuildID: 20141222210536
Gaia: c2da2bafd4e809317e2ca70c9bf5c11136a32818
Gecko: ad0bb596da2c
Version: 37.0a1 (2.2) 
Firmware Version: v188-1
User Agent: Mozilla/5.0 (Mobile; rv:37.0) Gecko/37.0 Firefox/37.0

Last Working gaia / First Broken gecko - Issue DOES occur
Gaia: c2da2bafd4e809317e2ca70c9bf5c11136a32818
Gecko: ad0bb596da2c

First Broken gaia / Last Working gecko - Issue does NOT occur
Gaia: c2da2bafd4e809317e2ca70c9bf5c11136a32818
Gecko: 2b933356daf7

Gecko Pushlog: http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=2b933356daf7&tochange=ad0bb596da2c
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(pbylenga)
NI on Daniel, can you take a look please?
Blocks: 1113811
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(pbylenga) → needinfo?(dholbert)
I have only one cset in that range, which was removing an unneeded null-check, whose only conceivable effect on behavior could be to cause crashes on null-deref. (in the unlikely event the null-check was actually needed.)

So, I'll claim innocence here. :) I'd place my bets on wchen's commit at the very bottom of the range about "cloneNode on a custom element" (bug 1081039), since I know gaia apps use custom elements, and this bug seems to be about some cloned content.

(Could conceivably also be fallout from one of botond's build-warning-fixes in that range, but that seems less likely, since build-warning-fixes are usually not expected to impact behavior much.)
Flags: needinfo?(dholbert) → needinfo?(wchen)
Blocks: 1081039
No longer blocks: 1113811
Ah, yeah, this is almost certainly the cloneNode thing.  The HTML cookie cache thing (which is a hack; there's no pretending otherwise!) wants cloneNode to be producing an inert node (like in a data document) when it is calling cloneNode on a custom element (ack!) so that it can slice and dice things down before it eventually uses outerHTML.  (Noting that the custom element is very intentionally not using the Shadow DOM; the semantics on that would be nuts.)

It seems like the platform is in the right here.  The email app should probably be using someDataDocument.importNode(cloneThisNode, deep=true) where someDataDocument lacks the custom element registration.  Ideally the semantics and implementation are such that this is a more efficient version of someDataDocument.someContainer.innerHTML = cloneThisNode.outerHTML.  Or maybe we just need to use outerHTML directly.  I'll take a look.

I'm going to leave the needinfo on :wchen for now in case any helpful tips are available.  (Noting that startup time constraints requires us to either use the evil cookie cache or an evil localStorage cache.)
Assignee: nobody → bugmail
Status: NEW → ASSIGNED
Bug 1081039 fixed a custom element bug where if you cloned or imported a node, you wouldn't get a created callback for the newly created node. Now we do in order to match the spec. Web components provides the template element (<template>) to store inert elements. The template content (HTMLTemplateElement.content) is a document fragment that is owned by a different document (without an empty custom elements registry) so things living inside behave as if inert. Depending on what you're doing, it may make sense to slice and dice the custom element within a template element before importing it into the main document.
Flags: needinfo?(wchen)
Thanks for the tips!  Along with http://w3c.github.io/webcomponents/spec/custom/#creating-and-passing-registries and some other spec reading I've managed to get things happy with a patch that looks like so (but with more comments explaining what is going on):

-      var cacheNode = this.cloneNode(true);
+      var templateNode = document.createElement('template');
+      // content is a DocumentFragment, we need its ownerDocument.
+      var cacheDoc = templateNode.content.ownerDocument;
+      var cacheNode = cacheDoc.importNode(this, true); // yes, deep

I'll centralize the anti-footgun logic and have our existing caching logic use it and touch the code that impacts the cookie cache hash and have a fix up very shortly.
Fix.  :jrburke is on PTO right now so using the productivity devs who are currently arround.
Attachment #8542759 - Flags: review?(mmedeiros)
Attachment #8542759 - Flags: review?(gaye)
QA Whiteboard: [QAnalyst-Triage+] → [QAnalyst-Triage+][COM=Gaia::Email]
Comment on attachment 8542759 [details] [review]
Use importNode with a template's content document's empty custom element registry

Stealing review, just happened to wander through, this was a fascinating bug.

The changes make sense. Lots of edges still to learn on the custom elements. Probably good our usage is light so far.
Attachment #8542759 - Flags: review?(mmedeiros)
Attachment #8542759 - Flags: review?(gaye)
Attachment #8542759 - Flags: review+
landed on gaia/master:
https://github.com/mozilla-b2g/gaia/pull/27069
https://github.com/mozilla-b2g/gaia/commit/f90f7146c6b71bc9b5ae96fb3f97c1c56a81517f

Many thanks to all involved in helping with this (regression window identification, suspicious commit identification, helpful tips, etc.)!
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
This bug will still reproduce 10/10 times on today's build

When signing into an email account, then force closing the email app, and reopening it, the app will show 1 header at the top, and 1 header at the bottom. The user must factory reset to recover from this. Screenshot added showing results

Device: Flame 2.2
BuildID: 20141231010205
Gaia: 26d479f0fccb7174e06255121e4e938c1b280d67
Gecko: 88037f94b7d7
Gonk: a814b2e2dfdda7140cb3a357617dc4fbb1435e76
Version: 37.0a1 (2.2)
Firmware: V18D
User Agent: Mozilla/5.0 (Mobile; rv:37.0) Gecko/37.0 Firefox/37.0
QA Whiteboard: [QAnalyst-Triage+][COM=Gaia::Email] → [QAnalyst-Triage+][COM=Gaia::Email][failed-verification]
(In reply to Derek Harris [:DerekH] from comment #17)
> Device: Flame 2.2
> BuildID: 20141231010205
> Gaia: 26d479f0fccb7174e06255121e4e938c1b280d67

This build does not include the fix; commit f90f7146c6b71bc9b5ae96fb3f97c1c56a81517f with the fix has 3 more merges in it (including mine).  It's possible the bumper bot fell behind or something.
QA Whiteboard: [QAnalyst-Triage+][COM=Gaia::Email][failed-verification] → [QAnalyst-Triage+][COM=Gaia::Email]
The full time sheriffs are basically all on pto. Almost certainly the issue is that b2g-inbound wasn't merged to m-c in time for today's nightlies.
(In reply to [Away 24-Dec to 4-Jan] Ryan VanderMeulen [:RyanVM UTC-5] from comment #19)
> The full time sheriffs are basically all on pto. Almost certainly the issue
> is that b2g-inbound wasn't merged to m-c in time for today's nightlies.

This is now merged, so the next Nightlies should include it.
Adding qawanted to check to see if we can test the patch in tinderbox builds or when the next nightlies get spun.
Keywords: qawanted
I can confirm that this has been fixed with the latest nightly OTA build :)
Per comment 24.
Status: RESOLVED → VERIFIED
(In reply to Peter Bylenga [:PBylenga] from comment #21)
> Adding qawanted to check to see if we can test the patch in tinderbox builds
> or when the next nightlies get spun.

Confirmed that this issue is no longer occurring on today's build. Email app can now be properly re-launched after it has been killed.

Tested on:
Device: Flame 2.2 Master
BuildID: 20150105033341
Gaia: 4ceeff19086b2a2955f044ad923dcfa63a293de3
Gecko: 912036eeb024
Version: 37.0a1 (2.2 Master)
Firmware: V188-1
User Agent: Mozilla/5.0 (Mobile; rv:37.0) Gecko/37.0 Firefox/37.0
QA Whiteboard: [QAnalyst-Triage+][COM=Gaia::Email] → [QAnalyst-Triage?][COM=Gaia::Email]
Flags: needinfo?(pbylenga)
Keywords: qawanted
blocking-b2g: 2.2? → 2.2+
Flags: needinfo?(pbylenga) → needinfo?(ktucker)
QA Whiteboard: [QAnalyst-Triage?][COM=Gaia::Email] → [QAnalyst-Triage+][COM=Gaia::Email]
Flags: needinfo?(ktucker)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: