[email] Weird re-entrant function case leading to jumbled message list

NEW
Unassigned

Status

()

4 years ago
4 years ago

People

(Reporter: jrburke, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(3 attachments)

(Reporter)

Description

4 years ago
Created attachment 8568273 [details]
message-list-jumble.png

Steps to reproduce:

* Set up email account, make sure email app is closed.
* In Contacts app, select an contact with an email address, and tap on the email address to launch the Email app in Compose view.
* Once the name of the contact shows up in the Compose form, press the Back button.

The device should transition back to the Contacts app. Every so often, the Email app will have a failure during this transition, perhaps 25%-50% of the time.

When you switch back to the still running Email app via the app switcher, you might see all the messages in the message list overlaid on top of each other. See the attached screenshot.

The adb logcat will contain something like:

I/GeckoDump( 2982): LOG: compose: back: save needed, prompting
I/GeckoDump( 2982): LOG: compose: explicit draft discard on exit
I/GeckoDump( 2982): LOG: pushCard for type: message_list
I/Gecko   ( 3088): SystemMessageCache: init
I/Gecko   ( 2982): WLOG: [slog] mailslice:mutex-released {"folderId":"0/0"}
I/GeckoDump( 2982): ERR: onerror reporting: NotFoundError: Node was not found @ app://email.gaiamobile.org/js/mail_app.js : 1857
W/E-Mail  ( 2982): [JavaScript Error: "NotFoundError: Node was not found" {file: "app://email.gaiamobile.org/js/mail_app.js" line: 1857}]
I/GeckoDump( 2982): LOG: VSCROLL scrollTop: 37, RECALCULATE: 0, NaN
I/GeckoDump( 2982): LOG: message_list complete: 15 items of 23 alleged known headers. canGrow: false
I/GeckoDump( 2982): LOG: message_list loadNextChunk growing 8 to 23 items out of 23 alleged known
I/Gecko   ( 2982): WLOG: [slog] mailslice:mutex-released {"folderId":"0/0"}
I/GeckoDump( 2982): LOG: message_list complete: 23 items of 23 alleged known headers. canGrow: true

The strange thing, doing a deeper trace, vscroll._init is called twice, and it leads to this problem.

Call 1 gets as far as the `this.itemHeight = node.clientHeight;` line. The clientHeight triggers sync reflow, which apparently does some sync event notifications because vscroll's `nowVisible` is called, and init() is entered a second time. The first call is still paused on the clientHeight line, but the second one runs to completion.

After that though, the first call continues to run and generates the error.

nowVisible is triggered by this chain:

* cards _onTransitionEnd
* cards _onCardVisible
* message_list onCardVisible
* vscroll nowVisible
* vscroll _init

So the .clientHeight seems to trigger the _onTransitionEnd, which is triggered when the cards are transitioning back to showing the newly created message_list.

I will attach an email build that has more console.logs that log out the problem.

I have a patch on a branch that I will pull out as a standalone fix. It is a bit crazy that it is needed though, as it makes me paranoid about any sort of init setup that only needs to be done once, but could be called multiple times.

So this might be appropriate to fix in the platform. Triggering a transitionend as part of a sync clientHeight call seems undesirable because of this side effect, but perhaps it is required by other platform features.
(Reporter)

Comment 1

4 years ago
Created attachment 8568363 [details] [diff] [review]
log.patch

This is the patch that I applied to gaia master just to log the issue. Also, I ran this on a flame device with 1GB memory configured, to make sure both Contacts and Email would continue to run.

Log results when I run with this patch. The VSCROLL INIT4 comes right before the clientHeight call. The 1 or 2 at the end of the VSCROLL line indicates which init call is logging, the first or second one:

I/GeckoDump(10493): LOG: @@@ IN VSCROLL INIT: 1
I/GeckoDump(10493): LOG: @@@ IN VSCROLL INIT2: 1
I/GeckoDump(10493): LOG: @@@ IN VSCROLL INIT3: 1
I/GeckoDump(10493): LOG: @@@ IN VSCROLL INIT4: 1
I/GeckoDump(10493): LOG: @@@ CARDS _onTransitionEnd
I/GeckoDump(10493): LOG: @@@ CARDS _onTransitionEnd
I/GeckoDump(10493): LOG: @@@ CARDS about to call _onCardVisible
I/GeckoDump(10493): LOG: @@@ CARDS _onCardVisible
I/GeckoDump(10493): LOG: @@@ MESSAGE_LIST onCardVisible
I/GeckoDump(10493): LOG: @@@ VSCROLL onCardVisible
I/GeckoDump(10493): LOG: @@@ IN VSCROLL INIT: 2
I/GeckoDump(10493): LOG: @@@ IN VSCROLL INIT2: 2
I/GeckoDump(10493): LOG: @@@ IN VSCROLL INIT3: 2
I/GeckoDump(10493): LOG: @@@ IN VSCROLL INIT4: 2
I/GeckoDump(10493): LOG: @@@ IN VSCROLL INIT5: 2
I/GeckoDump(10493): LOG: @@@ IN VSCROLL INIT6: 2
I/GeckoDump(10493): LOG: @@@ IN VSCROLL INIT7: 2
I/GeckoDump(10493): LOG: @@@ VSCROLL INIT8 2 OK: 80, 444.33331298828125
I/GeckoDump(10493): LOG: @@@ IN VSCROLL INIT5: 1
I/GeckoDump(10493): LOG: VSCROLL scrollTop: 0, RECALCULATE: 0, NaN
I/GeckoDump(10493): LOG: message_list complete: 15 items of 30 alleged known headers. canGrow: false
I/GeckoDump(10493): LOG: message_list loadNextChunk growing 15 to 30 items out of 30 alleged known
I/GeckoDump(10493): ERR: onerror reporting: NotFoundError: Node was not found @ app://email.gaiamobile.org/js/mail_app.js : 6393
W/E-Mail  (10493): [JavaScript Error: "NotFoundError: Node was not found" {file: "app://email.gaiamobile.org/js/mail_app.js" line: 6393}]
(Reporter)

Comment 2

4 years ago
Created attachment 8568364 [details]
email-1135960.zip

This is a zip of the email app with the logging patch applied, which can be used to demonstrate the error with a bit more logging in it.
(Reporter)

Comment 3

4 years ago
Asking Core Layout if this is expected behavior for a clientHeight/transitionend combo. I hope not as it breaks some fundamental assumptions we had in email. Hopefully the transitionend can be triggered async.
Component: Gaia::E-Mail → Layout
Product: Firefox OS → Core
David, is dispatching transitionend synchronously when flushing style like this the expected behaviour?
Flags: needinfo?(dbaron)
The spec is pretty vague but sounds a little bit like events are dispatched synchronously.
http://dev.w3.org/csswg/css-transitions/#transition-events

For transitions we appear to do the dispatch synchronously:

https://dxr.mozilla.org/mozilla-central/source/layout/style/nsTransitionManager.cpp#763

But for SMIL we do this asynchronously because we ran into these kind of issues in the past:

https://dxr.mozilla.org/mozilla-central/source/dom/smil/nsSMILTimedElement.cpp#107
I think it is the expected behavior, though it's worth double-checking the behavior of other browsers.
Flags: needinfo?(dbaron)
If it is expected, is it something we can address in a revision to a current spec or in a successor spec?  Having a JS property-read trigger synchronous event dispatch would seem to violate at least the spirit of run-to-completion semantics for JS, even if we are intentionally poking layout with a stick.  (And it would suck if the end result was that everyone mediated all DOM access through react.js or things like https://github.com/wilsonpage/fastdom because to do otherwise was to court frustration.)
Note that HTML also seems to suggest these events are dispatched synchronously under the "update the rendering" steps--although that probably doesn't apply to flushing layout from querying clientTop.

  https://html.spec.whatwg.org/multipage/webappapis.html#processing-model-9

(And dbaron and I have both indicated that the steps there don't really correspond to what we do and probably should be revised.)
(Reporter)

Comment 9

4 years ago
Created bug 1141362 to uplift a certain mitigation for b2g 2.2. However, I believe this behavior will cause subtle bugs in general user expectation around reading clientHeight and ontransitionend. Bug 1141362 fixes a specific case that might be visible to QA, but I expect other issues lurking because of this behavior.
See Also: → bug 1141362
This behavior does not happen anymore. This has been thoroughly tested in PVT builds for the last 2 days.
You need to log in before you can comment on or make changes to this bug.