Closed Bug 1206156 Opened 9 years ago Closed 9 years ago

[email] Make requestGrowth not throw on misuse to avoid breaking vscroll state machine

Tracking

(blocking-b2g:2.5+)

Status:

RESOLVED FIXED

Project Flags:

blocking-b2g

2.5+

People

(Reporter: wilsonpage, Assigned: asuth)

Details

(Keywords: foxfood, Whiteboard: [bzlite])

Attachments

(6 files, 1 obsolete file)

system-b2g-application.ini 9 years ago Wilson Page [:wilsonpage] 602 bytes, application/octet-stream		Details
dev-log-main.log 9 years ago Wilson Page [:wilsonpage] 274.28 KB, application/octet-stream		Details
properties.log 9 years ago Wilson Page [:wilsonpage] 9.96 KB, application/octet-stream		Details
screenshot.png 9 years ago Wilson Page [:wilsonpage] 37.55 KB, image/png		Details
Empt_Inbox_Logcat.txt 9 years ago Jayme Mercado [:JMercado] (Inactive 3/4/2016) 91.86 KB, text/plain		Details
[gaia-email-libs-and-more] asutherland:email-no-grow-throw > mozilla-b2g:master 9 years ago GitHub Autolander Bot 64 bytes, text/x-github-pull-request	jrburke : review+	Details \| Review
[gaia] asutherland:email-no-grow-throw > mozilla-b2g:master 9 years ago GitHub Autolander Bot 46 bytes, text/x-github-pull-request		Details \| Review

Wilson Page [:wilsonpage]

Reporter

Description

•

9 years ago

User-Agent: Mozilla/5.0 (Mobile; rv:43.0) Gecko/43.0 Firefox/43.0

Not sure how to reproduce. I got.into this state after I tapped on an email notification and then pressed the back button.

Wilson Page [:wilsonpage]

Reporter

Comment 1

•

9 years ago

Attached file system-b2g-application.ini — Details

Wilson Page [:wilsonpage]

Reporter

Comment 2

•

9 years ago

Attached file dev-log-main.log — Details

Wilson Page [:wilsonpage]

Reporter

Comment 3

•

9 years ago

Attached file properties.log — Details

Wilson Page [:wilsonpage]

Reporter

Comment 4

•

9 years ago

Attached image screenshot.png — Details

Mahendra Potharaju [:mahe]

Comment 5

•

9 years ago

Blocking 2.5 for this defect. P1 priority.

blocking-b2g: --- → 2.5+

QA Whiteboard: [foxfood-triage]

Component: Gaia::Feedback → Gaia::E-Mail

Priority: -- → P1

Andrew Sutherland [:asuth] (he/him)

Assignee

Comment 6

•

9 years ago

I suspect this bug may be moot. This bug is reported against build 20150812231434 which definitely does not have the custom elements fix of bug 1176829 which could potentially be involved. However, it seems like our failure is in message_list which was already on the card-stack, so maybe it's not that bug. But then the refactoring changes on bug 1196077 landed after this build and may have corrected any state machine failures.

So I think we want a reproduction on a current build to identify that this still happens and get a logcat of what's going on. If we can't reproduce, we resolve WFM.

From the log Wilson provided (thank you!), I think these are basically the steps to reproduce:

1) Have the email app configured with a single IMAP account with a small number of messages in the inbox. In this case, there were only seven messages in the inbox. (I believe this is significant to the state machine and the "grow" errors generated that may have contributed to things breaking.)

2) Have the email app doing periodic sync. Using the secret debug mode (https://wiki.mozilla.org/Gaia/Email/SecretDebugMode#Getting_to_the_Secret_Debug_Menu) an option can be added to the menu to set the sync interval to 2 minutes/120 seconds to make it go faster. Alternately, the https://github.com/jrburke/requestsyncmanager app can be sideloaded to manipulate things.

3) Have the email app active in the background with the inbox displayed (probably). (So, do not close the email app, and do not test on a 319M device where the email app will almost certainly be killed. Test on a 1gig device or higher.) Basically, just hit the home button.

4) Send a single email to the account so that when email performs periodic sync, it will generate a notification that you can tap on and that will bring you directly to the message reader.

5) When the notification triggers and the message reader displays, tap the back button.

6) If the message list is displaying the "loading" fake-messages whose contents are just bars, like in the screenshot, then the bug has reproduced.

Keywords: qawanted

Comment hidden (obsolete)

New Inbox blank Repro steps.

Description:
Inbox with several unread email messages appears blank after navigating to sent, drafts, outbox,..etc, then back to Inbox. 


Repro Steps:
1) Update an Aries to 20150921142023
2) Setup an email account, such as Hotmail,.. sync up latest messages. 
3) Send email to the Hotmail account to the phone from desktop or other device. 
4) Let email account on phone default to background by pressing home button. 

NOTE: email should be set at default to update periodically ( 15 min. ) and should update immediatley after sync is manually implemented. 
 
Actual:
Inbox continues to state, "Loading Messages" after 29 minutes of syncing


Expected:
Inbox updates with latest message after navigating to sent, outbox, etc,..



Environmental Variables:
Device: Aries 2.5
Build ID: 20150921142023
Gaia: 2d370fa35c1a0ee2a637e3772c0843586a5f96c9
Gecko: fcef8ded82219c89298b4e376cfbdfba79a1d35a
Gonk: 2916e2368074b5383c80bf5a0fba3fc83ba310bd
Version: 43.0a1 (2.5)
Firmware Version: D5803_23.1.A.1.28_NCB.ftf
User Agent: Mozilla/5.0 (Mobile; rv:43.0) Gecko/43.0 Firefox/43.0


User Impact: 

Repro frequency: 100 %

Andrew Sutherland [:asuth] (he/him)

Assignee

Comment 8

•

9 years ago

Do you have logcat logs you can provide from your reproduction?  (Also, you may be reproducing a different bug, especially since ActiveSync is involved rather than IMAP; your problem sounds like a hung sync, possibly involving mutex loss.  Wilson's logs indicated no hangs or mutex losses.)

Flags: needinfo?(sleedavid)

Jayme Mercado [:JMercado] (Inactive 3/4/2016)

Comment 9

•

9 years ago

Setting qawanted to get the log requested in comment 8 using the steps from comment 7.

QA Whiteboard: [foxfood-triage?] → [foxfood-triage+]

Flags: needinfo?(sleedavid)

Flags: needinfo?(jmercado)

Keywords: qawanted

Jayme Mercado [:JMercado] (Inactive 3/4/2016)

Updated

•

9 years ago

QA Whiteboard: [foxfood-triage+] → [foxfood-triage?]

Jayme Mercado [:JMercado] (Inactive 3/4/2016)

Updated

•

9 years ago

QA Whiteboard: [foxfood-triage?] → [foxfood-triage]

Jayme Mercado [:JMercado] (Inactive 3/4/2016)

Comment 10

•

9 years ago

Attached file Empt_Inbox_Logcat.txt (obsolete) — Details

Jayme Mercado [:JMercado] (Inactive 3/4/2016)

Comment 11

•

9 years ago

I was able to reproduce an empty inbox following steps in comment 7.  Hopefully this log will help.

QA Whiteboard: [foxfood-triage] → [foxfood-triage][QAnalyst-Triage?]

Flags: needinfo?(ktucker)

Keywords: qawanted

Andrew Sutherland [:asuth] (he/him)

Assignee

Updated

•

9 years ago

Attachment #8663942 - Attachment is obsolete: true

Andrew Sutherland [:asuth] (he/him)

Assignee

Comment 12

•

9 years ago

Unfortunately that log was fairly boring.  It looks like the "Diag_Lib" logger is filling up the logcat or something; there was less than 20 seconds of data in the logcat.  And although that included a fresh email run, nothing interesting happened there because email chose to not automatically refresh because it had synced sufficiently recently.

To avoid confusion about the STRs, I'm marking comment 7 obsolete since it probably is a different bug (maybe just device id exhaustion that will clear itself tomorrow?).

Anywho, this is a tricky one and it can be handy for me to be able to inspect state when it reproduces, so I'll take over reproducing this for now.

Assignee: nobody → bugmail

Status: NEW → ASSIGNED

Flags: needinfo?(bugmail)

KTucker [:KTucker][Inactive 3/4/2016]

Updated

•

9 years ago

QA Whiteboard: [foxfood-triage][QAnalyst-Triage?] → [foxfood-triage][QAnalyst-Triage+]

Flags: needinfo?(ktucker)

Mahendra Potharaju [:mahe]

Comment 13

•

9 years ago

Hey Andrew, 

Any luck reproducing this and getting a fix? Its a bad P1 that's a must fix for 2.5.

Thank you!

Andrew Sutherland [:asuth] (he/him)

Assignee

Comment 14

•

9 years ago

I've been trying to reproduce in the background while dogfooding, but so many key features of the aries builds have been completely unreliable (can't hit "ok" after entering a wifi password, already-configured wifi doesn't reconnect, the requestsync API gives up on doing its job after a few cycles, etc.), it petered out pretty quick.  I'm spinning a build now so I can investigate why requestsync is unreliable (which is arguably the bigger problem) and will actively work that and then this.

Andrew Sutherland [:asuth] (he/him)

Assignee

Comment 15

•

9 years ago

I've re-created the triggering scenario (as I understand it) several times and a few variations on it and am unable to reproduce.

Since that's not entirely reassuring on its own, I also did some code inspection of the code as it existed for comment 0. I found that in general that things should be self-healing; barring permanent mutex-lossage (which the log did not indicate), attempting to scroll the pane should result in new requests being issued and things like the "scrollStopped" event to fire (even in the face of a jumpy clock). Having said that, I think there are two ingredients to the breakage:
1) Something was spamming the messages_complete event handler. I'm thinking the model refactor as part of bug 1196077 probably made this stop happening.
2) The "Error: Already growing in 7 dir." is a side-effect of that spamming of messages_complete. There's a call to loadNextChunk if `this.desiredHighAbsoluteIndex` in there which calls that can-explode requestGrowth method. If that call explodes, `this.desiredHighAbsoluteIndex` will not be cleared and all subsequent calls will keep failing to get down to the call to `this.vScroll.updateDataBind` which could explain the failure to update the DOM. If the timing works out right, we can end up never making it to that call. It is likely that changing folders away and back would have "healed" the problem.

As noted, I think point 1 is already addressed, but fixing point 2 makes point 1 moot. The `throw` in requestGrowth is one of those misuse assertions that made sense in unit tests but results in issues like this and where we have now more properly adopted use of `logic.fail()`. I'll attach a patch momentarily that does this. I believe that this will alleviate the problem if the root cause (point 1) recurs.

Andrew Sutherland [:asuth] (he/him)

Assignee

Updated

•

9 years ago

Flags: needinfo?(bugmail)

Summary: [email] Inbox blank → [email] Make requestGrowth not throw on misuse to avoid breaking vscroll state machine

GitHub Autolander Bot

Comment 16

•

9 years ago

Attached file [gaia-email-libs-and-more] asutherland:email-no-grow-throw > mozilla-b2g:master — Details

Andrew Sutherland [:asuth] (he/him)

Assignee

Updated

•

9 years ago

Attachment #8672156 - Flags: review?(jrburke)

James Burke [:jrburke]

Updated

•

9 years ago

Attachment #8672156 - Flags: review?(jrburke) → review+

GitHub Autolander Bot

Comment 17

•

9 years ago

Attached file [gaia] asutherland:email-no-grow-throw > mozilla-b2g:master — Details

Andrew Sutherland [:asuth] (he/him)

Assignee

Comment 18

•

9 years ago

landed:
https://github.com/mozilla-b2g/gaia/pull/32424
https://github.com/mozilla-b2g/gaia/commit/6c7548db65047541e24b1500c53e16041509d435

https://github.com/mozilla-b2g/gaia-email-libs-and-more/pull/395
https://github.com/mozilla-b2g/gaia-email-libs-and-more/commit/43d482e288d70f051a9eb69d07a614d1767edccf

Status: ASSIGNED → RESOLVED

Closed: 9 years ago

Resolution: --- → FIXED

You need to log in before you can comment on or make changes to this bug.