Closed Bug 972514 Opened 7 years ago Closed 7 years ago

[B2G][Email] ActiveSync message state changes from server not synchronized error "Failed to parse a message: Error: Bad body type: undefined" thrown

Categories

(Firefox OS Graveyard :: Gaia::E-Mail, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(blocking-b2g:1.3+, b2g-v1.2 unaffected, b2g-v1.3 fixed, b2g-v1.3T fixed, b2g-v1.4 verified)

VERIFIED FIXED
1.4 S2 (28feb)
blocking-b2g 1.3+
Tracking Status
b2g-v1.2 --- unaffected
b2g-v1.3 --- fixed
b2g-v1.3T --- fixed
b2g-v1.4 --- verified

People

(Reporter: mclemmons, Assigned: asuth)

Details

(Keywords: regression, verifyme, Whiteboard: [p=2][burirun1.3-3])

Attachments

(4 files)

Having an ActiveSync account established on device using Hotmail, when user flags or removes the flag indicator on web interface and syncs on the device, the user selection is not reflected on device regardless of wait time.

Repro Steps:
1) Updated Buri to BuildID: 20140212004003
2) Have ActiveSync account established on device 
3) From web interface, flag one or more emails
4) On device, select the sync icon 

Actual:
Device syncs but no visible changes to emails (no flag indicator changes)

Expected:
Device syncs and appropriate change(s) displays

Environmental Variables:
Device: Buri 1.3 MOZ
BuildID: 20140212004003
Gaia: ce17d5eae7b1893ae4397c814b10ae598fcbdb58
Gecko: ab07e61c2eb0
Version: 28.0

1. Repro frequency: (10/10, 100%) 
2. Link to failed test case: https://moztrap.mozilla.org/manage/case/4352/
3. See attached: (logcat) 
4. The type of device in use - Buri
5. The "Build Identifier" of the build you are using - BuildID: 20140212004003
6. Confirm the net connection is working - confirmed
7. The e-mail domain you used/are using to create the account – hotmail.com
8. If this is not a problem creating the account, the account type the e-mail client thinks it is using – ActiveSync
9. A log of what was happening around the time the problem happened - Attached
Update to Comment 0

Environmental Variables:
Device: Buri 1.3 MOZ
BuildID: 20140212004003
Gaia: ce17d5eae7b1893ae4397c814b10ae598fcbdb58
Gecko: ab07e61c2eb0
Version: 28.0
Base Image: v1.2-device.cfg
This issue does not reproduce on Buri 1.2. When user flags one or more emails on web interface and syncs on device, the user changes reflect immediately on device. 


Environmental Variables:
Device: Buri 1.2 MOZ
BuildID: 20140212004000
Gaia: 539a25e1887b902b8b25038c547048e691bd97f6
Gecko: 4fd893896d02
Version: 26.0
Based on the log (thanks!) one or two things are going on:

- Bug 825538 about multi-device interaction looks like a given thanks to the "bad sync key" error.  The empty response stuff leads to all kinds of sadness.  This could be where the flag notification is disappearing into a black hole.

- We're exploding when consuming the body.  This is super very bad, and could also be how we're managing to lose some data, although maybe not.

Relevant log excerpt:

WWAR: ActiveSync had a bad sync key
WLOG: Sync completed: added 13, changed 0, deleted 0
WLOG: Sync Completed! 13 messages synced
WLOG: queueOp 2 downloadBodies
WLOG: runOp(do: {"type":"downloadBodies","longtermId":"2/A","lifecycle":"do","localStatus":"done","serverStatus":"doing","tryCount":0,"humanOp":"downloadBodies","messages":[{"s)
WLOG: runOp_end(do: {"type":"downloadBodies","longtermId":"2/A","lifecycle":"do","localStatus":"done","serverStatus":"doing","tryCount":0,"humanOp":"downloadBodies","messages":[{"s)
WLOG: Sync completed with empty response
WLOG: Sync Completed! 0 messages synced

WERR: Failed to parse a message: Error: Bad body type: undefined 
 makeBodyPart@app://email.gaiamobile.org/js/ext/mailapi/worker-bootstrap.js:9522
asfc__parseMessage@app://email.gaiamobile.org/js/ext/mailapi/activesync/configurator.js:1343
asfc__enumerateFolderChanges/</<@app://email.gaiamobile.org/js/ext/mailapi/activesync/configurator.js:1040
EventParser.prototype.run@app://email.gaiamobile.org/js/ext/mailapi/activesync/protocollayer.js:1290
asfc__enumerateFolderChanges/<@app://email.gaiamobile.org/js/ext/mailapi/activesync/configurator.js:1074
Connection.prototype.postData/xhr.onload@app://email.gaiamobile.org/js/ext/mailapi/activesync/protocollayer.js:3179

WLOG: Sync completed: added 0, changed 0, deleted 0
WLOG: Sync Completed! 0 messages synced
WLOG: Sync completed with empty response
WLOG: Sync Completed! 0 messages synced
WLOG: Sync completed with empty response
WLOG: Sync Completed! 0 messages synced
WLOG: Sync completed with empty response
WLOG: Sync Completed! 0 messages synced
WLOG: Sync completed with empty response
WLOG: Sync Completed! 0 messages synced
Sounds like flagging is busted.
blocking-b2g: --- → 1.3?
(In reply to Jason Smith [:jsmith] from comment #4)
> Sounds like flagging is busted.

Specifically - syncing a flagged an email is busted.
Elaborating on comment 3, I think this is just a side effect of one or more testers using the same ActiveSync account credentials across multiple devices.  Sync fundamentally breaks in that case, and that's tracked as bug 825538.  The second error which is a body failure thing is a different bug, although possibly also fall out from the sync problems.
(In reply to Andrew Sutherland (:asuth) from comment #6)
> Elaborating on comment 3, I think this is just a side effect of one or more
> testers using the same ActiveSync account credentials across multiple
> devices.  Sync fundamentally breaks in that case, and that's tracked as bug
> 825538.  The second error which is a body failure thing is a different bug,
> although possibly also fall out from the sync problems.

Do we know what specifically is being called out as the regression here then? Is it potentially possible that the account used to test 1.2 is different than 1.3's account, which is why this doesn't reproduce on 1.2?
(In reply to Jason Smith [:jsmith] from comment #7)
> Do we know what specifically is being called out as the regression here
> then? Is it potentially possible that the account used to test 1.2 is
> different than 1.3's account, which is why this doesn't reproduce on 1.2?

Sure, that could be why.  Or if the 1.2 device was a physically different device and it had periodic sync on, it could have been the source of the repro for the 1.3 device.

Alternately, the body sync explosion, which probably is a new 1.3 regression, could be involved.  I wouldn't expect things to line up there, but that's something where I'd want to see logcats from multiple runs rather than one run.  But really we should just fix bug 825538 and the body explosion bug and then we can just make sure the problem goes away.
Okay - let's gather more data then.

QA Wanted - see above. Was the same account used to test 1.2/1.3 previously used on other devices?
blocking-b2g: 1.3? → ---
I tested this with a brand new account on a Buri 1.3 and Buri 1.2, flags set on a web interface would not appear on the Buri 1.3 after selecting the sync icon. On Buri 1.2 flags do appear.

Environmental Variables
Device: Buri 1.3 Mozilla
Build ID: 20140218004003
Gecko: https://hg.mozilla.org/releases/mozilla-b2g28_v1_3/rev/b5afe0b10e93
Gaia: df60e9b49207d64da5647ab15760c122adabfba4
Platform Version: 28.0
Firmware Version: v1.2-device.cfg

Environmental Variables
Device: Buri 1.2 Mozilla
Build ID: 20140220004002
Gecko: https://hg.mozilla.org/releases/mozilla-b2g26_v1_2/rev/2ea6a65eea23
Gaia: 539a25e1887b902b8b25038c547048e691bd97f6
Platform Version: 26.0
Firmware Version: v1.2-device.cfg
Assignee: nobody → jharvey
Keywords: qawanted
Was there new accounts used for 1.2 & 1.3 separately? Or did you test this with the same account across 1.2 & 1.3?
Flags: needinfo?(jharvey)
Separate brand new accounts were used for 1.3 and 1.2
Flags: needinfo?(jharvey)
In the absence of logcats from those runs, it sounds like this is likely caused by the bad body type error then.  I'll duplicate in a back-end test and then fix.
Assignee: jharvey → bugmail
Status: NEW → ASSIGNED
blocking-b2g: --- → 1.3?
Attached file GELAM test and fix
This pull request duplicates the failure in a test and has a fix.  I refactored the existing "stuff happened on the server somehow" (webmail/other client/etc.) case from IMAP so that we can also run it for ActiveSync as well.  Our ActiveSync test coverage gaps bit us here and this goes a long way to addressing those.

Good news/bad news is that this revealed a bug in our slice batching fix landed on bug 862870.  Specifically, our strategy of applying updates immediately is no good when we're keying off of an index offset that won't be valid until after our deferred splices are applied.  If we were keying off the suid or something like that, we'd be okay.  Or at least this is the operating theory based on the behaviour in the test that indicates state corruption in the test.  I'm crashing now, but will address this tomorrow.

This does suggest that uplift of this fix to 1.3 and landing on 1.4 will be critical to correctness.
Summary: [B2G][Email]Flagged Hotmail emails do not display the changed on or off flag indicator when synced from the server to the device using ActiveSync → [B2G][Email] ActiveSync message state changes from server not synchronized error "Failed to parse a message: Error: Bad body type: undefined" thrown
Attachment #8379610 - Flags: review?(m)
Comment on attachment 8379612 [details] [review]
GELAM test and fix

The update ordering was dealt with by forcing it to happen in sequence with the splices.  I'm not entirely sure how I managed to rationalize dispatching them immediately in review; I just have thought there was some type of suid assertion magic.


The one glitch with this pull request is that I didn't do a review-only package.json git ref instead of the mail-fakeservers npm version ref, so stuff explodes immediately.  To test you'll need to have a local checkout of mail-fakeservers.  Or trust in my having run the tests/etc. other

I've tested this in Gaia and it does resolve the flag problem.


We 100% super want this uplifted to 1.3 when we land it because the breakage is really two-fold:
- ActiveSync flag/other delta glitch
- Updates in general may corrupt the visible messages badly.
Attachment #8379612 - Flags: review?(m)
Attachment #8379610 - Flags: review?(m) → review+
Comment on attachment 8379612 [details] [review]
GELAM test and fix

Looks good to me. I pulled your branch of mail-fakeservers as well; tests run fine here as well (save for the POP3 intermittent that you already fixed on master and an unrelated IMAP test_account_logic "Invariant problemo; did not scan all folders; 0 observed, 12 expected" that I've seen before).
Attachment #8379612 - Flags: review?(m) → review+
== backend drama

landed in mail-fakeservers/master:
https://github.com/mozilla-b2g/mail-fakeservers/pull/11
https://github.com/mozilla-b2g/mail-fakeservers/commit/eb88e49568ef9bed78b41bdcb4244f4c7446de0a

published to npm as mail-fakeservers 0.0.15:
https://registry.npmjs.org/mail-fakeservers/-/mail-fakeservers-0.0.15.tgz/-rev/31-39c2f3c89237225e8437bb07306eb6e9


landed in gaia-email-libs-and-more/master after travis greened after npm push:
https://github.com/mozilla-b2g/gaia-email-libs-and-more/pull/286
https://github.com/mozilla-b2g/gaia-email-libs-and-more/commit/8f2996aa17f4795dcff7acb130897bed8dedd3ae

it then turns out I managed to not save my emacs buffer at the right time and a hunk was not migrated from the test, so a fix-up test-only PR was made.

landed in gaia-email-libs-and-more/master:
https://github.com/mozilla-b2g/gaia-email-libs-and-more/pull/287
https://github.com/mozilla-b2g/gaia-email-libs-and-more/commit/10f1c905d41b7a5e34e9b3aa40c1559da603aac7


== gaia, the stuff people care about

landed in gaia/master:
https://github.com/mozilla-b2g/gaia/pull/16585
https://github.com/mozilla-b2g/gaia/commit/da53aa80ae43f1c58b18ce18317ce96c9989b9a6


I'll request uplift approval once verified.
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Keywords: verifyme
Resolution: --- → FIXED
Whiteboard: burirun1.3-3 → [p=2][burirun1.3-3]
Target Milestone: --- → 1.4 S2 (28feb)
1.3+ for profound user impact-> unable to sync contacts.
blocking-b2g: 1.3? → 1.3+
(In reply to Preeti Raghunath(:Preeti) from comment #20)
> 1.3+ for profound user impact-> unable to sync contacts.

I agree this should be 1.3+, but the e-mail app's ActiveSync implementation has nothing to do with syncing contacts.
Any news on the verification? :)
Flags: needinfo?(bugmail)
[Approval Request Comment]
[Bug caused by] (feature/regressing bug #): 871897
[User impact] if declined: ActiveSync accounts (currently hotmail.com/outlook.com if autoconfig is used, plus any corporate exchange activesync servers) will fail to sync changes to messages.  New messages we hear about are okay, but then if the message gets marked read later or flagged, we would explode and fail to make that change without this patch.  Additionally, there is a risk when performing any synchronization for IMAP or ActiveSync that the UI will visually update the wrong message elements in the case of asynchronous mozContact resolution in the face of message modifications accompanied by new or deleted messages.
[Testing completed]: Detailed analysis of the problem. manual verification of the fix by the author, thorough back-end unit tests, no one has complained about new regressions.
[Risk to taking this patch] (and alternatives if risky): Very low.  Both fixes are extremely limited in scope and the ActiveSync glitch was already limited in its damage by a friendly try/catch, so the patch couldn't really make anything worse.
[String changes made]: None.
Attachment #8383855 - Flags: approval-gaia-v1.3?
Flags: needinfo?(bugmail)
(I probably should have used qawanted in addition to verifyme.  Don't need it quite so much now.  Will wait for uplift to request explicit verification.)
Keywords: verifyme
Attachment #8383855 - Flags: review+
(In reply to Andrew Sutherland (:asuth) from comment #24)
> (I probably should have used qawanted in addition to verifyme.  Don't need
> it quite so much now.  Will wait for uplift to request explicit
> verification.)

I need verification to grant approval ;)
Keywords: verifyme
Keywords: qawanted
(In reply to Andrew Sutherland (:asuth) from comment #24)
> (I probably should have used qawanted in addition to verifyme.  Don't need
> it quite so much now.  Will wait for uplift to request explicit
> verification.)

Actually, verifyme is only needed here - that's used for post landing verification. qawanted is used for misc QA tasks before the patch lands.
Keywords: qawanted
Verified fixed in v1.4. Setting flags on web client is reflected on device.
Environmental Variables:
Device: Buri v1.4 Moz RIL
BuildID: 20140304040200
Gaia: 6df4533f14ec2645bb13d8a803a5151583ca13a8
Gecko: 529b86b92b1d
Version: 30.0a1
Firmware Version: v1.2-device.cfg
Status: RESOLVED → VERIFIED
Attachment #8383855 - Flags: approval-gaia-v1.3? → approval-gaia-v1.3+
v1.3: 3bba74d518437e013d080e3da0c993b03bae5e8f
You need to log in before you can comment on or make changes to this bug.