Closed Bug 1028192 Opened 10 years ago Closed 6 years ago

Problems with ActiveSync accounts

Categories

(Firefox OS Graveyard :: Gaia::UI Tests, defect, P1)

ARM
Gonk (Firefox OS)
defect

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: AndreiH, Unassigned)

References

Details

(Whiteboard: [xfail])

Attachments

(3 files)

On the latest run you can see that the emails don't load on b2g-9.

http://selenium.qa.mtv2.mozilla.com:8080/job/b2g.flame.mozilla-central.ui.smoketest/80/HTML_Report/

When using b2g-9's credentials locallly I get the same results.
Do we need to change the account?
Priority: -- → P1
Attached image Screenshot.png
This looks like the issue of the account being used on more than one device and thus the emails not loading.
Looking through the credentials it doesn't seem like any of the automated devices in the lab uses the same account...
Guys the tests still failed in the last build:
http://selenium.qa.mtv2.mozilla.com:8080/view/B2G%20Aurora/job/b2g.flame.mozilla-aurora.ui.smoketest/30/HTML_Report/

Shouldn't we change the credentials?
Flags: needinfo?(mozbugs.retornam)
(In reply to [:AndreiH] from comment #4)
> Guys the tests still failed in the last build:
> http://selenium.qa.mtv2.mozilla.com:8080/view/B2G%20Aurora/job/b2g.flame.
> mozilla-aurora.ui.smoketest/30/HTML_Report/
> 
> Shouldn't we change the credentials?


We can't easily change ActiveSync accounts now. We have crossed the $1000 limit and now need the Finance teams approval before adding new ones. I'd have to talk to Stephen about this first
Flags: needinfo?(mozbugs.retornam)
Logging into the Outlook webapp, it the bug stems from reaching over 100 devices syncing to the account. So I believe that what is happening is that with every restart the phone is considered a different device, incrementing the number of devices it considers syncing on the account. I've gone through and deleted all the connected devices, but this could crop up in the future. The sync problem might be considered a bug since activesync believes that previously connected devices are new devices after a reset and setting up.
RobertC started cleaning that account also. Do we have to do this every time we reach 100 new devices?
Ni? zac to see this issue
Flags: needinfo?(zcampbell)
Attached image e-mail.png
This issue reappeared on b2g-9. I started removing devices, but after deleting 20 I got an error stating that I can only delete 20 devices a month (see attached screenshot)
I've been trying to hack together a script to login and do it for us but I'm having trouble with the amount of delay the devices list takes to load. The 20 devices per month limit is concerning though.
Tests failed on b2g-2 because of this limit and we already deleted 20 devices this month.
This issue is starting to snowball and we might run out of available accounts.

This said, I don't think deleting the devices is a solution.

We are stuck now on running 20 tests on account per month so this will be done in aprox a week. 

We need a more long term option for running the active sync tests.

:asuth maybe you have any ideas on how we can prevent this from happening.
Severity: normal → blocker
Flags: needinfo?(mozbugs.retornam)
Flags: needinfo?(bugmail)
Let's just disable all of the ActiveSync tests for now.
Flags: needinfo?(zcampbell)
Summary: Emails for b2g-9 are not loading → Problems with ActiveSync accounts
Why are we still dependent on the multi-account workaround we have implemented for activesync? bug 825538 landed - so we shouldn't need that anymore, right?
I think that what might be happening is that since we're resetting the phone every time it has a new Device ID generated causing ActiveSync to recognize it as a different device each time adding to the limit, unless it's stored somewhere else. Would it be possible to maybe specify static Device ID's in our testvars? I'm not too sure how they work aha.
Also, I am meeting with the owner of the accounts (Jennifer Hayashi) this morning at 9am PST to see if anything can just be done on Microsoft's side to remove either limits.
Comment on attachment 8447979 [details] [review]
Disabled activesync tests: https://github.com/mozilla-b2g/gaia/pull/21182

I r+ this because of the infrastructure problems.
Attachment #8447979 - Flags: review?(zcampbell) → review+
Flags: needinfo?(mozbugs.retornam)
Update, Jennifer will be contacting Microsoft to see what can be done on their side, if not remove the limits, at least delete all devices as a stopgap to buy time, though we're fine for now since the tests have been disabled.

Another thing, I believe clicking wipe, then block device will also remove devices from the list, but it takes a lot longer to process.
Whiteboard: [xfail]
Currently mozusers 2, 3, and 7, have had their device lists cleaned. Though the process to clean them is longwinded and not an ideal solution.
Hm, I was somewhat assuming from the spec that ActiveSync servers would just do a least-recently-used (LRU) eviction policy for Devices from its synchronization table.

So yeah, the only practical solution to the problem for automated tests is to have each device use the same device ID every time.  And I mean *each* device; multiple devices cannot/must not use the same device ID or we're back where we started.

There are three scenarios we need to be concerned about here:
A) These automated tests using a pool of devices and potentially performing aggresive wiping in-between.
B) Manual QA testing.  In this case, the device may not be quite so thoroughly wiped between runs, but the accounts are likely to be repeatedly added.
C) Real users who ideally are only rarely deleting and re-adding the account when there is a problem, and even more rarely completely wiping their devices.

Situation 'A' is very special.  We have the ability to poke/magically hack things about the state of the email app to allow workarounds like forcing the use of a specific device ID.  It's also the only one where this is basically required.

Situation 'B' is probably the most troubling because we cannot really expect manual QA testers to reliably jump through special debugging hoops.  Mitigations for this that jump out at me are:

- Keep the device id's we have created in the database so that if the user deletes foo@example.com and then re-creates it, we can reuse that same device ID again.

- If there's a way to tell an ActiveSync server that we are unbinding the device, we should totally 100% do that when the user deletes the account.  We'd do that opportunistically but we could have an enhancement bug to make sure we eventually do it if we were offline at the time.

If it really takes 100 devices to make an ActiveSync server angry, we're probably okay in the 'C' case most of the time, but the mitigation of persisting the device ID even when an account is removed from step 'B' is likely the most useful thing we can do.

Jim, any thoughts?
Flags: needinfo?(bugmail) → needinfo?(squibblyflabbetydoo)
See Also: → 825538
We could add a pref that you can use to hard code the device ID, and then (A) and (B) users can just be expected to set that pref. Otherwise, I think we're doing the right thing. If we can maintain the ability to change the device ID by deleting and recreating the account, I think that's ideal, since it helps reduce fingerprinting.
Flags: needinfo?(squibblyflabbetydoo)
A pref that we could pass in during tests would be great. I see where DeviceID's and so forth are passed around and defined, but I haven't worked on the code that interfaces between our tests and FFxOS so I'm not sure how difficult that would be to implement. This would also allow us on the automated testing side to cut down on the number of ActiveSync accounts for testing we use as we currently have it one device to account as a workaround to the previous fact all Device ID's were the same (now patched).
I created bug 1033923 for 'a'; this probably will happen next week since there's some synergy with integration testing stuff I need to finish out too.

:squib, can you clarify what you mean by a preference.  Do you mean a mozSettings setting, a build-time compilation/customization thing, or a secret debug menu mode?.  We are trying to get rid of the mozSettings dep in the email app, and I think compilation stuff is probably heavyweight and likely to end up not enabled for testing or accidentally shipped.  The last one I'm not expecting QA to do.

Can you elaborate on the fingerprinting scenario you are thinking would be a problem if we did option 'B'?  (Not that I'm crazy about implementing option 'B').

I think my main concern for fingerprinting was that our deterministic sources of entropy we could get are potentially sensitive and not something that the server could retrieve otherwise.  For example, the MAC address is great deterministic entropy, but is super-predictable and so relatively cheaply reversible despite hashing.  For a ne'er do-well to otherwise get the MAC they'd need to correlate the MAC with an IP which, especially with NATs and such is not an entirely trivial undertaking.

If we're just reusing a completely random device ID when we create an account we previously deleted, I'm not sure what we'd be revealing to the server that it couldn't already cheaply infer.

Note that the impact of this is really a question of whether we file an eventual enhancement bug for 'b' that we fix when we find that manual QA testing is running into the problem.  If we think it's a bad idea, maybe I file a bug just to WONTFIX it for if/when that QA problem comes up in the future so we can tell QA to just clear the account out/etc.
Flags: needinfo?(squibblyflabbetydoo)
Can we just do some kind of build-time config?

I think it's useful to be able to reset the device ID in order to, say, work around buggy servers. I can imagine that if something goes awry, a particular device ID might be "poisoned", and give back erroneous/unexpected data.

Also, you might want to reset your device ID if you're afraid your account got hijacked (and if you have access to see the IDs of all devices connecting to your account), since then you'd be able to see if there were any devices using something other than your new device ID.

Granted, these are edge cases and I'm kinda just making excuses here, but I'd rather be a little too paranoid than not enough.
Flags: needinfo?(squibblyflabbetydoo)
QA Whiteboard: [fxosqa-auto-backlog+]
Firefox OS is not being worked on
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: