Closed Bug 840935 Opened 11 years ago Closed 6 years ago

Create Email workloads for B2G performance tests

Categories

(Firefox OS Graveyard :: Gaia::E-Mail, defect, P2)

ARM
Gonk (Firefox OS)
defect

Tracking

(tracking-b2g:backlog, b2g18+)

RESOLVED WONTFIX
tracking-b2g backlog
Tracking Status
b2g18 + ---

People

(Reporter: davehunt, Unassigned)

References

Details

(Keywords: perf, Whiteboard: c= ,)

When measuring the performance of launching the Email app, we want to pre-load a number of emails.
Andrew: Would you be able to provide some suggestions for how we might pre-load the email app before we launch it?
Flags: needinfo?(bugmail)
(In reply to Dave Hunt (:davehunt) from comment #1)
> Andrew: Would you be able to provide some suggestions for how we might
> pre-load the email app before we launch it?

In short, the e-mail app needs to be spun up and connected to a mail account and the sync process run to completion.

Our general game-plan has for standalone back-end and integration testing has been to use 'fake-servers' to have the app connect to a JS-implemented ActiveSync or IMAP server.  :willkg created an integration test that can create accounts, but we may have broken that and not fixed it when we updated the account creation process?  (I/we are cutting corners out of necessity all over the place :( )

Right now, we have a very limited ActiveSync fake-server in https://github.com/mozilla-b2g/gaia-email-libs-and-more.  If you do follow the README.md to provide an xpcshell setup, then running "make activesync-server" should hook you up.  (Our unit tests which use xpcshell internally spin up the fake activesync server.)

For IMAP/SMTP, we don't have a fake-server in-tree yet, but Thunderbird has one of those that we can probably port without too much effort, see bug 813411.

Since the goal is just to fill an account up with some messages, it's possible things could be simplified by running a variant of the fake-server just on a node on the local subnet persistently, and pointing the account creation at that server.  We could even use real servers, but the fake server has the advantage that it can generate a consistent message distribution that we control very easily.  Otherwise, we run into the problem that hotmail.com is an unreliable network endpoint in terms of latency and not being down, or that a locally hosted IMAP server will need extra moving parts to cram in the consistent message density.  (IMAP unit tests operate by nuking folders, creating folders, then cramming the synthesized messages in, so we already have the code for this.)

We could use the general mechanism to pre-create IndexedDB contents, but I think we probably want numbers for: 1) bring-up with no accounts, 2) time to create the account to "full initial sync", 3) time to show messages in first open after that.
Component: Gaia → Gaia::E-Mail
Flags: needinfo?(bugmail)
QA Contact: nhirata.bugzilla
See Also: → 813411
Note that private bug 841575 is a MoCo bug about creating an account on our Zimbra server and could end up involving credentials, which is why it is secret.  I say this mainly so I can respond to its comment 0 with some relevant info:

(In reply to Dave Hunt (:davehunt) from bug 841575 comment #0)
> We'd prefer not to use a third-party for this, to reduce unreliable network
> endpoints in terms of latency and availability. As I understand it, we
> should be able to ensure we set up a test's prerequisites (number of emails)
> fairly easily using IMAP.

The key thing is that the sync process for IMAP is time-aware; we start by looking for messages in that last 3 days, and then we keep moving the window backward (and expanding it if we don't find any messages).  So if we use this account for any timing that involves the duration of an initial sync, we need to deal with this.  If we ignore the sync time, it doesn't matter.

As such, the test infrastructure needs to make sure it crams in new messages appropriately.  So if the test is run at the same time every day, messages should also be crammed in at the same time every day, with no chance of the crammin or the test running to overlap.

Alternatively, the e-mail app can be tricked into thinking it is operating in the past.  As a side effect of unit test support for doing this, the mailapi/date.js's "TEST_LetsDoTheTimewarpAgain" method can be used.  Or you can just make the whole b2g-desktop instance think it's in the past so that Date.now() tells horrible lies.  If you do the latter, it is essential thta new messages are not introduced into the e-mail account.
(In reply to Andrew Sutherland (:asuth) from comment #3)
> The key thing is that the sync process for IMAP is time-aware; we start by
> looking for messages in that last 3 days, and then we keep moving the window
> backward (and expanding it if we don't find any messages).  So if we use
> this account for any timing that involves the duration of an initial sync,
> we need to deal with this.  If we ignore the sync time, it doesn't matter.
> 
> As such, the test infrastructure needs to make sure it crams in new messages
> appropriately.  So if the test is run at the same time every day, messages
> should also be crammed in at the same time every day, with no chance of the
> crammin or the test running to overlap.
> 
> Alternatively, the e-mail app can be tricked into thinking it is operating
> in the past.  As a side effect of unit test support for doing this, the
> mailapi/date.js's "TEST_LetsDoTheTimewarpAgain" method can be used.  Or you
> can just make the whole b2g-desktop instance think it's in the past so that
> Date.now() tells horrible lies.  If you do the latter, it is essential thta
> new messages are not introduced into the e-mail account.

I (perhaps naively) was thinking that the test could be immediately preceded with a script that cleared the inbox via IMAP and populated it with X new messages. I'd rather not mess around with dates, as from my experience this can have nasty side-effects.
(In reply to Dave Hunt (:davehunt) from comment #4)
> I (perhaps naively) was thinking that the test could be immediately preceded
> with a script that cleared the inbox via IMAP and populated it with X new
> messages. I'd rather not mess around with dates, as from my experience this
> can have nasty side-effects.

That's should be perfect (as long as tests won't be running concurrently for multiple devices).  That's how the IMAP tests currently operate right now (from inside each test case, but using newly nuked folders so they don't interfere with a real account or get surprise messages being received).

An example of a unit test along those lines is:
https://github.com/mozilla-b2g/gaia-email-libs-and-more/blob/master/test/unit/test_torture_imap.js

That tries to put in too many messages (300 a day), and the inbox would want to be used instead, of course:
https://github.com/mozilla-b2g/gaia-email-libs-and-more/blob/master/test/unit/test_compose.js
During a recent meeting on performance, it was mentioned that we should be able to prepopulate a database of emails on the device and start the email application in offline mode. If this is possible, it would be preferable to relying on an actual email account.
Yes, definitely possible.  In some other thread/context (email/IRC/in-person?) I had also previously mentioned the offline case, but it didn't end up here.  To recap that:

If navigator.onLine is false, the e-mail app starts up offline even and won't try and talk to the server at all.

Also, since we've changed how the e-mail app works for synchronization purposes (we show the offline messages before trying to talk to the server, no matter how long ago we last synchronized with the server), it might also not be the end of the world if we have a pre-filled database that has a configured IMAP account that it does talk to and allow online activity to happen.  Key things:

- This definitely will not work with ActiveSync since the protocol is stateful and it agrees with the server on tokens that are used to indicate synchronization state that are destroyed and replaced with new ones every time we talk to the server.

- If the IMAP account talks to a server and the pre-canned data is distributed to more than a few people, it's very important than the server's IMAP connection limits be relaxed and that the server won't freak out if there are a large number of concurrent connections to each folder.


Since in general we may not want the account to ever go online, we can modify the e-mail app to have a debug-only flag that will completely disable an account from ever going online.  This would probably want to make it so the server operations think they have completed so that we can avoid an ever-increasing list of operations to run.

We could expose UI to set all of a device's accounts to perma-offline via the secret debug mode menu.  See https://wiki.mozilla.org/Gaia/Email/SecretDebugMode for instructions on how to get there.  We'd add a "make accounts perma-offline" button.
This would be awesome for reference-workloads... Ideally I would want a way to set that flag from the desktop command line (using adb or something similar). I don't want to force the user to have to fiddle with a freshly flashed phone before they can load reference workloads.
The flag would exist in the database.  Since I assume you're just copying an IndexedDB database into place that was 'mastered' somewhere by syncing an account and setting the flag, the flag would travel with the data and everything would be good.
Okay, so it sounds like we should go ahead and create reference workloads for email. Is this something you can work on Jon? Do you need an IMAP account to help set this up? We have bug 841575 for setting up an account for testing purposes, but I'm not sure how soon this will be resolved.
Yes, I'm definitely going to add email to reference workloads. I will need some kind of account that has a pile of email in it (preferably in the thousands of messages). I can make my own email accounts (on my personal domain), but I don't have any decent way of populating it short of writing an email generator, which I guess would work.

Reading the bug mentioned above (bug 841575), I think in respect to reference-emails, it won't matter much where the email account lives, since it will become a completely offline access system. I'll just use the email account to pre-populate the database.

Andrew - I need to know specifically which database(s) are involved with this.
it's just the one IndexedDB database for e-mail.  For my unagi, that's /data/local/indexedDB/3+f+app+++email.gaiamobile.org/1840232287bl2iga-me.sqlite and the /indexedDB/3+f+app+++email.gaiamobile.org/1840232287bl2iga-me sub-directory tree for storing associated blobs.  Blobs are only used to store embedded images that have been downloaded.

For a reference setup, it's probably better to have more realistic messages than what our generator creates.  For an overview of a test that uses the generator to put messages somewhere, see my comment 5.  There are several non-generator options.

First, there's importing the contents of mailing lists:
- Some mailing lists provide mbox archives that can be imported.  Apache is particularly good about this, although they may have changed their policy about publishing the archives.  At least the httpd project still does it: http://httpd.apache.org/mail/
- mail.mozilla.org provides mbox lists (ex: https://mail.mozilla.org/pipermail/tb-planning/), but that's not where most of our mailing lists live.
- There's a Thunderbird extension that provides easy mbox import capabilities: https://addons.mozilla.org/en-US/thunderbird/addon/importexporttools/
- Using Thunderbird, you can subscribe to the newsgroup representation of a mailing list and then copy the messages 

The main problem with mailing lists is that frequently they nuke HTML or the users of the lists aren't huge HTML fans and so it's rare to see HTML on there.  In that case, it's useful to just have a mail account that was signed up for lots of newsletters, etc.  We don't really have one of those for gaia, but we do have one that existed for the raindrop project.  The credentials for that account are under 'Google IMAP' here: https://intranet.mozilla.org/Gaia/Email.  Note that gmail tends to get paranoid about access from many different locations, so sign-in to the web interface may be required to make google happy.

One other option is that jlebar created a test gmail account that is subscribed to basically every bugmail bugzilla.mozilla.org creates.  He is probably willing to share those credentials since I think he created the account to investigate gaia email problems.

ActiveSync (ex: hotmail.com or hosted exchange) may be an easier way to get a lot of messages into the device since you can explicitly set the sync settings for a folder to sync the entire folder rather than having to manually keep scrolling down and hit 'load more messages'.  ActiveSync will also currently fetch the bodies in their entirety.  We have some hosted exchange accounts under 'QA-owned credentials' on https://intranet.mozilla.org/Gaia/Email where you might be able to then also use IMAP to use Thunderbird to cram messages in, then sync the messages out with ActiveSync.  If you do go with IMAP, you'll definitely want to modify the source for gaia that you use so that the INITIAL_FILL_SIZE constant is higher than 15 (and then that BISECT_DATE_AT_N_MESSAGES is at least several times that number) in order to reduce the number of times you have to hit "load more messages".  For IMAP, bodies will currently *not* be fetched until you view the message.  And snippets will only be opportunistically fetched as you scroll.
Assignee: nobody → jhylands
Andrew - how much does it matter that these email repositories are pretty much all receive-only? Performance-wise is there a difference between sent email and received email?
No difference.  Sent e-mail is just like all other e-mail, it just happens to live in the 'sent' folder by default.
Any progress on this Jon? I now have an IMAP account dedicated for automated B2G testing, which you can use if you like to populate an initial database.
Flags: needinfo?(jhylands)
Dave - I talked with Andrew Sutherland at length about this last week, and we're still waiting on the special offline mode flag being added.
Flags: needinfo?(jhylands)
Great, thanks. Do you know if that's being tracked anywhere? Would be great to add it as a blocker.
Andrew, is this (the offline flag) being tracked in another bug?
Flags: needinfo?(bugmail)
Status: NEW → ASSIGNED
Keywords: perf
Whiteboard: c= ,
tracking-b2g18: --- → +
Priority: -- → P2
Summary: Prepopulate email app with emails for the B2G performance tests → Create Email workloads for B2G performance tests
Blocks: 896839
We're almost able to do this in an automated-ish fashion.

When bug 892519 lands we will have integration tests in gaia that use our IMAP fake-server.  Bug 908944 is about being able to cram synthetic messages into the IMAP server so we can then sync them.  This allows us to get a bunch of messages into the mailbox.

In my optimistic discussion with :jhylands back in the day I proposed adding some secret debug menu options relating to letting us mark the account as offline/connections should not occur, etc. and more easily synchronizing the entire contents of a folder.

I think the marionette automated tooling stuff should eliminate the need for the specialized back-end stuff.  Some marionette support logic will be required to trigger a sync loop that keeps scrolling down and selects "Load more messages from server" until we're fully synchronized, but that's pretty trivial and much simpler than doing it in the back-end.

The offline flag we talked about was mainly to scrub credentials and avoid trying to connect to the server whoever was creating the canned data was using.  The marionette stuff largely uses localhost, so that should end up failing fast if the device is not put into offline mode before testing.  That seems like it should be fine for the use-cases I understand.
Depends on: 908944, 892519
Flags: needinfo?(bugmail)
Blocks: 914907
Assignee: jhylands → nobody
blocking-b2g: --- → backlog
blocking-b2g: backlog → ---
Firefox OS is not being worked on
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.