Closed Bug 799828 Opened 7 years ago Closed 7 years ago
[email/IMAP] data eviction from cache; discard messages outside sync window periodically
134 bytes, text/plain
This was previously tracked on https://github.com/mozilla-b2g/gaia/issues/2530 The relevant snippet from a comment I made is: Right now the [IMAP] e-mail database will never forget about messages. So if you are sufficiently popular, your indexedDB database will eventually either fill up the device or hit whatever quota constraints the e-mail app and then the e-mail app will break. The specific to-do is to periodically discard all message headers and their associated bodies that are older than the requested synchronization range. We want to discard older messages before newer messages since our UI requires you to always go through the more recent messages first. Some amount of recency checking is appropriate; the accuracy range information serves as a sufficient proxy for this since the time interval will identify when the user last scrolled that far back in time. Some concept of disk usage/storage pressure is also appropriate, and this is already covered by the block infos.
Chris - Should we break feature freeze for this feature? It is highly likely we could hit the quota limit.
E-mails with embedded images will have the images stored into the database which could magnify the growth factor with no UI to discard them. Is there a specific number you are thinking of as the quota limit? (Like, just the IndexedDB default, or maybe we have a different one for privileged apps, or a notional max for the e-mail app on the production device?)
We're marking this bug with the C1 milestone since it follows the criteria of "unfinished feature work" (see https://etherpad.mozilla.org/b2g-convergence-schedule). If this work is not finished by Nov19, this bug will need an exception and will be called out at the upcoming Exec Review.
Target Milestone: --- → B2G C1 (to 19nov)
Cody, whats going on here? No updates in a week for a critical blocker that missed the convergence date.
I worked on this a little bit on Tuesday and Wednesday last week (before the holiday), but I've mainly been laying the groundwork and learning the DB side of the Mail app. I'm aiming to have an initial patch up tonight/tomorrow. Aside from the standard changes that'll be required for review and such, we're going to have to do a decent bit of tweaking to get it to flush out the right emails; weighing based on folder type and all that. But we can at least get this to a point where the bug itself (crashing the app when too many emails are present) is taken care of quickly.
If you just want to push your changes as you make them to a branch on your gaia-email-libs-and-more fork, that would be great so that I can follow along even if the patch isn't quite ready yet. As a nice bonus, that way we never have to worry about disk crashes and the like. Related Note: There's no need to worry about having particularly clean commits for gaia-email-libs-and-more. Especially since we may frequently have to do a bunch of merges, it's desirable to have the true history available.
Target Milestone: B2G C1 (to 19nov) → B2G C2 (20nov-10dec)
Cody, please give an update on this one. This is the last remaining email "feature" bug so we need to knock it out. Can you get your patch up today?
I ran into issues implementing my initial solution; I was attempting to implement it at too low a level, directly manipulating the slice storage. After talking to asuth earlier, I'm implementing this in a far simpler way, simply using the existing deletion facilities and just selecting messages outside of the last sync to delete. This is almost definitely too aggressive, but it provides a good starting point and will be landable very quickly.
I now have a buggy (having issues with the actual deletion) but mostly functional implementation of the feature. I'm writing a unit test to try to cover a few big cases, one of which is certainly not correct right now: 1) Sync X days of emails, then get another Y emails after that. Those last Y emails should be evicted from the cache, leaving just the sync period (for now). 2) Search for emails in the distant past and ensure those are evicted properly during the next period. 3) Ensure attachments are properly deleted when headers and bodies are. (These are stored separately -- does the standard deletion interface take care of this for us?) 4) Do all this while ensuring that the accuracy ranges remain accurate. This is currently not functional, as the standard deletion facilities don't update the accuracy ranges in the way we need for this bug. I'll have a version missing #4 and evicting data on every sync (rather than periodically as intended; do we have a good way to handle that, outside of just checking how long it's been since the last eviction?) up on my branch tomorrow for review and commenting. Hopefully the finishing touches won't take much longer after that.
(In reply to Cody Brocious [:Daeken] from comment #9) > 3) Ensure attachments are properly deleted when headers and bodies are. > (These are stored separately -- does the standard deletion interface take > care of this for us?) We don't need to worry about embedded images, since IndexedDB does the book-keeping on those. For attached images which get stored in DeviceStorage, we've declared it not the e-mail app's problem at this time. The user can delete the images from the gallery if they don't want them anymore. > I'll have a version missing #4 and evicting data on every sync (rather than > periodically as intended; do we have a good way to handle that, outside of > just checking how long it's been since the last eviction?) up on my branch > tomorrow for review and commenting. I think the best option is to: - Have the eviction logic run as a job/operation that takes a specific folder to process as its argument. mailapi/imap/jobs.js would be the right place for it since it does not make any sense for ActiveSync. - Queue the job to run as the result of the allocation of new blocks in the FolderStorage instance as a proxy for "non-trivial amounts of new storage used". Specifically, have FolderStorage._makeHeaderBlock check if (this._folderImpl.nextHeaderBlock % $syncbase.DATA_EVICT_EVERY_N_BLOCKS === 0) post-increment where we set DATA_EVICT_EVERY_N_BLOCKS to 4 initially. When it's true, we queue the operation to be run. The operation will acquire the folder mutex, so it will stay out of the way of the synchronization operation.
This patch (pretty version at https://github.com/daeken/gaia-email-libs-and-more/commit/c8108ef7bc182a56177601be3632b250d0705932 ) adds data eviction to mail storage and adds an accompanying unit test. The current version is full of debug logging so you can see what's going on at the moment, and it needs proper comments, but is otherwise solid I believe. The big outstanding question is: where should this be called from? Doing it as part of folder refreshes (specifically, I was trying from _sliceOpenFromNow) causes something to go wrong due to the deletion. We should also be updating accuracy ranges, but that's easy in this case since we're doing purely date-base deletion right now. Going to add tests for that in my next patch. As a sidenote, should getMessagesInImapDateRange() allow mutation of blocks while iterating? I was originally using it for the eviction, but after a lot of debugging found that something was going wrong internally; this seems likely to complicate things in the future. I'm using getAllMessagesInImapDateRange() now, but that seems wasteful.
Attachment #687638 - Flags: feedback?(bugmail)
Disregard the outstanding question -- the job solution should, if it's holding a lock, fix the issue. Going to roll that into my patch now.
Attachment #690065 - Flags: review?(squibblyflabbetydoo)
Notes from testing: My patch includes a torture test that adds 1200 messages at 300 messages per day. I've been using the leftover folder from the unit test to test the app in Firefox nightly, and scrolling around the entirely synchronized folder, cached header and body blocks are flushed as expected. (Uncomment the various console.log/console.warn calls related to flushing to see.) about:memory also does not show any scaling/leak problems from this, although our header records are compact enough that even if we leak them, they would be largely lost in the noise. In my testing, about:memory footprint was 6,524K with a small/reasonable inbox with minimal sync (just 15 messages) from testy@localhost opened on an account with 3 accounts (testy@localhost, testy@aslocalhost, yahoo test account). When opening the already fully synchronized torture test folder, usage increased to 7,071K. After scrolling to the bottom of the message list (spanning all 1200 messages), usage was at 9,191K. In both cases, most of the memory increase was in DOM/layout: 451k and 1926K (also against the first value). This makes sense because when opening the torture test we would buffer up to our minimum buffering level in one direction in the first case, and in the second case we would be retaining our maximum buffer in one direction. It's worth noting that this was on a 1700px high window, so we buffer a pretty good number of messages since buffering scales with the number of visible messages at a time.
Status: NEW → ASSIGNED
Attachment #690065 - Flags: review?(squibblyflabbetydoo) → review+
Landed on gaia-email-libs-and-more/master: https://github.com/mozilla-b2g/gaia-email-libs-and-more/pull/96 Landed on gaia/master: https://github.com/mozilla-b2g/gaia/pull/6960
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
The above PR had some fallout. A PR to fix it is up here: https://github.com/mozilla-b2g/gaia-email-libs-and-more/pull/98
I've reviewed and landed Jim's fix. (Thanks Jim!): Landed on gaia-email-libs-and-more/master: https://github.com/mozilla-b2g/gaia-email-libs-and-more/pull/98 Landed on gaia/master: https://github.com/mozilla-b2g/gaia/pull/6978
we need a test case for this in moztrap
You need to log in before you can comment on or make changes to this bug.