Closed Bug 991567 Opened 10 years ago Closed 10 years ago

[Tarako][Email] Email app could use less memory when downloading attachments; OOM crashes on very small attachments

Categories

(Firefox OS Graveyard :: Gaia::E-Mail, defect)

Other
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(blocking-b2g:1.3T+, b2g-v1.3 fixed, b2g-v1.3T verified, b2g-v1.4 fixed, b2g-v2.0 fixed)

RESOLVED FIXED
1.4 S5 (11apr)
blocking-b2g 1.3T+
Tracking Status
b2g-v1.3 --- fixed
b2g-v1.3T --- verified
b2g-v1.4 --- fixed
b2g-v2.0 --- fixed

People

(Reporter: bli, Assigned: mcav)

References

Details

(Whiteboard: OOM [priority] [p=3])

Attachments

(8 files, 1 obsolete file)

Environment:
-------------------------------------------------------
Gaia      c418ec10d1e1d53c6757ad12b2320c204808d251
Gecko     https://hg.mozilla.org/releases/mozilla-b2g28_v1_3t/rev/36b1279ef6df
BuildID   20140402164000
Version   28.1
ro.build.version.incremental=eng.cltbld.20140402.202437
ro.build.date=Wed Apr  2 20:24:44 EDT 2014


Steps to reproduce:
------------------------------------------------------
1. Launch Email, and add an email account.
2. Send an email to this email address with an attachment.(e.g. a jpg file)
3. Launch Email, and open the email sent in step 2
4. Tap on the downloading icon to download the attachment.
5. Wait for a while


Actual results:
--------------------------------------------------------
Email app is killed while downloading the attachment.


Additional info:
--------------------------------------------------------
When the size of the attachment is larger than 1M, the reproduce rate is 100%

When the size of the attachment is about 300K, this happens occasionally.
Similar bug context:
- Bug 943798 is about streaming attachments to disk.  But that's a platform issue that is now extremely unlikely to happen for v1.3t.
- Bug 972632 is the POP3 specific variant for that.

The Tarako failure case is pretty extreme, so I think it makes sense to leave this bug open for Tarako-specific discussion for now rather than duping.


What domain / type of account are you reproducing this on?  Also, are you letting the screen turn off?  See https://wiki.mozilla.org/Gaia/Email/RequiredBugInfo.  Also, please provide a logcat as requested there.  (Note that this is information we will always want for bugs filed in the e-mail app.)
Bingqing says she use gmail to reproduce it.
Bingqing, can you provide what Andrew asked in comment 1? Thanks,
Flags: needinfo?(bli)
blocking-b2g: --- → 1.3T?
Attached file log
Flags: needinfo?(bli)
Attached file bug-991567-logcat
Okay, so I can reproduce this when I try to read an email with 1M attachment.
Here is my STR:
1. launch email app
2. see there is a mail with an attachment (it's a JPEG file, size: 1.1M)
3. click on the mail to read it

Expected Result:
entering the mail

Actual Result:
Email app is killed


shell@android:/ $ b2g-ps
APPLICATION      USER     PID   PPID  VSIZE  RSS     WCHAN    PC         NAME
b2g              root      84    1     147716 29820 ffffffff 00000000 S /system/b2g/b2g
(Nuwa)           root      330   84    51664  3208  ffffffff 00000000 D /system/b2g/plugin-container
E-Mail           app_1966  1966  330   86568  37924 ffffffff 00000000 R /system/b2g/plugin-container
Flags: needinfo?(bugmail)
Attached file bug-991567-logcat_2 (obsolete) —
more logs. This time, email is kill while downloading.

b2g-info BEFORE download starts

                          |     megabytes     |
           NAME  PID PPID CPU(s) NICE  USS  PSS  RSS VSIZE OOM_ADJ USER
            b2g   84    1  261.0    0 22.8 25.0 30.7 146.4       0 root
         (Nuwa)  330   84    4.4    0  0.0  0.7  2.7  50.5       0 root
         E-Mail 2584  330   13.8    1  6.8  9.3 15.4  85.3       2 app_2584
     Homescreen 2633  330    7.8    0  8.8 11.3 17.4  65.7       2 app_2633
(Preallocated a 2644  330    2.8    0  3.1  4.6  9.2  59.4       1 root

System memory info:

            Total 98.1 MB
     Used - cache 76.5 MB
  B2G procs (PSS) 50.8 MB
    Non-B2G procs 25.7 MB
     Free + cache 21.6 MB
             Free  4.0 MB
            Cache 17.6 MB

Low-memory killer parameters:

  notify_trigger 14336 KB

  oom_adj min_free
        0  1024 KB
        1  2048 KB
        2  4096 KB
        6 18432 KB
        8 18432 KB
       10 24576 KB
Attached file bug-991567-logcat_3
This time, the download ran a while(>10s), but eventually OOMed...

* b2g-info BEFORE download starts

                          |     megabytes     |
           NAME  PID PPID CPU(s) NICE  USS  PSS  RSS VSIZE OOM_ADJ USER
            b2g   84    1  288.7    0 17.1 19.6 24.9 140.1       0 root
         (Nuwa)  330   84    4.9    0  0.0  0.7  2.7  50.5       0 root
         E-Mail 3135  330   19.0    1 15.9 18.3 23.8  86.3       2 app_3135
     Homescreen 3167  330    7.8    0  1.8  3.4  8.1  65.7       2 app_3167
(Preallocated a 3184  330    3.2   18  2.8  4.1  8.1  59.4       1 root

System memory info:

            Total 98.1 MB
     Used - cache 74.6 MB
  B2G procs (PSS) 46.0 MB
    Non-B2G procs 28.6 MB
     Free + cache 23.5 MB
             Free  6.7 MB
            Cache 16.8 MB

Low-memory killer parameters:

  notify_trigger 14336 KB

  oom_adj min_free
        0  1024 KB
        1  2048 KB
        2  4096 KB
        6 18432 KB
        8 18432 KB
       10 24576 KB

* dmesg (full log attached in next attachment)
2379 <6>0[ 3157.536314] [ 3135] 13135  3135    24252     4876   0       2           134 E-Mail
2380 <6>0[ 3157.536378] [ 3167] 13167  3167    16571     2703   0       2           134 Homescreen
2381 <6>0[ 3157.536445] [ 3184]     0  3184    14962     1241   0       1            67 (Preallocated a
2382 <6>0[ 3157.536513] [ 3263]     0  3263      178       41   0       0             0 logcat
2383 <6>0[ 3157.536577] [ 3273]     0  3273      192       22   0       0             0 sh
2384 <3>0[ 3157.536636] Out of memory: Kill process 3135 (E-Mail) score 374 or sacrifice child
2385 <3>0[ 3157.536702] Killed process 3135 (E-Mail) total-vm:97008kB, anon-rss:14832kB, file-rss:4672kB
2386 <4>0[ 3157.632289] slog invoked oom-killer: gfp_mask=0xd0, order=2, oom_adj=-16, oom_score_adj=-941
2387 <4>0[ 3157.632367] Backtrace:
2388 <4>0[ 3157.632399] Function entered at [<c4537a30>] from [<c48838e4>]
2389 <4>0[ 3157.632448]  r7:000000d0 r6:000000d0 r5:00000002 r4:c59e8000
Attachment #8401233 - Attachment is obsolete: true
Attached file bug-991567-dmesg
could this be caused my the new LMK setting? perhaps after we roll back the LMK setting, we should test this again. thanks
Flags: needinfo?(tchou)
According to comment 7, it is killed by Linux OOM killer, which is not about recent LMK change.
Flags: needinfo?(tchou)
Let's visit the product requirement for Tarako. 
--------------------------------------------------------
When the size of the attachment is larger than 1M, the reproduce rate is 100%
When the size of the attachment is about 300K, this happens occasionally.


do we have idea on how this can be fixed?
on such a low memory device, i feel that we need to limit this use case
can we disable the support of attachment downloading? if the size the over a certain limit?
or could be just completely disable the download feature? Thanks
Flags: needinfo?(mkhoo)
Investigating currently.
Assignee: nobody → m
In a world, filled with ZOMBIES, you must not fall asleep, lest you
turn into a zombie yourself. But more seriously:

I'm not sure how thoroughly this patch actually addresses the issue,
as Tarako crashes if you look at it wrong, and I don't know the
specific workload the phone was under in the conditions described in
the STR. In my head, and in my manual testing, it seems to help.

This patch (for IMAP) changes attachment parsing as follows:
Attachments are decoded as they come, rather than attempting to decode
the entire attachment after it has been downloaded. Accordingly, the
previously-encoded-and-now-binary buffers are accumulated and passed
directly to `new Blob()` without being copied further.

If I understand the code path, it goes something like:

   socket -> fetcher -> mailparser -> streaming decode ==> Blob

Which, given that all of the operations prior to the Blob are chunked,
there shouldn't be any extraneous copying going on.

I tested with a couple of attachment sizes:

- 1MB-ish photo, no problems downloading or viewing in my testing.

- 2.8MB-ish photo; it always downloads completely, but then
  _sometimes_ the app dies somewhere between creating the Blob and
  saving it to storage. Sometimes it works too. But this is somewhat
  irrelevant, because the Photo Viewer will refuse to display a file
  this large (saying it's too big to be displayed on the device).

- 20MB photo: It happily downloads about 4MB before crashing.

YMMV. Would be nice to have someone else test this too.

*However*, there's one fairly big caveat, which would probably be
worth scoping in a different bug: The results I list above are what
happen when the phone stays awake, screen on.

If you let the phone lock, or it locks itself (by waiting as a large
file downloads without tapping the screen once in a while), the app
consistenly gets killed almost right away. It seems as though the
system cares more about the Lock Screen staying alive than E-mail,
even if we have a wake lock.

Asking 'feedback?' on this patch, as it hasn't been cleaned up or
unit-tested, and I haven't given any thought as to how this affects
ActiveSync or POP3 (ha!). We hadn't used the Streams functionality of
MailParser yet, so I have a lingering doubt that we may have not used
it for a reason.
Attachment #8402104 - Flags: feedback?(bugmail)
Comment on attachment 8402104 [details] [review]
Link to Github pull-request: https://github.com/mozilla-b2g/gaia-email-libs-and-more/pull/299

No real intent in not using streams; I believe the original hookup to mailparser and buffering approach was for bodies and attachment download was able to reuse this code without deeper thought.  Because of usual time-pressure reasons changing to proper streaming was deferred as an enhancement with the idea that that should be easy since we'd already have the unit tests in place.  I think I also initially used a transformation hack to get around the angry nature of window.atob when encountering whitespace or gibberish before we eventually moved to having a Buffer shim that was more consistent with node.  (Although that might have been in mimelib?)

This looks like a good direction and I'm happy to see it's not a ton of extra code.

test_compose.js already has a test of round-tripping of attachments and checking that the received file is byte-for-byte identical with the sent file.  I think the extra coverage we would want given this change is mainly to ensure that our attachment is big enough that it will be fragmented into multiple TCP packets.  If you modify test_compose.js's hard-coded length of 256 to a somewhat larger size with the same 0-255 pattern, I think that will be sufficient for the patch.
Attachment #8402104 - Flags: feedback?(bugmail) → feedback+
Flags: needinfo?(bugmail)
Whiteboard: OOM → OOM [MP_Blocker]
triage: partner blocker. 1.3T+
blocking-b2g: 1.3T? → 1.3T+
Whiteboard: OOM [MP_Blocker] → OOM [priority]
Comment on attachment 8402104 [details] [review]
Link to Github pull-request: https://github.com/mozilla-b2g/gaia-email-libs-and-more/pull/299

Cleaned up the patch a tad and adjusted test_compose.js as you suggested. POP3 and ActiveSync build out the messages differently so I didn't modify those here; happy to check those out if you think they might have similar low-hanging fruit.
Attachment #8402104 - Flags: review?(bugmail)
landed:

master: https://github.com/mozilla-b2g/gaia/commit/3aef168f164becc5fa6564045bc4e17d5c300959

gelam: https://github.com/mozilla-b2g/gaia-email-libs-and-more/commit/9cb6f3aa21c399054a430baaacd0676b78a2ef8d
Status: NEW → RESOLVED
Closed: 10 years ago
Flags: needinfo?(mkhoo)
Resolution: --- → FIXED
Whiteboard: OOM [priority] → OOM [priority] [p=5]
Target Milestone: --- → 1.4 S5 (11apr)
[Approval Request Comment]
[Bug caused by] (feature/regressing bug #): n/a
[User impact] if declined: Downloading a large attachment (anywhere from several hundred KB upward) is likely to cause the e-mail app to crash.
[Testing completed]: manual/automated test
[Risk to taking this patch] (and alternatives if risky): low
[String changes made]: none
Attachment #8403016 - Flags: approval-gaia-v1.3?(fabrice)
Attachment #8403016 - Flags: approval-gaia-v1.3?(fabrice) → approval-gaia-v1.3+
Is it land on v1.3T? I can't find this commit.
Flags: needinfo?(fabrice)
Is it land on v1.3T? I can't find this commit.
Is it land on v1.3T? I can't find this commit.
Yes, it's 62acb4b0e774b6709b8be400d849f807404bb21b
Flags: needinfo?(fabrice)
Whiteboard: OOM [priority] [p=5] → OOM [priority] [p=3]
Attached file dmesg.txt
verify with today's daily build, still has this problem
Gaia      c62bff0bfb8b069c962dfae2640e931cc9ad1bdf
Gecko     https://hg.mozilla.org/releases/mozilla-b2g28_v1_3t/rev/7e764399b4fc
BuildID   20140415164003
Version   28.1
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
test picture is 1.5MB
(In reply to Mike Lien[:mlien] from comment #25)
> verify with today's daily build, still has this problem

This fix was just to improve things and since a patch has landed without regression no further patches will be developed on this bug.  Please see bug 996232 which tracks us crashing on larger attachments and is marked as dependent on this bug.  I have clarified the summary of this bug to make it clear that the fix is not intended to make us never crash on downloading attachments.
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Resolution: --- → FIXED
Summary: [Tarako][Email] Email app is killed while downloading an attachment → [Tarako][Email] Email app could use less memory when downloading attachments; OOM crashes on very small attachments
Fixed in today's Tarako build 1.3t

1.3 Environmental Variables:
Device: Buri 1.3 MOZ
BuildID: 20140429024001
Gaia: caab7a78f0c04913f48a3051259663ee85625906
Gecko: f84a8ffbc552
Version: 28.0
Firmware Version: v1.2-device.cfg
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: