941160 - [B2G][MMS] Crash will occur when attaching images to an MMS

Reporter

Description

•

11 years ago

Attached file log.txt — Details

Description: When adding multiple images to an MMS a crash will occur, sending the user back to the home screen. NOTE: This issue does not reproduce on the Buri 20131104 Base Image, on the latest Buri 1.2 build. Repro Steps: 1) Updated Buri to V1.2 Com RIL Build ID: 20131118004001 2) Open the Messenger app, create a new message. 3) Select the attachment icon and select gallery. 4) Select an Image and tap on the check box to add to the MMS. 5) Repeat step 3-4 two or three times. Actual: Crash occurred when adding MMS Expected: Crash never occurs when adding MMS Environmental Variables Device: Buri ver 1.2 COM RIL Build ID: 20131118004001 Gecko: http://hg.mozilla.org/releases/mozilla-b2g26_v1_2/rev/3e8b854a992e Gaia: 7a23f8c53ba97da9c63a7275b36d155b4526a639 Platform Version: 26.0 RIL Version: 01.02.00.019.102 Firmware Version: V1.2_US_20131115 Notes:Does not generate crash report but app does crash and force close. Always takes more than one MMS to cause crash. Repro frequency: 70% Link to failed test case: https://moztrap.mozilla.org/manage/case/8511/ See attached: Logcat

Jason Smith [:jsmith]

Comment 1

•

11 years ago

This looks graphics related based on the log. Milan - Can someone on your team look into this to see why this is happening? I'm going to assume defensively that this is a bug on our side, as the 1.1 base image doesn't trigger the correct gfx workflows anyways. This is likely a bug only seen when we're using the correct base image. I actually don't know how this could be a vendor bug, so I'd like input from gfx to see what they think is going on here.

blocking-b2g: --- → koi?

Component: Gaia::SMS → Graphics

Flags: needinfo?(milan)

Keywords: perf

Product: Firefox OS → Core

Whiteboard: burirun4 → burirun4 [MemShrink]

Milan Sreckovic [:milan] (needinfo for best results)

Comment 2

•

11 years ago

Sotaro, could you take a look if this is us or vendor issue?

Assignee: nobody → sotaro.ikeda.g

Flags: needinfo?(milan) → needinfo?(sotaro.ikeda.g)

Sotaro Ikeda [:sotaro]

Comment 3

•

11 years ago

(In reply to Milan Sreckovic [:milan] from comment #2) > Sotaro, could you take a look if this is us or vendor issue? I take a look.

Flags: needinfo?(sotaro.ikeda.g)

Sotaro Ikeda [:sotaro]

Comment 4

•

11 years ago

I tested on master b2g with pmem fallback ROM built from today's latest source. I also see the problem. from 'adb shell dmesg', Message app was killed by low memory killer. <4>[ 251.902831] select 636 ((Preallocated a), adj 10, size 4828, to kill <4>[ 251.902854] send sigkill to 636 ((Preallocated a), adj 10, size 4828 <4>[ 253.421836] idle_stats_device: Overwriting samples <4>[ 288.474128] select 559 (Gallery), adj 10, size 7091, to kill <4>[ 288.474153] send sigkill to 559 (Gallery), adj 10, size 7091 <4>[ 289.216199] select 424 (Homescreen), adj 8, size 5112, to kill <4>[ 289.216226] send sigkill to 424 (Homescreen), adj 8, size 5112 <4>[ 307.115284] select 718 (Gallery), adj 10, size 8536, to kill <4>[ 307.115306] send sigkill to 718 (Gallery), adj 10, size 8536 <4>[ 309.359411] select 759 (plugin-containe), adj 10, size 124, to kill <4>[ 309.359453] send sigkill to 759 (plugin-containe), adj 10, size 124 <4>[ 333.999933] select 762 (Settings), adj 10, size 8153, to kill <4>[ 333.999956] send sigkill to 762 (Settings), adj 10, size 8153 <4>[ 334.605773] select 894 (plugin-containe), adj 10, size 496, to kill <4>[ 334.605788] send sigkill to 894 (plugin-containe), adj 10, size 496 <4>[ 338.606279] select 820 (Gallery), adj 10, size 7736, to kill <4>[ 338.606298] send sigkill to 820 (Gallery), adj 10, size 7736 <4>[ 339.820094] select 765 (Homescreen), adj 8, size 4855, to kill <4>[ 339.820114] send sigkill to 765 (Homescreen), adj 8, size 4855 <4>[ 340.338408] select 565 (Messages), adj 2, size 25995, to kill <4>[ 340.338429] send sigkill to 565 (Messages), adj 2, size 25995

Sotaro Ikeda [:sotaro]

Comment 5

•

11 years ago

(In reply to Jason Smith [:jsmith] from comment #1) This is likely > a bug only seen when we're using the correct base image. If we do not use correct base image (no fallback from pmem to ashmem), memory allocation failed because of out of pmem.

Sotaro Ikeda [:sotaro]

Comment 6

•

11 years ago

> > Notes:Does not generate crash report but app does crash and force close. > Always takes more than one MMS to cause crash. This bug seems not crash, but the MMS killed by low memory killer because of out of memory.

Sotaro Ikeda [:sotaro]

Comment 7

•

11 years ago

pfield, can you attach also "adb shell dmesg" log?

Flags: needinfo?(pfield)

Sotaro Ikeda [:sotaro]

Comment 8

•

11 years ago

Recently, memory usage of 2d canvas was shrunk by Bug 920160 and Bug 927254. They decreases memory usage a lot. If there is no gralloc fallback from pmem to ashmem, gralloc usage hit the limit when the device became out of pmem. But the fallback enables more memory usage.

pfield

Reporter

Comment 9

•

11 years ago

Attached file $ adb shell dmesg.txt — Details

Flags: needinfo?(pfield)

Sotaro Ikeda [:sotaro]

Comment 10

•

11 years ago

From attachment 8336378 [details], Messages was killed by low memory killer.

Sotaro Ikeda [:sotaro]

Comment 11

•

11 years ago

When disable SkiaGL by disabling following, the problem did not happen. >pref("gfx.canvas.azure.backends", "skia"); >pref("gfx.canvas.azure.accelerated", true);

Jason Smith [:jsmith]

Comment 12

•

11 years ago

(In reply to Sotaro Ikeda [:sotaro] from comment #11) > When disable SkiaGL by disabling following, the problem did not happen. > >pref("gfx.canvas.azure.backends", "skia"); > >pref("gfx.canvas.azure.accelerated", true); That concludes this is a regression from Skia GL, which I think makes this a cut and dry blocker.

Milan Sreckovic [:milan] (needinfo for best results)

Comment 13

•

11 years ago

Sotaro, what happens with these settings: pref("gfx.canvas.azure.backends", "skia"); pref("gfx.canvas.azure.accelerated", false); as in, Skia, but not SkiaGL. In general, I would expect us to run out of memory sooner with SkiaGL - after all, we have the caching enabled, and caching means more memory usage. This is not necessarily a bug, just "we're using more memory". It may be worth testing with caching turned off completely (or down to that 2MB minimum). We're supposed to get back 2MB from bug 939392 anyway :) George, Snorp, let's see if we can keep Skia/GL in 1.2, we know it gives us a lot of performance improvement.

Flags: needinfo?(snorp)

Flags: needinfo?(gwright)

Sotaro Ikeda [:sotaro]

Comment 14

•

11 years ago

(In reply to Milan Sreckovic [:milan] from comment #13) > Sotaro, what happens with these settings: > > pref("gfx.canvas.azure.backends", "skia"); > pref("gfx.canvas.azure.accelerated", false); By the above setting, the STR was passed without the crash(not killed). GL/GLCache part seems to be a problem.

James Willcox (:snorp) (jwillcox@mozilla.com) (he/him)

Assignee

Comment 15

•

11 years ago

(In reply to Milan Sreckovic [:milan] from comment #13) > Sotaro, what happens with these settings: > > pref("gfx.canvas.azure.backends", "skia"); > pref("gfx.canvas.azure.accelerated", false); > > as in, Skia, but not SkiaGL. > > In general, I would expect us to run out of memory sooner with SkiaGL - > after all, we have the caching enabled, and caching means more memory usage. > This is not necessarily a bug, just "we're using more memory". > > It may be worth testing with caching turned off completely (or down to that > 2MB minimum). We're supposed to get back 2MB from bug 939392 anyway :) > > George, Snorp, let's see if we can keep Skia/GL in 1.2, we know it gives us > a lot of performance improvement. Right now 2MB is the minimum Skia texture cache size, but it's shared between all canvas instances. I'd recommend against making that much lower (and certainly not zero), as performance would be much worse than software because of constant uploads. Assuming the patch from bug 927254 was in the build, I don't know what we can really do to reduce memory usage much more. One idea I've had in the past is to back immutable Skia bitmaps with ashmem gralloc buffers, which would remove the tex2d copy, but that's going to be significant work with a lot of edge cases I'd imagine.

Flags: needinfo?(snorp)

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Updated

•

11 years ago

Whiteboard: burirun4 [MemShrink] → burirun4 [MemShrink:P1]

Sotaro Ikeda [:sotaro]

Comment 16

•

11 years ago

Milan, I might be better to re-assign this bug to a engineer knows well about SkiaGL. How do you thinks about it?

Flags: needinfo?(milan)

Mike Lee [:mlee]

Updated

•

11 years ago

Severity: normal → major

Priority: -- → P1

Whiteboard: burirun4 [MemShrink:P1] → [c=memory p= s= u=] burirun4 [MemShrink:P1]

Jason Smith [:jsmith]

Updated

•

11 years ago

blocking-b2g: koi? → koi+

Dave Huseby [:huseby]

Comment 17

•

11 years ago

We need product to make the call on SkiaGL in 1.2. See Bug 939962 as it is the same underlying cause.

Flags: needinfo?(clee)

Dave Huseby [:huseby]

Updated

•

11 years ago

Whiteboard: [c=memory p= s= u=] burirun4 [MemShrink:P1] → [c=memory p= s= u=1.2] burirun4 [MemShrink:P1]

Dave Huseby [:huseby]

Updated

•

11 years ago

Severity: major → blocker

Milan Sreckovic [:milan] (needinfo for best results)

Comment 18

•

11 years ago

We have a product decision - SkiaGL stays in. James, I gave you this one and :gw280 bug 939962, though they are probably the same. I don't want to make that assumption though. Can you two please work together to see what the alternatives are.

Assignee: sotaro.ikeda.g → snorp

Flags: needinfo?(milan)

James Willcox (:snorp) (jwillcox@mozilla.com) (he/him)

Assignee

Comment 19

•

11 years ago

Does the patch for bug 946541 fix this?

Jason Smith [:jsmith]

Comment 20

•

11 years ago

(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #19) > Does the patch for bug 946541 fix this? Putting qawanted to test this in a build with the patch from bug 946541 included (i.e. test this on a 12/6 1.2 build or later).

Keywords: qawanted

Mike Lee [:mlee]

Updated

•

11 years ago

Keywords: perf → footprint

Whiteboard: [c=memory p= s= u=1.2] burirun4 [MemShrink:P1] → burirun4 [MemShrink:P1]

David Flanagan [:djf]

Comment 21

•

11 years ago

As Milan guessed, this bug is probably related to bug 939962. When the MMS app picks an image from the Gallery app, the gallery app offers the user the chance to crop the image. And image cropping is done by the same part of the Gallery app that does image editing. For bug 935273 I've made this more efficient, so that the full-size image is not decoded unless the user actually does a crop. Now that 935273 has landed, I would expect that the OOM will now only occur if you actually crop the image after picking it. If you can't solve the underlying issue, a possible workaround would be to modify the MMS app to set nocrop:true in the activity request it uses to pick an attachment. This will tell the gallery to not allow cropping. That would be a UX change, however, so you'd want to get UX approval first, of course.

David Flanagan [:djf]

Comment 22

•

11 years ago

Actually, after reading the STR more carefully, I'm not sure whether the MMS app is OOMing because the gallery app is using too much memory or whether it OOMs when it tries to resize the image it just received. Resizing the image requires decoding it at full size which allocates a bunch of memory. Another possible (last-resort) workaround would be to modify the pick activity so the gallery returns both the full-size image and the preview image. The preview image is at least as big as the screen, and that might be about the size that is desired here. It would probably only make sense with the nocrop:true fix mentioned above, however.

Sarah Parsons

Comment 23

•

11 years ago

I can still repro this issue on Buri 1.2 Build ID: 20131206004002 Gaia d48df33cbdae2063f12b09e3dca07ff7a2f2a3cc SourceStamp fd0302696c57 BuildID 20131206004002 Version 26.0 (In reply to Jason Smith [:jsmith] from comment #20) > (In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #19) > > Does the patch for bug 946541 fix this? > > Putting qawanted to test this in a build with the patch from bug 946541 > included (i.e. test this on a 12/6 1.2 build or later).

Keywords: qawanted

Preeti Raghunath(:Preeti)

Comment 24

•

11 years ago

Peter Please pitch in for product. Moving to koi? till product decision is made.

blocking-b2g: koi+ → koi?

Flags: needinfo?(pdolanjski)

Peter Dolanjski [:pdol]

Comment 25

•

11 years ago

Based on comment #18, it sounds like product chimed in. Milan, who made the SkiaGL decision?

Flags: needinfo?(pdolanjski) → needinfo?(milan)

Jason Smith [:jsmith]

Comment 26

•

11 years ago

(In reply to Peter Dolanjski [:pdol] from comment #25) > Based on comment #18, it sounds like product chimed in. Milan, who made the > SkiaGL decision? Andreas did. This was already discussed on b2g-release-drivers that we are moving forward with blocking on fixing the memory issues for Skia GL. Please move this back to koi+.

Jason Smith [:jsmith]

Updated

•

11 years ago

blocking-b2g: koi? → koi+

Milan Sreckovic [:milan] (needinfo for best results)

Updated

•

11 years ago

Flags: needinfo?(milan)

James Willcox (:snorp) (jwillcox@mozilla.com) (he/him)

Assignee

Comment 27

•

11 years ago

I commented in another bug that we could consider demoting to a software canvas under memory pressure. This would eliminate the GLContext and friends (SurfaceStream, etc) and also whatever GPU stuff Skia has. The problem with this is that in order to do the demotion, we have to allocate a software DrawTarget and then copy the contents from original one. This will obviously use even more memory, which is clearly not a great idea when we are already in a low memory situation. I thought about having an "emergency" demotion mode that would skip this copy, but I don't think it's a very good idea. Content would have no idea when this occurred, so it wouldn't have an opportunity to recover. We would need to add something similar to the "lost context" event that WebGL has.

Jason Smith [:jsmith]

Comment 29

•

11 years ago

Can we check this again on today's 1.2 build? The automation report is saying this is passing now, so I want to double check if this reproduces manually.

Keywords: qawanted

Sarah Parsons

Comment 30

•

11 years ago

I can still repro this issue on the Buri 1.2 Build ID: 20131211004007 Gaia 096722a9e2510ecdfe45ba7382d7d50826b82feb SourceStamp 43d7b300241a BuildID 20131211004007 Version 26.0

Keywords: qawanted

Milan Sreckovic [:milan] (needinfo for best results)

Comment 31

•

11 years ago

We're at a bit of an end of the road here. George is going to take the upstream Skia and see if it fixes the issue (they handle cache clearing better in the newer versions); either way, I don't think we have a 1.2 solution to this issue, short of disabling SkiaGL, at startup, on any phone < 256mb (or perhaps even < 384mb)

Flags: needinfo?(gwright)

Milan Sreckovic [:milan] (needinfo for best results)

Comment 32

•

11 years ago

James will try adding a flag that forces the canvas used as a part of the process to software instead, see if that saves us enough memory to survive the operation.

Milan Sreckovic [:milan] (needinfo for best results)

Updated

•

11 years ago

Comment 32 is private: false

Tony Chung [:tchung]

Comment 33

•

11 years ago

(In reply to Milan Sreckovic [:milan] from comment #32) > James will try adding a flag that forces the canvas used as a part of the > process to software instead, see if that saves us enough memory to survive > the operation. Hi Milan, Snorp. any progress on the patch here?

James Willcox (:snorp) (jwillcox@mozilla.com) (he/him)

Assignee

Comment 34

•

11 years ago

There is a patch that adds a flag to force a software context on bug 884226. Just need one more r+ there and we can put that in. Once that is in we'll need an additional patch to set the 'gfx.canvas.willReadFrequently.enable' pref, and then the Gallery app will need to use that flag in its canvas.

Noemí Freire (:noemi)

Updated

•

11 years ago

status-b2g-v1.2: --- → affected

status-b2g-v1.3: --- → affected

Jason Smith [:jsmith]

Comment 36

•

11 years ago

Given Milan's email, this is a dupe of bug 939962 (same problem).

Status: NEW → RESOLVED

Closed: 11 years ago

Resolution: --- → DUPLICATE

Dave Huseby [:huseby]

Updated

•

11 years ago

Flags: needinfo?(clee)

log.txt 11 years ago pfield 28.40 KB, text/plain		Details
$ adb shell dmesg.txt 11 years ago pfield 53.73 KB, text/plain		Details