Closed Bug 974716 Opened 11 years ago Closed 10 years ago

Gallery app is killed by OOM during swapping pic

Categories

(Firefox OS Graveyard :: Gaia::Gallery, defect, P3)

ARM
Gonk (Firefox OS)
defect

Tracking

(blocking-b2g:-)

RESOLVED WORKSFORME
blocking-b2g -

People

(Reporter: tkundu, Unassigned)

References

Details

(Keywords: perf, regression, Whiteboard: [c=memory p= s= u=] [CR 619861][MemShrink:P3] [priority])

Attachments

(1 file)

Attached image IMG_0001.jpg
STR: 1) Insert an SDcard with 100 copies of attached image files. 2) Flash userdata partition or Reset gaia to force Gallery app scanning all images in sdcard. 3) Open gallery app. You will see that it is scanning images. 4) Open an image immediately after opening gallery app 5) Swipe image to see next image from gallery. 6) Repeat step (5) , 30 times. 7) You will see gallery is getting sigkill from dmesg and it will be crashed. This issue is observed in both ICS based and JB based FFOS 1.3 . Gaia and Gecko revisions are as below: gaia commit hash: a43904d9646109b48836d62f7aa51e499fbf4b2e gecko commit hash: 0b4f0a800c967e94edf23180d19f42e3fbbb2a9d You can use any QRD msm7627a device to reproduce it easily.
blocking-b2g: --- → 1.3?
Why is this nominated? The use case feels entirely unrealistic & easy for a user to recover from.
What is a QRD msm7627a device? Does that correspond to any of the devices that we produce nightly builds for?
I agree with Jason that this is an unrealistic test. Note also that the test image does not have a suitable EXIF preview image. This wouldn't reproduce with 100 photos from the camera. And on the other hand, if you had a 5mp test image with no EXIF preview instead of a 2mp test image, then it would be even easier to reproduce this. Big images are hard. Gecko doesn't give us good primitives for managing image memory, and the web platform doesn't have any kind of OutOfMemoryError that we can handle. We don't know that we're using too much memory until we've already been killed. I've optimized all of this to the best of my ability for previous releases. There is already special case code that prevents image editing while scanning. I don't think it would be acceptable to prevent swiping between images while scanning. I don't think there is any way we can fix this in gaia until 854795 lands. I recommend making this 1.3- and marking it as a dependency of 949755 (which in turn depends on 854795).
Tapas, is this observed on v1.2 w/ a QRD7627a too?
(In reply to Michael Vines [:m1] [:evilmachines] from comment #4) > Tapas, is this observed on v1.2 w/ a QRD7627a too? v1.2 ICS QRD7627a is fine. It is not present there. This is present only in both v1.3 ICS (QRD 7627a) and v1.3 JB. This is a regression in v1.3
This is a gaia bug. Gaia developers don't speak chipset. What the heck is a 7627a? I have no idea what device to attempt to duplicate this on. I've got a hamachi, a helix and a nexus 4. Is one of those a 7627a device?
Flags: needinfo?(tkundu)
If this is a regression, then comment #3 doesn't apply. Taking it to investigate, though I'm quite pessimistic about being able to do anything about it.
Assignee: nobody → dflanagan
Keywords: regression
Keywords: perf
Whiteboard: [CR 619861] → [CR 619861][MemShrink]
(In reply to David Flanagan [:djf] from comment #7) > If this is a regression, then comment #3 doesn't apply. Taking it to > investigate, though I'm quite pessimistic about being able to do anything > about it. Can you please try with Hamachi and let us know if you have some difficulty to see this bug. Thanks a lot for your help.
Flags: needinfo?(tkundu)
Keywords: qawanted
(In reply to David Flanagan [:djf] from comment #3) > Note also that the test image does not have a suitable EXIF preview image. > This wouldn't reproduce with 100 photos from the camera. Looking at our internal bug report I don't see any details on the exact gallery images used to originally reproduce this bug. Tapas, can you please confirm those details.
Flags: needinfo?(tkundu)
I was unable to reproduce this bug on my hamachi with today's nightly 1.3 engineering build. I used 100 copies of the sample image attached to this bug. I haven't tried Helix or nexus4 since they both have more memory and are less likely to reproduce an OOM bug. Bhavana has already set qawanted. If anyone can figure out how to reproduce on a hamchi, let me know and I'll try to investigate some more. But in the meantime, I'm going to unassign myself. Tapas: could this be another case like the initial scan bug where your reference devices had a more aggressive LMK or were not sending proper memory pressure signals to gecko? As I said in that bug, Gallery is the most memory intensive app we have, and it can serve as a stress test for memory management in gecko and gonk.
Assignee: dflanagan → nobody
>> I was unable to reproduce this bug on my hamachi with today's nightly 1.3 engineering build Can you please run |adb shell procrank| and confirm me "Total memory" on your device.
Flags: needinfo?(tkundu) → needinfo?(dflanagan)
Flags: needinfo?(tkundu)
Tapas, My hamachi does not have procrank installed. I use b2g-ps instead, but that does not report total memory. Here are two lines from dmesg: <6>[ 0.000000] Memory: 194MB = 194MB total <5>[ 0.000000] Memory: 184232k/184232k available, 39000k reserved, 0K highmem And here's a line from proc/meminfo: MemTotal: 184768 kB I would guess that means that it is a 256mb device with 70mb or so taken up by the kernel.
Flags: needinfo?(dflanagan)
I am going to assign this to Tapas so it doesn't sit down unassigned. Please try to provide STR or unblock if you can't reproduce it. Also setting 1.3+ to track this properly.
Assignee: nobody → tkundu
blocking-b2g: 1.3? → 1.3+
(In reply to Michael Vines [:m1] [:evilmachines] from comment #9) > (In reply to David Flanagan [:djf] from comment #3) > > Note also that the test image does not have a suitable EXIF preview image. > > This wouldn't reproduce with 100 photos from the camera. > > Looking at our internal bug report I don't see any details on the exact > gallery images used to originally reproduce this bug. Tapas, can you please > confirm those details. With test team sdcard contents, I can see this only in JB Based FFOS 1.3 . ICS 1.3 works fine with test team sdcard contents. It seems that reason is High mem usage for 800x400 resolution and Mozilla does not have such device to work on it locally :( . I will try to debug more and update here soon.
Flags: needinfo?(tkundu)
Flags: needinfo?(tkundu)
Assignee: tkundu → nobody
Flags: needinfo?(tkundu)
Running out of ideas except optimizing gallery more for 800x400 256MB device. So I unassigned myself as owner of this bug
(We aren't going to track this issue anymore. This bug could be reproduced with one of the new OEM devices that Moz has access to, Moz and the OEMs can decide if this is still v1.3+)
blocking-b2g: 1.3+ → 1.3?
blocking-b2g: 1.3? → backlog
blocking-b2g: backlog → -
Due to lack of STR at this time, this bug is unactionable . Moving this off-of blocker list.
(In reply to bhavana bajaj [:bajaj] from comment #17) > Due to lack of STR at this time, this bug is unactionable . Well not really :)
Unable to repro on buri on the v1.3 build from the 19th, or on the latest v1.3. Environmental Variables: Device: buri v1.3 moz ril BuildID: 20140219040204 Gaia: ac06cfbd2baf6494ffbb668cc599e3892cd5e17b Gecko: bf0e76f2a7d4 Version: 30.0a1 Firmware Version: v1.2-device.cfg Environmental Variables: Device: buri v1.3 moz ril BuildID: 20140226004002 Gaia: da50b2b62ed96993cf4df22e3d69c435b28506bd Gecko: 3c39d1e487d9 Version: 28.0 Firmware Version: v1.2-device.cfg
Keywords: qawanted
One of the gallery bugs blocking bug 949748 will probably help here.
Whiteboard: [CR 619861][MemShrink] → [CR 619861][MemShrink:P3]
Depends on: 949748
Just as a note re: realism, I frequently take event pictures with a high-zoom travel point and shoot, and might take 300 pics or more over the course of a couple of hours (I snap everything several times, since half turn out blurry). I could absolutely see loading those into an SD card and shoving them into the phone for triage, editing and posting. I do similar things with my tablet all the time. As for the flow, in other galleries as soon as thumbnails start generating I usually dive into the first pic and then start moving through them to triage, assuming the gallery will keep up with my motion. Then I go back and edit and post. Upshot is I think the flow is actually relatively plausible. What I don't know is whether a typical camera pregenerates the EXIF previews. Otherwise, this doesn't seem unrealistic to me.
Whiteboard: [CR 619861][MemShrink:P3] → [CR 619861][MemShrink:P3] [priority]
Hema, does your team have a ZTE Open C? you should be able to use that to troubleshoot this.
Severity: blocker → normal
Flags: needinfo?(hkoka)
Priority: P1 → P3
Whiteboard: [CR 619861][MemShrink:P3] [priority] → [c=memory p= s= u=] [CR 619861][MemShrink:P3] [priority]
The Gallery guys are mostly using Buri/Hamachi, Nexus4 mostly.
Flags: needinfo?(hkoka)
I think we ended up getting this fixed. But the bug is obsolete anyway since it was specific to hardware that no one is testing on anymore. So closing.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: