Closed Bug 928614 Opened 11 years ago Closed 11 years ago

[META][B2G][Camera][dingyu] All running apps get killed when clicking the filmstrip thumbnails quickly

Categories

(Firefox OS Graveyard :: Gaia::Camera, defect, P1)

defect

Tracking

(blocking-b2g:hd+, b2g-v1.1hd fixed, b2g-v1.2 fixed)

RESOLVED FIXED
1.3 Sprint 4 - 11/8
blocking-b2g hd+
Tracking Status
b2g-v1.1hd --- fixed
b2g-v1.2 --- fixed

People

(Reporter: lecky.wanglei, Assigned: dmarcos)

References

Details

Attachments

(2 files)

User Agent: Mozilla/5.0 (Windows NT 6.1) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.112 Safari/535.1

Steps to reproduce:

1. open the gallery 、FM、Message app
2. open the camera and take snapshot for two pictures
3. Alternating  clicking on the two thumbnails quickly


Actual results:

1. the camera is close automatically and the pid of gallery, FM, Message are all killed


Expected results:

1. I can see the two pictures
Severity: normal → critical
blocking-b2g: --- → hd?
Flags: needinfo?(wchang)
Priority: -- → P1
Attached file log.rar
(1) According to the log, the app gallery,Setting,Marketplace, camera .etc are all killed because of low memory.
(2) when I check the media_frame.js https://github.com/mozilla-b2g/gaia/blob/v1.1.0hd/shared/js/media/media_frame.js#L88
I find that the  bigEnough(preview) is added compare to the  previous version.  bigEnough(preview) is always return false becase preview.width and preview.height is alway undefined.
  So https://github.com/mozilla-b2g/gaia/blob/v1.1.0hd/shared/js/media/media_frame.js#L109-110 is always execute.
Evelyn,

Can you or something check out comment 2's analysis? seems kind of odd, but i am not sure if it is in relevant to the described bug here?
Flags: needinfo?(wchang) → needinfo?(ehung)
Is it always reproducible? I tested on partner's build 10/15, but can't reproduce it.

and I think (2) in comment 2 is not the root cause of this problem, but (1) is. Low memory is much more like to cause all apps killed.

Regarding (2), if you happen to be able to reproduce the issue again, I'd like to know why "preview.width" and "preview.height" are undefined. Can you trace the code and dump values on some points of the whole flow?
Flags: needinfo?(ehung) → needinfo?(lecky.wanglei)
Hi Eyelyn:

  (1)It is always reproducible in our 1.1.0hd devices.
  
  (2)Yes, I enter adb shell b2g-info, and find that If I do the reproduce steps, the memory become fewer and fewer 
  
 (3)I have add the log as follows:
     function bigEnough(preview) {
        if (!preview.width || !preview.height){
          dump("camera: preview.width = " + preview.width +" preview.height " + preview.height);
          return false
         }
      ...
     the preview.width and preview.height is always undefined

     so https://github.com/mozilla-b2g/gaia/blob/v1.1.0hd/shared/js/media/media_frame.js#L109-110 is always execute
Flags: needinfo?(lecky.wanglei)
Hi Eyelyn:

  You have said you cann't reproduce it, Maybe It relates to the image resolution. 
  I test two pictures of 640*480 which is get from front camera, I also cann't reproduce it. 
  But if I test two pictures of 2596*1944 which is get from back camera, It is always reproducible, but if I disenable bigEnough(preview). I also can't reproduce it
From the kernel log:

01-06 00:41:40 <4>[448, HTML5 Parser] [  516.382308] do_ex01-06 00:41:43 <4>[276, Gecko_IOThread] [  519.559363] select 1263 (Camera), adj 2, size 76768, to kill
01-06 00:41:43 <4>[276, Gecko_IOThread] [  519.559383] send sigkill to 1263 (Camera), adj 2, size 76768

The 'size' field is the count of 4KiB pages[1], so for some reason, the Camera app is holding on to almost 300MiB of data!

1. https://wiki.mozilla.org/B2G/Debugging_OOMs#Step_1:_Verify_that_it.27s_actually_an_OOM
Summary: [B2G][Helix][Camera][dingyu]All the running apps will be killed when i click the thumbnails quickly → [B2G][Helix][Camera][dingyu] All running apps get killed when clicking the filmstrip thumbnails quickly
I can confirm that this happens on Inari as well. Some additional observations:
1. Inari doesn't have enough memory to load all of the apps in comment 0 all at once, but it doesn't matter
2. Take two pictures, then tap --REALLY QUICKLY-- back and forth on the two thumbnails in the filmstrip, fast enough that the full image never displays completely
3. If you have |adb shell cat /proc/kmsg| open in a window, one by one you'll see the background apps get killed, e.g.:

<4>[10-21 22:19:34.561] [25: kswapd0]select 529 (FM Radio), adj 10, size 5902, to kill
<4>[10-21 22:19:34.561] [25: kswapd0]send sigkill to 529 (FM Radio), adj 10, size 5902
<4>[10-21 22:19:37.884] [25: kswapd0]select 828 ((Preallocated a), adj 10, size 4827, to kill
<4>[10-21 22:19:37.884] [25: kswapd0]send sigkill to 828 ((Preallocated a), adj 10, size 4827
<4>[10-21 22:19:40.127] [25: kswapd0]select 382 (Homescreen), adj 8, size 5663, to kill
<4>[10-21 22:19:40.127] [25: kswapd0]send sigkill to 382 (Homescreen), adj 8, size 5663

4. Once the Camera app is the only app left, it takes some time for it to die, but it does eventually:

<4>[10-21 22:21:40.184] [804: Camera]select 804 (Camera), adj 2, size 27030, to kill
<4>[10-21 22:21:40.184] [804: Camera]send sigkill to 804 (Camera), adj 2, size 27030

In this case, it was using >105MiB when it was killed.

djf, could this be due to us keeping too many copies of the fully-decoded images in memory at the same time?
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: needinfo?(dflanagan)
This is obviously a OOM issue.

And if the bigEnough() function is returning false, that means that we're decoding the full-size image instead of the preview. 

Get bigEnough() to return true, and that will fix the OOM.

Note that on an HD device, we want to be sure that the camera is creating photos with EXIF preview images at least as big as the screen size.  That may be unusually big for an EXIF preview, but if we don't do that, we run into OOM issues like this.  The camera app may need tweaking to be sure it is requesting large enough previews.
Flags: needinfo?(dflanagan)
HD+'ing this with David's comment 9.

Mike, would you be able to work on this in before the end of October? If your plate is full or if you know someone else to be better suitable for this please let me know.
blocking-b2g: hd? → hd+
Flags: needinfo?(mhabicher)
Assignee: nobody → mhabicher
Flags: needinfo?(mhabicher)
Summary: [B2G][Helix][Camera][dingyu] All running apps get killed when clicking the filmstrip thumbnails quickly → [B2G][Camera][dingyu] All running apps get killed when clicking the filmstrip thumbnails quickly
Follow-up to comment 9 and comment 11, Inari's display is 320x480, so bigEnough() _should_ be returning true.
Looks like the 'preview' object passed bigEnough() has neither .width nor .height defined:

    if (preview.width) {
      dump("preview.width=" + preview.width);
    } else {
      dump("preview.width is missing");
    }
    if (preview.height) {
      dump("preview.height=" + preview.height);
    } else {
      dump("preview.height is missing");
    }

10-22 11:46:08.217   618   618 I GeckoDump: preview.width is missing
10-22 11:46:08.217   618   618 I GeckoDump: preview.height is missing
parseJPEGMetadata() returns 'metadata.preview' without height and width properties:

        dump("mikeh: metadata.preview=" + metadata.preview.toSource());

10-22 12:25:27.269  1651  1651 I GeckoDump: mikeh: metadata.preview=({start:450, end:16280})
I'm rewriting the way we parse meta data. I think It makes sense for me to take this bug
Assignee: mhabicher → dmarcos
It looks like the correct solution would be for filmstrip.js to call metadataParser() instead of parseJPEGMetadata(); the former properly takes the extracted thumbnail and extracts the height and width from it, setting preview.height/.width; the latter (in all its incarnations) doesn't.

Looks like this will be fixed in bug 928612. Handing off to dmarcos.
Depends on: 928612
Thanks Mike.


Diego,
We'll need this bug fixed on v1.1HD prior to bug 928612 though, can we get something suitable fixed here first?

Thank you
Flags: needinfo?(dmarcos)
Wayne,

I found a quick fix for this bug. I will try to get a patch done for tomorrow. I'm confident we can get this fixed on v1.1HD
As David Flanagan suggested the problem comes from the bigEnough function. It's not actually a bug since the function works as expected:

When the user clicks on a filmstrip thumbnail the function bigEnough checks if the thumbnail is large enough to cover the whole area where we display the image. If the image is smaller than the area we render the complete image. In the case of the phone I tested (hamachi) the thumbnail is always smaller than the display area so the complete image is always rendered resulting in crashes when the user quickly navigates between pictures.

I'm trying to come up with a short term solution for this. I need some input from David Flanagan:

As a quick solution: Do you think is reasonable to always display the thumbnail in the filmstrip? This way we avoid the crashes but we always render a black frame around the images since the thumbnail is not big enough to cover the whole screen. If the user decides to zoom in the images is also going to inspect a low quality version of the taken image.
Flags: needinfo?(dflanagan)
Flags: needinfo?(dmarcos)
Mike was saying on IRC yesterday that he thought that we were only querying the size of the full image, and never actually getting the size of the preview image, and that those values were undefined, and that is why bigEnough() was failing.  I think Mike is right, and we need to fix this.

But there may be a bigger, deeper bug: Diego is finding that when he attempts to display the preview image (even without knowing its size) it is smaller than the screen.  If that is true -- if the camera hardware is returning a jpeg file whose EXIF preview image is smaller than the screen size -- then we are in big, big trouble: I can guarantee you that if the camera returns photos like that Gallery will be slow and crashy. We absolutely must fix this, especially for hd devices.

Bug 807058 added a gecko API so that the camera app could control the size of the embedded preview. That patch never landed because the hardware on the unagi didn't honor the setting.  Mike is trying it again on our newer hardware.  If his patch works now, then we'll need hd+ on bug 807058 so we can land it.

If it still doesn't work, then someone will need to fix this lower down, presumably at the vendor device driver level.  Or we will need to restrict the helix (and other affected devices) to taking 2mp pictures.  

If we ever get a fix for bug 854795, then we can rewrite gallery so that embedded preview size does not matter.  But for now, we NEED the big embedded previews.
Flags: needinfo?(dflanagan)
(In reply to Diego Marcos from comment #20)
 
> As a quick solution: Do you think is reasonable to always display the
> thumbnail in the filmstrip? This way we avoid the crashes but we always
> render a black frame around the images since the thumbnail is not big enough
> to cover the whole screen. If the user decides to zoom in the images is also
> going to inspect a low quality version of the taken image.

I think that UX would hate us if we did that. But, as noted above, if we can't fix the preview size, we'll have much bigger problems, like Camera and Gallery crashing all the time.  We really need to get the preview size to be big enough.
I confirm my observations with numbers. Tested on hamachi device: 

Screen Size -> width: 320 height: 480 
Thumbnail Size -> width: 240 height: 320

If the device is in landscape (horizontal) orientation we can fill the width of the screen (320px) with the height of the thumbnail (320px). We end up with empty space on the left and right sides of the image. This is expected since we are dealing with different aspect ratios.

If the device is in portrait (vertical) orientation we cannot completely fill the screen. 480px (screen height) vs 320px (thumbnail height) and 320px (screen width) vs 240px (thumbnail width). We end up with empty space on the sides, top and bottom of the image. 

With the code we have today the bigEnough check returns false and the full image is going to be always rendered.

David, Do we implement a quick fix to always display the thumbnail even if the UX is not optimal? I think a crash is worse than displaying an image smaller than the screen. I think in the future we will be always displaying the thumbnail anyway, right? We just have to make sure that the thumbnail has the same size than the screen.

I have a patch ready to go if you give thumbs up to the quick fix.
Flags: needinfo?(dflanagan)
No, I don't think we can go for the quick fix here.  We need to fix the underlying issue, or Gallery will be really sucky because scanning will be slow and it will crash often with OOMs.

Mike reports in bug 807058 that his Hamachi uses a default thumbnail size that is big enough.  It is possible, I suppose that one of you is using out-of-date firmware.

I think we need to make bug 807058 be hd+, so we can land it and uplift it.  Then we need to modify camera.js to explicitly set the thumbnail size to something bigger than the screen.  (But to work on HD, it has to be bigger than the screen times the CSS device pixel ratio).  Something similar to the code we use to pick a preview size would work.  Or, if the config.js file has been uplifted to 1.1hd, then we could just add another config variable to that file and set the thumbnail size based on that.

But also, we need to fix the underlying issue that we call parseJPEGMetadata() to get the preview image, but then never call parseJPEGMetadata() on the preview to get the size of the preview. That's the immediate cause of the bigEnough() failing.
Flags: needinfo?(dflanagan)
Clearing the dependency on 928612. The patch in that bug is going to be really big and not suitable for uplift on an hd+ bug.  There is a much simpler fix we can use to just get the size of the image.
No longer depends on: 928612
Given the importance of having a big enough EXIF preview, we should probably also modify filmstrip.js in this patch so that if bigEnough() fails, we use console.error() to put a big ugly error message in the logcat.
As noted in comment 13, preview.width and .height are definitely not getting set.

Further to this and bug 807058, it looks like although hamachi (e.g.) defaults to a JPEG thumbnail of 512x384, we're selecting a smaller one due to bug 808030.
Depends on: 931054
From my discussions with Mike and Diego, here is my understanding of the current plan:

1) Diego will fix the metadata parsing issue in 931054 so we're testing the preview size correctly. That bug is hd+ and there is a patch with an r+ already.

2) Mike will land 807058 to give us a JS API for setting the embedded EXIF preview size from the camera app. He has a patch ready to go.  This may also involve backing out bug 808030, and if so, Mike will presumably file a blocking bug for that.

3) Diego will modify the camera app to request a EXIF preview that is big enough.  He'll do this work in 931093. That bug is hd? and needs to be hd+ so it can be uplifted.

Fixing the the three things above should get us previews that are big enough to prevent the OOM. However, I expect that they will also cause photos to start looking blurry on hd devices. So I think there is a fourth step required here as well:

4) Diego will modify Camera, Gallery and shared/js/media/media_frame.js to make them sensitive to devicePixelRatio, so that the minimum preview size isn't the screen size in CSS pixels but the screen size in device pixels.  This bug has not been filed yet, but when it is, it should block this one, and it should be made hd+.
Depends on: 931125
David, Mike, Diego, Thanks for your work! If you are ready to give the patch from Comment 28, I can do the test and verify task in our hd? device.
Lecky, the patch for 931054 has landed on master. Feel free to test it.
Hi Diego: The patch seem base on v1.2, But we are used is v1.1hd? Can you help to merge the change to v1.1hd? thank you!
Flags: needinfo?(dmarcos)
I attached a patch on bug 931054 that applies on v1.1hd
I have set the flags accordingly here:

(In reply to David Flanagan [:djf] from comment #28)
> From my discussions with Mike and Diego, here is my understanding of the
> current plan:
> 
> 1) Diego will fix the metadata parsing issue in 931054 so we're testing the
> preview size correctly. That bug is hd+ and there is a patch with an r+
> already.
> 
> 2) Mike will land 807058 to give us a JS API for setting the embedded EXIF
> preview size from the camera app. He has a patch ready to go.  This may also
> involve backing out bug 808030, and if so, Mike will presumably file a
> blocking bug for that.

Patch available, HD+'ed here, if required please also land it on v1.2 for future releases.

> 
> 3) Diego will modify the camera app to request a EXIF preview that is big
> enough.  He'll do this work in 931093. That bug is hd? and needs to be hd+
> so it can be uplifted.


Patch available, HD+'ed here, if required please also land it on v1.2 for future releases.

> 
> Fixing the the three things above should get us previews that are big enough
> to prevent the OOM. However, I expect that they will also cause photos to
> start looking blurry on hd devices. So I think there is a fourth step
> required here as well:
> 
> 4) Diego will modify Camera, Gallery and shared/js/media/media_frame.js to
> make them sensitive to devicePixelRatio, so that the minimum preview size
> isn't the screen size in CSS pixels but the screen size in device pixels. 
> This bug has not been filed yet, but when it is, it should block this one,
> and it should be made hd+.
Diego is on this, HD+'ed here, if required please also land it on v1.2 for future releases.


David, anything specific to be done this bug 928614 itself? Should we turn this into a meta instead if nothing to do here?
Hi Wayne. I don't think there's anything specific that has to be done for this bug once the dependencies are resolved.
Flags: needinfo?(dmarcos)
(In reply to Diego Marcos from comment #34)
> Hi Wayne. I don't think there's anything specific that has to be done for
> this bug once the dependencies are resolved.

Thanks Diego, marking this META :)
Summary: [B2G][Camera][dingyu] All running apps get killed when clicking the filmstrip thumbnails quickly → [META][B2G][Camera][dingyu] All running apps get killed when clicking the filmstrip thumbnails quickly
Blocks: 931125
No longer depends on: 931125
No longer blocks: 931125
Depends on: 931125
Target Milestone: --- → 1.3 Sprint 4 - 11/8
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: