Closed Bug 861965 Opened 11 years ago Closed 11 years ago

[Gallery] fix metadata scanning peformance regression

Categories

(Firefox OS Graveyard :: Gaia::Gallery, defect)

x86
macOS
defect
Not set
normal

Tracking

(blocking-b2g:tef+, b2g18 verified, b2g18-v1.0.0 wontfix, b2g18-v1.0.1 verified)

VERIFIED FIXED
blocking-b2g tef+
Tracking Status
b2g18 --- verified
b2g18-v1.0.0 --- wontfix
b2g18-v1.0.1 --- verified

People

(Reporter: djf, Assigned: djf)

References

Details

(Whiteboard: [status: patch ready for review, gecko bug 865929 must land first])

Attachments

(2 files, 1 obsolete file)

This bug is a followup to bugs 847060 and bugs 854799.

The first bug fixed a crash, but in order to do so, it had to make the Gallery very very slow to scan metadata for large images (~5 seconds per image for large images: it should be about 10 times faster than that).

The second bug is an underlying gecko fix that enables me to fix the performance regression.

Both of those bugs were tef+, so I need this one to be as well, so that I can fix the issue.
Cc'ing Milan and Daniel

Milan: since you closed the open tef+ bug I was going to use to fix this, do you have the power to tef+ this one instead?

Daniel: you gave tef+ to the gecko fix that will enable me to fix this, so maybe you'll give tef+ on this one?
blocking-b2g: --- → tef?
Assignee: nobody → dflanagan
Blocks: 847060
Depends on: 854799
blocking-b2g: tef? → tef+
It looks like bug 854799 is not the fix I need.  I'm still getting OOMs when scanning 5mp images using nightly builds that include that bug fix.
(adding comment that ended up in the wrong bug when I originally wrote it).

Per discussion in triage session today we need to block on this, but we don't necessarily need to be able to show thumbnails for the large images that are causing problems here. IOW, we could show an image with an x through it or what not and avoid generating a thumbnail for such images. That's just one idea, but we need something to here to keep the gallery app working with sd cards with large images on it.
Whiteboard: [status: needs workaround, real fix post 1.0.1]
David, can you provide an update here please?
Whiteboard: [status: needs workaround, real fix post 1.0.1] → [status: needs workaround, real fix post 1.0.1, still waiting for update]
(In reply to Johnny Stenback (:jst, jst@mozilla.com) from comment #4)
> David, can you provide an update here please?

Completely blocked on 854799. That bug is marked Resolved/Fixed, but the fix does not work for me.  Kyle is investigating, I think.
(In reply to Johnny Stenback (:jst, jst@mozilla.com) from comment #3)
> (adding comment that ended up in the wrong bug when I originally wrote it).
> 
> Per discussion in triage session today we need to block on this, but we
> don't necessarily need to be able to show thumbnails for the large images
> that are causing problems here. IOW, we could show an image with an x
> through it or what not and avoid generating a thumbnail for such images.
> That's just one idea, but we need something to here to keep the gallery app
> working with sd cards with large images on it.

The issue isn't so much about thumbnails. When our partners started testing with 5 megapixel images that don't have embedded previews the size of the screen, everything fell apart and we discovered that gecko isn't up to the task on a device this constrained.

Because gecko doesn't free its image memory right away (that's what bug 854799 is about) we would get frequent OOMs while the user was browsing large previewless images.  Just swipe back and forth through a few of them and it was a guaranteed crash.

So to avoid the memory issue, I decided that in addition to creating thumbnails, I also need to create screen-sized preview images for images that do not have their own embedded previews. I'm trading off extra time during metadata parsing for not crashing when the user tries to view photos.

So, not displaying a thumbnail is one thing. But then when you tap on the non-thumbnail, I have to display a big X for the image itself and require the user to zoom in to see their actual photo. Clearly that is broken.

So if we can't fix 854799, I think we just have to lower the size limit for the photos we'll display.  Maybe a 3 megapixel limit would solve the OOM issues.

The ultimate solution to my gallery memory woes would be a fix for bug 854795, but in the discussion of that bug, a fix has been categorically ruled out for b2g18.
Note also that creating preview images for 5mp images is inherently time consuming on this little device. So even if we fix this bug, the scanning process for those large images is still going to be terribly slow (until we have a fix for 854795).  It just won't be quite as slow.

That is, if I can resolve this bug, then scanning preview-less 5mp images will take 2 or 3 seconds instead of the current 5 seconds.

I asked for tef+ on this bug because I thought we had a fix for bug 854799 in hand and I wanted permission to uplift a trivial change that would give us that performance boost.

But I don't actually believe that we *need* to block on this fix.

Perhaps bug 854799 should be reopened so that it's tef+ shows up in triage again.
David, if we only show images taken by the device camera within the gallery but relegate (not sure what is involved here) all other images to another folder, does that mitigate the issue or just transfer the problem to this other folder?  (I realize that this could have UI work associated with it)
Flags: needinfo?(dflanagan)
(In reply to Peter Dolanjski [:pdol] from comment #8)
> David, if we only show images taken by the device camera within the gallery
> but relegate (not sure what is involved here) all other images to another
> folder, does that mitigate the issue or just transfer the problem to this
> other folder?  (I realize that this could have UI work associated with it)

I think we'd still want to scan all known photos at app startup time. So even if we separated camera photos from others (which would be a good thing to do for v1.2, I think) we'd still have the really slow scanning speed for those large images.

For some reason, our partners seem to keep testing the gallery app with an sdcard loaded with hundreds of 5mp images.  Are we actually expecting customers to do that?  If not, it doesn't seem like a very realistic test case, and I'm not too concerned about this bug.
Flags: needinfo?(dflanagan)
(In reply to David Flanagan [:djf] from comment #9)
> Are we actually expecting customers to do that?  If not, it doesn't seem like a 
> very realistic test case, and I'm not too concerned about this bug.

I'm checking amongst the product team on this.  If we do put a restriction in place, would the effort be the same to restrict to only show images taken from the device's camera as opposed to doing it based on the image resolution/file size?  (it may be confusing if a user can see some offline loaded SD card images but not others)
Actually I think I have a fix for this in bug 862970. In that bug I no longer lock decoded images so the low memory notification should immediately clear the decoded image surface. Can we test that?
(In reply to Peter Dolanjski [:pdol] from comment #10)
> 
> I'm checking amongst the product team on this.  If we do put a restriction
> in place, would the effort be the same to restrict to only show images taken
> from the device's camera as opposed to doing it based on the image
> resolution/file size?  (it may be confusing if a user can see some offline
> loaded SD card images but not others)

Those two options would require completely different patches. Changing the existing resolution limit would be trivial patch. Changing the app to only display photos from the camera would be very simple, but not trivial like the other.

Meanwhile, I'm building Gecko with Andreas's patch from comment 11 and will report whether that fixes things for this bug.
Thanks Peter, make sure you use the latest patch, on bug 862970, there have been updates since 4/19.
Andreas's patch from bug 862970 does not fix the OOM for me. The app still crashes while scanning large images if I don't insert a long setTimeout() between each image.

Also, with the patch I sometimes find that when switching from fullscreen image display back to the thumbanail display that the thumbnails have disappeared.  I'll attach a screenshot.
I see now that there is a new version of the bug 862970 patch. Andreas must have updated that while I was building the old version....
David, will you be trying again with the newer patch?
Flags: needinfo?(dflanagan)
(In reply to Peter Dolanjski [:pdol] from comment #17)
> David, will you be trying again with the newer patch?

I'd like to, but I can't get the latest version of the patch to apply cleanly to my b2g-18 tree.
Flags: needinfo?(dflanagan)
I've built m-c with the patch from bug 862970 applied.

When I take out my setTimeout() calls to remove the performance regression, the phone crashes and reboots when I try to scan 5mp images.

I see a flash of white screen, as if the gallery is closing with an OOM right before the screen goes blank and the phone reboots.
Attached file patch (obsolete) —
Assignee: dflanagan → gal
Attachment #742101 - Attachment is obsolete: true
Attachment #742101 - Attachment is patch: false
Assignee: gal → dflanagan
Comment on attachment 742101 [details]
patch

>@@ -140,6 +183,10 @@ CanvasImageCache::NotifyDrawImage(Element* aImage,
>     entry->mData->mSurface = aSurface;
>     entry->mData->mSize = aSize;
>   }
>+
>+  // Expire the image cache early if its larger than we want it to be.

it's

Looks good to me.
Attachment #742101 - Flags: review+
> Looks good to me.

(Aside from a missing zero-initialization of mSize, per irc.)
We fixed the gecko bug in bug 865929. djf will do the gaia fix here.
Depends on: 865929
No longer depends on: 854799
With the patch fro bug 865929, I can take out the artificial delay while scanning images.

Now instead of taking 4+ seconds for each large previewless image, we take only 1.8 seconds. With 55 copies of the 5 megapixel images from bug 847060 on the sdcard, the gallery scans them all in 100s, and does not crash, even if I browse the images while scanning.

Note that this metadata parsing speed for large images with small (or no) previews is still longer than it used to be before I fixed bug 847060. That is because I now create, as part of the metadata parsing process, a preview for all images that do not have a sufficiently large embedded preview.  This means that we can always display images quickly and avoids OOM crashes when the user is quickly swiping through large images.

In the normal use case, the user will be viewing photos from the camera. These have embedded previews and the metadata parsing process is very fast.
Dominic,

This is a minor change to Gallery's metadata parsing logic, to remove an artificial setTimeout() delay I had to introduce in my fix for a previous bug.  It should be pretty easy to review.  

I've tested it with large images with small previews, using a custom build with the patch from bug 865929 applied. You won't be able to test it without that patch applied, but given that this is a long-standing tef+ bug, I'm asking for review now so I can uplift the patch as soon as the gecko patch is landed.
Attachment #742204 - Flags: review?(dkuo)
Whiteboard: [status: needs workaround, real fix post 1.0.1, still waiting for update] → [status: patch ready for review, gecko bug 865929 must land first]
Attachment #742204 - Flags: review?(dkuo) → review+
Landed on master: https://github.com/mozilla-b2g/gaia/commit/3463dc01096e5423be6d08dc2f71f62c12ffbb45
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Uplifted 3463dc01096e5423be6d08dc2f71f62c12ffbb45 to:
v1-train: d3e8745ff0471d12d475c98998cb0098b6c969e6
v1.0.1: 898a0fad9e620c04c85881fae745c7fd74028c10
After downloading 5 large images, the average size is 2-3 mb, the Gallery is scanned these images for less than 5 sec, one image loaded for a less than a 1 sec.
Start up load time is 557 milliseconds.
All images are loaded with thumbnails
Tested on Leo and Inari devices

Environmental  Variables:
Leo Build ID: 20130506070204
Kernel Date: Apr 25
Gecko: http://hg.mozilla.org/releases/mozilla-b2g18/rev/00c554abfc17
Gaia: 9377636cee5ac55b9f1d68f598afc7aadfbb2b00

Environmental  Variables:
Inari Build ID: 20130506070205
Kernel Date: Feb 21
Gecko: http://hg.mozilla.org/releases/mozilla-b2g18_v1_0_1/rev/ce67220b877d
Gaia: 1e598d8842920d9e0b1743dc6fe9390bd5f6e2df
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: