Closed Bug 1164164 Opened 10 years ago Closed 9 years ago

FirefoxOS Gallery app can't always create thumbnails when scanning the sdcard

Categories

(Firefox OS Graveyard :: Gaia::Gallery, defect)

defect
Not set
normal

Tracking

(blocking-b2g:2.2+, b2g-v2.2 verified, b2g-master fixed)

VERIFIED FIXED
blocking-b2g 2.2+
Tracking Status
b2g-v2.2 --- verified
b2g-master --- fixed

People

(Reporter: djf, Assigned: djf)

References

Details

(Keywords: regression, Whiteboard: gfx-noted)

Attachments

(3 files)

I have about 250 images on my sdcard. When I flash today's FirefoxOS 2.2 nightly build onto my 319mb Flame and start the Gallery app, it scans the sdcard and creates thumbnails for the images, but fails to create any thumbnail for 52 images, most or all of which are png images (mostly screenshots) or gif images. Photos and other jpeg images did work. Gallery uses the embedded EXIF preview and/or uses #-moz-samplesize to create thumbnails for jpegs with less memory, but those optimizations are not available for png images. Gallery will intentionally ignore pngs that are too large too decode, but most of the failed pngs in this case are just 480x854 screenshots. This is repeatable. I tried it twice and got exactly the same 52 failures in both cases. I'm attaching a logcat showing the failures. The 52 failures that this bug are about are the ones that include 'createThumbnailFromElement' in the message. There are other error messages in the log, and at least some of those are expected failures, because some of the images on my phone are from apps/gallery/test/images/ which includes a number of images designed to challenge Gallery's metadata parsing and thumbnail generation code. (Some of the other failures in the log may actually also be bugs, but I haven't investigated enough yet.) These createThumbnailFromElement() error messages are exactly the same symptom as seen in bug 1163102. This means that uplifting bug 1162282 will make those error messages go away. But that won't solve this bug because then instead of having missing thumbnails, we'll just have black squares instead of thumbnails and the user will not really be any better off. I suspect that the fix to this bug is going to be to allow the gallery app to use more image memory than it is currently allowed to. Somehow it is able to create these thumbnails in 2.1, so we know it is possible.
This is a regression. If the gallery app can't create thumbnails for images, the user will never be able to see them, so this is almost a data loss issue. I think we need to block on it.
blocking-b2g: --- → 2.2?
Flags: needinfo?(seth)
Keywords: regression
I'm working to find a simple way to reproduce this without requiring all 250 images from my sdcard
Here are simple STR that demonstrate this bug with jpeg images: STR: 0) Make sure your phone is using 319mb of memory 1) remove all images from your sdcard 2) flash 2.2 nightly 3) use the settings app to set the screen timeout to 10 minutes, then kill the settings app 4) Run this command to push 125 test images to your phone: $ cd gaia/test_media/reference-workload $ ./generateImages 125 5) Launch the gallery app and wait untill it is done scanning 6) Count the number of thumbnails you see. 7) Look at the logcat Expected result: there should be one thumbnail for each image, so 125 thumbnails and no serious error messages in the logcat. Actual result: there are fewer than 125 thumbanils (I got 109) and logcat errors like this for each missing thumbnail: W/Gallery (12312): Content JS WARN: MediaDB: error parsing metadata for /sdcard/DCIM/100MZLLA/IMG_0110.jpg : createThumbnailFromElement:
Editing the bug description to remove the png/gif stuff, since these STR demonstrate the bug with jpeg files only. (Those test jpegs are 1600x1200 with no EXIF preview) This bug does not occur for me when I try it with only 100 photos. I'm assuming we have some kind of memory leak here where gecko is not releasing the images memory for the offscreen thumbnails, perhaps? Note that if you want to try reproducing this multiple times, you can blow away the gallery database (forcing it to scan again) like this: $ adb shell rm -r /data/local/storage/default/*gallery* In the past, when memory was low, the gallery would sometimes crash with an OOM while trying to create thumbnails. When the user restarted it, it would resume creating the thumbnails. This bug is worse because it fails without crashing, so the thumbnails are never created. Killing and restarting the app does not make them appear. To the user, it seems like the photos are just gone.
Summary: FirefoxOS Gallery app can't create thumbnails for png and gif images → FirefoxOS Gallery app can't always create thumbnails when scanning the sdcard
I just tried this again with the latest 2.2 sources where bug 1163187 had been uplifted (removing the -moz-crisp-edges stuff) and it did not make a difference. My phone still only scanned 108 or 109 of the 125 images.
No-Jun: I think this is going to need some QA attention. I'm not setting regressionwindow-wanted until Seth weighs in because he may already have a good idea where the regression is.
Flags: needinfo?(npark)
Is this related to bug 1148696? That got uplifted to 2.2 a day ago; did this just start happening within the past day?
Flags: needinfo?(seth)
Flags: needinfo?(dflanagan)
I don't know if it is related to that bug. Note that it is not happening on master, and I think you've got image locking turned on there, too. Let's ask for a regression window
Flags: needinfo?(dflanagan)
Clearing the regressionwindow-wanted flag because I tried it myself. I flashed /pvt/mozilla.org/b2gotoro/nightly/mozilla-b2g37_v2_2-flame-kk/2015/05/2015-05-09-16-25-00 The bug does not occur when I follow the STR in comment 3 with that build. (It was a production build instaed of an engineering build, but I assume that the relevant difference is the image locking)
Flags: needinfo?(seth)
125 images is not enough to cause the flashing thumbnails bug. I tried again with 300 images. All the thumbnails were created successfully. But once I start scrolling I get flashing images galore. Could it be that there is something wrong with the image locking code and that images are not being released when they are offscreen? Note that during thumbnail generation the user isn't scrolling. Changes to the DOM can make images that were previously onscreen move offscreen, however. Could it be that Gecko is not handling that case effectively? Or maybe we just need more image memory? You've got new prefs, don't you, that specify how much of system memory can be used for images? Is that in 2.2? Can we tweak those numbers? Any chance that you can make it so that offscreen images (those that are never even inserted into the document) are exempt from that image memory quota? In general, Gallery depends on the ability to do image processing (thumbnail generation and image editing) using <canvas> and offscreen images. Obviously gecko should decode images that are onscreen in preference to images that are in the document but offscreen. But I think maybe it should give the highest priority to images that are not in the document at all. Those are the images where failure causes data loss rather than just visual issues.
(In reply to David Flanagan [:djf] from comment #10) > Could it be that there is something wrong with the image locking code and > that images are not being released when they are offscreen? That's definitely it, yeah. That much was clear from the about:memory reports you posted the other day. Somehow the Gallery app is tricking the image locking code into thinking that all the thumbnails are visible. I haven't had a chance to figure out why that is yet. > Or maybe we just need more image memory? You've got new prefs, don't you, > that specify how much of system memory can be used for images? Is that in > 2.2? Can we tweak those numbers? Any chance that you can make it so that > offscreen images (those that are never even inserted into the document) are > exempt from that image memory quota? I'd prefer to keep tweaking those numbers off the table until we've eliminated all the bugs that are increasing the amount of memory the Gallery app is using unnecessarily. I suspect we'll find that tweaking the numbers isn't necessary. > In general, Gallery depends on the ability to do image processing (thumbnail > generation and image editing) using <canvas> and offscreen images. > Obviously gecko should decode images that are onscreen in preference to > images that are in the document but offscreen. But I think maybe it should > give the highest priority to images that are not in the document at all. > Those are the images where failure causes data loss rather than just visual > issues. We can't do this yet, but it's coming. The support we need for this will get added in bug 1123976. I don't think that bug needs to block this one, but I'll link your comment 10 in that bug, and we'll take it into account when the code for bug 1123976 gets written.
Flags: needinfo?(seth)
blocking-b2g: 2.2? → 2.2+
adding verifyme flags so once this is resolved, qa can take a look at it.
Flags: needinfo?(npark)
Keywords: verifyme
Depends on: 1166134
Depends on: 1166136
Hi David, Hi Seth, Thank you for helping on this bug. The 2.2 Code Complete date is 6/8. Do we have chance to fix this one and other Gallery Thumbnail bugs before then?
Flags: needinfo?(dflanagan)
Flags: needinfo?(seth)
I will be looking for a gaia-based workaround (or at least a reduced test case) for this bug, so I'll go ahead and take it. I assume that Seth will be looking for a Gecko-based fix in bug 1166136. Note that this bug and 1163225 are basically the same bug, but the symptoms are different in 2.2 and 3.0. I had actually forgotten that I filed this one. But since it is now a blocker, I'll started working on it here. The first step is to try the things I listed in https://bugzilla.mozilla.org/show_bug.cgi?id=1163225#c10...
Assignee: nobody → dflanagan
Flags: needinfo?(dflanagan)
I've been doing some investigation in bug 1163225 about the underlying gecko issue. In 2.2 we still get an exception thrown by the drawImage() call, and we can use that to detect when something is going wrong. If we can be sure that the exception only happens when memory is low (and not for corrupt images) then one possible workaround would be to basically reboot the gallery app with a window.reload(). It looks like sorting the new files by date before creating thumbnails will also help prevent the memory issue.
(In reply to Josh Cheng [:josh] from comment #13) > Hi David, Hi Seth, > Thank you for helping on this bug. > The 2.2 Code Complete date is 6/8. Do we have chance to fix this one and > other Gallery Thumbnail bugs before then? I'm working on the Gecko side, and David is working on workarounds on the Gaia side. We'll do our best. My work is taking place in bug 1166134 and bug 1166136. Only bug 1166136 is probably relevant for 2.2.
Flags: needinfo?(seth)
Whiteboard: gfx-noted
Hi David, Hi Seth, Thank you very much and appreciate for your help!
I've got a WIP patch for mediadb that does a complete device storage scan, then sorts the files into newest-to-oldest order and processes them that way. This ensures that most new thumbnails are created offscreen, and with the patch, I can scan an sdcard with hundreds of test images on it without getting failures if I just let the device sit and scan. Unfortunately, if I wait for the scan to complete about halfway then start scrolling, then memory use spikes and I almost immediately start seeing errors creating the thumbnails. So my mediadb fix can go part way to solving the problem of large initial scans, but it will only be part of the solution. Hopefully when we have solved 1166136 that will solve the rest of this problem.
Is the resolution of this bug blocked on both "depends on" bug 1166134 and bug 1166136? If so, we should probably 2.2? those two so that they get tracked.
Comment on attachment 8613123 [details] [review] [gaia] davidflanagan:bug1164164-master > mozilla-b2g:master Jim, this patch modifies MediaDB so that when it scans for new files, it waits until it has all results, sorts them, and then does metadata parsing in chronological order from newest to oldest This gives a better startup experience with less jumping around. It does mean that instead of starting to process the first file in about 500ms, it doesn't start processing files for about 1500ms, but this never happens until the db is fully enumerated anyway, so it shouldn't really make much of a difference to the user experience. The reason that this patch is attached to this 2.2+ bug is that it also works around bug 1168985 which was causing the gallery app to run out of image memory and fail to create thumbnails for some images. The two versions of the patch are almost identical. There were no merge conflicts in mediadb.js, only in the other affected files.
Attachment #8613123 - Flags: review?(squibblyflabbetydoo)
The patch I've just attached is a partial solution that addresses the case of a large initial scan where the user doesn't scroll. We also need to get bug 1166136 fixed, I think. And to be safe, I should probably land a 2.2-only patch where if drawImage() throws an exception we don't mark the image as invalid, so that we can try scanning again the next time we start up.
I've just confirmed that even with this patch, if I scroll until I see missing thumbnails and then take a screenshot, the screenshot will not get a thumbnail. Hopefully a fix for 1166136 will make that bug go away. But if it does not, then I need to handle the exception thrown by drawImage() differently so the screenshot is not permanently marked as invalid.
Switching this back to FirefoxOS:Gallery (from Core:ImageLib) since I'm doing gaia fixes here, and we've sort of established that bug 1166136 is the gecko bug.
Component: ImageLib → Gaia::Gallery
Product: Core → Firefox OS
Version: Trunk → unspecified
Jim: Seth thinks that he has fixed the gecko issues, so this patch may no longer be needed for 2.2. I suspect it is probably a good thing to land on master though, since it should give a better scanning experience for all the media apps.
Flags: needinfo?(squibblyflabbetydoo)
Today's nightly includes Seth's patches for bugs 1169879, 1169880 and 1169881, and using that nightly I can no longer reproduce this bug. It is probably still worth landing the mediadb patch to master, though. Perhaps in a new bug.
Update, if I turn the DDD pref off on master (at Seth's suggestion to better simulate 2.2) and scroll aggressively up and down while the gallery is scanning images, I can get the black thumbnail issue to reproduce. But I've only seen this once, and it only affected a single thumbnail, so the issue is much, much better than it was. Apparently during scrolling all images are locked, so it is apparently possible to use up available image memory by scrolling quickly through a long list of images. I should consider a 2.2 patch that handles the exception on canvas.drawImage() so that the image is not marked as permanently invalid and can be scanned on the next startup.
Comment on attachment 8613123 [details] [review] [gaia] davidflanagan:bug1164164-master > mozilla-b2g:master This looks good to me, although I'm not sure how much it matters now...
Flags: needinfo?(squibblyflabbetydoo)
Attachment #8613123 - Flags: review?(squibblyflabbetydoo) → review+
Hi David, Do you like to raise 2.2 uplift approval request? Thanks
Flags: needinfo?(dflanagan)
Josh: see comments #26 and #27. Gecko changes seem to be sufficient (or mostly sufficient) to avoid this issue in 2.2, so I don't think the patch will help at all anymore and is not worth the risk to uplift. I do want to land it on master, however. The symptom on master is a black thumbnail, and that is bad enough that we can never let it happen. On 2.2, the symptom is that no thumbnail ever shows up. We currently do that for any image that is too big to decode, so in a sense, if this bug occurs, it will just be because we've encountered another image that is too big to decode. Josh: I think this no longer needs to be a 2.2+ bug. Setting qawanted to get help verifying that it is no longer an issue on 2.2. QA: see comment #3 for the STR. We need to verify that this no longer occurs in 2.2
Flags: needinfo?(dflanagan) → needinfo?(jocheng)
Keywords: qawanted
Even though this patch is no longer directly relevant to this bug, it is still a good patch, and I've landed it on master: https://github.com/mozilla-b2g/gaia/commit/e38e8fd1a758778058f4d4b6f83c17599fe16bf7 I am not requesting uplift because I think it is too risky for 2.2
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
I've filed bug 1174814 as a followup so we don't forget about the underlying issue on master.
Ugh. I broke lint by landing above. I've landed a fix to master: https://github.com/mozilla-b2g/gaia/commit/62ba52866f4e5ca9120dad5bfe62fc5df981dc39
This issue is verified fixed on the latest 2.2 Flame. Following STR on comment 3, all thumbnails are now showing and there are no errors parsing metada in the logcat. Device: Flame (full flashed, 319MB, KK) BuildID: 20150615002503 Gaia: e7a0c6d5f4df04d45fb3f726efb9e8223600cb79 Gecko: f278c675d51f Gonk: bd9cb3af2a0354577a6903917bc826489050b40d Version: 37.0 (2.2) Firmware Version: v18D-1 User Agent: Mozilla/5.0 (Mobile; rv:37.0) Gecko/37.0 Firefox/37.0 ------- I'm not sure whether this was ever affecting 3.0. If it was, feel free to tag qawanted to verify on 3.0.
Status: RESOLVED → VERIFIED
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(ktucker)
Keywords: qawanted, verifyme
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(ktucker)
(In reply to David Flanagan [:djf] from comment #31) > Josh: see comments #26 and #27. Gecko changes seem to be sufficient (or > mostly sufficient) to avoid this issue in 2.2, so I don't think the patch > will help at all anymore and is not worth the risk to uplift. I do want to > land it on master, however. > > The symptom on master is a black thumbnail, and that is bad enough that we > can never let it happen. On 2.2, the symptom is that no thumbnail ever shows > up. We currently do that for any image that is too big to decode, so in a > sense, if this bug occurs, it will just be because we've encountered another > image that is too big to decode. > > Josh: I think this no longer needs to be a 2.2+ bug. Setting qawanted to get > help verifying that it is no longer an issue on 2.2. > > QA: see comment #3 for the STR. We need to verify that this no longer occurs > in 2.2 Hi David, Thanks for the info!
Flags: needinfo?(jocheng)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: