Closed Bug 1171356 Opened 4 years ago Closed 4 years ago
On B2G, retry image decodes that fail because allocation of the first frame failed
Right now, we allocate the first frame of an image on the main thread, before sending the decoder off to the DecodePool where it'll sit in a queue until we're ready to run it. The first frame has to be locked during this time. (There's no point in not locking it; we'll just end up OOM'ing.) This causes a problem when you scroll very quickly in the Gallery app, because we can rapidly request decoding for a lot of images, and those decode jobs sit in the queue, keeping their first frames locked until they actually get a chance to run. We can easily run out of SurfaceCache capacity, which causes us to start failing to allocate the first frame of new images, resulting in decode failures and gray rectangles where images should be. The *correct* fix for this is bug 1117607 - just don't allocate the first frame until the decoder actually runs. However, there's no way we can uplift that to B2G 2.2. So, in this bug, let's implement a B2G-only hack for now. If allocating the first frame fails, we'll just send off a runnable to attempt decoding again later. In practice this works very well, and seems to solve the remaining issues we have with decoding failures while scrolling in the Gallery app.
Summary: On B2G, retry images decodes that fail because allocation of the first frame failed → On B2G, retry image decodes that fail because allocation of the first frame failed
Here's the patch. I tried to keep it extremely simple here (given that this needs uplift to 2.2 and code freeze is Friday), so I didn't try to do anything clever with timers or merging runnables or anything like that. Scrolling pretty abusively on the Gallery way, I just wasn't able to cause enough retries to make any of that seem worth it, in any case. One thing that's particularly encouraging is that virtually every time we retry decoding an image, we only need to retry for that image once. I saw a very small number of cases where we retried twice, and not a single instance of trying more than twice, even with very abusive scrolling. I do want to be sure that we don't keep spawning these runnables eternally in the case where the image is just too large for us to allocate, though, so I have added RasterImage::mRetryCount, which tracks the number of times we retry decoding for an image. If we retry 10 times in a row and fail, we just give up.
Attachment #8615177 - Flags: review?(tnikkel)
Attachment #8615177 - Flags: review?(tnikkel) → review+
Comment on attachment 8615177 [details] [diff] [review] On B2G, retry image decodes that fail because allocation of the first frame failed Please approve this one!!! It's really, really important. It makes a huge difference. [Approval Request Comment] Bug caused by (feature/regressing bug #): Not sure, but it was exposed by the fact that we enabled image locking on B2G in bug 1148696. User impact if declined: Images may fail to decode due to low memory, causing blank thumbnails in the gallery which may persist across reboots (making this effectively a data loss issue). This patch is one of several that together fix bug 1166136, which is nominated for blocking 2.2, and bug 1164164, which is a 2.2 blocker. Testing completed: Extensively tested locally. On mozilla-inbound and anticipated to land on mozilla-central as of now. Risk to taking this patch (and alternatives if risky): This is a low risk patch that only affects a small section of code. String or UUID changes made by this patch: None.
Attachment #8615177 - Flags: approval-mozilla-b2g37?
Attachment #8615177 - Flags: approval-mozilla-b2g37? → approval-mozilla-b2g37+
Bustage follow-up. https://hg.mozilla.org/releases/mozilla-b2g37_v2_2/rev/69a622553765
You need to log in before you can comment on or make changes to this bug.