Compute the size of animated image frames correctly in the SurfaceCache

RESOLVED FIXED in Firefox 38

Status

()

defect
RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: seth, Assigned: seth)

Tracking

unspecified
mozilla40
Points:
---

Firefox Tracking Flags

(firefox38+ verified, firefox38.0.5+ verified, firefox39+ fixed, firefox40 fixed, firefox-esr3838+ verified, relnote-firefox 38+)

Details

(Whiteboard: [MemShrink])

Attachments

(1 attachment)

(Assignee)

Description

4 years ago
Wowzers. We have been computing the size of animated image frames *completely wrong* all this time, causing us to think we were on the verge of OOM'ing when we were really nowhere close. I noticed this thanks to the about:memory changes in bug 1157954. (Though I think we could have noticed it before, it's just that after that patch I started using about:memory differently.)

Here's what went wrong:

- ComputeCost() in SurfaceCache.cpp assumes that surfaces are RBGA/RBGX format and have 4 bytes per pixel. However, animated image frames are often paletted and thus have only 1 byte per pixel. That's a factor of 4, right there.

- Even worse, ComputeCost() used the width and height from the SurfaceKey to do its computation, but the SurfaceKey uses the size of the overall image, _not_ the size of the individual frame. For single-frame images these are the same, but animated images often only have image data for a small subregion of the image on each frame. This caused a tremendous overestimate of memory usage, because if an animated image was 1000x1000, we'd treat every frame as 1000x1000, even if some frames only had a 1x1 surface.

In one test case, we treated an animated image as if it occupied over 1.5GB when decoded. After these changes, we only viewed it as occupying _50MB_.
(Assignee)

Comment 1

4 years ago
Here's the patch. Basically we just need to always use the imgFrame's actual
size (imgFrame::GetSize(), which is determined by the size of aFrameRect in
Decoder::InternalAddFrame) and take bytes-per-pixel into account when calling
SurfaceCache::ComputeCost.

External to SurfaceCache, this change only affects callers of
SurfaceCache::CanHold(), and since nobody but Decoder ever allocates frames with
sizes different than their associated images, or non-RGBA/RGBX frames, we only
need to update Decoder.
Attachment #8601847 - Flags: review?(dholbert)
(Assignee)

Comment 3

4 years ago
Comment on attachment 8601847 [details] [diff] [review]
Compute the size of animated image frames correctly in the SurfaceCache

Approval Request Comment
[Feature/regressing bug #]: Bug 1116719.
[User impact if declined]: Some animated GIFs won't play back. They can even cause the browser to think it's near OOM, causing other images to fail to decode and making users restart their browser to recover. This is pretty serious; I'd really like to get this in 38 if possible.
[Describe test coverage new/current, TreeHerder]: Test coming up.
[Risks and why]: Low risk. The patch is pretty straightforward and only affects our estimates of the memory usage of image surfaces; it does not directly affect image rendering or decoding at all.
[String/UUID change made/needed]: None.
Attachment #8601847 - Flags: approval-mozilla-beta?
Attachment #8601847 - Flags: approval-mozilla-aurora?
(Assignee)

Comment 4

4 years ago
(In reply to Seth Fowler [:seth] from comment #3)
> [Describe test coverage new/current, TreeHerder]: Test coming up.

Hmm, it's actually impossible to write a test for this right now, because we need to check that we never manage to decode frame N, and there's no way to check that condition except for an inevitably-intermittent timeout.

After bug 11553322 lands, we will be able to reliably detect OOM because it will cause the image's |onerror| notification to fire. So it's better to wait. I'll file a followup to add a test.
(Assignee)

Comment 5

4 years ago
(In reply to Seth Fowler [:seth] from comment #4)
> After bug 11553322 lands, we will be able to reliably detect OOM because it
> will cause the image's |onerror| notification to fire. So it's better to
> wait. I'll file a followup to add a test.

Ack, no, that's not spec compliant. Drat.

I think we may need to do some work to support testing this. I'll file a followup about it.
(Assignee)

Updated

4 years ago
See Also: → 1161878
(Assignee)

Comment 6

4 years ago
I filed bug 1161878 about the work necessary to test this.
Comment on attachment 8601847 [details] [diff] [review]
Compute the size of animated image frames correctly in the SurfaceCache

r=me
Attachment #8601847 - Flags: review?(dholbert) → review+
(Assignee)

Comment 8

4 years ago
Thanks for the quick review! That try job doesn't seem to contain any failures introduced by this patch, so I'm going to go ahead and push.
This is unfortunately probably too late for 38 but we could take it for 38.0.5 and 38.1.0esr.
https://hg.mozilla.org/mozilla-central/rev/bbe25ecbebda
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla40
(Assignee)

Comment 12

4 years ago
(In reply to Sylvestre Ledru [:sylvestre] from comment #10)
> This is unfortunately probably too late for 38 but we could take it for
> 38.0.5 and 38.1.0esr.

Yeah, 38.1esr sounds good. Just hoping to get it in an ESR release if possible.
(Assignee)

Comment 13

4 years ago
Comment on attachment 8601847 [details] [diff] [review]
Compute the size of animated image frames correctly in the SurfaceCache

[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration: This is a fix for a serious regression in our ability to play back longer animated images. Without this fix, not only will the images freeze midway through playback, but the browser will also be stuck in a low-memory state, forcing users to restart the browser. I'd really prefer to have this fix in 38.1 if possible.
User impact if declined:  ^ See above.
Fix Landed on Version: 40. Uplifting everywhere else ASAP.
Risk to taking this patch (and alternatives if risky): This is pretty low risk, as it just corrects our algorithm for estimating the size of an image surface. It doesn't affect higher-risk pathways like decoding or drawing.
String or UUID changes made by this patch: None.

See https://wiki.mozilla.org/Release_Management/ESR_Landing_Process for more info.
Attachment #8601847 - Flags: approval-mozilla-esr38?
Whiteboard: [MemShrink]
Seth I see your comments on the other bug and writing tests, but is there a way we could manually test this?
Flags: needinfo?(seth)
Comment on attachment 8601847 [details] [diff] [review]
Compute the size of animated image frames correctly in the SurfaceCache

Approved for uplift to aurora. Let's see if this helps with OOM and animations.
Attachment #8601847 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
(Assignee)

Comment 16

4 years ago
(In reply to Liz Henry (:lizzard) from comment #14)
> Seth I see your comments on the other bug and writing tests, but is there a
> way we could manually test this?

Yeah, a manual test is easy. Load the GIF attached to bug 1150089, and make sure that it (1) plays back at all, and (2) loops, rather than freezing part way through.
Flags: needinfo?(seth)
Flags: qe-verify+
Comment on attachment 8601847 [details] [diff] [review]
Compute the size of animated image frames correctly in the SurfaceCache

After reviewing with Seth we're going to take this fix in 38.0.1 for both mainline and ESR. This has noticeable user impact (Seth will provide more details with bug refs) and does not have much risk of regression. The patch has been on m-c for a week and m-a for 5 days at this point as well.

Note that this patch needs to land on:
38.0 relbranch
38.0ESR relbranch
m-r
m-esr38
m-b
Attachment #8601847 - Flags: approval-mozilla-release+
Attachment #8601847 - Flags: approval-mozilla-esr38?
Attachment #8601847 - Flags: approval-mozilla-esr38+
Attachment #8601847 - Flags: approval-mozilla-beta?
Attachment #8601847 - Flags: approval-mozilla-beta+
Release Note Request (optional, but appreciated)
[Why is this notable]: Issues with image loading
[Suggested wording]: Fixed: Images may fail to load
[Links (documentation, blog post, etc)]:
(Assignee)

Comment 20

4 years ago
> <lmandel> seth: OK. Let's take this. What I need from you is:
> <lmandel> 1.  some dependent bugs that show that users are reporting this

See bug 1150089 for this. I think they're mostly duped against bug 1150089.

> <lmandel> 2. steps to reproduce so that SV can test overnight tonight

See comment 16.

> <lmandel> 3. uplift request for 38 and ESR38

Done!
(Assignee)

Comment 21

4 years ago
It's probably worth noting that this affects all platforms based on Gecko 38, both desktop and mobile.

Here's a revised release note request:

Release Note Request (optional, but appreciated)
[Why is this notable]: Firefox 37 introduced a bug where large animated images may fail to decode completely. Even worse, once this has happened other images will also start failing to decode, forcing the user to restart their browser to recover. This is fixed in Firefox 38.
[Suggested wording]: Fixed: Large animated images may fail to play and may stop other images from loading
[Links (documentation, blog post, etc)]:
[Tracking Requested - why for this release]: Needs to change to 38+ AFAICT.

This was uplifted to Aurora prior to Monday's uplift, so Fx39 already has the fix.

Fx 38.0.5 (default):
https://hg.mozilla.org/releases/mozilla-release/rev/aa3a683fd335

Fx 38.0.1 (GECKO380_2015050320_RELBRANCH)
https://hg.mozilla.org/releases/mozilla-release/rev/570b63d791b9

ESR 38.1 (default):
https://hg.mozilla.org/releases/mozilla-esr38/rev/af674949ec09

ESR 38.0.1 (GECKO380esr_2015050513_RELBRANCH):
https://hg.mozilla.org/releases/mozilla-esr38/rev/117ad2bfcf6c
This push to mozilla-release (default) broke lots of builds.
Flags: needinfo?(ryanvm)
The error includes "no known conversion for argument 1 from 'nsIntSize' to 'const IntSize&".

(I think this is fallout from the recent nsIntSize --> IntSize mass-renaming that happened on trunk.)

I think we might just need s/aFrameRect.Size()/aFrameRect.Size().ToIntSize()/, here (in the mozilla-release cset):
  https://hg.mozilla.org/releases/mozilla-release/rev/aa3a683fd335#l1.12
(Assignee)

Comment 25

4 years ago
(In reply to Daniel Holbert [:dholbert] from comment #24)
> The error includes "no known conversion for argument 1 from 'nsIntSize' to
> 'const IntSize&".
> 
> (I think this is fallout from the recent nsIntSize --> IntSize mass-renaming
> that happened on trunk.)
> 
> I think we might just need
> s/aFrameRect.Size()/aFrameRect.Size().ToIntSize()/, here (in the
> mozilla-release cset):
>   https://hg.mozilla.org/releases/mozilla-release/rev/aa3a683fd335#l1.12

Yep, that looks right. I'm cloning mozilla-release right now to verify that that's the only problem. Will upload a backported version shortly.
Ah dammit, I was worried about that hunk :(. Thanks for looking into it, Seth.
Flags: needinfo?(ryanvm)
I can push that to esr38 as well once m-r gets it.
Thanks, it'll need to land on default and the relbranches noted in comment 22.
Looks perfect, thanks!
Reproduced the initial issue on Firefox 38.0 ESR, verified that the issues if fixed on Windows 7 64-bit, Ubuntu 14.04 32-bit, Mac OS X 10.9.5, Windows 8.1 32-bit using Firefox 38.0.1 ESR and Firefox 38.0.1 RC.
Using the steps from comment #16 I'm able to reproduce on Fennec 38.0
Verifying as fixed on Fennec 38.0.1
Verifying as fixed on Fennec 38.0.5b4
You need to log in before you can comment on or make changes to this bug.