Closed Bug 1104622 Opened 5 years ago Closed 5 years ago

Remove DiscardTracker

Categories

(Core :: ImageLib, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla36

People

(Reporter: seth, Assigned: seth)

References

(Depends on 2 open bugs)

Details

Attachments

(2 files, 3 obsolete files)

With bug 1060869 landed, all discardable images (i.e., non-animated images) have their decoded surfaces stored in the SurfaceCache. Since the SurfaceCache manages its own memory and serves the same role that the DiscardTracker did for surfaces stored within it, the DiscardTracker is no longer needed.

In this bug I'll remove the DiscardTracker and simplify the discarding-related code in RasterImage. (But not remove all of it; there is still discarding happening, but it's only due to redecoding with different decode flags or an explicit request via imgIContainer::RequestDiscard.)
Attached patch Remove DiscardTracker (obsolete) — Splinter Review
Here we go. This could benefit from being broken up into smaller patches, but I'm too tired to do it tonight.
Attachment #8528273 - Flags: review?(tnikkel)
Attachment #8528273 - Flags: review?(tnikkel) → review+
Attached patch Remove DiscardTracker (obsolete) — Splinter Review
This updated version of the patch fixes the failures seen so far:

(1) The unused 'nsresult rv' was removed, which eliminated the build failure due to warnings-as-errors.
(2) On B2G, the SurfaceCache minimum expiration time is 24 hours, which makes 'test_bug399925.html' time out waiting for the DISCARD notification. Unfortunately, the pref that sets that is not live, and making it live is not feasible because nsExpirationTimer doesn't allow you to change the timeout dynamically. It would be nice to support that by fixing nsExpirationTimer, but in the short term I've modified the test so it manually requests a discard if it detects that the SurfaceCache minimum expiration time is really long. (So the previous behavior is used everywhere but B2G.)

Here's a new try run:

https://tbpl.mozilla.org/?tree=Try&rev=dd4cf8fb15c7
Attachment #8528273 - Attachment is obsolete: true
I'm guessing that failure on 'Linux debug' is from breaking non-unified builds. I've added an additional header to try to straighten that out. Since most of the platforms in that push worked, it seems like it's not worth another try push at this point.
Attachment #8529391 - Attachment is obsolete: true
Here's a second part that makes us not discard for redecoding at all. If this doesn't work out, I'm OK with it. Let's see how the try job turns out. (Local tests worked very well!)

Try job:

https://tbpl.mozilla.org/?tree=Try&rev=f246f2c7a13a
Attachment #8529444 - Flags: review?(tnikkel)
OK, I ended up having an hour of free time today, so I fixed this patch. This version resolves all the failures above locally. (Try is closed so I can't push another job now, but I'll of course verify before pushing this in.) I think this is good to go now though.

The issue was basically that SyncDecode will bail out early if it sees that |mDecoded = true|. So even though we don't discard in LookupFrame, we still need to clear mDecoded (and mHasFirstFrame).

Debugging this made me realize, in addition, that nobody should be calling ApplyDecodeFlags but LookupFrame anymore, so I removed the other redundant calls and inlined it. ApplyDecodeFlags had a nice optimization that checked if the image is opaque, and if so will let us use frames decoded with FLAG_DECODE_NO_PREMULTIPLY_ALPHA set differently. I realized that this needed to work differently in the SurfaceCache world - we need to do this check in LookupFrame. I moved it there as part of inlining ApplyDecodeFlags, and now we also avoid doing extra decoding in some cases where we were doing it before!

I'm actually pretty excited about this patch now.
Attachment #8529954 - Flags: review?(tnikkel)
Attachment #8529444 - Attachment is obsolete: true
Attachment #8529444 - Flags: review?(tnikkel)
Attachment #8529954 - Flags: review?(tnikkel) → review+
Thanks for the review! Try looks green. (It hasn't completely finished, but given how close the merge is, I don't have time to wait.) Pushed:

remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/503c9052d399
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/471b25bef3ec
https://hg.mozilla.org/mozilla-central/rev/503c9052d399
https://hg.mozilla.org/mozilla-central/rev/471b25bef3ec
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla36
So due to the holiday, I can't really deal with this until next week, but given the emails I received last night I predict that somebody is going to come here and file a blocker bug about AWSY memory usage regressions from this patch.

Almost certainly these regressions are an artifact of how we're measuring. Currently the SurfaceCache prefs are set to release surfaces for unlocked images after 60 seconds on desktop platforms; the DiscardTracker, which was the previous mechanism we had for this job, was set to release surfaces for unlocked images after 10 seconds.

I strongly suspect that that pref difference is what caused these regressions. For evidence, look at AWSY, which has a substantial regression in "After TP5" and related measurements but no regression in "After TP5, tabs closed [+30s]" and related measurements. (In fact, maybe it's just noise, but I think those latter measurements may have *improved* a little. That'd make sense, since SurfaceCache is more aggressive about releasing memory than DiscardTracker was.) I think what makes the difference here is how long we're waiting to take the measurement.

Next week I'll run some tests to verify this, and we can decide where to go from there. In the meantime, please do not back this out. It'll be trivial to uplift whatever pref changes we may decide to make to Aurora.
(Making sure :njn sees this.)
(In reply to Seth Fowler [:seth] from comment #10)
> I predict that somebody is going to
> come here and file a blocker bug about AWSY memory usage regressions from
> this patch.

I was just about to do that :-D
Maybe someone can add an annotation that the regression is expected? And maybe update AWSY to increase the settling time?
Depends on: 1107052
Depends on: 1120607
Depends on: 1163367
You need to log in before you can comment on or make changes to this bug.