Closed Bug 1465619 Opened Last year Closed 11 months ago

Improve performance of BlendAnimationFilter

Categories

(Core :: ImageLib, enhancement, P3)

enhancement

Tracking

()

RESOLVED FIXED
mozilla65
Tracking Status
firefox65 --- fixed

People

(Reporter: aosmond, Assigned: aosmond)

References

(Blocks 1 open bug)

Details

(Whiteboard: [gfx-noted])

Attachments

(14 files, 22 obsolete files)

46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
46 bytes, text/x-phabricator-request
Details | Review
There are a few things we can do to improve the performance of BlendAnimationFilter in bug 1337111.

1) If we produced full frames, finished decoding, and did not trigger discarding, we should consider optimizing all the surfaces. This will require some rejiggering to access the underlying imgFrame instead of the RawAccessFrameRef via DrawableSurface. Then we can get the RawAccessFrameRef if it turns out we need to do blending too. Bug 1465496 suggests this might be a good win on Windows.

2) When we are discarding frames, we should recycle the frame buffers instead of discarding them. This has a two-fold benefit -- we avoid the alloc/free and memory churning, and we can attempt to reuse the already decoded contents of the buffer. By accumulating the dirty rects between the recycled frame index and the frame we are decoding, we can only copied the bits that changed to get to the desired blending state.

We need to be careful that something else isn't still using the surfaces after we decide to recycle. This could happen if we fall behind the current transaction in the refresh driver (i.e. when we are advancing the animation, we don't know if the paint thread is still using the surface), or if something pulls the surface out from imagelib directly (e.g. canvas using GetFrame(AtSize)). If we are falling behind, then during animation advancing, we should discard the old current frame instead of recycling it; we will incur the cost for an extra alloc/free, but it will resume recycling frames once the transactions catch up. If canvas or blob images (WebRender) use the image, we should disable recycling entirely since we cannot easily synchronize with them.
Assignee: nobody → aosmond
Status: NEW → ASSIGNED
Depends on: 1337111
Priority: -- → P3
Whiteboard: [gfx-noted]
Another area to explore is how we memset our initial buffers. We currently must memset our buffers because a caller might access pixels we haven't written to yet. However for animated images, we never advance unless the frame is finished, thus granting us more freedom in this regard. We could choose to skip the memset and only do so if the last frame is incomplete. Regardless this is something we have to keep in mind because recycling doesn't memset the buffer (we want to reuse the contents!), and currently an incomplete last frame will assume unwritten pixels are memset to 0x00 or 0xFF.
I decided to break up the bug, so that the frame optimization will be done in a separate follow up bug. This does lay the foundation for it however, since we need to be store/access frames which are not RawAccessFrameRef (a requirement to optimize!).

Additionally my original approach for handling the necessary synchronization between a recycled frame and the decoder did not work very well. We would often discard a recycled frame in non-trivial web pages. As a result, I no longer depend on the transaction ID and instead have a wrapping surface that keeps a lock on the underlying surface, preventing it from being recycled as long as it is alive. WebRender will need to do some special magic to keep the shared surface alive since the consumer lives out of process, but I think we can handle that as part of the async pipeline work. Thus for now, this will only work with on and off main thread painting in the content process.
Also, there no longer needs to be an exception for canvas since the code makes fewer assumptions. I'll count that as a win :).
Comment on attachment 8983130 [details] [diff] [review]
0007-Bug-1465619-Part-7.-Add-support-for-recycling-to-ima.patch, v1

Review of attachment 8983130 [details] [diff] [review]:
-----------------------------------------------------------------

::: image/Decoder.cpp
@@ +368,3 @@
>  
> +      if (!blocked) {
> +        nsresult rv = ref->InitForDecoderRecycle(aAnimParams.ref());

Doing my pre-review, I realized there is a bug here. I can't overwrite the restore frame parameters, because I need them to do blending properly. This is really a corner case optimization I thought of while fixing the bug of overwriting the restore frame. I'll take it out because I don't think it is worth it unless it is easy to do :).
Blocks: 1467161
Blocks: 1458387
When blending full frames off the main thread, FrameAnimator no longer
requires access to the raw data of the frame to advance the animation.
Now we only request a RawAccessFrameRef for the current/next frames when
we have discovered that we need to do blending on the main thread.

In addition to avoiding the mutex overhead of RawAccessFrameRef, this
will also facilitate potentially optimizing the surfaces for the
DrawTarget for individual animated image frames.

Depends on D7505
Given an invalidation rect, called the recycle rect, which represents
the area which may have changed between the current frame and the frame
we are recycling, we can not only reuse the buffer itself to avoid an
allocation and free, we can also avoid copying pixel data from the
restore frame which is already set.

Depends on D7506
The clear rect and the recycle rect can overlap, and depending on the
size of the clear rect, it could be a significant amount of data that
needs to be copied from the restore frame. This patch minimizes the
copying for a row which contains both the recycle rect and the clear
rect.

Depends on D7507
If we discard a frame during decoding, e.g. due to an error, then we
don't want to take that frame into account for the first frame refresh
area. We should also be handling partial frames here (the dirty rect
needs to encompass the rows that did not get written with actual pixel
data). The only place that can have the necessary information is at the
end rather than at the beginning.

Depends on D7508
Since imgFrame::Draw will limit the drawing to only look at pixels that
we have written to and posted an invalidation for, there is no need to
hold the monitor while doing so. By taking the most expensive operation
outside the lock, we will minimize our chances of hitting contention
with the decoder thread.

A later part in this series will require that a surface be freed outside
the lock because it may end up reacquiring it. In addition to the
contention win, this change should facilitate that.

Depends on D7509
Beyond the necessary reinitialization methods, we need to protect
ourselves from recycling a frame that some other entity in the browser
is still using. Generally speaking the animated surface will only be
used in imgFrame::Draw since we don't layerize animated images, which
will be safe. However with OMTP or blob recordings, we could retain a
reference to the surface outside the current stack context. Additional
if something calls RasterImage::GetImageContainer(AtSize) or
RasterImage::GetFrame(AtSize), it may also have a reference to the
surface for an indetermine period of time.

As such, if an imgFrame is a candidate for recycling, it will wrap
imgFrame::mLockedSurface in a RecyclingSourceSurface. Its job is to
track how many consumers there are still of the surface, so that after
we advance the animation, the decoder will know if there are still
outstanding consumers.

If the surface is still in use, it will block for a finite period of
time (the refresh interval) before giving up on reclaiming the surface,
and will allocate a new surface. The old surface can then remain in
circulation for as long as necessary without further blocking the
animation progression, since we stop recycling that surface/imgFrame.

Depends on D7510
The owner for the decoder may implement IDecoderFrameRecycler to allow
the decoder to request a recycled frame instead of allocating a new one.
If none are available, it will fallback to allocating a new frame.

Not only may the IDecoderFrameRecycler not have any frames available for
recycling, the recycled frame itself may still be in use by other
entities outside of imagelib. Additionally it may still be required by
BlendAnimationFilter to restore the previous frame's data. It may even
be the same frame as to restore. In the worst case, we will simply
choose to allocate an entirely new frame, just like before.

When we allocate a new frame, that means the old frame we tried to
recycle will be taken out of circulation and not reused again,
regardless of why it failed.

Depends on D7511
In the next few patches, AnimationFrameBuffer will be reworked to split
the discarding behaviour from the retaining behaviour. Once implemented
as separate classes, it will allow easier reuse of the discarding code
for recycling.

Depends on D7512
This patch makes AnimationSurfaceProvider use the new abstractions
provided by AnimationFrameBuffer and AnimationFrameRetainedQueue to
provide storage and lifetime management of decoders and the produced
frames. We initially start out with an implementation that will just
keep every frame forever, like our historical behaviour. The next patch
will add support for discarding.

Depends on D7513
AnimatedFrameDiscardingQueue subclasses AnimationFrameBuffer to allow a
cleaner abstraction over the behaviour change when we cross the
threshold of too high a memory footprint for an animated image. The next
patch will build on top of this to provide an abstraction to reuse the
discarded frames.

Depends on D7514
This is what we have been working towards in all of the previous parts
in the series. This subclasses AnimationFrameDiscardingQueue to save the
discarded frames for recycling by the decoder, if the frame is marked as
supporting recycling.

Depends on D7515
Attached file dummy.log (obsolete) —
Attachment #8983796 - Attachment is obsolete: true
Attachment #8983797 - Attachment is obsolete: true
Attachment #8983798 - Attachment is obsolete: true
Attachment #8983800 - Attachment is obsolete: true
Attachment #8983801 - Attachment is obsolete: true
Attachment #8983802 - Attachment is obsolete: true
Attachment #8983803 - Attachment is obsolete: true
Attachment #8983804 - Attachment is obsolete: true
Attachment #8983805 - Attachment is obsolete: true
Attachment #8983806 - Attachment is obsolete: true
Attachment #8983807 - Attachment is obsolete: true
Attachment #8983808 - Attachment is obsolete: true
Attachment #8983810 - Attachment is obsolete: true
Attachment #8983811 - Attachment is obsolete: true
Comment on attachment 9013775 [details]
dummy.log

>dummy
Attachment #9013775 - Attachment is obsolete: true
Pushed by aosmond@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/b2a31a31fba6
Part 1. Use imgFrame directly instead of RawAccessFrameRef in FrameAnimator. r=tnikkel
https://hg.mozilla.org/integration/mozilla-inbound/rev/847e82346693
Part 2. Add basic support for recycling a frame buffer to BlendAnimationFilter. r=tnikkel
https://hg.mozilla.org/integration/mozilla-inbound/rev/fc76b59e51b4
Part 3. Improve the clear rect calculations to take into account recycling in BlendAnimationFilter. r=tnikkel
https://hg.mozilla.org/integration/mozilla-inbound/rev/edeea0a49a33
Part 4. Move the first frame refresh area calculation to frame commit. r=tnikkel
https://hg.mozilla.org/integration/mozilla-inbound/rev/ddaa812c31e4
Part 5. Move actual drawing in imgFrame::Draw outside the monitor. r=tnikkel
https://hg.mozilla.org/integration/mozilla-inbound/rev/7a2954981481
Part 6. Add support for recycling to imgFrame. r=tnikkel
https://hg.mozilla.org/integration/mozilla-inbound/rev/6ef29bb72930
Part 7. Add support for recycling to image::Decoder. r=tnikkel
https://hg.mozilla.org/integration/mozilla-inbound/rev/ef6369b9a1f9
Part 8. Remove the old AnimationFrameBuffer implementation. r=tnikkel
https://hg.mozilla.org/integration/mozilla-inbound/rev/bd9168e1e48c
Part 9. Use redesigned AnimationFrameBuffer interface and retaining buffer. r=tnikkel
https://hg.mozilla.org/integration/mozilla-inbound/rev/453d21db2a30
Part 10. Re-add support for discarding animated image frames. r=tnikkel
https://hg.mozilla.org/integration/mozilla-inbound/rev/9c8e6aaa0176
Part 11. Add support for recycling animated image frames. r=tnikkel
https://hg.mozilla.org/integration/mozilla-inbound/rev/0bacfc8c286f
Part 12. Add gtests for AnimationFrameRetainedBuffer. r=tnikkel
https://hg.mozilla.org/integration/mozilla-inbound/rev/6bf7e20ce8fc
Part 13. Add gtests for AnimationFrameDiscardingQueue. r=tnikkel
https://hg.mozilla.org/integration/mozilla-inbound/rev/2ce8f6d1a64e
Part 14. Add gtests for AnimationFrameRecyclingQueue. r=tnikkel
Depends on: 1501120
Depends on: 1501923
Depends on: 1502275
See Also: → 1510601
You need to log in before you can comment on or make changes to this bug.