Closed
Bug 1083672
Opened 10 years ago
Closed 10 years ago
drawImage where the source and destination canvas are the same is 100x slower than copying through a temporary canvas
Categories
(Core :: Graphics: Canvas2D, defect)
Tracking
()
RESOLVED
FIXED
mozilla36
People
(Reporter: mbx, Assigned: mstange)
References
Details
(Keywords: perf)
Attachments
(2 files)
2.70 KB,
text/html
|
Details | |
3.17 KB,
patch
|
jrmuizel
:
review+
|
Details | Diff | Splinter Review |
Using drawImage to copy a region of a canvas to another region within the same canvas is extremely slow. We're talking 100X slower than copying the region to a temporary canvas and then copying it back. I have only tested on OSX.
Comment 1•10 years ago
|
||
The A<->A copy is also slower in Chrome (though Chrome is faster than Firefox.)
The A<->B<->A copy is considerably faster in both browsers.
It looks like a typical memory/speed tradeoff.
Comment 2•10 years ago
|
||
Markus: can you profile this one for us? It does seem like we're still several times slower than Chrome. Thanks!
Flags: needinfo?(mstange)
Updated•10 years ago
|
Keywords: perf
Summary: drawImage where the source and destination are the same is painfully slow. → drawImage where the source and destination canvas are the same is 100x slower than copying through a temporary canvas
Assignee | ||
Comment 3•10 years ago
|
||
This happens because internally we copy out the whole canvas instead of just the subrect we need.
Assignee: nobody → mstange
Status: NEW → ASSIGNED
Flags: needinfo?(mstange)
Comment 4•10 years ago
|
||
dupe of bug 1001069?
Assignee | ||
Comment 5•10 years ago
|
||
(In reply to Jet Villegas (:jet) from comment #4)
> dupe of bug 1001069?
Similar issue but needs to be fixed separately.
Assignee | ||
Comment 6•10 years ago
|
||
Together with bug 1074842 this patch gets us most of the way there. The remaining overhead comes from allocating, zeroing-out and de-allocating the temporary DrawTarget. The manual CanvasA->CanvasB->CanvasA copy doesn't have that overhead because CanvasB stays alive across all drawImage the calls.
Attachment #8507949 -
Flags: review?(bas)
Comment 7•10 years ago
|
||
Comment on attachment 8507949 [details] [diff] [review]
v1
Stealing the review from Bas to Jeff.
Attachment #8507949 -
Flags: review?(bas) → review?(jmuizelaar)
Reporter | ||
Comment 8•10 years ago
|
||
(In reply to Markus Stange [:mstange] from comment #6)
> Created attachment 8507949 [details] [diff] [review]
> v1
>
> Together with bug 1074842 this patch gets us most of the way there. The
> remaining overhead comes from allocating, zeroing-out and de-allocating the
> temporary DrawTarget. The manual CanvasA->CanvasB->CanvasA copy doesn't have
> that overhead because CanvasB stays alive across all drawImage the calls.
Can we create and keep a temporary DrawTarget around for each canvas that needs this? We can resize it as needed, or free it if we're under memory pressure.
Is the zeroing-out step necessary? We are reading back exactly from the same area we wrote to.
Assignee | ||
Comment 9•10 years ago
|
||
(In reply to Michael Bebenita [:mbx] from comment #8)
> (In reply to Markus Stange [:mstange] from comment #6)
> > Created attachment 8507949 [details] [diff] [review]
> > v1
> >
> > Together with bug 1074842 this patch gets us most of the way there. The
> > remaining overhead comes from allocating, zeroing-out and de-allocating the
> > temporary DrawTarget. The manual CanvasA->CanvasB->CanvasA copy doesn't have
> > that overhead because CanvasB stays alive across all drawImage the calls.
>
> Can we create and keep a temporary DrawTarget around for each canvas that
> needs this? We can resize it as needed, or free it if we're under memory
> pressure.
Yes, we could do that. With my patches there's still a difference of around 23% (~25500 Op/s vs ~33000 Op/s) which should be enough to justify the added complexity. Though it would be better if we could demonstrate the perf difference on a real world application instead of a microbenchmark. Do you have one you can point me to?
> Is the zeroing-out step necessary? We are reading back exactly from the same
> area we wrote to.
You're right, the zeroing-out step is not necessary. But at the moment Factory::CreateDrawTarget always creates an empty DrawTarget. We could add a fast path for creating uninitialized DrawTargets, but if we have to keep around the intermediate DrawTarget anyway we probably won't need it.
Reporter | ||
Comment 10•10 years ago
|
||
(In reply to Markus Stange [:mstange] from comment #9)
> (In reply to Michael Bebenita [:mbx] from comment #8)
> > (In reply to Markus Stange [:mstange] from comment #6)
> > > Created attachment 8507949 [details] [diff] [review]
> > > v1
> > >
> > > Together with bug 1074842 this patch gets us most of the way there. The
> > > remaining overhead comes from allocating, zeroing-out and de-allocating the
> > > temporary DrawTarget. The manual CanvasA->CanvasB->CanvasA copy doesn't have
> > > that overhead because CanvasB stays alive across all drawImage the calls.
> >
> > Can we create and keep a temporary DrawTarget around for each canvas that
> > needs this? We can resize it as needed, or free it if we're under memory
> > pressure.
>
> Yes, we could do that. With my patches there's still a difference of around
> 23% (~25500 Op/s vs ~33000 Op/s) which should be enough to justify the added
> complexity. Though it would be better if we could demonstrate the perf
> difference on a real world application instead of a microbenchmark. Do you
> have one you can point me to?
The use case here is JS developers needing to allocate "DrawTargets" to hold temporary results, usually for blending or masking operations. They can either create lots of Canvas elements, or reuse a pool of large Canvas elements and manage their own allocation of subregions within those. This then leads to the problem where you read and write to the same canvas. They can work around it by using a temporary canvas like I do in this example, but I doubt most will be aware of the problem.
> > Is the zeroing-out step necessary? We are reading back exactly from the same
> > area we wrote to.
>
> You're right, the zeroing-out step is not necessary. But at the moment
> Factory::CreateDrawTarget always creates an empty DrawTarget. We could add a
> fast path for creating uninitialized DrawTargets, but if we have to keep
> around the intermediate DrawTarget anyway we probably won't need it.
Comment 11•10 years ago
|
||
Comment on attachment 8507949 [details] [diff] [review]
v1
Review of attachment 8507949 [details] [diff] [review]:
-----------------------------------------------------------------
::: dom/canvas/CanvasRenderingContext2D.cpp
@@ +3858,5 @@
> +// Returns a surface that contains only the part needed to draw aSourceRect.
> +// On entry, aSourceRect is relative to aSurface, and on return aSourceRect is
> +// relative to the returned surface.
> +static TemporaryRef<SourceSurface>
> +ExtractSubrect(SourceSurface* aSurface, mgfx::Rect* aSourceRect, DrawTarget* aTargetDT)
I don't really like the fact that the rounding happens here and source rect is modified, but I have no suggestions for better.
Attachment #8507949 -
Flags: review?(jmuizelaar) → review+
Assignee | ||
Comment 12•10 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/853612ddc686
I'll open a new bug for the other optimization we talked about.
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla36
Assignee | ||
Comment 14•10 years ago
|
||
(In reply to Michael Bebenita [:mbx] from comment #8)
> (In reply to Markus Stange [:mstange] from comment #6)
> > Created attachment 8507949 [details] [diff] [review]
> > v1
> >
> > Together with bug 1074842 this patch gets us most of the way there. The
> > remaining overhead comes from allocating, zeroing-out and de-allocating the
> > temporary DrawTarget. The manual CanvasA->CanvasB->CanvasA copy doesn't have
> > that overhead because CanvasB stays alive across all drawImage the calls.
>
> Can we create and keep a temporary DrawTarget around for each canvas that
> needs this? We can resize it as needed, or free it if we're under memory
> pressure.
I put up a patch for this in bug 1090323.
You need to log in
before you can comment on or make changes to this bug.
Description
•