Closed Bug 600410 Opened 15 years ago Closed 8 years ago

Slow canvas drawImage (SpeedReading)

Categories

(Core :: Graphics, defect)

x86
Windows 7
defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME
Tracking Status
blocking2.0 --- -

People

(Reporter: vlad, Unassigned)

References

(Blocks 1 open bug, )

Details

(Whiteboard: ietestdrive)

Attachments

(3 files, 1 obsolete file)

The speedreading benchmark is essentially a benchmark of drawImage speed. The attached testcase is basically that particular demo, but in quantifiable form.
Attached image source image
The source image (same one from speed reading)
Attached file benchmark (obsolete) —
The benchmark. Does a bunch of drawimages with different scales to non-aligned destinations.
On my Intel GPU laptop, we get around 2500ms regardless of scale -- even for scale 1 and forcing things to be pixel aligned, which the attached testcase doesn't do. IE9 gets around 440ms, however once it gets to scale 2.856 that jumps to 750, and then it increases with the scale after that. That's interesting, as ours doesn't change even at higher scales (going up to 5+ with my exponential scale). Chrome's values are directly dependent on the scale at all times, and range from 316 (1:1) to 7840 (2.85). Chrome with GL canvas accel, which should eat up drawImage, hovers around 2000ms for all scales. Opera behaves similarly to unaccelerated Chrome, going from 631ms -> 13913ms. So -- how is IE9 doing this 5-6x faster with d2d? With this particular image, there are 51,000 draws happening per scale. So we average 49µs per draw, IE9 around 8.6µs. I'm going to look in to some of the other work that happens during DrawImage, like the security checks, and see what the impact is...
If we do everything in DrawImage except for the actual call to Fill(), we get around 200ms consistently; so the overhead here isn't in the rest of work DrawImage does, though we can probably get that 200ms lower.
blocking2.0: --- → ?
More data: the 2500ms number was with a 20100928 nightly. With a local build that has my fix for bug 599698, our numbers start to vary a lot: Scale: 0.269 -- 1473 ms (51000 draws) Scale: 0.35 -- 2491 ms (51000 draws) Scale: 0.455 -- 1849 ms (51000 draws) Scale: 0.592 -- 1985 ms (51000 draws) Scale: 0.769 -- 1769 ms (51000 draws) Scale: 1 -- 436 ms (51000 draws) (pixel-aligned) Scale: 1.3 -- 1966 ms (51000 draws) Scale: 1.69 -- 1998 ms (51000 draws) Scale: 2.197 -- 2974 ms (51000 draws) Scale: 2.856 -- 2473 ms (51000 draws) Scale: 3.713 -- 4409 ms (51000 draws) Scale: 4.827 -- 3621 ms (51000 draws) So going to 436ms for the pixel-aligned case is great, and the other drops are good, but I don't understand at all why we get -increases-. For comparison, here's what the run from the nightly looks like: Scale: 0.269 -- 2667 ms (51000 draws) Scale: 0.35 -- 2653 ms (51000 draws) Scale: 0.455 -- 2656 ms (51000 draws) Scale: 0.592 -- 2658 ms (51000 draws) Scale: 0.769 -- 2666 ms (51000 draws) Scale: 1 -- 2659 ms (51000 draws) (pixel-aligned) Scale: 1.3 -- 2633 ms (51000 draws) Scale: 1.69 -- 2642 ms (51000 draws) Scale: 2.197 -- 2636 ms (51000 draws) Scale: 2.856 -- 2649 ms (51000 draws) Scale: 3.713 -- 2653 ms (51000 draws) Scale: 4.827 -- 2653 ms (51000 draws)
Attached file updated benchmark
Updated benchmark, goes to higher scales and also tests 1.0x scale with both aligned and unaligned destinations.
Attachment #479241 - Attachment is obsolete: true
So after after some looking into this. This is much better with D3D10 layers -and- Vlad's fix to bug 599698. We get within 50% of IE9, however at FishIE we're actually faster than IE9 with both those fixes! So there's some odd performance discrepancy there.
Pixel alignment differences, perhaps?
Let's re-evaluate once D3D10 and bug 599698 land.
blocking2.0: ? → ---
How are you going to reevaluate if you just removed the blocking nom (and we're more or less ignoring non-blocker-and-non-nominated bugs)?
OK.
blocking2.0: --- → betaN+
With d3d10 patches -- I modified the test to run each scale twice, on a hunch: Scale: 0.269 -- 1619 ms (51000 draws) Scale: 0.269 -- 1708 ms (51000 draws) Scale: 0.35 -- 2497 ms (51000 draws) Scale: 0.35 -- 2498 ms (51000 draws) Scale: 0.455 -- 1803 ms (51000 draws) Scale: 0.455 -- 1800 ms (51000 draws) Scale: 0.592 -- 2033 ms (51000 draws) Scale: 0.592 -- 2001 ms (51000 draws) Scale: 0.769 -- 1787 ms (51000 draws) Scale: 0.769 -- 1775 ms (51000 draws) Scale: 1 -- 2038 ms (51000 draws) Scale: 1 -- 1928 ms (51000 draws) Scale: 1.3 -- 1960 ms (51000 draws) Scale: 1.3 -- 1927 ms (51000 draws) Scale: 1.69 -- 2004 ms (51000 draws) Scale: 1.69 -- 1997 ms (51000 draws) Scale: 2.197 -- 2754 ms (51000 draws) Scale: 2.197 -- 2792 ms (51000 draws) Scale: 2.856 -- 2541 ms (51000 draws) Scale: 2.856 -- 2406 ms (51000 draws) Scale: 3.713 -- 4018 ms (51000 draws) Scale: 3.713 -- 3837 ms (51000 draws) Scale: 4.827 -- 3771 ms (51000 draws) Scale: 4.827 -- 3588 ms (51000 draws) Scale: 1 -- 444 ms (51000 draws) (pixel-aligned dest rect) Scale: 1 -- 441 ms (51000 draws) (pixel-aligned dest rect) No real change. Note some weirdnesses -- the spike at 0.35 scale is recurring, not a fluke of this particular run. Compare IE9 though: Scale: 0.269 -- 624 ms (51000 draws) Scale: 0.269 -- 454 ms (51000 draws) Scale: 0.35 -- 442 ms (51000 draws) Scale: 0.35 -- 463 ms (51000 draws) Scale: 0.455 -- 449 ms (51000 draws) Scale: 0.455 -- 444 ms (51000 draws) Scale: 0.592 -- 440 ms (51000 draws) Scale: 0.592 -- 443 ms (51000 draws) Scale: 0.769 -- 442 ms (51000 draws) Scale: 0.769 -- 443 ms (51000 draws) Scale: 1 -- 420 ms (51000 draws) Scale: 1 -- 425 ms (51000 draws) Scale: 1.3 -- 444 ms (51000 draws) Scale: 1.3 -- 444 ms (51000 draws) Scale: 1.69 -- 443 ms (51000 draws) Scale: 1.69 -- 445 ms (51000 draws) Scale: 2.197 -- 442 ms (51000 draws) Scale: 2.197 -- 622 ms (51000 draws) Scale: 2.856 -- 579 ms (51000 draws) Scale: 2.856 -- 990 ms (51000 draws) Scale: 3.713 -- 1126 ms (51000 draws) Scale: 3.713 -- 1520 ms (51000 draws) Scale: 4.827 -- 1671 ms (51000 draws) Scale: 4.827 -- 2344 ms (51000 draws) Scale: 1 -- 1200 ms (51000 draws) (pixel-aligned dest rect) Scale: 1 -- 406 ms (51000 draws) (pixel-aligned dest rect) I don't really understand these numbers, especially at the end, where the first scale =1 aligned is 1200ms and the second drops to 406ms. Only thing I can think of is that there were still some of the previous draws in the pipe that weren't flushed? You could presumably get these numbers if you happened to cache the source image at the appropriate scale.. I'll try to construct a test that would thrash such a cache and see what happens.
(Nope, even at random scales per draw we get around 2400, ie9 around 540)
I take the above numbers back -- I had two additional patches applied, one which optimizes images to d2d surfaces, and another that uses EXTEND_PAD for drawImage. If I don't do the first, I get around 440ms at all scales! Not sure if the second affects it or not, looking into it. However, Speed Reading doesn't seem to be affected by a lot.
As we discussed on IRC, the cause seems to largely be that we're hitting the 'slow' Clip()/Paint() path rather than the Fill() path. I've optimized this case a bit in the D2D backend in bug 600760. But the fill path is still somewhat faster.
The Demo's performance is still horrible for me. 18 FPS, AVG Draw Duration: 80 ms Same with D3D10 and D3D9. D2D, JM+TM ON. Win7, HD4330. Fishtank is OK. On my other system: Win7 8800GT Speed reading: 30 FPS, 28 ms Still not good. IE9 produces 60FPS, 15ms on both systems.
Looks good on my end: Scale: 0.269 -- 453 ms Scale: 0.35 -- 434 ms Scale: 0.455 -- 432 ms Scale: 0.592 -- 434 ms Scale: 0.769 -- 428 ms Scale: 1 -- 423 ms Scale: 1.3 -- 428 ms Scale: 1.69 -- 436 ms Scale: 2.197 -- 412 ms Scale: 2.856 -- 424 ms Scale: 3.713 -- 413 ms Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0b7pre) Gecko/20101001 Firefox/4.0b7pre - Build ID: 20101001104515
Win7, HD4330: IE9: Scale: 0.269 -- 857 ms (51000 draws) Scale: 0.35 -- 705 ms (51000 draws) Scale: 0.455 -- 680 ms (51000 draws) Scale: 0.592 -- 681 ms (51000 draws) Scale: 0.769 -- 682 ms (51000 draws) Scale: 1 -- 659 ms (51000 draws) Scale: 1.3 -- 681 ms (51000 draws) Scale: 1.69 -- 720 ms (51000 draws) Scale: 2.197 -- 720 ms (51000 draws) Scale: 2.856 -- 723 ms (51000 draws) Scale: 3.713 -- 731 ms (51000 draws) Scale: 4.827 -- 730 ms (51000 draws) Scale: 1 -- 682 ms (51000 draws) (pixel-aligned dest rect) Fx4: Scale: 0.269 -- 943 ms (51000 draws) Scale: 0.35 -- 917 ms (51000 draws) Scale: 0.455 -- 915 ms (51000 draws) Scale: 0.592 -- 914 ms (51000 draws) Scale: 0.769 -- 917 ms (51000 draws) Scale: 1 -- 919 ms (51000 draws) Scale: 1.3 -- 913 ms (51000 draws) Scale: 1.69 -- 916 ms (51000 draws) Scale: 2.197 -- 918 ms (51000 draws) Scale: 2.856 -- 915 ms (51000 draws) Scale: 3.713 -- 927 ms (51000 draws) Scale: 4.827 -- 945 ms (51000 draws) Scale: 1 -- 906 ms (51000 draws) (pixel-aligned dest rect) Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0b7pre) Gecko/20101001 BuildID = 20101001104515
Assignee: nobody → vladimir
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0b8pre) Gecko/20101007 Firefox/4.0b8pre Radeon HD 4650, Catalyst 10.9a driver layers.use-d3d10 true Speedreading score 479 seconds. With IE 19 seconds. Minefield: Scale: 0.269 -- 725 ms (51000 draws) Scale: 0.35 -- 720 ms (51000 draws) Scale: 0.455 -- 710 ms (51000 draws) Scale: 0.592 -- 719 ms (51000 draws) Scale: 0.769 -- 705 ms (51000 draws) Scale: 1 -- 703 ms (51000 draws) Scale: 1.3 -- 709 ms (51000 draws) Scale: 1.69 -- 735 ms (51000 draws) Scale: 2.197 -- 715 ms (51000 draws) Scale: 2.856 -- 706 ms (51000 draws) Scale: 3.713 -- 728 ms (51000 draws) Scale: 4.827 -- 734 ms (51000 draws) Scale: 1 -- 683 ms (51000 draws) (pixel-aligned dest rect) IE9: Scale: 0.269 -- 721 ms (51000 draws) Scale: 0.35 -- 579 ms (51000 draws) Scale: 0.455 -- 556 ms (51000 draws) Scale: 0.592 -- 589 ms (51000 draws) Scale: 0.769 -- 566 ms (51000 draws) Scale: 1 -- 513 ms (51000 draws) Scale: 1.3 -- 568 ms (51000 draws) Scale: 1.69 -- 583 ms (51000 draws) Scale: 2.197 -- 579 ms (51000 draws) Scale: 2.856 -- 580 ms (51000 draws) Scale: 3.713 -- 582 ms (51000 draws) Scale: 4.827 -- 598 ms (51000 draws) Scale: 1 -- 563 ms (51000 draws) (pixel-aligned dest rect)
layers.use-d3d10 false Scale: 0.269 -- 727 ms (51000 draws) Scale: 0.35 -- 723 ms (51000 draws) Scale: 0.455 -- 761 ms (51000 draws) Scale: 0.592 -- 731 ms (51000 draws) Scale: 0.769 -- 723 ms (51000 draws) Scale: 1 -- 718 ms (51000 draws) Scale: 1.3 -- 752 ms (51000 draws) Scale: 1.69 -- 745 ms (51000 draws) Scale: 2.197 -- 709 ms (51000 draws) Scale: 2.856 -- 748 ms (51000 draws) Scale: 3.713 -- 727 ms (51000 draws) Scale: 4.827 -- 748 ms (51000 draws) Scale: 1 -- 687 ms (51000 draws) (pixel-aligned dest rect)
We don't need to block Firefox 4 on this benchmark.
blocking2.0: betaN+ → -
With the latest nightly, FPS: 29, Average duration: 34ms. That sounds good to me. Vladimir, Does it work for you now?
The drawing duration is okay for me, but somehow the actual drawing is really slow. IE9: score: 19 sec avg draw duration:21ms Fx4: score: 487 sec avg draw duration: 21ms
I noticed that recent nightlies (e.g. Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0b11pre) Gecko/20110131 Firefox/4.0b11pre ID:20110131030335) give very slow "Speed Reading" benchmark IF bookmarks are displayed in the sidebar. With the bookmarks sidebar display, Speed Reading was ~700 seconds. With the sidebar hidden, the Speed Reading score was 30 seconds. This data is from a mid-range Windows 7 machine. With Mac OS X, having the sidebar bookmark display on also slows the Speed Reading time.
I noticed that recent nightlies (e.g. Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0b11pre) Gecko/20110131 Firefox/4.0b11pre ID:20110131030335) give very slow "Speed Reading" benchmark IF bookmarks are displayed in the sidebar. With the bookmarks sidebar display, Speed Reading was ~700 seconds. With the sidebar hidden, the Speed Reading score was 30 seconds. This data is from a mid-range Windows 7 machine. With Mac OS X, having the sidebar bookmark display on also slows the Speed Reading time.
Switching the sidebar bookmarks on and off while Speed Reading is running causes the frame rate and redraw speed to slow and speed up, respectively. Sidebar bookmarks is interfering with Speed Reading.
David, I filed bug 630446 on that issue.
Assignee: vladimir → nobody
Here are recent scores for me: * 18.0a1/20120916: 48 sec * IE 9: 34 sec * Chrome 21: 55 sec
Using Mozilla Firefox 19 Nightly (20121103030715) FPS: 7 Total Billboard Draws: 16870 Average Draw Duration: 128ms Window Size: 1600x1097 Seconds: 2120 (tested also with "gfx.direct2d.force-enabled;true" and "gfx.font_rendering.directwrite.enabled;true") Using Google Chrome 25 canary (25.0.1316.0) FPS: 60 Total Billboard Draws: 782 Average Draw Duration: 3ms Window Size: 1600x1099 Seconds: 7 Using Internet Explorer 9 (9.0.8112.16421 Update 10) FPS: 47 Total Billboard Draws: 1100 Average Draw Duration: 21ms Window Size: 1600x1105 Seconds: 21 More informations: http://forums.mozillazine.org/viewtopic.php?f=23&t=2603873
(In reply to MrX1980 from comment #30) > Using Mozilla Firefox 19 Nightly (20121103030715) > FPS: 7 > Total Billboard Draws: 16870 > Average Draw Duration: 128ms > Window Size: 1600x1097 > Seconds: 2120 > (tested also with "gfx.direct2d.force-enabled;true" and > "gfx.font_rendering.directwrite.enabled;true") > > Using Google Chrome 25 canary (25.0.1316.0) > FPS: 60 > Total Billboard Draws: 782 > Average Draw Duration: 3ms > Window Size: 1600x1099 > Seconds: 7 > > Using Internet Explorer 9 (9.0.8112.16421 Update 10) > FPS: 47 > Total Billboard Draws: 1100 > Average Draw Duration: 21ms > Window Size: 1600x1105 > Seconds: 21 > > More informations: > http://forums.mozillazine.org/viewtopic.php?f=23&t=2603873 Could you the graphics section from your about:support? I suspect you're getting blacklisted.
(In reply to Bas Schouten (:bas.schouten) from comment #31) > (In reply to MrX1980 from comment #30) > > Could you the graphics section from your about:support? I suspect you're > getting blacklisted. Even though force-enabled is true, I mean, as you'd also have to force-enable gfx.layers.force-enable.
Here are my system / Firefox / Google Chrome specs. gfx.layers (or similar) not exist in about:config Was it renamed to "layers.acceleration.force-enabled" ? I have now tested: layers.acceleration.force-enabled;true layers.prefer-d3d9;true/false or layers.prefer-opengl;true/false webgl.force-layers-readback;true/false Still the same slow average ~128ms
(In reply to MrX1980 from comment #33) > Created attachment 678117 [details] > about:support, dxdiag, aida64, gpu-internals, ... > > Here are my system / Firefox / Google Chrome specs. > > gfx.layers (or similar) not exist in about:config > Was it renamed to "layers.acceleration.force-enabled" ? > > I have now tested: > layers.acceleration.force-enabled;true > layers.prefer-d3d9;true/false or layers.prefer-opengl;true/false > webgl.force-layers-readback;true/false > > Still the same slow average ~128ms Ah, that's still a DirectX 9 card, we, sadly do not support canvas acceleration on pre-DirectX 10 hardware. That would explain the slow result on your machine.
I found a compromise. Based on the below blog I have set it now to "gfx.canvas.azure.backends;direct2d,cg,skia,cairo". So skia is now active for me and faster than cairo. Here are the skia results: Using Mozilla Firefox 19 Nightly (20121104030714) FPS: 12 Total Billboard Draws: 16886 Average Draw Duration: 77ms Window Size: 1600x1097 Seconds: 1299 http://featherweightmusings.blogspot.de/2012/08/azure-canvas-in-firefox-nightlies.html
Nightly 27: 10 seconds FPS: 60 Total Billboard Draws: 924 Average Draw Duration: 11ms Window Size: 1366x618 Chrome 29: 9 seconds FPS: 60 Total Billboard Draws: 792 Average Draw Duration: 08ms Window Size: 1366x667 IE 10: 7 seconds FPS: 60 Total Billboard Draws: 787 Average Draw Duration: 08ms Window Size: 1366x673
gfx.canvas.azure.backends: skia settings on MacosX speeds up not only drawImage but everything. Why not use it default? the default 'cg' setting is slow and choppy.
On Windows 7 Nighty is faster than IE 11 and for some reason Chrome 59 is very slow. Bug 932958 and bug 1150944 enabled skia on Mac OSX. I'm gonna mark this exact issue as fixed.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: