Open
Bug 598202
Opened 14 years ago
Updated 2 years ago
Significant Zimbra perf regression due to d2d
Categories
(Core :: Graphics, defect)
Tracking
()
NEW
Tracking | Status | |
---|---|---|
blocking2.0 | --- | .x+ |
People
(Reporter: jst, Unassigned)
References
(Blocks 1 open bug, )
Details
(Whiteboard: [softblocker][final?][approved-patches-landed])
Attachments
(2 files)
2.61 KB,
patch
|
Details | Diff | Splinter Review | |
2.11 KB,
patch
|
vlad
:
review+
vlad
:
approval2.0+
|
Details | Diff | Splinter Review |
We've taken a major hit (2x) on a decent number of the tests in the zimbra perf test suite (non-public, available internally at Mozilla), and if I disable d2d (by setting "gfx.direct2d.disabled" to true) the performance goes back to what I would expect it to be. Joe tells me that this is likely due to how d2d interoperates d3d, with Bas is working on, so this bug is primarily to track that Bas' work really does fix the regression as seen here as well.
For anyone looking at the graphs in the url field, ignore the first test, it's very unpredictable, but the rest are very repeatable. If anyone wants to run these tests themselves, let me know and I can tell you where they can be loaded.
Reporter | ||
Comment 1•14 years ago
|
||
Blocking betaN for now, but blocking anything else is fine with me, as long as it blocks 2.0 :)
blocking2.0: --- → betaN+
Comment 2•14 years ago
|
||
I'm not so sure if this is caused by interop (it's possible, I just see no evidence of that). It all depends on how Zimbra does it's drawing, it's certainly trivial for me to come up with some code that draws 5 times slower on D2D than it does on GDI.
If Zimbra is triggering a lot of those scenarios, that would certainly be a problem. What for example could currently be very slow is any code that uses 'special' CSS borders. It's possible that that has to do with it, I'm not sure how much border drawing paths get drawn, we should just profile these tests.
![]() |
||
Comment 3•14 years ago
|
||
Border drawing is certainly one of the things I remember seeing in the painting parts of the Zimbra profiles.
What border styles are problematic for d2d? I can try to write a patch to disable those somehow so Johnny can test this theory.
Comment 4•14 years ago
|
||
(In reply to comment #3)
> Border drawing is certainly one of the things I remember seeing in the painting
> parts of the Zimbra profiles.
>
> What border styles are problematic for d2d? I can try to write a patch to
> disable those somehow so Johnny can test this theory.
Anything that uses clip or OPERATOR_ADD (which is a -lot- of the border code), we should try just commenting all those clips and OPERATOR_ADDs out. And then see how performance improves, and use that as a guideline. This might actually be the perfect test case giving us some factual evidence our border drawing code is harmful for Direct2D.
We should simply stop jumping through all kinds of nasty hoops to do anti-aliasing on border corners for example, at least when using D2D. Jeff tells me no other browser actually does this.
The clips should be avoidable mostly just by doing the drawing differently, that should greatly improve the performance for D2D but it -might- regress the performance for GDI so we need to look at that.
There's bug 588271 to track this issue.
![]() |
||
Comment 5•14 years ago
|
||
I can comment out some clips and test. What does it mean to comment out OPERATOR_ADD? Just replace with OVER or SOURCE?
Comment 6•14 years ago
|
||
(In reply to comment #5)
> I can comment out some clips and test. What does it mean to comment out
> OPERATOR_ADD? Just replace with OVER or SOURCE?
Over would do fine!
![]() |
||
Comment 7•14 years ago
|
||
I _think_ this kills the relevant OPERATOR_ADDs and clips. jst, do you want to give this a spin?
Reporter | ||
Comment 8•14 years ago
|
||
I'm afraid that patch made things worse:
https://spreadsheets.google.com/ccc?key=0AiY-NtfakhpidHlSeHdIWnJXOE1xY193ZnBuSGJWSEE&hl=en
I think the Clip() removals there potentially hurt. FWIW, I'm perfectly OK with changing ADD to OVER, and living with the artifacts at intersections. ADD/OVER has nothing to do with rounded borders; the reason why ADD is used is so that the intersection between two border edges that have different colors is handled correctly. The artifacts can be seen here: https://bug368247.bugzilla.mozilla.org/attachment.cgi?id=254651 . Compare our rendering and Chrome's (ignore that ours have border radius and theirs don't, the testcase just uses -moz-border-radius). Specifically, the third testcase -- note the dark colors at the edges.
![]() |
||
Comment 10•14 years ago
|
||
Yeah, I was about to say that. What happens if you only make the s/ADD/OVER/ change and keep the Clip() calls?
Comment 11•14 years ago
|
||
We could certainly try that, I'm surprised removing the clips wouldn't help though, they helped a lot in my local test. But let's with all the clips in and just ADD replaced by OVER.
Reporter | ||
Comment 12•14 years ago
|
||
With the patch being only the change from OPERATOR_ADD to OPERATOR_OVER we do get a speedup on the zimbra tests, but only a very small one. Results in the graph linked to in this bug...
Comment 13•14 years ago
|
||
Johnny, did you ever end up doing a test with D2D only and software only?
![]() |
||
Comment 14•14 years ago
|
||
We should also retest with the patches in bug 599118 (possibly just once it lands?)
Depends on: 599118
Comment 15•14 years ago
|
||
(In reply to comment #14)
> We should also retest with the patches in bug 599118 (possibly just once it
> lands?)
It landed :).
Comment 16•14 years ago
|
||
Is the standard deviation on some tests rather large? ADD will -never- be faster than OVER, yet on the go to 'Tasks' tests has the OVER fix show slower than the original ADD measurement.
![]() |
||
Comment 17•14 years ago
|
||
> Is the standard deviation on some tests rather large?
Depends on how you measure "large". Not compared to the size of the regression we're seeing. It does look like the difference between OVER and ADD for the "go to Tasks" test is just noise.
Reporter | ||
Comment 18•14 years ago
|
||
Test results now up that show the performance right before Bas' fix for bug 599118. With that fix we're doing better, but still far worse than with d2d disabled. Those numbers are w/o changing ADD to OVER. Check the url in this bug, look at the sheet named "2010-09-24 bas' d2d fix".
Reporter | ||
Comment 19•14 years ago
|
||
(In reply to comment #18)
> Test results now up that show the performance right before Bas' fix for bug
> 599118.
... and right after, and then also with d2d disabled in that latter build.
Comment 20•14 years ago
|
||
(In reply to comment #19)
> (In reply to comment #18)
> > Test results now up that show the performance right before Bas' fix for bug
> > 599118.
>
> ... and right after, and then also with d2d disabled in that latter build.
I might be looking at the wrong place, but I can't see the results anywhere? :)
![]() |
||
Comment 21•14 years ago
|
||
You need to switch to the second tab from the left on the bottom of the spreadsheet. The one labeled "2010-09-24 bas' d2d fix".
Comment 22•14 years ago
|
||
(In reply to comment #21)
> You need to switch to the second tab from the left on the bottom of the
> spreadsheet. The one labeled "2010-09-24 bas' d2d fix".
Wow, this is fascinating, these results suggests my fix made some test cases worse.. that's really.. well, unexpected ... impossible almost. Although I guess the noise could be a bit higher than expected. Well we should try this with D3D10 layers, (I don't expect that to fix it though!) and then profile it to see what's taking the time.
Comment 23•14 years ago
|
||
We really do need this run with d2d only (accelerated layers off) and software only (hw accel disabled).
layers.accelerate-none;true
Options dialog > Advanced > General > Use hardware acceleration where available.
Comment 24•14 years ago
|
||
Any progress on getting us numbers as I requested in comment 23? Also, testing with the d3d10 backend would be instructive.
Assignee: nobody → jst
Comment 25•14 years ago
|
||
Under D2D, we seem to be spending roughly 75% of our painting time drawing borders. Painting accounts for roughly 30% of our processing time.
30% of our processing time seems to be spent doing reflows.
![]() |
||
Comment 26•14 years ago
|
||
Note that the testcase is measuring wall-clock time, not processing time, and it does so across running the event loop. So anything that causes us to block will also show up in the times it measures.
30% CPU time under reflow sounds about right to me based on profiling this on Mac, for what it's worth...
Comment 27•14 years ago
|
||
This patch takes time taken by border drawing rendering to practically 0. It seems they decided to, in -addition- to specifying transparent border colors, specify moz-border-*-colors as transparent.
Our comment here stated we don't check for that, because it's kind of silly, but I guess it's better if we do, it gives Zimbra a huge painting perf boost. This patch adds that check and updates the comment.
Attachment #486506 -
Flags: review?(vladimir)
Attachment #486506 -
Flags: approval2.0?
Comment on attachment 486506 [details] [diff] [review]
Avoid more transparent border drawing
Nice catch!
Attachment #486506 -
Flags: review?(vladimir)
Attachment #486506 -
Flags: review+
Attachment #486506 -
Flags: approval2.0?
Attachment #486506 -
Flags: approval2.0+
Comment 29•14 years ago
|
||
With the landing of http://hg.mozilla.org/mozilla-central/rev/83e5fdf07535 this should have improved a lot! It would be great if we can retest with that patch in.
Comment 30•14 years ago
|
||
(In reply to comment #27)
> This patch takes time taken by border drawing rendering to practically 0. It
> seems they decided to, in -addition- to specifying transparent border colors,
> specify moz-border-*-colors as transparent.
Probably because of bug 482692?
Reporter | ||
Comment 31•14 years ago
|
||
Ok, good news, the patch that Bas landed (comment 29) made a huge difference here. I updated the google spreadsheet linked to in this bug with charts showing how we perform with no changes, with hw accel disabled (in the prefs UI), with d2d disabled, and with layer acceleration disabled. I ran these tests with a build built from 20101105, which means the fix wasn't included, once I realized that I re-ran with layer acceleration disabled in a build and included those results in the spreadsheet too and the graph speaks for itself. The last set of data in that graph (gray) shows that the last 10 tests now perform as well as they used to, but there's possibly something still going on with the "go to'*'" tests, I'll need to repeat this with the new build with a clean profile to know...
The results are in the sheet named "hw accel".
Comment 32•14 years ago
|
||
(In reply to comment #31)
> Ok, good news, the patch that Bas landed (comment 29) made a huge difference
> here. I updated the google spreadsheet linked to in this bug with charts
> showing how we perform with no changes, with hw accel disabled (in the prefs
> UI), with d2d disabled, and with layer acceleration disabled. I ran these tests
> with a build built from 20101105, which means the fix wasn't included, once I
> realized that I re-ran with layer acceleration disabled in a build and included
> those results in the spreadsheet too and the graph speaks for itself. The last
> set of data in that graph (gray) shows that the last 10 tests now perform as
> well as they used to, but there's possibly something still going on with the
> "go to'*'" tests, I'll need to repeat this with the new build with a clean
> profile to know...
>
> The results are in the sheet named "hw accel".
Yeah, the 'go to' results are interesting! I'd much like to see the results with layer accel on as well (like you say, clean profile will do it!). It's good to see we're making considerable progress it seems.
Reporter | ||
Comment 33•14 years ago
|
||
I re-ran the numbers with todays nightly (so pgo optimized build vs not in my previous results) with a clean profile and with d2d disabled. Something's definitely going on with some of the "go to '*'" tests, but other than that this is looking good! I added the results to the sheet "hw accel 20101112" in the same spreadsheet.
Comment 34•14 years ago
|
||
(In reply to comment #33)
> I re-ran the numbers with todays nightly (so pgo optimized build vs not in my
> previous results) with a clean profile and with d2d disabled. Something's
> definitely going on with some of the "go to '*'" tests, but other than that
> this is looking good! I added the results to the sheet "hw accel 20101112" in
> the same spreadsheet.
I'm also happy to see that it seems this patch gives us some improvements on no-accel builds as far as I can see (as expected!)
Comment 35•14 years ago
|
||
With the landing of 588271, this has hopefully improved further.
Reporter | ||
Updated•14 years ago
|
Whiteboard: softblocker
Comment 36•14 years ago
|
||
Does this bug still exist? The improved border rendering code should have landed and it shouldn't. (but it might still)
Updated•14 years ago
|
Whiteboard: softblocker → [soft blocker]
Updated•14 years ago
|
Whiteboard: [soft blocker] → [softblocker]
Comment 37•14 years ago
|
||
Is there a good reason that this blocks betaN? If not, it should be moved over to final+. (Note: comment 1!)
Whiteboard: [softblocker] → [softblocker][final?]
Comment 38•14 years ago
|
||
Has this patch landed? Should it? Has it bitrotten, entirely?
Comment 39•14 years ago
|
||
It landed.
Comment 40•14 years ago
|
||
I believe this bug does not exist anymore after the improvements we made, but I don't have access to the test suite so I can't check.
Comment 41•14 years ago
|
||
jst: can you retest this please?
Comment 42•14 years ago
|
||
** PRODUCT DRIVERS PLEASE NOTE **
This bug is one of 7 automatically changed from blocking2.0:betaN+ to blocking2.0:.x during the endgame of Firefox 4 for the following reasons:
- it was marked as a soft blocking issue for a beta
- the whiteboard indicated that it might be appropriate for a .x release
blocking2.0: betaN+ → .x+
Updated•14 years ago
|
Whiteboard: [softblocker][final?] → [softblocker][final?][approved-patches-landed]
Comment 43•3 years ago
|
||
The bug assignee didn't login in Bugzilla in the last 7 months.
:bhood, could you have a look please?
For more information, please visit auto_nag documentation.
Assignee: jstenback+bmo → nobody
Flags: needinfo?(bhood)
Updated•3 years ago
|
Flags: needinfo?(bhood)
Updated•2 years ago
|
Severity: normal → S3
You need to log in
before you can comment on or make changes to this bug.
Description
•