Closed
Bug 1344131
Opened 8 years ago
Closed 7 years ago
stylo: Setting inline style can be a lot slower than Gecko
Categories
(Core :: CSS Parsing and Computation, defect, P4)
Core
CSS Parsing and Computation
Tracking
()
RESOLVED
WORKSFORME
People
(Reporter: bzbarsky, Assigned: SimonSapin)
References
(Depends on 1 open bug, Blocks 1 open bug)
Details
Attachments
(1 file)
59 bytes,
text/x-review-board-request
|
Details |
Take this microbenchmark:
<script>
var div = document.createElement("div");
var s = div.style;
var count = 1000000;
var start = new Date;
for (var i = 0; i < count; ++i) {
s.top = "10px";
}
var stop = new Date;
document.write(stop - start);
</script>
In Gecko I get numbers on the order of 600ms. In Stylo I get numbers on the order of 900ms.
For scale, by the way, on that testcase Chrome is at ~300ms and Safari at ~130ms, so even the non-stylo number is not amazing.
![]() |
Reporter | |
Comment 1•8 years ago
|
||
> In Gecko I get numbers on the order of 600ms.
Correction, closer to 300ms if I do an apples to apples comparison (same profile, no random other tabs running, etc).
![]() |
Reporter | |
Comment 2•8 years ago
|
||
So I tried running this testcase (with a larger iteration count) for about 14s in both Gecko an stylo.
In Gecko, I see time break down like so:
1s: nsDOMCSSAttributeDeclaration::GetCSSDeclaration
1s: under nsDocument::BeginUpdate
1.5s: under nsDocument::EndUpdate (NS_IsMainThread shows up under RemoveScriptBlocker here....)
1.5s: under nsDOMCSSAttributeDeclaration::GetCSSParsingEnvironment
7s: under nsCSSParser::ParseProperty
and then some long-tail stuff.
In stylo, it looks like this:
0.5s: nsDOMCSSAttributeDeclaration::GetCSSDeclaration
0.5s: AppendUTF16toUTF8 (converting the property value to UTF-8)
0.5s: under nsDocument::BeginUpdate
0.6s: under nsDocument::EndUpdate
0.7s: under nsDOMCSSAttributeDeclaration::GetCSSParsingEnvironment
3.5s: under nsStyledElement::SetInlineStyleDeclaration.
7s: ServoDeclarationBlock_SetPropertyById.
Rescaling that to the Gecko times by using the overal timings, or the BeginUpdate/EndUpdate stuff, that means that for the same number of iterations:
1) ServoDeclarationBlock_SetPropertyById takes 14+s vs 7s for nsCSSParser::ParseProperty.
2) There's an extra cost of 7s for nsStyledElement::SetInlineStyleDeclaration.
First, the simple bit. Why are we ending up under SetInlineStyleDeclaration? That means that "changed" was true near the tail end of nsDOMCSSDeclaration::ParsePropertyValue. And then we go to do a bunch of ~nsAttrValue, freeing misc containers, SetAttrAndNotify, constructing nsAttrValues and allocating their misc containers, etc.
But why is "changed" true? It shouldn't be! But it looks like servo just returns whether the value parsed successfully. Whereas Gecko ends up in http://searchfox.org/mozilla-central/rev/546a05fec017cb785fca62a5199d42812a6a1fd3/layout/style/nsCSSDataBlock.cpp#37 and actually checks whether the value changed.
Now the more complicated bit: why is ServoDeclarationBlock_SetPropertyById slow? Profile says a few things about the servo side:
1) Directly under glue::set_property we're doing malloc/free traffic. That accounts for about
1/6 of the time ServoDeclarationBlock_SetPropertyById takes (a bit over 1.1s out of 7s).
2) We're doing some spinlock stuff directly under glue::set_property. That takes 100-200ms (it's
pretty scattered in the profile, hard to add it all up).
3) drop::hd805e44d693ee661 directly under glue::set_property is 150ms.
4) Under declaration_block::parse_one_declaration we have more malloc/free and spinlock traffic.
That takes up about 2s.
5) The actual call to PropertyDeclaration::parse is about 2.7s, which is comparable to the time
Gecko's parser bits take, or a bit less. It's hard to tell what the right apples to apples
comparison is here.
Anyway, I filed bug 1344135 on the boolean return value being wrong and bug 1344136 on nearly half the time under ServoDeclarationBlock_SetPropertyById being spent on malloc, free, and spinlocks.
Comment 3•8 years ago
|
||
I'm more curious about the spinlock... why do we need spinlock there? We are on the main thread, and there shouldn't be any retention.
![]() |
Reporter | |
Comment 4•8 years ago
|
||
The total spinlock time in my profile is order of 400ms out of 14s, so about 3%. A single set takes about 900ns. So about 26ns of spinlocking per call, at a guess, on 1.5-year-old laptop hardware. That does sound a little high.
Updated•8 years ago
|
Updated•8 years ago
|
Priority: -- → P2
Updated•8 years ago
|
Assignee: nobody → simon.sapin
Comment 5•8 years ago
|
||
It seems several optimizations have been landed since this bug filed. Probably we should profile again and see what's going on now?
BTW, we probably should open a bug for having IDL binding pass in raw JS string for CSSOM, so that we can avoid double conversion in many cases.
Assignee | ||
Comment 6•8 years ago
|
||
Bug 1355106 is relevant to the double conversion.
![]() |
Reporter | |
Comment 7•8 years ago
|
||
Passing raw JS string is rather annoying in various ways. But yes, we could temporarily do something like that.
Bug 1355106 is a bit hard to do until we drop non-stylo or stop caring about its performance... So it's not going to be the bullet that saves us. And note that in the common case bindings don't really do "conversion" on a JS string: they either do a memcpy if it's non-latin1 or a fast byte-expanding copy if it is. So you will still end up doing just as much conversion as now even if you're coming from a JS string directly. Things will be a bit faster because you touch less memory, of course.
![]() |
Reporter | |
Comment 8•8 years ago
|
||
Oh, and now that I have builds I can report on the remeasurement and reprofiling.
In Gecko nightly, I get times of about 400ms on the microbenchmark from comment 0. Yes, I know comment 0 says 600ms. In Firefox 53 I get about 600ms. So we made some serious progress here at some point even for gecko. Looks like it happened between 53 and 54; I get 400ms in 54.
In stylo nightly I get times of about 430ms. This is still a bit worse than (new) Gecko, but a good bit better than 53 was.
For stylo, time breakdown under nsDOMCSSDeclaration::ParsePropertyValue looks like this now:
8% EndUpdate notifications.
8% AppendUTF16toUTF8
8% BeginUpdate notifiactions
6% under nsDOMCSSAttributeDeclaration::GetCSSDeclaration; about half
is AttributeWillChange, 1% is nsContentUtils::HasMutationListeners,
the rest is Element::GetInlineStyleDeclaration.
4% under nsDOMCSSAttributeDeclaration::GetServoCSSParsingEnvironment. This is
1% GetDefaultStyleAttrURLData, 1% GetBindingParent, 1% GetURLDataForStyleAttr,
1% self time.
57% is under Servo_DeclaraionBlock_SetPropertyById, breaking down as:
1% getting PropertyIds from nsCSSPropertyId.
8% under PropertyDeclarationBlock::extend_common. There's various vec::reserve
stuff there, comparisons under push_common, etc. I dunno why the reserve()
shows up at all; we should not be changing the length here.
The rest of the time is pretty much under parse_one_declaration_into. The only weird bit there is that I see _two_ codepaths that land us in tokenizer::consume_numeric. One is:
PropertyDeclaration::parse_into calls
LengthOrPercentageOrAuto::parse_internal calls
Parser::next_including_whitespace_and_comments calls
tokenizer::next_token calls
tokenizer::consume_numeric.
The other one is:
PropertyDeclaration::parse_into calls
style..properties..CSSWideKeyword$u20$as$u20$style..parser..Parse$GT$::parse::heb90e588fd20a8e6 calls
Parser::next_including_whitespace_and_comments calls
tokenizer::next_token calls
tokenizer::consume_numeric.
I'm not sure what that symbol is that's being called from parse_into here, and hence whether we're really tokenizing the number twice or the profiler is just being confused. If there's a way to get better information about what that symbol is, I'd love to know!
Comment 9•8 years ago
|
||
Comment 10•8 years ago
|
||
Also note that, in Gecko, we are still spending time on computing xml:base, which should be much expensive than GetServoCSSParsingEnvironment. It's worth checking how long would Gecko take if we change nsIContent::GetBaseURIForStyleAttr() to just return GetBaseURIWithoutXMLBase().
Comment 11•8 years ago
|
||
We could give Parser a lookahead buffer to store any single token that we unget (when a try() function does a reset() upon detecting failure). But Tokens look pretty big, so I'm not sure we want to be cloning them out of that lookahead buffer all the time.
Parser::expect_ident and friends return any unexpected token in the error part of their Result. Maybe we should have versions of these functions that don't need to know what token it did unexpectedly fine. Then we could have them do a quick check of the leading bytes to see what token we have next, so that when trying to parse a an "inherit" when the next byte is a digit, we could return quickly. (We would have to look past white space and comments when doing that.)
Assignee | ||
Comment 12•8 years ago
|
||
> The only weird bit there is that I see _two_ codepaths that land us in tokenizer::consume_numeric.
Yes. cssparser’s `try` method works by rewinding the tokenizer, so we end up doing the same work twice.
> We could give Parser a lookahead buffer to store any single token that we unget (when a try() function does a reset() upon detecting failure). But Tokens look pretty big, so I'm not sure we want to be cloning them out of that lookahead buffer all the time.
We’ve been talking about adding that buffer and reducing the size of tokens. I’ll work on that next. Checking leading bytes also sounds interesting. It’ll break through the nice abstraction between tokenizer and parser, but it may be worth it.
![]() |
Reporter | |
Comment 13•8 years ago
|
||
> It's worth checking how long would Gecko take if we change nsIContent::GetBaseURIForStyleAttr() to just return GetBaseURIWithoutXMLBase().
Then the Gecko time drops to about 360ms.
> Yes. cssparser’s `try` method works by rewinding the tokenizer, so we end up doing the same work twice.
That seems like something we should really fix; some tokens are pretty expensive to tokenize. Is there an existing bug/issue tracking this?
(Gecko solves this by handing out a borrow to an internal token storage, but I'm pretty sure the borrow checker would correctly point out that this is insane in terms of guaranteeing safety statically.)
Flags: needinfo?(simon.sapin)
Assignee | ||
Comment 14•8 years ago
|
||
There is bug 1347408.
Yes, separate "borrow the current token" and "advance to the next token" APIs would help eliminate this overhead. The current API gives ownership of the token to the caller. Token contains Cow<str> for ident-like things, which borrows the input when possible and has an allocated String when backslash-escapes are involves. Some callers take advantage of this ownership to avoid re-allocating. Although maybe backslash-escape are rare enough that we shouldn’t care about optimizing them?
Another idea is to replace Cow<str> with something that always borrows the input and does unescaping on demand. This would eliminate String allocations in Token, making it cheap to clone.
Flags: needinfo?(simon.sapin)
Comment hidden (mozreview-request) |
Assignee | ||
Updated•8 years ago
|
Keywords: leave-open
Comment 16•8 years ago
|
||
mozreview-review |
Comment on attachment 8887360 [details]
Bug 1344131 - Add a gtest bench for Servo_DeclarationBlock_SetPropertyById.
https://reviewboard.mozilla.org/r/158194/#review163370
::: layout/style/test/gtest/StyloParsingBench.cpp:71
(Diff revision 1)
> + RefPtr<RawServoDeclarationBlock> block = Servo_DeclarationBlock_CreateEmpty().Consume();
> + RefPtr<URLExtraData> data = new URLExtraData(
> + NullPrincipalURI::Create(), nullptr, NullPrincipal::Create());
> +
> + NS_NAMED_LITERAL_CSTRING(css_, " 10px");
> + const nsACString& css = css_;
You should be able to do:
```
const nsAutoCString css = NS_LITERAL_CSTRING(" 10px");
```
Or something like that, and the `&css` above should still work.
::: layout/style/test/gtest/StyloParsingBench.cpp:83
(Diff revision 1)
> + &css,
> + /* is_important = */ false,
> + data,
> + ParsingMode::Default,
> + eCompatibility_FullStandards,
> + NULL
nit: nullptr.
Attachment #8887360 -
Flags: review+
Comment 17•8 years ago
|
||
mozreview-review |
Comment on attachment 8887360 [details]
Bug 1344131 - Add a gtest bench for Servo_DeclarationBlock_SetPropertyById.
https://reviewboard.mozilla.org/r/158194/#review163380
![]() |
Reporter | |
Comment 18•8 years ago
|
||
I'm a bit confused. Why can't you just do:
NS_NAMED_LITERAL_CSTRING(css, " 10px");
And then use &css as needed?
Comment hidden (mozreview-request) |
Assignee | ||
Comment 20•8 years ago
|
||
Changed NULL to nullptr.
I’m quite confused with ns*String APIs myself. This bit copy/pasted from the other micro-benchmark in that file, which I originally wrote pretty much by trying things at random until something compiled.
Boris, when I try what you suggest I get this error message: "error: cannot convert ‘const nsLiteralCString*’ to ‘const nsACString*’". My (limited) understanding is that that conversion happens implicitly in an assignment to a new local variable, but not in a argument to a function call, or at least not "through" the & operator.
Emilio, your suggestion gives: "error: conversion from 'const nsLiteralCString' to non-scalar type 'const nsAutoCString' requested".
![]() |
Reporter | |
Comment 21•8 years ago
|
||
Oh, nsLiteralCString does not inherit (any more?) from nsACString? That's annoying, ok.
Comment hidden (mozreview-request) |
Comment 23•8 years ago
|
||
Pushed by simon.sapin@exyr.org:
https://hg.mozilla.org/integration/autoland/rev/d1f7d9c6394e
Add a gtest bench for Servo_DeclarationBlock_SetPropertyById. r=emilio
Comment 24•8 years ago
|
||
bugherder |
Assignee | ||
Comment 25•8 years ago
|
||
https://github.com/servo/servo/pull/17820 should help with this.
Updated•8 years ago
|
Priority: P2 → --
![]() |
Reporter | |
Updated•8 years ago
|
Priority: -- → P3
Comment 26•8 years ago
|
||
In the microbenchmark in comment 0, I get about 530 gecko vs 630 stylo. This is probably fine for shipping.
Priority: P3 → P4
Comment 27•8 years ago
|
||
status-firefox57=wontfix unless someone thinks this bug should block 57
status-firefox57:
--- → wontfix
Comment 28•7 years ago
|
||
The leave-open keyword is there and there is no activity for 6 months.
:SimonSapin, maybe it's time to close this bug?
Flags: needinfo?(simon.sapin)
Comment 29•7 years ago
|
||
I don't think this bug needs to be open at this point. We can file a new one if we want to improve the situation further.
Status: NEW → RESOLVED
Closed: 7 years ago
Flags: needinfo?(simon.sapin)
Resolution: --- → WORKSFORME
Updated•7 years ago
|
Keywords: leave-open
You need to log in
before you can comment on or make changes to this bug.
Description
•