Closed
Bug 1428285
Opened 6 years ago
Closed 6 years ago
stylo: PropertyDeclaration::id() shows up in profile
Categories
(Core :: CSS Parsing and Computation, enhancement, P3)
Core
CSS Parsing and Computation
Tracking
()
RESOLVED
FIXED
Tracking | Status | |
---|---|---|
firefox59 | --- | affected |
People
(Reporter: xidorn, Unassigned)
References
Details
(Keywords: perf)
In a profile for tpaint, I see PropertyDeclaration::id() shows up in it. It takes 3ms (30 samples) which consists 0.8% of the measured range. The profile I'm looking at is https://perfht.ml/2CsDCaI It's a surprising one, and thus I checked the assembly of that method, it is something like: > 00007FF816417202 movsxd rax,dword ptr [rdx+rax*4] > 00007FF816417206 add rax,rdx > 00007FF816417209 jmp rax > 00007FF81641720B mov ax,1 > 00007FF81641720F jmp style::properties::PropertyDeclaration::id+0B9Bh (07FF816417D3Bh) > 00007FF816417214 xor eax,eax > 00007FF816417216 jmp style::properties::PropertyDeclaration::id+0B9Bh (07FF816417D3Bh) > 00007FF81641721B mov ax,2 > 00007FF81641721F jmp style::properties::PropertyDeclaration::id+0B9Bh (07FF816417D3Bh) > 00007FF816417224 mov ax,3 > 00007FF816417228 jmp style::properties::PropertyDeclaration::id+0B9Bh (07FF816417D3Bh) > 00007FF81641722D mov ax,4 > 00007FF816417231 jmp style::properties::PropertyDeclaration::id+0B9Bh (07FF816417D3Bh) > 00007FF816417236 mov ax,5 > ... each property takes 9 bytes, and we have 326 properties, so this branch table would take ~2.9KB (actually the last few branches take only 6 bytes, and the total size is ~2.8KB), which doesn't look very friendly to the cache for this kind of hot function. A branch table is probably the best thing the compiler can optimize it to at the moment, but it isn't the best form for this kind of function. Since PropertyDeclaration and LonghandId are both use the same list to generate, they should actually have matched tag value. Ideally, the compiler should recognize this pattern and optimize it to a single conditional jump and a single move (without the whole table). That should behave much better on a modern CPU than this branch table. (I'm not familiar with compiler implementation, but based on what I learned about Rust before, I suppose this can probably be done with a optimization pass in MIR?)
Reporter | ||
Comment 1•6 years ago
|
||
Before the compiler becomes able to recognize this pattern and apply the optimization, I guess we can probably use mem::discriminant and mem::transmute to simulate that, and use test to catch compiler change which would break the invariant we need.
Comment 2•6 years ago
|
||
One difficulty is that this enum has not only on variant per property, but also a few like Custom(…) that behave differently. I did rewrite PropertyDeclaration::id to use two separate `match` expressions, only one with many "branches" and make it as "uniform" as possible. I did see it compile to a lookup table (indexing into a compact array of static data) rather than a jump table at some point, maybe some compiler change regressed this particular case. Maybe another approach could be to use https://github.com/rust-lang/rfcs/pull/2195 (when it’s accepted/implemented) to manipulate the discriminant directly. Transmuting the result of mem::discriminant is not OK, it is made opaque specifically because we cannot make assumptions about its exact value.
Reporter | ||
Comment 3•6 years ago
|
||
(In reply to Simon Sapin (:SimonSapin) from comment #2) > One difficulty is that this enum has not only on variant per property, but > also a few like Custom(…) that behave differently. Ideally compiler can detect the common pattern and use a single if for those pattern and handle the special cases separately... that's probably hard to implement perfectly, though. > I did rewrite PropertyDeclaration::id to use two separate `match` > expressions, only one with many "branches" and make it as "uniform" as > possible. I did see it compile to a lookup table (indexing into a compact > array of static data) rather than a jump table at some point, maybe some > compiler change regressed this particular case. Look up table would take 600+ bytes, which still not that good, but should be more or less better than a jump table, I guess. > Maybe another approach could be to use > https://github.com/rust-lang/rfcs/pull/2195 (when it’s accepted/implemented) > to manipulate the discriminant directly. Transmuting the result of > mem::discriminant is not OK, it is made opaque specifically because we > cannot make assumptions about its exact value. It would be great to have this. Can we push Rust team for having that? I know it is made opaque intentionally, but as far as we have a test to check everything works fine... I guess we can rely on it until the compiler changes... Ah, well, probably you wouldn't be happy because that would potentially make it easier for Rust changes to break Servo build...
Reporter | ||
Comment 4•6 years ago
|
||
(In reply to Xidorn Quan [:xidorn] UTC+10 from comment #3) > (In reply to Simon Sapin (:SimonSapin) from comment #2) > > One difficulty is that this enum has not only on variant per property, but > > also a few like Custom(…) that behave differently. > > Ideally compiler can detect the common pattern and use a single if for those > pattern and handle the special cases separately... that's probably hard to > implement perfectly, though. By this, I mean the compiler can check that if there is a pattern of > match xxx { > 1 => 1 + A, > 2 => 2 + A, > 3 => 3 + A, > ... > 100 => 100 + A, > 101 => { /* something different */ } > } where A is a constant, than optimize it to something like > match xxx { > n if n <= 100 => n + A, > 101 => { /* something different */ } > }
Reporter | ||
Comment 5•6 years ago
|
||
It seems nox is fixing this in servo/servo#19956.
Comment 6•6 years ago
|
||
Indeed, my patch more or less makes PropertyDeclaration::id a noop.
Comment 7•6 years ago
|
||
The PR landed.
Comment 8•6 years ago
|
||
This seems to have improved perf! See the PR in comment 5.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•