Closed Bug 1372591 Opened 7 years ago Closed 7 years ago

Stylo: Remove atomic refcount bumping from style::rule_tree::StrongRuleNode::ensure_child

Categories

(Core :: CSS Parsing and Computation, enhancement, P1)

enhancement

Tracking

()

RESOLVED FIXED

People

(Reporter: jseward, Assigned: emilio)

References

Details

MESI profiling of the Stylo pass only, when running the Obama testcase, with 4 worker threads, shows that style::rule_tree::StrongRuleNode::ensure_child is the biggest single source of RFOs, generating 1/3 of all RFOs in the styling pass. Examining the annotated assembly shows they all come from an atomic add instructtion in ensure_child (post inlining). The relevant source code contains the following interesting comment: // TODO(emilio): We could avoid all the refcount churn here. Emilio kindly hacked up the change he had in mind, which is now at https://github.com/servo/servo/pull/17294 This seems to be a win. Numbers are: BEFORE | best | | PAR SPEEDUP 1p 115.16 113.96 115.81 | 113.96 | | 1.00 x 2 60.21 60.68 60.44 | 60.21 | | 1.89 x 3 42.32 41.94 41.85 | 41.85 | | 2.72 x 4 33.17 33.23 33.31 | 33.17 | | 3.43 x 6 29.53 30.13 29.70 | 29.53 | | 3.85 x 8 27.17 26.78 27.11 | 26.78 | | 4.25 x AFTER best | %BEFORE | 1p 111.75 111.45 112.57 | 111.45 | 97.8% | 1.00 x 2 59.02 58.44 58.52 | 58.44 | 97.1% | 1.90 x 3 40.33 40.16 40.80 | 40.16 | 96.0% | 2.78 x 4 31.17 31.10 31.14 | 31.10 | 93.7% | 3.58 x 6 28.23 28.27 28.23 | 28.23 | 95.6% | 3.94 x 8 25.66 25.70 25.81 | 25.66 | 95.8% | 4.34 x Which I interpret to mean: * an absolute speedup of between 2% (1 core) and 6% (4 cores) * parallel speedup seems modestly improved, at 4 cores from around 3.5 x to 3.6 x. Given that the 6 and 8 thread numbers necessarily involve hyperthreading, I would not pay much attention to them. The average number of instructions per RFO increased from 1196 to 1871 (viz, less interference in the memory system). Setup: Quiet F25 machine, Skylake Xeon, 2.6 GHz, runs of the form: STYLO_ITERS=500 STYLO_ITER_THRESHOLD=10 STYLO_THREADS={1,2,3,4,6,8} \ FORCE_STYLO_THREAD_POOL=1 DISPLAY=:2.0 \ ./gcc-Og-nondebug-stylo/dist/bin/firefox-bin -P dev -no-remote \ /home/sewardj/MOZ/STYLO/Barack_Obama_Wikipedia.html
Nice find + fix! This is exactly the kind of stuff I was hoping we'd find with your investigations. Something we may have known about, but didn't know was hot.
Assignee: nobody → emilio+bugs
Priority: -- → P1
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.