27.97 - 5.09% perf_reftest_singletons id-getter-2.html / perf_reftest_singletons id-getter-3.html + 2 more (Linux, OSX, Windows) regression on Wed September 27 2023
Categories
(Core :: XPCOM, defect)
Tracking
()
| Tracking | Status | |
|---|---|---|
| firefox-esr115 | --- | unaffected |
| firefox118 | --- | unaffected |
| firefox119 | --- | unaffected |
| firefox120 | --- | fixed |
| firefox121 | --- | wontfix |
People
(Reporter: afinder, Assigned: emilio)
References
(Regression)
Details
(4 keywords)
Attachments
(1 obsolete file)
Perfherder has detected a talos performance regression from push 84e0feee4ceb4f753e7ac5883ab6601c668d7da3. As author of one of the patches included in that push, we need your help to address this regression.
Regressions:
| Ratio | Test | Platform | Options | Absolute values (old vs new) |
|---|---|---|---|---|
| 28% | perf_reftest_singletons id-getter-2.html | windows10-64-shippable-qr | e10s fission stylo webrender | 594.89 -> 761.27 |
| 26% | perf_reftest_singletons id-getter-2.html | macosx1015-64-shippable-qr | e10s fission stylo webrender | 656.11 -> 823.51 |
| 12% | perf_reftest_singletons id-getter-2.html | linux1804-64-shippable-qr | e10s fission stylo webrender | 896.50 -> 1,000.35 |
| 5% | perf_reftest_singletons id-getter-3.html | windows10-64-shippable-qr | e10s fission stylo webrender | 318.66 -> 334.88 |
Improvements:
| Ratio | Test | Platform | Options | Absolute values (old vs new) |
|---|---|---|---|---|
| 2% | tp5o_webext | linux1804-64-shippable-qr | e10s fission stylo webrender-sw | 280.35 -> 273.99 |
Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the offending patch(es) may be backed out in accordance with our regression policy.
If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.
For more information on performance sheriffing please see our FAQ.
Comment 1•2 years ago
|
||
Set release status flags based on info from the regressing bug 1854446
| Assignee | ||
Comment 2•2 years ago
|
||
| Assignee | ||
Updated•2 years ago
|
| Assignee | ||
Comment 3•2 years ago
|
||
Updated•2 years ago
|
| Assignee | ||
Updated•2 years ago
|
| Assignee | ||
Comment 4•2 years ago
|
||
So ok, I did a bit more digging. The new setup was pretty much the one we had before bug 1447951. Bug 1447951 has a bit of measurements of memory improvements that the old setup had. I don't think that accounted for all the copies this avoids tho (when passing strings from the html parser and atomizing them, etc), or at least I don't see that seen anywhere.
In particular, bug 1447951 comment 0 mentions how many atoms are shared at a particular point in time, but not about whether we avoided copies (temporary or not) by handing out references to those atoms or by saving the copy while atomizing.
The performance regression seems to me like mostly about cache misses. Or at least I can't find another explanation for it. comment 3 helps removing a bit of code but that's not enough to recover the regression. My hypothesis is that the reason why this becomes slower and other microbenchmarks that use larger strings not so, is because NewMaybeExternalString actually looks at the character buffer (twice) in order to create an inline string in this case. That comes from bug 1038099.
In any case, I don't think this benchmark is super-realistic in the sense that we're always passing the same atom that's hot in cache before the patch. So my patch would end up paying for two cache misses instead of one.
So the question is, what do we want to do? I see two options:
Back out bug 1854446
...going to the previous setup.
It presumably would give us some of the memory improvements from bug 1447951 (though we saw no AWSY regression). Some cons:
- Atomizing strings from the HTML parser etc now require a string copy.
- Atomizing UTF-8 strings from e.g. the CSS parser requires two copies instead of one.
- Turning an atom into an
nsAStringrequires a string copy.
We could file follow-ups to investigate other potential changes to the string representation to address those, but my understanding is that making nsAString understand nsAtom* more deeply didn't seem palatable to either Nika, nor Nathan at the time.
That said, the useless string copies that bug 1854446 was about have been removed with bug 1855074, so for atoms that are small (<= 16 code units), there's no big performance difference between the new setup and backing out bug 1854446.
Take this regression
...and simplifications like comment 3. Note that bug 1854446 was also an improvement in something that was not a micro-benchmark.
I think the new set up is a bit nicer because:
-
It avoids all the copies mentioned above.
-
It makes the performance characteristics of atoms more predictable: Before bug 1854446, if you use
DOMStringandSetKnownLiveAtom, you can get fast large atoms passing to JS. If you usensAString, andnsAtom::ToStringwhich is a lot more common, then you were out of luck, but that now just works. -
I have other potential ideas to be able to share
nsDependentAtomStringswith JS more easily, but in general the less bespoke string representations we have, the better.
I think I lean towards (2), because it makes the performance characteristics of atoms more predictable. But I'm ok with (1) and going back to the drawing board, if you don't agree, of course :)
Comment 5•2 years ago
|
||
But usually atomizing string from HTML parser doesn't need a copy. Normally there should be already an atom around.
It makes the performance characteristics of atoms more predictable
Well, now that atoms don't have the data inline, there is unpredictable cost accessing the data because of the indirect access.
I'd still rather investigate making nsAtoms nsStringBuffers. We don't even know what the overhead there is.
(and back out the changes for now)
| Assignee | ||
Comment 6•2 years ago
|
||
(In reply to Olli Pettay [:smaug][bugs@pettay.fi] from comment #5)
Well, now that atoms don't have the data inline, there is unpredictable cost accessing the data because of the indirect access.
Data has always been out of line for static atoms tho, right?
But usually atomizing string from HTML parser doesn't need a copy. Normally there should be already an atom around.
How so? We atomize a bunch of stuff both from HTML and CSS parsing (e.g., all different ids, classnames, and names in a document). That seems not-insignificant to me.
I'd still rather investigate making nsAtoms nsStringBuffers. We don't even know what the overhead there is.
It's a bit unclear to me how that would look... Can you confirm that would mean moving length / hash (lazily probably, or atom-only) / ascii-lowercase-bit (lazily too, probably, or atom-only) to nsStringBuffer, is that right?
If so, I guess the length at least would be duplicated with nsTStringRepr, and we'd need to keep that somehow in sync, which seems a bit unfortunate / error-prone...
If you confirm that's what you are thinking of, and Nika agrees it's worth exploring, I'm happy to give it a poke...
(and back out the changes for now)
Not sure I agree it's worth backing out, but okay.
Comment 7•2 years ago
|
||
(In reply to Emilio Cobos Álvarez (:emilio) from comment #6)
(In reply to Olli Pettay [:smaug][bugs@pettay.fi] from comment #5)
Well, now that atoms don't have the data inline, there is unpredictable cost accessing the data because of the indirect access.
Data has always been out of line for static atoms tho, right?
Well, we're talking about dynamic atoms here :)
But usually atomizing string from HTML parser doesn't need a copy. Normally there should be already an atom around.
How so? We atomize a bunch of stuff both from HTML and CSS parsing (e.g., all different ids, classnames, and names in a document). That seems not-insignificant to me.
Sure, but often we can reuse an atom which is already in mrucache, when using main thread only atomization.
I'd still rather investigate making nsAtoms nsStringBuffers. We don't even know what the overhead there is.
It's a bit unclear to me how that would look... Can you confirm that would mean moving length / hash (lazily probably, or atom-only) / ascii-lowercase-bit (lazily too, probably, or atom-only) to
nsStringBuffer, is that right?
StringBuffer has already size, that could be use as length for atoms. But yes.
If so, I guess the length at least would be duplicated with
nsTStringRepr, and we'd need to keep that somehow in sync, which seems a bit unfortunate / error-prone...
I don't quite understand what this means. An atom StringBuffer could be readonly.
Comment 8•2 years ago
|
||
What was the final decision? Should we back out or not? I'm asking because the soft freeze for 120 is tomorrow, if we don't back out by then we'll have to uplift it.
| Assignee | ||
Comment 9•2 years ago
|
||
No decision yet, but we could back out from beta without much issue. Also this is a very targetted micro-benchmark, so it's not a huge deal IMO.
Comment 10•2 years ago
|
||
Set release status flags based on info from the regressing bug 1854446
Comment 11•2 years ago
|
||
do we need to backout bug 1854446 out of beta? This week is the last week of early beta.
| Assignee | ||
Comment 12•2 years ago
|
||
Yeah, let's back out of beta for now, can you do that?
Comment 14•2 years ago
|
||
Yes i can do that! should we also back this out from central fx121 ( i can request that) or will the fix be ready in time for mergeday?
Comment 15•2 years ago
|
||
120 fixed by backout of bug 1854446
| Assignee | ||
Comment 16•2 years ago
|
||
I'm hoping to have a decision on this made by then.
Comment 17•1 year ago
|
||
Only a week left before Fx121 goes to RC. Can we please try to make a decision soon? FWIW, it looks like macOS is the only platform where we're still seeing consistently higher numbers at this point - it looks like bug 1857237 clawed back a lot of the regression already on Linux & Windows.
| Assignee | ||
Comment 18•1 year ago
|
||
Yeah it makes sense that bug 1857237 would address this regression, effectively.
I think given that, I'd rather take this simplification and call it a day. The less weird string representations we have, the better, IMO. But I'm also fine backing out the regressor in central if you want, Olli.
Comment 19•1 year ago
|
||
If the regression has been fixed in some manner, no need to back out, IMO.
Comment 20•1 year ago
|
||
It has been partially fixed on Linux and Windows, and it is still not on Mac.
Closing as WONTFIX given recent comments.
Comment 21•1 year ago
|
||
Comment on attachment 9356911 [details]
Bug 1856854 - Remove DOMString atom state now that atoms hold an nsStringBuffer. r=smaug
Revision D190218 was moved to bug 1870608. Setting attachment 9356911 [details] to obsolete.
Description
•