RegExp.prototype[Symbol.replace] needs to call replacer function after collecting all matches

RESOLVED FIXED in Firefox 52

Status

()

defect
RESOLVED FIXED
3 years ago
3 years ago

People

(Reporter: anba, Assigned: arai)

Tracking

({regression})

Trunk
mozilla52
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox47 unaffected, firefox48 unaffected, firefox49 fix-optional, firefox-esr45 unaffected, firefox50 fix-optional, firefox51 fix-optional, firefox52 fixed)

Details

Attachments

(1 attachment, 3 obsolete attachments)

Test case:
---
rx = RegExp("a", "g");
r = rx[Symbol.replace]("abba", function() {
    rx.compile("b", "g");
    return "?";
});
print(r);
---

Expected: Prints "?bb?"
Actual: Prints "???a"



Also reproducible with String.prototype.replace:
---
rx = RegExp("a", "g");
r = "abba".replace(rx, function() {
    rx.compile("b", "g");
    return "?";
});
print(r);
---

Expected: Prints "?bb?"
Actual: Prints "???a"

That means, strictly speaking, this is a regression compared to Fx47.
uhh....
we cannot check if the function may invoke RegExp#compile, and once RegExp#compile is called, bailing out won't help :/

So we will have to hold a reference to the compiled RegExp code/flags at the point of entering @@replace function...
now wondering how that could be done without introducing performance regression...
separating the match loop and accumulation loop regresses sunspider string-unpack-code from 33.7 to 40.3
that would be the simplest fix tho
draft patch that separates the loop for match and accumulation.
this has so much code duplication for functional and non-functional cases, but not sure if merging them again with more #ifdef is readable.
maybe we could merge functional case and slow path.

anycase, this has performance regression :/
Here's benchmark results with WIP patch

SunSpider: http://bnjbvr.github.io/simplegraph/#title=SunSpider%20result%20comparison&ytitle=Values&categories=math-spectral-norm%2Cbitops-3bit-bits-in-byte%2Cdate-format-xparb%2C3d-morph%2Ccontrolflow-recursive%2Cstring-base64%2C3d-raytrace%2Caccess-nsieve%2Ccrypto-md5%2Cbitops-nsieve-bits%2Caccess-binary-trees%2Cbitops-bitwise-and%2Cstring-validate-input%2Caccess-fannkuch%2Cstring-fasta%2Cregexp-dna%2Cmath-partial-sums%2Caccess-nbody%2Cstring-tagcloud%2Cdate-format-tofte%2Cstring-unpack-code%2Ccrypto-aes%2Ccrypto-sha1%2Cbitops-bits-in-byte%2Cmath-cordic%2C3d-cube&values=before%201.7%201%2011.6%204.7%202.1%205.5%2010.9%202.7%203.8%203.4%202.4%202%207.8%205.4%206.3%2014.4%206.5%202.6%2020.2%2012.2%2033.5%2011.4%203.5%202.1%202.3%2011.9%0Aafter%201.7%201%2011.6%204.7%202.1%205.5%2010.8%202.7%203.7%203.4%202.4%202%207.8%205.2%206.3%2014.3%206.5%202.8%2020.1%2012.2%2040%2011.6%203.4%202.2%202.4%2012

Kraken: http://bnjbvr.github.io/simplegraph/#title=Kraken%20result%20comparison&ytitle=Values&categories=audio-fft%2Cstanford-crypto-pbkdf2%2Caudio-beat-detection%2Cstanford-crypto-ccm%2Cimaging-darkroom%2Cjson-parse-financial%2Caudio-oscillator%2Cai-astar%2Caudio-dft%2Cstanford-crypto-sha256-iterative%2Cjson-stringify-tinderbox%2Cimaging-gaussian-blur%2Cstanford-crypto-aes%2Cimaging-desaturate&values=before%2048.6%2096.4%2081.4%2084%2081%2032%2059%2064.9%20109.1%2039.5%2039.3%2064.5%2052.6%2052.9%0Aafter%2048.5%2096.9%2080.9%2083.9%2080.7%2031.5%2058.7%2064.8%20107.8%2039.4%2039.1%2064.3%2052.4%2052.8

Octane: http://bnjbvr.github.io/simplegraph/#title=Octane%20result%20comparison&ytitle=Values&categories=Richards%2CDeltaBlue%2CCrypto%2CRayTrace%2CEarleyBoyer%2CRegExp%2CSplay%2CSplayLatency%2CNavierStokes%2CPdfJS%2CMandreel%2CMandreelLatency%2CGameboy%2CCodeLoad%2CBox2D%2Czlib%2CTypescript%2CScore%20(version%209)&values=before%2035919%2068981%2029926%20112848%2031571%204294%2017440%2020407%2038073%2013271%2032729%2032150%2053558%2016433%2060672%2078184%2030659%2031335%0Aafter%2035920%2069486%2029896%20113449%2031640%204347%2017373%2020197%2038016%2012910%2032475%2033335%2053801%2016795%2060169%2077964%2030726%2031386

Notable changes is only SunSpider string-unpack-code: 33.5 => 40
will check other micro benchmarks
Posted image Micro benchmark comparison (obsolete) —
with WIP patch, 13.5% regression happens in both functional replacement and packer's optimization case.
till, can I have feedback on this?

Currently it duplicates RegExpGlobalReplaceOpt.h.js into RegExpGlobalReplaceOptFunc.h.js, and separates the loop for RegExpMatcher and accumulation.
There are 2 concerns, one is the amount of the code duplication, another one is the performance regression, especially for SunSpider string-unpack-code.
Attachment #8776211 - Attachment is obsolete: true
Attachment #8776448 - Flags: feedback?(till)
jmaher here is a perf regression for you. Do you think this is a problem in 49? Thanks!
Flags: needinfo?(jmaher)
looking at fx.48 vs fx.49, I have data from kraken as we run that on Talos:
linux64: https://treeherder.mozilla.org/perf.html#/comparesubtest?originalProject=mozilla-beta&originalRevision=84c59d136a9e&newProject=mozilla-aurora&newRevision=709652b674013b136627e45fd90b4e9063906a19&originalSignature=907466aea86e9e134379a1ef963a81fb7053d4f9&newSignature=907466aea86e9e134379a1ef963a81fb7053d4f9&framework=1

win7: https://treeherder.mozilla.org/perf.html#/comparesubtest?originalProject=mozilla-beta&originalRevision=84c59d136a9e&newProject=mozilla-aurora&newRevision=709652b674013b136627e45fd90b4e9063906a19&originalSignature=8ea80bd907bdf2f68afa98b12c02182d50e7dc67&newSignature=8ea80bd907bdf2f68afa98b12c02182d50e7dc67&framework=1

win8: https://treeherder.mozilla.org/perf.html#/comparesubtest?originalProject=mozilla-beta&originalRevision=84c59d136a9e&newProject=mozilla-aurora&newRevision=709652b674013b136627e45fd90b4e9063906a19&originalSignature=1dc70f253d06daa5b7e0a05f02a5b4b5e86a58b1&newSignature=1dc70f253d06daa5b7e0a05f02a5b4b5e86a58b1&framework=1

as you can see we have a few wins, and for stanford-crypto-* we have some of those with notable regressions.  

We also collect the AWFY data, in this case we have regexp for Octane on linux64 and win10:
https://treeherder.mozilla.org/perf.html#/graphs?timerange=7776000&series=%5Bmozilla-inbound,ae45f2ef6fdab7bdeb4d54f5daa28524f19045f5,1,5%5D&series=%5Bmozilla-inbound,d3a3b86b2e2c38bf2cbe3ac45a3ea1961f346661,1,5%5D&series=%5Bmozilla-inbound,7d502fd04298a034f36e554512c8150a27dc58e8,1,5%5D

We only have data from Mid July- so probably after the time window we are looking at- the data in the last 8 weeks doesn't show any issues.

let me know if I can help dig up more data.
Flags: needinfo?(jmaher)
Joel I'm not sure if the first three links are testing 48 v 49? (Currently 49 is beta and 50 is aurora)
Flags: needinfo?(jmaher)
sorry, what's going on here now?
this bug is about wrong behavior, not performance.
I was testing performance of possible patch for it, but it's not yet landed,
and it's not related to older versions.
I was looking at 48 vs 49 because if this was an issue in 49, it wouldn't be an issue in 48- at least that was my impression.  Let me know if I am comparing things incorrectly.

Since it looks like :arai is working on fixing the conformance, not performance, maybe we don't need to be so technical on the perf side of this.
Flags: needinfo?(jmaher)
(In reply to Joel Maher ( :jmaher) from comment #12)
> I was looking at 48 vs 49 because if this was an issue in 49, it wouldn't be
> an issue in 48- at least that was my impression.  Let me know if I am
> comparing things incorrectly.

I was looking at the output and it shows:
"Base - 84c59d136a9e (mozilla-beta) ..."

"New - 709652b67401 (mozilla-aurora) ..."

I assume mozilla-beta is 49 and mozilla-aurora is 50?
Flags: needinfo?(jmaher)
this was from a few weeks ago prior to the last merge, I believe July 14th.
Flags: needinfo?(jmaher)
Flags: needinfo?(jdemooij)
As arai said, this is a correctness bug, not a performance regression. arai is just looking at the performance impact of the patch that still has to land.

arai, till: can we get this landed?
Flags: needinfo?(jdemooij)
We should fix this, but I don't think we are going to backport the patch.
Hi :jandem,
Do you have any updates here? Is this worth uplifting to 51 aurora?
Flags: needinfo?(jdemooij)
sorry for being late.
current patch's performance regression isn't acceptable I think.
we need another way, or optimize more.
Hi :arai, this bug is assigned to nobody and we need someone to work on this. Since it seems that you are involved in this, feel free to reassign it to someone else if you disagree.
Assignee: nobody → arai.unmht
Comment on attachment 8776448 [details] [diff] [review]
Do not call replaceValue function in the same loop as match in RegExp.prototype[@@replace] optimized path.

h4writer, how do you think about this approach and regression?
Attachment #8776448 - Flags: feedback?(hv1989)
arai is on it :)
Flags: needinfo?(jdemooij)
we might be able to clone RegExp to pass to RegExpMatcher.
will try implementing and check performance.
very interesting.
it doesn't cause any notable regression.

Octane: http://bnjbvr.github.io/simplegraph/#title=Octane%20result%20comparison&ytitle=Values&categories=Richards%2CDeltaBlue%2CCrypto%2CRayTrace%2CEarleyBoyer%2CRegExp%2CSplay%2CSplayLatency%2CNavierStokes%2CPdfJS%2CMandreel%2CMandreelLatency%2CGameboy%2CCodeLoad%2CBox2D%2Czlib%2CTypescript%2CScore%20(version%209)&values=before%2034845%2065143%2030047%20101958%2029320%204102%2015052%2018090%2037995%2012816%2031082%2036336%2049219%2016063%2059711%2078218%2026455%2029896%0Aafter%2034899%2065169%2030006%20101329%2029359%204144%2015972%2018718%2038088%2012415%2031633%2037030%2048755%2016264%2060134%2078433%2026593%2030115

SunSpider: http://bnjbvr.github.io/simplegraph/#title=SunSpider%20result%20comparison&ytitle=Values&categories=math-spectral-norm%2Cbitops-3bit-bits-in-byte%2Cdate-format-xparb%2C3d-morph%2Ccontrolflow-recursive%2Cstring-base64%2C3d-raytrace%2Caccess-nsieve%2Ccrypto-md5%2Cbitops-nsieve-bits%2Caccess-binary-trees%2Cbitops-bitwise-and%2Cstring-validate-input%2Caccess-fannkuch%2Cstring-fasta%2Cregexp-dna%2Cmath-partial-sums%2Caccess-nbody%2Cstring-tagcloud%2Cdate-format-tofte%2Cstring-unpack-code%2Ccrypto-aes%2Ccrypto-sha1%2Cbitops-bits-in-byte%2Cmath-cordic%2C3d-cube&values=before%201.9%201.1%2013%204.9%202.1%205.7%2011.2%202.5%203.9%203.5%202.5%201.9%208%205.5%206.4%2014.2%206.4%202.8%2021%2010.3%2034.7%2011.8%203.6%202.2%202.4%2012.2%0Aafter%201.8%201.1%2013.1%204.9%202.1%205.8%2011.2%202.5%203.9%203.5%202.5%202%207.8%205.5%206.4%2014.2%206.4%202.8%2020.9%2010.2%2034.7%2012.2%203.6%202.3%202.4%2012.4

Kraken: http://bnjbvr.github.io/simplegraph/#title=Kraken%20result%20comparison&ytitle=Values&categories=audio-fft%2Cstanford-crypto-pbkdf2%2Caudio-beat-detection%2Cstanford-crypto-ccm%2Cimaging-darkroom%2Cjson-parse-financial%2Caudio-oscillator%2Cai-astar%2Caudio-dft%2Cstanford-crypto-sha256-iterative%2Cjson-stringify-tinderbox%2Cimaging-gaussian-blur%2Cstanford-crypto-aes%2Cimaging-desaturate&values=before%2047.3%20101.4%2083.8%2086.8%2075.4%2034.8%2056.3%2065.1%20106.7%2041.6%2039.2%2064.4%2053.8%2052.6%0Aafter%2047.6%20101.4%2084%2087.2%2075.1%2034.6%2056.3%2065.5%20106.6%2041.8%2040.2%2064.2%2054.1%2052.7
Comment on attachment 8776448 [details] [diff] [review]
Do not call replaceValue function in the same loop as match in RegExp.prototype[@@replace] optimized path.

I'll go with clone way.
will post after try run.
Attachment #8776448 - Attachment is obsolete: true
Attachment #8776448 - Flags: feedback?(till)
Attachment #8776448 - Flags: feedback?(hv1989)
Attachment #8776267 - Attachment is obsolete: true
The issue here is that, RegExp#compile called in `replaceValue` function can change `rx`, and it affects the RegExpMatcher result.
To avoid that, I cloned ``rx` before the loop and use it in RegExpMatcher.
so that any change to original `rx` doesn't affect RegExpMatcher result.
Attachment #8803719 - Flags: review?(till)
Comment on attachment 8803719 [details] [diff] [review]
Clone RegExp object at the top of RegExpGlobalReplaceOptFunc and RegExpGlobalReplaceOptElemBase to avoid the effect of RegExp#compile.

Review of attachment 8803719 [details] [diff] [review]:
-----------------------------------------------------------------

r=me

I'm very sorry for not giving you feedback on this :( I think after having looked at it a number of times I probably didn't have too much useful feedback, but I should've at least done what you did in the end, and forwarded the f? to Hannes.
Attachment #8803719 - Flags: review?(till) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/26fcee410e9f006d1561535ad91906ca590aa507
Bug 1290506 - Clone RegExp object at the top of RegExpGlobalReplaceOptFunc and RegExpGlobalReplaceOptElemBase to avoid the effect of RegExp#compile. r=till
https://hg.mozilla.org/mozilla-central/rev/26fcee410e9f
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
Jandem mentioned we likely don't want to backport/uplift.
You need to log in before you can comment on or make changes to this bug.