Closed Bug 965712 Opened 11 years ago Closed 11 years ago

Create path for regexp that can be represented by easy atom

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla32

People

(Reporter: h4writer, Assigned: h4writer)

References

Details

Attachments

(3 files, 3 obsolete files)

WIP 11 years ago Hannes Verschore [:h4writer] 5.42 KB, patch		Details \| Diff \| Splinter Review
testcase 11 years ago Hannes Verschore [:h4writer] 4.40 KB, text/plain		Details
WIP2 11 years ago Hannes Verschore [:h4writer] 6.38 KB, patch		Details \| Diff \| Splinter Review
Part 2: use memchr 11 years ago Hannes Verschore [:h4writer] 4.67 KB, patch		Details \| Diff \| Splinter Review
Part1: Use our string matching logic for regexps when possible 11 years ago Hannes Verschore [:h4writer] 9.64 KB, patch	luke : review+	Details \| Diff \| Splinter Review
Part2: Replace the unrolling matcher with memchr 11 years ago Hannes Verschore [:h4writer] 4.57 KB, patch	luke : review+	Details \| Diff \| Splinter Review

Hannes Verschore [:h4writer]

Assignee

Description

•

11 years ago

When measuring regexp, I noticed that easy regexp are faster on chrome. Upon digging I noticed they have a separate path for "easy" regexp. I.e. regexp that are actually strings instead of regexps. In that case they don't use the regexp engine but string matching. (What makes sense). I don't think this will win us much on octane-regexp, but might still be important to do for real life code.

Hannes Verschore [:h4writer]

Assignee

Updated

•

11 years ago

Blocks: 806646

Hannes Verschore [:h4writer]

Assignee

Comment 1

•

11 years ago

Attached patch WIP (obsolete) — Details — Splinter Review

(In reply to Hannes Verschore [:h4writer] from comment #0) > I don't think this will win us much on octane-regexp, but might still be > important to do for real life code. Lets discard that comment: Improves octane-regexp2.0 from 2900 to 3900

Assignee: nobody → hv1989

Hannes Verschore [:h4writer]

Assignee

Comment 2

•

11 years ago

Comment on attachment 8411385 [details] [diff] [review] WIP Review of attachment 8411385 [details] [diff] [review]: ----------------------------------------------------------------- This removes the speed-up. Back to square one for fixing regexp issues. ::: js/src/vm/RegExpObject.cpp @@ +490,5 @@ > RegExpShared::compile(JSContext *cx, JSLinearString &pattern, bool matchOnly) > { > + if (!HasRegExpMetaChars(pattern.chars(), pattern.length())) { > + canStringMatch = true; > + } Bbouvier pointed out this should set canStringMatch to false

Hannes Verschore [:h4writer]

Assignee

Comment 3

•

11 years ago

Attached file testcase — Details

Hannes Verschore [:h4writer]

Assignee

Comment 4

•

11 years ago

Attached patch WIP2 (obsolete) — Details — Splinter Review

So after fixing the issues and adding the optimization to executeMatchOnly I see: On the testcase Before patch: 51ms 97ms 149ms After patch: 37ms 68ms 113ms I see octane-regexp2.0 go from 2900 to 3000

Attachment #8411385 - Attachment is obsolete: true

Hannes Verschore [:h4writer]

Assignee

Comment 5

•

11 years ago

Attached patch Part 2: use memchr (obsolete) — Details — Splinter Review

So while I'm at it, I thought I could improve our string match algorithm. We were 2x slower than v8 for this particular benchmark. With both patches: 26 46 81 Octane-regexp score now improved to 3100 and I think this will decrease/remove the visible blue VM time in the graph: https://tl-beta.paas.allizom.org/tracelogger.html?data=data-Octane-x86-regexp.json

Hannes Verschore [:h4writer]

Assignee

Comment 6

•

11 years ago

Attached patch Part1: Use our string matching logic for regexps when possible — Details — Splinter Review

Attachment #8411793 - Attachment is obsolete: true

Attachment #8412504 - Flags: review?(luke)

Hannes Verschore [:h4writer]

Assignee

Comment 7

•

11 years ago

Attached patch Part2: Replace the unrolling matcher with memchr — Details — Splinter Review

Attachment #8412252 - Attachment is obsolete: true

Attachment #8412508 - Flags: review?(luke)

Luke Wagner [:luke]

Comment 8

•

11 years ago

Comment on attachment 8412504 [details] [diff] [review] Part1: Use our string matching logic for regexps when possible Review of attachment 8412504 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/vm/RegExpObject.cpp @@ +569,5 @@ > + > + matches.displace(displacement); > + matches.checkAgainst(origLength); > + *lastIndex = matches[0].limit; > + return RegExpRunStatus_Success; To factor out these final 4 lines which are duplicated below, can you put the subsequent Yarr-calling code in the 'else' block of this iff. @@ +632,5 @@ > + > + match = MatchPair(res + start, res + start + source->length()); > + match.displace(displacement); > + *lastIndex = match.limit; > + return RegExpRunStatus_Success; Same here

Attachment #8412504 - Flags: review?(luke) → review+

Luke Wagner [:luke]

Comment 9

•

11 years ago

Comment on attachment 8412508 [details] [diff] [review] Part2: Replace the unrolling matcher with memchr Sorry for the delay; the power went out as I was reviewing earlier. Great patch, simpler and faster! I suppose libc is using some rep-prefixed instruction?

Attachment #8412508 - Flags: review?(luke) → review+

Hannes Verschore [:h4writer]

Assignee

Comment 10

•

11 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/7796f1b42487 https://hg.mozilla.org/integration/mozilla-inbound/rev/00edef0582f1

Ryan VanderMeulen [:RyanVM]

Comment 11

•

11 years ago

This was backed out for making the mochitest-dt debugger tests on OSX more timeout prone, mostly in browser_dbg_pretty-print-11.js. https://hg.mozilla.org/integration/mozilla-inbound/rev/44e189f0fd67 https://tbpl.mozilla.org/php/getParsedLog.php?id=38824075&tree=Mozilla-Inbound https://tbpl.mozilla.org/php/getParsedLog.php?id=38821153&tree=Mozilla-Inbound https://tbpl.mozilla.org/php/getParsedLog.php?id=38817107&tree=Mozilla-Inbound https://tbpl.mozilla.org/php/getParsedLog.php?id=38815334&tree=Mozilla-Inbound

Ryan VanderMeulen [:RyanVM]

Comment 12

•

11 years ago

*sigh* Still hitting those failures. This can re-land whenever. Sorry for the inconvenience :(

Joel Maher ( :jmaher ) (UTC -8)

Comment 13

•

11 years ago

when this was on the tree we saw a 3.12% v8 regression on osx 10.6. it might be good check into that before we land this again.

Hannes Verschore [:h4writer]

Assignee

Comment 14

•

11 years ago

(In reply to Joel Maher (:jmaher) from comment #13) > when this was on the tree we saw a 3.12% v8 regression on osx 10.6. it > might be good check into that before we land this again. Yeah, mac was a regression, windows didn't change and linux was a nice improvement. That's the reason I have relanded it. I am looking into it ;)

Hannes Verschore [:h4writer]

Assignee

Comment 15

•

11 years ago

The slowdown is in part 2. So it seems like mac's memchr function is ultra slow: memchr (char): 447.0us UnrolledMatch (char): 180.0us So that's the reason for the slowdown on mac. I haven't looked into the licence issues, but we could import the glibc memchr version. this would give us the fast memchr on all platforms. That gives me: glibc_memchr (char): 82.0us Note: I tried the source copied glibc version on the jschar (so 16bit chars) and couldn't get an improvement. The extra checks probably don't get optimized enough in clang and kill the difference we see between UnrolledMatch and glibc_memchr. MacOS: octane-regexp Unpatched: 2277 octane-regexp Part 1: 2379 octane-regexp Part 2 - memchr: 2036 octane-regexp Part 2 - glibc_memchr: 2300 I could special case memchr for linux (and maybe windows) and still use the UnrolledLoop for mac? What do you think?

Flags: needinfo?(luke)

Hannes Verschore [:h4writer]

Assignee

Comment 16

•

11 years ago

Part 1: https://hg.mozilla.org/integration/mozilla-inbound/rev/7e5dc2e26dd3

Luke Wagner [:luke]

Comment 17

•

11 years ago

Adding a special case just for Linux doesn't seem great from a cost (extra code) vs. benefit (tiny fraction of users) analysis. Can you test glibc_memchr on Windows?

Flags: needinfo?(luke)

Luke Wagner [:luke]

Comment 18

•

11 years ago

Also, is it possible that glibc_memchr contains optimized (I assume, assembly-coded) paths and a fallback C path (which I assume would be somewhat similar to UnrolledMatch) and you are getting the latter in your tests?

Hannes Verschore [:h4writer]

Assignee

Comment 19

•

11 years ago

(In reply to Luke Wagner [:luke] (on partial PTO) from comment #18) > Also, is it possible that glibc_memchr contains optimized (I assume, > assembly-coded) paths and a fallback C path (which I assume would be > somewhat similar to UnrolledMatch) and you are getting the latter in your > tests? I used the benchmark and implemenation mentioned in (glibc_memchr): https://gist.github.com/nico/2624336

Phil Ringnalda (:philor)

Comment 20

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/7e5dc2e26dd3

Status: NEW → RESOLVED

Closed: 11 years ago

Resolution: --- → FIXED

Target Milestone: --- → mozilla32

Jared Wein [:jaws] (please needinfo? me)

Comment 21

•

11 years ago

Driving by, it seems that part 2 hasn't landed yet, so this was resolved prematurely ([leave open] should have been placed in the whiteboard. I didn't place it now since 'part 2' when it lands should resolve the bug). Please correct me if I'm wrong.

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Hannes Verschore [:h4writer]

Assignee

Comment 23

•

11 years ago

Part 2 was a regression, so it was decided to not take it.

Status: REOPENED → RESOLVED

Closed: 11 years ago → 11 years ago

Resolution: --- → FIXED

You need to log in before you can comment on or make changes to this bug.