Last Comment Bug 584595 - TM: speed up scanning three ways
: TM: speed up scanning three ways
Status: RESOLVED FIXED
fixed-in-tracemonkey
:
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: unspecified
: x86 Mac OS X
: -- normal (vote)
: ---
Assigned To: Nicholas Nethercote [:njn]
:
:
Mentors:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2010-08-04 19:49 PDT by Nicholas Nethercote [:njn]
Modified: 2010-08-23 14:49 PDT (History)
4 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
patch 1 (against TM 48641:8cae6d5cd69c) (3.43 KB, patch)
2010-08-04 19:49 PDT, Nicholas Nethercote [:njn]
no flags Details | Diff | Splinter Review
patch 2 (against TM 48641:8cae6d5cd69c) (3.80 KB, patch)
2010-08-04 19:49 PDT, Nicholas Nethercote [:njn]
no flags Details | Diff | Splinter Review
Parsemark patch comparison. (1.78 KB, text/plain)
2010-08-06 18:08 PDT, Chris Leary [:cdleary] (not checking bugmail)
no flags Details
TM patch (against TM 49115:f516ec208e64) (17.89 KB, patch)
2010-08-08 18:34 PDT, Nicholas Nethercote [:njn]
cdleary: review+
Details | Diff | Splinter Review

Description Nicholas Nethercote [:njn] 2010-08-04 19:49:07 PDT
Created attachment 463041 [details] [diff] [review]
patch 1 (against TM 48641:8cae6d5cd69c)

getCharFillLinebuf() gets a lot of mispredicted branches when looking for EOL sequences.  Based on my rule of thumb, which is that mispredicted branches always cost more than you think, I've tried to eliminate them.

Attached are two patches.  The first one is simpler and reduces the mispredictions by about 2/3, and also uses fewer instructions overall.  The second one is more complex;  it reduces the mispredictions almost completely, but incurs more instructions due to some shifting and masking.

I'm possibly getting up to 5ms speedup on SS with either patch, but it's hard to tell with the usual SS noise.

cdleary, can you try both patches with parsemark and see if any clear results emerge?  Thanks.
Comment 1 Nicholas Nethercote [:njn] 2010-08-04 19:49:41 PDT
Created attachment 463042 [details] [diff] [review]
patch 2 (against TM 48641:8cae6d5cd69c)
Comment 2 Chris Leary [:cdleary] (not checking bugmail) 2010-08-06 18:06:36 PDT
I see 2.30% for p1, 2.77% for p2. Note that you can also run parsemark on your local machine -- js/src/tests/parsemark.py and compare_bench.py, with the suite of tests located in bug 548621.
Comment 3 Chris Leary [:cdleary] (not checking bugmail) 2010-08-06 18:08:23 PDT
Created attachment 463727 [details]
Parsemark patch comparison.

Whoops, those aren't weighted results, d'oh. Looks like p1 does a little better in general. Attached the output.
Comment 4 Brendan Eich [:brendan] 2010-08-06 18:16:40 PDT
Awesome -- keep this bug going. I wrote the original scanner in the days of 386 and MIPS (my fave!).

/be
Comment 5 Nicholas Nethercote [:njn] 2010-08-08 18:34:36 PDT
Created attachment 463985 [details] [diff] [review]
TM patch (against TM 49115:f516ec208e64)

This patch has three improvements to the scanner.

- First, it adds the maybeEOL table (as described above) to avoid
  mispredictions when checking for newlines in getCharFillLinebuf().  I went
  with the first of the above two patches, it was simpler and gave better
  results when parsing a really big file (see below for more).

- Second, it adds the maybeStrSpecial table.  When scanning inside a string,
  we currently have to do 4 tests in the common case, against: \n, \' or ",
  EOF, \\.  This table allows these to be combined into a single test.  

- Third, it rearranges getChar().  Currently, in the common case, for every
  char we do an ungetbuf test, an end-of-linebuf test, and an is-it-newline
  test.  With my change we just do an end-of-currbuf test.  This makes 
  getChar() small enough to inline.

  I did this by (a) changing ungetbuf to be a TokenBuf like linebuf and
  introducing 'currbuf' to point to the TokenBuf we're currently reading 
  from;  and (b) taking advantage of the fact that \n can only occur as the
  last char in linebuf/ungetbuf.

Results on SS and parsemark are inconclusive, due to noise.  But I also
tried a simpler benchmark, which consisted of 22627 lines that looked like
this:

  x = "<string>";

where <string> was 2348 chars long.  This is just over 50MB.  On this test I
run 1.20--1.25x faster with the patch applied, so I figure it's probably 
worth landing, as it may help big scripts (eg. string-tagcloud, 
string-unpack-code, regexp-dna, v8-earley-boyer, v8-regexp).
Comment 6 Chris Leary [:cdleary] (not checking bugmail) 2010-08-13 23:42:11 PDT
Comment on attachment 463985 [details] [diff] [review]
TM patch (against TM 49115:f516ec208e64)

Good stuff! I like the maybeEOL check better than the (d & 0xdfd0 == 0) one we were doing before.

I don't think it's easy to create language-level tests for our scanner trickery... making the 1024th char a '\r' and 1025th a '\n' outside of a string in a way that would demonstrate bad behavior at a language level is hard to pull off the top of my head. I guess ideally we would have C++ tests, but that's not common SM practice.

+    ungetbuf.limit = ungetbuf.ptr = buf + UNGET_MAX;
+    linebuf.base = linebuf.limit = linebuf.ptr = buf + UNGET_MAX;

Methinks UNGET_MAX should be called UNGET_LIMIT by convention -- max suffix is supposed to indicate the <= for-loop termination condition, whereas limit indicates <; i.e. the first char of linebuf is *(buf + UNGET_LIMIT).

+    maybeStrSpecial[uint32_t(EOF) & 0xff] = true;

Why the uint32 cast? I think SM style also prefers uint32(EOF) -- people seem to dislike the _t's.

+        JS_ASSERT(currbuf = &linebuf);

Double equals.

+                if (c == '\n' || c == EOF) {
+                    ungetChar(c);
+                    ReportCompileErrorNumber(cx, this, NULL, JSREPORT_ERROR,
+                                             JSMSG_UNTERMINATED_STRING);
+                    goto error;
+                }

Personally I would shove this under the more likely '\\' case, but it's not going to make one lick of difference perf wise. :-)

Might be nice in a followup patch for us to make a TokenBuf class with some helper methods to indicate  common patterns (like full() and pop()) and a ReverseTokenBuf to do similar for ungetbuf. It's a very isolated bit of code, though, so clearly low priority.
Comment 7 Nicholas Nethercote [:njn] 2010-08-17 18:05:51 PDT
Pushed with nits fixed:

http://hg.mozilla.org/tracemonkey/rev/c2c09af14293

Note You need to log in before you can comment on or make changes to this bug.