Closed Bug 588648 Opened 15 years ago Closed 15 years ago

Don't copy chars when scanning

Tracking

()

Status:

RESOLVED FIXED

People

(Reporter: n.nethercote, Assigned: n.nethercote)

References

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(1 file, 5 obsolete files)

patch 1 (against TM 50897:45a893397e30) 15 years ago Nicholas Nethercote [inactive] 41.04 KB, patch		Details \| Diff \| Splinter Review
patch v2 (against TM 50897:45a893397e30) 15 years ago Nicholas Nethercote [inactive] 43.17 KB, patch		Details \| Diff \| Splinter Review
patch v3 (against TM 57130:bc000c1509ac) 15 years ago Nicholas Nethercote [inactive] 43.82 KB, patch		Details \| Diff \| Splinter Review
patch v4 (against TM 57130:bc000c1509ac) 15 years ago Nicholas Nethercote [inactive] 45.09 KB, patch		Details \| Diff \| Splinter Review
patch, v5 (against 57572:eea1f443ea15) 15 years ago Nicholas Nethercote [inactive] 50.31 KB, patch		Details \| Diff \| Splinter Review
patch, v6 (against TM 57572:eea1f443ea15) 15 years ago Nicholas Nethercote [inactive] 45.07 KB, patch	brendan : review+	Details \| Diff \| Splinter Review

Nicholas Nethercote [inactive]

Assignee

Description

•

15 years ago

Here's how scanning works: - The common mode of operation is that we already have a jschar buffer entirely in memory that we're scanning from, called 'userbuf'. - Each char read from 'userbuf' is copied twice in the most common case: (1) In getCharFillLinebuf(), it's copied from 'userbuf' to 'linebuf', and if it's an EOL char, it's normalized to '\n' in the process. (2) In getTokenInternal(), it's probably appended to a JSCharBuffer, which is a js::Vector. This is a bit slow because for every char copied the jsvector has to check if it has enough space. - There's a less common mode of operation where we are reading from file instead of directly from a buffer. In that case, we'll read from file into 'userbuf' in fillUserbuf(); the file gets read in chunks of size 1024. - The copying of chars from 'userbuf' to 'linebuf' doesn't seem necessary, though the EOL normalization does simplify things a bit. - The copying of chars into the JSCharBuffer is necessary only in the second mode of operation -- because 'userbuf' can be chunked when reading from file, it's possible a single token is split across two (or more, for long strings) chunks. If we only ever scanned directly from 'userbuf' we wouldn't need to copy to a JSCharBuffer. AFAICT, we read from file in the following places: - Some API functions (js_CompileFile, JS_CompileFileHandle, JS_CompileFileHandleForPrincipals) - These are used in the JS shell and xpcshell, which aren't important. - JS_CompileFileHandleForPrincipals is used in xpconnect/loader/mozJSComponentLoader.cpp only as a fallback if HAVE_PR_MEMMAP isn't available, which seems like an unusual case. I asked Brendan, he thinks in these case that reading the entire file into memory before parsing is reasonable. SS wanted win: 3ms? Hard to estimate, that's a guess based on some data in bug 578216. string-tagcloud looks to benefit the most; string-unpack-code and regexp-dna also some.

Chris Leary [:cdleary] (not checking bugmail)

Comment 1

•

15 years ago

Yeah, this is definitely a sweet win (and reduction in complexity). I didn't realize it was kosher to read whole files into memory!

Nicholas Nethercote [inactive]

Assignee

Comment 2

•

15 years ago

I've hit a snag in avoiding the userbuf-to-JSCharBuffer copy. The problem is multi-char characters. Eg. if we have '\' followed by 'n' in a string we need to convert that into a '\n' before we atomize it. There's a similar issue with \uXXXX sequences. Maybe the atomizer could do the conversions? Hmm.

Chris Leary [:cdleary] (not checking bugmail)

Comment 3

•

15 years ago

(In reply to comment #2) > Maybe the atomizer could do the conversions? Hmm. Sounds right! Thoughts on this: - The atomizer has to scan the contents of the string to perform a hash calculation. - The |JSString *| that winds up representing the atom needs to own its buffer, so we'll at least need to copy from the userbuf to heap space once. - The length of the string in userbuf isn't known until you scan for the endquot. - Are escapes uncommon enough in interned strings that we could slow path them? If the atomizer used a heaped vector to interpret escapes into, hashing the interpreted chars along the way, and made a |JSString *| with that as its backing store, you only have to do one (copying) pass on the string data in userbuf.

Brendan Eich [:brendan]

Comment 4

•

15 years ago

We don't want to lex while interning strings passed from C++, so any such path would be lexer-specific, not jsatom-generic. Also, \uXXXX can occur in other places than string literals. See the ES5 lexical grammar. /be

Nicholas Nethercote [inactive]

Assignee

Comment 5

•

15 years ago

Attached patch patch 1 (against TM 50897:45a893397e30) (obsolete) — Details — Splinter Review

Work-in-progress patch. Removes the scan-from-file option, always reads in the whole file before scanning. Scanning from 'stdin' isn't yet handled. Removes linebuf and getbuf, scans directly from userbuf. Doesn't do anything about userbuf-to-JSCharBuffer copying, I'm a bit leery about futzing with js_AtomizeString().

Nicholas Nethercote [inactive]

Assignee

Comment 6

•

15 years ago

Attached patch patch v2 (against TM 50897:45a893397e30) (obsolete) — Details — Splinter Review

In this patch: - Input is now always read fully into 'userbuf'; 'linebuf' and 'ungetbuf' were removed as a result. This simplifies scanning quite a bit. - Improves scanning of strings. Previously if a string included an EOL, we'd detect it in getChar() and then detect it again in getTokenInternal(). Now we have getCharIgnoreEOL() which avoids the first of these two checks. - Adds some big comments to help break up getTokenInternal() -- those multi-page if-statements have been really bugging me. I'm happy to fiddle with the exact formatting but I'd like to keep something like that in there. - Avoids creating a Flagger for a lot of tokens by introducing a flagless version of getToken(). - Things that don't yet work or I'm unsure about are marked with 'njn' comments. Instruction counts on 64-bit linux (32-bit linux results were not quite as good): --------------------------------------------------------------- | millions of instructions executed | | total | on-trace (may overestimate) | --------------------------------------------------------------- | 98.702 98.643 (1.001x) | 21.727 21.727 (------) | 3d-cube | 80.989 80.957 (------) | 19.256 19.256 (------) | 3d-morph | 135.688 135.315 (1.003x) | 23.025 23.025 (------) | 3d-raytrace | 66.276 66.238 (1.001x) | 17.993 17.993 (------) | access-binary- | 155.527 155.493 (------) | 94.894 94.894 (------) | access-fannkuc | 33.923 33.829 (1.003x) | 18.627 18.627 (------) | access-nbody | 49.819 49.788 (1.001x) | 27.665 27.665 (------) | access-nsieve | 8.227 8.216 (1.001x) | 3.486 3.486 (------) | bitops-3bit-bi | 40.012 40.005 (------) | 35.101 35.101 (------) | bitops-bits-in | 15.880 15.868 (1.001x) | 11.404 11.404 (------) | bitops-bitwise | 62.546 62.529 (------) | 38.601 38.601 (------) | bitops-nsieve- | 38.958 38.955 (------) | 31.397 31.397 (------) | controlflow-re | 39.471 39.237 (1.006x) | 5.570 5.570 (------) | crypto-md5 | 23.112 22.975 (1.006x) | 7.414 7.414 (------) | crypto-sha1 | 99.551 99.326 (1.002x) | 12.201 12.201 (------) | date-format-to | 75.279 75.010 (1.004x) | 10.156 10.156 (------) | date-format-xp | 58.876 58.837 (1.001x) | 29.129 29.129 (------) | math-cordic | 24.813 24.793 (1.001x) | 5.894 5.894 (------) | math-partial-s | 20.761 20.735 (1.001x) | 10.955 10.955 (------) | math-spectral- | 49.951 48.217 (1.036x) | 34.501 34.501 (------) | regexp-dna | 31.759 31.673 (1.003x) | 10.515 10.515 (------) | string-base64 | 105.522 104.888 (1.006x) | 23.351 23.351 (------) | string-fasta | 159.016 153.664 (1.035x) | 9.832 9.832 (------) | string-tagclou | 164.104 162.481 (1.010x) | 9.326 9.326 (------) | string-unpack- | 37.458 37.405 (1.001x) | 8.211 8.211 (------) | string-validat ------- | 1676.234 1665.085 (1.007x) | 520.243 520.243 (------) | all Time improvements may be 2 or 3 ms, but it's hard to tell through the usual noise. It'll also help slightly with the 1/2 million or so chars of JS that get read at browser startup.

Attachment #467653 - Attachment is obsolete: true

Attachment #467700 - Flags: feedback?(brendan)

Nicholas Nethercote [inactive]

Assignee

Comment 7

•

15 years ago

I thought of a way to avoid the (userbuf.ptr < userbuf.limit) test in getChar(). It would involve having a jschar that represents end-of-file; this would work because all the callers of getChar() check for end-of-file. The normal value of EOF is inappropriate because it's a 32-bit -1, which is outside the range of a jschar. So the question is: is there a jschar value that is unused and so suitable for this use? I've been reading about UCS-2 and I think any UTF-16 surrogate value such as 0xD800 would be suitable. From http://en.wikipedia.org/wiki/UTF-16/UCS-2: "Unicode and ISO/IEC 10646 do not, and will never, assign characters to any of the code points in the U+D800–U+DFFF range, so an individual code unit from a surrogate pair does not ever represent a character." Does this sound plausible? I fully admit to being a Unicode neophyte. If it is valid, getChar()'s return type could change from int32 to jschar.

Nicholas Nethercote [inactive]

Assignee

Comment 8

•

15 years ago

One tricky thing with this idea: TSF_EOF could no longer be set in getChar(). Looks like it could instead be set in the error case at the end of getTokenInternal() when the token kind is set to TOK_ERROR, if c==EOF.

Brendan Eich [:brendan]

Comment 9

•

15 years ago

Re: comment 7 -- JS does not combine surrogates, and it's easy on the web to put them in script content. Better to use int for the return value, as stdio getc did ages ago. Re: comment 8 -- yes, folding error and eof cases together is plausible. See the existing tt <= TOK_EOF tests. Nick, want feedback still? 2-3ms is not nothing on SS. Maybe rebase and go for a reviewable patch? /be

Nicholas Nethercote [inactive]

Assignee

Comment 10

•

15 years ago

(In reply to comment #9) > > Nick, want feedback still? 2-3ms is not nothing on SS. Maybe rebase and go for > a reviewable patch? My impression from what Sayre said recently is that parsing time is not counted in browser runs of SunSpider and V8, which would make this lower priority. I certainly won't get to this patch until platform work week starts, at the earliest.

Nicholas Nethercote [inactive]

Assignee

Comment 11

•

15 years ago

Attached patch patch v3 (against TM 57130:bc000c1509ac) (obsolete) — Details — Splinter Review

Rebased patch. - Reading from stdin now works. Indeed, previously, on Linux at least, I had to hit Ctrl-d three times to exit from |load('-')| in the shell. Now it works immediately. - The patch fixes bug 352970, too. - Adds ungetCharIgnoreEOL() which was needed to fix a jstest failure.

Attachment #467700 - Attachment is obsolete: true

Attachment #490794 - Flags: review?(brendan)

Attachment #467700 - Flags: feedback?(brendan)

Nicholas Nethercote [inactive]

Assignee

Updated

•

15 years ago

Attachment #490794 - Attachment description: patch (against TM 57130:bc000c1509ac) → patch v3 (against TM 57130:bc000c1509ac)

Nicholas Nethercote [inactive]

Assignee

Comment 12

•

15 years ago

Attached patch patch v4 (against TM 57130:bc000c1509ac) (obsolete) — Details — Splinter Review

Reinstate the Flagger-less version of getToken(), which was lost somehow.

Attachment #490794 - Attachment is obsolete: true

Attachment #490797 - Flags: review?(brendan)

Attachment #490794 - Flags: review?(brendan)

Brendan Eich [:brendan]

Comment 13

•

15 years ago

I'm at a TC39 meeting, so a bit slow. Would value cdleary's review too if time allows. Nick, one question: is <sys/stat.h> cross-platform? /be

Nicholas Nethercote [inactive]

Assignee

Comment 14

•

15 years ago

Attached patch patch, v5 (against 57572:eea1f443ea15) (obsolete) — Details — Splinter Review

This fixes a couple of jsapi-test failures. (I didn't even realize we had jsapi tests as I don't use the 'make check' target. Woo!) Problem was that I was returning a NULL JSScript pointer for an empty file. Now it reads in the file and does nothing, which matches the old behaviour. I also ran a try server run, it built fine on all platforms, so sys/stat.h looks to be portable.

Attachment #490797 - Attachment is obsolete: true

Attachment #491744 - Flags: review?(brendan)

Attachment #491744 - Flags: feedback?(cdleary)

Attachment #490797 - Flags: review?(brendan)

Nicholas Nethercote [inactive]

Assignee

Comment 15

•

15 years ago

Attached patch patch, v6 (against TM 57572:eea1f443ea15) — Details — Splinter Review

This patch removes some debugging guff that was accidentally included in v5.

Attachment #491744 - Attachment is obsolete: true

Attachment #491745 - Flags: review?(brendan)

Attachment #491745 - Flags: feedback?(cdleary)

Attachment #491744 - Flags: review?(brendan)

Attachment #491744 - Flags: feedback?(cdleary)

Brendan Eich [:brendan]

Comment 16

•

15 years ago

Comment on attachment 491745 [details] [diff] [review] patch, v6 (against TM 57572:eea1f443ea15) >+static JSScript * >+compileFileHelper(JSContext *cx, JSObject *obj, JSPrincipals *principals, uint32 tcflags, Static helpers have StudlyCaps not camelCaps names. >+ const char* filename, FILE *fp) >+{ >+ struct stat st; >+ int ok = fstat(fileno(fp), &st); >+ if (ok != 0) >+ return NULL; In the old days this would not do for Windows. Progress! >+ >+ jschar *buf = NULL; >+ size_t len = st.st_size; >+ JSScript *script; script is odd in that it is the only decl in this group not initialized. It's also the return expression. For these reasons make it come last in the group of four. >+ size_t i = 0; Blank line here please. >+ /* Read in the whole file, then compile it. */ >+ if (fp == stdin) { >+ JS_ASSERT(len == 0); >+ len = 8; /* start with a small buffer, expand as necessary */ >+ int c; >+ bool hitEOF = false; >+ while (!hitEOF) { >+ len *= 2; >+ jschar* tmpbuf = (jschar *) cx->realloc(buf, len * sizeof(jschar)); >+ if (!tmpbuf) { >+ cx->free(buf); >+ return NULL; >+ } >+ buf = tmpbuf; >+ >+ while (true) { >+ if (i == len) >+ break; This loop should start while (i < len) (both to avoid wasting columns and lines and reader brainwaves on true and if/break, and to use < instead of !=). >+ int c; >+ while ((c = fast_getc(fp)) != EOF) >+ buf[i++] = (jschar) (unsigned char) c; >+ } >+ JS_ASSERT(i <= len); >+ len = i; >+ script = Compiler::compileScript(cx, obj, NULL, principals, tcflags, buf, len, filename, 1); >+ cx->free(buf); >+ >+ return script; Move the blank line up to come right after the right brace. >+ int32 c; >+ if (JS_LIKELY(userbuf.ptr < userbuf.limit)) { >+ c = *userbuf.ptr++; > /* Blank line before comment taking one or more lines, unless preceding line ends in left brace. >+ if (JS_UNLIKELY(maybeEOL[c & 0xff])) { These JS_UNLIKELY uses helped, per valgrind? > /* >- * This gets the next char, normalizing all EOL sequences to '\n' as it goes. >+ * This gets the next char. It does nothing special with EOL sequences, not >+ * even updating the line counters. http://twitter.com/#!/jonniker/status/10534301980430336 Plus, prevailing style. Last such whine from me on this topic. + index = (tp->begin.lineno == tp->end.lineno) + ? tp->begin.index /* the column number of the start of the bad token */ + : 0; Uber-nit: ? and : underhang the ( not the = or whatever is two chars to the left of the (. > #if JS_HAS_XML_SUPPORT >+ /* >+ *----------------------------------------------------------------------- These hyphen-rules don't seem beneficial. Prevailing style does use major comment style even with one-line comment, for emphasis. That ought to be enough. .. if (c == qc) { .. break; .. } else if (c == '\\') { Pre-existing else after break: unbrace and lose the else. >+ } else if (c == '\n' || c == '\r' || c == LINE_SEPARATOR || c == PARA_SEPARATOR || >+ c == EOF) >+ { >+ ungetCharIgnoreEOL(c); > ReportCompileErrorNumber(cx, this, NULL, JSREPORT_ERROR, > JSMSG_UNTERMINATED_STRING); > goto error; > } Does it make sense to move this case up to come before the if (c == '\\') clause, since most string chars won't be special? Would help to lose the else (which would now come after goto) before if (c == '\\') and unindent the backslash interpretation code by one level. PGO will handle any further reordering. Looks great otherwise. r=me with nits picked. Thanks for doing this, /be

Attachment #491745 - Flags: review?(brendan) → review+

Nicholas Nethercote [inactive]

Assignee

Comment 17

•

15 years ago

(In reply to comment #16) > > > #if JS_HAS_XML_SUPPORT > >+ /* > >+ *----------------------------------------------------------------------- > > These hyphen-rules don't seem beneficial. Prevailing style does use major > comment style even with one-line comment, for emphasis. That ought to be > enough. I've done as you asked, but reluctantly. I humbly posit that a 1,014 line function -- that's not a typo, it's really 1,014 lines -- needs more formatting/sectioning than a single style of comment can provide. > >+ } else if (c == '\n' || c == '\r' || c == LINE_SEPARATOR || c == PARA_SEPARATOR || > >+ c == EOF) > >+ { > >+ ungetCharIgnoreEOL(c); > > ReportCompileErrorNumber(cx, this, NULL, JSREPORT_ERROR, > > JSMSG_UNTERMINATED_STRING); > > goto error; > > } > > Does it make sense to move this case up to come before the if (c == '\\') > clause, since most string chars won't be special? Would help to lose the else > (which would now come after goto) before if (c == '\\') and unindent the > backslash interpretation code by one level. PGO will handle any further > reordering. Backslashes are common in long strings, EOLs are very uncommon, so the ordering helps speed. Or are you saying that PGO would realize that and change the ordering of the branches? I'm not comfortable relying on that... do we even have PGO on all platforms?

Nicholas Nethercote [inactive]

Assignee

Comment 18

•

15 years ago

http://hg.mozilla.org/tracemonkey/rev/2b9d805d77a1

Whiteboard: fixed-in-tracemonkey

Chris Leary [:cdleary] (not checking bugmail)

Comment 19

•

15 years ago

(In reply to comment #17) > I humbly posit that a 1,014 line > function -- that's not a typo, it's really 1,014 lines -- needs more > formatting/sectioning than a single style of comment can provide. That delimiter should be easily understandable functions with JS_ALWAYS_INLINE. That'll be a nice cleanup patch somewhere down the road.

Chris Leary [:cdleary] (not checking bugmail)

Updated

•

15 years ago

Attachment #491745 - Flags: feedback?(cdleary)

Chris Leary [:cdleary] (not checking bugmail)

Comment 20

•

15 years ago

(Btw, sorry I didn't get around to giving feedback!)

Nicholas Nethercote [inactive]

Assignee

Comment 21

•

15 years ago

(In reply to comment #19) > > That delimiter should be easily understandable functions with JS_ALWAYS_INLINE. > That'll be a nice cleanup patch somewhere down the road. I considered that but getTokenInternal() is riddled with gotos and I couldn't see how to do it in a way that didn't hurt performance.

Robert Sayre

Comment 22

•

15 years ago

http://hg.mozilla.org/mozilla-central/rev/2b9d805d77a1

Status: ASSIGNED → RESOLVED

Closed: 15 years ago

Resolution: --- → FIXED

Jan de Mooij [:jandem]

Updated

•

15 years ago

Blocks: 618572

Jan de Mooij [:jandem]

Updated

•

15 years ago

No longer blocks: 618572

Depends on: 618572

Henrik Skupin [:whimboo][⌚️UTC+1]

Updated

•

14 years ago

Depends on: 634444

Alice0775 White

Updated

•

14 years ago

Depends on: 653890

Nicholas Nethercote [inactive]

Assignee

Updated

•

14 years ago

No longer depends on: 653890

Alice0775 White

Updated

•

14 years ago

Depends on: 653890

You need to log in before you can comment on or make changes to this bug.