Closed Bug 588061 Opened 14 years ago Closed 13 years ago

bogus initial tokenizer position

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla8

People

(Reporter: dherman, Assigned: brendan)

References

Details

(Whiteboard: [inbound])

Attachments

(1 file, 1 obsolete file)

The TokenStream zeroes out its tokens buffer and initially points to the first token. So if the parser creates a node before advancing the token stream, the node gets the initial position 0:0, which is an invalid position.

This comes up if you try to use js::Parser::statements, because it calls ListNode::create before touching the token stream. I don't think this is currently used anywhere except JS_BufferIsCompilableUnit, which ignores token positions anyway. But it affects the upcoming parser API (see bug 533874) by producing an incorrect start position for the root node.

Dave
Blocks: 568142
cdleary's suggestion: js::Parser::parse() should not be calling js::Parser::statements(), and the latter should be private since it depends on being called only in a context where the token stream is advanced past the first token. js::Parser::parse() should create a ListNode properly up front.

Dave
Attached patch proposed fix (obsolete) — Splinter Review
Assignee: general → brendan
Status: NEW → ASSIGNED
Attachment #546042 - Flags: review?(cdleary)
OS: Mac OS X → All
Hardware: x86 → All
js::Parser::statements is private. We could have the TokenStream constructor always get a token and then unget. That seems like the best way to roll but it might cost noticeably on empty function bodies or programs. Comments?

/be
Attachment #546042 - Attachment is obsolete: true
Attachment #546060 - Flags: review?(cdleary)
Attachment #546042 - Flags: review?(cdleary)
Comment on attachment 546060 [details] [diff] [review]
alterna-fix, better

Review of attachment 546060 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsscan.cpp
@@ +234,5 @@
> +     * future) can create parse nodes with good source coordinates before they
> +     * explicitly get any tokens.
> +     *
> +     * FIXME: reopen bug 556486 and prime the pump here, leaving cursor at the
> +     * next token for the parser to consider.

I don't think this is a legitimate FIXME.
Attachment #546060 - Flags: review?(cdleary) → review+
(In reply to comment #5)
> Comment on attachment 546060 [details] [diff] [review] [review]
> alterna-fix, better
> 
> Review of attachment 546060 [details] [diff] [review] [review]:
> -----------------------------------------------------------------
> 
> ::: js/src/jsscan.cpp
> @@ +234,5 @@
> > +     * future) can create parse nodes with good source coordinates before they
> > +     * explicitly get any tokens.
> > +     *
> > +     * FIXME: reopen bug 556486 and prime the pump here, leaving cursor at the
> > +     * next token for the parser to consider.
> 
> I don't think this is a legitimate FIXME.

Right, cuz that bug is resolved WONTFIX -- is there a better FIXME comment, then? Should that bug be reopened based on stuff njn learned since?

/be
(In reply to comment #6)
> Right, cuz that bug is resolved WONTFIX -- is there a better FIXME comment,
> then? Should that bug be reopened based on stuff njn learned since?

Yeah, I think if njn could describe the new course of action (and how it differs from the potential of the failed approach) in a new bug, that'd be for the best. FIXMEs + open and actionable bugs FTW.
Um, I don't claim any special knowledge here.  Could TokenStream be initialized so that the first token has a valid source location?  But that seems hacky.

The premature ListNode creation seems to be the root of the problem.  What happens if there is no more input -- you've created a ListNode unnecessarily.  (Does the ListNode leak in that case, BTW?)  Could js::Parser::statements avoid creating the ListNode until a token has been parsed?  Alternatively, could it call peekToken() before creating the ListNode, and bail early if peekToken() fails?
(In reply to comment #8)
> Um, I don't claim any special knowledge here.  Could TokenStream be
> initialized so that the first token has a valid source location?  But that
> seems hacky.

The idea of pump-priming is non-hacky. The lexer always gets the next token and the parser consults it as if by peekToken(). That's all that is being proposed here. Is that a non-starter?

> The premature ListNode creation seems to be the root of the problem.

No, the parse tree always makes a list node even if it has no kids. This is a trivial memory cost for an uncommon empty-block/body/program source form, in order to avoid a non-uniform basis case.

> What
> happens if there is no more input -- you've created a ListNode
> unnecessarily.  (Does the ListNode leak in that case, BTW?)

Nothing leaks, parse nodes are arena-allocated with recycling, and freed en masse.

> Could
> js::Parser::statements avoid creating the ListNode until a token has been
> parsed?  Alternatively, could it call peekToken() before creating the
> ListNode, and bail early if peekToken() fails?

This was the less obviously good fix, which we discussed here and on IRC, since statements is called recursively and also from compileScript where the token stream has a valid next-token. Adding an extra peek just spends some cycles to prime the token position for the called-from-API-entry-point case that are a waste in all other cases.

Hence the desire to take the first-token cost only where needed: up front in TokenStream::init.

/be
/be
Nick the reason you were summoned here was your bug 556486 comment 6.

/be
(In reply to comment #10)
> Nick the reason you were summoned here was your bug 556486 comment 6.

Now I'm more confused.  That bug was resolved WONTFIX, so why is it relevant?  Maybe don't bother answering that, it sounds like you understand this much better than I do and have it under control.
Sorry for the confusing bug comment citing. The upshot: cdleary and I are still considering a pump-priming model. Your work in bug 637549 reduced the need for such a change, but it still might be the fastest and smallest code footprint way to parse. Not sure.

Anyway, we can revisit at leisure. I'm removing the FIXME.

/be
http://hg.mozilla.org/integration/mozilla-inbound/rev/32ab46a3803b

/be
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Backed out on inbound:
http://hg.mozilla.org/integration/mozilla-inbound/rev/47b309008c4e

This caused failures in jsreftest on Linux debug builds:
tinderbox.mozilla.org/showlog.cgi?log=Mozilla-Inbound/1311365688.1311367284.20312.gz

REFTEST TEST-START | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_5/misc/future-reserved-words.js
++DOMWINDOW == 22 (0x32921c8) [serial = 2968] [outer = 0x1d762a0]
497869: Implement FutureReservedWords per-spec
Assertion failure: tp->begin.lineno == tp->end.lineno, at /builds/slave/m-in-lnx64-dbg/build/js/src/jsscan.cpp:525
WARNING: An event was posted to a thread that will never run it (rejected): file /builds/slave/m-in-lnx64-dbg/build/xpcom/threads/nsThread.cpp, line 374
WARNING: leaking reference to nsTimerImpl: file /builds/slave/m-in-lnx64-dbg/build/xpcom/threads/nsTimerImpl.cpp, line 494
TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_5/misc/future-reserved-words.js | Exited with code 1 during test run
INFO | automation.py | Application ran for: 0:07:16.379330
INFO | automation.py | Reading PID log: /tmp/tmpvP1wfgpidlog
PROCESS-CRASH | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_5/misc/future-reserved-words.js | application crashed (minidump found)
Crash dump filename: /tmp/tmpyRWUUj/minidumps/3c81c007-fc10-2978-1c76f6fd-26f3e91d.dmp
Operating system: Linux
                  0.0.0 Linux 2.6.31.5-127.fc12.x86_64 #1 SMP Sat Nov 7 21:11:14 EST 2009 x86_64
CPU: amd64
     family 6 model 23 stepping 10
     2 CPUs

Crash reason:  SIGABRT
Crash address: 0x1f400000843

Thread 0 (crashed)
 0  libpthread-2.11.so + 0xee6b
    rbx = 0x0000020d   r12 = 0x00000001   r13 = 0x8b69b6b8   r14 = 0xffffffff
    r15 = 0x8b62ca00   rip = 0xd360ee6b   rsp = 0x126cc838   rbp = 0x126cc870
    Found by: given as instruction pointer in context
 1  libxul.so!JS_Assert [jsutil.cpp:32ab46a3803b : 89 + 0x9]
    rip = 0xbe812cfc   rsp = 0x126cc840
    Found by: stack scanning
 2  libxul.so!js_ReportValueErrorFlags [jscntxt.cpp:32ab46a3803b : 1115 + 0x6]
    rip = 0xbe6d537c   rsp = 0x126cc870
    Found by: stack scanning
 3  0x7fff126cc94f
    rbx = 0x00000002   rip = 0x126cc950   rsp = 0x126cc878   rbp = 0xbe6d537c
    Found by: call frame info
 4  libxul.so!js::TokenStream::reportCompileErrorNumberVA [jsscan.cpp:32ab46a3803b : 525 + 0x29]
    rip = 0xbe7d6d2e   rsp = 0x126cc880
    Found by: stack scanning
Rats, ran the tests on my opt shell build instead of dbg.

http://hg.mozilla.org/integration/mozilla-inbound/rev/99f8219d08d3

/be
Status: REOPENED → RESOLVED
Closed: 13 years ago13 years ago
Resolution: --- → FIXED
(In reply to comment #15)

Just a heads up -- the procedure for inbound is like tracemonkey: mark as [inbound] in the whiteboard and the inbound-merger marks as RESOLVED/FIXED.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Whiteboard: [inbound]
Status: REOPENED → ASSIGNED
http://hg.mozilla.org/mozilla-central/rev/32ab46a3803b
Status: ASSIGNED → RESOLVED
Closed: 13 years ago13 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla8
this was backed out by mbrubeck and then pushed again.
This is the final changeset, sorry for bugspam:
http://hg.mozilla.org/mozilla-central/rev/99f8219d08d3
http://hg.mozilla.org/integration/mozilla-inbound/rev/735e1dc41b44

Forgot the test when transplanting from my tm tree to mozilla-inbound, d'oh.

/be
Ay caramba. Comment 19 was for bug 671947.

/be
brendan: since evalcx is shell only, shouldn't you just skip the test cross-global-implicit-this.js in the browser rather than mark it as a fails?
(In reply to comment #21)
> brendan: since evalcx is shell only, shouldn't you just skip the test
> cross-global-implicit-this.js in the browser rather than mark it as a fails?

I am a jstests.list copy-paste monkey. Fixed:

http://hg.mozilla.org/integration/mozilla-inbound/rev/14e1153b706c

/be
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: