368520 - Rhino parser inspires fear, awe

Reporter

Description

•

18 years ago

As I've been wading through the E4X bugs and testing them against the nearly-complete non-XMLBeans E4X support, I've been able to catalog those bugs that can't be fixed without altering the underlying parser code.  I'm creating this issue to address the parser situation (and marking it as part of Core, although we could revisit that) and setting it up as a blocker for those other issues.  See the "Blocks" list for details.

There has been some discussion on the newsgroup as of late regarding the parser; see

David Caldwell

Reporter

Comment 1

•

18 years ago

Apologies, submitted too early.  See: http://groups.google.com/group/mozilla.dev.tech.js-engine/browse_thread/thread/773c5cc56bfad5f3/# for the discussion referenced above.

David Caldwell

Reporter

Comment 2

•

18 years ago

I am generalizing this from E4X issues to a general parser issue, as there are non-E4X bugs that might want to mark this as a blocker.

Summary: Rhino parser cannot handle many E4X constructs → Rhino parser inspires fear, awe

David Caldwell

Reporter

Updated

•

18 years ago

Blocks: 368455

David Caldwell

Reporter

Comment 3

•

18 years ago

Changing to "enhancement" since of course the parser works.  It just inspires a sense of impotence and unworthiness in some of those who might seek to master it.

Severity: normal → enhancement

David Caldwell

Reporter

Updated

•

18 years ago

Blocks: 351991

David Caldwell

Reporter

Updated

•

18 years ago

Blocks: 323890

Bob Jervis

Comment 4

•

17 years ago

Since I've been making changes to the parser for months now, I don't feel either fear or awe at it.  I think it's kind of clean, as parsers tend to go.

Maybe I'm just strange, but I'm not concerned about the state of the parser.  In any event, I don't see anything actionable about this bug.  

What exactly could we do to make it less fearsome?

David Caldwell

Reporter

Comment 5

•

17 years ago

Well, I created this as a blocker for the E4X stuff based on Attila's comments (and those of others) in this thread:

http://groups.google.com/group/mozilla.dev.tech.js-engine/browse_thread/thread/773c5cc56bfad5f3/a6fa53f5870f1029?lnk=gst&q=parser&rnum=2#a6fa53f5870f1029

Perhaps I scared too easily.  But anyway, it seemed at the time that some of the approaches described in the thread might be the "actionable" stuff here.  I thought if that discussion resulted in a solution then when the solution was implemented this bug would be closed.  I am not so good with parser generators, etc. -- do you have any thoughts about the discussion in the thread above?

Bob Jervis

Comment 6

•

17 years ago

My comment on the above thread is that whenever anobody offers you a parser generator run away.  I have never known a self-respecting production compiler writer who has used one.  If you think the existing parser is scary, then parser generators are far, far, far worse.

I have used them and I abandoned them 27 years ago in favor of the recursive descent parsers that I have written ever since.

If I were writing a parser from scratch the most significant difference with what is there is that I would use a precedence table for binary expression parsing.  The result is a single recursive function that handles all binary operators instead of the cascade of very similar functions we have today (that are modeled on the grammar).

Otherwise, I think the Rhino parser is a pretty clean recursive descent parser that has few problems.  I found it effortless to patch in changes (adding the get/set syntax took less than a day, const took about the same).  You haven't seen my type checker changes that involve parsing JSDoc comments and attaching type annotations to the parse tree.  Those were just as easy to plumb through.

Bob Jervis

Comment 7

•

17 years ago

Oh, one thing: I haven't had any reason to mess around with the E4X parsing, but I have written about 5 XML parsers and they are even easier to write than a Javascript parser (if your XML parser takes more than about 300 lines of code, total, you're doing something wrong).

David Caldwell

Reporter

Comment 8

•

17 years ago

Regarding comment 7:  We tokenize the XML, though, as well as tokenizing some of the ordinary text nodes (e.g., if they contain curly braces), so it's a little bit more complex than your ordinary XML case, because within our text nodes we can have XML inside curly braces, etc.  So there are more possible states than in a vanilla XML parser.

As for the parser generator debate (or more generally the debate on whether the parser could be simplified) ... I am mostly a bystander (at least for now, unless I dive in a little more).

But if we decide that doing nothing is the best choice, then I have no problem closing this bug.

Norris Boyd

Comment 9

•

17 years ago

Of course I'm biased, but I'd agree with Bob. Staying with the existing recursive descent parser is the way to go. The Rhino parser was heavily informed by the SpiderMonkey parser; starting over with a different approach is likely to lead to subtle incompatibilities with the de facto standard on tricky things like semicolon insertion.

Attila Szegedi

Comment 10

•

17 years ago

(In reply to comment #9)
> Of course I'm biased, but I'd agree with Bob. Staying with the existing
> recursive descent parser is the way to go. The Rhino parser was heavily
> informed by the SpiderMonkey parser; starting over with a different approach is
> likely to lead to subtle incompatibilities with the de facto standard on tricky
> things like semicolon insertion.
> 

I did embark on a pet project to write a generated parser for JavaScript few months back, and it was the inability of all available Java parser generators to cope with the semicolon insertion that made me throw up my hands :-)

On the bright side, I started looking into that with the assumption that you won't be around Rhino much in the future. Seeing how this changed now, I'm 100% happy to drop the subject :-)

David Caldwell

Reporter

Comment 11

•

17 years ago

In that case, it seems like there's a consensus sufficient for closing this.  If I don't hear an objection in the next few days, I'll do that.

David Caldwell

Reporter

Comment 12

•

17 years ago

Cleaning up the bug list, as it seems that there is widespread agreement here.

Status: NEW → RESOLVED

Closed: 17 years ago

Resolution: --- → FIXED

David Caldwell

Reporter

Updated

•

17 years ago

Resolution: FIXED → INVALID

Bugzilla

Quick Search

Rhino parser inspires fear, awe

Categories

(Rhino Graveyard :: Core, enhancement)

Tracking

(Not tracked)

People

(Reporter: inonit, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Updated

Comment 3

Updated

Updated

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Comment 12

Updated