Closed Bug 636654 Opened 14 years ago Closed 14 years ago

Extract large cold chunks (e4x, @lines) from getTokenInternal()

Tracking

()

Status:

RESOLVED FIXED

People

(Reporter: n.nethercote, Assigned: n.nethercote)

References

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(1 file, 1 obsolete file)

patch 14 years ago Nicholas Nethercote [inactive] 25.50 KB, patch		Details \| Diff \| Splinter Review
patch (against TM 63113:40092c96120b) 14 years ago Nicholas Nethercote [inactive] 32.89 KB, patch	brendan : review+	Details \| Diff \| Splinter Review

Nicholas Nethercote [inactive]

Assignee

Description

•

14 years ago

jsscan.cpp has various big chunks of XML-handling code, esp. in getTokenInternal(). It would be nice to factor this as much as possible so it doesn't get in the way of the main show.

Nicholas Nethercote [inactive]

Assignee

Updated

•

14 years ago

Depends on: 638034

Nicholas Nethercote [inactive]

Assignee

Comment 1

•

14 years ago

Note to self... in this code: #if JS_HAS_SHARP_VARS || JS_HAS_XML_SUPPORT badchar: #endif The "|| JS_HAS_XML_SUPPORT" can be removed -- badchars is only jumped to from JS_HAS_SHARP_VARS code.

Nicholas Nethercote [inactive]

Assignee

Updated

•

14 years ago

Summary: Separate XML lexing code in jsscan.cpp → Separate the XML lexing code in jsscan.cpp

Nicholas Nethercote [inactive]

Assignee

Comment 2

•

14 years ago

Attached patch patch (obsolete) — Details — Splinter Review

Here's the hacky-but-easy way of doing this. There are two big chunks of XML-specific code in getTokenInternal(). This patch just moves them into a new file, jsscanxml.cpp, and #includes them in the right spots (using some #ifdefs to #include the appropriate chunk in the appropriate place). I came to this approach after first trying to move the chunks into separate functions, which was a bit of a pain due to the gotos in the chunks and the variables (tt, tp) modified by the chunks. I suspect you're going to hate this approach... or if you think it's ok but don't like the #ifdefs I could split jsscanxml.cpp into jsscanxml[12].cpp. There are a bunch of small XML-specific chunks still in getTokenInternal(), I didn't do anything with them because they're small enough to be easy to ignore. I also combined the TSF_XMLTEXTMODE and TSF_XMLTAGMODE tests into a single test. This reduced the instruction count on parsemark-njn by 1.007x, which is better than a kick in the teeth.

Brendan Eich [:brendan]

Comment 3

•

14 years ago

I don't think we should #include fragments of function code. Better to factor into helpers, a la TokenStream::getXMLEntity. All this is doing is trying to sweep the dirt under a corner cabinet. But we'll have to move the cabinet or even maintain the dirt at some point. The copyright date and attribution in the new file are wrong; see jsxml.cpp for what to use, if this patch were to be approved. Really, we have inline helpers and residual #if JS_HAS_XML hunks, so creating a new file to #include is a mismatch. But since the death of jsrecursion.h, I had thought we had seen the last of #include'ing source fragments. Just say no. You can quote me in the style guide. At the least we need a really good reason, like performance or a compiler bug workaround. I don't see it here. I'll take this bug and factor into inline helpers if you don't want to do it. /be

Brendan Eich [:brendan]

Comment 4

•

14 years ago

(In reply to comment #3) > I don't think we should #include fragments of function code. Better to factor > into helpers, a la TokenStream::getXMLEntity. All this is doing is trying to > sweep the dirt under a corner cabinet. But we'll have to move the cabinet or > even maintain the dirt at some point. Again, not all of the dirt. And, the cabinet is on three points and everyone knows it. It rocks in the breeze (I mean, jsscanxml.cpp, new file, grep hits it -- no secrecy in this dirtpile). /be

Nicholas Nethercote [inactive]

Assignee

Comment 5

•

14 years ago

Fair enough. The handling of @line comments is a good candidate for moving into a separate function, as is the handling of sharps. I guess there's an argument for moving more of the big chunks (eg. the handling of identifiers, numbers, strings, regexps) into separate functions. I'll see how I'm feeling when I do this; I've gotten quite used to those ones being in getTokenInternal(). (In reply to comment #1) > > #if JS_HAS_SHARP_VARS || JS_HAS_XML_SUPPORT > badchar: > #endif > > The "|| JS_HAS_XML_SUPPORT" can be removed -- badchars is only jumped to from > JS_HAS_SHARP_VARS code. And the badchar label can be moved into the immediately preceding JS_HAS_SHARP_VARS block.

Summary: Separate the XML lexing code in jsscan.cpp → Extract cold bits of code (e4x, @lines, sharps) from getTokenInternal()

Nicholas Nethercote [inactive]

Assignee

Comment 6

•

14 years ago

Attached patch patch (against TM 63113:40092c96120b) — Details — Splinter Review

This patch: - Factors out the two big chunks of XML-specific code in getTokenInternal() into getXMLTextOrTag() and getXMLMarkup(). - Factors out the @line-handling code into getAtLine(). - Adds some comments to the '/'-scanning case. - Changes peekChars(), matchChars() and getXMLEntity() to return bool instead of JSBool. All methods in TokenStream now use bool exclusively. I didn't factor any more chunks out, mostly because the remaining chunks are all one screen's worth of text or less, and so not to hard to navigate, and so I'm pretty happy with what's left in getTokenInternal(). 'hg diff' has done a really bad job with this patch, sorry about that.

Attachment #518654 - Attachment is obsolete: true

Attachment #519835 - Flags: review?(brendan)

Nicholas Nethercote [inactive]

Assignee

Updated

•

14 years ago

Summary: Extract cold bits of code (e4x, @lines, sharps) from getTokenInternal() → Extract large cold chunks (e4x, @lines) from getTokenInternal()

Brendan Eich [:brendan]

Comment 7

•

14 years ago

Comment on attachment 519835 [details] [diff] [review] patch (against TM 63113:40092c96120b) >+bool >+TokenStream::getXMLMarkup(TokenKind *ttp, Token **tpp) >+{ >+ TokenKind tt; >+ int c; >+ Token *tp = *tpp; >+ JSAtom *atom; >+ JSBool inTarget; >+ size_t targetLength; >+ ptrdiff_t contentIndex; >+ >+ /* >+ * After much testing, it's clear that Postel's advice to protocol >+ * designers ("be liberal in what you accept, and conservative in what you This comment seems better before the function, and looks like it will rewrap in a prettier fashion there to boot. Slightly rubbery r=me otherwise, thanks! /be

Attachment #519835 - Flags: review?(brendan) → review+

Nicholas Nethercote [inactive]

Assignee

Comment 8

•

14 years ago

(In reply to comment #7) > > Slightly rubbery r=me otherwise, thanks! Rubbery? I'm not sure what you mean. Hesitant?

Josh Matthews [:jdm]

Comment 9

•

14 years ago

Rubber stamp, I presume.

Nicholas Nethercote [inactive]

Assignee

Comment 10

•

14 years ago

http://hg.mozilla.org/tracemonkey/rev/37e5a7fe49ef

Whiteboard: fixed-in-tracemonkey

Chris Leary [:cdleary] (not checking bugmail)

Comment 11

•

14 years ago

http://hg.mozilla.org/mozilla-central/rev/37e5a7fe49ef

Status: ASSIGNED → RESOLVED

Closed: 14 years ago

Resolution: --- → FIXED

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Extract large cold chunks (e4x, @lines) from getTokenInternal()

Categories

(Core :: JavaScript Engine, defect)

Tracking

()

People

(Reporter: n.nethercote, Assigned: n.nethercote)

References

Details

(Whiteboard: fixed-in-tracemonkey)

Crash Data

Security

(public)

User Story

Attachments

(1 file, 1 obsolete file)

Description

Updated

Comment 1

Updated

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Updated

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Attachment

General

Description

File Name

Content Type