Closed
Bug 256728
Opened 21 years ago
Closed 21 years ago
Suggestion: eliminating TSF_REGEXP
Categories
(Core :: JavaScript Engine, enhancement, P2)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
mozilla1.8alpha4
People
(Reporter: igor, Assigned: brendan)
Details
(Keywords: js1.5)
Attachments
(1 file, 1 obsolete file)
35.65 KB,
patch
|
shaver
:
review+
|
Details | Diff | Splinter Review |
To distinguish between / denoting division and regexp start jsparse.c instructs
the scanner via special TSF_REGEXP flag to mark the regexp context for the
scanner. The scanner then uses this flag during scanning of /.
But this leads to numerous flag set/clear: currently there are 15 such pairs
which are executed even if there are no regexp in the source.
To eliminate this flag set/clear I suggest to always scan / as devision and then
during parsing of primary expression treat / as regexp start in the literal
content and read the rest of regexp there.
Reporter | ||
Comment 1•21 years ago
|
||
The patch moves regexp scanning to the separated js_ReadRegExpToken function
and then changes PrimaryExpr to call the function after it gets /. The only
complication here is that the scanner will /= read as a special token and in
that case js_ReadRegExpToken adds the initial = to the buffer.
The the patch removes 15 pairs of TSF_REGEXP set/clear.
Assignee | ||
Comment 2•21 years ago
|
||
Sorry, this is going to be WONTFIXED. The TSF_REGEXP flag is being generalized
into a TSF_OPERAND flag, and used for E4X and possibly operator precedence unary
+/- disambiguation _a la_ Narcissus (js/narcissus/jsparse.js) changes that are
coming. It's true that it need not be set and cleared so often. That could be
what this bug could stand for, but I don't need this bug to make that change.
Igor, are you ok with WONTFIX?
/be
Reporter | ||
Comment 3•21 years ago
|
||
WONTFIX is OK but it was exactly E4X implementation in Rhino that triggered that.
Rhino also had TSF_REGEXP but after merging E4X patch from Bea/AgileDelta folks
which always scanned < as TOK_LT and then changed that to XML literal in the
unary expression context I realized that doing the same trick with / would save
a few lines of code. And it did.
Assignee | ||
Comment 4•21 years ago
|
||
The reason I prefer a scanner flag (or several, in the case of E4X -- the work
in progress patch in bug 256691 adds TSF_XMLTAGMODE, TSF_XMLTEXTMODE, and
TSF_XMLONLYMODE -- the last is for pure XML 1.0 parsing without E4X's embedded
{expressions}) is that it means you get a whole token from the scanner, and you
can continue to combine DIVOPs (distinguished by pn_op; *, / and % are divops).
Another reason is that <, <?, </, and <! (or perhaps <!-- and <![CDATA[) are all
distinct tokens, with spaces not allowed between parts, so either the scanner
must expose its GetChar primitive to the parser, or the parser must call into
alternative entry points than the js_GetToken one -- or just use flag bits, as
I'm doing.
For the operator precedence parser in Narcissus, there are five places that must
set t.scanOperand = false, and five corresponding t.scanOperand = true settings.
The savings in stack space and runtime (if optimized fully, e.g. by hand in a C
translation in SpiderMonkey) motivate this approach. Operator precedence
parsing requires scanning unary +/- as distinct tokens, so requires an operand
vs. operator mode-flag.
I'll keep this bug open to minimize the bit-twiddling in a patch that can be
landed ahead of the E4X patch, to make shaver's life easier.
/be
Assignee: general → brendan
Assignee | ||
Comment 5•21 years ago
|
||
> The reason I prefer a scanner flag (or several, in the case of E4X -- the work
> in progress patch in bug 256691 adds TSF_XMLTAGMODE, TSF_XMLTEXTMODE, and ....
Oops, I meant bug 246441.
/be
Status: NEW → ASSIGNED
Keywords: js1.5
OS: Linux → All
Priority: -- → P2
Hardware: PC → All
Target Milestone: --- → mozilla1.8alpha4
Assignee | ||
Comment 6•21 years ago
|
||
> can continue to combine DIVOPs (distinguished by pn_op; *, / and % are divops).
* is TOK_STAR, of course. The point is not specific to TOK_DIVOP -- TOK_UNARYOP
and other cases combine tokens that have the same precedence and associativity,
to reduce the parser size.
/be
Assignee | ||
Comment 7•21 years ago
|
||
So this patch actually increases net code size:
28c28
< 23427 8 0 23435 5b8b Linux_All_OPT.OBJ/jsparse.o
---
> 23419 8 0 23427 5b83 Linux_All_OPT.OBJ/jsparse.o
31c31
< 11029 960 0 11989 2ed5 Linux_All_OPT.OBJ/jsscan.o
---
> 11189 960 0 12149 2f75 Linux_All_OPT.OBJ/jsscan.o
This is the standalone, optimized (gcc 3.3.2 -O2) build, unpatched trunk on left
and with patch on right.
The parser got slightly smaller, but the scanner got a bit bigger from the
alternative entry point. Although the bit-twiddling may seem ugly (at first
;-), code sizes trump aesthetics if the two are in conflict (not sure they are,
here). I'll put the TSF_OPERAND patch here next.
/be
Reporter | ||
Comment 8•21 years ago
|
||
In Java the case was the opposite: the overhead of new function was buried under
saving from eliminations of byte on/off.
BTW, if js_GetToken would be replaced by the pair:
JSTokenType
GetTokenImpl(JSContext *cx, JSTokenStream *ts)
{
Current code with RETURN(x) replaced by return x
}
JSTokenType
js_GetToken(JSContext *cx, JSTokenStream *ts)
{
JSTokenType tt;
JSToken *tp;
tt = GetTokenImpl(cx, ts);
tp = &CURRENT_TOKEN(ts);
Expansion of the current RETURN
if (tt == TOK_ERROR) ts->flags |= TSF_ERROR;
tp->pos.end.index = ts->linepos
+ (ts->linebuf.ptr - ts->linebuf.base)
- ts->ungetpos;
return (tp->type = tt);
}
then jsscan.o build with gcc 3.3.3 -O2 would shrink by 396 bytes from 17604 to
17208 or 3.5% ...
Assignee | ||
Comment 9•21 years ago
|
||
Igor, see the e4x patch in progress in bug 246441 -- it gets rid of the
over-aggressive use of macros in js_GetToken. Bogus stuff especially on modern
super-scalar architectures, based at one point in 1997 or so, on profile data
gathered on an SGI Indy(!).
/be
Assignee | ||
Comment 10•21 years ago
|
||
And saves 628 bytes:
31c31
< 11029 960 0 11989 2ed5 Linux_All_OPT.OBJ/jsscan.o
---
> 10401 960 0 11361 2c61 Linux_All_OPT.OBJ/jsscan.o
/be
Assignee | ||
Updated•21 years ago
|
Attachment #156894 -
Attachment is obsolete: true
Assignee | ||
Updated•21 years ago
|
Attachment #156936 -
Flags: review?(shaver)
Comment 11•21 years ago
|
||
Comment on attachment 156936 [details] [diff] [review]
proposed fix, paves the way for e4x patch
OK, anything to reduce the review-pain of E4X!
Attachment #156936 -
Flags: review?(shaver) → review+
Assignee | ||
Comment 12•21 years ago
|
||
Patch checked in.
/be
Status: ASSIGNED → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•