Closed
Bug 626932
Opened 14 years ago
Closed 11 years ago
Shrink JSParseNode
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
RESOLVED
WORKSFORME
People
(Reporter: n.nethercote, Unassigned)
References
Details
(Whiteboard: [MemShrink:P3])
Attachments
(1 obsolete file)
Massif tells me that allocations done by NewOrRecycledNode() can easily take 10MB, 20MB or more (this on 64-bit, it'd be a bit less on 32-bit). Can we shrink JSParseNode?
The recycling is somewhat helpful. Proportions of handed-out nodes that are recycled:
- Sunspider: ~20%
- V8: ~30%
- Kraken: ~ 3%
We could do variable-width JSParseNodes (a bit like variable width LIR instructions in Nanojit), maybe by using some C++ sub-classing. That wouldn't play nicely with the recycle list, however, we'd have to have several recycle lists.
I wonder why TokenPos has to record the line number at the beginning and end of the token -- are multi-line tokens possible? Maybe strings with escaped newlines?
Do we even need to record line numbers? They're used relatively infrequently. Maybe we could compute them lazily by scanning back through the text?
Comment 1•14 years ago
|
||
TokenPos records both the line and the column. It could record everything using character numbers, and let someone else deal with turning that into line and column numbers. This might be a good step towards recording column numbers in the src notes for bytecode, too.
pn_next could be moved out of the element nodes and into the parent node; how many nodes are list members?
Only PN_NAME and PN_FUNC nodes need pn_link.
binary and unary nodes certainly don't need the whole thing.
Couldn't namesets be moved into the function box?
Dealing with different size classes would complicate the allocator quite a bit. We wouldn't want to forget lessons learned by jemalloc regarding fragmentation.
Reporter | ||
Comment 2•14 years ago
|
||
Just to make the numbers in comment 0 more rigorous:
SS (shell):
19023 numbers:
( 1) 15606 (82.0%, 82.0%): new
( 2) 3417 (18.0%,100.0%): recycle
V8 (shell):
51674 numbers:
( 1) 36207 (70.1%, 70.1%): new
( 2) 15467 (29.9%,100.0%): recycle
Kraken (shell):
1617986 numbers:
( 1) 1570461 (97.1%, 97.1%): new
( 2) 47525 ( 2.9%,100.0%): recycle
Opening http://www.blogtechnical.com/412/browser-benchmark-wars-how-does-firefox-4-beta-9-stand-up.bt (browser):
867812 numbers:
( 1) 693351 (79.9%, 79.9%): new
( 2) 174461 (20.1%,100.0%): recycle
Kraken has lots of huge array literals, which probably explain why it recycles such a small fraction of nodes.
Comment 3•14 years ago
|
||
What do the first and second percentages mean?
Reporter | ||
Comment 4•14 years ago
|
||
First is proportion of the total, second is the cumulative proportion.
Comment 5•14 years ago
|
||
On IRC we talked about the possibility of replacing the line-and-column numbers in TokenPtr with a single character index, and using an off-to-the-side table mapping indices to (line, column) pairs.
I'd like to eventually have column numbers as well as line number available at run-time. (At the moment, the src notes only record line numbers, which is unsatisfying for a number of reasons). So it would be nice if the data structure were designed to eventually be attached to JSScripts, not just used at compile-time.
Another issue: We want to help people compile other languages to JavaScript. It would be very helpful if they could have their own index -> (line,column) tables, mapping JavaScript source positions to original source positions. Perhaps the mapping objects and useful functions on them could be accessible from JavaScript.
Reporter | ||
Comment 6•14 years ago
|
||
Somewhat related: I'm surprised that the recycled numbers are so low. From talking to Jim it sounds like the recycling that can be done after each statement (ie. in the main loop in compileScript()) can be more aggressive than the recycling done elsewhere, ie. the end-of-statement recycling should be able to reclaim all nodes.
Reporter | ||
Comment 7•14 years ago
|
||
Brendan suggested I remeasure the recycling rate prior to bug 501908 landing, to see if something regressed.
Reporter | ||
Comment 8•14 years ago
|
||
(In reply to comment #7)
> Brendan suggested I remeasure the recycling rate prior to bug 501908 landing,
> to see if something regressed.
I did that. The recycle rate regressed a only tiny bit. For SS, from 18.4% to 18.0%, for V8, from 30.4 to 29.9%. Not worth worrying about.
Comment 9•14 years ago
|
||
I did enlarge the class of nodes that are not recycled, to match (my understanding of) the requirements of method lists, the funbox tree, and the top-level decls table. So some increase is to be expected.
If the top-level-form-processing loop in compileScript cleared the top-level decls, then we could really recycle everything. I tried doing this, and it introduced some regressions that I didn't look into, but in principle I think it should be fine.
Comment 10•14 years ago
|
||
The recycle confirmation is good, I wasn't pointing fingers, just looking for a sanity check on the bytes allocated to JSParseNodes. Shrinking them makes sense. If we can defer compiling, that's another bug but it could save even more space for cold code. JSC does this.
/be
Comment 11•14 years ago
|
||
(In reply to comment #10)
> The recycle confirmation is good, I wasn't pointing fingers
Nothing wrong with bringing in the likely suspects for interrogation.
Reporter | ||
Comment 12•14 years ago
|
||
FWIW, the memory used for the parse tree is transient, so it's arguably less important to shrink it than other stuff that hangs around longer.
Reporter | ||
Comment 13•14 years ago
|
||
Here are the current offsets of all the fields in all the JSParseNode kinds:
-----------------------------------------------------------------------------
32-bit
-----------------------------------------------------------------------------
sizeof(JSParseNode) = 48
-- common
pn_pos: 4
pn_offset: 20
pn_next: 24
pn_link: 28
-- list
pn_u.list.head: 32
pn_u.list.tail: 36
pn_u.list.count: 40
-- ternary
pn_u.ternary.kid1: 32
pn_u.ternary.kid2: 36
pn_u.ternary.kid3: 40
-- binary
pn_u.binary.left: 32
pn_u.binary.right: 36
pn_u.binary.pval: 40
pn_u.binary.iflags: 44
-- unary
pn_u.unary.kid: 32
pn_u.unary.num: 36
pn_u.unary.hidden: 40
-- name
-- nameset
pn_u.nameset.names: 32
pn_u.nameset.tree: 44
-- apair
pn_u.apair.atom: 32
pn_u.apair.atom2: 36
-- dval
pn_u.dval: 32
-----------------------------------------------------------------------------
64-bit
-----------------------------------------------------------------------------
sizeof(JSParseNode) = 72
-- common
pn_pos: 4
pn_offset: 20
pn_next: 24
pn_link: 32
-- list
pn_u.list.head: 40
pn_u.list.tail: 48
pn_u.list.count: 56
-- ternary
pn_u.ternary.kid1: 40
pn_u.ternary.kid2: 48
pn_u.ternary.kid3: 56
-- binary
pn_u.binary.left: 40
pn_u.binary.right: 48
pn_u.binary.pval: 56
pn_u.binary.iflags: 64
-- unary
pn_u.unary.kid: 40
pn_u.unary.num: 48
pn_u.unary.hidden: 52
-- name
-- nameset
pn_u.nameset.names: 40
pn_u.nameset.tree: 64
-- apair
pn_u.apair.atom: 40
pn_u.apair.atom2: 48
-- dval
pn_u.dval: 40
Reporter | ||
Comment 14•14 years ago
|
||
Here's a partial patch for keeping the index-to-line-number lookups in a separate table. It has all the plumbing in place, but doesn't yet implement the table itself.
Comment 15•14 years ago
|
||
re: isEscapeFreeStringLiteral: it's a pity that this becomes more expensive. I wonder if there's a way the scanner could simply flag such string literals --- perhaps re-use some PND_ bits?
(Attaching flags to string literals would allow us to get some obscure ES5 strict mode corners right, too; see bug 624766.)
Comment 16•14 years ago
|
||
I take it the renaming of the TokenPtr members was just to get the compiler to help you validate your patch, right?
Reporter | ||
Comment 17•14 years ago
|
||
(In reply to comment #16)
> I take it the renaming of the TokenPtr members was just to get the compiler to
> help you validate your patch, right?
Yes. That'll change, assuming I ever finish the patch. It's very rough, I put it here in order to free up a workspace for myself, not really for other people to look at :)
Comment 18•14 years ago
|
||
The plumbing looks pretty much as expected.
The line number comparisons in the error-reporting stuff really just want to see if the error is on the current line, where it can easily snarf the text to say stuff like:
typein:3: foo bar baz
typein:3: ....^
So you don't need to do a full line lookup to answer that question; you can switch to comparing the buffer offset against that of the current line's start.
Comment 19•14 years ago
|
||
(In reply to comment #17)
> Yes. That'll change, assuming I ever finish the patch. It's very rough, I put
> it here in order to free up a workspace for myself, not really for other people
> to look at :)
Just trying to help! :)
Reporter | ||
Updated•14 years ago
|
Blocks: mslim-fx5+
Reporter | ||
Updated•13 years ago
|
Whiteboard: [MemShrink:P3]
Comment 20•13 years ago
|
||
(In reply to Jim Blandy :jimb from comment #1)
> TokenPos records both the line and the column. It could record everything
> using character numbers, and let someone else deal with turning that into
> line and column numbers. This might be a good step towards recording column
> numbers in the src notes for bytecode, too.
I have a patch queue in progress for this. I'll file a blocker of this bug and attach there.
/be
Reporter | ||
Updated•12 years ago
|
Attachment #512616 -
Attachment is obsolete: true
Reporter | ||
Comment 21•11 years ago
|
||
> Massif tells me that allocations done by NewOrRecycledNode() can easily take
> 10MB, 20MB or more (this on 64-bit, it'd be a bit less on 32-bit).
This isn't true any more. about:memory has a "js-main-runtime-temporary-peak" measurement, and I haven't seen it go above ~3 MiB for ages. Nothing else to do here.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•