Closed
Bug 758509
Opened 12 years ago
Closed 12 years ago
Create JSScript before doing bytecode generation
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
mozilla16
People
(Reporter: n.nethercote, Assigned: n.nethercote)
References
Details
(Whiteboard: [js:t])
Attachments
(9 files, 1 obsolete file)
6.90 KB,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
26.80 KB,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
14.02 KB,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
13.88 KB,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
13.98 KB,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
16.31 KB,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
14.36 KB,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
33.59 KB,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
14.71 KB,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
This draft patch takes a step towards splitting JSScript into two parts, one (larger) holding bytecode and related things, and the other holding everything else. This should make lazy bytecode generation easier. JSScript::NewScript() is a new function that creates a JSScript and initializes some of the non-bytecode-related fields -- principals, noScriptRval, version, staticLevel. There are some more fields that can receive the same treatment. The old JSScript::NewScript() is now called JSScript::init, and JSScript:NewScriptFromEmitter is now called JSScript::initFromEmitter. They're both no longer static. Good things: - BytecodeEmitter (BCE) now has a pointer to the JSScript being created. (Previously the JSScript was only created at the end.) Because we can initialize some fields in JSScript up-front, we don't need to carry around that data in BCE or elsewhere. For example, I've removed BCE::noScriptRval and Parser::{principals,originPrincipals}. - initFromEmitter() is simpler. Bad things: - Initializing JSScript in two steps could lead to problems (e.g. forgetting to initialize a field). - The patch hasn't been updated for the s/RootedVar/Rooted/ renaming. I'm also not confident with my use of the new rooting APIs. - It feels a bit like I've pushed around a lot of code without that much effect. Ultimately it would make sense to split JSScript into two -- say JSScript and js::CompiledScript -- but I haven't got that far yet. Just looking for feedback at the moment. It's obviously incomplete, but does it feel like it's heading in the right direction?
Attachment #627104 -
Flags: feedback?(luke)
Attachment #627104 -
Flags: feedback?(bhackett1024)
Assignee | ||
Updated•12 years ago
|
Whiteboard: [js:t]
Comment 1•12 years ago
|
||
Comment on attachment 627104 [details] [diff] [review] draft patch I didn't fully appreciate the benefits of splitting JSScript before (I blame it on jetlag), but this actually looks pretty nice. Having the JSScript created before emitting seems righteous in the way you described. Going further, I wonder if we could avoid jsscript.cpp poking directly at BytecodeEmitter.
Attachment #627104 -
Flags: feedback?(luke) → feedback+
Assignee | ||
Comment 2•12 years ago
|
||
> Going
> further, I wonder if we could avoid jsscript.cpp poking directly at
> BytecodeEmitter.
Thinking superficially: I think NewScriptFromEmitter() should probably be in frontend/BytecodeEmitter.cpp, but it hasn't been worth the effort to change.
Thinking deeper: BytecodeEmitter (BCE) duplicates a lot of the stuff in JSScript. That data gets copied into JSScript in NewScriptFromEmitter(). Ideally, we'd get rid of all this duplication and just generate all the bytecode and related stuff directly into the JSScript -- BCE would then be a thin wrapper around JSScript, basically.
However, this is difficult because JSScript::data is packed so tightly. You can't create it until you know exactly how many consts, objects, regexps, bytecodes, srcnotes, etc, you have. Some other JSScript fields are just scalars and so this problem doesn't occur (nTypeSets comes to mind), but generating scalar values directly into JSScript and variable-length data indirectly via BCE would be a bit confusing.
Comment 3•12 years ago
|
||
(In reply to Nicholas Nethercote [:njn] from comment #2) > However, this is difficult because JSScript::data is packed so tightly. You > can't create it until you know exactly how many consts, objects, regexps, > bytecodes, srcnotes, etc, you have. Some other JSScript fields are just > scalars and so this problem doesn't occur (nTypeSets comes to mind), but > generating scalar values directly into JSScript and variable-length data > indirectly via BCE would be a bit confusing. FWIW, JITScript and the various JM ICs have the same problem, and have parallel structures in mjit::Compiler which get filled in and then copied over as compilation wraps up. I don't think this design is terrible, but it's best if the correspondence between the temporary and final data is clear (ideally with a single shared definition), which is not really the case for JM.
Comment 4•12 years ago
|
||
Comment on attachment 627104 [details] [diff] [review] draft patch Looks good. I would suggest though, that starting with the structural changes (splitting JSScript up) and then working through how that affects the places where those fields are accessed will make the change easier and cleaner, e.g. this troublesome stuff with partially initialized scripts floating around shouldn't exist. That's the approach I took doing objshrink and it worked pretty well.
Attachment #627104 -
Flags: feedback?(bhackett1024) → feedback+
Assignee | ||
Comment 5•12 years ago
|
||
> Looks good. I would suggest though, that starting with the structural
> changes (splitting JSScript up) and then working through how that affects
> the places where those fields are accessed will make the change easier and
> cleaner, e.g. this troublesome stuff with partially initialized scripts
> floating around shouldn't exist. That's the approach I took doing objshrink
> and it worked pretty well.
I have a different patch that does that :) It's really boring, though, so I worked on this one for a bit because it at least does the create-JSScript-earlier thing, which had an element of risk/interest. I'll combine the two.
bhackett, do you think a single big patch is the way to go? I can't really see how to naturally break this up into smaller patches.
Comment 6•12 years ago
|
||
(In reply to Nicholas Nethercote [:njn] from comment #5) > bhackett, do you think a single big patch is the way to go? I can't really > see how to naturally break this up into smaller patches. Sure, that's fine.
Assignee | ||
Updated•12 years ago
|
Summary: Split JSScript in two → Create JSScript before doing bytecode generation
Assignee | ||
Comment 7•12 years ago
|
||
This just removes some dead things.
Attachment #627104 -
Attachment is obsolete: true
Attachment #630854 -
Flags: review?(luke)
Assignee | ||
Comment 8•12 years ago
|
||
Currently you can't create a JSScript without also filling in heaps of bytecode-related fields (via NewScript or NewScriptFromEmitter). This patch changes that. It: - Adds CreateScript(), which just creates the JSScript and PodZeroes it. BytecodeEmitter can now get a pointer to the JSScript. - Renames NewScript() as partiallyInit() and NewScriptFromEmitter() as fullyInitFromEmitter() and makes them non-static methods. - Adds fullyInitTrivial() for the special trivial script created for Function.prototype -- this is nice because it moves the JSScript-initialization details from vm/GlobalObject.cpp to jsscript.cpp. This is a step towards lazy bytecode generation. This will also allow moving as much initialization of JSScript as possible from partiallyInit/fullInitFromEmitter/fullInitTrivial into CreateScript(). The following patches do lots of this (and CreateScript will end up with lots of arguments). Benefits of this: - It allows some state to be removed from BytecodeEmitter. - It makes much of the JSScript initialization more explicit, which makes the characteristics of each JSScript more obvious at its creation point.
Attachment #630855 -
Flags: review?(luke)
Assignee | ||
Comment 9•12 years ago
|
||
This patch initializes JSScript::noScriptVal in CreateScript(). This removes the need for BytecodeEmitter::noScriptVal.
Attachment #630856 -
Flags: review?(luke)
Assignee | ||
Comment 10•12 years ago
|
||
This patch initializes JSScript::version in CreateScript(). This allows version getters to be removed from Parser and BytecodeEmitter.
Attachment #630857 -
Flags: review?(luke)
Assignee | ||
Comment 11•12 years ago
|
||
This patch initializes JSScript::{principals,originPrincipals} in CreateScript(). This removes the need for storing them in Parser, and also allows us to do the "principals implies originPrincipals" manipulation in a single place instead of two.
Attachment #630858 -
Flags: review?(luke)
Assignee | ||
Comment 12•12 years ago
|
||
This patch initializes JSScript::compileAndGo in CreateScript(). This allows Parser::compileAndGo to become private.
Attachment #630859 -
Flags: review?(luke)
Assignee | ||
Comment 13•12 years ago
|
||
This patch initializes JSScript::savedCallerFun in CreateScript(). The only callsite where it can be true is the one in CompileScript(), and so I moved the code that inspects compileAndGo/isFunctionFrame() there, which removes the need for some subsequent assertions.
Attachment #630860 -
Flags: review?(luke)
Assignee | ||
Comment 14•12 years ago
|
||
> This patch initializes JSScript::savedCallerFun in CreateScript(). The only
> callsite where it can be true is the one in CompileScript(), and so I moved
> the code that inspects compileAndGo/isFunctionFrame() there, which removes
> the need for some subsequent assertions.
Hmm. This patch preserves the existing behaviour, which is that if a script has savedCallerFun==true (which only happens for eval scripts), then every nested script within it will also have savedCallerFun==true. But that suggests that script->objects[0] is set to the enclosing function, which seems entirely wrong.
If it is wrong, then it's easy to fix in this patch -- I can just change |parent->savedCallerFun| to |false| in BytecodeEmitter.cpp. Without this patch (and those preceding it) it's a bit more work to fix.
AFAICT, that change would also allow Parser::callerFrame to be removed.
Assignee | ||
Comment 15•12 years ago
|
||
eserves the existing behaviour, which is that if a script
> has savedCallerFun==true (which only happens for eval scripts), then every
> nested script within it will also have savedCallerFun==true. But that
> suggests that script->objects[0] is set to the enclosing function, which
> seems entirely wrong.
Just to clarify: it's correct for the eval script, but seems wrong for scripts nested within the eval script.
Assignee | ||
Comment 16•12 years ago
|
||
SharedContext::staticLevel is shared by Parser and BytecodeEmitter. But JSScript also has a staticLevel, so this patch moves SharedContext::staticLevel into TreeContext, which the Parser uses, and the JSScript's staticLevel is used by BytecodeEmitter. This required replacing SharedContext* arguments to several functions with TreeContext* parameters.
Attachment #631291 -
Flags: review?(luke)
Assignee | ||
Comment 17•12 years ago
|
||
This patch initializes JSScript::globaObject in CreateScript(). This allows BytecodeEmitter::needScriptGlobal to be removed.
Attachment #631292 -
Flags: review?(luke)
Comment 18•12 years ago
|
||
Comment on attachment 631292 [details] [diff] [review] Part 8: Initialize JSScript::globalObject in CreateScript() Review of attachment 631292 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/frontend/BytecodeCompiler.cpp @@ +105,5 @@ > if (!tc.init()) > return NULL; > > bool savedCallerFun = compileAndGo && callerFrame && callerFrame->isFunctionFrame(); > + js::GlobalObject *globalObject = needScriptGlobal ? GetCurrentGlobal(cx) : NULL; Why the 'js::'?
Comment 19•12 years ago
|
||
Comment on attachment 630854 [details] [diff] [review] Part 0: Remove some dead things Appetizers
Attachment #630854 -
Flags: review?(luke) → review+
Comment 20•12 years ago
|
||
Comment on attachment 630855 [details] [diff] [review] Part 1: Create JSScript before starting bytecode generation Review of attachment 630855 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/frontend/BytecodeEmitter.cpp @@ +2652,5 @@ > return false; > bce->switchToMain(); > } > > + if (!EmitTree(cx, bce, body) || Emit1(cx, bce, JSOP_STOP) < 0) Since you are already breaking this up, could you give EmitTree and Emit1 their own separate 'if (!...) return false' statements? ::: js/src/jsscript.cpp @@ +1229,5 @@ > + > + script->noScriptRval = true; > + > + script->code[0] = JSOP_STOP; > + script->notes()[0] = SRC_NULL; Could this reuse partiallyInit (for symmetry with fullyInitFromEmitter)? @@ +1490,5 @@ > > + if (data) { > + JS_POISON(data, 0xdb, computedSizeOfData()); > + fop->free_(data); > + } So one fundamental change is that the GC can observe allocated-but-not-fully-initialized scripts. Mostly these functions check "if has X" before doing anything, so I don't see any immediate problems. However, I would add a comment to the top of JSScript::markChildren and JSScript::finalize explaining that callers should expect a script in the state left by JSScript::CreateScript (allocate :) before being fully initialized. Also, could you add a test (using gczeal or other) that causes GC/finalization of such a script so that way we don't find out about bugs (in this patch or future) via browser failures. ::: js/src/jsscript.h @@ +575,5 @@ > + uint32_t nobjects, uint32_t nregexps, uint32_t ntrynotes, uint32_t nconsts, > + uint16_t nClosedArgs, uint16_t nClosedVars, uint32_t nTypeSets, > + JSVersion version); > + bool fullyInitTrivial(JSContext *cx, JSVersion version); // inits a JSOP_STOP-only script > + bool fullyInitFromEmitter(JSContext *cx, js::BytecodeEmitter *bce); JSScript::CreateScript sounds a bit redundant. I would suggest 'JSScript::construct' but this usually implies that the resulting object has finished initialization which is not the case here. How about JSScript::allocate?
Attachment #630855 -
Flags: review?(luke) → review+
Updated•12 years ago
|
Attachment #630856 -
Flags: review?(luke) → review+
Updated•12 years ago
|
Attachment #630857 -
Flags: review?(luke) → review+
Comment 21•12 years ago
|
||
Comment on attachment 630858 [details] [diff] [review] Part 4: Initialize JSScript::{principals,originPrincipals} in CreateScript() Nice
Attachment #630858 -
Flags: review?(luke) → review+
Updated•12 years ago
|
Attachment #630859 -
Flags: review?(luke) → review+
Comment 22•12 years ago
|
||
Comment on attachment 631291 [details] [diff] [review] Part 7: Initialize JSScript::staticLevel in CreateScript() Review of attachment 631291 [details] [diff] [review]: ----------------------------------------------------------------- On the subject of TreeContext, could you remove the #include "TreeContext-inl.h from BytecodeEmitter.cpp? ::: js/src/frontend/Parser.cpp @@ +1064,4 @@ > { > + /* Initialize non-default members of funtc->sc. */ > + funtc->sc->blockidGen = outertc->sc->blockidGen; > + if (!GenerateBlockId(funtc->sc, funtc->sc->bodyid)) I don't see why this change is necessary; seems worse. ::: js/src/jsscript.cpp @@ +585,5 @@ > script = JSScript::CreateScript(cx, !!(scriptBits & (1 << SavedCallerFun)), > /* principals = */ NULL, /* originPrincipals = */ NULL, > /* compileAndGo = */ false, > + !!(scriptBits & (1 << NoScriptRval)), version_, > + /* staticLevel = */ 0); As the height of JSScript::CreateScript increases, I think it would be more aesthetically pleasing to put a single parameter on each line rather.
Attachment #631291 -
Flags: review?(luke) → review+
Updated•12 years ago
|
Attachment #631292 -
Flags: review?(luke) → review+
Updated•12 years ago
|
Attachment #630860 -
Flags: review?(luke) → review+
Assignee | ||
Comment 23•12 years ago
|
||
> > + if (data) { > > + JS_POISON(data, 0xdb, computedSizeOfData()); > > + fop->free_(data); > > + } > > So one fundamental change is that the GC can observe > allocated-but-not-fully-initialized scripts. Mostly these functions check > "if has X" before doing anything, so I don't see any immediate problems. > However, I would add a comment to the top of JSScript::markChildren and > JSScript::finalize explaining that callers should expect a script in the > state left by JSScript::CreateScript (allocate :) before being fully > initialized. Good idea. > Also, could you add a test (using gczeal or other) that causes > GC/finalization of such a script so that way we don't find out about bugs > (in this patch or future) via browser failures. I'm not sure how to do this -- just do |gczeal(2,1)| and run a bunch of eval scripts in a loop and hope the case is hit? Or is there a more targeted way to do it? I can't work out how to deliberately do it from C++ using the JSAPI, let alone in JS code... FWIW, if I add |JS_ASSERT(data)| to JSScript::finalize(), I get 10 failures during jit_test.py and 430 failures during jstests.py. In other words, it becomes very crashy. Knowing that, do you still think it's worth having a standalone test for this?
Comment 24•12 years ago
|
||
(In reply to Nicholas Nethercote [:njn] from comment #23) > FWIW, if I add |JS_ASSERT(data)| to JSScript::finalize(), I get 10 failures > during jit_test.py and 430 failures during jstests.py. Ah good. It makes sense that this would be frequent: any failed compilation should leave a script in such a state. No additional test needed there. What if you put the same assert in JSScript::markChildren; any hits? If not, one easy way should be gczeal(2,1) and then make a bunch of let blocks (each one allocates a StaticBlockObject).
Assignee | ||
Comment 25•12 years ago
|
||
> What if you put the same assert in JSScript::markChildren; any hits? If
> not, one easy way should be gczeal(2,1) and then make a bunch of let blocks
> (each one allocates a StaticBlockObject).
No hits in jit-tests or jstests, but it was easy to write a test case with let blocks that hit that case. Thanks for the suggestion!
Assignee | ||
Comment 26•12 years ago
|
||
> On the subject of TreeContext, could you remove the #include
> "TreeContext-inl.h from BytecodeEmitter.cpp?
Unfortunately not; it defines the SharedContext constructor, which BytecodeEmitter.cpp uses.
There's definitely a case for renaming TreeContext{.cpp,.h,-inl.h} as SharedContext{.cpp,.h,-inl.h} and moving the TreeContext class into Parser.h (and also renaming TreeContext as ParseContext), but that's beyond the scope of this bug.
Assignee | ||
Comment 27•12 years ago
|
||
Try server run: https://tbpl.mozilla.org/?tree=Try&rev=ed9885614052 Landing on inbound: https://hg.mozilla.org/integration/mozilla-inbound/rev/9c1ec3a821e8 https://hg.mozilla.org/integration/mozilla-inbound/rev/c872b3e2b25b https://hg.mozilla.org/integration/mozilla-inbound/rev/d9b5986f2c61 https://hg.mozilla.org/integration/mozilla-inbound/rev/9742f1d93641 https://hg.mozilla.org/integration/mozilla-inbound/rev/3b184d56cbeb https://hg.mozilla.org/integration/mozilla-inbound/rev/ec6da0036263 https://hg.mozilla.org/integration/mozilla-inbound/rev/f58124ce7e30 https://hg.mozilla.org/integration/mozilla-inbound/rev/7af450131eb1 https://hg.mozilla.org/integration/mozilla-inbound/rev/36e3e2913c3d
Comment 28•12 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/9c1ec3a821e8 https://hg.mozilla.org/mozilla-central/rev/c872b3e2b25b https://hg.mozilla.org/mozilla-central/rev/d9b5986f2c61 https://hg.mozilla.org/mozilla-central/rev/9742f1d93641 https://hg.mozilla.org/mozilla-central/rev/3b184d56cbeb https://hg.mozilla.org/mozilla-central/rev/ec6da0036263 https://hg.mozilla.org/mozilla-central/rev/f58124ce7e30 https://hg.mozilla.org/mozilla-central/rev/7af450131eb1 https://hg.mozilla.org/mozilla-central/rev/36e3e2913c3d
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla16
You need to log in
before you can comment on or make changes to this bug.
Description
•