Created attachment 551151 [details] [diff] [review]
patch for m-c
There's currently some twitchiness involved with pushing the dummy frame when entering a compartment (see the comments in AutoCompartment::enter and ContextStack::saveFrameChain). The root evil here is that its not "right" to set the compartment before calling pushDummyFrame and vice versa. So the obvious fix is to make pushDummyFrame do the compartment change itself, atomically. This will also remove some hackage that bhackett had to do for TI.
Created attachment 551154 [details] [diff] [review]
patch for TI
This patch is the m-c patch rebased and merged for TI and with the abovementioned hacks removed.
Brian, so I was able to find a single test case (testStackIter.js) that caused the problems (I was just running the tests incorrectly at first). The attached patch fixes them.
Oops, forgot to qref: pretend there is no "cx->compartment = dest" in saveFrameChain (as its strictly redundant given the cx->resetCompartment).
Comment on attachment 551151 [details] [diff] [review]
patch for m-c
This is pretty awesome.
>diff --git a/js/src/vm/Stack.h b/js/src/vm/Stack.h
>+ * ensureSpace distinguishes between a destination compartment (which is
>+ * used to guard access to the trusted buffer space) and the cx's current
>+ * compartment (which is used to report errors). These are usually the same
>+ * ensureSpace defaults to CX_COMPARTMENT which means "use cx->compartment".
>+ static const size_t CX_COMPARTMENT = 1;
> inline bool ensureSpace(JSContext *cx, MaybeReportError report,
>- Value *from, ptrdiff_t nvals) const;
>+ Value *from, ptrdiff_t nvals,
>+ JSCompartment *dest = (JSCompartment *)CX_COMPARTMENT) const;
Instead of this crazy CX_COMPARTMENT stuff, why not overload ensureSpace with an ensureSpace that doesn't take the |dest| parameter at all and just calls ensureSpace with cx->compartment as the last parameter?
Looks awesome otherwise. r=me.
(In reply to Blake Kaplan (:mrbkap) from comment #3)
> Instead of this crazy CX_COMPARTMENT stuff, why not overload ensureSpace
> with an ensureSpace that doesn't take the |dest| parameter at all
Hah, of course, thanks!
Oh wait, I did try this and it isn't so easy: the issue isn't just the default value but that we need to propagate this "use the cx's compartment" choice from ensureOnTop to ensureSpace (which is hard to do it with overloads without duplicating code).
yeah, mrbkap was right, no CX_COMPARTMENT
Merged in m-c:
Backing out to test a significant Windows-only (NT5.1 and NT6.1) Dromaeo(SunSpider) regression:
Confirmed, the patch caused the regression.
I can repro the perf regression just by passing the dest parameter (but otherwise keeping everything unmodified). Also interesting is that ONLY Dromaeo(SunSpider) was affected, not other Dromaeo tests, not SunSpider, not any other Talos measurement. Perhaps this breaks some critical PGO optimization? I'm going to see if I can repro the regression on a non-PGO local opt build and play with it more there.
PGO is currently disabled for JS, no?
(In reply to Ryan VanderMeulen from comment #13)
> PGO is currently disabled for JS, no?
So I tested locally (simple optimized non-PGO build) and I see no speed difference. Le sigh. I'll start experimenting with try.
Heh, if nothing else, this can land when TI lands since TI did not experience any dip:
Or, more accurately, since the patch hasn't been backed out of TI, it will just landed with the merge.
Hah! As if I couldn't flip-flop any more on the CX_COMPARTMENT issue, using CX_COMPARTMENT does indeed fix the talos regression on try!
The reasoning in the original patch, which was later decided to be premature optimization, is that the 'dest' parameter to the (inlined) ensureSpace is only needed for the out-of-line slow path so it should be a simple constant, not a load (cx->compartment). Now, one would hope that a smart compiler would notice that the load was only needed on the branch so it would push the load down so it only happened on the slow path. But, if you are the compiler, maybe this is risky since it is beneficial for the load to be early (so it can have more time to complete before needed). (Maybe with profiling (which RyanVM ensures me is not happening for JS) the compiler would be able to make this judgment.) A constant, OTOH, is trivially constant-propagated down to the call-site.
Still, 8% on Dromaeo(SunSpider) is an extraordinarily huge hit for such a small optimization! Perhaps someone who knows how to profile on Windows can see what we are doing in Dromaeo(SunSpider) there that puts so much pressure on VM stack functions which, in theory, shouldn't be so hot. Maybe bz cares about Dromaeo....... :)
> Maybe bz cares about Dromaeo....
It's interesting. The JS team has consistently been not caring about Dromaeo for some reason.... even though a bunch of stuff it measures is in fact JS performance relevant to actual websites (and I'm not talking just the pure-JS tests, which are _not_ relevant to actual websites).
I do care, I guess, since no one else is willing to, but I'm also spread very very thin right now, and have not in fact done any Windows profiling yet. :(