Open Bug 919095 Opened 11 years ago Updated 2 years ago

Let's be consistent about whether we lazily initialize standard classes in all globals.

Categories

(Core :: XPConnect, defect)

defect

Tracking

()

REOPENED

People

(Reporter: till, Unassigned)

References

Details

Attachments

(1 file, 1 obsolete file)

As discussed
Attached patch the patch (obsolete) — Splinter Review
Builds and runs locally. Try run here: https://tbpl.mozilla.org/?tree=Try&rev=b7c9083ce2c6
Attachment #808063 - Flags: review?(bobbyholley+bmo)
Comment on attachment 808063 [details] [diff] [review]
the patch

Review of attachment 808063 [details] [diff] [review]:
-----------------------------------------------------------------

Great! Thanks for doing that. r=bholley
Attachment #808063 - Flags: review?(bobbyholley+bmo) → review+
I'm sure glad I try-servered this: https://tbpl.mozilla.org/?tree=Try&rev=b7c9083ce2c6

Now I only have to understand why the browser runs just fine, most Talos tests do, too, but lots of other things fail because of not finding standard classes. Oh, and then there are actual crashes to throw into the mix. Those I should probably look into independently of the rest.
So, the Special Powers addon creates a chrome window, and in that chrome window, standard classes aren't resolved anymore. I tried wading through how this all works, but it's a bit much to take in all at once.

bholley, maybe you have an idea on why this happens, or can give me pointers for the direction to look in?
Flags: needinfo?(bobbyholley+bmo)
(In reply to Till Schneidereit [:till] from comment #3)
> I'm sure glad I try-servered this:
> https://tbpl.mozilla.org/?tree=Try&rev=b7c9083ce2c6

Yeah. Any change like this _definitely_ needs a try run. The r+ here just means "this is conceptually fine, and doesn't break anything I know about". But with deep platform changes, there's always a heavy risk of breaking something that the reviewer doesn't know about.

(In reply to Till Schneidereit [:till] from comment #4)
> So, the Special Powers addon creates a chrome window, and in that chrome
> window, standard classes aren't resolved anymore.

Is it really a chrome window? I'd think it's be a TabChildGlobal, which is where the in-tab SpecialPowers code runs (SpecialPowers was originally designed to be e10s-safe).

> bholley, maybe you have an idea on why this happens, or can give me pointers
> for the direction to look in?

I would insert a Math.sin() call right before the first place where we fail to find a standard class, gdb break on js::math_sin, and then see why the lazy resolution code isn't being invoked.
Flags: needinfo?(bobbyholley+bmo)
(In reply to Bobby Holley (:bholley) from comment #5) 
> (In reply to Till Schneidereit [:till] from comment #4)
> > So, the Special Powers addon creates a chrome window, and in that chrome
> > window, standard classes aren't resolved anymore.
> 
> Is it really a chrome window? I'd think it's be a TabChildGlobal, which is
> where the in-tab SpecialPowers code runs (SpecialPowers was originally
> designed to be e10s-safe).

I think you mean nsInProcessTabChildGlobal.
(In reply to Josh Matthews [:jdm] from comment #6)
> (In reply to Bobby Holley (:bholley) from comment #5) 
> > (In reply to Till Schneidereit [:till] from comment #4)
> > > So, the Special Powers addon creates a chrome window, and in that chrome
> > > window, standard classes aren't resolved anymore.
> > 
> > Is it really a chrome window? I'd think it's be a TabChildGlobal, which is
> > where the in-tab SpecialPowers code runs (SpecialPowers was originally
> > designed to be e10s-safe).
> 
> I think you mean nsInProcessTabChildGlobal.

Er, yes.
Blocks: 933681
I changed the patch to always eagerly initialize, instead of never doing so.

Try-servering here:
https://tbpl.mozilla.org/?tree=Try&rev=0d69ab677667
Attachment #825842 - Flags: review?(bobbyholley+bmo)
Attachment #808063 - Attachment is obsolete: true
Comment on attachment 825842 [details] [diff] [review]
Eagerly initialize standard classes in all globals

Review of attachment 825842 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/xpconnect/src/XPCWrappedNative.cpp
@@ +316,5 @@
>      // create ends up there.
>      JSAutoCompartment ac(cx, global);
>  
> +    // Initialize the standard classes on the global.
> +    if (!JS_InitStandardClasses(cx, global))

If the plan is to eventually move this work into JS_NewGlobalObject, this is not isomorphic, since there are a number of paths that don't go through WrapNewGlobal.

How about we just do it directly in JS_NewGlobalObject? Or, barring that, xpc::CreateGlobalObject?
Attachment #825842 - Flags: review?(bobbyholley+bmo) → review-
Do we have ballpark numbers for how long eager standard class init takes?  I'm not saying we shouldn't do this; just trying to understand the impact.
(In reply to Boris Zbarsky [:bz] from comment #10)
> Do we have ballpark numbers for how long eager standard class init takes? 
> I'm not saying we shouldn't do this; just trying to understand the impact.

I don't remember precise numbers, but when I measured the impact Intl had, that was about 100m instructions during global creation and caused several % of Tpaint regression. Standard class init without Intl (and with the pattern that caused the 100M instructions fixed), otoh, was maybe 10k or 20k instructions. Based on that, I expect any Talos influence to be neglegible.
I'm a lot more interested in real web pages (which can easily have hundreds to thousands of iframes in them, one per like button) as opposed to the out-of-date talos pageset.  So I'd really like to know what the actual wall-clock numbers are for eager standard class init on some sort of sane hardware.
Note that I expect this to still be OK: 20k instructions ought to be about 20us, so even scaled across 1000 iframes is not terrible.  But would be good to know for sure.
The try run shows that eager initialization of standard builtins causes some issues.

One is that Math.__proto__ changes from Object.prototype to Math. The reason is that the resolve hook pushes an AutoResolving for "Math" onto the stack before calling js_InitMathClass. This causes js_GetClassObject (invoked by GlobalObject::getConstructor during proto lookup) to return a nullptr instead of looking up the constructor[1].

Now, I guess I could just stick an AutoResolving into GlobalObject::initStandardClasses, but that seems highly suspect to me. And it looks like we're fulfill the requirement that Math.__proto__ should be Object.prototype quite accidentally. In fact, we don't fulfill it in window globals at all (because for those, the builtins are initialized eagerly). Try executing `Math._proto__` in the console.

Waldo, do you know how this is all supposed to work, and what would be the best course of action for making it do so consistently?


[1]: http://mxr.mozilla.org/mozilla-central/source/js/src/jsobj.cpp?rev=48d02e98287b#3067
Flags: needinfo?(jwalden+bmo)
(In reply to Boris Zbarsky [:bz] from comment #13)
> Note that I expect this to still be OK: 20k instructions ought to be about
> 20us, so even scaled across 1000 iframes is not terrible.  But would be good
> to know for sure.

I'll do measurements of the exact effect on global object init time and on memory usage once I figured out the correctness issues.
> Try executing `Math._proto__` in the console.

Math.__proto__ == Object.prototype returns true for me in a nightly (2013-10-24) both in the web console and in an actual web page.  What makes you say that builtins are initialized eagerly in "window globals"?
> Math.__proto__ == Object.prototype returns true for me in a nightly
> (2013-10-24) both in the web console and in an actual web page.

Curious. When I open a new tab, then the console of that tab, and enter Math.__proto__, I get "[Object Math]". Doing the same in a bmo tab and some other pages I just tried gives the right answer. But then, http://people.mozilla.org/~jorendorff/es6-draft.html gives the wrong one, too.

> What makes
> you say that builtins are initialized eagerly in "window globals"?

I discovered that they are when I debugged the Intl-caused Talos regressions. See bug 853301, comment 85. That comment also definitely makes it clear that I have to do more measurements before anything can land here.
(In reply to Till Schneidereit [:till] from comment #14)
> One is that Math.__proto__ changes from Object.prototype to Math.

You mean we end up with a cyclic proto? Or is that a typo?

(In reply to Till Schneidereit [:till] from comment #17)
> > What makes
> > you say that builtins are initialized eagerly in "window globals"?
> 
> I discovered that they are when I debugged the Intl-caused Talos
> regressions. See bug 853301, comment 85.

Hm, what code actually causes us to init these classes? According to http://mxr.mozilla.org/mozilla-central/search?string=INIT_JS_STANDARD_CLASSES , we only do this for FrameMessageManager globals.
> Curious.

Indeed.  I can confirm the difference in behavior on a bmo page and the ES6 draft page; I just happened to test the console on a bmo page.  My actual web page was a minimal page from file://.  What the heck is going on here?
So, while I still don't know what's going on with the inconsistency discussed in the last few comments, I realized that I can easily do some measurements by just creating lots of globals in the shell with `newGlobal()`.

The results are a bummer and call the entire "get rid of lazy builtins" into question.

Consider this benchmark:

var t = dateNow();
var globals = [];
for (var i = 1000; i--;)
	globals.push(newGlobal());

print(dateNow() - t);

With lazy standard classes, this finishes in about 420ms on my machine. With eager standard classes, that goes up to about 1170ms. Thus, eagerly initializing the builtins takes about 0.75ms per global.

I suppose that this kills the idea already, but sadly, there's more.

Crudely measuring the memory usage by looking at top's MEM measurement, a shell started with this script and then put into interactive mode uses about 20MB of memory with lazy builtins, and 38MB with eager ones. Thus, eagerly initializing the builtins costs about 180KB per global.

Both measurements don't take into account that most globals won't just sit there and do nothing, but even if we assume that 50% or more of the builtins are initialized soon after a global is created, that still leaves considerable regressions.

I guess I'll work on making lazy builtins work for all cases, after all.
(In reply to Till Schneidereit [:till] from comment #20)
> I guess I'll work on making lazy builtins work for all cases, after all.

Yep, sounds like it. Back to the first version of the patch. :-)

Very importantly though, we need to investigate your claim that we're currently always doing eager initialization. I don't see the code that's doing that explicitly, and if we happen to be doing it accidentally, it's (apparently) costing us a lot. Can you look at that ASAP?
So, this only happens for TabChildGlobals, because we invoke InitClassesWithNewWrappedGlobal with INIT_JS_STANDARD_CLASSES set in InitTabChildGlobalInternal[1].

Changing that broke some things, so I'll have to do the debugging described from comment 3 onwards.


[1]: http://mxr.mozilla.org/mozilla-central/source/content/base/src/nsFrameMessageManager.cpp?rev=05c31c03a8dd#1415
0.75ms per global is probably survivable.  The memory bit is a lot more concerning, esp. given the way session restore works in Firefox and b2g's memory constraints.  :(
The best course of action for Object.getPrototypeOf(Math) is to change the NewObjectWithClassProto (no one should ever use this!) in js_InitMathClass to NewObjectWithGivenProto, passing in obj->as<GlobalObject>().getOrCreateObjectPrototype(cx) as the proto.  I'm kind of surprised nobody's ever noticed the difference here before between eager/lazy, actually.  Easy fix.

I'm pretty sure anyone who's gated by the perf of creating new globals is doing it *horribly* wrong.

I'd be interested in a more detailed breakdown of the memory consumption for a fresh, initialized global object.  Just saying 180K doesn't say whether that could be trimmed, trivially or with more work, and feels way too much like just giving up without even trying, to me.  And was 180K a 32-bit or a 64-bit figure?  (And I assume non-debug, with optimization enabled?)
Flags: needinfo?(jwalden+bmo)
njn might have throughts on the memory consumption of about:blank documents; he's measured it before.
Per policy at https://wiki.mozilla.org/Bug_Triage/Projects/Bug_Handling/Bug_Husbandry#Inactive_Bugs. If this bug is not an enhancement request or a bug not present in a supported release of Firefox, then it may be reopened.
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → INACTIVE
Status: RESOLVED → REOPENED
Resolution: INACTIVE → ---
Assignee: till → nobody
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: