Closed Bug 877658 Opened 7 years ago Closed 6 years ago

Gecko profiler has lots of undetected JS rooting hazards

Categories

(Core :: Gecko Profiler, defect, major)

defect
Not set
major

Tracking

()

RESOLVED FIXED
mozilla26

People

(Reporter: bzbarsky, Assigned: terrence)

References

Details

Attachments

(1 file, 1 obsolete file)

And the reason they're undetected is that the static analysis works with the type information available but the profiler code has things like reinterpret_cast from JSCustomObject* to JSObject*, and the static analysis has no way to know that the value of a JSCustomObject* can go stale on gc.

Can we just template this stuff on the actual types or something?
The profiler doesn't hold any reference to JSObject. I imagine the call of concern is |jsval getProfileData();| which creates a JSObject and returns it. I'm not familiar with rooting but it is required if we're not holding a reference when GC/CC runs?
Let's take a concrete example.

  184 JSCustomObject* BuildJavaThreadJSObject(JSAObjectBuilder& b)

has this code:

  195     JSCustomObject *sample = nullptr;
...
  207         sample = b.CreateObject();
...
  210         b.ArrayPush(samples, sample);

This call can trigger garbage collection.  With a moving gc, after that call any unrooted gcthing pointers can be stale.  So at this point "sample" (and also "samples" and "frames" and so forth) can all be pointing to lala-land.  And then the code does:

  213         b.DefineProperty(sample, "time", sampleTime);

which is clearly bad if sample is pointing to lala-land.

Or another example.  In JSObjectBuilder::ArrayPush we have (ignoring error-checking and the like):

90   mOk = JS_GetArrayLength(mCx, (JSObject*)aArray, &length);
96   mOk = JS_SetElement(mCx, (JSObject*)aArray, length, &objval);

JS_GetArrayLength can gc, which will make the aArray pointer stale.  But the rooting static analysis has no idea that this is actually a JSObject* that lives across the JS_GetArrayLength call.

This code is all fine now while we have a conservative stack scanner, but the JS engine folks are trying to move to an exact rooting setup and moving gc, and then this code is definitely not fine.
Understood. The reason why the profiler code is designed this way is we need to either return a JSObject or return a JSON c string when the JSEngine is no longer around (shutdown profiles). So we use JSCustomObject and an abstract builder to achieve this.

Perhaps I can modify the code to always build a JSON c string and then just have nsProfiler.cpp stringify that string before returning it? The performance would probably be much worse however for large profiles.
I guess the question is whether the builder can become a template on the relevant types....
(In reply to Boris Zbarsky (:bz) from comment #4)
> I guess the question is whether the builder can become a template on the
> relevant types....

I don't see why that would not be possible.
Attached patch hazard_profiler-v0.diff (obsolete) — Splinter Review
This builds and can load a few websites locally. Lets see what try thinks.

https://tbpl.mozilla.org/?tree=Try&rev=891b3c3e8fbb
Assignee: nobody → terrence
Status: NEW → ASSIGNED
Severity: normal → major
OS: Mac OS X → All
Hardware: x86 → All
Version: unspecified → Trunk
Seems some compilers want an explicit template instantiation for ThreadProfile::BuildJSObject, probably because there are no local callers of the local instantiations.

https://tbpl.mozilla.org/?tree=Try&rev=37cded44f67f
Appears to be green on all platforms with the right template instantiations in place.

This is actually not too bad a patch: the behavior was already fully static, so converting it to templates was straightforward. I personally find the new version considerably clearer (modulo generic template ugliness) than the previous virtual calls and gratuitous casting. It should also be a bit faster, which may be important for profiling code.
Attachment #800530 - Attachment is obsolete: true
Attachment #800945 - Flags: review?(bzbarsky)
Comment on attachment 800945 [details] [diff] [review]
hazard_profiler-v1.diff

Benoit, are you comfortable reviewing this?
Attachment #800945 - Flags: review?(bzbarsky) → review?(bgirard)
I'll take a look. Have you checked if saving a shutdown profile works via the custom JSON code still works? Try the 'Firefox restart' button which will open two profiles, one of which is the shutdown.

(In reply to Terrence Cole [:terrence] from comment #8)
> It should also be a bit
> faster, which may be important for profiling code.

That's always nice. For large profiles it takes a while to pass it to cleopatra but I never checked which part (JSON building, or passing the profile between the worker->addon->cleopatra page) was the bottleneck.
(In reply to Benoit Girard (:BenWa) from comment #10)
> I'll take a look. Have you checked if saving a shutdown profile works via
> the custom JSON code still works? Try the 'Firefox restart' button which
> will open two profiles, one of which is the shutdown.

The restart works fine, however, the shutdown profile only has a single entry: Startup::XRE_Main. Is this normal?
Just tried on my stock nightly and got exactly the same behavior. Consider this tested.
Comment on attachment 800945 [details] [diff] [review]
hazard_profiler-v1.diff

Review of attachment 800945 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry for the slow review. The templetization looks fine but I can't review the rootings bit so I'll trust you on that :).

Thanks for doing this!
Attachment #800945 - Flags: review?(bgirard) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/4bdb0462804b

https://tbpl.mozilla.org/?tree=Try&rev=891b3c3e8fbb (mostly green)
https://tbpl.mozilla.org/?tree=Try&rev=37cded44f67f (+ green on previous orange)

Good timing, I was actually going to bug you about it tomorrow (-: the tighter typing may expose some new rooting hazards and I'd like a bit of lead time to address those before we're ready to land the analysis on TBPL.
https://hg.mozilla.org/mozilla-central/rev/4bdb0462804b
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla26
You need to log in before you can comment on or make changes to this bug.