Closed Bug 761261 Opened 12 years ago Closed 12 years ago

Instrument JS function pro/epilogue to push profile labels

Categories

(Core :: Gecko Profiler, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla16

People

(Reporter: BenWa, Assigned: u443197)

References

(Blocks 1 open bug)

Details

Attachments

(2 files, 19 obsolete files)

470.15 KB, text/plain
Details
60.90 KB, patch
luke
: review+
Details | Diff | Splinter Review
I spoke with Luke and instrumenting the Js function pro/epilogue in profiling builds should be simple to implement and an overhead low enough to be useful (we're guessing 5% overhead). So far this looks like the best short term solution.

In the profiler module 'tools/profiler/*' we have the following relevant functions:
inline void* mozilla_sampler_call_enter(const char *aInfo);
inline void  mozilla_sampler_call_exit(void* handle);

If the profiler can be efficiently notified when relevant (functions that are long running) then the profiler should be able to show JS execution when 'Stackwalking' is disabled. With an additional patch it will be able to interleave JS with the C++ unwalking stack.
Ah, thanks for filing the bug.  I was just talking to Alex about this.
Depends on: 761277
Assignee: nobody → acrichton
Feel free to let me know if you have any question. I'm willing to make necessary changes to tools/profiler (it should be minimal anyways).
Attached patch v1 diff (obsolete) — Splinter Review
Here's a patch for JaegerMonkey with an API for the outside world to register some callbacks through JSRuntime. I haven't looked too much into the overhead just yet, and currently it only pushes the filename of of the script instead of "filename:line".

One thing I'm assuming is that the lifetime of the script is at least that of the JIT code, but that may be wrong?
Attached patch actual v1, removed cruft (obsolete) — Splinter Review
Attachment #630352 - Attachment is obsolete: true
(In reply to Alex Crichton from comment #3)
> One thing I'm assuming is that the lifetime of the script is at least that
> of the JIT code, but that may be wrong?

A JSScript will live strictly longer than the jit code.
Do you want me to write a patch to hook in the sample label functions when the profiler modules is loaded based on that patch?
I don't mind, I just wanted to make sure that this wasn't too horribly wrong and then I was gonna go hunt down where to find the JSRuntime for when the sampler starts.
Alright, very excited to try this out :)
So things are turning out a bit more complicated than I thought:

1. Turns out there's already functions to instrument entry/exits to functions in JS, so long as the --enable-trace-jscalls option is specified. If used, this callback is scoped under a JSContext*. Furthermore, this callback looks to be more accurate than the implementation I had above.

2. After this, somehow we need access to all JSContext* pointers, and conveniently there's a callback on each runtime which will be invoked whenever a context is created.

3. Then we just need a callback for every runtime which is created, which is easy enough to add, and I've done so far.

Putting all this together, however, results in a fair amount of woe. Turns out there's some xpconnect file which sets a ContextCallback, but doesn't invoke the old ContextCallback (such as the one we added when the JSRuntime* was created). If modified to do so, the browser either trips various assertions throughout the code base or brings up a popup window saying there's "no XBL binding for browser" and the refuses to bring up a window.

I'm not quite sure what interaction is going on, or why that error is popping up. Otherwise everything is *almost* in place to have some JS frames show up in SPS. Benoit do you have any idea what's going on? I can upload a really rough patch if necessary.

I'm not 100% sure how the interaction between the profiler and everything else is intended, but here's a quick summary of some changes I implemented:

* There's one thread-local TableTicker which now has a reference count. All JSContext* instances pipe through the same TableTicker.
* mozilla_sampler_start is now scoped to a JSContext* and so is mozilla_sampler_stop
* nsProfiler.cpp has been modified to take a JSContext* as an argument, but now that I think about it I'm not quite sure if I changed any invocations of StartProfiler() anywhere...
(In reply to Alex Crichton from comment #9)browser" and the refuses to bring up a window.
> 
> I'm not quite sure what interaction is going on, or why that error is
> popping up. Otherwise everything is *almost* in place to have some JS frames
> show up in SPS. Benoit do you have any idea what's going on? I can upload a
> really rough patch if necessary.
> 

I don't follow why the profiler needs any JSContext and I would REALLY love to avoid that. I'll take a look at your patch and if you can ping me on IRC maybe we can clear up the design.

> I'm not 100% sure how the interaction between the profiler and everything
> else is intended, but here's a quick summary of some changes I implemented:
> 
> * There's one thread-local TableTicker which now has a reference count. All
> JSContext* instances pipe through the same TableTicker.
> * mozilla_sampler_start is now scoped to a JSContext* and so is
> mozilla_sampler_stop
> * nsProfiler.cpp has been modified to take a JSContext* as an argument, but
> now that I think about it I'm not quite sure if I changed any invocations of
> StartProfiler() anywhere...

Currently StartProfiler() is called from my extension https://github.com/bgirard/Gecko-Profiler-Addon.git and Jeff's about:jank extension.
Attached patch v2 - wip (obsolete) — Splinter Review
Attachment #630353 - Attachment is obsolete: true
Attachment #631159 - Attachment is patch: true
Attached patch crash (obsolete) — Splinter Review
This is a bare minimum patch which is causing an XBL error at startup (at least most of the time). I can't seem to track down any reason why this is happening, any ideas?
Attached patch v3 (obsolete) — Splinter Review
Another work in progress, but a lot of the infrastructure is there. Crashes in a debug build due to an assertion on jsgc.cpp:3117. The comment above this assertion say that it's OK to happen, so not sure what's going on there...

In a release build, the JIT entry/exit isn't being tracked well so it crashes in some cases. I've seen the data in SPS though and it's got the whole stack all nice and clean.

Still trying to figure out how to track all entries/exits to the JIT and balance pushes/pops as necessary.
Attachment #631159 - Attachment is obsolete: true
Attachment #631441 - Attachment is obsolete: true
Depends on: 707308
Don't worry about leaking the labels. This will be fixed by bug 707308. I have a patch under way just need to fix an issue with it.
Attached patch v4 patch (obsolete) — Splinter Review
There's a set of still-failing tests in debug mode, but no crashes are happening in release mode (the relevant assertions are all disabled). Otherwise some data is actually being captured in profiling.

For some reason my SPS hates copy/paste, so I'm going to upload the downloaded version of the profile so you can see it. It at least functionally works now, but I want to track down the rest of the bugs before review.
Attachment #633338 - Attachment is obsolete: true
Attached file sps profile
Depends on: 766579
Attached patch v5 patch (obsolete) — Splinter Review
Woo I think I finally tracked down the last of the bugs with the tests.
Attachment #635012 - Attachment is obsolete: true
Attachment #635151 - Flags: review?(luke)
Attachment #635151 - Flags: feedback?(bgirard)
Comment on attachment 635151 [details] [diff] [review]
v5 patch

Review of attachment 635151 [details] [diff] [review]:
-----------------------------------------------------------------

The profiler works without MOZ_PROFILING so it would be nice to support this feature without MOZ_PROFILING. Note that firefox can be built without the profiler (non 1st tier platform can disable it) so we need to support that. You can check MOZ_ENABLE_PROFILER_SPS. When not running with MOZ_PROFILING the stack will only show SAMPLE_LABEL (like in the test file you included) so we can include JS stacks in there.

It's not clear from looking at the patch how you handle enabling/disabling this when the profiler is started/stopped at runtime. Also ideally the js support should be passed as an optional feature when starting the profiler. This is useful in case the user is interested in profile but doesn't want to incur the JS performance regression this may incur.

The format of ProfileStack is going to change once bug 707308 (part 1) lands. It will now include a slot for the frame address and a bit if the string must be copied when sampling. It would be great if you could rebase your patch on that change since you manipulate the stack directly. That will be all you need to support stack intertwining. If you're interested in testing out intertwining you can also rebase with bug 766579 (only tested on mac).
Attachment #635151 - Flags: feedback?(bgirard)
Alex: I gave a high-level glance over the whole thing and overall it looks great; nice work digging in and understanding JM, frame prologue/epilogue, and HashPolicies.

I have two high-level comments for the next iteration of the patch:

1. I think it would be good if you looked at https://wiki.mozilla.org/JavaScript:SpiderMonkey:C%2B%2B_Coding_Style (particularly bracing/indentation) and gave a styling pass over the whole patch.

2. To land this I think we'll need to actually deal with the "leak all strings" shortcut I advised you to take initially.  Perhaps you can talk to BenWa on the best way to handle this?
Attachment #635151 - Flags: review?(luke)
> The format of ProfileStack is going to change once bug 707308 (part 1)
> lands. It will now include a slot for the frame address and a bit if the
> string must be copied when sampling. It would be great if you could rebase
> your patch on that change since you manipulate the stack directly.

Is it not a bad thing if this is incredibly SPS-specific? A generic char* stack is quite easy to use elsewhere, but once we start having a very specific stack of data structures with a particular layout tuned for SPS, the generality is lost.

I don't really have a problem with it, but I was wondering if this possibly isn't the right direction to go in?

One possible approach would be to have two stacks, one for a char* and one for a void* frame and then the bit in the void* for copy would be a negatively-asserted kind of thing. Other code would have to be careful to explicitly set it to NULL on a pop(), but the JS integration would work seamlessly with using a char* stack only.
(In reply to Alex Crichton from comment #20)
> > The format of ProfileStack is going to change once bug 707308 (part 1)
> > lands. It will now include a slot for the frame address and a bit if the
> > string must be copied when sampling. It would be great if you could rebase
> > your patch on that change since you manipulate the stack directly.
> 
> Is it not a bad thing if this is incredibly SPS-specific? A generic char*
> stack is quite easy to use elsewhere, but once we start having a very
> specific stack of data structures with a particular layout tuned for SPS,
> the generality is lost.
> 
> I don't really have a problem with it, but I was wondering if this possibly
> isn't the right direction to go in?
> 
> One possible approach would be to have two stacks, one for a char* and one
> for a void* frame and then the bit in the void* for copy would be a
> negatively-asserted kind of thing. Other code would have to be careful to
> explicitly set it to NULL on a pop(), but the JS integration would work
> seamlessly with using a char* stack only.

If something that is flexible the best way to do it is passing a string to a callback and let the implementation decide on what it should do. JS integration needs to set the stack address correctly otherwise we wont interleave with the c++ stack correctly which is very important.
> If something that is flexible the best way to do it is passing a string to a
> callback and let the implementation decide on what it should do. JS
> integration needs to set the stack address correctly otherwise we wont
> interleave with the c++ stack correctly which is very important.

Well that seems pretty reasonable to have a stack for both a string description of and where the frame lies. I don't really have a problem with that, although it might be best to have them in two separate stacks for ease? That way the hook into the js engine would specify a string stack, a void* stack, a size pointer, and a maximum size.

You'd still probably want to change the copy flag to a negatively asserted thing so all SPS code knows that it should set the flight while the JS code blindly "doesn't set the flag" so that it knows that it should be copied out.
Attached patch v6 patch (obsolete) — Splinter Review
* MOZ_PROFILING no longer needed
* enable/disabling at runtime added (hopefully in the right place)
* the profile stack format has changed (discussed between benwa and me)
* Attempted to incorporate Mozilla style
* Profile strings are deallocated when their corresponding scripts are deallocated
Attachment #635151 - Attachment is obsolete: true
Attachment #635523 - Flags: review?(luke)
Attachment #635523 - Flags: feedback?(bgirard)
Attached file crash (obsolete) —
I got the attached crash while building with the changes from bug 707308 and bug 766579. Building with --enable-profiling on mac64.
Attached patch v7 patch (obsolete) — Splinter Review
Tracked down another problem, seeing if working with SPS.
Attachment #635523 - Attachment is obsolete: true
Attachment #635523 - Flags: review?(luke)
Attachment #635523 - Flags: feedback?(bgirard)
Attachment #636448 - Flags: feedback?(bgirard)
Attached patch v8 patch (obsolete) — Splinter Review
Benoit, this works for me up to the point that it looks like string copying is the problem. We talked about making the 'copy' bit in the stack entries be 0=copy, 1=don't copy, and with that change this looks like it should be working.
Attachment #635570 - Attachment is obsolete: true
Attachment #636448 - Attachment is obsolete: true
Attachment #636448 - Flags: feedback?(bgirard)
Attachment #636833 - Flags: feedback?(bgirard)
Attached patch v8 patch (obsolete) — Splinter Review
Attachment #636927 - Flags: review?(luke)
Attached patch patch (obsolete) — Splinter Review
Benoit and I have verified that this appears to be mostly working now. There will be some minor issues in interleaving stacks, but that will most likely be more of an SPS problem instead of a JS engine problem.
Attachment #636833 - Attachment is obsolete: true
Attachment #636927 - Attachment is obsolete: true
Attachment #636833 - Flags: feedback?(bgirard)
Attachment #636927 - Flags: review?(luke)
Attachment #636928 - Flags: review?(luke)
Attached patch JS Profiler Feature (obsolete) — Splinter Review
This will let the profiler addon decide at runtime if it wants JS profiling or not.
Attachment #636946 - Flags: review?(ehsan)
Attached patch JS Profiler Feature (obsolete) — Splinter Review
Attachment #636946 - Attachment is obsolete: true
Attachment #636946 - Flags: review?(ehsan)
Attachment #636947 - Flags: review?(ehsan)
Comment on attachment 636947 [details] [diff] [review]
JS Profiler Feature

Review of attachment 636947 [details] [diff] [review]:
-----------------------------------------------------------------

::: tools/profiler/TableTicker.cpp
@@ +393,5 @@
>    }
>  
>    JSObject *ToJSObject(JSContext *aCx);
>  
> +  bool ProfileJS() { return mProfileJS; }

Nit: make this const please.

@@ +959,5 @@
>    tlsTicker.set(NULL);
>    ProfileStack *stack = tlsStack.get();
>    ASSERT(stack != NULL);
> +
> +  if (uninstallJS)

Nit: Please call ProfileJS inline here as well, in the interest of keeping the start/stop function similar.
Attachment #636947 - Flags: review?(ehsan) → review+
(In reply to Ehsan Akhgari [:ehsan] from comment #32)
> Comment on attachment 636947 [details] [diff] [review]
> JS Profiler Feature
> 
> Review of attachment 636947 [details] [diff] [review]: 
> @@ +959,5 @@
> >    tlsTicker.set(NULL);
> >    ProfileStack *stack = tlsStack.get();
> >    ASSERT(stack != NULL);
> > +
> > +  if (uninstallJS)
> 
> Nit: Please call ProfileJS inline here as well, in the interest of keeping
> the start/stop function similar.

I can't, we delete 't'.
Attached patch JS Profiler Feature (obsolete) — Splinter Review
Carrying forward r+
Attachment #636947 - Attachment is obsolete: true
Attachment #636964 - Flags: review+
Attached patch JS Profiler Feature (obsolete) — Splinter Review
Attachment #636964 - Attachment is obsolete: true
Attachment #636965 - Flags: review+
(In reply to Benoit Girard (:BenWa) from comment #33)
> (In reply to Ehsan Akhgari [:ehsan] from comment #32)
> > Comment on attachment 636947 [details] [diff] [review]
> > JS Profiler Feature
> > 
> > Review of attachment 636947 [details] [diff] [review]: 
> > @@ +959,5 @@
> > >    tlsTicker.set(NULL);
> > >    ProfileStack *stack = tlsStack.get();
> > >    ASSERT(stack != NULL);
> > > +
> > > +  if (uninstallJS)
> > 
> > Nit: Please call ProfileJS inline here as well, in the interest of keeping
> > the start/stop function similar.
> 
> I can't, we delete 't'.

Ah, indeed we do!
Bug 761277 is not needed based on the approach we're taking. Part 1 of bug 707308 landed which was the only prerequisite from that bug, and bug 766579 is an optional improvement.

This patch should be good to land once reviewed.
No longer depends on: 761277
Attached patch patch (obsolete) — Splinter Review
Minor changes to what's pushed on the SPS stack (NULL instead of a stack pointer) as what Benoit and I discussed on IRC. I also merged Benoit's patch of the JS feature into this one as I changed the names around a bit.

This was also rebased on the current inbound with Benoit's most recent changes.
Attachment #636928 - Attachment is obsolete: true
Attachment #636965 - Attachment is obsolete: true
Attachment #636928 - Flags: review?(luke)
Attachment #637649 - Flags: review?(luke)
Comment on attachment 637649 [details] [diff] [review]
patch

Review of attachment 637649 [details] [diff] [review]:
-----------------------------------------------------------------

Again, this is looking great, nice work.  I still need to think about the interp/jit changes to see if it covers all the spidery paths, but I have some nits that I'll dump now.

::: js/src/jscntxt.cpp
@@ +1309,5 @@
> + * Serialize this entry into its string representation. This function can
> + * fail, and should be dealt with accordingly
> + */
> +const char*
> +ProfileStringify(JSScript *script, JSFunction *function)

This definition could be moved to vm/SPSProfiler.cpp.

@@ +1319,5 @@
> +    if (script->filename)
> +        len += strlen(script->filename);
> +    /* Account for ':', 6 digits for a line number, and trailing 0 */
> +    len += 8;
> +    char *ptr = (char*) malloc(len);

js_malloc

@@ +1320,5 @@
> +        len += strlen(script->filename);
> +    /* Account for ':', 6 digits for a line number, and trailing 0 */
> +    len += 8;
> +    char *ptr = (char*) malloc(len);
> +    if (ptr == NULL) return ptr;

'return ptr' on new line

::: js/src/jscntxt.h
@@ +348,5 @@
> +    const JSScript *script;
> +    const char *string;
> +};
> +
> +const char *ProfileStringify(JSScript *script, JSFunction *function);

Perhaps AllocProfileString (to emphasize the fact that the return value must be freed)?

@@ +709,5 @@
> +    /* When profiling, store for strings and stack we're instrumenting */
> +    js::ProfileStringSet profileStrings;
> +    js::ProfileEntry     *profilingStack;
> +    uint32_t             *profilingStackSize;
> +    uint32_t             profilingStackMax;

JSRuntime is kinda sprawling so we've been incrementally grouping related members into classes and then just storing a single instance in the JSRuntime.  Since we have so many forms of "profiling", perhaps we could include SPS in the name (say, class SPSProfiler) and create a new file (vm/SPSProfiler.h) that includes the SPSProfiler definition, it's members, and any related helpers, and a big honkin' comment at the top (say, like String.h, Stack.h, jsscope.h...) that explains overall what you are doing here: maintaining a runtime stack of what is active that can be used by the mozilla SPS profiler (perhaps the url https://developer.mozilla.org/en/Performance/Profiling_with_the_Built-in_Profiler).

Also, in the destructor, could you assert that stackSize is zero?

@@ +718,5 @@
> +        js::ProfileStringSet::AddPtr s = profileStrings.lookupForAdd(script);
> +        if (s)
> +            return s->string;
> +        const char *str = js::ProfileStringify(script, fun);
> +        if (str == NULL) return NULL;

'return NULL' on newline

@@ +719,5 @@
> +        if (s)
> +            return s->string;
> +        const char *str = js::ProfileStringify(script, fun);
> +        if (str == NULL) return NULL;
> +        js::ProfileString entry = {script, str};

We generally avoid aggregate initialization as much as possible; could you give ProfileString a binary constructor?

@@ +721,5 @@
> +        const char *str = js::ProfileStringify(script, fun);
> +        if (str == NULL) return NULL;
> +        js::ProfileString entry = {script, str};
> +        if (!profileStrings.add(s, entry)) {
> +            free((void*) str);

IIRC, We Are Never To Use malloc()/free().  Apparently js_malloc/js_free is the way to go.

@@ +729,5 @@
> +    }
> +
> +    inline void removeProfileString(JSScript *script) {
> +        if (!profileStrings.initialized())
> +            return;

How about JS_ASSERT it is initialized?

@@ +731,5 @@
> +    inline void removeProfileString(JSScript *script) {
> +        if (!profileStrings.initialized())
> +            return;
> +        js::ProfileStringSet::Ptr entry = profileStrings.lookup(script);
> +        if (entry) {

if (ProfileStringSet::Ptr entry = ...) {
  ...
}

@@ +734,5 @@
> +        js::ProfileStringSet::Ptr entry = profileStrings.lookup(script);
> +        if (entry) {
> +            const char *tofree = entry->string;
> +            profileStrings.remove(entry);
> +            free((void*) tofree);

ditto on free()

::: js/src/jsfriendapi.h
@@ +528,5 @@
> + * A call stack can be specified to the JS engine such that all JS entry/exits
> + * to functions push/pop an entry to/from the specified stack. The stack can be
> + * thought of as this prototype:
> + *
> + *    struct { char *name; void *sp; } stack[max];

I'd like it if we could statically check that the caller passes a proper two-word struct.  Perhaps you could (1) move the definition of the entry here, (2) include this type in the signature and (3) JS_STATIC_ASSERT at the callsite that sizeof(*stack) == sizeof(js::ProfileEntry).

@@ +533,5 @@
> + *
> + * Each entry in the specified stack will be filled in on a push. The stack
> + * pointer pushed for all js function calls is NULL, but each group of js
> + * function calls are preceded with an entry named 'js::RunScript' with the
> + * relevant stack pointer. Hence, If an entry on the stack has a NULL stack

s/If/if/

::: js/src/jsgc.cpp
@@ +4765,5 @@
>      rt->scriptAndCountsVector = vec;
>  }
>  
>  JS_FRIEND_API(void)
> +SetRuntimeProfilingStack(JSRuntime *rt, void *stack, uint32_t *size,

could you move this def to jsfriendapi.cpp?

::: js/src/jsprobes.h
@@ +396,5 @@
>  #endif
>  
> +    JS_ASSERT(fp->hasPrologueCompleted());
> +    JSRuntime *rt = cx->runtime;
> +    if (rt->profilingStack != NULL) {

Could the new SPSProfiler struct keep its members private and add accessors like isActive() ?

@@ +403,5 @@
> +            ok = false;
> +        } else if (*rt->profilingStackSize < rt->profilingStackMax) {
> +            /* The anchor ensures that the compiler won't reorder the increment
> +               below before this store into memory */
> +            JS::Anchor<JSScript *> anchor(script);

I don't think Anchor is the right hammer for this nail.  While 'volatile' has almost no meaning, the definition of a program via the C spec is its I/O and reads/writes to volatile memory so I think we can ensure proper ordering if you include 'volatile' on all the relevant fields of profilingStack as well as profilingStackSize.  (If that doesn't work, let's talk about it :)

@@ +438,5 @@
> +        uint32_t *size = rt->profilingStackSize;
> +        (*size)--;
> +        JS_ASSERT(*(int*)size >= 0);
> +#ifdef DEBUG
> +        /* Sanity check to make sure push/pop balanced */

good idea!

::: js/src/jsscript.cpp
@@ +1452,5 @@
>      // JSScript::Create(), but not yet finished initializing it with
>      // fullyInitFromEmitter() or fullyInitTrivial().
>  
>      CallDestroyScriptHook(fop, this);
> +    fop->runtime()->removeProfileString(this);

Does SPS make a copy (and thus, since scripts are not finalized while they are active on the stack, we can know that noone else is pointing to this script's string?)

::: js/xpconnect/src/XPCJSRuntime.cpp
@@ +1953,5 @@
>  #endif
> +#ifdef MOZ_ENABLE_PROFILER_SPS
> +    ProfileStack *stack = mozilla_profile_stack();
> +    if (stack != NULL)
> +        stack->sampleRuntime(mJSRuntime);

if (ProfileStack *stack = ...)
    stack->sampleRuntime(mJSRuntime);
Attachment #637649 - Flags: review?(luke)
Comment on attachment 637649 [details] [diff] [review]
patch

Review of attachment 637649 [details] [diff] [review]:
-----------------------------------------------------------------

One overall comment: I think the nightly profiling builds pass --enable-profiling (double check with BenWa first) which #defines MOZ_PROFILING which would allow us to #ifdef off the dynamic profiling checks in the Probes and RunScript.

::: js/src/jsinterp.cpp
@@ +1600,5 @@
>      {
> +        if (regs.fp()->hasPrologueCompleted() || regs.fp()->isYielding()) {
> +            Probes::exitJSFun(cx, script->function(), script, regs.fp());
> +            if (cx->compartment->debugMode())
> +                interpReturnOK = ScriptDebugEpilogue(cx, regs.fp(), interpReturnOK);

Perhaps it would be best to keep the ScriptDebugEpilogue as it was before, outside the branch.  In the future, I think we should unify the two things.

::: js/src/vm/Stack.cpp
@@ -265,5 @@
>              return false;
>          functionThis() = ObjectValue(*obj);
>      }
>  
> -    Probes::enterJSFun(cx, fun(), script());

It seems best to keep Probes::enterJSFun in StackFrame::prologue if at all possible and avoid adding it in all the other places: otherwise we are just asking to regress the next time we tweak how calls happen (which is not uncommon).  Perhaps you could write:

Probes::enterJSFun(cx, fun(), script());
flags_ |= HAS_CALLED_ENTRY_PROBE;

and then in the epilogue write:

if (flags_ & HAS_CALLED_ENTRY_PROBE)
    Probes::exitJSFun(cx, fun(), script());

?

::: js/src/vm/Stack.h
@@ +898,5 @@
>          flags_ |= HAS_HOOK_DATA;
>      }
>  
> +    bool hasInformedEntry() {
> +        return !!(flags_ & INFORMED_ENTERED);

Perhaps we could make this profile-specific and call it hasCalledEntryProbe?

@@ +911,5 @@
> +    }
> +
> +    void setPrologueCompleted() {
> +        JS_ASSERT(!hasPrologueCompleted());
> +        flags_ |= PROLOGUE_COMPLETED;

I can't help but think that PROLOGUE_COMPLETED is redundant; it seems like all we'd need to know is when we've called Probes::enterJSFun on the frame.
(In reply to Luke Wagner [:luke] from comment #41)
> Comment on attachment 637649 [details] [diff] [review]
> patch
> 
> Review of attachment 637649 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> One overall comment: I think the nightly profiling builds pass
> --enable-profiling (double check with BenWa first) which #defines
> MOZ_PROFILING which would allow us to #ifdef off the dynamic profiling
> checks in the Probes and RunScript.

Nightly profiling build do pass that option which let the profiler work with additional features (full native unwinding). However release builds based on Gecko 16 will also supporting profiling without full native unwinding. So if the JS engine responds to installing/uninstalling probes then we will be able to profile JS on release builds in Gecko 16. Of course all of this depends on how expensive it is to keep these checks.
(In reply to Benoit Girard (:BenWa) from comment #42)
> However release builds based on
> Gecko 16 will also supporting profiling without full native unwinding. So if
> the JS engine responds to installing/uninstalling probes then we will be
> able to profile JS on release builds in Gecko 16.

Hot hot hot!  Thanks for explaining.  Alex: disregard my MOZ_PROFILING request, then; the dynamic overhead is in C++ that shouldn't be hot enough to really notice.
(In reply to Luke Wagner [:luke] from comment #40)

> Also, in the destructor, could you assert that stackSize is zero?

Sadly no because the profilingStack isn't only connected to JS, but also the rest of the engine. It's always zero when running from the shell, but it's never 0 when running in the browser.

> > +    inline void removeProfileString(JSScript *script) {
> > +        if (!profileStrings.initialized())
> > +            return;
> 
> How about JS_ASSERT it is initialized?

I want to, but dynamically turning on and off makes it impossible to assert.

> >  JS_FRIEND_API(void)
> > +SetRuntimeProfilingStack(JSRuntime *rt, void *stack, uint32_t *size,
> 
> could you move this def to jsfriendapi.cpp?

I would have, but I needed the ReleaseAllJITCode function which is only defined in this file. Should I make that a function visible to others so I can put this in jsfriendapi.cpp?

> Does SPS make a copy (and thus, since scripts are not finalized while they
> are active on the stack, we can know that noone else is pointing to this
> script's string?)

It does indeed. That was recently added to SPS


(In reply to Luke Wagner [:luke] from comment #41)

> >      {
> > +        if (regs.fp()->hasPrologueCompleted() || regs.fp()->isYielding()) {
> > +            Probes::exitJSFun(cx, script->function(), script, regs.fp());
> > +            if (cx->compartment->debugMode())
> > +                interpReturnOK = ScriptDebugEpilogue(cx, regs.fp(), interpReturnOK);
> 
> Perhaps it would be best to keep the ScriptDebugEpilogue as it was before,
> outside the branch.  In the future, I think we should unify the two things.

I'm not sure if debug mode changes the runtime of scripts, but if regs.fp()->hasCompleted() returns false, then it's impossible for ScriptDebugPrologue to have completed. For now though, I'll restore to where it was.

> 
> ::: js/src/vm/Stack.cpp
> @@ -265,5 @@
> >              return false;
> >          functionThis() = ObjectValue(*obj);
> >      }
> >  
> > -    Probes::enterJSFun(cx, fun(), script());
> 
> It seems best to keep Probes::enterJSFun in StackFrame::prologue if at all
> possible and avoid adding it in all the other places: otherwise we are just
> asking to regress the next time we tweak how calls happen (which is not
> uncommon).  Perhaps you could write:
> 
> Probes::enterJSFun(cx, fun(), script());
> flags_ |= HAS_CALLED_ENTRY_PROBE;
> 
> and then in the epilogue write:
> 
> if (flags_ & HAS_CALLED_ENTRY_PROBE)
>     Probes::exitJSFun(cx, fun(), script());
> 

I wanted to do that, but generators threw a wrench into my plans. It turns out that when using generators the StackFrame::prologue() method is invoked, but then the function leaves without calling StackFrame::epilogue() because it can eventually re-enter the frame. Worse even, each time the function comes back into the StackFrame instance it won't call prologue().

There were a couple spots where the prologue was conditionally run, and I basically always ran enterJSFun right below it regardless of whether the prologue ran.

I do think that a regression problem is a big thing here. I found the assertions in enter/exit to be incredibly helpful in tracking down problems, but they're not much use unless there's a profiling stack installed. When I was testing I had a static stack which was installed to the one runtime created. This worked for debugging purposes, but wouldn't want to be anywhere in release at all. Perhaps this could be integrated into the shell to install a stack whenever it starts when running tests?




Other comments shall all be addressed, thanks for the review!
(In reply to Alex Crichton from comment #44)
> I would have, but I needed the ReleaseAllJITCode function which is only
> defined in this file.

Well, for that, I'd take of the 'static' in front of ReleaseAllJITCode, put an extern decl in jsgc.h.

> regs.fp()->hasCompleted() returns false, then it's impossible for
> ScriptDebugPrologue to have completed. For now though, I'll restore to where
> it was.

You're right, but ScriptDebug* are meant to handle this.  If they want to stop one day, then probably we'd add a HAS_CALLED_DEBUG_PROLOGUE bit or something to stack frame.

> I wanted to do that, but generators threw a wrench into my plans.

You're right about generators being a problematic case.  For that, though, it seems like we can still keep the general case using prologue/epilogue and add a specific case for generators: I think the minimal change is tweak the Interpret prologue to be:

  if (fp->isGeneratorFrame()) {
    if (!fp->prologue(...))
       goto exit;
  } else {
    Probes::enterJSFun(...);
  }

and then, similarly for the Interpret exit path:

  if (regs.fp()->isYielding())
    Probes::exitJSFun(...);
  else
    regs.fp()->epilogue(cx);

Already the jits keep away from generators and, when we do tackle generators, we'll undoubtedly need to factor our these two blogs into some "generator resume" and "generator yield" StackFrame functions.
Attached patch patch2 (obsolete) — Splinter Review
Attachment #637649 - Attachment is obsolete: true
Attachment #638514 - Flags: review?(luke)
Comment on attachment 638514 [details] [diff] [review]
patch2

Review of attachment 638514 [details] [diff] [review]:
-----------------------------------------------------------------

Great work!  The comments from the previous patches were addressed very well, thanks.  I'm almost ready to r+, assuming positive answers to some of the questions below.

Even though this profiling should be dynamically off in all the average release browser, it would be good to push this to try server to confirm.  Do you have access to try?

Second, could you file a bug to add a shell function that turns on the profiling so we could let the fuzzers go to town?  (cc :gkw and :decoder)  That should flush out any remaining bits.

::: js/src/jsfriendapi.h
@@ +541,5 @@
> +     * If the size increment were re-ordered before the store of the string,
> +     * then if sampling occurred there would be a bogus entry on the stack.
> +     */
> +    volatile const char *string;
> +    volatile void *sp;

As is 'volatile' describes the memory location of the chars when I think we want the memory location of the two pointers, so I think you want:

  const char * volatile string;
  void * volatile sp;

::: js/src/jsprobes.h
@@ +91,5 @@
>   */
>  bool wantNativeAddressInfo(JSContext *);
>  
>  /* Entering a JS function */
> +bool enterJSFun(JSContext *, JSFunction *, JSScript *, StackFrame *fp);

Given that script can be NULL and this is always JS, could this be Probes::enterScript/exitScript?

::: js/src/methodjit/Compiler.cpp
@@ +3759,5 @@
>      }
>  
>      if (a != outer) {
> +        JS_ASSERT(!debugMode());
> +        if (Probes::callTrackingActive(cx) || 1) {

'|| 1' probably needs to go, here and below

@@ +3764,5 @@
> +            prepareStubCall(Uses(0));
> +            INLINE_STUBCALL(stubs::ScriptProbeOnlyEpilogue, REJOIN_RESUME);
> +        } else {
> +            profilingPopHelper();
> +        }

\n after this line

@@ +3928,5 @@
> +        return;
> +    /* If allocation fails, make sure no PopHelper() is emitted */
> +    const char *str = p->profileString(cx, script->function(), script);
> +    if (str == NULL) {
> +        profilePushFailed++;

For this, just give profilingPushHelper a CompileStatus return error, return Compile_Error (and make sure to fix up the callers).  This should allow you to remove profilePushFailed.

@@ +3934,5 @@
> +    }
> +    /* Check if there's still space on the stack */
> +    RegisterID size = frame.allocReg();
> +    RegisterID base = frame.allocReg();
> +    masm.move(ImmPtr((void*) inlining()), size);

I think this is dead

::: js/src/methodjit/Retcon.cpp
@@ +269,5 @@
>      }
> +    if (fp->hasCalledEntryProbe())
> +        innerfp->setCalledEntryProbe();
> +    if (fp->hasPushedSPSFrame())
> +        innerfp->setPushedSPSFrame();

IIUC, mjit::Compiler::inlineScriptedFunction calls generateMethod directly (thereby skipping generatePrologue) and so it would seem to me that we would not have pushed an entry for these expanded frames.  Is this still necessary?  If something breaks when you take it out, perhaps we could look at it together?

(Also, if it was necessary (the inlined frames were getting pushed somehow), we'd have to do it for the N expanded frames, not just the innerfp; probably the best place to do this would be in StackFrame::initInlineFrame.)

::: js/src/vm/SPSProfiler.cpp
@@ +103,5 @@
> +}
> +
> +/*
> + * Serialize this entry into its string representation. This function can
> + * fail, and should be dealt with accordingly

The caller of allocProfileString assumes that it can't cause a GC (b/c that would allow script finalization which would allow HT resizing, which would invalidate the AddPtr).  I think that is valid since allocProfileString only does malloc which we ensure does not cause GC.  Could you add a comment stating this assumed contract?

@@ +106,5 @@
> + * Serialize this entry into its string representation. This function can
> + * fail, and should be dealt with accordingly
> + */
> +const char*
> +SPSProfiler::AllocProfileString(JSContext *cx, JSScript *script, JSFunction *function)

Could you change 'function' to 'maybeFunction' in this member and all its transitive callers?

@@ +108,5 @@
> + */
> +const char*
> +SPSProfiler::AllocProfileString(JSContext *cx, JSScript *script, JSFunction *function)
> +{
> +    StringBuffer buf(cx);

Good idea

@@ +130,5 @@
> +    if (hasAtom && !buf.append(")"))
> +        return NULL;
> +
> +    size_t len = buf.length();
> +    char *cstr = (char*) js_malloc(len);

I don't think 'len' will include the null char (finishString adds it at the end).

::: js/src/vm/SPSProfiler.h
@@ +18,5 @@
> + * https://developer.mozilla.org/en/Performance/Profiling_with_the_Built-in_Profiler
> + *
> + * The SPS profiler (found in tools/profiler) is an implementation of a profiler
> + * which has the ability to walk the C++ stack as well as use instrumentation to
> + * gather information. When dealing with JS, however, it's needs integration

s/it's/SPS/

@@ +22,5 @@
> + * gather information. When dealing with JS, however, it's needs integration
> + * with the engine because otherwise it is very difficult to figure out what
> + * javascript is executing.
> + *
> + * The chosen method of integration with SPS is a form of instrumentation. With

Perhaps s/chosen/current/

s/. With this instrumentation,/:/

@@ +26,5 @@
> + * The chosen method of integration with SPS is a form of instrumentation. With
> + * this instrumentation, every time a JS function is entered, a bit of
> + * information is pushed onto a stack that SPS owns and maintains. This
> + * information is then popped at the end of the JS function. SPS informs the JS
> + * engine of this stack to instrument at runtime, and it can by turned on/off

s/to instrument//

@@ +29,5 @@
> + * information is then popped at the end of the JS function. SPS informs the JS
> + * engine of this stack to instrument at runtime, and it can by turned on/off
> + * dynamically.
> + *
> + * When SPS talks to the JS engine, it tells it three parameters: a stack, a

Perhaps un-anthropomorphize to: "The SPS stack has three parameters: a base pointer, a size, and a maximum size."

@@ +56,5 @@
> + * maintained, and ideally at a very low overhead cost. For this reason, the JS
> + * engine maintains a mapping of all known profile strings. These strings are
> + * keyed in lookup by a JSScript*, but are serialized with a JSFunction*,
> + * JSScript* pair. A JSScript* outlives the its corresponding ProfileString and
> + * will destroy it when the JSScript* is destroyed.

Grammar error.  How about just "A ProfileString is freed when the corresponding JSScript is finalized."

@@ +111,5 @@
> +        ProfileStringSet;
> +
> +class SPSEntryMarker;
> +
> +class SPSProfiler {

{ on newline please

@@ +119,5 @@
> +    ProfileEntry         *stack_;
> +    uint32_t             *size_;
> +    uint32_t             max_;
> +
> +    static const char *AllocProfileString(JSContext *cx, JSScript *script,

allocProfileString

@@ +134,5 @@
> +    bool enter(JSContext *cx, JSFunction *function, JSScript *script);
> +    void exit(JSContext *cx, JSFunction *function, JSScript *script);
> +    void setProfilingStack(ProfileEntry *stack, uint32_t *size, uint32_t max);
> +    const char *profileString(JSContext *cx, JSFunction *function, JSScript *script);
> +    void removeProfileString(JSScript *script);

How about onScriptFinalized since that is, from an external perspective, SPSProfiler is really asking to know?

@@ +146,5 @@
> + * This class is used in RunScript() to push the marker onto the sampling stack
> + * that we're about to enter JS function calls. This is the only time in which a
> + * valid stack pointer is pushed to the sampling stack.
> + */
> +class SPSEntryMarker {

{ on newline please

::: js/src/vm/Stack.cpp
@@ +236,5 @@
>                  return false;
>              pushOnScopeChain(*callobj);
>              flags_ |= HAS_CALL_OBJ;
>          }
> +        Probes::enterJSFun(cx, script()->function(), script(), this);

I think script()->function() should be NULL here and in the isGlobalFrame case; can you just pass NULL?

::: js/src/vm/Stack.h
@@ +359,5 @@
>          /* Debugger state */
> +        PREV_UP_TO_DATE    =   0x400000,  /* see DebugScopes::updateLiveScopes */
> +
> +        /* Used in tracking calls and profiling (see vm/SPSProfiler.cpp) */
> +        HAS_CALLED_ENTRY_PROBE =  0x800000, /* Probes::enterJSFun was called */

I'm still not certain we need this flag.  Can we just call Probes::exitJSFun unconditionally in StackFrame::epilogue()? hasPushedSPSFrame should guard us where we need it and the other probes I don't think have such exacting requirements (and if they do, then it seems like they'd add their own specific flag).
Blocks: 770663
(In reply to Luke Wagner [:luke] from comment #47)

> Even though this profiling should be dynamically off in all the average
> release browser, it would be good to push this to try server to confirm.  Do
> you have access to try?

I just pushed to try, and there's failing tests and builds on other platforms. Will post an updated patch once I have those all fixed.

> Second, could you file a bug to add a shell function that turns on the
> profiling so we could let the fuzzers go to town?

bug 770663

> As is 'volatile' describes the memory location of the chars when I think we
> want the memory location of the two pointers, so I think you want:

I had no idea the location of 'volatile' affected how it was treated. I apparently need to read up on this.

> '|| 1' probably needs to go, here and below

This needs to be refactored for the fuzzing bug because otherwise no assertions would run when jit code exits. For this it will be removed, however.


Everything else should have been addressed.
Attached patch patch3Splinter Review
Fixes from try:

* Don't assert string map size is 0 on exit, this was tripped a fair amount and
  it's just as easy to just free all remaining entries.
* Fix assertion in SPSStackEntry because sometimes it would be accessing indices
  outside the bounds of the stack.
* Fix builds on windows because max() is a macro
* Fix builds on QT because jsfriendapi.h was now included elsewhere and QT
  redefines the word 'slots' to something which something uses as a variable.
* Fixed debug JIT behavior where debug's prologue was called before constructThis(),
  but it should have been called after

I'm going to push to try again to weed out any stragglers if so, but it takes forever to find them, so I thought I'd post again.
Attachment #638514 - Attachment is obsolete: true
Attachment #638514 - Flags: review?(luke)
Attachment #639490 - Flags: review?(luke)
Apparently results from try aren't visible for more than 24 hrs, but from what I could tell there weren't errors directly related to this, they were all sporadic things here and there and I couldn't see any references to these assertions or segfaults in the logs.
(In reply to Alex Crichton from comment #50)
> Apparently results from try aren't visible for more than 24 hrs, but from
> what I could tell there weren't errors directly related to this, they were
> all sporadic things here and there and I couldn't see any references to
> these assertions or segfaults in the logs.

It should go back much further. Check the email you got from your submission and you should get a link in the form of:
https://tbpl.mozilla.org/?tree=Try&rev=<CHANGESET>
I guess it's just a bug with tinderbox then?  When I go to those URLs it loads until 9% and then just hits ERROR: https://tbpl.mozilla.org/?tree=Try&rev=388e942745c2
That is a mercurial issue (bug 767657) that they are trying to work on.  What does work, though, just just watching your push from the general view (?tree=Try).
Comment on attachment 639490 [details] [diff] [review]
patch3

Awesome work, r+.
Attachment #639490 - Flags: review?(luke) → review+
Would you like for me to land it for you?
Sure, you could also get bug 770663 while you're at it
Excellent work here. Now that this is landed I'll press harder for review on bug  766579 but until then users will have to choose either JS or Stackwalking feature. I'll update the extension to make this clear.

We should do a bit of testing on all platforms and if there are no issues we should write up a blog post and/or announce this. If you're not interesting in doing this I can take care of it.
Benoit, Alex: I was just playing with this with the profiling extension and I see what are, I believe, blank labels for all the JS.  Is that a bug or just work needed on the extension?
(In reply to Luke Wagner [:luke] from comment #58)
> Benoit, Alex: I was just playing with this with the profiling extension and
> I see what are, I believe, blank labels for all the JS.  Is that a bug or
> just work needed on the extension?

I've just pushed an update to the extension now so go to Tools->Addons and update it. The profiler will now disable itself during Private Browsing as well now that we collect script URLs that can identify browsing patterns.

You'll want to stop the profiler, turn off 'Stackwalking' and turn on 'JS Profiling' and restart it.
https://hg.mozilla.org/mozilla-central/rev/69225f71e538
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Depends on: 772078
Blocks: 788495
No longer blocks: 788495
Depends on: 788495
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: