Closed Bug 645887 Opened 13 years ago Closed 13 years ago

Generic interface for recording JIT-generated code

Categories

(Core :: JavaScript Engine, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla10

People

(Reporter: sfink, Assigned: sfink)

References

(Blocks 3 open bugs)

Details

Attachments

(2 files, 5 obsolete files)

The JS engine should report method JIT-generated code ranges to various tools:
 - gdb
 - breakpad
 - opagent
 - valgrind

I'll file a separate bug for tracer-generated JIT code.
Just the configure.in changes. The only thing needed is a default setting for the granularity (detail) level that will be reported for JIT code. This should probably be expanded in the future to include various ICs.
Tack a JITWatcher object onto JM JIT scripts. For now, it doesn't do anything with the information given. Useful implementations will go into separate bugs.

The interface here is just wide enough to handle the oprofile and gdb consumers. Hopefully, it will be enough for other consumers, but it'll probably need to be expanded in the future.

dmandelin, I know you had something similar for the profiler stuff you did recently. I haven't looked to see how well it fits your use case.
Attachment #522525 - Flags: review?(dmandelin)
Blocks: 599499, 642320
Attachment #522524 - Flags: review?(dmandelin)
Oops, I had an "optimization" in there that resulted in all code being stuffed into line 1 (unless you happened to be compiling into single-step mode.)
Attachment #522525 - Attachment is obsolete: true
Attachment #522525 - Flags: review?(dmandelin)
Attachment #522730 - Flags: review?(dmandelin)
Attachment #522524 - Flags: review?(dmandelin) → review+
Comment on attachment 522730 [details] [diff] [review]
Framework for reporting JIT code ranges

Initial comments/questions:

0. This would be great for the profiler!

1. The per-line information is redundant with the nmap. Are you planning to replace the nmap with something computed by this system?

2. I think "jscodemap"/js::NativeCodeMap would be better names. (I'm assuming that this interface will be used primarily to map addresses to the JS code that generated them.)

3. I think the interface needs more design iteration. First, there should be a comment for each function saying clearly how it is to be called. I think the .cpp file is the usual place we would put that. Second, the set of functions here isn't the "obvious" set and seems like it might tied to our compilation process or some detail like that. Finally, I'm not sure this can be designed "right" without having the implementation that builds the code map (maps?) in place--we should probably have at least one application in place as we land the first one.

On the interface, to me the "obvious" interface would be:

  noteScript(JSContext *cx, JSScript *script, void *code, size_t size);
  noteLine(JSContext *cx, JSScript *script, unsigned lineno, void *code, size_t size);
  noteOp(JSContext *cx, JSScript *script, unsigned offset, void *code, size_t size);
  discardScript(JSContext *cx, JSScript *script);

But that won't actually work. (1) We compile scripts separately for the ctor and non-ctor case, so we'd have to add that flag to the key, or separate those in some other way. (2) We don't know the actual addresses of the code lines or ops until we are done compiling. There seem to be a lot of ways of solving that problem.

If you don't want to treat all these issues now, it seems fine to start just noting the script code, and fill in the other stuff later. Do remember that we generate PIC stubs later, and those should be accounted for as well.

4. I think the jitwatcher/CodeMap should go on the compartment. JM scripts are managed that way, and also that would mean the JITScript doesn't need to keep the watcher pointer. Apparently we will soon be in a place where it could go on JSRuntime, but for now compartment seems like the right place.
Attachment #522730 - Flags: review?(dmandelin)
(In reply to comment #4)
> Comment on attachment 522730 [details] [diff] [review]
> Framework for reporting JIT code ranges
> 
> Initial comments/questions:
> 
> 0. This would be great for the profiler!
> 
> 1. The per-line information is redundant with the nmap. Are you planning to
> replace the nmap with something computed by this system?

Yes. No. Wait!

The clients of this interface often need to store the per-line information in a consumer-specific way. For example, opagent (oprofile's JIT agent) has a particular format that it accepts line number information in. The gdb/jit integration puts line number info into a compressed ELF object section.

So we could either feed the nmap to this interface, or we could generate the nmap via this interface. Or leave them redundant. But I need to look at how nmap is generated and used before I can talk intelligibly about it.

> 2. I think "jscodemap"/js::NativeCodeMap would be better names. (I'm assuming
> that this interface will be used primarily to map addresses to the JS code that
> generated them.)

Better names than nmap, you mean? Sorry, I'm a little confused. Oh! Or do you mean instead of JITWatcher? I guess I was thinking of this as something that fed on JIT code-creation events, not necessarily as a data structure that holds the information. For example, for valgrind we would just invoke the magic nop sequences to inform valgrind that a range of memory is JIT code, and then forget all about it.

> 3. I think the interface needs more design iteration. First, there should be a
> comment for each function saying clearly how it is to be called. I think the
> .cpp file is the usual place we would put that. Second, the set of functions
> here isn't the "obvious" set and seems like it might tied to our compilation
> process or some detail like that. Finally, I'm not sure this can be designed
> "right" without having the implementation that builds the code map (maps?) in
> place--we should probably have at least one application in place as we land the
> first one.

Oh, sorry. I actually implemented 2 specific clients first, adding just what I needed for them, and then went backwards and separated out the general interface from the consumers. See bug 642320 and bug 599499 for the currently-implemented consumers.

> On the interface, to me the "obvious" interface would be:
> 
>   noteScript(JSContext *cx, JSScript *script, void *code, size_t size);
>   noteLine(JSContext *cx, JSScript *script, unsigned lineno, void *code, size_t
> size);
>   noteOp(JSContext *cx, JSScript *script, unsigned offset, void *code, size_t
> size);
>   discardScript(JSContext *cx, JSScript *script);

That naming style does look better, although I'll have to see how well it holds up to what I ended up needing for the 2 example clients.

> But that won't actually work. (1) We compile scripts separately for the ctor
> and non-ctor case, so we'd have to add that flag to the key, or separate those
> in some other way. (2) We don't know the actual addresses of the code lines or
> ops until we are done compiling. There seem to be a lot of ways of solving that
> problem.

Yes, #2 for gdb/jit was mostly how I arrived at these entry points. I think I'm oblivious to #1 right now, since nothing needs to be uniquely keyed yet.

> If you don't want to treat all these issues now, it seems fine to start just
> noting the script code, and fill in the other stuff later. Do remember that we
> generate PIC stubs later, and those should be accounted for as well.

Yes, I haven't dug into those yet. I've mostly just been getting the gdb/jit stuff off the ground, and leaving various "remember the ICs!" comments lying around.

> 4. I think the jitwatcher/CodeMap should go on the compartment. JM scripts are
> managed that way, and also that would mean the JITScript doesn't need to keep
> the watcher pointer. Apparently we will soon be in a place where it could go on
> JSRuntime, but for now compartment seems like the right place.

Hm. The jitwatcher often has some consumer-specific data structures that are associated with a JITted method -- for example, for gdb/jit there's a file descriptor for spewing an ELF object into, and there's a different file descriptor for each JIT object. So there needs to be *something* tacked onto the JITScript. (I originally had it on the JSScript, but lifetime management was uglier plus the ctor vs non-ctor cases were harder to deal with.)

I can go through a round of commenting and possibly renaming, but I'll need your feedback on the above before I can give you a real 2nd iteration.

Thanks!
For what it's worth, there's no need to do an AC_SUBST on a variable unless you want to use it in a Makefile. The AC_DEFINE_UNQUOTED is enough if you're just going to use it in C++ code.
Since there are different consumers of the information passed through this interface, should the member functions be virtual, so that we can switch implementations at run-time?
(In reply to comment #6)
> For what it's worth, there's no need to do an AC_SUBST on a variable unless you
> want to use it in a Makefile. The AC_DEFINE_UNQUOTED is enough if you're just
> going to use it in C++ code.

Why *did* I AC_SUBST JS_DEFAULT_JITREPORT_GRANULARITY? I have no idea what I was thinking.

(In reply to comment #7)
> Since there are different consumers of the information passed through this
> interface, should the member functions be virtual, so that we can switch
> implementations at run-time?

I did it that way first, and perhaps that's the right way. But because this is really an observer pattern, it means making an abstract base class JITWatcher, as you say, but then also maintaining a set of JITWatchers and looping through them on every event of interest. And doing a loop of virtual function calls for every single opcode compiled felt pretty heavyweight. (I know, premature optimization.) A single non-virtual function call that inlined the handling doesn't seem awful to do for every op.

Perhaps the best thing would be to go back to that original implementation, but throw out the per-op and per-line invocations. Instead, construct a single mapping table like nmap with all needed information and pass it into a noteMethodCompileComplete() or whatever.
(CC'ing dmandelin because I realize he never saw my response to his review. No action required, though, because I'm going to redo this nowish.)

Two other consumers that I'm working on: profiler (bug 642054) and vtune (gal put me in touch with an Intel guy who is interested in getting this working again.)
Status: NEW → ASSIGNED
Assignee: general → sphink
Blocks: 675093
Blocks: 675096
Blocks: 675098
This is the second iteration.

I made an attempt to merge this information into the nmap, but then reverted it because none of my consumer applications can use it in that form anyway, and it bloats up scripts.

Instead, I'm cramming everything into the bytecode analysis information, passing it to all consumers, and then throwing it out. Some consumers may then keep their own redundant copies in their own formats, but I don't see a way around that.

This API is working for me right now with both oprofile and GDB consumers, but only for consuming the main mjitted code. IC code is being reported, but it's much harder to deal with in the consumers (for different, consumer-specific reasons.) The API is a little annoying in that it gives a notification for each IC code block, but only gives a single notification when a whole range of ICs are released. If it turns out that I have to reimplement the scanning of a memory block to find all contained IC code regions, I may hoist that up into the general API as well. But as I said, I haven't implemented the IC part in any of the backends yet.

The patch uses an array of JIT watchers with virtual methods instead of my previous everything-in-the-header approach, because I now have a single notification point after the mjitted code is completely compiled.
Attachment #522730 - Attachment is obsolete: true
Attachment #559015 - Flags: review?(dmandelin)
Attachment #559015 - Flags: review?(dmandelin) → review?(bhackett1024)
Tying things to a runtime wasn't really correct, so I switched to JS_CallOnce from bug 686230 to allow JIT watchers to be associated with the entire process (if they want to be, but so far they all do.)
Attachment #559015 - Attachment is obsolete: true
Attachment #559015 - Flags: review?(bhackett1024)
Attachment #559787 - Flags: review?(bhackett1024)
Comment on attachment 559015 [details] [diff] [review]
Framework for reporting JIT code ranges

Review of attachment 559015 [details] [diff] [review]:
-----------------------------------------------------------------

Can you rebase and post an updated patch?  This patch reflects how things looked before the TI merge, and many of the structural changes and assumptions you've made here change in the presence of frame inlining.

::: js/src/jsanalyze.h
@@ +87,5 @@
> +    /* Line number that this instruction belongs to */
> +    uint32 lineno;
> +
> +    /* Offset of the native code for the beginning of this bytecode */
> +    off_t ncode;

Where do these fields (and the two bitfields above) get used?  If we know ahead of time whether some tool needs this information, then we should only compute/store it in such circumstances (in e.g. an array attached to the analysis or JITScript).

Information about native code should not be attached to ScriptAnalysis (the safePoint needs a better abstraction to distance itself from the mjit and describe its underlying meaning).   ->ncode and ->recordNativeAddr do not mean much when we can have multiple versions of the jitcode for a script at once (constructor/non-constructor versions, plus inlined frames).

::: js/src/methodjit/BaseCompiler.h
@@ +163,4 @@
>          masm.finalize(*this);
> +        JSC::CodeLocationLabel label = finalizeCodeAddendum();
> +        Probes::registerICCode(cx, jscr, cx->fp()->script(), cx->regs().pc,
> +                               label.executableAddress(), masm.size());

After rebasing on TI, this should use a VMFrame f and pass f.script() and f.pc() instead.  This indicates the innermost script/pc if we are in an inlined frame, whereas cx->fp()->script() and cx->regs().pc indicate the position of the outer call in such cases.
Attachment #559015 - Attachment is obsolete: false
Comment on attachment 559787 [details] [diff] [review]
Framework for reporting JIT code ranges

Just to be clear, this updated patch is *not* the rebase you requested; it has a different change. So don't bother reviewing it yet.

I could've sworn I had already rebased on top of TI, dammit... working on that now.
Attachment #559787 - Flags: review?(bhackett1024)
This doesn't quite work for the line-level information yet, and I haven't properly tested even the function-level info, but I'm wondering if you have time to take a look at the revised approach in this patch.

I haven't figured out yet what the consumers will need now that we have inlined frames -- is it better to have overlapping code ranges, or should the reported ranges be sliced up into contiguous subsequences, or somehow provide enough information for the consumer to infer either one? But I'll ignore that for now.

This patch reverts most of the modifications to the bytecode analysis; it looked to me like that was generated during the compile of a single script and discarded at the end, so it was a tempting place to wedge things. I don't know if it was ever correct, but it clearly isn't now.

Instead, I create my stuff (conditionally) in the ActiveFrames so that I can record each inlined frame as belonging to its original script. Does that seem like it (could be made to) work?
Attachment #559015 - Attachment is obsolete: true
Attachment #559787 - Attachment is obsolete: true
Attachment #559806 - Flags: feedback?(bhackett1024)
I think that putting things in the ActiveFrame should work good for function-level granularity stuff.  For tracking bytecode-level (and line-level) granularity you can look into extending the PCLengthEntry structure.  When -D is used, an array of these is allocated for every bytecode in a compiled script, including those from inlined frames.  Right now the indexing for this array is brittle, let me know if you need a better interface for accessing the entries after compilation.
Ok, here's an updated version. This is currently working for me for reporting JIT info to GDB (modulo gdb bugs in handling the info.) It uses a combination of ActiveFrame and pcLengths. See bug 642320 for an example consumer (which is no longer straightforward, since it must deal with inline frame and the separate stub/main code for each.) I'll post it in a second. (It's not review-ready, but might serve as an example.)
Attachment #559806 - Attachment is obsolete: true
Attachment #559806 - Flags: feedback?(bhackett1024)
Attachment #563610 - Flags: review?(bhackett1024)
Comment on attachment 563610 [details] [diff] [review]
Framework for reporting JIT code ranges

Review of attachment 563610 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry for the delay.

::: js/src/assembler/jit/ExecutableAllocator.h
@@ +165,5 @@
>  };
>  
>  class ExecutableAllocator {
>      enum ProtectionSetting { Writable, Executable };
> +    void (*destroyCallback)(void* addr, size_t size);

This signature shows up in setDestroyCallback too, can you make it a typedef?

::: js/src/methodjit/Compiler.cpp
@@ +1509,5 @@
> +    }
> +
> +    bool isLineHeader(ptrdiff_t relpc) const {
> +        return (relpc == offset) ? lineHeader : false;
> +    }

It seems like relpc must equal offset, can you remove the parameter?

::: js/src/methodjit/PolyIC.h
@@ +372,5 @@
>      }
>      void purge(Repatcher &repatcher);
> +    LookupStatus attachTypedArray(VMFrame &f, JSContext *cx, JSObject *obj, int32 key);
> +    LookupStatus attachHoleStub(VMFrame &f, JSContext *cx, JSObject *obj, int32 key);
> +    LookupStatus update(VMFrame &f, JSContext *cx, const Value &objval, const Value &idval);

Can you remove the JSContext parameter from these functions?  The cx will equal f.cx
Attachment #563610 - Flags: review?(bhackett1024) → review+
(In reply to Brian Hackett from comment #17)
> Comment on attachment 563610 [details] [diff] [review] [diff] [details] [review]
> Framework for reporting JIT code ranges
> 
> Review of attachment 563610 [details] [diff] [review] [diff] [details] [review]:
> -----------------------------------------------------------------
> 
> Sorry for the delay.
> 
> ::: js/src/assembler/jit/ExecutableAllocator.h
> @@ +165,5 @@
> >  };
> >  
> >  class ExecutableAllocator {
> >      enum ProtectionSetting { Writable, Executable };
> > +    void (*destroyCallback)(void* addr, size_t size);
> 
> This signature shows up in setDestroyCallback too, can you make it a typedef?

Done.

> ::: js/src/methodjit/Compiler.cpp
> @@ +1509,5 @@
> > +    }
> > +
> > +    bool isLineHeader(ptrdiff_t relpc) const {
> > +        return (relpc == offset) ? lineHeader : false;
> > +    }
> 
> It seems like relpc must equal offset, can you remove the parameter?

Oh, right. It's a remnant from when it used to be called with other values. Ok, gone.

> ::: js/src/methodjit/PolyIC.h
> @@ +372,5 @@
> >      }
> >      void purge(Repatcher &repatcher);
> > +    LookupStatus attachTypedArray(VMFrame &f, JSContext *cx, JSObject *obj, int32 key);
> > +    LookupStatus attachHoleStub(VMFrame &f, JSContext *cx, JSObject *obj, int32 key);
> > +    LookupStatus update(VMFrame &f, JSContext *cx, const Value &objval, const Value &idval);
> 
> Can you remove the JSContext parameter from these functions?  The cx will
> equal f.cx

Ok.
I may remove at least one of the fields I added to ActiveFrame in a later patch, but I'll land this for now.

It turns out that to handle inlined frames properly, you end up needing to recompute at least some of the stored information. But it's a small amount of very temporary additional memory, and it enables some useful assertions.
https://hg.mozilla.org/mozilla-central/rev/218e0d94de72
https://hg.mozilla.org/mozilla-central/rev/159202c12132
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla10
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: