Update VTune integration (2017.1)

RESOLVED FIXED in Firefox 54



3 years ago
2 years ago


(Reporter: sstangl, Assigned: sstangl)


(Blocks 1 bug)

53 Branch
Dependency tree / graph

Firefox Tracking Flags

(firefox53 affected, firefox54 fixed)



(1 attachment, 3 obsolete attachments)

For Bug 1330539, VTune integration would be helpful to identify patterns that induce high CPI rates.

The new version of VTune also seemingly supports code unloading through an undocumented interface, so JIT memory allocation can function using normal paths.
Priority: -- → P1
Posted patch VTune Integration WIP 1 (obsolete) — Splinter Review
This is a WIP of Vtune integration. Engines are hooked up and unload events seem to be handled. ICs are still missing, and basic block/inlining support inside IonMonkey isn't hooked up. Also every compilation counts as a separate function for the moment.

That said, this may be already in a useful state, since module-level boundaries are helpful, and you can easily bring up an ASM view.

Bas, would you like to try profiling with this enabled? It should work if you apply the patch and then configure with --enable-vtune.

VTune supports attaching to a process, but unfortunately doesn't provide hooks for us to recognize that we have been attached to. So if jitcode is generated before VTune is attached, no event will ever trigger for VTune to be made aware of that code. So it's probably best to attach right before loading the target page.
Flags: needinfo?(bas)
Posted patch VTune Integration (obsolete) — Splinter Review
This patch enables high-level VTune support. Most things you'd want are present. The missing pieces include:

- Tracking ICs.
- Tracking MBasicBlocks.
- Tracking inlined functions separately.
- Notifying VTune when write barrier guards are toggled.

But the code as present is good enough for most use cases.

It looks like Nightly should pass --enable-vtune by default, so this would be available for the team without recompilation.

VTune handles the case of missing system .dll/.so, so no change if you don't have VTune installed.
Attachment #8830970 - Attachment is obsolete: true
Attachment #8831355 - Flags: review?(sphink)
Comment on attachment 8831355 [details] [diff] [review]
VTune Integration

Review of attachment 8831355 [details] [diff] [review]:

Whoa, this update was a lot cleaner than I expected!

::: js/src/irregexp/NativeRegExpMacroAssembler.cpp
@@ +510,5 @@
> +
> +        // The unloading event fires in RegExpShared::discardJitCode().
> +        int ok = iJIT_NotifyEvent(iJVM_EVENT_TYPE_METHOD_LOAD_FINISHED_V2, (void*)&method);
> +
> +        // Assertions aren't reported in VTune, so they can make very misleading profiles.

Say what? Oh. Maybe

  // Assertions silently abort VTune profiling, making for very misleading profiles.

or something?

::: js/src/jit/Ion.cpp
@@ +850,5 @@
>      // Releasing the pool may free it. Horrible hack: if we are using perf
>      // integration, we don't want to reuse code addresses, so we just leak the
>      // memory instead.
>      if (!PerfEnabled())
>          pool_->release(headerSize_ + bufferSize_, CodeKind(kind_));

Heh. It'd be nice to spew out a warning when this is detected, but never mind for now.

::: js/src/jsscript.h
@@ +1535,5 @@
>      const char* filename() const { return scriptSource()->filename(); }
>      const char* maybeForwardedFilename() const { return maybeForwardedScriptSource()->filename(); }
> +#ifdef MOZ_VTUNE
> +    uint32_t getVTuneMethodID() const { return vtuneMethodId_; }

I thought we normally used get* for fallible things? Should this be vtuneMethodID()? (It's shame about the capitalization.)

::: js/src/vm/RegExpObject.cpp
@@ +957,5 @@
> +    // no longer traceable. VTune still needs a generated event to mark
> +    // that the addresses associated with the jitcode may be repurposed.
> +#ifdef MOZ_VTUNE
> +    discardJitCode();
> +#endif

Hm. It might be cleaner to do this directly, especially since it seems like it can be really short, something like:

for (auto compilation : compilationArray)

(don't trust me on the syntax). When first reading this, it was surprising to be doing something effectful-sounding as "discardJitCode".

::: js/src/vtune/VTuneWrapper.cpp
@@ +32,5 @@
> +
> +    iJIT_Method_Load_V2 method = {0};
> +    method.method_id = iJIT_GetNewMethodID();
> +    method.method_name = const_cast<char*>(name);
> +    method.method_load_address = reinterpret_cast<void*>(code->raw());

Really? I though reinterpret_cast<void*>(pointer) was a nop.

@@ +54,5 @@
> +    method.module_name = const_cast<char*>(module);
> +
> +    // Line numbers begin at 1, but columns begin at 0.
> +    // Text editors start at 1,1 so fixup is performed to match.
> +    char namebuf[256];

Is 256 big enough? Seems like I've seen some pretty long script filenames.

@@ +70,5 @@
> +void
> +VTuneUnloadCode(const js::jit::JitCode* code)
> +{
> +    if (!IsVTuneProfilingActive())
> +        return;

Does this support pausing and resuming profiling? It's no problem if it doesn't, eg if you lose unregistrations/registrations while it is paused, but it should be documented somewhere what the constraints are. Maybe you have already; I'm just starting this review.

@@ +83,5 @@
> +    int ok = iJIT_NotifyEvent(iJVM_EVENT_TYPE_METHOD_UNLOAD_START, (void*)&method);
> +
> +    // Assertions aren't reported in VTune: instead, they immediately end profiling
> +    // with no warning that a crash occurred. This can generate misleading profiles.
> +    // So instead, print out a message to stdout (which VTune does not redirect).

Oh! That's what you meant in the other place. Ok.
Attachment #8831355 - Flags: review?(sphink) → review+
Thanks for the quick review. I confirmed that Nightly is built with MOZ_VTUNE.

Unfortunately, VTune doesn't seem to be able to profile Firefox. Attaching to a Firefox process immediately kills that process, producing the following error in VTune:

> Injector: caught unexpected signal 31 while running miniloader.

Moh, long time! Do you know if the VTune team has ever been able to profile Firefox, and could provide some advice? If they're located in the Bay Area, I could also stop by with a laptop.

We're looking to get JIT introspection on Windows machines, and VTune looks like it provides a suitable interface.
Flags: needinfo?(mohammad.r.haghighat)
(In reply to Sean Stangl [:sstangl] from comment #4)
> > Injector: caught unexpected signal 31 while running miniloader.

That's a SIGSYS, which is what you get if you try to do something our sandbox doesn't allow. I'm guessing that's what is getting in the way here, somehow.

Try building with ac_add_options --disable-sandbox or setting MOZ_DISABLE_CONTENT_SANDBOX to see if it makes a difference?
+ Joe, Intel Application Engineer with Mozilla.

Hi Joe, please investigate this problem.

In the past, we could profile Firefox with VTune for both its C/C++ source as well as its dynamic JavaScript JIT compiled code.
Flags: needinfo?(mohammad.r.haghighat)
Thanks, Moh!

If the sandbox is not the cause, I can get some help from the VTune team. Most of them are in Russia, so a quick trip to bring a laptop may not be convenient :)
Fixes an accidental null-deref in RegExpShared's unload code.
Attachment #8831355 - Attachment is obsolete: true
(In reply to Steve Fink [:sfink] [:s:] from comment #5)
> Try building with ac_add_options --disable-sandbox or setting
> MOZ_DISABLE_CONTENT_SANDBOX to see if it makes a difference?

Setting --disable-sandbox does indeed allow attaching VTune.

Unfortunately, none of the jitcode appears to be tracked, so more investigation.
Attaching VTune to an already-running process is not detected by the integration.

Launching firefox from the GUI doesn't work because a special option has to be passed to the collector, which the GUI doesn't surface. On the other hand, when launched directly, VTune is detected, but all the results are discarded because of signal errors.

I've been using this script to attach:

> #!/bin/bash
> if [ "`cat /proc/sys/kernel/watchdog`" -eq "1" ]; then
>         echo "Need sudo for disabling nmi_watchdog:"
>         echo 0 | sudo tee /proc/sys/kernel/watchdog
> fi
> source /opt/intel/vtune_amplifier_xe_2017.1.0.486011/amplxe-vars.sh
> LIKELYPID=`pidof firefox | cut -f 1 -d ' '`
> amplxe-cl -collect hotspots --run-pass-thru=--profiling-signal=33 -target-pid "${LIKELYPID}"

The "--run-pass-thru" option appears to be undocumented but I found it trawling through internet help forums, and it allows bypassing the signal error.
Integration works just fine when launching from the command line, for example via:

amplxe-cl -collect hotspots --run-pass-thru=--profiling-signal=33 -- /home/sstangl/dev/gecko-dev/obj-opt-x86_64-pc-linux-gnu-nosandbox/dist/bin/firefox -no-remote -profile /home/sstangl/dev/gecko-dev/obj-opt-x86_64-pc-linux-gnu-nosandbox/tmp/scratch_user

where everything after "--" is the command executed by "mach run".

This probably means that the integration will work just fine for Bas on Windows, so it should be OK to land.
Carrying sfink's r+ from Comment 3.
Attachment #8833107 - Flags: review+
Keywords: checkin-needed
Attachment #8831927 - Attachment is obsolete: true
There was also a massive amount of android test failures similar to https://treeherder.mozilla.org/logviewer.html#?job_id=74096236&repo=mozilla-inbound when this landed.
Steve, could you take a look at the hazard analysis log? I annotated the relevant functions with "JS::AutoAssertNoGC nogc", but the analysis still marks all the VTune* functions as CanGC, on account of an IndirectCall in the iJIT* functions, which use a dlopen()'d library.

The rest of the problems are fixed; just the hazard analysis is remaining to land.

Hazard analysis: https://queue.taskcluster.net/v1/task/LdAIvc1DQBemedUHfshRUw/runs/0/artifacts/public/build/hazards.txt.gz
Flags: needinfo?(sstangl) → needinfo?(sphink)
Doh! Sorry, if anything my suggestion would make it worse. AutoAssertNoGC is the opposite of what I wanted; it is still checked by the hazard analysis. You unfortunately need AutoSuppressGCAnalysis.

Looking at the code changes, I'm wondering if it would be better to either (1) put your vtune stuff into a js::vtune:: namespace and then programmatically annotated everything in that namespace to not GC, or (2) just programmatically say that iJIT_* won't GC. Oh, wait, or (3) say that any static function in jitprofiling.c won't GC, though I'm not sure whether (3) covers everything or not.

(3) would be implemented by adding a name.startsWith("jitprofiling.c:") test to http://searchfox.org/mozilla-central/source/js/src/devtools/rootAnalysis/annotations.js#36

(2) would be implemented by adding a fun.startsWith("iJIT_") test to http://searchfox.org/mozilla-central/source/js/src/devtools/rootAnalysis/annotations.js#276

(1) would be similar to (2).

Whatever makes the most sense to you. I'd probably lean towards one of these three rather than sprinkling AutoSuppressGCAnalysis around; it seems like a blanket annotation better represents the intent here. Sorry for the misdirection.
Flags: needinfo?(sphink)
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla54
Depends on: 1340045
Duplicate of this bug: 675098
Duplicate of this bug: 766432
Depends on: 1339190
This is working for me now when I last checked, but it does make VTune a little more crashy it seems from time to time, I've disabled this on my profiling build but should probably re-enable it soon. There's more people using VTune now though also within the JS team, so you probably have better sources of feedback.
Flags: needinfo?(bas)
You need to log in before you can comment on or make changes to this bug.