Open Bug 1329923 Opened 5 years ago Updated 4 years ago

Emit profiler markers for discarding Ion code

Categories

(Core :: JavaScript Engine: JIT, defect, P3)

defect

Tracking

()

REOPENED

People

(Reporter: sfink, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: perf, stale-bug, Whiteboard: [qf:p3])

Attachments

(2 files, 2 obsolete files)

We emit markers for when we discard ion code due to bailouts. We should do the same when we discard ion code due to GC. (Or any other reason.) Then we could generate a timeline for a given script and display every reason for transitioning out of ion.
Blocks: 1329161
Priority: -- → P2
Priority: P2 → P1
Flags: needinfo?(hv1989)
Fixes:
- We were not reporting a large portion of invalidation happening
- Report when we invalidate due to GC
- Add marker when code is ion compiled
- Add marker when code is is marked as not compilable

@sfink:
This won't give if a bailout caused an invalidation. Though it might be more apparent now:

> ...
> Compilation finished of /home/h4writer/Build/mozilla-inbound/js/src/octane/gbemu-part2.js:1063
> Compilation finished of /home/h4writer/Build/mozilla-inbound/js/src/octane/gbemu-part2.js:1217
> Compilation finished of /home/h4writer/Build/mozilla-inbound/js/src/octane/gbemu-part2.js:502> 
> Bailout_TypeBarrierV after getprop on line 5850 of /home/h4writer/Build/mozilla-inbound/js/src/octane/gbemu-part2.js:5848
> Invalidate /home/h4writer/Build/mozilla-inbound/js/src/octane/gbemu-part2.js:5848
> Invalidate /home/h4writer/Build/mozilla-inbound/js/src/octane/gbemu-part2.js:6105
> Invalidate /home/h4writer/Build/mozilla-inbound/js/src/octane/gbemu-part2.js:5988
> Bailout_DuringVMCall after call on line 6032 of /home/h4writer/Build/mozilla-inbound/js/src/octane/gbemu-part2.js:5988
> Bailout_DuringVMCall after call on line 6032 of /home/h4writer/Build/mozilla-inbound/js/src/octane/gbemu-part2.js:5988
> Bailout_DuringVMCall after call on line 6032 of /home/h4writer/Build/mozilla-inbound/js/src/octane/gbemu-part2.js:5988
> Bailout_DuringVMCall after call on line 6032 of /home/h4writer/Build/mozilla-inbound/js/src/octane/gbemu-part2.js:5988
> Invalidate (GC) <unknown>:0
> Invalidate (GC) self-hosted:1
> Invalidate (GC) self-hosted:4
> ...

-> Here the bailout of gbemu-part2.js:5848 caused an invalidation.
-> The bailout of gbemu-part2.js:5988 didn't cause an invalidation.
Assignee: nobody → hv1989
Flags: needinfo?(hv1989)
Attachment #8853913 - Flags: review?(jdemooij)
Attachment #8853913 - Flags: feedback?(sphink)
hg qref
Attachment #8853913 - Attachment is obsolete: true
Attachment #8853913 - Flags: review?(jdemooij)
Attachment #8853913 - Flags: feedback?(sphink)
Attachment #8853914 - Flags: review?(jdemooij)
Attachment #8853914 - Flags: feedback?(sphink)
Comment on attachment 8853914 [details] [diff] [review]
Make sure we see all "invalidate" events in sps

(In reply to Hannes Verschore [:h4writer] from comment #1)
> Created attachment 8853913 [details] [diff] [review]
> Make sure we see all "invalidate" events in sps
> 
> Fixes:
> - We were not reporting a large portion of invalidation happening
> - Report when we invalidate due to GC
> - Add marker when code is ion compiled

\o/

> - Add marker when code is is marked as not compilable

Oh, that seems really useful. Is there a reason associated with these? Like, the string used when setting AbortReason::Disable? I don't know how that is plumbed through.

> -> Here the bailout of gbemu-part2.js:5848 caused an invalidation.
> -> The bailout of gbemu-part2.js:5988 didn't cause an invalidation.

Just so I understand... you're referring to the series of 4 bailouts of gbemu-part2.js:5988 here, right? Because there's an earlier invalidation of that same script a line before those bailouts. Would I be able to find the cause of that invalidation by looking further back in the output, or are some invalidations for reasons that will not show up as markers?

Also:

> > Bailout_TypeBarrierV after getprop on line 5850 of /home/h4writer/Build/mozilla-inbound/js/src/octane/gbemu-part2.js:5848

Is the given script the innermost or outermost script? As in, if f() is inlined into g(), and we bailout while running g(), would the bailout marker be for f() or g()?

Thanks! f+ from me because everything in the patch seems like useful additional data.
Attachment #8853914 - Flags: feedback?(sphink) → feedback+
Comment on attachment 8853914 [details] [diff] [review]
Make sure we see all "invalidate" events in sps

Review of attachment 8853914 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, but some nits below that would be good to address I think.

::: js/src/gc/Zone.cpp
@@ +206,5 @@
>  
>          /* Only mark OSI points if code is being discarded. */
>          jit::InvalidateAll(fop, this);
>  
> +        JSContext* cx = TlsContext.get();

We can get the rt and use that instead of the cx:

 JSRuntime* rt = runtimeFromActiveCooperatingThread();

@@ +211,4 @@
>          for (auto script = cellIter<JSScript>(); !script.done(); script.next())  {
>              jit::FinishInvalidation(fop, script);
>  
> +            if (cx->runtime()->geckoProfiler().enabled()) {

We should only do this when the script actually has JIT code that gets discarded right? Maybe FinishInvalidationOf is a good place?

::: js/src/jit/Ion.cpp
@@ +3241,5 @@
>  jit::Invalidate(TypeZone& types, FreeOp* fop,
>                  const RecompileInfoVector& invalid, bool resetUses,
>                  bool cancelOffThread)
>  {
> +    JSContext* cx = TlsContext.get();

Here we can get the rt from fop->runtime().

@@ +3280,5 @@
> +            // Ignore the event on allocation failure.
> +            if (buf) {
> +                cx->runtime()->geckoProfiler().markEvent(buf);
> +                JS_smprintf_free(buf);
> +            }

There are a lot of places where we have this code. Would it be possible to add something like:

  MarkJitProfilerEvent(JSRuntime* rt, JSScript* script, const char* event);

And then we can call MarkJitProfilerEvent(rt, script, "Invalidate"); etc.

@@ +3423,5 @@
> +        if (buf == nullptr) {
> +            ReportOutOfMemory(cx);
> +            return;
> +        }
> +        snprintf(buf, len, "Disabling compilation of %s:%" PRIuSIZE,

Nit: Ion compilation?
Attachment #8853914 - Flags: review?(jdemooij)
Attached patch PatchSplinter Review
Is indeed better this way. Thanks!
Attachment #8853914 - Attachment is obsolete: true
Attachment #8855243 - Flags: review?(jdemooij)
Comment on attachment 8855243 [details] [diff] [review]
Patch

Review of attachment 8855243 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks, sorry for the delay.
Attachment #8855243 - Flags: review?(jdemooij) → review+
Pushed by hv1989@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/1bd354390f12
IonMonkey - Emit profiler markers for finishing ion compilation and discarding code, r=jandem
I'm not sure what is going on. I just add some more markers to Gecko Profiler and I think the Gecko Profiler is somehow crashing because of the extra load?

https://treeherder.mozilla.org/logviewer.html#?job_id=90460856&repo=mozilla-inbound&lineNumber=1936

It seems related to locking. Not sure why though? Can you take a look njn?
Flags: needinfo?(n.nethercote)
profiler_stop and profiler_add_marker take the same lock, so profiler_add_marker must not be called while profiler_stop is holding the lock. But that's exactly what's happening here.
It looks like PseudoStack::pollJSSampling is not safe to call inside locked_profiler_stop.
> It looks like PseudoStack::pollJSSampling is not safe to call inside
> locked_profiler_stop.

It was prior to this patch! :)

Hannes, I suggest you just remove the "Invalidate (GC)" marker. I tried doing that locally and it fixes the deadlock. And I don't think it's a useful marker, because it's added right as we're stopping the profiler, so I don't think it'll show up anywhere useful.
Flags: needinfo?(n.nethercote)
Wait, I see how that marker can be added via paths other than profiler_stop(). I'll see if I can thread a bool parameter through to disable it for the profiler_stop() case, but allow it for other cases.
Attached patch fix-up patchSplinter Review
Hannes, here's a fix that skips adding the markers in the problem case. If you
are ok with it please fold it into your patch and land it. Thanks!
Attachment #8857835 - Flags: review?(hv1989)
Assignee: hv1989 → n.nethercote
Status: NEW → ASSIGNED
Assignee: n.nethercote → hv1989
Comment on attachment 8857835 [details] [diff] [review]
fix-up patch

Review of attachment 8857835 [details] [diff] [review]:
-----------------------------------------------------------------

Ah. Now I understand what is going on. Good find. Thanks!
Attachment #8857835 - Flags: review?(hv1989) → review+
Hannes, please fold and land the patches when you are ready. Thanks!
Sorry about not being able to land this. The patches seem to be correct, though I remember that my try build wasn't green.
If you want to move this forward I would suggest to create a new try push and see if it was maybe unrelated to the patches.
Grtz Hannes
Flags: needinfo?(hv1989) → needinfo?(sphink)
Flags: needinfo?(sphink)
Whiteboard: [qf]
Whiteboard: [qf] → [qf:p3]
https://hg.mozilla.org/integration/mozilla-inbound/rev/72f2cb8b917ee2cc5f50f2e8423b695d1db969ed
Bug 1329923 - Emit profiler markers for discarding Ion code. r=h4writer,njn.  r=jandem,h4writer.
Pushed by nnethercote@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/72f2cb8b917e
Emit profiler markers for discarding Ion code. r=h4writer,njn.  r=jandem,h4writer.
https://hg.mozilla.org/mozilla-central/rev/72f2cb8b917e
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
I backed this out because it was causing crashes. (See bug 1377857.) This was Hannes's change, which I was just shepherding in. I don't have a great deal of insight into it, and what the cause of the crash was. I'm not planning on working on it further.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
also backout landed on m-c 
https://hg.mozilla.org/mozilla-central/rev/798b89d453fa
Status: REOPENED → RESOLVED
Closed: 5 years ago5 years ago
Resolution: --- → FIXED
Assignee: hv1989 → nobody
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: mozilla56 → ---
This is a P1 bug without an assignee. 

P1 are bugs which are being worked on for the current release cycle/iteration/sprint. 

If the bug is not assigned by Monday, 28 August, the bug's priority will be reset to '--'.
Keywords: stale-bug
Priority: P1 → P3
Keywords: perf
You need to log in before you can comment on or make changes to this bug.