Closed
Bug 911738
Opened 10 years ago
Closed 9 years ago
IonMonkey: inline richards :204
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
mozilla36
People
(Reporter: h4writer, Assigned: h4writer)
References
(Blocks 1 open bug)
Details
Attachments
(1 file, 8 obsolete files)
23.84 KB,
patch
|
jandem
:
review+
|
Details | Diff | Splinter Review |
In richards function :204 isn't inlined. This is enormously short and gets called a lot. http://alasal.be/ionmonkey/index.php?subject=octane-richards-82d28bdf9317 :204 gets called 3462534 times :305 (parent) gets called 3213820 times There is not really a reason why this shouldn't get inlined. Reason: A -> B -> C A is hot enough to get compiled (usecount 1000). At that time B has only be called 100 times (no loops in B). But that's high enough to get inlined. Now C gets always called when B gets called. So they have the same hotness. But due to parent script heurstic we don't. Parent (i.e. B) needs to have usecount 125 before any of his functions get inlined. Now not much later B and C are hot enough to get inlined. But we don't recompile anymore. So we are stuck in not inlined C. Two possible solutions: 1) Enable usecount in ion again and add recompile for A when a callee is hot enough 2) Idea I have been thinking of to create a non-inlined version with high priority and a full inlined version with low priority and switch to improved version as soon as inlined version is done.
Assignee | ||
Comment 1•10 years ago
|
||
(In reply to Hannes Verschore [:h4writer] from comment #0) > http://alasal.be/ionmonkey/index.php?subject=octane-richards-82d28bdf9317 > :204 gets called 3462534 times > :305 (parent) gets called 3213820 times The parent is :374. Sorry about making this mistake. But the issue is the same. :374 calls :204 just once every execution. So :204/:305 have the same usecount.
Comment 2•10 years ago
|
||
I made a draft for the first solution you suggested. It does inline :204, but I get 3 recompile bailouts and performance is worse than the unpatched version.
Comment 3•10 years ago
|
||
With this change :204 gets inlined in the first attempt, and I get a ~500 point increase (~2.48%).
Assignee | ||
Comment 4•10 years ago
|
||
(In reply to Yaron Tausky from comment #3) > Created attachment 801239 [details] [diff] [review] > 911738_inline_small_funcs.patch > > With this change :204 gets inlined in the first attempt, and I get a ~500 > point increase (~2.48%). As explained on IRC I don't think this is a good way to solve it. This increases the amount of inlining, resulting in higher compilation times. There is a balance between inlining and compilation time. If we really want to inline everything we should go for the proposed second solution. But that needs some research and a lot of changes. Still doable though and hopefully I or somebody else can try that in the future. (In reply to Yaron Tausky from comment #2) > Created attachment 801235 [details] [diff] [review] > First approach draft v1 > > I made a draft for the first solution you suggested. It does inline :204, > but I get 3 recompile bailouts and performance is worse than the unpatched > version. I saw the following: unpatched: 26200 patched: 24500 Patched version is indeed slower, but why. I quickly disabled the improvement of recompilation by changing InliningStatus_MaybeLater in makeInliningDecision to InliningStatus_NotInlined. That way we would know if the recompilation is the real issue. patched + disable recompile: 23000 Interesting. Recompilation is indeed better, but we are still loosing time. But where. The only difference is MIncUseCount that gets added. So I removed that. patched + disable recompile + disable MIncUseCount: 26000 So the reason is that MIncUseCount is too slow. I don't really see a way to improve the IncUseCount that much, but there is another solution. We only need to decrease the number of times we add this. It only makes sense to add this when usecount is low. If it is already high (e.g. 1000). We won't inline more because it gets even higher. So I tested this with only adding MIncUseCount when usecount is low: patched + disable recompile + limited adding MIncUseCount: 25500 This seems better. We could possibly decrease the amount of MIncUseCounts it even more. I'll let this for you to research. Now I tried enabling recompile check. We had a 6% improvement with enabling it. Hopefully that works now too: patched + limited adding MIncUseCount: 27500 I think that is a nice win ;)
Assignee | ||
Comment 5•10 years ago
|
||
Sorry about the bad English. I should have reread my brain dump, before posting. If you also get this improvement, I think you can finish up this patch and make sure this doesn't decrease performance on SS/kraken/Octane and mark for review? I think it was mostly done, right?
Comment 6•10 years ago
|
||
Nice catch! I'll work on it over the weekend. I'd like to change the code generation a bit, I suspect that right now it only works on x64 (and maybe I can spare a register while I'm at it?).
Comment 7•10 years ago
|
||
Indeed I saw the same improvement. Some of the other tests have suffered, though. Here are my results for the unpatched version (averaged over 10 runs, with top and bottom results discarded): Richards 19803 DeltaBlue 19262 Crypto 21912 RayTrace 23870 EarleyBoyer 17528 RegExp 3438 Splay 13288 NavierStokes 26298 PdfJS 10541 Mandreel 4917 Gameboy 27500 CodeLoad 23811 Box2D 7096 Score (version 8) 14216 And the results for the patched version: Richards 21033 DeltaBlue 19128 Crypto 21947 RayTrace 23007 EarleyBoyer 17495 RegExp 3400 Splay 12902 NavierStokes 26426 PdfJS 10669 Mandreel 5043 Gameboy 27290 CodeLoad 23998 Box2D 7076 Score (version 8) 14248 So, a slight improvement of the overall score. I'll upload the SS and Kraken results as separate files.
Attachment #801235 -
Attachment is obsolete: true
Attachment #801239 -
Attachment is obsolete: true
Attachment #804986 -
Flags: review?(hv1989)
Comment 8•10 years ago
|
||
Tested with sunspider-1.0.1.
Comment 9•10 years ago
|
||
Tested with kraken-1.1. I also sent the patch to try: https://tbpl.mozilla.org/?tree=Try&rev=43f2337aa9e8 I don't have a way to test it myself on ARM, though.
Assignee | ||
Comment 10•10 years ago
|
||
The regression on kraken-ai-astar was increased due to this bug.
Depends on: 916846
Assignee | ||
Comment 11•10 years ago
|
||
Comment on attachment 804986 [details] [diff] [review] 911738_recompile.patch Review of attachment 804986 [details] [diff] [review]: ----------------------------------------------------------------- Most things look good. But can you report numbers on x86 (32bit) and with --parallel-compile=on as argument to js? That's our current configuration. For the later you need a thread-safe build. If you don't know how to create one, just ask me on irc ;)
Assignee | ||
Comment 12•10 years ago
|
||
Comment on attachment 804986 [details] [diff] [review] 911738_recompile.patch Review of attachment 804986 [details] [diff] [review]: ----------------------------------------------------------------- Removing review request. I reran the numbers on v8. We are hitting a regression in raytrace, caused because we are invalidating the script, instead of only requesting a recompile and upon ready insert the script. This should remove the raytrace regression. Atm. this improves richards with 11%, but decreases raytrace with 9%. When the depending bug gets fixed, I'll review this.
Attachment #804986 -
Flags: review?(hv1989)
Comment 13•10 years ago
|
||
With bug 918278 fixed, this should be doable, right?
Flags: needinfo?(hv1989)
Assignee | ||
Comment 14•10 years ago
|
||
(In reply to Till Schneidereit [:till] from comment #13) > With bug 918278 fixed, this should be doable, right? Correct. So now there is already a MRecompileCheck. So or we create a new specific MIR/LIR for this or we adjust MRecompileCheck to have two ways to work... Afterwards in jit/VMFunctions.cpp there is a function Recompile that should do this. The only small adjustment is that a new option to jit/Ion.cpp should get added to "force" a recompile. (This is not in my immediate todo list. So anybody feel free to pick it up. I can mentor this.)
Flags: needinfo?(hv1989)
Comment 15•10 years ago
|
||
Yaron, you wouldn't be interested in taking this up again by any chance?
Flags: needinfo?(yaron.tausky)
Comment 16•10 years ago
|
||
I kinda drifted to other things now, but I'll take a look and try it. Might take a few days, though.
Flags: needinfo?(yaron.tausky)
Comment 17•10 years ago
|
||
Awesome, thanks!
Comment 18•10 years ago
|
||
Yaron, are you still interested in working on this? Otherwise, we should probably turn it into a mentored bug.
Flags: needinfo?(yaron.tausky)
Comment 19•10 years ago
|
||
Sorry for the wait. I think it would be better to release this bug and let someone else have a go.
Flags: needinfo?(yaron.tausky)
Assignee | ||
Comment 21•10 years ago
|
||
Uses the infrastructure set up by optimization levels to force a recompile when we can potentially inline another function.
Attachment #804986 -
Attachment is obsolete: true
Attachment #804987 -
Attachment is obsolete: true
Attachment #804990 -
Attachment is obsolete: true
Attachment #8418668 -
Flags: review?(jdemooij)
Assignee | ||
Comment 22•10 years ago
|
||
Note: richards :204 already gets inlined due to another bugfix. So shouldn't make a big difference there.
Assignee | ||
Comment 23•10 years ago
|
||
Oh forgot to qref
Attachment #8418668 -
Attachment is obsolete: true
Attachment #8418668 -
Flags: review?(jdemooij)
Attachment #8418669 -
Flags: review?(jdemooij)
Comment 24•10 years ago
|
||
Comment on attachment 8418669 [details] [diff] [review] 8418668: Add check to recompile when hitting heurstic that disables inlining Review of attachment 8418669 [details] [diff] [review]: ----------------------------------------------------------------- Sorry for the delay. ::: js/src/jit/Ion.cpp @@ +2047,5 @@ > } > > static MethodStatus > Compile(JSContext *cx, HandleScript script, BaselineFrame *osrFrame, jsbytecode *osrPc, > + bool constructing, ExecutionMode executionMode, bool force = false) Nit: s/force/forceRecompile to make this argument a bit less ambiguous, also in the callers. @@ +2118,1 @@ > return Method_Compiled; Can we MOZ_ASSERT_IF(osrPc, osrPc == script->ionScript()->osrPc()); here? @@ +2179,5 @@ > > + // Return the compilation was skipped when the osr pc wasn't adjusted. > + scriptIon = GetIonScript(script, SequentialExecution); > + if (!scriptIon || pc != scriptIon->osrPc()) > + return Method_Skipped; If Compile returns Method_Compiled, there must be an IonScript right? ::: js/src/jit/IonBuilder.cpp @@ +5172,5 @@ > > + if (target && status == InliningStatus_UseCountToLow) { > + MRecompileCheck *check = MRecompileCheck::New(alloc(), target->nonLazyScript(), > + optimizationInfo().usesBeforeInlining(), > + /* force = */ true); This means we will bump the callee's use count twice right? Can we just turn it into a check without incrementing it in this case? ::: js/src/jit/IonBuilder.h @@ +629,5 @@ > enum InliningStatus > { > InliningStatus_Error, > InliningStatus_NotInlined, > + InliningStatus_UseCountToLow, Nit: s/ToLow/TooLow @@ +638,5 @@ > { > InliningDecision_Error, > InliningDecision_Inline, > + InliningDecision_DontInline, > + InliningDecision_UseCountToLow Same here.
Attachment #8418669 -
Flags: review?(jdemooij)
Assignee | ||
Comment 25•10 years ago
|
||
Comment on attachment 8418669 [details] [diff] [review] 8418668: Add check to recompile when hitting heurstic that disables inlining Review of attachment 8418669 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/Ion.cpp @@ +2118,1 @@ > return Method_Compiled; No we can't. Method_Compiled is returned when there is a IonScript. This can be an old IonScript with wrong osrPC(). Callers that want a specific pc (like enterAtBranch) need to check this. This makes code easier and removes the need of "failedState". @@ +2179,5 @@ > > + // Return the compilation was skipped when the osr pc wasn't adjusted. > + scriptIon = GetIonScript(script, SequentialExecution); > + if (!scriptIon || pc != scriptIon->osrPc()) > + return Method_Skipped; Yes, correct. ::: js/src/jit/IonBuilder.cpp @@ +5172,5 @@ > > + if (target && status == InliningStatus_UseCountToLow) { > + MRecompileCheck *check = MRecompileCheck::New(alloc(), target->nonLazyScript(), > + optimizationInfo().usesBeforeInlining(), > + /* force = */ true); Oh right. Since callee->useCount() < 1000 we will still be running in baseline/interpreter, which update the usecount themself.
Assignee | ||
Comment 26•10 years ago
|
||
Attachment #8418669 -
Attachment is obsolete: true
Attachment #8421672 -
Flags: review?(jdemooij)
Comment 27•10 years ago
|
||
Comment on attachment 8421672 [details] [diff] [review] Add check to recompile when hitting heurstic that disables inlining Review of attachment 8421672 [details] [diff] [review]: ----------------------------------------------------------------- r=me with comments below addressed. ::: js/src/jit/CodeGenerator.cpp @@ +8626,5 @@ > + AbsoluteAddress useCount = AbsoluteAddress(ins->mir()->script()->addressOfUseCount()); > + if (ins->mir()->increaseUseCount()) { > + masm.loadPtr(useCount, tmp); > + masm.add32(Imm32(1), tmp); > + masm.storePtr(tmp, useCount); These should be load32 and store32 ::: js/src/jit/Ion.cpp @@ +2047,5 @@ > } > > static MethodStatus > Compile(JSContext *cx, HandleScript script, BaselineFrame *osrFrame, jsbytecode *osrPc, > + bool constructing, ExecutionMode executionMode, bool force = false) s/force/forceRecompile @@ +2157,5 @@ > + > + // By default a recompilation doesn't happen on osr mismatch. > + // Decide if we want to force a recompilation if this happens too much. > + bool force = false; > + IonScript *scriptIon = GetIonScript(script, SequentialExecution); Nit: script->ionScript() @@ +2176,5 @@ > ForbidCompilation(cx, script); > return status; > } > > + // Return the compilation was skipped when the osr pc wasn't adjusted. When does Compile return Method_Compiled without the OSR pc matching? Can you add a line to this comment explaining when this happens? @@ +2178,5 @@ > } > > + // Return the compilation was skipped when the osr pc wasn't adjusted. > + scriptIon = GetIonScript(script, SequentialExecution); > + if (pc != scriptIon->osrPc()) Nit: if (pc != script->ionScript()->osrPc())
Attachment #8421672 -
Flags: review?(jdemooij) → review+
Assignee | ||
Comment 28•10 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/66a520c18efe
Comment 29•10 years ago
|
||
This bug (or bug 1008590) appears to have caused a 14.3% regression on Octane-mandreel and a 7.1% regression on Octane-mandreelLatency on machine 11 on AWFY range from AWFY: http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=10c6ee2c7524&tochange=66a520c18efe
Assignee | ||
Comment 30•10 years ago
|
||
It was this bug. Already under investigating. Probably gonna backout soonish.
Comment 31•10 years ago
|
||
One more datapoint, asmjs-apps-zlib-throughput (non-odin) also regressed (by 12%), I assume due to the same reasons as the others mentioned 2 comments back.
Comment 32•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/66a520c18efe
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla32
Assignee | ||
Comment 33•10 years ago
|
||
Backout: https://hg.mozilla.org/integration/mozilla-inbound/rev/b729d9a17712
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 34•10 years ago
|
||
So the issue is that when recompiling during the linking fase, we are invalidating the old IonScript. As a result we go back to baseline and it takes until the function starts again that we can go back in ionmonkey. This is a big issue for mandreel apparently.
Comment 35•10 years ago
|
||
this caused a 2.33% kraken regression on osx 10.8 in talos, when we backed it out the regression went away.
Assignee | ||
Comment 36•9 years ago
|
||
Assignee | ||
Comment 37•9 years ago
|
||
Comment on attachment 8459316 [details] [diff] [review] Delay linking IonBuilder until execution WIP Took its own bug nummer for it: bug 1047346. If that lands everything has landed to try this again. Locally I don't see the regressions anymore :D
Attachment #8459316 -
Attachment is obsolete: true
Assignee | ||
Comment 38•9 years ago
|
||
No luck, backed out: https://hg.mozilla.org/integration/mozilla-inbound/rev/85a0a388a339
Assignee | ||
Comment 39•9 years ago
|
||
Since the error was from another patch. Back to try to confirm and try relanding: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=df6ccb2f0e2d
Assignee | ||
Comment 40•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/a797dcc3dafc
https://hg.mozilla.org/mozilla-central/rev/a797dcc3dafc
Status: REOPENED → RESOLVED
Closed: 10 years ago → 9 years ago
Resolution: --- → FIXED
Target Milestone: mozilla32 → mozilla36
You need to log in
before you can comment on or make changes to this bug.
Description
•