Last Comment Bug 875135 - IonMonkey: Constants with a single use are sometimes loaded into registers
: IonMonkey: Constants with a single use are sometimes loaded into registers
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: unspecified
: All All
: -- normal (vote)
: mozilla24
Assigned To: (PTO until 29th August) Benjamin Bouvier [:bbouvier]
:
Mentors:
Depends on: 876607
Blocks: 876057 876247
  Show dependency treegraph
 
Reported: 2013-05-22 16:18 PDT by (PTO until 29th August) Benjamin Bouvier [:bbouvier]
Modified: 2013-05-28 19:33 PDT (History)
9 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Source code + generated assembly in comments (2.87 KB, application/javascript)
2013-05-22 16:18 PDT, (PTO until 29th August) Benjamin Bouvier [:bbouvier]
no flags Details
proposed fix (1.04 KB, patch)
2013-05-26 16:19 PDT, (PTO until 29th August) Benjamin Bouvier [:bbouvier]
sstangl: review+
Details | Diff | Splinter Review
proposed fix 2 (2.15 KB, patch)
2013-05-27 15:07 PDT, Hannes Verschore [:h4writer]
no flags Details | Diff | Splinter Review
proposed fix 2 (2.11 KB, patch)
2013-05-27 15:32 PDT, Hannes Verschore [:h4writer]
no flags Details | Diff | Splinter Review

Description (PTO until 29th August) Benjamin Bouvier [:bbouvier] 2013-05-22 16:18:46 PDT
Created attachment 753017 [details]
Source code + generated assembly in comments

On the given benchmark, using OdinMonkey, I have found that several moves could be avoided by just inverting the order of operands for destructive operations (like 'a += 4'). Also, there is a useless load / store of the same pair of variable (i.e. mov A,B and then mov B,A).
More details are provided in the attached file. I put a comment indicating when a mov could be avoided just after the concerned block.

This happens when using either the LSRA or the backtracking allocators (in x86).

In x64, the mov instructions related to the and / add instructions that clobbers the destination registers are also present. However, all the other moves are not useless anymore as the IP register is used as a base offset for datas.
Not sure if it happens in IonMonkey too.
Comment 1 (PTO until 29th August) Benjamin Bouvier [:bbouvier] 2013-05-26 16:19:32 PDT
Created attachment 754292 [details] [diff] [review]
proposed fix

This fixes the useless moves for constants, at least for constants.

Tests don't show any significant performance difference on my machine (which is not really stable regarding runtimes variance), but the generated code fits with what we would expect.

I propose to consider a follow-up bug for the remaining unnecessary moves.
Comment 2 Hannes Verschore [:h4writer] 2013-05-26 17:04:01 PDT
Comment on attachment 754292 [details] [diff] [review]
proposed fix

I saw you asked who should review this patch in IRC. This is actually rather easy to figure out (in most cases). Just do "hg annotate" on the file and look to the original lines you adjusted in a file. In this case you will see that this line has been adjusted in revision "131311".

hg log -r 131311 : 
Bug 870095 - Prefer clobbering binary operands with no further uses. r=h4writer

So sstangl would be a better reviewer ;)
Comment 3 Dan Gohman [:sunfish] 2013-05-27 10:06:19 PDT
Comment on attachment 754292 [details] [diff] [review]
proposed fix

This looks like an improvement to me.

Also, this patch does fix the missed add immediate in in bug 876057.
Comment 4 Sean Stangl [:sstangl] 2013-05-27 11:50:17 PDT
Comment on attachment 754292 [details] [diff] [review]
proposed fix

Review of attachment 754292 [details] [diff] [review]:
-----------------------------------------------------------------

Nice. Thanks for catching that.
Comment 5 Hannes Verschore [:h4writer] 2013-05-27 15:07:18 PDT
Created attachment 754594 [details] [diff] [review]
proposed fix 2

Can I propose another fix?

I noticed "useCount" also counts the ResumePoints. That is wrong, since those don't alter the codegeneration. As a result useCount is in most cases way to high and incorrect to decide when to reorder. Therefore I introduced defUseCount that counts how many definitions there are.

Secondly I also keep the order when both operands have a defUseCount of 1.

Why?

There are 2 regression caused by this optimization on "misc-basic-fpops".
1) On the original landing (add(phi,cte) is transformed to add(cte,phi)
=> that's what the original proposal would have fixed
2) Uncovered by bug 875276, but caused because this heuristic isn't working properly. 
=> This patch also fixed this regression
Comment 6 Hannes Verschore [:h4writer] 2013-05-27 15:08:20 PDT
Removing checkin-needed, since it is possible the original patch doesn't need to get checked in. Needs information from requested people first.
Comment 7 Hannes Verschore [:h4writer] 2013-05-27 15:32:30 PDT
Created attachment 754597 [details] [diff] [review]
proposed fix 2

Oops, forgot to qref before putting up for review
Comment 8 Hannes Verschore [:h4writer] 2013-05-27 15:47:10 PDT
Comment on attachment 754597 [details] [diff] [review]
proposed fix 2

I somehow took the wrong person :S
Comment 9 Dan Gohman [:sunfish] 2013-05-27 17:44:23 PDT
Comment on attachment 754597 [details] [diff] [review]
proposed fix 2

>+    // Number of uses that are defitions of this instruction.

This comment makes it sound (to a newbie like me) like we aren't in SSA form or something. Could it be reworded to say something like: Number of uses of this instruction that are from MDefinitions (and not from ResumePoints) ?

Also, there are a few existing uses of useCount() in other parts of the code which look like they need to use defUseCount() instead.
Comment 10 Hannes Verschore [:h4writer] 2013-05-28 00:55:45 PDT
Comment on attachment 754597 [details] [diff] [review]
proposed fix 2

https://hg.mozilla.org/integration/mozilla-inbound/rev/5c9f8afb8412

Gz bbouvier with your first commit ;) and sorry to temporary hijack it. I'll create a follow-up bug for the issue I'm seeing. (And I'll also look at the other places we possibly use useCount wrong)
Comment 11 (PTO until 29th August) Benjamin Bouvier [:bbouvier] 2013-05-28 10:28:48 PDT
(In reply to Hannes Verschore [:h4writer] from comment #10)
Thanks, and thanks for the commit too :)
Comment 12 Alon Zakai (:azakai) 2013-05-28 15:07:41 PDT
Looks like this regressed asmjs-ubench-memops by 12% on awfy.
Comment 13 Hannes Verschore [:h4writer] 2013-05-28 15:08:51 PDT
(In reply to Alon Zakai (:azakai) from comment #12)
> Looks like this regressed asmjs-ubench-memops by 12% on awfy.

I hope this get's fixed by bug 876607 ;)
Comment 14 Ryan VanderMeulen [:RyanVM] 2013-05-28 18:48:00 PDT
https://hg.mozilla.org/mozilla-central/rev/5c9f8afb8412
Comment 15 (PTO until 29th August) Benjamin Bouvier [:bbouvier] 2013-05-28 19:33:25 PDT
(In reply to Alon Zakai (:azakai) from comment #12)
> Looks like this regressed asmjs-ubench-memops by 12% on awfy.

Only on the 32bits machine. 64 bits and arm machines don't show any regression.

To be sure, I compiled the code on my computer with and without the patch, in x86 and x64 modes. There is no significant average difference in results (difference is <= 0.5%), so I think this regression might be due to a difference of alignment in the generated codes. These micro benchmarks are quite sensitive to code alignment...

Note You need to log in before you can comment on or make changes to this bug.