Closed
Bug 859257
Opened 12 years ago
Closed 12 years ago
Clang miscompiles MConstant::computeRange / ToInt64, hurts performance
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
mozilla23
People
(Reporter: jandem, Assigned: jandem)
References
Details
Attachments
(3 files)
1.13 KB,
patch
|
Waldo
:
review+
|
Details | Diff | Splinter Review |
605.45 KB,
application/x-bzip2
|
Details | |
1.61 KB,
patch
|
Waldo
:
review+
|
Details | Diff | Splinter Review |
Below is a testcase reduced from a Marsaglia RNG written in JS. I get these numbers:
Clang 3.2 debug: 144 ms
Clang 3.2 opt : 2874 ms
So the opt build is a lot slower. Nightly builds seem to have the same problem.
It looks like Clang miscompiles this line in MConstant::computeRange():
int64_t integral = ToInt64(d);
Integral ends up being 0 instead of 36969 and this messes up range analysis. I think in ToIntWidth, we end up reading du.s.lo here instead of du.s.hi:
di_h = du.s.hi;
I tried to move the ToInt64 call into a JS_NEVER_INLINE function, but that didn't help.
function marsaglia(m_z) {
for (var i = 0; i < 100000000; ++i) {
m_z = (36969 * m_z)|0;
}
return m_z;
}
var t = new Date;
var res = marsaglia(2);
print(new Date - t);
print(res);
Assignee | ||
Updated•12 years ago
|
Summary: Clang miscompiles MConstant::computeRange / Int64, hurts performance → Clang miscompiles MConstant::computeRange / ToInt64, hurts performance
Assignee | ||
Comment 1•12 years ago
|
||
Marking DoublePun values as |volatile| seems to avoid the problem..
Comment 2•12 years ago
|
||
Nathan, do we need to upstream this to clang, or are we violating aliasing rules on our end somehow? I was told to ping you now that respindola is no longer around....
Flags: needinfo?(nfroyd)
Comment 3•12 years ago
|
||
(In reply to Boris Zbarsky (:bz) from comment #2)
> Nathan, do we need to upstream this to clang, or are we violating aliasing
> rules on our end somehow? I was told to ping you now that respindola is no
> longer around....
I don't see any aliasing violations here; there's no pointer cast games, which usually is the source of aliasing woes.
Jan, what platform are you seeing the miscompilation on?
Flags: needinfo?(nfroyd) → needinfo?(jdemooij)
Comment 4•12 years ago
|
||
Also, Jan, if you could provide preprocessed source for RangeAnalysis.cpp, that would be helpful. Something like:
make -C js/src RangeAnalysis.i
should do the trick.
Assignee | ||
Comment 5•12 years ago
|
||
(In reply to Nathan Froyd (:froydnj) from comment #3)
> Jan, what platform are you seeing the miscompilation on?
OS X, 32-bit build, Clang 3.2:
$ clang++ --version
clang version 3.2 (tags/RELEASE_32/final)
Target: x86_64-apple-darwin11.4.2
Thread model: posix
Nightly builds seem to have the same problem though.
Flags: needinfo?(jdemooij)
Assignee | ||
Comment 6•12 years ago
|
||
(In reply to Nathan Froyd (:froydnj) from comment #4)
> Also, Jan, if you could provide preprocessed source for RangeAnalysis.cpp,
> that would be helpful.
Attaching RangeAnalysis.i.bz2
Comment 7•12 years ago
|
||
Thanks Jan. Can you tell me what the double that gets misconverted is?
Flags: needinfo?(jdemooij)
Assignee | ||
Comment 8•12 years ago
|
||
(In reply to Nathan Froyd (:froydnj) from comment #7)
> Thanks Jan. Can you tell me what the double that gets misconverted is?
It's 36969.0 (int32 stored as double), note that this value is from the JS testcase in comment 0.
Flags: needinfo?(jdemooij)
Comment 9•12 years ago
|
||
Comment on attachment 734552 [details] [diff] [review]
Patch
Review of attachment 734552 [details] [diff] [review]:
-----------------------------------------------------------------
Doesn't make me happy, but if it works, maybe it's enough for now. I'd like to see this random nuke-from-orbit volatility gone really quick, tho -- I'd be surprised if this doesn't hurt perf on To* operations in general, even if it makes them correct.
Attachment #734552 -
Flags: review?(jwalden+bmo) → review+
Assignee | ||
Comment 10•12 years ago
|
||
Comment 11•12 years ago
|
||
I've been debugging this and I don't see anything clang is doing that is
obviously wrong...but I did notice this in the preprocessed output:
union DoublePun {
struct {
uint32_t hi, lo;
} s;
uint64_t u64;
double d;
};
which comes from:union DoublePun {
struct {
uint32_t lo, hi;
uint32_t hi, lo;
} s;
uint64_t u64;
double d;
};
...and that looks like we're not picking up the right definitions from jscpucfg.h.
The breakage here is strictly limited to RangeAnalysis.cpp, since it
includes vm/NumericConversions.h without including other js headers
prior to it. On the plus (minus?) side, this means that it wasn't just
a clang issue: all platforms should benefit from this fix.
Testing with the fix on OS X shows the opt config to be slightly faster
than debug:
bld-lion-r5-054:~ cltbld$ ./build-js-debug/js test.js
176
869830658
bld-lion-r5-054:~ cltbld$ ./build-js-opt/js test.js
173
869830658
If you'd rather have a separate backout of Jan's patch followed by the
There are other issues here (FPU_IS_ARM_FPA seems to be undefined by
*anything*, including configury; we should be getting DoublePun
definitions from One True Source, etc. etc.), but those can be handled
in separate bugs. I'll file a followup for ditching FPU_IS_ARM_FPA;
there's already open bugs on enhancing FloatingPoint.h and consolidating
the DoublePun definitions.
Attachment #736918 -
Flags: review?(jwalden+bmo)
Comment 12•12 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla23
Assignee | ||
Comment 13•12 years ago
|
||
(In reply to Nathan Froyd (:froydnj) from comment #11)
> ...and that looks like we're not picking up the right definitions from
> jscpucfg.h.
Very nice find, thanks a lot. I do wonder though why adding "volatile" avoids the problem, any idea?
Comment 14•12 years ago
|
||
(In reply to Jan de Mooij [:jandem] from comment #13)
> (In reply to Nathan Froyd (:froydnj) from comment #11)
> > ...and that looks like we're not picking up the right definitions from
> > jscpucfg.h.
>
> Very nice find, thanks a lot. I do wonder though why adding "volatile"
> avoids the problem, any idea?
That's a good question. I'm not really sure. "volatile" shouldn't be changing anything, since we have the wrong structure definition in the first place!
My guess is that swapping the structure members around somehow enabled the compiler to do some aggressive-but-not-valid transformations. And that the results of those optimizations made RangeAnalysis exhibit particularly bad behavior. "volatile" probably prevented propagation of the facts the compiler needed to do those transformations. Which means that RangeAnalysis exhibited normal (if wrong?) behavior once again.
It'd be interesting to see where the opt build--before any of the fixes from this bug--was spending its time running that testcase. I think it's a good bet something in RangeAnalysis was going haywire.
Comment 15•12 years ago
|
||
Comment on attachment 736918 [details] [diff] [review]
include jscpucfg.h in NumericConversions.h so DoublePun definitions are correct
Review of attachment 736918 [details] [diff] [review]:
-----------------------------------------------------------------
Sigh. Good enough for now.
Attachment #736918 -
Flags: review?(jwalden+bmo) → review+
Comment 16•12 years ago
|
||
Comment 17•12 years ago
|
||
Comment 18•12 years ago
|
||
Jan pointed out this morning that the fix I committed was quite different from the patch Waldo reviewed: my commit simply added the #include without removing the conditionals for __clang__. I blame my poor rebasing skills.
I've fixed that up by backing out Jan's original commit:
https://hg.mozilla.org/integration/mozilla-inbound/rev/f10f33fc2e93
Comment 19•12 years ago
|
||
You need to log in
before you can comment on or make changes to this bug.
Description
•