860867 - Baseline Compiler startup crash in local opt build

Assignee

Description

•

11 years ago

I just did a local build and baseline crashes during the startup for me every single time.  I demoed this to Kannan in person.

See https://gist.github.com/ehsan/5366294 for the full backtrace.

(no longer active)

Assignee

Comment 1

•

11 years ago

There's mention of <http://mxr.mozilla.org/mozilla-central/source/toolkit/components/osfile/osfile_shared_allthreads.jsm> on frame 19...

(no longer active)

Assignee

Comment 2

•

11 years ago

And disabling baseline using the pref fixes the crash.

(no longer active)

Assignee

Comment 3

•

11 years ago

I updated my tree to the latest m-c and the problem still exists.  This is sort of a pain because I can't run tests on my opt build :(

(no longer active)

Assignee

Comment 4

•

11 years ago

Still hitting this problem, can someone please try to reproduce?

Kannan Vijayan [:djvj]

Comment 5

•

11 years ago

I can try to reproduce this later today or tomorrow.

Christian Holler (:decoder)

Comment 6

•

11 years ago

Atte Kettunen is seeing this on ASan builds at startup too. I was not yet able to reproduce and for him it started happening with a newer clang version (might be that something is optimized differently).

Christian Holler (:decoder)

Updated

•

11 years ago

Blocks: asan-maintenance

(no longer active)

Assignee

Comment 7

•

11 years ago

I would be more than happy to build an optimized build with debug symbols and upload it somewhere if needed.

(no longer active)

Assignee

Comment 8

•

11 years ago

Uploaded my build here: <http://people.mozilla.org/~eakhgari/firefox-860867.dmg>

(no longer active)

Assignee

Comment 9

•

11 years ago

So here's the generated code:

0x102e5a920 <_ZN2js3ion16BaselineCompiler8emitCallEv+592>:	lea    (%r9,%rsi,1),%rcx
0x102e5a924 <_ZN2js3ion16BaselineCompiler8emitCallEv+596>:	movd   %rcx,%xmm7
0x102e5a929 <_ZN2js3ion16BaselineCompiler8emitCallEv+601>:	movlhps %xmm7,%xmm7
0x102e5a92c <_ZN2js3ion16BaselineCompiler8emitCallEv+604>:	movaps %xmm7,%xmm2
0x102e5a92f <_ZN2js3ion16BaselineCompiler8emitCallEv+607>:	paddq  %xmm8,%xmm2
0x102e5a934 <_ZN2js3ion16BaselineCompiler8emitCallEv+612>:	movq   %xmm2,%rdi
0x102e5a939 <_ZN2js3ion16BaselineCompiler8emitCallEv+617>:	punpckhqdq %xmm2,%xmm2
0x102e5a93d <_ZN2js3ion16BaselineCompiler8emitCallEv+621>:	paddq  %xmm9,%xmm7
0x102e5a942 <_ZN2js3ion16BaselineCompiler8emitCallEv+626>:	movq   %xmm7,%rbx
                                                            bug----------^
0x102e5a947 <_ZN2js3ion16BaselineCompiler8emitCallEv+631>:	punpckhqdq %xmm7,%xmm7
0x102e5a94b <_ZN2js3ion16BaselineCompiler8emitCallEv+635>:	lea    (%rdi,%rdi,2),%rdi
0x102e5a94f <_ZN2js3ion16BaselineCompiler8emitCallEv+639>:	movq   %xmm2,%rcx
0x102e5a954 <_ZN2js3ion16BaselineCompiler8emitCallEv+644>:	add    $0xfffffffffffffffc,%rsi
0x102e5a958 <_ZN2js3ion16BaselineCompiler8emitCallEv+648>:	lea    (%rcx,%rcx,2),%rcx
0x102e5a95c <_ZN2js3ion16BaselineCompiler8emitCallEv+652>:	movd   (%rax,%rcx,8),%xmm3
0x102e5a961 <_ZN2js3ion16BaselineCompiler8emitCallEv+657>:	movd   (%rax,%rdi,8),%xmm4
0x102e5a966 <_ZN2js3ion16BaselineCompiler8emitCallEv+662>:	movq   %xmm7,%rcx
0x102e5a96b <_ZN2js3ion16BaselineCompiler8emitCallEv+667>:	lea    (%rcx,%rcx,2),%rcx
0x102e5a96f <_ZN2js3ion16BaselineCompiler8emitCallEv+671>:	lea    (%rbx,%rbx,2),%rdi
0x102e5a973 <_ZN2js3ion16BaselineCompiler8emitCallEv+675>:	movd   (%rax,%rdi,8),%xmm7
0x102e5a978 <_ZN2js3ion16BaselineCompiler8emitCallEv+680>:	movd   (%rax,%rcx,8),%xmm2
0x102e5a97d <_ZN2js3ion16BaselineCompiler8emitCallEv+685>:	mov    %rdx,%rcx
0x102e5a980 <_ZN2js3ion16BaselineCompiler8emitCallEv+688>:	add    %rsi,%rcx
0x102e5a983 <_ZN2js3ion16BaselineCompiler8emitCallEv+691>:	punpckldq %xmm3,%xmm4
0x102e5a987 <_ZN2js3ion16BaselineCompiler8emitCallEv+695>:	pshufd $0x10,%xmm4,%xmm4
0x102e5a98c <_ZN2js3ion16BaselineCompiler8emitCallEv+700>:	pand   %xmm10,%xmm4
0x102e5a991 <_ZN2js3ion16BaselineCompiler8emitCallEv+705>:	pcmpeqd %xmm6,%xmm4
0x102e5a995 <_ZN2js3ion16BaselineCompiler8emitCallEv+709>:	pshufd $0xb1,%xmm4,%xmm3
0x102e5a99a <_ZN2js3ion16BaselineCompiler8emitCallEv+714>:	pand   %xmm4,%xmm3
0x102e5a99e <_ZN2js3ion16BaselineCompiler8emitCallEv+718>:	pand   %xmm5,%xmm3
0x102e5a9a2 <_ZN2js3ion16BaselineCompiler8emitCallEv+722>:	punpckldq %xmm2,%xmm7
0x102e5a9a6 <_ZN2js3ion16BaselineCompiler8emitCallEv+726>:	paddq  %xmm3,%xmm1
0x102e5a9aa <_ZN2js3ion16BaselineCompiler8emitCallEv+730>:	pshufd $0x10,%xmm7,%xmm2
0x102e5a9af <_ZN2js3ion16BaselineCompiler8emitCallEv+735>:	pand   %xmm10,%xmm2
0x102e5a9b4 <_ZN2js3ion16BaselineCompiler8emitCallEv+740>:	pcmpeqd %xmm6,%xmm2
0x102e5a9b8 <_ZN2js3ion16BaselineCompiler8emitCallEv+744>:	pshufd $0xb1,%xmm2,%xmm3
0x102e5a9bd <_ZN2js3ion16BaselineCompiler8emitCallEv+749>:	pand   %xmm2,%xmm3
0x102e5a9c1 <_ZN2js3ion16BaselineCompiler8emitCallEv+753>:	pand   %xmm5,%xmm3
0x102e5a9c5 <_ZN2js3ion16BaselineCompiler8emitCallEv+757>:	paddq  %xmm3,%xmm0
0x102e5a9c9 <_ZN2js3ion16BaselineCompiler8emitCallEv+761>:	jne    0x102e5a920 <_ZN2js3ion16BaselineCompiler8emitCallEv+592>

Here the compiler is moving the xmm7 register into a 64-bit register, which means that the carry over bit will not be clobbered, and later on when this variable is used as an offset it ends up being a really large value.

Kannan Vijayan [:djvj]

Comment 10

•

11 years ago

To expand a bit on ehsan's comment: the issue is that auto-vectorization of loops by Clang is misbehaving on a particular loop.

The auto-vectorization lifts some 32-bit operations into a 2*64-bit XMM register, performs operations which make the values in those register exceed the 32-bit range, and then move them back into a 64-bit register without truncating to a 32-bit range.

Solution is to turn off auto-vectorization on BaselineCompiler.cpp

Christian Holler (:decoder)

Comment 11

•

11 years ago

(In reply to Kannan Vijayan [:djvj] from comment #10)
> To expand a bit on ehsan's comment: the issue is that auto-vectorization of
> loops by Clang is misbehaving on a particular loop.

Is this a Clang bug?

Kannan Vijayan [:djvj]

Comment 12

•

11 years ago

(In reply to Christian Holler (:decoder) from comment #11)
> Is this a Clang bug?

Yes.

(no longer active)

Assignee

Comment 13

•

11 years ago

First things first, someone needs to look at the generated code for this function in one of our official Mac nightly binaries.  If the bug is present there, then this needs to be treated with a different priority.  FWIW I tried clang trunk and it generated the same bad code, so I think that we'll find the same bad code in our official binaries.

Jan de Mooij [:jandem]

Comment 14

•

11 years ago

With respindola gone, CC'ing nfroyd on this Clang/LLVM-related bug.

Nathan Froyd [:froydnj]

Comment 15

•

11 years ago

I don't see the buggy code in a current Mac Nightly.

The loop vectorizer is enabled by default at -O3 in pre-release versions of clang/llvm 3.3 and it was enabled only by explicit request in earlier releases.  Our current clang for nightlies is from the 3.2 release branch, so that'd explain why we're not seeing it in nightlies.

(no longer active)

Assignee

Comment 16

•

11 years ago

(In reply to comment #15)
> I don't see the buggy code in a current Mac Nightly.
> 
> The loop vectorizer is enabled by default at -O3 in pre-release versions of
> clang/llvm 3.3 and it was enabled only by explicit request in earlier releases.
>  Our current clang for nightlies is from the 3.2 release branch, so that'd
> explain why we're not seeing it in nightlies.

That doesn't explain why I'm seeing it in my local builds though, does it?  I'm building with --enable-optimize, which means -O2 I think.

This will be a time bomb when we want to upgrade to 3.3.  Is there a command line flag to turn it off everywhere?  Perhaps we should add that to the build system...

Christian Holler (:decoder)

Comment 17

•

11 years ago

(In reply to :Ehsan Akhgari (needinfo? me!) from comment #16)
> 
> That doesn't explain why I'm seeing it in my local builds though, does it? 
> I'm building with --enable-optimize, which means -O2 I think.

I was told that the JS engine is built with -O3 by default.

Nathan Froyd [:froydnj]

Comment 18

•

11 years ago

(In reply to :Ehsan Akhgari (needinfo? me!) from comment #16)
> (In reply to comment #15)
> > I don't see the buggy code in a current Mac Nightly.
> > 
> > The loop vectorizer is enabled by default at -O3 in pre-release versions of
> > clang/llvm 3.3 and it was enabled only by explicit request in earlier releases.
> >  Our current clang for nightlies is from the 3.2 release branch, so that'd
> > explain why we're not seeing it in nightlies.
> 
> That doesn't explain why I'm seeing it in my local builds though, does it? 
> I'm building with --enable-optimize, which means -O2 I think.

js/src builds with -O3:

[froydnj@cerebro src]$ pwd
/opt/build/froydnj/build-inbound/js/src
[froydnj@cerebro src]$ make echo-variable-MOZ_OPTIMIZE_FLAGS
-O3 -freorder-blocks
[froydnj@cerebro src]$ make -C ../../ echo-variable-MOZ_OPTIMIZE_FLAGS
make: Entering directory `/opt/build/froydnj/build-inbound'
-Os -freorder-blocks
make: Leaving directory `/opt/build/froydnj/build-inbound'
[froydnj@cerebro src]$ 

This is on Linux, but assuming I'm understanding configure.in right, there's no reason it should be any different on Mac.

I am assuming that you use a newer version of clang for your local builds.  Is that correct?

> This will be a time bomb when we want to upgrade to 3.3.  Is there a command
> line flag to turn it off everywhere?  Perhaps we should add that to the
> build system...

-fno-vectorize turns it off on a trunk build of clang.  The release notes for 3.2 say that -vectorize is the option to turn it on, so it's also possible that -no-vectorize works...or the option has been changed during the 3.3 cycle.

(no longer active)

Assignee

Comment 19

•

11 years ago

(In reply to comment #18)
> (In reply to :Ehsan Akhgari (needinfo? me!) from comment #16)
> > (In reply to comment #15)
> > > I don't see the buggy code in a current Mac Nightly.
> > > 
> > > The loop vectorizer is enabled by default at -O3 in pre-release versions of
> > > clang/llvm 3.3 and it was enabled only by explicit request in earlier releases.
> > >  Our current clang for nightlies is from the 3.2 release branch, so that'd
> > > explain why we're not seeing it in nightlies.
> > 
> > That doesn't explain why I'm seeing it in my local builds though, does it? 
> > I'm building with --enable-optimize, which means -O2 I think.
> 
> js/src builds with -O3:
> 
> [froydnj@cerebro src]$ pwd
> /opt/build/froydnj/build-inbound/js/src
> [froydnj@cerebro src]$ make echo-variable-MOZ_OPTIMIZE_FLAGS
> -O3 -freorder-blocks
> [froydnj@cerebro src]$ make -C ../../ echo-variable-MOZ_OPTIMIZE_FLAGS
> make: Entering directory `/opt/build/froydnj/build-inbound'
> -Os -freorder-blocks
> make: Leaving directory `/opt/build/froydnj/build-inbound'
> [froydnj@cerebro src]$ 
> 
> This is on Linux, but assuming I'm understanding configure.in right, there's no
> reason it should be any different on Mac.
> 
> I am assuming that you use a newer version of clang for your local builds.  Is
> that correct?

Yes, I use tip of tree.

> > This will be a time bomb when we want to upgrade to 3.3.  Is there a command
> > line flag to turn it off everywhere?  Perhaps we should add that to the
> > build system...
> 
> -fno-vectorize turns it off on a trunk build of clang.  The release notes for
> 3.2 say that -vectorize is the option to turn it on, so it's also possible that
> -no-vectorize works...or the option has been changed during the 3.3 cycle.

glandium, where's the right place to add that to the build options?

(no longer active)

Assignee

Comment 20

•

11 years ago

Er, glandium, please see comment 19.

Flags: needinfo?(mh+mozilla)

Mike Hommey [:glandium]

Comment 21

•

11 years ago

(In reply to :Ehsan Akhgari (needinfo? me!) from comment #19)
> glandium, where's the right place to add that to the build options?

I'd say in MOZ_OPTIMIZE_FLAGS definitions in js/src/configure.in, where it contains -O3, conditional to the compiler being clang (test -n "$CLANG_CXX")

Flags: needinfo?(mh+mozilla)

(no longer active)

Assignee

Comment 22

•

11 years ago

Attached patch Patch (v1) — Details — Splinter Review

Assignee: general → ehsan

Status: NEW → ASSIGNED

Attachment #742080 - Flags: review?(mh+mozilla)

Mike Hommey [:glandium]

Updated

•

11 years ago

Attachment #742080 - Flags: review?(mh+mozilla) → review+

(no longer active)

Assignee

Comment 23

•

11 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/694322457a64

(no longer active)

Assignee

Comment 24

•

11 years ago

My attempts at converting this code into a smaller standalone test case which we can use to file an LLVM bug based on have failed.  It would be fantastic if somebody who knows this code takes on doing that.

Ryan VanderMeulen [:RyanVM][PTO June 24-28]

Comment 25

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/694322457a64

Status: ASSIGNED → RESOLVED

Closed: 11 years ago

Resolution: --- → FIXED

Target Milestone: --- → mozilla23

(no longer active)

Assignee

Comment 26

•

11 years ago

clang 3.3 has a fix for this bug now, so I'm backing out the work-around patch:

https://hg.mozilla.org/integration/mozilla-inbound/rev/02f1eeef2613

Resolution: FIXED → WONTFIX

Ryan VanderMeulen [:RyanVM][PTO June 24-28]

Comment 27

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/02f1eeef2613