EXCEPTION_ILLEGAL_INSTRUCTION crashes [@ mp4parse_new] starting in nightly 2016-02-27 (enabling rust in nightly)

VERIFIED FIXED in Firefox 47

Status

()

defect
P1
normal
VERIFIED FIXED
3 years ago
3 years ago

People

(Reporter: dbaron, Assigned: rillian)

Tracking

({crash, topcrash})

Trunk
mozilla47
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox47+ fixed)

Details

(crash signature)

Attachments

(1 attachment, 2 obsolete attachments)

Flags: needinfo?(giles)
Summary: crashes [@ mp4parse_new] starting in nightly 2016-02-27 → EXCEPTION_ILLEGAL_INSTRUCTION crashes [@ mp4parse_new] starting in nightly 2016-02-27 (enabling rust in nightly)
The CPU breakdown of the crashes so far is:

Rank 	Cpu info 	Count 	%
1 	AuthenticAMD family 6 model 8 stepping 1 | 1 	85 	75.89 %
2 	AuthenticAMD family 6 model 10 stepping 0 | 1 	21 	18.75 %
3 	AuthenticAMD family 6 model 8 stepping 0 | 1 	3 	2.68 %
4 	GenuineIntel family 6 model 8 stepping 6 | 1 	2 	1.79 %
5 	CentaurHauls family 6 model 9 stepping 10 | 1 	1 	0.89 %
Disassembling the minidump from https://crash-stats.mozilla.com/report/index/1870d0d1-fe33-4d1e-91cb-fad322160228 indicates that this is crashing calling an SSE instruction:

5f53c2ac:	f2 0f 10 05 28 a1 39 	movsd  0x6139a128,%xmm0
^^^^^^^^^	^^^^^^^^^^^^^^^^^^^^^	^^^^^^^^^^^^^^^^^^^^^^^^

(In reply to David Baron [:dbaron] ⌚️UTC+8 from comment #1)
> The CPU breakdown of the crashes so far is:
> 
> Rank 	Cpu info 	Count 	%
> 1 	AuthenticAMD family 6 model 8 stepping 1 | 1 	85 	75.89 %

This is an "AMD Athlon XP2400" (released in 2002!)

> 4 	GenuineIntel family 6 model 8 stepping 6 | 1 	2 	1.79 %

This appears to be a Pentium 3 or Celeron of some variety.

I guess this was an easy way to answer the "how many of our Nightly users have SSE-capable processors" question! FTR, I don't actually think it's that many, the crash stats indicate ~20 unique users.
This is probably from llvm assuming sse2 is available. We can pass a flag to turn it off.
Assignee: nobody → giles
Flags: needinfo?(giles)
Priority: -- → P1
Not sure if this works; I don't see where RUSTFLAGS is exported.
Clean up the patch.

Add an export statement for RUSTFLAGS which don't appear to propagate otherwise.

Builds: https://treeherder.mozilla.org/#/jobs?repo=try&revision=5cc5d770cea5
Attachment #8724933 - Attachment is obsolete: true
Attachment #8724987 - Flags: review?(ted)
Comment on attachment 8724987 [details] [diff] [review]
Disable sse2 code generation on sse2 v2

Whoever gets to it first. :)
Attachment #8724987 - Flags: review?(mshal)
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #2)

> Disassembling the minidump ... indicates that this is crashing calling an SSE instruction:
> 
> 5f53c2ac:	f2 0f 10 05 28 a1 39 	movsd  0x6139a128,%xmm0

Thanks for disassembling. That was very helpful!

Just for clarity, I believe movsd with an xmm register is sse2, and not part of the original sse extensions set? 
https://en.wikipedia.org/wiki/X86_instruction_listings#SSE2_instructions
> ^^^^^^^^^	^^^^^^^^^^^^^^^^^^^^^	^^^^^^^^^^^^^^^^^^^^^^^^
> 
> (In reply to David Baron [:dbaron] ⌚️UTC+8 from comment #1)
> > The CPU breakdown of the crashes so far is:
> > 
> > Rank 	Cpu info 	Count 	%
> > 1 	AuthenticAMD family 6 model 8 stepping 1 | 1 	85 	75.89 %
> 
> This is an "AMD Athlon XP2400" (released in 2002!)
> 
> > 4 	GenuineIntel family 6 model 8 stepping 6 | 1 	2 	1.79 %
> 
> This appears to be a Pentium 3 or Celeron of some variety.
> 
> I guess this was an easy way to answer the "how many of our Nightly users
> have SSE-capable processors" question! FTR, I don't actually think it's that
> many, the crash stats indicate ~20 unique users.
Comment on attachment 8724987 [details] [diff] [review]
Disable sse2 code generation on sse2 v2

As discussed in IRC, we need an AC_SUBST as well.
Attachment #8724987 - Flags: review?(ted)
Attachment #8724987 - Flags: review?(mshal)
Attachment #8724987 - Flags: feedback+
Add and AC_SUBST line, per feedback. Thanks for the quick attention!

Pushed to try as https://treeherder.mozilla.org/#/jobs?repo=try&revision=972f4e35226e
Attachment #8724987 - Attachment is obsolete: true
Attachment #8725000 - Flags: review?(mshal)
Comment on attachment 8725000 [details] [diff] [review]
Disable sse2 code generation on sse2 v3

LGTM. Please double-check the try build when it finishes to make sure that your flag shows up on the command-line before landing.
Attachment #8725000 - Flags: review?(mshal) → review+
Will do. Thanks again.
Looks like the flag is propagating now.

16:06:42     INFO -  c:/builds/moz2_slave/try-w32-d-00000000000000000000/build/src/rustc/bin/rustc --target=i686-pc-windows-msvc -C target-feature=-sse2 -O --crate-type staticlib --emit dep-info=.deps/liblib.lib.pp,link=liblib.lib c:/builds/moz2_slave/try-w32-d-00000000000000000000/build/src/media/libstagefright/binding/mp4parse/lib.rs
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #2)
> I guess this was an easy way to answer the "how many of our Nightly users
> have SSE-capable processors" question! FTR, I don't actually think it's that
> many, the crash stats indicate ~20 unique users.

How does crash stats end up making a crash that affects ~20 unique users the top #1 crasher?
(In reply to Mike Hommey [:glandium] from comment #14)
> How does crash stats end up making a crash that affects ~20 unique users the
> top #1 crasher?

When builds tend towards being more stable, it only requires 10-20 crashes per day to be the top crash in nightly.
Comment on attachment 8725000 [details] [diff] [review]
Disable sse2 code generation on sse2 v3

Review of attachment 8725000 [details] [diff] [review]:
-----------------------------------------------------------------

> By default llvm (and rustc) generate sse2 instructions on **x84**, but

What is "x84"? I suppose it's x86 :)
(In reply to David Baron [:dbaron] ⌚️UTC+8 from comment #15)

> When builds tend towards being more stable, it only requires 10-20 crashes
> per day to be the top crash in nightly.

Seems like good news, really.

> By default llvm (and rustc) generate sse2 instructions on **x84**, but

Oops! Yes, x86 (i686 by the target triple).
(In reply to David Baron [:dbaron] ⌚️UTC+8 from comment #15)
> When builds tend towards being more stable, it only requires 10-20 crashes
> per day to be the top crash in nightly.

FWIW, we have currently have 210 crashes with this signature over 3 days of builds, but yeah, it's likely the same few people hitting this repeatedly.

Comment 19

3 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/a5afedbecaad
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla47
This crash has not occurred since the fix landed. It stopped at 2016-03-01 build which was the last build without the fix. With that data, I consider this verified.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.