Closed Bug 1143966 Opened 10 years ago Closed 10 years ago

jsshell startup crash with Ubuntu 12.04 LTS (gcc 4.7)

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla39

Tracking Flags:

Tracking

Status

firefox39

---

fixed

People

(Reporter: decoder, Assigned: ehoogeveen)

Details

(Keywords: crash, Whiteboard: [fuzzblocker])

Attachments

(1 file, 1 obsolete file)

Work around a compiler bug in GCC by enforcing a specific optimization level for js::gc::IsBackgroundFinalized(). 10 years ago Emanuel Hoogeveen [:ehoogeveen] 2.16 KB, patch		Details \| Diff \| Splinter Review
Remove type specifier from AllocKind to avoid miscompilations on GCC. 10 years ago Emanuel Hoogeveen [:ehoogeveen] 838 bytes, patch	terrence : review+	Details \| Diff \| Splinter Review

Christian Holler (:decoder)

Reporter

Description

•

10 years ago

I'm seeing a startup crash on mozilla-central since rev 2b9f5019abf1: Program received signal SIGSEGV, Segmentation fault. js::gc::GetBackgroundAllocKind (kind=js::gc::OBJECT16) at js/src/jsgc.h:219 219 MOZ_ASSERT(!IsBackgroundFinalized(kind)); (gdb) bt #0 js::gc::GetBackgroundAllocKind (kind=js::gc::OBJECT16) at js/src/jsgc.h:219 #1 0x086f6b2d in js::NewObjectWithGivenTaggedProto (cxArg=cxArg@entry=0x96b7560, clasp=clasp@entry=0x9651cc0 <JSRuntime::createSelfHostingGlobal(JSContext*)::shgClass>, proto=proto@entry=..., parentArg=..., allocKind=js::gc::OBJECT16, newKind=newKind@entry=js::SingletonObject) at js/src/jsobj.cpp:1180 #2 0x081537f8 in js::NewObjectWithGivenTaggedProto (cx=cx@entry=0x96b7560, clasp=clasp@entry=0x9651cc0 <JSRuntime::createSelfHostingGlobal(JSContext*)::shgClass>, proto=..., parent=..., newKind=newKind@entry=js::SingletonObject) at js/src/jsobjinlines.h:587 #3 0x08209a88 in NewObjectWithGivenProto (newKind=js::SingletonObject, parent=..., proto=..., clasp=0x9651cc0 <JSRuntime::createSelfHostingGlobal(JSContext*)::shgClass>, cx=0x96b7560) at js/src/jsobjinlines.h:611 #4 js::GlobalObject::createInternal (cx=cx@entry=0x96b7560, clasp=clasp@entry=0x9651cc0 <JSRuntime::createSelfHostingGlobal(JSContext*)::shgClass>) at js/src/vm/GlobalObject.cpp:241 #5 0x0829e35b in JSRuntime::createSelfHostingGlobal (cx=cx@entry=0x96b7560) at js/src/vm/SelfHosting.cpp:1129 #6 0x082a3022 in JSRuntime::initSelfHosting (this=this@entry=0x968aba0, cx=cx@entry=0x96b7560) at js/src/vm/SelfHosting.cpp:1161 #7 0x086a4b6c in js::NewContext (rt=0x968aba0, stackChunkSize=8192) at js/src/jscntxt.cpp:127 #8 0x086a4bd9 in JS_NewContext (rt=0x968aba0, stackChunkSize=stackChunkSize@entry=8192) at js/src/jsapi.cpp:674 #9 0x0804d9ba in NewContext (rt=<optimized out>) at js/src/shell/js.cpp:5590 #10 0x080684f5 in main (argc=1, argv=0xffffd4c4, envp=0xffffd4cc) at js/src/shell/js.cpp:6337 Bisection says: The first bad revision is: changeset: 233529:2b9f5019abf1 user: Emanuel Hoogeveen date: Fri Mar 13 02:13:00 2015 +0100 summary: Bug 1139552 - Convert js::gc::AllocKind to an enum class and eliminate non-AllocKind indexing. r=terrence The crash only reproduces with the following compiler/build flags combination: Compiler: gcc (Ubuntu/Linaro 4.7.3-2ubuntu1~12.04) 4.7.3 Build Flags: --enable-debug --enable-optimize --enable-posix-nspr-emulation --enable-valgrind --enable-gczeal --target=i686-pc-linux-gnu

Emanuel Hoogeveen [:ehoogeveen]

Assignee

Comment 1

•

10 years ago

I can reproduce this with GCC 4.7 without any changes. I can also reproduce it with GCC 4.8 if I remove the 'inline' keyword from GetBackgroundAllocKind. So it seems the underlying bug is still there, but the inlining saves it. Here's the disassembly (Intel style, because AT&T makes my eyes glaze over): Dump of assembler code for function js::gc::GetBackgroundAllocKind(js::gc::AllocKind): 0x0868c5a0 <+0>: push ebp 0x0868c5a1 <+1>: mov ebp,esp 0x0868c5a3 <+3>: push ebx 0x0868c5a4 <+4>: sub esp,0x14 0x0868c5a7 <+7>: call 0x806902b <__x86.get_pc_thunk.bx> 0x0868c5ac <+12>: add ebx,0xf70a48 0x0868c5b2 <+18>: cmp al,0x16 0x0868c5b4 <+20>: ja 0x868c5cd <js::gc::GetBackgroundAllocKind(js::gc::AllocKind)+45> => 0x0868c5b6 <+22>: cmp BYTE PTR [ebx+eax*1-0xbef1fb],0x0 0x0868c5be <+30>: jne 0x868c5f9 <js::gc::GetBackgroundAllocKind(js::gc::AllocKind)+89> 0x0868c5c0 <+32>: cmp al,0xb 0x0868c5c2 <+34>: ja 0x868c5d2 <js::gc::GetBackgroundAllocKind(js::gc::AllocKind)+50> 0x0868c5c4 <+36>: add esp,0x14 0x0868c5c7 <+39>: add eax,0x1 0x0868c5ca <+42>: pop ebx 0x0868c5cb <+43>: pop ebp 0x0868c5cc <+44>: ret 0x0868c5cd <+45>: call 0x868c4b0 <js::gc::IsBackgroundFinalized(js::gc::AllocKind)> 0x0868c5d2 <+50>: mov DWORD PTR [esp],0xde 0x0868c5d9 <+57>: lea edx,[ebx-0xbf7348] 0x0868c5df <+63>: lea eax,[ebx-0xbf720c] 0x0868c5e5 <+69>: call 0x871bf42 <MOZ_ReportAssertionFailure(char const*, char const*, int)> 0x0868c5ea <+74>: mov DWORD PTR ds:0x0,0xde 0x0868c5f4 <+84>: call 0x804aa20 <abort@plt> 0x0868c5f9 <+89>: mov DWORD PTR [esp],0xdd 0x0868c600 <+96>: lea edx,[ebx-0xbf7348] 0x0868c606 <+102>: lea eax,[ebx-0xc932e1] 0x0868c60c <+108>: call 0x871bf42 <MOZ_ReportAssertionFailure(char const*, char const*, int)> 0x0868c611 <+113>: mov DWORD PTR ds:0x0,0xdd 0x0868c61b <+123>: call 0x804aa20 <abort@plt> From this, I think the problem is that it's using eax instead of al in the offending cmp. *($ebx+$al-0xbef1fb) == 0x100, which would indeed truncate down to 0 and pass the assertion - but *($ebx+$eax-0xbef1fb) cannot be accessed. Unfortunately I have no idea how to work around this except by locally disabling optimizations (I haven't tried this yet, but I imagine it would work). I also couldn't say why GCC is failing in this way, or if it might be doing the same thing in other places.

Emanuel Hoogeveen [:ehoogeveen]

Assignee

Comment 2

•

10 years ago

Oh, interesting. If I wrap GetBackgroundAllocKind in a #pragma GCC optimize ("O0") block, I get a different assertion failure, with kind == 127 (!). So maybe the problematic situation is created higher up the stack.

Emanuel Hoogeveen [:ehoogeveen]

Assignee

Comment 3

•

10 years ago

Disregard that. I think the calling conventions used by -O3 and -O0 just aren't compatible - the AllocKind was reading random data. Using -O1 does work, and |-O2 -fno-expensive-optimizations|. -O3 fails, but none of the exposed options are responsible - I can add all of them to |-O2 -fno-expensive-optimizations|, and it won't change the generated code at all.

Emanuel Hoogeveen [:ehoogeveen]

Assignee

Comment 4

•

10 years ago

Attached patch Work around a compiler bug in GCC by enforcing a specific optimization level for js::gc::IsBackgroundFinalized(). (obsolete) — Details — Splinter Review

This patch fixes the problem for me locally. It might need more work or it might not be the best solution, and I'd like to see if I can get a reduced testcase, but could you confirm that this fixes the crashes for you as well?

Assignee: nobody → emanuel.hoogeveen

Attachment #8579042 - Flags: feedback?(choller)

Emanuel Hoogeveen [:ehoogeveen]

Assignee

Comment 5

•

10 years ago

So what happens here is that it stores the AllocKind on the stack as a uint8_t: 0x08199dc3 <+67>: mov BYTE PTR [esp+0xc],cl Then it reads it back off the stack as a uint32_t, so the high bits are no longer zero: 0x08199ddc <+92>: call 0x86fb710 0x086fb710 <+0>: push ebp 0x086fb711 <+1>: mov ebp,esp 0x086fb719 <+9>: mov eax,DWORD PTR [ebp+0x14] It inlines CanBeFinalizedInBackground(), and does the right thing for the IsBackgroundFinalized() check in there: 0x086fb735 <+37>: movzx edx,al 0x086fb738 <+40>: cmp BYTE PTR [ebx+edx*1-0xc43c1b],0x0 But then it calls into GetBackgroundAllocKind() (not inlined in GCC 4.7), and does the wrong thing: 0x086d9eb6 <+22>: cmp BYTE PTR [ebx+eax*1-0xc43c1b],0x0 I'm not sure what's at fault here: is the second check 'forgetting' that the high bits might not be zero, or is the optimization transforming the load and first (inline) check but forgetting to update the second?

Emanuel Hoogeveen [:ehoogeveen]

Assignee

Comment 6

•

10 years ago

This still reproduces in GCC 4.8.3, but doesn't appear to in GCC 4.9.1. It might just be dodging the bug in a different way though. It'd be really nice to have a reduced testcase, but I'm not sure how to approach reduction (my first naive attempt failed).

Jan de Mooij [:jandem]

Comment 7

•

10 years ago

Could it be https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64037 ?

Emanuel Hoogeveen [:ehoogeveen]

Assignee

Comment 8

•

10 years ago

Well spotted! Yes, it seems to be fixed in 4.8 trunk. It just missed 4.8.4, so I'm trying that now to make sure. As for where that leaves us.. Terrence suggested just changing the underlying type to be word-sized, so this problem can't happen. That will make AllocKind arrays bigger again, but otherwise I don't think there's much of a disadvantage. I don't know whether x64 is sensitive to this, either. That is to say, would we need to make the underlying type intptr_t/uintptr_t/size_t or would int/unsigned suffice? Terrence, what would you prefer? I'm wondering if a signed type might be slightly nicer so we can do things like MOZ_ASSERT(kind >= AllocKind::OBJECT0) again*. * For context, with an unsigned type this is a tautological compare, which makes warn-as-err builds fail, so I removed these assertions in bug 1139552 and added a static_assert that size_t(AllocKind::OBJECT0) == 0.

Flags: needinfo?(terrence)

Terrence Cole [:terrence]

Comment 9

•

10 years ago

I'd just remove the type specifier for now and add a fixme referencing this bug. We can revisit in a few months when gcc4.9 is standard. Maybe by then the MSVC issues with range iteration will also be fixed.

Flags: needinfo?(terrence)

Emanuel Hoogeveen [:ehoogeveen]

Assignee

Comment 10

•

10 years ago

Attached patch Remove type specifier from AllocKind to avoid miscompilations on GCC. — Details — Splinter Review

I confirmed that the problem still exists on GCC 4.8.4. From the bug jandem linked, I believe it also exists on GCC 4.9.2, though just removing the inline keyword from GetBackgroundAllocKind() wasn't enough to trigger it there. It should be fixed in 4.8.5 and 4.9.3, whenever those are released.

Attachment #8579042 - Attachment is obsolete: true

Attachment #8579042 - Flags: feedback?(choller)

Attachment #8580271 - Flags: review?(terrence)

Emanuel Hoogeveen [:ehoogeveen]

Assignee

Updated

•

10 years ago

Status: NEW → ASSIGNED

Emanuel Hoogeveen [:ehoogeveen]

Assignee

Comment 11

•

10 years ago

AllocKind is far from the only enum class with a less-than-word-sized underlying type in the tree by the way: https://dxr.mozilla.org/mozilla-central/search?q=%22enum+class%22&case=true&redirect=true I guess those aren't causing known problems though :\

Terrence Cole [:terrence]

Comment 12

•

10 years ago

O_O Good point! Some days I wonder how anything works at all, ever.

Terrence Cole [:terrence]

Comment 13

•

10 years ago

Comment on attachment 8580271 [details] [diff] [review] Remove type specifier from AllocKind to avoid miscompilations on GCC. Review of attachment 8580271 [details] [diff] [review]: ----------------------------------------------------------------- Great!

Attachment #8580271 - Flags: review?(terrence) → review+

Emanuel Hoogeveen [:ehoogeveen]

Assignee

Comment 14

•

10 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=6858cbe49471

Keywords: checkin-needed

Ryan VanderMeulen [:RyanVM]

Comment 15

•

10 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/1d9012617957

Keywords: checkin-needed

Phil Ringnalda (:philor)

Comment 16

•

10 years ago

https://hg.mozilla.org/mozilla-central/rev/1d9012617957

Status: ASSIGNED → RESOLVED

Closed: 10 years ago

status-firefox39: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla39

You need to log in before you can comment on or make changes to this bug.

Bugzilla

jsshell startup crash with Ubuntu 12.04 LTS (gcc 4.7)

Categories

(Core :: JavaScript Engine, defect)

Tracking

()

People

(Reporter: decoder, Assigned: ehoogeveen)

References

Details

(Keywords: crash, Whiteboard: [fuzzblocker])

Crash Data

Security

(public)

User Story

Attachments

(1 file, 1 obsolete file)

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Updated

Comment 11

Comment 12

Comment 13

Comment 14

Comment 15

Comment 16

Attachment

General

Description

File Name

Content Type