Closed
Bug 703135
Opened 12 years ago
Closed 11 years ago
Crash in oc_huff_tree_collapse()
Categories
(Core :: Audio/Video, defect)
Tracking
()
RESOLVED
FIXED
mozilla21
People
(Reporter: gps, Assigned: cpearce)
References
Details
(Keywords: crash, dogfood, Whiteboard: [metro-mvp])
Attachments
(7 files, 1 obsolete file)
13.41 KB,
text/plain
|
Details | |
10.95 KB,
text/plain
|
Details | |
635 bytes,
text/html
|
Details | |
1.58 KB,
patch
|
mbrubeck
:
checkin+
|
Details | Diff | Splinter Review |
8.99 KB,
text/plain
|
Details | |
4.34 KB,
patch
|
derf
:
review+
|
Details | Diff | Splinter Review |
1.67 KB,
patch
|
Details | Diff | Splinter Review |
I built Nightly on Windows 8 and Visual Studio 2011 and was treated to a crash a few seconds after application start. Unhandled exception at 0x7444A42A (xul.dll) in firefox.exe: 0xC00000005: Access violation writing location 08BBC1BF0F. The top of the stack is huffdec.c:346:oc_huff_tree_collapse(). It is failing on the "if(_tree!=NULL)_tree[node[l]++]=(ogg_int16_t)nbits;" line. First few frames are: oc_huff_tree_unpack() oc_dec_headerin() th_decode_headerin() nsTheoraState::DecodeHeader() Interestingly, the memory address of _tree is 0x052fc0b0 and node is 0x0a67f55c. This is pretty far away from where the write exception occurred, 0x08BBC1BF0F. The disassembly reveals the error occurred at instruction: inc word ptr ds:[8BC1BF0Fh] which is the first instruction after the annotation for the "if(_tree!=NULL)..." line. Steps to reproduce: 1) Build Firefox on Windows 8 and Visual Studio 2011 (using vanilla .mozconfig) 2) Load https://www.mozilla.org/projects/firefox/prerelease.html (default page on empty Nightly profile). I have a Visual Studio crash dump file and a VirtualBox snapshot that reliably produces the bug. This could be a MSVC2011 and/or Windows 8 bug. Both are developer previews, after all.
Reporter | ||
Updated•12 years ago
|
Summary: oc_huff_tree_collapse → Crash in oc_huff_tree_collapse()
Comment 1•12 years ago
|
||
Wow. Can you attach the full disassembly of oc_huff_tree_collapse so we can see the context?
Reporter | ||
Comment 2•12 years ago
|
||
Crash is at 7444A42A.
EAX = 00000100 EBX = 00000000 ECX = 00000000 EDX = 00000008 ESI = 00000000 EDI = 00000101 EIP = 7444A42A ESP = 0A67F4D0 EBP = 0A67F55C EFL = 00010202
+ _tree 0x052fc0b0 {27792} short *
+ _tokens 0x0a67f5c8 {4 '\x4', 4 '\x4'} unsigned char[2] *
_ntokens 92 int
+ node 0x0a67f55c {0, 295, 321, 2663, 0, 0, 25036, 29764, -2012, 2663, 32, 0, 91, 0, 0, -4096, 4, 0, -24046, ...} short[34]
+ depth 0x0a67f514 "" unsigned char[34]
+ last 0x0a67f538 "[IF" unsigned char[34]
> xul.dll!oc_huff_tree_collapse(short * _tree, unsigned char[2] * _tokens, int _ntokens) Line 346 C
xul.dll!oc_huff_trees_unpack(oc_pack_buf * _opb, short * * _nodes) Line 403 C
xul.dll!oc_dec_headerin(oc_pack_buf * _opb, th_info * _info, th_comment * _tc, th_setup_info * * _setup, ogg_packet * _op) Line 214 C
xul.dll!th_decode_headerin(th_info * _info, th_comment * _tc, th_setup_info * * _setup, ogg_packet * _op) Line 242 C
xul.dll!nsTheoraState::DecodeHeader(ogg_packet * aPacket) Line 267 C++
xul.dll!nsOggReader::ReadHeaders(nsOggCodecState * aState) Line 163 C++
xul.dll!nsOggReader::ReadMetadata(nsVideoInfo * aInfo) Line 248 C++
xul.dll!nsBuiltinDecoderStateMachine::DecodeMetadata() Line 1370 C++
xul.dll!nsBuiltinDecoderStateMachine::DecodeThreadRun() Line 428 C++
xul.dll!nsRunnableMethodImpl<void (__thiscall nsMediaChannelStream::*)(void),0>::Run() Line 346 C++
xul.dll!nsThread::ProcessNextEvent(bool mayWait, bool * result) Line 631 C++
xul.dll!NS_ProcessNextEvent_P(nsIThread * thread, bool mayWait) Line 245 C++
xul.dll!nsThread::ThreadFunc(void * arg) Line 272 C++
nspr4.dll!_PR_NativeRunThread(void * arg) Line 448 C
nspr4.dll!pr_root(void * arg) Line 122 C
msvcr110.dll!_callthreadstartex() Line 308 C
msvcr110.dll!_threadstartex(void * ptd) Line 286 C
kernel32.dll!75629391() Unknown
[Frames below may be incorrect and/or missing, no symbols loaded for kernel32.dll]
ntdll.dll!77ed31b0() Unknown
ntdll.dll!77ed3183() Unknown
Comment 3•12 years ago
|
||
The crashing instruction inc word ptr ds:[8BC1BF0Fh] is an increment to a hard-coded constant address (qualified with the ds segment register). I am not at all surprised that this causes a crash, but I have no idea where the compiler is getting this from, as there are no increments of a global variable anywhere in this code.
Reporter | ||
Comment 4•12 years ago
|
||
FWIW, the value at that DS address is 2344730383. When stepping through things in the debugger, I noticed that the current line position was not progressing linearly. Starting at the nbits=oc_huff_tree_collapse_depth(...) line (343), it went: 343, 345, 344, 345, 344, 345, 346, crash Looking at the disassembly annotations, it does appear that lines are being evaluated out of order. I guess the optimizer was a little too aggressive? MSVC 2011 bug? Speaking of disassembly source annotations, they are different now! The assembly still looks the same, but the attributions of source lines are off by a few instructions! And, no, the binary didn't change. One would think Visual Studio would be deterministic here. Most interesting. You want me to perform a debug build (without optimizations)?
Reporter | ||
Comment 5•12 years ago
|
||
Ehsan: you got m-c building on MSVC 2011 on Windows 7, right? Could you see if you experience this bug with a regular/non-debug build? If not, I'd be interested in seeing what your assembly looks like for the infringing function.
Comment 6•12 years ago
|
||
Gregory: worth trying to reproduce with a less-optimized build. That would confirm it's an optimizer bug.
Reporter | ||
Comment 7•12 years ago
|
||
I built with --enable-debug --no-optimize and the crash went away. Disassembly attached. The same revision of the source code was used. I think we have an optimizer bug!
Reporter | ||
Comment 8•12 years ago
|
||
I meant --disable-optimize, duh. Sorry for the bugspam.
Comment 9•12 years ago
|
||
Great! So can we report it?
![]() |
||
Comment 10•11 years ago
|
||
This is showing up on our vs11 builders as well. Appears to be an issue with the use of jemalloc.
Blocks: 74734
![]() |
||
Updated•11 years ago
|
![]() |
||
Comment 11•11 years ago
|
||
![]() |
||
Comment 12•11 years ago
|
||
This is rather strange. In debugging I added a simple print statement to the end of oc_huff_tree_collapse to dump ntree when _tree is NULL. With this addition the crash goes away. Commenting out the print statement brings is back. Some here commented that they thought this might be a compiler bug which seemed somewhat unlikely to me, but with behavior like this I'm wondering if a compiler issue might be the case.
Comment 14•11 years ago
|
||
Does the offending inc ds:[-data ptr-] instruction go away with the printf?
![]() |
||
Comment 15•11 years ago
|
||
This quirky little patch seems to address the problem.
![]() |
||
Comment 16•11 years ago
|
||
minus the makefile change.
Attachment #635417 -
Attachment is obsolete: true
![]() |
||
Comment 17•11 years ago
|
||
We could land something like this or just wait to test the final I suppose. We landed another similar quirk in bug 761279. I'm going to hold off until our new vs2012RC elm builders are up and running. Then do some additional testing.
![]() |
||
Comment 18•11 years ago
|
||
(In reply to Jim Mathies [:jimm] from comment #16) > Created attachment 635418 [details] [diff] [review] > patch > > minus the makefile change. This didn't fix the problem on elm, although it did solve it in a local build. Which means this is going to be tricky to track down.
![]() |
||
Comment 19•11 years ago
|
||
Lat week I landed a few changes - I landed the patch here, I disabled jemalloc, and I disabled the first run start screen with the webm video. Today when I tested the latest builds by loading the video manually there's no crash. elm builds - https://bugzilla.mozilla.org/show_bug.cgi?id=703135 So I'm going enable the start screen again to see if the crash comes back. (When I first tested the jemalloc and patch changes, it didn't address the problem.) One other point, I don't think we are doing full clobber builds on elm currently which I think needs to change. If that's the case I can't be certain the changes I'm making are getting incorporated into new builds.
![]() |
||
Comment 20•11 years ago
|
||
(In reply to Jim Mathies [:jimm] from comment #19) > elm builds - > https://bugzilla.mozilla.org/show_bug.cgi?id=703135 http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/elm-win32-metro/
![]() |
||
Updated•11 years ago
|
Whiteboard: completed-elm
![]() |
||
Comment 21•11 years ago
|
||
This appears to be fixed in the final release of vc 2012.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Comment 22•11 years ago
|
||
Deeply saddened, but this is still happening on elm (pull from 2 days ago) with vs2012 on a samsung slate with win8 (release win8 not dev preview). Annoying, this means that nightly crashes promptly on startup, because the first page it tries to run is www.mozilla.org/projects/firefox/prerelease.html which contains a video. my stack trace is: > gkmedias.dll!oc_huff_tree_collapse(short * _tree, unsigned char[2] * _tokens, int _ntokens) Line 346 C gkmedias.dll!oc_huff_trees_unpack(oc_pack_buf * _opb, short * * _nodes) Line 403 C gkmedias.dll!oc_dec_headerin(oc_pack_buf * _opb, th_info * _info, th_comment * _tc, th_setup_info * * _setup, ogg_packet * _op) Line 214 C gkmedias.dll!th_decode_headerin(th_info * _info, th_comment * _tc, th_setup_info * * _setup, ogg_packet * _op) Line 242 C xul.dll!nsTheoraState::DecodeHeader(ogg_packet * aPacket) Line 312 C++ xul.dll!nsOggReader::ReadHeaders(nsOggCodecState * aState) Line 142 C++ xul.dll!nsOggReader::ReadMetadata(nsVideoInfo * aInfo, nsDataHashtable<nsCStringHashKey,nsCString> * * aTags) Line 258 C++ xul.dll!nsBuiltinDecoderStateMachine::DecodeMetadata() Line 1733 C++ xul.dll!nsBuiltinDecoderStateMachine::DecodeThreadRun() Line 490 C++ xul.dll!nsRunnableMethodImpl<enum tag_nsresult (__thiscall mozilla::places::`anonymous namespace'::VisitedQuery::*)(void),1>::Run() Line 351 C++ xul.dll!nsThread::ProcessNextEvent(bool mayWait, bool * result) Line 627 C++ xul.dll!NS_ProcessNextEvent_P(nsIThread * thread, bool mayWait) Line 221 C++ xul.dll!nsThread::ThreadFunc(void * arg) Line 265 C++ nspr4.dll!_PR_NativeRunThread(void * arg) Line 417 C nspr4.dll!pr_root(void * arg) Line 90 C msvcr110.dll!_callthreadstartex() Line 354 C msvcr110.dll!_threadstartex(void * ptd) Line 332 C kernel32.dll!76f18543() Unknown [Frames below may be incorrect and/or missing, no symbols loaded for kernel32.dll] ntdll.dll!77efac69() Unknown ntdll.dll!77efac3c() Unknown
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Whiteboard: completed-elm → [metro-mvp]
Comment 23•11 years ago
|
||
(In reply to :Ally Naaktgeboren from comment #22) > my stack trace is: > > gkmedias.dll!oc_huff_tree_collapse(short * _tree, unsigned > char[2] * _tokens, int _ntokens) Line 346 C Does the disassembly here look the same as before?
Updated•11 years ago
|
Attachment #635418 -
Flags: checkin+
Assignee | ||
Comment 24•11 years ago
|
||
I'm hitting this in 100% of the time locally in content/media/test/test_loop.html mochitest, with VS2012/Win8/Opt builds on my Lenovo W530. Here's the dissasembly. We're still crashing at the "inc word ptr ds:[8BC1BF0Fh]" instruction. As far as I can tell my VS2012 is fully updated to version "11.0.5110601 Update 1".
Assignee | ||
Comment 25•11 years ago
|
||
I reported against Visual Studio 2012 using the "Report a bug" feature of Visual Studio, it's filed here: https://connect.microsoft.com/VisualStudio/feedback/details/777087/c-compiler-optimization-bug
Assignee | ||
Comment 26•11 years ago
|
||
I uploaded a dump to the connect.ms bug, and the issue has been "routed to the appropriate VS development team for investigation."
Assignee | ||
Comment 27•11 years ago
|
||
This crash is annoying me. Let's disable compiler optimizations in that function when building with MSVC2012 and later until such a time as a fix is released for MSVC2012.
Attachment #708895 -
Flags: review?(tterribe)
Updated•11 years ago
|
Attachment #708895 -
Flags: review?(tterribe) → review+
Assignee | ||
Comment 28•11 years ago
|
||
(In reply to Chris Pearce (:cpearce) from comment #27) > Created attachment 708895 [details] [diff] [review] > Patch: Disable optimize in oc_huff_tree_collapse() in MSVC2012 Landed this patch: https://hg.mozilla.org/integration/mozilla-inbound/rev/f24905861be5
Comment 29•11 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/f24905861be5
Assignee: nobody → cpearce
Status: REOPENED → RESOLVED
Closed: 11 years ago → 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla21
Assignee | ||
Comment 31•10 years ago
|
||
I've heard back from Microsoft in the support issue I filed. They suggest we disable the specific optimization that fails only: Hi Chris, Thank you for taking the time to provide your feedback, this indeed is a bug in our product. After reviewing your reported issue, in the context of all the issues reported to us, we are considering fixing this bug in one of our future releases. In the meantime, could you please consider accepting this workaround until you get a product release from us with the fix? -> As an alternative to disabling all optimizations for the affected function, disable just `/Oy` optimization for the affected function, `oc_huff_tree_collapse`, using the following directive: #pragma optimize( "y", off ) // disable frame pointer omission // ... // function definition // ... #pragma optimize( "", on ) // reset optimizations to what was specified at the command line If this issue is severe, causing critical business situations or blocking your product development or deployment, please go to http://support.microsoft.com or call 1-800-MICROSOFT for assistance. For Microsoft premier customers, please contact your administrator, your Technical Account Manager, or your Microsoft premier account representative. I am closing this MSConnect item. Feel free to respond if you need anything else. Thanks, Daniel Podder VC++ Team
Comment 32•10 years ago
|
||
Comment 33•10 years ago
|
||
I don't have an windows build environment to verify the relaxed work-around, but here's a try push. https://tbpl.mozilla.org/?tree=Try&rev=81cfe3121394
You need to log in
before you can comment on or make changes to this bug.
Description
•