Closed Bug 1000537 Opened 11 years ago Closed 11 years ago

minidump_stackwalk "ERROR: ReadBytes: read 0/32 | ERROR: Minidump cannot read header"

Categories

(Toolkit :: Crash Reporting, defect)

x86
Windows 7
defect
Not set
normal

Tracking

()

RESOLVED INCOMPLETE

People

(Reporter: karlt, Unassigned)

References

Details

(Keywords: intermittent-failure, sheriffing-P1)

+++ This bug was initially created as a clone of Bug #820362 +++ Windows XP 32-bit mozilla-inbound opt test crashtest on 2014-04-23 00:57:47 PDT for push 80dba24cc929 slave: t-xp32-ix-119 https://tbpl.mozilla.org/php/getParsedLog.php?id=38308247&tree=Mozilla-Inbound#error0
Blocks: 996231
This is just "we wrote an empty minidump".
This seems to be happening consistently in what is likely an off main thread crash in bug 996231. Is there a theory why the dump might be empty? The assertion failure doesn't directly imply out of memory. There is at least one failure that is a little different, perhaps a truncated dump. https://tbpl.mozilla.org/php/getParsedLog.php?id=38210678&tree=Mozilla-Inbound#error0 minidump.cc:3860: INFO: Minidump not byte-swapping minidump minidump.cc:4226: INFO: GetStream: type 7 not present minidump.cc:4226: INFO: GetStream: type 7 not present minidump.cc:4226: INFO: GetStream: type 1197932545 not present minidump.cc:4226: INFO: GetStream: type 6 not present minidump.cc:4226: INFO: GetStream: type 1197932546 not present minidump.cc:4226: INFO: GetStream: type 4 not present minidump.cc:4226: INFO: GetStream: type 3 not present minidump_processor.cc:112: ERROR: Minidump c:\docume~1\cltbld~1.t-x\locals~1\temp\tmpciq26c.mozrunner\minidumps\0aa98f82-da7e-4cc5-bbf0-6e094ec701a3.dmp has no thread list minidump.cc:3787: INFO: Minidump closing minidump minidump_stackwalk.cc:529: ERROR: MinidumpProcessor::Process failed
Right, that reads as "we managed to write a minidump header with a directory list, but not any of the streams". Unfortunately this is Windows, and on Windows we use Microsoft's MinidumpWriteDump as a black box--we don't have any knowledge of its inner workings. We did try to do some mitigation for OOM/VM fragmentation scenarios in bug 837835/bug 943051 which had a noticeable impact on empty dumps in crash-stats. Short of having a reproducible testcase I'm not sure there's anything actionable we can do here. If you can reproduce this on try you could try bumping the memory reservation we use up a bit and see if that helps: http://hg.mozilla.org/mozilla-central/annotate/1d0496e30feb/toolkit/crashreporter/nsExceptionHandler.cpp#l373
Out of a couple dozen crashtest runs, I got two empty dumps and one valid one: https://tbpl.mozilla.org/?tree=Try&rev=7f1d28874fcf The run with the intact dump didn't indicate any memory problems: SystemMemoryUsePercentage=29 TotalVirtualMemory=2147352576 AvailableVirtualMemory=1523236864 AvailablePageFile=4348211200 AvailablePhysicalMemory=2252025856 I've got another try build going with higher breakpad reservation (+ commit), but I don't know if that will change anything.
> I've got another try build going with higher breakpad reservation (+ > commit), but I don't know if that will change anything. Yeah, that didn't help at all.
For reference, the crash with the valid dump from that try run was: https://tbpl.mozilla.org/php/getParsedLog.php?id=38610536&tree=Try 22:47:05 INFO - Crash reason: EXCEPTION_ACCESS_VIOLATION_READ 22:47:05 INFO - Crash address: 0x0 22:47:05 INFO - Thread 82 (crashed) 22:47:05 INFO - 0 xul.dll!mozilla::AudioNodeExternalInputStream::TrackMapEntry::ResampleChannels(nsTArray<void const *> const &,unsigned int,mozilla::AudioSampleFormat,float) [AudioNodeExternalInputStream.cpp:7f1d28874fcf : 171 + 0x6] 22:47:05 INFO - eip = 0x0205faf6 esp = 0x19ede714 ebp = 0x19ede758 ebx = 0x19edf7c0 22:47:05 INFO - esi = 0x02b13690 edi = 0x00000001 eax = 0x00000000 ecx = 0x00000000 22:47:05 INFO - edx = 0x00000002 efl = 0x00010246 22:47:05 INFO - Found by: given as instruction pointer in context 22:47:05 INFO - 1 xul.dll!mozilla::AudioNodeExternalInputStream::TrackMapEntry::ResampleInputData(mozilla::AudioSegment *) [AudioNodeExternalInputStream.cpp:7f1d28874fcf : 198 + 0x13] 22:47:05 INFO - eip = 0x0205fc4c esp = 0x19ede760 ebp = 0x19edf7f4 22:47:05 INFO - Found by: call frame info 22:47:05 INFO - 2 xul.dll!mozilla::AudioNodeExternalInputStream::ProcessInput(__int64,__int64,unsigned int) [AudioNodeExternalInputStream.cpp:7f1d28874fcf : 412 + 0x10] 22:47:05 INFO - eip = 0x02060547 esp = 0x19edf7fc ebp = 0x19edfd70 22:47:05 INFO - Found by: call frame info 22:47:05 INFO - 3 xul.dll!mozilla::MediaStreamGraphImpl::ProduceDataForStreamsBlockByBlock(unsigned int,int,__int64,__int64) [MediaStreamGraph.cpp:7f1d28874fcf : 1178 + 0x2a] 22:47:05 INFO - eip = 0x020650df esp = 0x19edfd78 ebp = 0x19edfda8 22:47:05 INFO - Found by: previous frame's frame pointer 22:47:05 INFO - 4 xul.dll!mozilla::MediaStreamGraphImpl::RunThread() [MediaStreamGraph.cpp:7f1d28874fcf : 1342 + 0x1b] 22:47:05 INFO - eip = 0x02070c99 esp = 0x19edfdb0 ebp = 0x19edfe44 22:47:05 INFO - Found by: call frame info 22:47:05 INFO - 5 xul.dll!mozilla::`anonymous namespace'::MediaStreamGraphInitThreadRunnable::Run() [MediaStreamGraph.cpp:7f1d28874fcf : 1498 + 0xc] 22:47:05 INFO - eip = 0x02070f60 esp = 0x19edfe4c ebp = 0x19edfe50 22:47:05 INFO - Found by: call frame info Looks like a null pointer deref here: http://hg.mozilla.org/mozilla-central/annotate/fefcd48d8313/content/media/AudioNodeExternalInputStream.cpp#l171 Is this only happening on opt builds? Are we seeing a different symptom on debug builds? If there's heap corruption or something like that that could easily trip up dump writing.
> Looks like a null pointer deref here: > http://hg.mozilla.org/mozilla-central/annotate/fefcd48d8313/content/media/ > AudioNodeExternalInputStream.cpp#l171 > > Is this only happening on opt builds? Are we seeing a different symptom on > debug builds? If there's heap corruption or something like that that could > easily trip up dump writing. Debug builds hit a size assertion on the array's operator[]. It looks like the orange itself is understood (bug 997152 comment 3, bug 996231 comment 26).
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #6) > If there's heap corruption or something like that that could > easily trip up dump writing. Some heap corruption was likely from bug 998711 before hitting this assertion, so if MinidumpWriteDump is used in the same process, then it could have been affected.
It is. Rearchitecting that is pretty difficult, unfortunately.
Thanks for looking at this. Seems there is a plausible explanation and not much can be done, so I'll close. Please feel free to reopen if useful.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.