Closed Bug 1000537 Opened 10 years ago Closed 10 years ago

minidump_stackwalk "ERROR: ReadBytes: read 0/32 | ERROR: Minidump cannot read header"

Categories

(Toolkit :: Crash Reporting, defect)

x86
Windows 7
defect
Not set
normal

Tracking

()

RESOLVED INCOMPLETE

People

(Reporter: karlt, Unassigned)

References

Details

(Keywords: intermittent-failure, sheriffing-P1)

+++ This bug was initially created as a clone of Bug #820362 +++

Windows XP 32-bit mozilla-inbound opt test crashtest on 2014-04-23 00:57:47 PDT for push 80dba24cc929

slave: t-xp32-ix-119

https://tbpl.mozilla.org/php/getParsedLog.php?id=38308247&tree=Mozilla-Inbound#error0
Blocks: 996231
This is just "we wrote an empty minidump".
This seems to be happening consistently in what is likely an off main thread crash in bug 996231.

Is there a theory why the dump might be empty?
The assertion failure doesn't directly imply out of memory.

There is at least one failure that is a little different, perhaps a truncated dump.

https://tbpl.mozilla.org/php/getParsedLog.php?id=38210678&tree=Mozilla-Inbound#error0
minidump.cc:3860: INFO: Minidump not byte-swapping minidump
minidump.cc:4226: INFO: GetStream: type 7 not present
minidump.cc:4226: INFO: GetStream: type 7 not present
minidump.cc:4226: INFO: GetStream: type 1197932545 not present
minidump.cc:4226: INFO: GetStream: type 6 not present
minidump.cc:4226: INFO: GetStream: type 1197932546 not present
minidump.cc:4226: INFO: GetStream: type 4 not present
minidump.cc:4226: INFO: GetStream: type 3 not present
minidump_processor.cc:112: ERROR: Minidump c:\docume~1\cltbld~1.t-x\locals~1\temp\tmpciq26c.mozrunner\minidumps\0aa98f82-da7e-4cc5-bbf0-6e094ec701a3.dmp has no thread list
minidump.cc:3787: INFO: Minidump closing minidump
minidump_stackwalk.cc:529: ERROR: MinidumpProcessor::Process failed
Right, that reads as "we managed to write a minidump header with a directory list, but not any of the streams".

Unfortunately this is Windows, and on Windows we use Microsoft's MinidumpWriteDump as a black box--we don't have any knowledge of its inner workings. We did try to do some mitigation for OOM/VM fragmentation scenarios in bug 837835/bug 943051 which had a noticeable impact on empty dumps in crash-stats.

Short of having a reproducible testcase I'm not sure there's anything actionable we can do here. If you can reproduce this on try you could try bumping the memory reservation we use up a bit and see if that helps:
http://hg.mozilla.org/mozilla-central/annotate/1d0496e30feb/toolkit/crashreporter/nsExceptionHandler.cpp#l373
Out of a couple dozen crashtest runs, I got two empty dumps and one valid one: https://tbpl.mozilla.org/?tree=Try&rev=7f1d28874fcf

The run with the intact dump didn't indicate any memory problems:
SystemMemoryUsePercentage=29
TotalVirtualMemory=2147352576
AvailableVirtualMemory=1523236864
AvailablePageFile=4348211200
AvailablePhysicalMemory=2252025856

I've got another try build going with higher breakpad reservation (+ commit), but I don't know if that will change anything.
> I've got another try build going with higher breakpad reservation (+
> commit), but I don't know if that will change anything.
Yeah, that didn't help at all.
For reference, the crash with the valid dump from that try run was:
https://tbpl.mozilla.org/php/getParsedLog.php?id=38610536&tree=Try

22:47:05     INFO -  Crash reason:  EXCEPTION_ACCESS_VIOLATION_READ
22:47:05     INFO -  Crash address: 0x0
22:47:05     INFO -  Thread 82 (crashed)
22:47:05     INFO -   0  xul.dll!mozilla::AudioNodeExternalInputStream::TrackMapEntry::ResampleChannels(nsTArray<void const *> const &,unsigned int,mozilla::AudioSampleFormat,float) [AudioNodeExternalInputStream.cpp:7f1d28874fcf : 171 + 0x6]
22:47:05     INFO -      eip = 0x0205faf6   esp = 0x19ede714   ebp = 0x19ede758   ebx = 0x19edf7c0
22:47:05     INFO -      esi = 0x02b13690   edi = 0x00000001   eax = 0x00000000   ecx = 0x00000000
22:47:05     INFO -      edx = 0x00000002   efl = 0x00010246
22:47:05     INFO -      Found by: given as instruction pointer in context
22:47:05     INFO -   1  xul.dll!mozilla::AudioNodeExternalInputStream::TrackMapEntry::ResampleInputData(mozilla::AudioSegment *) [AudioNodeExternalInputStream.cpp:7f1d28874fcf : 198 + 0x13]
22:47:05     INFO -      eip = 0x0205fc4c   esp = 0x19ede760   ebp = 0x19edf7f4
22:47:05     INFO -      Found by: call frame info
22:47:05     INFO -   2  xul.dll!mozilla::AudioNodeExternalInputStream::ProcessInput(__int64,__int64,unsigned int) [AudioNodeExternalInputStream.cpp:7f1d28874fcf : 412 + 0x10]
22:47:05     INFO -      eip = 0x02060547   esp = 0x19edf7fc   ebp = 0x19edfd70
22:47:05     INFO -      Found by: call frame info
22:47:05     INFO -   3  xul.dll!mozilla::MediaStreamGraphImpl::ProduceDataForStreamsBlockByBlock(unsigned int,int,__int64,__int64) [MediaStreamGraph.cpp:7f1d28874fcf : 1178 + 0x2a]
22:47:05     INFO -      eip = 0x020650df   esp = 0x19edfd78   ebp = 0x19edfda8
22:47:05     INFO -      Found by: previous frame's frame pointer
22:47:05     INFO -   4  xul.dll!mozilla::MediaStreamGraphImpl::RunThread() [MediaStreamGraph.cpp:7f1d28874fcf : 1342 + 0x1b]
22:47:05     INFO -      eip = 0x02070c99   esp = 0x19edfdb0   ebp = 0x19edfe44
22:47:05     INFO -      Found by: call frame info
22:47:05     INFO -   5  xul.dll!mozilla::`anonymous namespace'::MediaStreamGraphInitThreadRunnable::Run() [MediaStreamGraph.cpp:7f1d28874fcf : 1498 + 0xc]
22:47:05     INFO -      eip = 0x02070f60   esp = 0x19edfe4c   ebp = 0x19edfe50
22:47:05     INFO -      Found by: call frame info

Looks like a null pointer deref here:
http://hg.mozilla.org/mozilla-central/annotate/fefcd48d8313/content/media/AudioNodeExternalInputStream.cpp#l171

Is this only happening on opt builds? Are we seeing a different symptom on debug builds? If there's heap corruption or something like that that could easily trip up dump writing.
> Looks like a null pointer deref here:
> http://hg.mozilla.org/mozilla-central/annotate/fefcd48d8313/content/media/
> AudioNodeExternalInputStream.cpp#l171
> 
> Is this only happening on opt builds? Are we seeing a different symptom on
> debug builds? If there's heap corruption or something like that that could
> easily trip up dump writing.

Debug builds hit a size assertion on the array's operator[]. It looks like the orange itself is understood (bug 997152 comment 3, bug 996231 comment 26).
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #6)
> If there's heap corruption or something like that that could
> easily trip up dump writing.

Some heap corruption was likely from bug 998711 before hitting this assertion, so if MinidumpWriteDump is used in the same process, then it could have been affected.
It is. Rearchitecting that is pretty difficult, unfortunately.
Thanks for looking at this.  Seems there is a plausible explanation and not much can be done, so I'll close.  Please feel free to reopen if useful.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.