Closed
Bug 1391979
Opened 8 years ago
Closed 7 years ago
Crash in RtlpFindNextActivationContextSection | RtlpFindFirstActivationContextSection | RtlFindActivationContextSectionGuid | FindActCtxSectionGuidWorker
Categories
(Core :: Audio/Video: cubeb, defect, P2)
Tracking
()
RESOLVED
WONTFIX
People
(Reporter: snorp, Assigned: padenot)
Details
(Keywords: crash)
Crash Data
This bug was filed from the Socorro interface and is
report bp-106ba238-9865-43a3-92f8-2cc0e1170817.
=============================================================
My brother is hitting this crash consistently on 55.
Updated•8 years ago
|
Rank: 15
Priority: -- → P1
| Assignee | ||
Comment 1•8 years ago
|
||
Jean-Yves, this is crashing in WMF.
Component: Audio/Video: cubeb → Audio/Video: Playback
Flags: needinfo?(jyavenard)
Comment 2•8 years ago
|
||
am I looking at the same bug?
this is all in cubeb, windows yes, but still all in the cubeb's wasapi code.
Flags: needinfo?(jyavenard)
Comment 3•8 years ago
|
||
Yeah. From the crash report @comment 1, it crashes in the cubeb.
Flags: needinfo?(padenot)
| Assignee | ||
Comment 4•8 years ago
|
||
The signature appears to not be specific enough, https://crash-stats.mozilla.com/report/index/707ff2e4-4d13-41af-95b0-9b94c0170823 is about WMF for example.
That said, I'll take this one. NI jya so he's knows about the other one.
Flags: needinfo?(padenot) → needinfo?(jyavenard)
| Assignee | ||
Updated•8 years ago
|
Assignee: nobody → padenot
Comment 5•8 years ago
|
||
Likely to be exactly the same issue as the cubeb one. A failure to instanciate some service.
Hopefully when :padenot figure the why, we will be able to apply the same fix in the decoder.
Flags: needinfo?(jyavenard)
Updated•8 years ago
|
Component: Audio/Video: Playback → Audio/Video: cubeb
| Assignee | ||
Comment 6•8 years ago
|
||
Dan, can you have a look at those stacks, it's crashing somewhere inside system libraries, with weird pointers, have you ever seen something like this?
We know of two call sites where it eventually blows up: one related to system codecs, and one related to audio device enumeration.
Flags: needinfo?(dmajor)
| Assignee | ||
Comment 7•8 years ago
|
||
Not Dan, David, I confused myself.
When CoCreateInstance is on the stack, my first guess is that COM wasn't initialized on that thread, or was initialized with the wrong threading model.
cpearce observed in bug 1389980 that this bad-COM situation might be happening more often nowadays. I have a suspicion that his fix might address these crashes, but since the patch is only on nightly 57, and we don't see this crash on nightly, I can't give any data to confirm it.
Flags: needinfo?(dmajor)
| Assignee | ||
Comment 9•8 years ago
|
||
On this particular call path, we do initialize COM like so: https://github.com/kinetiknz/cubeb/blob/master/src/cubeb_wasapi.cpp#L139, in MTA, so it must be different somewhat.
| Assignee | ||
Updated•8 years ago
|
Flags: needinfo?(dmajor)
Comment 10•8 years ago
|
||
Working backwards...
The crashing instruction is:
ntdll!RtlpFindNextActivationContextSection+0xf1:
4b2dd4c7 8b7604 mov esi,dword ptr [esi+4]
And esi in all the reports is 0xaa000080 (they are all 32-bit builds), leading to the crash address of 0xaa000084 (in crash-stats it is displayed with a sign-extend as 0xffffffffaa000084).
The only way to reach that instruction at +0xf1 is to have gone through these lines:
[...]
4b2dd3de 64a118000000 mov eax,dword ptr fs:[00000018h] ; Standard address of NT TEB [1]
4b2dd3f1 8945fc mov dword ptr [ebp-4],eax
[...]
4b2dd47b 8b45fc mov eax,dword ptr [ebp-4] ; eax = TEB
4b2dd47e 8b80a8010000 mov eax,dword ptr [eax+1A8h] ; eax = TEB->ActivationContextData
4b2dd488 8b30 mov esi,dword ptr [eax] ; esi = TEB->ActivationContextData->SomeField
[...]
4b2dd4c7 8b7604 mov esi,dword ptr [esi+4] ; SomeField->SomeOtherField;
This 0xaa000080 value is supposed to be a pointer inside an internal Windows data structure (my debugger says it's a member of struct _ACTIVATION_CONTEXT_DATA but I can't find any further documentation of that structure). The pointer is garbage though. Interestingly enough, 0xaa000084 comes up in a bunch of other signatures [2] where Windows is trying to access activation context information.
So this is corruption inside a Windows data structure, but I don't want to dismiss this as "this is an OS bug" just yet. It's possible that we (or some other code) called Windows in some incorrect way to make it mess up its bookkeeping.
I'm still not fully convinced that we're initializing COM correctly. Are you sure that the auto_com didn't get an RPC_E_CHANGED_MODE error?
[1] https://en.wikipedia.org/wiki/Win32_Thread_Information_Block
[2] https://crash-stats.mozilla.com/search/?address=%3D0xffffffffaa000084&product=Firefox&date=%3E%3D2017-03-07T04%3A10%3A48.000Z&date=%3C2017-09-07T04%3A10%3A48.000Z&_sort=-date&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#crash-reports
Flags: needinfo?(dmajor)
| Assignee | ||
Comment 11•8 years ago
|
||
(In reply to David Major [:dmajor] from comment #10)
> I'm still not fully convinced that we're initializing COM correctly. Are you
> sure that the auto_com didn't get an RPC_E_CHANGED_MODE error?
I'd believe it if it were the case. We can't change the apartment type of a thread can we ?
I also don't quite understand why it works the immense majority of the time, and why it crashes often when snorp's brother computer.
Flags: needinfo?(dmajor)
Comment 12•8 years ago
|
||
(In reply to Paul Adenot (:padenot) from comment #11)
> (In reply to David Major [:dmajor] from comment #10)
> > I'm still not fully convinced that we're initializing COM correctly. Are you
> > sure that the auto_com didn't get an RPC_E_CHANGED_MODE error?
>
> I'd believe it if it were the case. We can't change the apartment type of a
> thread can we ?
Correct.
> I also don't quite understand why it works the immense majority of the time,
> and why it crashes often when snorp's brother computer.
This would be an excellent use-case for the upcoming Microsoft time-travel debugger. :-)
Or maybe he could run with whatever setting causes those auto_com LOG's to be activated.
It would also be interesting to know if this happens on 64-bit builds. We don't see any 64-bit crash reports for these signatures -- maybe the structure offsets are sufficiently different that this "corruption" goes somewhere else and is benign, or maybe they still crash just with a different signature.
Flags: needinfo?(dmajor)
Comment 13•8 years ago
|
||
Mass change P1->P2 to align with new Mozilla triage process
Priority: P1 → P2
Comment 14•7 years ago
|
||
Closing because no crashes reported for 12 weeks.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•