Closed Bug 1274913 Opened 4 years ago Closed 3 years ago

Crash in nsTHashtable<T>::s_HashKey | PLDHashTable::Search | mozilla::LogModuleManager::CreateOrGetModule

Categories

(Core :: XPCOM, defect, critical)

x86
Windows 10
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla49
Tracking Status
firefox47 --- wontfix
firefox48 --- fixed
firefox49 --- fixed
firefox-esr45 --- wontfix
firefox50 --- fixed

People

(Reporter: njn, Assigned: erahm)

Details

(Keywords: crash)

Crash Data

Attachments

(3 files)

This bug was filed from the Socorro interface and is 
report bp-b702ed7d-b7f9-436d-bff2-60d1e2160517.
=============================================================

70 crashes with this signature in the past week, mostly in 48. The crashing address is 0x0 in almost all of them.

I looked at the minidump for https://crash-stats.mozilla.com/report/index/b702ed7d-b7f9-436d-bff2-60d1e2160517 but it left me none the wiser. It claims that |this| was null in PLDHashTable::Search(), but I don't see how that's possible given that LogModuleManager::mModules isn't a pointer.

It also said that |aKey| was null, which suggests that the log name is null. That seems a little more plausible. I'm a bit suspicious of |operator LogModule*()|.

erahm, any ideas?
Flags: needinfo?(erahm)
I found some reports like https://crash-stats.mozilla.com/report/index/f7768474-0aec-4143-981d-06c7c2160518 which have a different signature. The crash stack trace is this:

> 0 	xul.dll 	ResourceHashEntry::HashKey(void const*) 	rdf/base/nsRDFService.cpp:126
> 1 	xul.dll 	PLDHashTable::Search(void const*) 	xpcom/glue/PLDHashTable.cpp:522
> 2 	xul.dll 	mozilla::LogModuleManager::CreateOrGetModule(char const*) 	xpcom/base/Logging.cpp:173
> 3 	xul.dll 	mozilla::LazyLogModule::operator mozilla::LogModule*() 	obj-firefox/dist/include/mozilla/Logging.h:141
> 4 	xul.dll 	mozilla::WMFVideoMFTManager::InitInternal(bool) 	dom/media/platforms/wmf/WMFVideoMFTManager.cpp:291
> 5 	xul.dll 	mozilla::WMFVideoMFTManager::Init() 	dom/media/platforms/wmf/WMFVideoMFTManager.cpp:219

I'm not sure I trust that -- I don't see how insertion into the log module table can trigger a hash function in nsRDFService.cpp??
Crash Signature: [@ nsTHashtable<T>::s_HashKey | PLDHashTable::Search | mozilla::LogModuleManager::CreateOrGetModule] → [@ nsTHashtable<T>::s_HashKey | PLDHashTable::Search | mozilla::LogModuleManager::CreateOrGetModule] [@ ResourceHashEntry::HashKey]
A ton of identical hash functions are likely merged together by PGO.
So those are all using this weird pattern of referencing an extern'd static function that returns a pointer to a LogModule converted from a static function local LazyLogModule [1].

It seems like it's possible things aren't properly initialized yet on certain compilers? We could just declare the log module in each compilation unit, maybe just removing the 'extern' declaration would be enough or we could remove the intermediary function.

[1] https://dxr.mozilla.org/mozilla-central/rev/16663eb3dcfa759f25b5e27b101bc79270c156f2/dom/media/platforms/PlatformDecoderModule.cpp#9
Flags: needinfo?(erahm)
This is pretty straightforward, but let me know if you want someone else to review it.
Attachment #8756072 - Flags: review?(n.nethercote)
Assignee: nobody → erahm
Status: NEW → ASSIGNED
Also if it makes it easier to review, most of the changes were done via sed:

> sed -i -e '/extern mozilla::LogModule\* GetPDMLog();/d' \
>        -e 's/GetPDMLog()/sPDMLog/g'
Comment on attachment 8756072 [details] [diff] [review]
Move PDM log definition to header

Review of attachment 8756072 [details] [diff] [review]:
-----------------------------------------------------------------

Much nicer. r=me if it passes try.

::: dom/media/platforms/PlatformDecoderModule.h
@@ +30,5 @@
>  class MediaDataDecoderCallback;
>  class FlushableTaskQueue;
>  class CDMProxy;
>  
> +static LazyLogModule sPDMLog("PlatformDecoderModule");

Does this compile? I would have thought you'd have to put it in a .cpp file?
Attachment #8756072 - Flags: review?(n.nethercote) → review+
(In reply to Nicholas Nethercote [:njn] from comment #7)
> Comment on attachment 8756072 [details] [diff] [review]
> Move PDM log definition to header
> 
> Review of attachment 8756072 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Much nicer. r=me if it passes try.
> 
> ::: dom/media/platforms/PlatformDecoderModule.h
> @@ +30,5 @@
> >  class MediaDataDecoderCallback;
> >  class FlushableTaskQueue;
> >  class CDMProxy;
> >  
> > +static LazyLogModule sPDMLog("PlatformDecoderModule");
> 
> Does this compile? I would have thought you'd have to put it in a .cpp file?

Yes it builds fine. It just means there will be a separate instance in each file that includes the header (which is okay to do with log modules).
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/a789163346d9
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla49
would this fix be upliftable to 48? - in early data for 48.0b1 this crash is at #15, making up 0.9% of all crashes.
Flags: needinfo?(erahm)
Attached patch Patch for 48Splinter Review
Updated patch for 48
Comment on attachment 8761695 [details] [diff] [review]
Patch for 48

Approval Request Comment
[Feature/regressing bug #]:
Unknown.
[User impact if declined]:
Crashes keep crashing.
[Describe test coverage new/current, TreeHerder]:
Been in 49 for a while.
[Risks and why]:
None. 
[String/UUID change made/needed]:
N/A
Flags: needinfo?(erahm)
Attachment #8761695 - Flags: approval-mozilla-beta?
We are not taking stability in esr after the two first releases.
Comment on attachment 8761695 [details] [diff] [review]
Patch for 48

I am not sure I agree with you about the risk evaluation ("None"), every change has risk.

Anyway, taking it because we should land that sooner in the cycle than later.

Should be in 48 beta 2.
Attachment #8761695 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
hi, though their volume has dropped after the patch has landed, crashes with this signature are continuing to happen in 48 beta builds - could you take a look again?
Flags: needinfo?(erahm)
It looks like the remaining crashes a referencing |GetDirectShowLog| [1] which again uses an odd pattern of referencing an externed function that has a local static |LazyLogModule| the gets converted to a |LogModule| in the return statement.

I'll go through a few more crash reports to verify that's the case.

[1] https://dxr.mozilla.org/mozilla-central/rev/63cc31d6cc1c8089590461016ce0b4a2fb77ecbc/dom/media/directshow/DirectShowReader.cpp#20-24
Flags: needinfo?(erahm)
This should help fix unitialized statics crashes on Windows in the DirectShow logging code.

Additional changes are included to fix non-unified build issues.
Attachment #8769310 - Flags: review?(giles)
Not quite fixed, lets reopen.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment on attachment 8769310 [details] [diff] [review]
Use LazyLogModule directly for DirectShow logging

Review of attachment 8769310 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for cleaning this up!

::: dom/media/directshow/AudioSinkFilter.cpp
@@ +23,5 @@
>  
>  namespace mozilla {
>  
> +static LazyLogModule sDirectShowLog("DirectShowDecoder");
> +#define LOG(...) MOZ_LOG(sDirectShowLog, mozilla::LogLevel::Debug, (__VA_ARGS__))

I think 'g' is winning over 's' as a LazyLogModule prefix in dom/media. I slightly prefer it for compilation-unit scope but don't feel strongly.

::: dom/media/directshow/moz.build
@@ +21,5 @@
>  SOURCES += [
>      'AudioSinkFilter.cpp',
> +    'AudioSinkInputPin.cpp',
> +    'DirectShowReader.cpp',
> +    'SampleSink.cpp',

Did you mean to leave this change in? Fixing the non-unified build issues it good, but not the same as disabling the unified build.
Attachment #8769310 - Flags: review?(giles) → review+
Comment on attachment 8769310 [details] [diff] [review]
Use LazyLogModule directly for DirectShow logging

Review of attachment 8769310 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/directshow/AudioSinkFilter.cpp
@@ +23,5 @@
>  
>  namespace mozilla {
>  
> +static LazyLogModule sDirectShowLog("DirectShowDecoder");
> +#define LOG(...) MOZ_LOG(sDirectShowLog, mozilla::LogLevel::Debug, (__VA_ARGS__))

Sure, I'll go with 'g'.

::: dom/media/directshow/moz.build
@@ +21,5 @@
>  SOURCES += [
>      'AudioSinkFilter.cpp',
> +    'AudioSinkInputPin.cpp',
> +    'DirectShowReader.cpp',
> +    'SampleSink.cpp',

Sorry that wasn't clear, I had to make them non-unified due to symbol clashes for |sDirectShowLog|.
https://hg.mozilla.org/integration/mozilla-inbound/rev/0fadabf932c25817f2b8f218532171b82ac96280
Bug 1274913 - Use LazyLogModule directly for DirectShow logging. r=rillian
Comment on attachment 8769310 [details] [diff] [review]
Use LazyLogModule directly for DirectShow logging

Nominating early, I'd like to see this land first on m-c of course. We might need rebased patches for aurora/beta, I'll deal with that after approval.

Approval Request Comment
[Feature/regressing bug #]: Unknown
[User impact if declined]: Crash remains for windows users.
[Describe test coverage new/current, TreeHerder]: Covered by existing media tests.
[Risks and why]: Low, previous patch in this bug covering the same issue hasn't seen negative side effects.
[String/UUID change made/needed]: N/A
Attachment #8769310 - Flags: approval-mozilla-beta?
Attachment #8769310 - Flags: approval-mozilla-aurora?
Hi Eric,
We are going to approve it when it's in m-c.
Next time please open a new bug to follow up because this will make sheriff and release management life easier.
Flags: needinfo?(erahm)
https://hg.mozilla.org/mozilla-central/rev/0fadabf932c2
Status: REOPENED → RESOLVED
Closed: 4 years ago3 years ago
Resolution: --- → FIXED
Comment on attachment 8769310 [details] [diff] [review]
Use LazyLogModule directly for DirectShow logging

Review of attachment 8769310 [details] [diff] [review]:
-----------------------------------------------------------------

This patch fixes a crash. Take it in 48 beta 10 and aurora.
Attachment #8769310 - Flags: approval-mozilla-beta?
Attachment #8769310 - Flags: approval-mozilla-beta+
Attachment #8769310 - Flags: approval-mozilla-aurora?
Attachment #8769310 - Flags: approval-mozilla-aurora+
Clearing ni, we landed this.
Flags: needinfo?(erahm)
You need to log in before you can comment on or make changes to this bug.