Closed
Bug 1277626
Opened 8 years ago
Closed 2 years ago
Crashes in CreateDecodeDeviceLH
Categories
(Core :: Audio/Video: Playback, defect, P3)
Tracking
()
RESOLVED
WORKSFORME
mozilla50
People
(Reporter: milan, Assigned: mattwoodrow)
References
Details
(Keywords: crash, stale-bug)
Crash Data
Attachments
(4 files)
31.39 KB,
image/png
|
Details | |
6.89 KB,
patch
|
mattwoodrow
:
review-
|
Details | Diff | Splinter Review |
8.26 KB,
patch
|
Details | Diff | Splinter Review | |
4.75 KB,
patch
|
jya
:
review+
dvander
:
review+
lizzard
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
Because these usually crash in the graphics driver, all at different addresses for different versions, we don't aggregate them enough, and we fix things one driver version at a time (e.g., bug 1266220 where we ended up blocking driver 10.18.10.3947.) Some work was already done in bug 1274132, so this may be a duplicate, but I wanted to record things. Here's the crash query when creating DXVA decoding device. With Intel graphics: https://crash-stats.mozilla.com/search/?product=Firefox&signature=~igdumd32.dll&proto_signature=~CreateDecodeDeviceLH&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature With AMD graphics: https://crash-stats.mozilla.com/search/?product=Firefox&proto_signature=~CreateDecodeDeviceLH&signature=~atiumdva&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature With Nvidia graphics: https://crash-stats.mozilla.com/search/?product=Firefox&proto_signature=~CreateDecodeDeviceLH&signature=~nvd3dum.dll&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature
Reporter | ||
Comment 1•8 years ago
|
||
These crashes represent ~43% of all Intel driver crashes. The top two signatures ~17% of all Intel driver crashes are in this group. They also account for 42% of all ATI crashes in the said DLL, and 9% of all the Nvidia crashes. I'd say we take a hard look at these ones. Let me know how graphics can help.
Flags: needinfo?(bugs)
Flags: needinfo?(ajones)
Reporter | ||
Comment 2•8 years ago
|
||
Matt, can we wrap try/catch around the mDecoderService->CreateVideoDecoder call in here https://dxr.mozilla.org/mozilla-central/source/dom/media/platforms/wmf/DXVA2Manager.cpp#223 and would it catch the crash and help us recover?
Flags: needinfo?(matt.woodrow)
Reporter | ||
Comment 3•8 years ago
|
||
I can randomly see some of these are D3D9 DXVA, but I can't tell if it's all of them.
Reporter | ||
Comment 4•8 years ago
|
||
From http://ashughes1.github.io/metrics-graphics-gfx/, put CreateDecodeDeviceLH in the input box, press Graph if nothing shows up. Is the spike because of the release of 44?
Assignee | ||
Comment 5•8 years ago
|
||
(In reply to Milan Sreckovic [:milan] from comment #2) > Matt, can we wrap try/catch around the mDecoderService->CreateVideoDecoder > call in here > https://dxr.mozilla.org/mozilla-central/source/dom/media/platforms/wmf/ > DXVA2Manager.cpp#223 and would it catch the crash and help us recover? This callsite is outside of outside of our startup crash guards (only the ::Init function is covered). I think we should definitely fix that, so that these crashes only happen once, rather than repeatedly. I'm a little scared about try/catch, since we don't know if any memory has been corrupted, so we might just crash later on in places that make it harder to track. Ultimately, I think the solution is to get accelerated video decoding running out of process (we'll be discussing this more in London), and to have crash guards so that we don't repeatedly crash. How about we start with the crash guard, and then consider adding try/catch or blacklisting if we decide we want to prevent the one-off crash too.
Flags: needinfo?(matt.woodrow)
Reporter | ||
Comment 6•8 years ago
|
||
Crash guard would be awesome, with the annotations, maybe even with telemetry?
Comment 7•8 years ago
|
||
(In reply to Milan Sreckovic [:milan] from comment #0) > With Intel graphics: > > https://crash-stats.mozilla.com/search/?product=Firefox&signature=~igdumd32. > dll&proto_signature=~CreateDecodeDeviceLH&_facets=signature&_columns=date&_co > lumns=signature&_columns=product&_columns=version&_columns=build_id&_columns= > platform#facet-signature > > With AMD graphics: > > https://crash-stats.mozilla.com/search/ > ?product=Firefox&proto_signature=~CreateDecodeDeviceLH&signature=~atiumdva&_f > acets=signature&_columns=date&_columns=signature&_columns=product&_columns=ve > rsion&_columns=build_id&_columns=platform#facet-signature > > With Nvidia graphics: > > https://crash-stats.mozilla.com/search/ > ?product=Firefox&proto_signature=~CreateDecodeDeviceLH&signature=~nvd3dum. > dll&_facets=signature&_columns=date&_columns=signature&_columns=product&_colu > mns=version&_columns=build_id&_columns=platform#facet-signature Why do these queries not return crashes in 48 and 49? Did we blacklist some of these drivers in 48/49? We blacklisted a bunch of ATI drivers for D3D9 in 49 in bug 1265281, and igdumd64.dll in bug 1271525. But those are only in 49 thus far; they've not been uplift yet.
Assignee | ||
Comment 8•8 years ago
|
||
David, can you add a crash guard for this call site too please?
Flags: needinfo?(dvander)
Reporter | ||
Comment 9•8 years ago
|
||
(In reply to Chris Pearce (:cpearce) from comment #7) > ... > Why do these queries not return crashes in 48 and 49? Did we blacklist some > of these drivers in 48/49? We blacklisted a bunch of ATI drivers for D3D9 in > 49 in bug 1265281, and igdumd64.dll in bug 1271525. But those are only in 49 > thus far; they've not been uplift yet. Good question. I don't know enough about video changes that went into 48, but it's also possible something else (e.g. in graphics) changed that moved or masked these crashes. I would suggest we add telemetry for how many actually get through these crash guards. We probably don't want to end up in a situation where we block everything.
OS: Unspecified → Windows
Reporter | ||
Comment 10•8 years ago
|
||
(In reply to Matt Woodrow (:mattwoodrow) from comment #8) > David, can you add a crash guard for this call site too please? This may be a good first project for Ryan?
Chris Pearce is aware of this so I'll remove the needinfo.
Component: Audio/Video → Audio/Video: Playback
Flags: needinfo?(ajones)
Priority: -- → P1
Assignee: nobody → rhunt
Status: NEW → ASSIGNED
Flags: needinfo?(dvander)
Comment 12•8 years ago
|
||
Attachment #8762189 -
Flags: review?(matt.woodrow)
Assignee | ||
Comment 13•8 years ago
|
||
Comment on attachment 8762189 [details] [diff] [review] Added crash guards in SupportsConfig, telemetry in D3D*VideoCrashGuard. Review of attachment 8762189 [details] [diff] [review]: ----------------------------------------------------------------- ::: gfx/src/DriverCrashGuard.cpp @@ +506,5 @@ > > +void > +D3D9VideoCrashGuard::Initialize() > +{ > + if (!XRE_IsParentProcess()) { We basically only run videos in the child process, so this is going to stop this being effective for most cases.
Attachment #8762189 -
Flags: review?(matt.woodrow) → review-
Comment 14•8 years ago
|
||
Attachment #8763269 -
Flags: review?(matt.woodrow)
Comment on attachment 8763269 [details] [diff] [review] bug1277626.patch Talked about this offline, but: this still won't work because NS_IsMainThread() will be false. The easiest (or maybe only sane way) to fix this would be to have a new PDriverCrashGuard top-level protocol that PContentParent opens on the main thread and PContentChild bridges on the video thread. The other solution would be doing some hack to block the video thread and post a message to the main event loop, and have that event wake up the video thread when we've received a crash guard response from the parent process. With either of these solutions we'd have to make sure we're not going to run into deadlocks (i.e. cpow to Parent -> Child -> Video and guard Video -> Parent). Milan, is this crash severe enough that we want a fix in the next release or two, or is it something that can wait for out-of-process video?
Flags: needinfo?(milan)
Attachment #8763269 -
Flags: review?(matt.woodrow)
Reporter | ||
Comment 16•8 years ago
|
||
More than 4000 crashes in a week, and this is with the blocklisting we are doing on a number of configurations, so I would like to pursue this more, and see what we can come up with that doesn't need OOP video.
Flags: needinfo?(milan)
I think the best approach here would be to defer CreateDecoder calls until the driver crash guard has been confirmed. The way it would work is like this: 1. Decoder thread dispatches to the main thread: "initiate a driver crash guard with the parent process". 2. Decoder thread then goes idle. 3. Main thread notifies the decoder thread whether or not it is okay to proceed. 4. If okay, decoder thread calls CreateDecoder. 5. If the call does not crash, decoder dispatches to the main thread "tell the parent process that the decoder did not crash". Media folks, how hard would it be to factor out this initialization step so it can be interrupted and resumed?
Assignee: rhunt → dvander
Flags: needinfo?(cpearce)
Flags: needinfo?(jyavenard)
Comment 18•8 years ago
|
||
(In reply to David Anderson [:dvander] from comment #17) > Media folks, how hard would it be to factor out this initialization step so > it can be interrupted and resumed? I think jya is the best person to answer (and implement) this. I expect it would involve changing MediaFormatReader::EnsureDecoderCreated() to return a MozPromise, and then changing all PlatformDecoderModules' CreateXXDecoder() functions to also return MozPromises. jya may be able to come up with a better solution here.
Flags: needinfo?(cpearce)
Comment 19•8 years ago
|
||
we should simply move the WMF creation to be performed during the initialisation stage rather than in the constructor like it does now. All other platforms (linux, mac, android) are already doing it that way. Unfortunately not on WMF. One issue remain however is how to determine if the platform supports the codec or not. Right now this is done by creating a decoder and simply checking if the creation succeeded. It can't be promise based.
Assignee: dvander → jyavenard
Flags: needinfo?(jyavenard)
Reporter | ||
Comment 20•8 years ago
|
||
[Tracking Requested - why for this release]: 3700+ crashes in the past week.
tracking-firefox48:
--- → ?
Reporter | ||
Updated•8 years ago
|
Crash Signature: [@ @0x0 | igdumd32.dll | atiu9pag.dll | CreateDecodeDeviceLH ]
[@ atiumdva.dll | atiu9pag.dll | CreateDecodeDeviceLH ]
[@ @0x0 | igdumdim32.dll | CreateDecodeDeviceLH ]
[@ @0x0 | igdumdim32.dll | CreateDecodeDeviceLH ]
Reporter | ||
Updated•8 years ago
|
Crash Signature: [@ @0x0 | igdumd32.dll | atiu9pag.dll | CreateDecodeDeviceLH ]
[@ atiumdva.dll | atiu9pag.dll | CreateDecodeDeviceLH ]
[@ @0x0 | igdumdim32.dll | CreateDecodeDeviceLH ]
[@ @0x0 | igdumdim32.dll | CreateDecodeDeviceLH ] → [@ @0x0 | igdumd32.dll | atiu9pag.dll | CreateDecodeDeviceLH ]
[@ atiumdva.dll | atiu9pag.dll | CreateDecodeDeviceLH ]
[@ @0x0 | igdumdim32.dll | CreateDecodeDeviceLH ]
[@ @0x0 | igdumdim32.dll | CreateDecodeDeviceLH ]
[@ nvd3dum.dll | CreateDecodeDev…
Updated•8 years ago
|
status-firefox48:
--- → affected
status-firefox49:
--- → affected
Comment 23•8 years ago
|
||
(In reply to Gerry Chang [:gchang] from comment #22) > Hi Blake, > Can you help on this? Anthony has talked with jya about this via IRC, jya will have a go at it on Monday.
Flags: needinfo?(bwu)
Updated•8 years ago
|
Flags: needinfo?(jyavenard)
Comment 24•8 years ago
|
||
Was those crashes present in 47 and earlier? There's been no much changes in the way decoders are created and initialised since 41. because if it's something spiking in 48 it may be something else... (or maybe just dxva11 crashing quicker)
Flags: needinfo?(jyavenard)
Updated•8 years ago
|
Flags: needinfo?(jyavenard)
Updated•8 years ago
|
Assignee: jyavenard → matt.woodrow
Flags: needinfo?(jyavenard)
Assignee | ||
Comment 25•8 years ago
|
||
Flags: needinfo?(bugs)
Attachment #8771836 -
Flags: review?(jyavenard)
Assignee | ||
Updated•8 years ago
|
Attachment #8771836 -
Flags: review?(dvander)
Comment 26•8 years ago
|
||
Comment on attachment 8771836 [details] [diff] [review] main-thread-crash-guard-video Review of attachment 8771836 [details] [diff] [review]: ----------------------------------------------------------------- ::: dom/media/platforms/wmf/WMFVideoMFTManager.cpp @@ +508,5 @@ > + , mSupportsConfig(false) > + {} > + > + NS_IMETHOD Run() { > + NS_ASSERTION(NS_IsMainThread(), "Must be on main thread."); Strong assert would be better no? @@ +553,5 @@ > + new SupportsConfigEvent(mDXVA2Manager, aType, framerate); > + > + if (NS_IsMainThread()) { > + event->Run(); > + } } else {
Attachment #8771836 -
Flags: review?(jyavenard) → review+
Comment on attachment 8771836 [details] [diff] [review] main-thread-crash-guard-video Review of attachment 8771836 [details] [diff] [review]: ----------------------------------------------------------------- ::: dom/media/platforms/wmf/WMFVideoMFTManager.cpp @@ +555,5 @@ > + if (NS_IsMainThread()) { > + event->Run(); > + } > + else { > + NS_DispatchToMainThread(event, NS_DISPATCH_SYNC); And this won't deadlock, right?
Attachment #8771836 -
Flags: review?(dvander) → review+
Assignee | ||
Comment 28•8 years ago
|
||
(In reply to David Anderson [:dvander] from comment #27) > Comment on attachment 8771836 [details] [diff] [review] > main-thread-crash-guard-video > > Review of attachment 8771836 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: dom/media/platforms/wmf/WMFVideoMFTManager.cpp > @@ +555,5 @@ > > + if (NS_IsMainThread()) { > > + event->Run(); > > + } > > + else { > > + NS_DispatchToMainThread(event, NS_DISPATCH_SYNC); > > And this won't deadlock, right? I *believe* so, but it's hard to be entirely sure. We do the same thing in Init (from the same thread) without issues, so that's a good sign.
Comment 29•8 years ago
|
||
Pushed by mwoodrow@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/6047635058ed Create test DXVA decoders on the main thread so that we can protect them with a crash guard. r=jya,dvander
Comment 30•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/6047635058ed
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
status-firefox50:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla50
Comment 31•8 years ago
|
||
Crash volume for signature '@0x0 | igdumd32.dll | atiu9pag.dll | CreateDecodeDeviceLH': - Ni. (version 51): 0 crashes from 2016-08-01. - Aur. (version 50): 0 crashes from 2016-08-01. - Beta (version 49): 8 crashes from 2016-08-02. - Rel. (version 48): 2 crashes from 2016-07-25. - Esr (version 45): 0 crashes from 2016-04-20. Crash volume on the last weeks (Week N is from 08-08 to 08-14): W. N-1 W. N-2 - Beta (49) 4 - Rel. (48) 0 0 Affected platform: Windows Crash volume for signature 'igdumd32.dll | RtlConsoleMultiByteToUnicodeN | igdumd32.dll | CreateDecodeDeviceLH': - Ni. (version 51): 0 crashes from 2016-08-01. - Aur. (version 50): 0 crashes from 2016-08-01. - Beta (version 49): 46 crashes from 2016-08-02. - Rel. (version 48): 30 crashes from 2016-07-25. - Esr (version 45): 3 crashes from 2016-04-20. Crash volume on the last weeks (Week N is from 08-08 to 08-14): W. N-1 W. N-2 W. N-3 W. N-4 W. N-5 W. N-6 W. N-7 - Beta (49) 18 - Rel. (48) 15 0 - Esr (45) 0 0 0 2 0 0 0 Affected platform: Windows Crash volume for signature 'nvd3dum.dll | CreateDecodeDeviceLH': - Ni. (version 51): 0 crashes from 2016-08-01. - Aur. (version 50): 0 crashes from 2016-08-01. - Beta (version 49): 40 crashes from 2016-08-02. - Rel. (version 48): 51 crashes from 2016-07-25. - Esr (version 45): 98 crashes from 2016-04-20. Crash volume on the last weeks (Week N is from 08-08 to 08-14): W. N-1 W. N-2 W. N-3 W. N-4 W. N-5 W. N-6 W. N-7 - Beta (49) 11 - Rel. (48) 12 0 - Esr (45) 12 9 5 10 17 17 4 Affected platform: Windows Crash volume for signature 'atiumdva.dll | atiu9pag.dll | CreateDecodeDeviceLH': - Ni. (version 51): 0 crashes from 2016-08-01. - Aur. (version 50): 0 crashes from 2016-08-01. - Beta (version 49): 104 crashes from 2016-08-02. - Rel. (version 48): 173 crashes from 2016-07-25. - Esr (version 45): 61 crashes from 2016-04-20. Crash volume on the last weeks (Week N is from 08-08 to 08-14): W. N-1 W. N-2 W. N-3 W. N-4 W. N-5 W. N-6 W. N-7 - Beta (49) 47 - Rel. (48) 74 0 - Esr (45) 5 9 11 12 6 9 2 Affected platform: Windows Crash volume for signature '@0x0 | igdumdim32.dll | CreateDecodeDeviceLH': - Ni. (version 51): 0 crashes from 2016-08-01. - Aur. (version 50): 2 crashes from 2016-08-01. - Beta (version 49): 181 crashes from 2016-08-02. - Rel. (version 48): 55 crashes from 2016-07-25. - Esr (version 45): 436 crashes from 2016-04-20. Crash volume on the last weeks (Week N is from 08-08 to 08-14): W. N-1 W. N-2 W. N-3 W. N-4 W. N-5 W. N-6 W. N-7 - Aur. (50) 1 - Beta (49) 57 - Rel. (48) 26 0 - Esr (45) 80 74 55 69 51 41 17 Affected platform: Windows
Comment 32•8 years ago
|
||
According to comment #31, the volume is not that big. However, Liz would be probably happy to take a patch
Assignee | ||
Comment 33•8 years ago
|
||
Comment on attachment 8771836 [details] [diff] [review] main-thread-crash-guard-video Approval Request Comment [Feature/regressing bug #]: Unknown, long standing bug [User impact if declined]: Buggy graphics drivers can repeatedly crash firefox/content process. This stops us from attempting the buggy functionality more than once. [Describe test coverage new/current, TreeHerder]: Minimal. [Risks and why]: Very low risk, expands the existing crash guard mechanism to another callsite. [String/UUID change made/needed]: None.
Flags: needinfo?(matt.woodrow)
Attachment #8771836 -
Flags: approval-mozilla-beta?
Comment 34•8 years ago
|
||
Comment on attachment 8771836 [details] [diff] [review] main-thread-crash-guard-video Stopping repeated crashes sounds good, let's take this for beta 6 (or 7 depending when we can build today)
Attachment #8771836 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment 35•8 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-beta/rev/8bee137bfeac
Reporter | ||
Comment 37•8 years ago
|
||
Doesn't look like this is fixed, based on the crash reports?
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Updated•8 years ago
|
Comment 38•7 years ago
|
||
signatures like this are still appearing in current nightly builds. on 51.0b [@ @0x0 | igdumd32.dll | CreateDecodeDeviceLH] is accounting for 0.3% of browser crashes...
Crash Signature: CreateDecodeDeviceLH ]
[@ igdumd32.dll | RtlConsoleMultiByteToUnicodeN | igdumd32.dll | CreateDecodeDeviceLH ] → CreateDecodeDeviceLH ]
[@ igdumd32.dll | RtlConsoleMultiByteToUnicodeN | igdumd32.dll | CreateDecodeDeviceLH ]
[@ @0x0 | igdumd32.dll | CreateDecodeDeviceLH ]
[@ igdumd32.dll | CreateDecodeDeviceLH ]
status-firefox51:
--- → affected
status-firefox52:
--- → affected
status-firefox53:
--- → affected
Comment 39•7 years ago
|
||
Mass wontfix for bugs affecting firefox 52.
Comment hidden (obsolete) |
Comment 41•7 years ago
|
||
Mass change P1->P2 to align with new Mozilla triage process
Priority: P1 → P2
Updated•7 years ago
|
Severity: normal → critical
Updated•7 years ago
|
Priority: P2 → P3
Comment 42•2 years ago
|
||
Closing because no crashes reported for 12 weeks.
Status: REOPENED → RESOLVED
Closed: 8 years ago → 2 years ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•