This bug was filed from the Socorro interface and is report bp-264f1d35-b95a-436e-9711-65d542141011. ============================================================= Null-deref crash in D3D9SurfaceImage::EnsureSynchronized() (typically from GetTextureClient() from UpdateImage/UpdateImageInternal()). Thread 0 was in gdi, called from mozilla::gfx::DataSourceSurfaceD2DTarget::EnsureMapped(). About 200 crash reports per week Changes ready to land in bug 879717 require a new test, but our test shows about 50% failure rate even without the changes we're landing, so this would block us landing an important change. See https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=00462507ee04 (note about 50% of the oranges end in a crash - it's permaorange since we don't have the bug 879717 changes in the Try).
Bas, isn't this your field of expertise? Do you have the time to take a look?
I can reproduce the crash by going to http://html5demos.com/video-canvas 1 press play then on 4,6,8 seconds pause and unpause, then change speed of video to ludicous and wait till video has played then press play again it should crash
This is from the video code. Chris Pearce is the right person here.
Chris, will you have time to look at this anytime soon?
Chris is pretty occupied at the moment. Milan, could take a look at this and prioritize/assign as appropriate?
Looks like a nullptr deref, but only a few lines after we check for nullptr. I guess we're calling EnsureSynchronized from both the main-thread and image bridge thread, and that's causing us to get mQuery cleared while we're using it. We can probably just store mQuery in a local RefPtr variable to stop this happening, assuming d3d9 queries are threadsafe. Nical might want to look at this, since it appears to be async-video related.
Crash, but not topcrash, what's the urgency for this? It doesn't show up in the current "urgent stability issues to consider" that QA maintains for the graphics team.
My only concern is that it's a blocker for bug 879717 - a test there reproduces it at a ~50% rate. Since you are busy in other areas though, I can follow up on Matt's comment and see what results that yields us.
Created attachment 8529536 [details] [diff] [review] Avoid using null query in D3D9SurfaceImage::EnsureSynchronized Per comment 6. (Thanks for the analysis Matt!) Seems to work. try before: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=0e55ba93595a try after: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=10a081c5129a The log parsing is behind at the moment but checking the build dirs there were 4 of 5 crashes on the runs before the patch. All other test runs are taking significantly longer so at least they have not crashed. nical, are you the man for review here?
Not sure what's up with treeherder today, but tbpl have parsed the results: before patch: https://tbpl.mozilla.org/?tree=Try&rev=0e55ba93595a after patch: https://tbpl.mozilla.org/?tree=Try&rev=10a081c5129a They're all orange but the crash is fixed. The other oranges are due to the other patches (needed in order to trigger the crash), so we should be good here.
Created attachment 8529989 [details] [diff] [review] Avoid using null query in D3D9SurfaceImage::EnsureSynchronized (carries r=nical) Commit msg fix.
I am sorry if I did the wrong thing to reply to this bug. Why is this bug going to be fixed in 36 and not 35? The reason why I ask is I get this bug mainly 1 to 4 times a day and it is driving me want to go back in a few version before this bug.
nical/Milan: Is this a reasonable candidate for uplift to beta? (I would have nominated it for Aurora after it landed if I'd though about it.) Simple patch. well-baked
(In reply to Randell Jesup [:jesup] from comment #15) > nical/Milan: Is this a reasonable candidate for uplift to beta? (I would > have nominated it for Aurora after it landed if I'd though about it.) > > Simple patch. well-baked Looks reasonable to me.
Comment on attachment 8529989 [details] [diff] [review] Avoid using null query in D3D9SurfaceImage::EnsureSynchronized (carries r=nical) Approval Request Comment [Feature/regressing bug #]: unknown [User impact if declined]: Crashes on D3D9 devices randomly until 36 goes to release. [Describe test coverage new/current, TBPL]: Not testable in automation. Baking on nightly/aurora for considerable time now [Risks and why]: Low risk - holds a reference while using a resource instead of a raw pointer [String/UUID change made/needed]: none
Socorro  shows zero crashes over the past two weeks for 35 Beta 6 & 8, 36 Aurora and 37 Nightly.  - https://crash-stats.mozilla.com/report/list?product=Firefox&range_unit=days&range_value=14&signature=mozilla%3A%3Alayers%3A%3AD3D9SurfaceImage%3A%3AEnsureSynchronized%28%29#tab-sigsummary