<a class="header-button" href="https://bugzilla.mozilla.org/home" title="Go to home page"> Bugzilla

Comment 3

•

10 years ago

(In reply to Milan Sreckovic [:milan] from comment #2) > Start with Bas, just because of the Windows expertise. There does not appear to be any obvious connection between the two crashes mentioned in comment 0 from the crashes themselves. (if I understand correctly both were triggered with the same STR?) Other than that both are invalid memory reads at a seemingly not ridiculous address on the heap. The first crash however, does have an interesting stack as well as an interesting crash address. It appears to me that the crash address may very well be related to the image data being uploaded. I'll attempt to catch this in a debugger, but in case someone beats me to it, it'd be interesting to know where the address where the read access violation occurs has any relation to the address of the image data that we're trying to upload. That would explain this crashing in a variety of locations as depending on the driver version the upload would occur in different locations inside of the DLL. If my suspicion is correct this would not be a driver bug but possibly more of an issue in our image bridge code. Nical, what do you think?

Flags: needinfo?(bas) → needinfo?(nical.bugzilla)

Nicolas Silva [:nical]

Comment 4

•

10 years ago

In the first report, an ImageContainer contains a TextureClient that has been used on the main thread and has a D2D texture which, I assume, is using the main thread's D2D device. in this stack we are on the ImageBridge thread, which uses a different D2D device for its texture uploads (as of recently). So my guess is that this crash comes from trying to copy from a texture into another with the textures not belonging to the same D2D device. I remember reviewing the patch that introduced the separate D2D device for the ImageBridge thread but I don't have it handy. We could either try to back the patch out, or read the texture back on the cpu and reupload it. The later would be sorta horrible but safe. The risk with the former is that we landed the patch to avoid a deadlock inside of D2D if you do certain things with a D2D device from several threads. IIRC we landed that patch and another one which was attempting to work around the deadlock in a different way, so maybe we can just back it out. Matt, what do you think (as you wrote both the patches for the deadlock issue)?

Flags: needinfo?(nical.bugzilla) → needinfo?(matt.woodrow)

Comment 5

•

10 years ago

My changes only affected uploads for D3D9SurfaceImage, so this crash will still be using the 'content' d3d device for uploading. It looks like the main thread is within d2d as well, so we're probably because both threads are using the same device at the same time. We could try using SetMultithreadProtected on the d3d device (see the gfxWindowsPlatform changes in bug 875247). Alternatively we could create and upload the TextureClient when we construct the Image (on the main-thread) rather than waiting for the ImageBridge thread to do it.

Flags: needinfo?(matt.woodrow)

Milan Sreckovic [:milan] (needinfo for best results)

Reporter

Comment 6

•

10 years ago

Attached image Firefox behavior before crash — Details

Here are some more crashes with a slightly different signature. Attached is a gif showing how Firefox behaves before crashes (ugly sight...). It`s quite easy for me to crash now. bp-f9f1fb7c-2428-4022-bfef-efb892150423 bp-e280c757-710d-4817-9c09-d01d92150423

Comment 7

•

10 years ago

Is there a try build we can produce with some of these suggestions, perhaps more instrumentation (I'd like to know if any TDRs are happening, for instance), and pass it to Bogdan to test with?

Flags: needinfo?(nical.bugzilla)

Flags: needinfo?(matt.woodrow)

Flags: needinfo?(bas)

Comment 8

•

10 years ago

Attached patch multi-thread-protect — Details — Splinter Review

Sure, here's a patch for the first suggestion. Try builds: https://treeherder.mozilla.org/#/jobs?repo=try&revision=5b9f787a0bb1 The second suggestion I had doesn't actually make much sense, because we allocate the CairoImage off the main thread as well. The other solution would be to have a separate d3d device for the image bridge thread (we could probably use the existing 'media' one), and have TextureClientD3D11::AllocateSurface choose the appropriate one given the current thread. That might be preferable, since setting MultiThreadProtected on the content device might hurt performance.

Flags: needinfo?(matt.woodrow)

Comment 9

•

10 years ago

Attached patch Choose d3d11 device based on thread — Details — Splinter Review

Test for option 2. https://treeherder.mozilla.org/#/jobs?repo=try&revision=715263e65af1

Comment 10

•

10 years ago

Working try builds: https://treeherder.mozilla.org/#/jobs?repo=try&revision=b1b6324cbf4a https://treeherder.mozilla.org/#/jobs?repo=try&revision=e11d0a5bebbe Bogdan, could you please try those builds and see if the crashes are fixed?

Flags: needinfo?(bogdan.maris)

Nicolas Silva [:nical]

Updated

•

10 years ago

Flags: needinfo?(nical.bugzilla)

Mike de Boer [:mikedeboer]

Reporter

Comment 11

•

10 years ago

1) Build https://treeherder.mozilla.org/#/jobs?repo=try&revision=e11d0a5bebbe: - Windows 7 64-bit (AMD Radeon HD6450) - machine where I originally crashed with this signature bp-b7eae424-d5ef-40c0-a2fc-01a3d2150424 We also tested on a second machine: - Windows 7 64-bit (NVidia Geforce 210) - machine where it originally crashed with signature from bug 1154271 bp-41721af3-829b-45c5-9536-8817f2150424 2) Build https://treeherder.mozilla.org/#/jobs?repo=try&revision=b1b6324cbf4a: Looks good with both machines from above, was unable to crash the build. Could this build contain the fix for bug 1154271 as well, and bug 1154771 (I don`t have setup to verify this crash)?

Flags: needinfo?(bogdan.maris) → needinfo?(matt.woodrow)

Comment 12

•

10 years ago

Bogdan, could you clarify comment 11 a bit? Did you mean that build 1) still crashes for you and build 2) doesn't?

Flags: needinfo?(bogdan.maris)

Reporter

Comment 13

•

10 years ago

Yes Build 1) - I can still crash Firefox Build 2) - Works fine, unable to crash Firefox

Flags: needinfo?(bogdan.maris)

Comment 14

•

10 years ago

Bogdan, do you have any idea why the stacks changed between comment #1 and comment #6? Bug 1154271 and bug 1154771 both have very similar stacks to comment #1, where the crash is coming from a thread with CreateLayeredChild on the stack. The stacks in comment #6 and comment #11 seem different and I'm not sure why. Bas: The newer crashes aren't obviously on the ImageBridge thread (unlike the earlier ones), and avoiding using the content d3d device isn't fixing the crash. I guess that means that yet another thread has access to our content device, but I can't find anywhere that would happen.

Flags: needinfo?(matt.woodrow) → needinfo?(bogdan.maris)

Comment 15

•

10 years ago

It's possible that a gecko code change caused the different stacks, if so, getting a regression range would be really helpful.

Reporter

Comment 16

•

10 years ago

(In reply to Matt Woodrow (:mattwoodrow) from comment #14) > Bogdan, do you have any idea why the stacks changed between comment #1 and > comment #6? > > Bug 1154271 and bug 1154771 both have very similar stacks to comment #1, > where the crash is coming from a thread with CreateLayeredChild on the stack. > > The stacks in comment #6 and comment #11 seem different and I'm not sure why. Using the initial build where I got the signature from comment 1 ([@ atidxx32.dll@0x196f7]), now I get the signature from comment 6 ([@ atidxx32.dll@0x49299e ]). bp-20eba869-da76-4309-8ec4-2a2a42150428 Since mozregression was of no help here, I did a manual regression and tested builds from the first time I got this crash until nightly from 2015.04.23. Here are the results: 2015.03.19 - [@ atidxx32.dll@0x49299e ] bp-38d4d3af-5c0b-4d3b-9c21-ae3b52150428 2015.03.20 - [@ atidxx32.dll@0x49299e ] bp-af376f30-9bde-493f-9182-9cb712150428 2015.03.28 - [@ atidxx32.dll@0x49299e ] bp- 97dacb01-c60d-4698-a020-277402150428 2015.04.06 - [@ atidxx64.dll@0x57d7bb ] bp-9223dfa9-7853-43fa-9eed-6305f2150428 2015.04.23 - [@ atidxx32.dll@0x49299e ] Latest Nightly: 2015.04.28 - [@ atidxx32.dll@0x49299e ] bp-20825854-6c12-4af9-9fc5-c08bd2150428 Note: - I did not actually tested every build from 2015.03.18 to 2015.04.28 and Today, I bisected just as mozregression would. I don`t know what has changed but I can only get the signature from comment 6 on almost all builds, see above. I hope this helps...

Flags: needinfo?(bogdan.maris) → needinfo?(matt.woodrow)

Reporter

Comment 17

•

10 years ago

I encountered the first signature again using latest Nightly and 38.0.5beta1 build 2: - [@ atidxx32.dll@0x196f7 ] bp-17baf221-d1f7-4122-a5c9-76d582150512 bp-88a80823-de88-4d1f-bf20-8332d2150512 bp-40a112bb-d067-46fb-90a9-fae702150512 Sometimes I get other signatures: - [@ afhook32.dll@0x54de ] bp-3c07bf93-f9b4-45de-8e07-16db32150512 - [@ atidxx32.dll@0x509db0 ] bp-aaef48b2-87fc-4d57-8a01-c29c72150512 Steps: 1. Make a tab sharing call. 2. Send and Open link in other PC. 2. Resize Firefox from first PC.

Randell Jesup [:jesup] (needinfo me)

Comment 18

•

10 years ago

It looks like our D2DFactory is created with multithreading, so I'm not sure what else could be causing this. I guess we should just go ahead with marking the D3D device as multithreaded unless Bas has a better idea.

Flags: needinfo?(matt.woodrow)

:shell escalante

Updated

•

10 years ago

backlog: --- → webRTC+

Comment 19

•

10 years ago

-> GFX, because it is. Waiting on bas to reply to matt I guess.

backlog: webRTC+ → ---

Rank: 10

Component: WebRTC: Audio/Video → Graphics

Priority: P1 → --

Milan Sreckovic [:milan] (needinfo for best results)

Updated

•

10 years ago

Attachment #8596969 - Flags: review?(bas)

Milan Sreckovic [:milan] (needinfo for best results)

Comment 20

•

10 years ago

Comment on attachment 8596962 [details] [diff] [review] multi-thread-protect Review of attachment 8596962 [details] [diff] [review]: ----------------------------------------------------------------- ::: gfx/thebes/gfxWindowsPlatform.cpp @@ +2015,5 @@ > > mD3D11ContentDevice->SetExceptionMode(0); > > + nsRefPtr<ID3D10Multithread> multi; > + mD3D1ContentDevice->QueryInterface(__uuidof(ID3D10Multithread), getter_AddRefs(multi)); I'm assuming this is a typo, and it should be mD3D11ContentDevice?

Attachment #8596962 - Flags: feedback?(bas)

Comment 21

•

10 years ago

Yes, it's spelt correctly in the try push :)

Comment 22

•

10 years ago

Comment on attachment 8596969 [details] [diff] [review] Choose d3d11 device based on thread This patch didn't fix the crash. I'm not sure why.

Attachment #8596969 - Flags: review?(bas)

Milan Sreckovic [:milan] (needinfo for best results)

Comment 23

•

10 years ago

Comment on attachment 8596962 [details] [diff] [review] multi-thread-protect Review of attachment 8596962 [details] [diff] [review]: ----------------------------------------------------------------- If this helps let's do it, although we really need to fix the fact we have this bug in the first place, there's no way this will totally fix things.

Attachment #8596962 - Flags: feedback?(bas) → feedback+

u279076

Comment 24

•

10 years ago

Is the patch still needed? It looks like there have been zero reports of this crash in at least 28 days.

Comment 25

•

10 years ago

This got landed as a part of bug 1176363.

Updated

•

10 years ago

Flags: needinfo?(bas)