Closed
Bug 950235
Opened 11 years ago
Closed 11 years ago
Crashes at mozilla::layers::TextureParent::RecvInit() or mozilla::layers::TextureHost::Create(mozilla::layers::SurfaceDescriptor const&, mozilla::layers::ISurfaceAllocator*, unsigned int)
Categories
(Core :: Graphics: Layers, defect)
Tracking
()
VERIFIED
FIXED
mozilla30
Tracking | Status | |
---|---|---|
firefox28 | --- | unaffected |
firefox29 | + | verified |
firefox30 | --- | verified |
People
(Reporter: smichaud, Assigned: nical)
References
Details
(Keywords: reproducible, topcrash-mac)
Crash Data
Attachments
(1 file)
2.65 KB,
patch
|
mattwoodrow
:
review+
Sylvestre
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
These started with today's mozilla-central nightly, but they're already the #9 Mac topcrasher on the 29 branch:
https://crash-stats.mozilla.com/report/list?signature=mozilla%3A%3Alayers%3A%3ATextureParent%3A%3ARecvInit%28mozilla%3A%3Alayers%3A%3ASurfaceDescriptor+const%26%2C+unsigned+int+const%26%29&product=Firefox&query_type=contains&range_unit=weeks&process_type=any&hang_type=any&date=2013-12-13+22%3A00%3A00&range_value=4#reports
In my next comment I'll post the implied regression range.
Reporter | ||
Updated•11 years ago
|
Crash Signature: [@ mozilla::layers::TextureParent::RecvInit(mozilla::layers::SurfaceDescriptor const&, unsigned int const&) ]
Reporter | ||
Comment 1•11 years ago
|
||
Oops, sorry for the mis-classification.
Component: JavaScript Engine → Graphics: Layers
Reporter | ||
Comment 2•11 years ago
|
||
Here's the implied regression range:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=1ad9af3a2ab8&tochange=8b5875dc7e31
I'd guess the trigger is one of the patches for bug 897452.
Blocks: PTexture
Comment 3•11 years ago
|
||
I get this crash reproducibly (on 10.9) when I visit:
http://www.ifixit.com/Teardown/Mac+Pro+Late+2013+Teardown/20778
Updated•11 years ago
|
tracking-firefox29:
--- → ?
Keywords: reproducible
Comment 4•11 years ago
|
||
Nical, these crashes all seem to be this assertion:
http://hg.mozilla.org/mozilla-central/annotate/8b5875dc7e31/gfx/layers/composite/TextureHost.cpp#l173
MOZ_CRASH("Couldn't create texture host");
They're hitting it on Mac; I can't reproduce on Linux. Do you have an idea of what needs to be done?
Flags: needinfo?(nical.bugzilla)
Comment 5•11 years ago
|
||
I'm not seeing this on 29.0a1 (2013-12-31) on 10.9.1.
Updated•11 years ago
|
Keywords: steps-wanted
Comment 6•11 years ago
|
||
I can reproduce this on my 2010-era MacBook Pro, but not my 2013 one. Both are on 10.9.1. One difference between the two is that I've disabled hardware acceleration on my 2010 one (I unchecked "Use hardware acceleration when available" because I've been getting GPU panics.)
Comment 7•11 years ago
|
||
That sort of makes sense, then. It looks like the switch statement that has this assertion needs to be updated to account for the kind of TextureHost involved when disabling GL layers on Mac.
Comment 8•11 years ago
|
||
When I re-enable hardware acceleration on my 2010 MBP and then restart the browser, I don't get the crash any more on the URL in comment 3. Likewise, when I disable hardware acceleration on my 2013 MBP, I get the crash on that URL
Updated•11 years ago
|
Keywords: steps-wanted
Assignee | ||
Comment 9•11 years ago
|
||
Is basic layers + OMTC a supported config (or somewhat semi-supported) my understanding is that at the moment basic-OMTC is not shippable quality yet or wasn't not too long ago. We had the same problem on windows very recently and the fix was to make sure we don't get OMTC+basic (by disabling OMTC when layers acceleration is off iirc) on windows.
Flags: needinfo?(nical.bugzilla)
Comment 10•11 years ago
|
||
(In reply to Nicolas Silva [:nical] from comment #9)
> Is basic layers + OMTC a supported config (or somewhat semi-supported) my
> understanding is that at the moment basic-OMTC is not shippable quality yet
> or wasn't not too long ago. We had the same problem on windows very recently
> and the fix was to make sure we don't get OMTC+basic (by disabling OMTC when
> layers acceleration is off iirc) on windows.
You are right, OMTC/basic is not shipping yet and should not get turned on anywhere unless it is overridden with a pref or we are using e10s.
Comment 11•11 years ago
|
||
Another page that causes this crash is: http://www.nytimes.com/redesign/
Comment 12•11 years ago
|
||
100% reproducible, OSX running latest nightly, from here:
http://www.visagetechnologies.com/HTML5/Samples/VirtualEyewearTryOn/eyewear.html
Comment 13•11 years ago
|
||
Mine's running macosx 10.8.5, so Mavericks isn't necessary. I have hardware acceleration turned off, due to a bug in rendering BMC Remedy pages (which is currently a problem with or without hardware acceleration, but BMC's support suggested turning it off which helped on a previous version). http://cloud.feedly.com/#latest reliably crashes (presumably with my login cookies needed).
Updated•11 years ago
|
Summary: Crashes at mozilla::layers::TextureParent::RecvInit() on OS X 10.9 → Crashes at mozilla::layers::TextureParent::RecvInit() on OS X with hardware acceleration disabled
Comment 14•11 years ago
|
||
Hey, I would really like this bug fixed too, but here is likely why it hasn't been taken care of yet: the two 'core' people who worked on the PTexture patches which caused this bug are :nical and me, and neither of us have a Mac dev environment ready. This bug could likely get fixed very fast if someone with a Mac worked on it with input from :nical. Milan, I believe that it would be worth finding someone to take care of this.
Flags: needinfo?(milan)
Comment 15•11 years ago
|
||
Sold! to Jeff, and we'll all blame Benoit if it turns out to be a complicated mess.
Assignee: nobody → jmuizelaar
Flags: needinfo?(milan)
Updated•11 years ago
|
Updated•11 years ago
|
status-firefox28:
--- → unaffected
Comment 16•11 years ago
|
||
Also reproducible just by opening flickr.com - BOOM!
Updated•11 years ago
|
Crash Signature: [@ mozilla::layers::TextureParent::RecvInit(mozilla::layers::SurfaceDescriptor const&, unsigned int const&) ] → [@ mozilla::layers::TextureParent::RecvInit(mozilla::layers::SurfaceDescriptor const&, unsigned int const&) ]
[@ mozilla::layers::TextureParent::Init(mozilla::layers::SurfaceDescriptor const&, unsigned int const&) ]
Assignee | ||
Comment 17•11 years ago
|
||
Jeff, this is caused by our assumption that when we get to create a TextureHost, we have already created a Compositor. Creating the compositor is what sets the sBackend global on the compositor side. Somehow we run into the case where ImageBridge is setup and starts sending frames before we created a compositor.
My suggestion is that in TextureParent::RecvInit we just select the backend depending on the SurfaceDescriptor type. This used to not be possible when each backend had a shmem texture, but now this is better decoupled.
Except in the case of SurfaceDescriptorD3D10 for which both D3D9 and D3D11 will have an implementation after Matt lands one of his patches. To handle that case we can probably just check gfxPlatform::GetPrefLayersPreferD3D9.
Assignee | ||
Updated•11 years ago
|
Assignee: jmuizelaar → nical.bugzilla
Assignee | ||
Comment 18•11 years ago
|
||
This should solve the problem in most cases, although MacIOSurface is getting in the way because it has both a basic TextureHost and a GL TextureHost.
I'll look into that. Either we have only one TextureHost for the two cases and move the backend specific bits into the TextureSource as we did for shmem, or we create have a separate descriptor type for the two cases (provided we know on the client side which backend we are using on the compositor side)
At the moment I prefer the first solution.
Attachment #8368641 -
Flags: review?(matt.woodrow)
Assignee | ||
Comment 19•11 years ago
|
||
Comment 20•11 years ago
|
||
Comment on attachment 8368641 [details] [diff] [review]
Pick the backend in function of the descriptor type
Review of attachment 8368641 [details] [diff] [review]:
-----------------------------------------------------------------
I don't particularly like this. As you said, there are already multiple surface descriptor types that have more than one possible TextureHost. I suspect this number is going to grow too, for things like SurfaceStream.
I really think we need to catch this on the client side, setting up IPDL objects when they'll never be used is a waste.
Attachment #8368641 -
Flags: review?(matt.woodrow) → review-
Assignee | ||
Comment 21•11 years ago
|
||
I share your concern, although with ImageBridge being independent of the main thread, it's hard to know if a Compositor has already been created. I'll try to think of a way. There should not be descriptor types with several TextureHosts though. The separation between TextureHost and TextureSource lets us decouple the TextureHost and the backend when the descriptor type is usable on several backends.
Assignee | ||
Comment 22•11 years ago
|
||
Matt, considering that this is a top crasher and already made it into aurora, could we revisit the review decision?
I don't think this can be efficiently caught on the client side, because ImageBridge is by design independent of the main thread so we can't efficiently get it to wait for the first layer transaction to have been complete before sending messages.
The IPDL object we are sending before we have a compositor here is most likely to end up on the screen because it arrives just before the creation of the Compositor, so it will be in the next composition. If we are creating a video element and the layer tree at the same time there isn't much we can do to prevent ImageBridge to be faster than main thread transaction, and actually we kind want it to be, otherwise we may composite the first screen before we get a video frame which is not correct.
Since we do have the information about the backend in the surface descriptor type 90% of the time there is no harm in using it, even if we may need the Compositor::GetBackend() to disambiguate some edge cases that should not be there (I still think we should have a 1-1 relation between TextureHost and SurfaceDescriptor).
Flags: needinfo?(matt.woodrow)
Comment 23•11 years ago
|
||
Ok, in the interest of fixing a topcrasher on aurora I think we can probably take this.
So we're making the assertion that TextureHosts cannot rely on anything backend specific. While this patch fixes this for now (or at least the case that we hit for now), it still seems rather non-obvious and fragile.
What I think we should do then, is make TextureHosts entirely independent of backend code. That would involve moving them all out of opengl/, d3d11/ etc (and dropping the backend specific suffixes). Then we get a very clear division of responsibilities. TextureHost only manages the liftetime of a shared object, and has nothing backend dependent. When we first render one, we lazily allocate the correct TextureSource for the Compositor type we have.
I guess we could have a problem where we require the compositor in order to do resource deallocation (needing a device/gl context etc). I think we already have this problem though, if we start async video but don't have OMTC enabled (or close the window before the compositor is initialised). This change just make the issue much more visible (which seems like a good thing).
Flags: needinfo?(matt.woodrow)
Comment 24•11 years ago
|
||
Comment on attachment 8368641 [details] [diff] [review]
Pick the backend in function of the descriptor type
Review of attachment 8368641 [details] [diff] [review]:
-----------------------------------------------------------------
I don't particularly like this. As you said, there are already multiple surface descriptor types that have more than one possible TextureHost. I suspect this number is going to grow too, for things like SurfaceStream.
I really think we need to catch this on the client side, setting up IPDL objects when they'll never be used is a waste.
Attachment #8368641 -
Flags: review- → review+
Assignee | ||
Comment 25•11 years ago
|
||
(In reply to Matt Woodrow (:mattwoodrow) from comment #23)
> Ok, in the interest of fixing a topcrasher on aurora I think we can probably
> take this.
Thanks!
>
> So we're making the assertion that TextureHosts cannot rely on anything
> backend specific. While this patch fixes this for now (or at least the case
> that we hit for now), it still seems rather non-obvious and fragile.
Most of the surface descriptor types imply a backend, and for shmem and memory which are backend-agnostic, you don't need a compositor right away (you will need one to composite eventually). So the information that is encoded in the descriptor type is something we can use.
>
> What I think we should do then, is make TextureHosts entirely independent of
> backend code.
Well, i'd say make them all dependent on the shared memory type: ShmemTextureClient and MemoryTextureClient are backend-agostic which is great since all or most backends need them at least as a fallback.
> That would involve moving them all out of opengl/, d3d11/ etc
> (and dropping the backend specific suffixes).
I like having the backend suffix because "SharedTextureHost" could be seen as a D3D or GL thing by the name whereas with "SharedTextureHostOGL" you get what you expect.
> Then we get a very clear
> division of responsibilities. TextureHost only manages the liftetime of a
> shared object, and has nothing backend dependent.
Yeah, the backend specific stuff for drawing really needs to go into TextureSource (it's tru for most existing TextureClients). There can be a bit of backend specificities in TextureHost when needed, for instance when using D3D APIs to lock the texture's memory, but the rest should be in the TextureSource.
> When we first render one,
> we lazily allocate the correct TextureSource for the Compositor type we have.
In some cases like shmem, creating the TextureSource means doing the texture upload so we may want to control doing it in the transaction to avoid double buffering
>
> I guess we could have a problem where we require the compositor in order to
> do resource deallocation (needing a device/gl context etc). I think we
> already have this problem though, if we start async video but don't have
> OMTC enabled (or close the window before the compositor is initialised).
> This change just make the issue much more visible (which seems like a good
> thing).
I don't know if it's doable to destroy a window before it gets its first layer transaction. The code that kicks the transaction is the same that caused the first frame to be sent through async-video, the latter was just faster at getting to the compositor side and I don't think we have anything that will interrupt the former.
Comment 26•11 years ago
|
||
TextureHosts often need access to platform headers (gl.h, d3d.h, gfxWindowsPlatform.h etc), but they should never need access to our backend headers (TextureHostD3D11.h, CompositorD3D11.h etc).
That's why I'd prefer to have them moved outside of the platform specific folders, because we don't export the headers, and it guarantees we don't break that.
As for the naming, I think your example still needs GL in the name, because the underlying object is a GL texture. Not because it has anything to do with our CompositorOGL code. But, for example, MacIOSurfaceTextureHostOGL, could drop the suffix and move out of opengl/.
Comment 27•11 years ago
|
||
I can also reproduce by setting:
user_pref("layers.acceleration.disabled", true);
and loading:
layout/reftests/ogg-video/black100x100-aspect3to2.ogv
Comment 28•11 years ago
|
||
(on mozilla-central)
Comment 29•11 years ago
|
||
Backed out because bug 947045 was backed out and it appeared that these landings were intertwined a bit.
https://hg.mozilla.org/integration/mozilla-inbound/rev/a150cbbc959b
I relanded this:
https://hg.mozilla.org/integration/mozilla-inbound/rev/61c1351849d8
Comment 31•11 years ago
|
||
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla30
Assignee | ||
Comment 32•11 years ago
|
||
Comment on attachment 8368641 [details] [diff] [review]
Pick the backend in function of the descriptor type
[Approval Request Comment]
Bug caused by (feature/regressing bug #):
User impact if declined: crashes
Testing completed (on m-c, etc.): a few days
Risk to taking this patch (and alternatives if risky): low risk
String or IDL/UUID changes made by this patch:
Attachment #8368641 -
Flags: approval-mozilla-aurora?
Updated•11 years ago
|
Attachment #8368641 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Assignee | ||
Comment 33•11 years ago
|
||
Updated•11 years ago
|
status-firefox29:
--- → affected
Updated•11 years ago
|
status-firefox30:
--- → fixed
Updated•11 years ago
|
Crash Signature: [@ mozilla::layers::TextureParent::RecvInit(mozilla::layers::SurfaceDescriptor const&, unsigned int const&) ]
[@ mozilla::layers::TextureParent::Init(mozilla::layers::SurfaceDescriptor const&, unsigned int const&) ] → [@ mozilla::layers::TextureParent::RecvInit(mozilla::layers::SurfaceDescriptor const&, unsigned int const&) ]
[@ mozilla::layers::TextureParent::Init(mozilla::layers::SurfaceDescriptor const&, unsigned int const&) ]
[@ mozilla::layers::TextureHost::Crea…
OS: Mac OS X → All
Summary: Crashes at mozilla::layers::TextureParent::RecvInit() on OS X with hardware acceleration disabled → Crashes at mozilla::layers::TextureParent::RecvInit() or mozilla::layers::TextureHost::Create(mozilla::layers::SurfaceDescriptor const&, mozilla::layers::ISurfaceAllocator*, unsigned int)
Comment 35•11 years ago
|
||
Reproduced in nightly 2014-02-03, Mac OS X 10.8.5, HWA disabled, by opening:
http://www.ifixit.com/Teardown/Mac+Pro+Late+2013+Teardown/20778
http://www.nytimes.com/redesign/
http://www.visagetechnologies.com/HTML5/Samples/VirtualEyewearTryOn/eyewear.html
http://www.flickr.com/
Verified fixed in FF 29.0a2(2014-02-11), 30.0a1(2014-02-11), Mac OS X 10.8.5.
You need to log in
before you can comment on or make changes to this bug.
Description
•