crash in mozalloc_abort(char const* const) | NS_DebugBreak | mozilla::layers::BasicCompositor::DrawQuad(mozilla::gfx::RectTyped<T> const&, mozilla::gfx::RectTyped<T> const&, mozilla::layers::EffectChain const&, float, mozilla::gfx::Matrix4x4 const&)

RESOLVED FIXED in Firefox 41

Status

()

defect
--
critical
RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: dmajor, Assigned: bas.schouten)

Tracking

(Depends on 1 bug, {crash, topcrash-win})

39 Branch
mozilla43
x86
Windows NT
Points:
---

Firefox Tracking Flags

(firefox39 wontfix, firefox40+ wontfix, firefox41+ verified, firefox42+ fixed, firefox43+ fixed)

Details

(crash signature)

Attachments

(2 attachments, 1 obsolete attachment)

(Reporter)

Description

4 years ago
This bug was filed from the Socorro interface and is 
report bp-5b639072-3c30-48dd-88d6-ba7772150705.
=============================================================

mozalloc!mozalloc_abort
xul!NS_DebugBreak
xul!mozilla::layers::BasicCompositor::DrawQuad
xul!mozilla::layers::ImageHost::Composite
xul!mozilla::layers::ImageLayerComposite::RenderLayer
xul!mozilla::layers::RenderLayers<mozilla::layers::ContainerLayerComposite>
xul!mozilla::layers::ContainerRender<mozilla::layers::ContainerLayerComposite>
xul!mozilla::layers::ContainerLayerComposite::RenderLayer
xul!mozilla::layers::RenderLayers<mozilla::layers::ContainerLayerComposite>
xul!mozilla::layers::ContainerRender<mozilla::layers::ContainerLayerComposite>
xul!mozilla::layers::ContainerLayerComposite::RenderLayer
xul!mozilla::layers::LayerManagerComposite::Render
xul!mozilla::layers::LayerManagerComposite::EndTransaction
xul!mozilla::layers::LayerManagerComposite::EndEmptyTransaction
xul!mozilla::layers::CompositorParent::CompositeToTarget
xul!mozilla::layers::CompositorParent::CompositeCallback
(Reporter)

Comment 1

4 years ago
[Tracking Requested - why for this release]: Moderately high crash on 39 release, all current versions affected.
(Reporter)

Comment 2

4 years ago
   419     case EffectTypes::YCBCR: {
   420       NS_RUNTIMEABORT("Can't (easily) support component alpha with BasicCompositor!");
   421       break;
   422     }

NS_RUNTIMEABORT strikes me as a pretty bad way to say we don't support something. Can we do any better?
Flags: needinfo?(matt.woodrow)
It's not meant to be possible to get to that line, so RUNTIMEABORT is the right behaviour. It's sad that this made it to release without being seen though.
Flags: needinfo?(matt.woodrow)
Nical probably knows the most about this.

One interesting from the crash report is "D3D11 Layers? D3D11 Layers+ D3D11 Layers-"

Do we know what causes that? Is it possible that we're reporting to content that we have D3D11 layers even though it failed the second time (?) and we now have BasicCompositor?

IMFYCbCrImage will create a PTexture with format YCbCr if it thinks we're using a D3D11 compositor, which won't go through the mCompositor->SupportsEffect() checks we have for BufferTextureHost.
Flags: needinfo?(nical.bugzilla)
(In reply to Matt Woodrow (:mattwoodrow) from comment #4)
> Nical probably knows the most about this.
> 
> One interesting from the crash report is "D3D11 Layers? D3D11 Layers+ D3D11
> Layers-"


huh, fun. 

> Do we know what causes that? Is it possible that we're reporting to content
> that we have D3D11 layers even though it failed the second time (?) and we
> now have BasicCompositor?

I guess so?
We've had other examples of D3D11+ followed by D3D11-, and sometimes, more worryingly, the other way around.
Flags: needinfo?(bas)
Doesn't look like there was a device reset, at least there are no messages in the crash.  Let's take that assumption, and figure out how we ended up in this situation.  We already know (e.g., see https://bugzilla.mozilla.org/show_bug.cgi?id=1172014#c22) that we get into D3D code even when the acceleration is disabled.  We can't afford to assume this is an unreported device reset, need to rule out all the other options first.
Assignee: nobody → bas
Depends on: 1183789
Flags: needinfo?(nical.bugzilla)
(In reply to Milan Sreckovic [:milan] from comment #8)
> Doesn't look like there was a device reset, at least there are no messages
> in the crash.  Let's take that assumption, and figure out how we ended up in
> this situation.  We already know (e.g., see
> https://bugzilla.mozilla.org/show_bug.cgi?id=1172014#c22) that we get into
> D3D code even when the acceleration is disabled.  We can't afford to assume
> this is an unreported device reset, need to rule out all the other options
> first.

How are you determining this? I can't see anywhere that would log device resets to the crash report, but I'm likely missing something.
My best guess at the moment:

The user is playing video, so ImageBridgeChild is producing d3d specific TextureClients and sending them to the compositor.

We get a TDR, recreate our LayerManager/Compositor pipeline, and d3d fails, so we end up with BasicCompositor. We also notify ImageBridge of the new compositor backend so it won't produce any more d3d specific TextureClients.

BasicCompositor does a composite and uses the existing TextureClients owned by ImageBridge, and then we hit the NS_RUNTIMEABORT since this won't ever work.

We might need to make sure that we recreate ImageBridge as well, or at least flush all TextureClients out of it before recreating compositors.
Another option is that we add a way to function to TextureHost (or TextureSource) that checks if it is compatible with the current compositor.

We could then discard/ignore any that fail that test, and assume that we're in a race condition and that all will be well soon.

That's not an alternative to trying to fix the races, but it might be a useful solution in the interim to stop crashing.

What do you think nical?
Flags: needinfo?(nical.bugzilla)
Actually, fixing the race might be as easy as having Image::GetTextureClient confirm that the current cached TextureClient is compatible with the compositor backend we're about to send it to.
(In reply to Matt Woodrow (:mattwoodrow) from comment #11)
> Another option is that we add a way to function to TextureHost (or
> TextureSource) that checks if it is compatible with the current compositor.
>
> What do you think nical?

Ok
Flags: needinfo?(nical.bugzilla)
(Assignee)

Comment 15

4 years ago
I'm hoping that now that we've made TextureClient do its multithreading properly this might be much less likely to occur as well.
Flags: needinfo?(bas)
The crash rate for 40 is in an acceptable range. I'm marking this as wontfix for 40.
(Reporter)

Updated

4 years ago
Duplicate of this bug: 1196693
Bas, I do not see this crash being hit in Beta41 at all. Should we set FF41 status to fixed based on that?
Flags: needinfo?(bas)

Comment 19

4 years ago
as dmajor has marked bug 1196693 as duplicate of this one, probably just the signature has shifted a bit.
Crash Signature: [@ mozalloc_abort(char const* const) | NS_DebugBreak | mozilla::layers::BasicCompositor::DrawQuad(mozilla::gfx::RectTyped<T> const&, mozilla::gfx::RectTyped<T> const&, mozilla::layers::EffectChain const&, float, mozilla::gfx::Matrix4x4 const&)] → [@ mozalloc_abort(char const* const) | NS_DebugBreak | mozilla::layers::BasicCompositor::DrawQuad(mozilla::gfx::RectTyped<T> const&, mozilla::gfx::RectTyped<T> const&, mozilla::layers::EffectChain const&, float, mozilla::gfx::Matrix4x4 const&)] [@ mozall…

Updated

4 years ago
Crash Signature: [@ mozalloc_abort(char const* const) | NS_DebugBreak | mozilla::layers::BasicCompositor::DrawQuad(mozilla::gfx::RectTyped<T> const&, mozilla::gfx::RectTyped<T> const&, mozilla::layers::EffectChain const&, float, mozilla::gfx::Matrix4x4 const&)] [@ mozall… → [@ mozalloc_abort(char const* const) | NS_DebugBreak | mozilla::layers::BasicCompositor::DrawQuad(mozilla::gfx::RectTyped<T> const&, mozilla::gfx::RectTyped<T> const&, mozilla::layers::EffectChain const&, float, mozilla::gfx::Matrix4x4 const&)] [@ mozall…
Bas, let's get a patch for the wallpaper (see comment 13), then work on the correct fix.
Flags: needinfo?(nical.bugzilla)
Assignee: bas → nical.bugzilla
Flags: needinfo?(nical.bugzilla)
Attachment #8655961 - Flags: review?(matt.woodrow)
(Assignee)

Comment 22

4 years ago
I'm about to put a patch up here that I believe is safer for an uplift to beta.
Flags: needinfo?(bas)
(Assignee)

Comment 23

4 years ago
This seems to be the safest choice, this replicates exactly the conditions in which DrawQuad might fail and should be very safe for uplift to beta.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=b92d5c8d6c18
Attachment #8655971 - Flags: review?(nical.bugzilla)
(Reporter)

Comment 24

4 years ago
Comment on attachment 8655971 [details] [diff] [review]
Do not allow incompatible effects

Review of attachment 8655971 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/basic/BasicCompositor.h
@@ +121,5 @@
>    }
>  
>    virtual nsIWidget* GetWidget() const override { return mWidget; }
>  
> +  virtual bool SupportsEffect(const Effect* aEffect);

Does static analysis complain if you don't mark this as override?
Comment on attachment 8655971 [details] [diff] [review]
Do not allow incompatible effects

Review of attachment 8655971 [details] [diff] [review]:
-----------------------------------------------------------------

We already have Compositor::SupportsEffect. It takes an EffectType rather than an effect but otherwise it is the same.
Attachment #8655971 - Flags: review?(nical.bugzilla) → review-
Comment on attachment 8655961 [details] [diff] [review]
Check that the compositor is compatible in TextureHost::SetCompositor

Bas will have a patch that's simpler to uplift, so let's take Bas's patch first and see if we need this one. If the issue is that we are using a d3d texture with the basic compositor, my patch will fix more potential cases (like if we send rgb video frames rather than ycbcr, which is not as common but possible) so we may want to take in addition to Bas's eventually.
Attachment #8655961 - Flags: review?(matt.woodrow)
Assignee: nical.bugzilla → bas
(Assignee)

Comment 28

4 years ago
Attachment #8655971 - Attachment is obsolete: true
Attachment #8656667 - Flags: review?(nical.bugzilla)
Attachment #8656667 - Flags: review?(nical.bugzilla) → review+
(Assignee)

Comment 29

4 years ago
Comment on attachment 8656667 [details] [diff] [review]
Do not allow incompatible effects v2

Approval Request Comment
[Feature/regressing bug #]: Unknown
[User impact if declined]: Top crasher
[Describe test coverage new/current, TreeHerder]: Some try coverage, beta doesn't run clean on try
[Risks and why]: Very low, early abort only
[String/UUID change made/needed]: None
Attachment #8656667 - Flags: approval-mozilla-beta?
Comment on attachment 8656667 [details] [diff] [review]
Do not allow incompatible effects v2

Approved for uplift to Beta as it appears to be a simple fix for the crash.
Attachment #8656667 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
(In reply to Ritu Kothari (:ritu) from comment #30)
> Comment on attachment 8656667 [details] [diff] [review]
> Do not allow incompatible effects v2
> 
> Approved for uplift to Beta as it appears to be a simple fix for the crash.

this doesn't apply cleanly to beta

renamed 1182147 -> bug1182147
applying bug1182147
patching file gfx/layers/basic/BasicCompositor.cpp
Hunk #1 FAILED at 142
1 out of 1 hunks FAILED -- saving rejects to file gfx/layers/basic/BasicCompositor.cpp.rej
patching file gfx/layers/composite/ImageHost.cpp
Hunk #1 FAILED at 105
1 out of 1 hunks FAILED -- saving rejects to file gfx/layers/composite/ImageHost.cpp.rej
patch failed, unable to continue (try -v)
patch failed, rejects left in working directory
errors during apply, please fix and refresh bug1182147

Bas, could you take a look, thanks!
Flags: needinfo?(bas)
(Assignee)

Comment 32

4 years ago
(In reply to Carsten Book [:Tomcat] from comment #31)
> (In reply to Ritu Kothari (:ritu) from comment #30)
> > Comment on attachment 8656667 [details] [diff] [review]
> > Do not allow incompatible effects v2
> > 
> > Approved for uplift to Beta as it appears to be a simple fix for the crash.
> 
> this doesn't apply cleanly to beta
> 
> renamed 1182147 -> bug1182147
> applying bug1182147
> patching file gfx/layers/basic/BasicCompositor.cpp
> Hunk #1 FAILED at 142
> 1 out of 1 hunks FAILED -- saving rejects to file
> gfx/layers/basic/BasicCompositor.cpp.rej
> patching file gfx/layers/composite/ImageHost.cpp
> Hunk #1 FAILED at 105
> 1 out of 1 hunks FAILED -- saving rejects to file
> gfx/layers/composite/ImageHost.cpp.rej
> patch failed, unable to continue (try -v)
> patch failed, rejects left in working directory
> errors during apply, please fix and refresh bug1182147
> 
> Bas, could you take a look, thanks!

I'll have a look, I created this for Beta though so something else must've been uplifted to conflict.
Flags: needinfo?(bas)
Bas, tomorrow we gtb last Beta (41.0b9), so we really need a beta patch uplift request soon/today?
Flags: needinfo?(bas)
Milan, I approved the beta uplift on this one almost a week ago and still waiting for a patch to land. Any help here would be appreciated.
Flags: needinfo?(milan)
(Assignee)

Comment 35

4 years ago
Not to worry, the reason this failed was because I landed this last week, just forgot to post the changeset here:

http://hg.mozilla.org/releases/mozilla-beta/rev/853b4b315d5f
Flags: needinfo?(milan)
Flags: needinfo?(bas)
No more occurrences of this bug in 41.0b8.

Bas, Nical, what about uplift this patch also to 42 before it goes to beta? Thanks
Flags: needinfo?(nical.bugzilla)
Flags: needinfo?(bas)
Comment on attachment 8656667 [details] [diff] [review]
Do not allow incompatible effects v2

Review of attachment 8656667 [details] [diff] [review]:
-----------------------------------------------------------------

Already landed in beta, we want it in aurora for the same reasons.
Attachment #8656667 - Flags: approval-mozilla-aurora?
Comment on attachment 8656667 [details] [diff] [review]
Do not allow incompatible effects v2

Thanks
Flags: needinfo?(nical.bugzilla)
Flags: needinfo?(bas)
Attachment #8656667 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
https://hg.mozilla.org/mozilla-central/rev/05202b078d10
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla43
You need to log in before you can comment on or make changes to this bug.