Closed Bug 1247977 Opened 4 years ago Closed 3 years ago

crash in mozilla::WebGLContext::FakeBlackTexture::FakeBlackTexture regressing in 45

Categories

(Core :: Canvas: WebGL, defect, critical)

45 Branch
x86
Windows
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla50
Tracking Status
firefox45 --- wontfix
firefox46 + wontfix
firefox47 + wontfix
firefox48 --- wontfix
firefox49 + fixed
firefox-esr45 --- wontfix
firefox50 + fixed

People

(Reporter: philipp, Assigned: mtseng)

References

()

Details

(Keywords: crash, regression, Whiteboard: gfx-noted)

Crash Data

Attachments

(5 files, 5 obsolete files)

3.71 KB, text/plain
Details
58 bytes, text/x-review-board-request
jgilbert
: review+
Details
2.65 KB, patch
jgilbert
: review+
Details | Diff | Splinter Review
8.98 KB, patch
mtseng
: review+
Details | Diff | Splinter Review
9.01 KB, patch
Details | Diff | Splinter Review
This bug was filed from the Socorro interface and is 
report bp-13c9bb23-4113-4bb4-bb60-7be612160205.
=============================================================
Crashing Thread (0)
Frame 	Module 	Signature 	Source
0 	xul.dll 	mozilla::WebGLContext::FakeBlackTexture::FakeBlackTexture(mozilla::gl::GLContext*, StrongGLenum<TexTargetDetails>, mozilla::FakeBlackType) 	dom/canvas/WebGLContextDraw.cpp
1 	xul.dll 	mozilla::WebGLContext::BindFakeBlack(unsigned int, StrongGLenum<TexTargetDetails>, mozilla::FakeBlackType) 	dom/canvas/WebGLContextDraw.cpp
2 	xul.dll 	<lambda_7b06d7e9b9eb5e3e25ced0db7be34b0f>::operator() 	dom/canvas/WebGLContextDraw.cpp
3 	xul.dll 	mozilla::ScopedResolveTexturesForDraw::ScopedResolveTexturesForDraw(mozilla::WebGLContext*, char const*, bool* const) 	dom/canvas/WebGLContextDraw.cpp
4 	xul.dll 	js::GetIterator(JSContext*, JS::Handle<JSObject*>, unsigned int, JS::MutableHandle<JSObject*>) 	js/src/jsiter.cpp
5 	xul.dll 	mozilla::dom::WebGLRenderingContextBinding::drawElements 	obj-firefox/dom/bindings/WebGLRenderingContextBinding.cpp
6 	xul.dll 	mozilla::dom::GenericBindingMethod(JSContext*, unsigned int, JS::Value*) 	dom/bindings/BindingUtils.cpp
7 		@0x146b4ad8 	
8 		@0xffffff80

this crash signature is regressing in 45 and is happening in a MOZ_CRASH section introduced in bug 1221822.
Whiteboard: gfx-noted
I have a permafail in 46 with this signature.  Attaching info
Added the permafail URL
NI milan for triage
Flags: needinfo?(milan)
Morris, could this be from some of your recent work?  Let's see if we can take care of it before 46 goes to Beta.
Flags: needinfo?(mtseng)
Flags: needinfo?(milan)
Flags: needinfo?(jgilbert)
I can't reproduce this. Reproduce this in a debugger and you'll be able to pull out the contents of `dui` and the value of the error produced, and/or step into ANGLE to see what it's complaining about.
Flags: needinfo?(jgilbert)
Randell, any chance of a regression range?

Jeff, you tried this with D3D9 ANGLE and basic compositor?
Flags: needinfo?(rjesup)
Flags: needinfo?(jgilbert)
(In reply to Milan Sreckovic [:milan] from comment #6)
> Randell, any chance of a regression range?
> 
> Jeff, you tried this with D3D9 ANGLE and basic compositor?

I tried with d3d9 ANGLE, but didn't try the basic compositor. I can try that too.
(In reply to Jeff Gilbert [:jgilbert] from comment #7)
> (In reply to Milan Sreckovic [:milan] from comment #6)
> > Randell, any chance of a regression range?
> > 
> > Jeff, you tried this with D3D9 ANGLE and basic compositor?
> 
> I tried with d3d9 ANGLE, but didn't try the basic compositor. I can try that
> too.

I cannot reproduce this.
Flags: needinfo?(jgilbert)
Is it possible this is a duplicate of bug 1247977?  Randall, could you check with the patch that's on that bug?
Hi,

I have tested this issue with the provided URL on the latest Nightly (47.0a1) build and I have managed to crash the browser, but the signature is different from the provided one (https://crash-stats.mozilla.com/report/index/bp-41f0e3b1-9b41-49bd-9702-20f732160304).  

Firefox: 47.0a1, Build ID: 20160303030253
User Agent: Mozilla/5.0 (Windows NT 6.3; rv:47.0) Gecko/20100101 Firefox/47.0

The browser crashed following this steps:
1. Open in a new tab this link http://collapse-thedivisiongame.ubi.com/en/#
2. Open in another tab YouTube and play a video
3. Open again this link http://collapse-thedivisiongame.ubi.com/en/# in another tab
4. Refresh the tab from the step 3.
Is it still related with this bug or should I log another? 

I have performed a regression and this is the obtained push log:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=56f5ffd44b7f13d5db5bd497018f428dcbb28fc4&tochange=b77b0bda050313973e5f24b165528ef94079bf05. 

The first build that Firefox crashed with this signature is form 2015-09-07. I am not sure if this is a good pushlog, I will investigate more next week.


I have also tested on the latest Beta (45.0 - Build ID: 20160301003640 ) and the browser crashed with this signature: https://crash-stats.mozilla.com/report/index/bp-0b4159fa-1cb1-41be-994e-635972160304

Thanks,
Cosmin.
(In reply to Milan Sreckovic [:milan] from comment #9)
> Is it possible this is a duplicate of bug 1247977?  Randall, could you check
> with the patch that's on that bug?

Which bug did you have in mind? This is the same bug number.
Flags: needinfo?(milan)
Right.  Sorry.  Bug 1228687.
Flags: needinfo?(milan) → needinfo?(jgilbert)
Tracking for 46 and 47, looks like it may be a recent regression and we see crashes on those channels.
Bug 1228687 landed on central on 3/18, and I don't see crashes in nightly/48 since 3/16 build, so this could be handled.  Randell, it'd still be good to know if it fixes it for you.
Assignee: nobody → milan
Flags: needinfo?(mtseng)
Flags: needinfo?(jgilbert)
It works for me in Nightly 3/23 on the same system
Flags: needinfo?(rjesup)
Making this a duplicate of bug 1228687, all indications are this is the case.
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 1228687
Just found a crash from 3/26 build (https://crash-stats.mozilla.com/report/index/fa28d3a0-5223-4e60-b6fd-475cc2160331) and the same signature.  Randell, does the 3/26 or later build crash for you?
Flags: needinfo?(rjesup)
The crash above is on WGL, and we don't actually want to allow that (handled in another bug), but I will reopen this for now.
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
(In reply to Milan Sreckovic [:milan] from comment #18)
> The crash above is on WGL, and we don't actually want to allow that (handled
> in another bug), but I will reopen this for now.

Bug 1261179 for "don't do WGL" as a fallback on Windows.
Comment on attachment 8736954 [details]
MozReview Request: Bug 1247977: More information when we hit the OpenGL error in FakeBlackTexture. r?jgilbert

https://reviewboard.mozilla.org/r/43603/#review40227

::: dom/canvas/WebGLContextDraw.cpp:851
(Diff revision 1)
>          for (int i = 0; i < 6; ++i) {
>              const TexImageTarget curTarget = LOCAL_GL_TEXTURE_CUBE_MAP_POSITIVE_X + i;
>              const GLenum error = DoTexImage(mGL, curTarget.get(), 0, &dui, 1, 1, 1,
>                                              zeros.get());
> -            if (error)
> -                MOZ_CRASH("Unexpected error during FakeBlack creation.");
> +            if (error) {
> +                gfxCriticalError() << "Unexpected error " << (int) error << " during FakeBlack creation on index " << i << " with format " << (int)texFormat;

We want `curTarget` and `dui`.

Cleaner as:
const nsPrintfCString text("DoTexImage failed with `error`: 0x%04x, for `curTarget`: 0x%04x, `dui`: {0x%04x, 0x%04x, 0x%04x}.", error, curTarget.get(), dui.internalFormat, dui.unpackFormat, dui.unpackType);
gfxCriticalError() << text.BeginReading();

::: dom/canvas/WebGLContextDraw.cpp:859
(Diff revision 1)
>          }
>      } else {
>          const GLenum error = DoTexImage(mGL, target.get(), 0, &dui, 1, 1, 1,
>                                          zeros.get());
> -        if (error)
> +        if (error) {
> +            gfxCriticalError() << "Unexpected error " << (int) error << " during FakeBlack creation with format " << (int)texFormat;

Same thing here, but `target` instead of `curTarget`.
Attachment #8736954 - Flags: review?(jgilbert)
(In reply to Milan Sreckovic [:milan] from comment #17)
> Just found a crash from 3/26 build
> (https://crash-stats.mozilla.com/report/index/fa28d3a0-5223-4e60-b6fd-
> 475cc2160331) and the same signature.  Randell, does the 3/26 or later build
> crash for you?

3/26 Nightly - no crash
Flags: needinfo?(rjesup)
Comment on attachment 8736954 [details]
MozReview Request: Bug 1247977: More information when we hit the OpenGL error in FakeBlackTexture. r?jgilbert

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/43603/diff/1-2/
Attachment #8736954 - Flags: review?(jgilbert)
Comment on attachment 8736954 [details]
MozReview Request: Bug 1247977: More information when we hit the OpenGL error in FakeBlackTexture. r?jgilbert

https://reviewboard.mozilla.org/r/43603/#review41161
Attachment #8736954 - Flags: review?(jgilbert) → review+
Still getting lots of crashes.  Out of >2000 crashes last week, only two are under WGL.  One is in the cube map, the rest are 2D texture creation.  We do get this across all vendors.

Here's a recent (48) crash, not under WGL:

https://crash-stats.mozilla.com/report/index/68249e55-0622-4b1f-bc69-1893f2160506

DoTexImage fails (and then we MOZ_CRASH):

Error is 0x0505, which is "out of memory".
Target is 0x0de1 (3553) - I don't know if this number is out of the ordinary.
Internal format is 0x1907, which is RGB (as opposed to RGBA.)
Unpack format is 0x1907, which is RGB (it would always match the internal format in this call.)
Unpack type is 0x1401, which is unsigned byte (all of them are in this call.)

This one: https://crash-stats.mozilla.com/report/index/296eab9a-9165-4698-bc68-c3fc22160509 eventually fails similary, but initially complains a lot about DXVA. Probably not related.

Haven't found anything in the nightly, but perhaps that is a usage pattern.

Here's the newest build I found; nothing much different than the rest of the 48s:
https://crash-stats.mozilla.com/report/index/5e33f4b7-f2ff-4f5a-9937-325012160510

Jeff, is this really out of memory?  On a 1x1 RGB texture?  Anything we can do at this point, other than crash - context reset?
Flags: needinfo?(jgilbert)
(In reply to Milan Sreckovic [:milan] from comment #27)
> Still getting lots of crashes.  Out of >2000 crashes last week, only two are
> under WGL.  One is in the cube map, the rest are 2D texture creation.  We do
> get this across all vendors.
> 
> Here's a recent (48) crash, not under WGL:
> 
> https://crash-stats.mozilla.com/report/index/68249e55-0622-4b1f-bc69-
> 1893f2160506
> 
> DoTexImage fails (and then we MOZ_CRASH):
> 
> Error is 0x0505, which is "out of memory".
> Target is 0x0de1 (3553) - I don't know if this number is out of the ordinary.
> Internal format is 0x1907, which is RGB (as opposed to RGBA.)
> Unpack format is 0x1907, which is RGB (it would always match the internal
> format in this call.)
> Unpack type is 0x1401, which is unsigned byte (all of them are in this call.)
> 
> This one:
> https://crash-stats.mozilla.com/report/index/296eab9a-9165-4698-bc68-
> c3fc22160509 eventually fails similary, but initially complains a lot about
> DXVA. Probably not related.
> 
> Haven't found anything in the nightly, but perhaps that is a usage pattern.
> 
> Here's the newest build I found; nothing much different than the rest of the
> 48s:
> https://crash-stats.mozilla.com/report/index/5e33f4b7-f2ff-4f5a-9937-
> 325012160510
> 
> Jeff, is this really out of memory?  On a 1x1 RGB texture?  Anything we can
> do at this point, other than crash - context reset?

I'll take another look. IIRC this OOM was not actually OOM, but rather ANGLE saying using GL_OOM to bail on doing an operation it doesn't like.
Is there a way to get to the "extra" error message text, instead of just the error enum?
Assignee: milan → jgilbert
(In reply to Milan Sreckovic [:milan] from comment #29)
> Is there a way to get to the "extra" error message text, instead of just the
> error enum?

There's a GL extension (that we recognize and support) but I don't think ANGLE supports this yet.
Flags: needinfo?(jgilbert)
OS: Windows NT → Windows
This is at >1500 crashes, and no bandwidth from Jeff - Peter, can you find somebody to take a look?
Assignee: jgilbert → howareyou322
Morris, any idea for comment 27?
Flags: needinfo?(mtseng)
This would log detailed message when ANGLE failure. We might get more useful info instead of just GL_OUT_OF_MEMORY.
Attachment #8764527 - Flags: review?(jgilbert)
Checking.
Flags: needinfo?(mtseng)
Assignee: howareyou322 → mtseng
Attachment #8764527 - Flags: review?(jgilbert) → review+
Pushed by mtseng@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/74ed3becf2d2
Log detailed error message when FakeBlackTexture creation fail. r=jgilbert
I got one report with ANGLE error message.
https://crash-stats.mozilla.com/report/index/7f72d31e-8514-4596-ba81-87f982160705

Which says "[C2454][GFX1-]: Failed to create staging texture, result: 0x887A0005. (t=38696.1)"
(In reply to Morris Tseng [:mtseng] [:Morris] from comment #37)
> I got one report with ANGLE error message.
> https://crash-stats.mozilla.com/report/index/7f72d31e-8514-4596-ba81-
> 87f982160705
> 
> Which says "[C2454][GFX1-]: Failed to create staging texture, result:
> 0x887A0005. (t=38696.1)"

0x887A0005 is DXGI_ERROR_DEVICE_REMOVED. So this crash is happened when device reset occurred. I found that there are other crashes are happened without device reset. So I'll wait more crash report to analysis.
Might it be worth uplifting this bug to Aurora so that we get more data?
Flags: needinfo?(mtseng)
(In reply to Jeff Muizelaar [:jrmuizel] from comment #39)
> Might it be worth uplifting this bug to Aurora so that we get more data?

Great idea. Let's do this.
Flags: needinfo?(mtseng)
Comment on attachment 8764527 [details] [diff] [review]
Log detailed error message when FakeBlackTexture creation fail.

Approval Request Comment
[Feature/regressing bug #]: N/A
[User impact if declined]: Don't impact on user since this patch only log additional message in crash report.
[Describe test coverage new/current, TreeHerder]: Test on nightly.
[Risks and why]: Low risk, just add some log.
[String/UUID change made/needed]: N/A
Attachment #8764527 - Flags: approval-mozilla-aurora?
Comment on attachment 8764527 [details] [diff] [review]
Log detailed error message when FakeBlackTexture creation fail.

This should help diagnose some crashes.  ok for aurora uplift.
Attachment #8764527 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Tracking this since it's still showing up strongly on release (1000 or so crashes in the last week). 
I would still take a patch for 49 in aurora or once it's on beta if you come up with a fix.
According to the crash stats, all fake black textures creation failures due to device reset. So caould we just ignore when the error occured?
Attachment #8774205 - Flags: review?(jgilbert)
Comment on attachment 8774205 [details] [diff] [review]
Handle FakeBlackTexture creation fail.

Review of attachment 8774205 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/canvas/WebGLContextDraw.cpp
@@ +202,5 @@
>      }
>      UniquePtr<FakeBlackTexture>& fakeBlackTex = *slot;
>  
>      if (!fakeBlackTex) {
>          fakeBlackTex.reset(new FakeBlackTexture(gl, target, fakeBlack));

Since this object is only ever heap-allocated, make the ctor trivial and protected, and expose object creation fallibly via a static Create function.
Attachment #8774205 - Flags: review?(jgilbert) → review-
Addressed comment.
Attachment #8774229 - Flags: review?(jgilbert)
Attachment #8774205 - Attachment is obsolete: true
Comment on attachment 8774229 [details] [diff] [review]
Handle FakeBlackTexture creation fail v2.

Review of attachment 8774229 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/canvas/WebGLContextDraw.cpp
@@ +887,5 @@
> +                                                       FakeBlackType type)
> +{
> +    UniquePtr<WebGLContext::FakeBlackTexture> result;
> +    result.reset(new WebGLContext::FakeBlackTexture(gl, target, type));
> +    if (!result->mIsValid) {

s/CreateFakeBlackTexture/Create/

So this works, but this isn't really better than before.

I want this done without mIsValid. Ideally all of the fallible logic that's currently in the ctor will be contained in Create. Usually this means that the ctor just sets its constant members and nothing else.
Attachment #8774229 - Flags: review?(jgilbert) → review-
Addressed comment.
Attachment #8774523 - Flags: review?(jgilbert)
Attachment #8774229 - Attachment is obsolete: true
Comment on attachment 8774523 [details] [diff] [review]
Handle FakeBlackTexture creation fail v3.

Review of attachment 8774523 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/canvas/WebGLContextDraw.cpp
@@ +899,5 @@
>      default:
>          MOZ_CRASH("GFX: bad type");
>      }
>  
> +    auto tex = CreateGLTexture(gl);

This will leak the texture on failure.
Generally what you want to do is:
UniquePtr<...> ret = new FakeBlackTexture(gl::GLContext* gl);
gl::ScopedBindTexture scopedBind(gl, ret->mGLName, target.get());
Attachment #8774523 - Flags: review?(jgilbert) → review-
Fix potential leak.
Attachment #8774534 - Flags: review?(jgilbert)
Attachment #8774523 - Attachment is obsolete: true
Comment on attachment 8774534 [details] [diff] [review]
Handle FakeBlackTexture creation fail v4.

Review of attachment 8774534 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks!
Attachment #8774534 - Flags: review?(jgilbert) → review+
Attachment #8774534 - Attachment is obsolete: true
Crash volume for signature 'mozilla::WebGLContext::FakeBlackTexture::FakeBlackTexture':
 - esr (version 45): 2610 crashes from 2016-04-07.

Crash volume on the last weeks:
             Week N-1   Week N-2   Week N-3   Week N-4   Week N-5   Week N-6   Week N-7
 - nightly          6          3          2          5          1          1          2
 - aurora          11         12         12         12          9         14          1
 - beta           255        233        229        287        256        220         54
 - release       1691       1668       1586       1593       1567       1547        529
 - esr            291        288        319        307        283        303        163

Affected platform: Windows
Pushed by mtseng@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/2da7497fc16b
Handle FakeBlackTexture creation fail. r=jgilbert
Should be fixed.
Status: REOPENED → RESOLVED
Closed: 4 years ago3 years ago
Resolution: --- → FIXED
Does something need uplifting to 49 here still?
Flags: needinfo?(mtseng)
Target Milestone: --- → mozilla50
Comment on attachment 8775425 [details] [diff] [review]
Handle FakeBlackTexture creation fail. (patch for uplifting to aurora)

Approval Request Comment
[Feature/regressing bug #]:
[User impact if declined]: May crash randomly when using webgl
[Describe test coverage new/current, TreeHerder]: test on locally. Also pass try.
[Risks and why]: Low
[String/UUID change made/needed]: N/A
Flags: needinfo?(mtseng)
Attachment #8775425 - Flags: approval-mozilla-aurora?
Comment on attachment 8775425 [details] [diff] [review]
Handle FakeBlackTexture creation fail. (patch for uplifting to aurora)

Fix a crash, taking it.
Attachment #8775425 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Previous patch has a typo. Upload a correct patch. Do I need request approval again?
Attachment #8775425 - Attachment is obsolete: true
Flags: needinfo?(mtseng) → needinfo?(ryanvm)
Nope, thanks for fixing it.
Flags: needinfo?(ryanvm)
You need to log in before you can comment on or make changes to this bug.