Closed Bug 1027604 Opened 10 years ago Closed 9 years ago

Intermittent base.py | application crashed [@ mozilla::gl::GLBlitTextureImageHelper::BlitTextureImage(mozilla::gl::TextureImage*, nsIntRect const&, mozilla::gl::TextureImage*, nsIntRect const&)]

Categories

(Core :: Graphics: Layers, defect)

ARM
Gonk (Firefox OS)
defect
Not set
critical

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: emorley, Unassigned)

References

Details

(Keywords: crash, intermittent-failure)

Crash Data

b2g_macosx64 b2g-inbound opt test gaia-ui-test on 2014-06-18 21:56:32 PDT for push b7b394f42062

slave: talos-mtnlion-r5-006

https://tbpl.mozilla.org/php/getParsedLog.php?id=42022963&tree=B2g-Inbound

{
22:09:45     INFO -  TEST-START test_keyboard.py
22:10:02     INFO -  test_keyboard_basic (test_keyboard.TestKeyboard) ... mozcrash INFO | Downloading symbols from: https://ftp-ssl.mozilla.org/pub/mozilla.org/b2g/tinderbox-builds/b2g-inbound-macosx64_gecko/1403151453/en-US/b2g-33.0a1.en-US.mac64.crashreporter-symbols.zip
22:10:17    ERROR -  PROCESS-CRASH | base.py | application crashed [@ mozilla::gl::GLBlitTextureImageHelper::BlitTextureImage(mozilla::gl::TextureImage*, nsIntRect const&, mozilla::gl::TextureImage*, nsIntRect const&)]
22:10:17     INFO -  Crash dump filename: /var/folders/sh/x3_cwy_x189_cj3v8yxlp9m000000w/T/tmp9GM0ev/minidumps/500A751D-CAF5-434B-AC21-1E2E48B96EF7.dmp
22:10:17     INFO -  Operating system: Mac OS X
22:10:17     INFO -                    10.8.0 12A269
22:10:17     INFO -  CPU: amd64
22:10:17     INFO -       family 6 model 42 stepping 7
22:10:17     INFO -       8 CPUs
22:10:17     INFO -  Crash reason:  EXC_BAD_ACCESS / KERN_INVALID_ADDRESS
22:10:17     INFO -  Crash address: 0x0
22:10:17     INFO -  Thread 20 (crashed)
22:10:17     INFO -   0  XUL!mozilla::gl::GLBlitTextureImageHelper::BlitTextureImage(mozilla::gl::TextureImage*, nsIntRect const&, mozilla::gl::TextureImage*, nsIntRect const&) [GLBlitTextureImageHelper.cpp:b7b394f42062 : 76 + 0x0]
22:10:17     INFO -      rbx = 0x0000000114605ec0   r12 = 0x0000000000000000
22:10:17     INFO -      r13 = 0x0000000000000032   r14 = 0x00000000000000a9
22:10:17     INFO -      r15 = 0x0000000000000000   rip = 0x000000010197ce71
22:10:17     INFO -      rsp = 0x000000010d8022a0   rbp = 0x0000000000000140
22:10:17     INFO -      Found by: given as instruction pointer in context
22:10:17     INFO -   1  XUL!mozilla::layers::TextureImageTextureSourceOGL::CopyTo(nsIntRect const&, mozilla::layers::DataTextureSource*, nsIntRect const&) [TextureHostOGL.cpp:b7b394f42062 : 360 + 0xd]
22:10:17     INFO -      rbx = 0x0000000109427d00   r12 = 0x0000000109427b80
22:10:17     INFO -      r13 = 0x0000000000000000   r14 = 0x000000010d802628
22:10:17     INFO -      r15 = 0x000000010d802638   rip = 0x0000000101a6e857
22:10:17     INFO -      rsp = 0x000000010d802550   rbp = 0x00000001191217e0
22:10:17     INFO -      Found by: call frame info
22:10:17     INFO -   2  XUL!mozilla::layers::ContentHostIncremental::TextureCreationRequest::Execute(mozilla::layers::ContentHostIncremental*) [ContentHost.cpp:b7b394f42062 : 570 + 0x7]
22:10:17     INFO -      rbx = 0x0000000109427d00   r12 = 0x0000000000000140
22:10:17     INFO -      r13 = 0x0000000000000000   r14 = 0x0000000000000000
22:10:17     INFO -      r15 = 0x0000000000000000   rip = 0x0000000101a3e169
22:10:17     INFO -      rsp = 0x000000010d802580   rbp = 0x00000001191217e0
22:10:17     INFO -      Found by: call frame info
22:10:17     INFO -   3  XUL!mozilla::layers::ContentHostIncremental::ProcessTextureUpdates() [ContentHost.cpp:b7b394f42062 : 454 + 0x8]
22:10:17     INFO -      rbx = 0x0000000000000003   r12 = 0x0000000000000005
22:10:17     INFO -      r13 = 0x00000001191217e0   r14 = 0x0000000119121850
22:10:17     INFO -      r15 = 0x00000001191217e0   rip = 0x0000000101a3daf0
22:10:17     INFO -      rsp = 0x000000010d8026a0   rbp = 0x000000011470f500
22:10:17     INFO -      Found by: call frame info
22:10:17     INFO -   4  XUL!mozilla::layers::ContentHostIncremental::Lock() [ContentHost.h:b7b394f42062 : 286 + 0x4]
22:10:17     INFO -      rbx = 0x00000001191217e0   r12 = 0x0000000000000000
22:10:17     INFO -      r13 = 0x00000001191217e0   r14 = 0x000000010d80296f
22:10:17     INFO -      r15 = 0x000000010d802988   rip = 0x0000000101a4fd19
22:10:17     INFO -      rsp = 0x000000010d8026e0   rbp = 0x0000000101a4f9a0
22:10:17     INFO -      Found by: call frame info
22:10:17     INFO -   5  XUL!mozilla::layers::ContentHostBase::Composite(mozilla::layers::EffectChain&, float, mozilla::gfx::Matrix4x4 const&, mozilla::gfx::Filter const&, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits> const&, nsIntRegion const*, mozilla::layers::TiledLayerProperties*) [ContentHost.cpp:b7b394f42062 : 43 + 0x8]
22:10:17     INFO -      rbx = 0x00000001191217e0   r12 = 0x0000000000000000
22:10:17     INFO -      r13 = 0x00000001191217e0   r14 = 0x000000010d80296f
22:10:17     INFO -      r15 = 0x000000010d802988   rip = 0x0000000101a3c4a6
22:10:17     INFO -      rsp = 0x000000010d8026f0   rbp = 0x0000000101a4f9a0
22:10:17     INFO -      Found by: call frame info
22:10:17     INFO -   6  XUL!mozilla::layers::ThebesLayerComposite::RenderLayer(nsIntRect const&) [ThebesLayerComposite.cpp:b7b394f42062 : 149 + 0x31]
22:10:17     INFO -      rbx = 0x0000000101a3c460   r12 = 0x0000000000000000
22:10:17     INFO -      r13 = 0x00000001191217e0   r14 = 0x000000011d7c0800
22:10:17     INFO -      r15 = 0x000000010d802988   rip = 0x0000000101a46043
22:10:17     INFO -      rsp = 0x000000010d802950   rbp = 0x0000000101a4f9a0
22:10:17     INFO -      Found by: call frame info
22:10:17     INFO -   7  XUL!void mozilla::layers::ContainerRender<mozilla::layers::ContainerLayerComposite>(mozilla::layers::ContainerLayerComposite*, mozilla::layers::LayerManagerComposite*, nsIntRect const&) [ContainerLayerComposite.cpp:b7b394f42062 : 392 + 0x10]
22:10:17     INFO -      rbx = 0x0000000000000000   r12 = 0x0000000000000140
22:10:17     INFO -      r13 = 0x0000000000000000   r14 = 0x000000010cfef500
22:10:17     INFO -      r15 = 0x000000011d7c09f8   rip = 0x0000000101a4bd9d
22:10:17     INFO -      rsp = 0x000000010d802a30   rbp = 0x0000000000000001
22:10:17     INFO -      Found by: call frame info
}
All of these failures seem to be in test_keyboard.py.

Bug 1023730 touches lots of keyboard stuff. Any ideas, Tim?
Depends on: 1023730
Flags: needinfo?(timdream)
We need a fix or backout on this ASAP.
Flags: needinfo?(nical.bugzilla)
I'm on PTO for ~two weeks. From a quick glance at the stack and code it looks like the crash can be avoided witha null check in TextureImageTextureSourceOGL::CopyTo, althought it'd be worth understanding why mTexImage appears to be null.
Matt knows this code.
Flags: needinfo?(nical.bugzilla) → needinfo?(matt.woodrow)
The first instance of this that I see on b2g-inbound is with the b2g bumper bot's push that included bug 1023730:
https://tbpl.mozilla.org/?rev=b7b394f42062&tree=B2g-Inbound&jobname=b2g_macosx64%20b2g-inbound%20opt%20test%20gaia-ui-test

I'm retriggering a bunch of jobs on that push and the one before it to make sure, but I'm really tempted to just revert bug 1023730 to resolve this intermittent failure until the actual cause in that patch can be fixed.
(In reply to Wes Kocher (:KWierso) from comment #137)
> Reverted bug 1023730 in
> https://github.com/mozilla-b2g/gaia/commit/
> 32ffcf9b8e01b350ba26673e6cbd1f61572b3265 to see if it makes these failures
> go away.

It doesn't, so please reland the patch.

(In reply to Wes Kocher (:KWierso) from comment #68)
> All of these failures seem to be in test_keyboard.py.
> 
> Bug 1023730 touches lots of keyboard stuff. Any ideas, Tim?

Even if the patch did cause the issue, I am not technically equip to help graphics bug when it crashes because of timing changes in web content.
Flags: needinfo?(timdream)
(In reply to Tim Guan-tin Chien [:timdream] (MoCo-TPE) (please ni?) from comment #143)
> It doesn't, so please reland the patch.

TBPL is only flagging the tree that revert commit has not reached, my conclusion here is premature and we need to wait.
The failures seem to have stopped since that revert got merged around, though the trees have been closed a lot for various reasons, so we might have just not had anything running to catch the failure since then. I'm retriggering a bunch of Gu runs on a recent mozilla-central build to see if we get any more failures.

The retriggered jobs should show up here: https://tbpl.mozilla.org/?rev=789f505eaab7&jobname=b2g_macosx64%20mozilla-central%20opt%20test%20gaia-ui-test

Assuming the crashes continue to not happen with these retriggered jobs, any idea why that gaia change would be causing this crash?
(In reply to Wes Kocher (:KWierso) from comment #165)
> The failures seem to have stopped since that revert got merged around,
> though the trees have been closed a lot for various reasons, so we might
> have just not had anything running to catch the failure since then. I'm
> retriggering a bunch of Gu runs on a recent mozilla-central build to see if
> we get any more failures.
> 
> The retriggered jobs should show up here:
> https://tbpl.mozilla.org/?rev=789f505eaab7&jobname=b2g_macosx64%20mozilla-
> central%20opt%20test%20gaia-ui-test

This is great - thank you for helping out with this Wes :-)

There are 84 Gu jobs on the push linked above, with no crashes.
CJ, is anyone in Taipei available to help out this? This blocks keyboard app feature work because somehow the patch in keyboard app crashes gecko but all keyboard app work is depend on that patch. Thanks.
Flags: needinfo?(cku)
Peter, are you or Chiajung have change to look into this bug?
Flags: needinfo?(cku) → needinfo?(pchang)
FYI, please consider this as feature-b2g: 2.1 precedence. So work on blocker first if there are blockers. Thanks.
I don't have any ideas about why this would happen.

I can see two possible failure spots that would have this effect, but neither seem particularly likely:

http://mxr.mozilla.org/mozilla-central/source/gfx/layers/opengl/TextureHostOGL.cpp#261
http://mxr.mozilla.org/mozilla-central/source/gfx/layers/client/ContentClient.cpp#929

As nical said, a null check at the crash site would work, but it might give broken rendering instead.
Flags: needinfo?(matt.woodrow)
(In reply to C.J. Ku[:cjku] from comment #168)
> Peter, are you or Chiajung have change to look into this bug?

Chiajung, please help to check.
Flags: needinfo?(pchang) → needinfo?(chung)
(In reply to peter chang[:pchang][:peter] from comment #171)
> (In reply to C.J. Ku[:cjku] from comment #168)
> > Peter, are you or Chiajung have change to look into this bug?
> 
> Chiajung, please help to check.

I tried to setup the same environment in my side but I couldn't reproduce the crash.
Will try to add some logs to identify the problem.
I had tried it before and failed to run the test at all. Since my b2g-desktop build on Mac can not show 'Next' button of FTU, and I can not disable it.

And from the code, I can not tell what's wrong here. A very simple solution should be adding some null check there but I think the problem is why they null.
Flags: needinfo?(chung)
(In reply to peter chang[:pchang][:peter] from comment #175)
> I tried to setup the same environment in my side but I couldn't reproduce
> the crash.
> Will try to add some logs to identify the problem.

Can we try to get these logs landed please?
Flags: needinfo?(pchang)
I tried to add some log and pushed to tryserver but the log does not pop up in the report. (Even if they are MOZ_RELEASE_ASSERT, though it make the crash point moves :) )

From the report, I tried to null check the aSrc which seems work around the problem(Can not reproduce in 50 try).

By the way, the bug is very strange:
The object relationship looks like: ContentHostIncremental->TextureImageTextureSourceOGL->TextureImage
Where the TextureImage is null but not TextureImageTextureSourceOGL(I will try to ckeck this later), that means the ContentHost had received TextureCreation before but no TextureUpdate, and then receives a TextureCreation with COPY request. It is a Client/Host out of sync condition, which seems impossible for me.
How about the local OSX? Does it have log?
Flags: needinfo?(pchang) → needinfo?(chung)
(In reply to peter chang[:pchang][:peter] from comment #184)
> How about the local OSX? Does it have log?

No, I add log into the first line of the crashing function by
printf_stderr("Enter BlitTextureImage");

And run the keyboard test.

But no log observable in the console.
Flags: needinfo?(chung)
(In reply to Chiajung Hung [:chiajung] from comment #185)
> (In reply to peter chang[:pchang][:peter] from comment #184)
> > How about the local OSX? Does it have log?
> 
> No, I add log into the first line of the crashing function by
> printf_stderr("Enter BlitTextureImage");
> 
> And run the keyboard test.
> 
> But no log observable in the console.

It seems we can write logs into a file, but I don't know how to get the file from tryserver. (I asked QA and others, but no one knows).
I can not re-trigger this today...
Let's wait to see whether it occurs again.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.