Closed Bug 1027604 Opened 11 years ago Closed 10 years ago

Intermittent base.py | application crashed [@ mozilla::gl::GLBlitTextureImageHelper::BlitTextureImage(mozilla::gl::TextureImage*, nsIntRect const&, mozilla::gl::TextureImage*, nsIntRect const&)]

Categories

(Core :: Graphics: Layers, defect)

ARM
Gonk (Firefox OS)
defect
Not set
critical

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: emorley, Unassigned)

References

Details

(Keywords: crash, intermittent-failure)

Crash Data

b2g_macosx64 b2g-inbound opt test gaia-ui-test on 2014-06-18 21:56:32 PDT for push b7b394f42062 slave: talos-mtnlion-r5-006 https://tbpl.mozilla.org/php/getParsedLog.php?id=42022963&tree=B2g-Inbound { 22:09:45 INFO - TEST-START test_keyboard.py 22:10:02 INFO - test_keyboard_basic (test_keyboard.TestKeyboard) ... mozcrash INFO | Downloading symbols from: https://ftp-ssl.mozilla.org/pub/mozilla.org/b2g/tinderbox-builds/b2g-inbound-macosx64_gecko/1403151453/en-US/b2g-33.0a1.en-US.mac64.crashreporter-symbols.zip 22:10:17 ERROR - PROCESS-CRASH | base.py | application crashed [@ mozilla::gl::GLBlitTextureImageHelper::BlitTextureImage(mozilla::gl::TextureImage*, nsIntRect const&, mozilla::gl::TextureImage*, nsIntRect const&)] 22:10:17 INFO - Crash dump filename: /var/folders/sh/x3_cwy_x189_cj3v8yxlp9m000000w/T/tmp9GM0ev/minidumps/500A751D-CAF5-434B-AC21-1E2E48B96EF7.dmp 22:10:17 INFO - Operating system: Mac OS X 22:10:17 INFO - 10.8.0 12A269 22:10:17 INFO - CPU: amd64 22:10:17 INFO - family 6 model 42 stepping 7 22:10:17 INFO - 8 CPUs 22:10:17 INFO - Crash reason: EXC_BAD_ACCESS / KERN_INVALID_ADDRESS 22:10:17 INFO - Crash address: 0x0 22:10:17 INFO - Thread 20 (crashed) 22:10:17 INFO - 0 XUL!mozilla::gl::GLBlitTextureImageHelper::BlitTextureImage(mozilla::gl::TextureImage*, nsIntRect const&, mozilla::gl::TextureImage*, nsIntRect const&) [GLBlitTextureImageHelper.cpp:b7b394f42062 : 76 + 0x0] 22:10:17 INFO - rbx = 0x0000000114605ec0 r12 = 0x0000000000000000 22:10:17 INFO - r13 = 0x0000000000000032 r14 = 0x00000000000000a9 22:10:17 INFO - r15 = 0x0000000000000000 rip = 0x000000010197ce71 22:10:17 INFO - rsp = 0x000000010d8022a0 rbp = 0x0000000000000140 22:10:17 INFO - Found by: given as instruction pointer in context 22:10:17 INFO - 1 XUL!mozilla::layers::TextureImageTextureSourceOGL::CopyTo(nsIntRect const&, mozilla::layers::DataTextureSource*, nsIntRect const&) [TextureHostOGL.cpp:b7b394f42062 : 360 + 0xd] 22:10:17 INFO - rbx = 0x0000000109427d00 r12 = 0x0000000109427b80 22:10:17 INFO - r13 = 0x0000000000000000 r14 = 0x000000010d802628 22:10:17 INFO - r15 = 0x000000010d802638 rip = 0x0000000101a6e857 22:10:17 INFO - rsp = 0x000000010d802550 rbp = 0x00000001191217e0 22:10:17 INFO - Found by: call frame info 22:10:17 INFO - 2 XUL!mozilla::layers::ContentHostIncremental::TextureCreationRequest::Execute(mozilla::layers::ContentHostIncremental*) [ContentHost.cpp:b7b394f42062 : 570 + 0x7] 22:10:17 INFO - rbx = 0x0000000109427d00 r12 = 0x0000000000000140 22:10:17 INFO - r13 = 0x0000000000000000 r14 = 0x0000000000000000 22:10:17 INFO - r15 = 0x0000000000000000 rip = 0x0000000101a3e169 22:10:17 INFO - rsp = 0x000000010d802580 rbp = 0x00000001191217e0 22:10:17 INFO - Found by: call frame info 22:10:17 INFO - 3 XUL!mozilla::layers::ContentHostIncremental::ProcessTextureUpdates() [ContentHost.cpp:b7b394f42062 : 454 + 0x8] 22:10:17 INFO - rbx = 0x0000000000000003 r12 = 0x0000000000000005 22:10:17 INFO - r13 = 0x00000001191217e0 r14 = 0x0000000119121850 22:10:17 INFO - r15 = 0x00000001191217e0 rip = 0x0000000101a3daf0 22:10:17 INFO - rsp = 0x000000010d8026a0 rbp = 0x000000011470f500 22:10:17 INFO - Found by: call frame info 22:10:17 INFO - 4 XUL!mozilla::layers::ContentHostIncremental::Lock() [ContentHost.h:b7b394f42062 : 286 + 0x4] 22:10:17 INFO - rbx = 0x00000001191217e0 r12 = 0x0000000000000000 22:10:17 INFO - r13 = 0x00000001191217e0 r14 = 0x000000010d80296f 22:10:17 INFO - r15 = 0x000000010d802988 rip = 0x0000000101a4fd19 22:10:17 INFO - rsp = 0x000000010d8026e0 rbp = 0x0000000101a4f9a0 22:10:17 INFO - Found by: call frame info 22:10:17 INFO - 5 XUL!mozilla::layers::ContentHostBase::Composite(mozilla::layers::EffectChain&, float, mozilla::gfx::Matrix4x4 const&, mozilla::gfx::Filter const&, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits> const&, nsIntRegion const*, mozilla::layers::TiledLayerProperties*) [ContentHost.cpp:b7b394f42062 : 43 + 0x8] 22:10:17 INFO - rbx = 0x00000001191217e0 r12 = 0x0000000000000000 22:10:17 INFO - r13 = 0x00000001191217e0 r14 = 0x000000010d80296f 22:10:17 INFO - r15 = 0x000000010d802988 rip = 0x0000000101a3c4a6 22:10:17 INFO - rsp = 0x000000010d8026f0 rbp = 0x0000000101a4f9a0 22:10:17 INFO - Found by: call frame info 22:10:17 INFO - 6 XUL!mozilla::layers::ThebesLayerComposite::RenderLayer(nsIntRect const&) [ThebesLayerComposite.cpp:b7b394f42062 : 149 + 0x31] 22:10:17 INFO - rbx = 0x0000000101a3c460 r12 = 0x0000000000000000 22:10:17 INFO - r13 = 0x00000001191217e0 r14 = 0x000000011d7c0800 22:10:17 INFO - r15 = 0x000000010d802988 rip = 0x0000000101a46043 22:10:17 INFO - rsp = 0x000000010d802950 rbp = 0x0000000101a4f9a0 22:10:17 INFO - Found by: call frame info 22:10:17 INFO - 7 XUL!void mozilla::layers::ContainerRender<mozilla::layers::ContainerLayerComposite>(mozilla::layers::ContainerLayerComposite*, mozilla::layers::LayerManagerComposite*, nsIntRect const&) [ContainerLayerComposite.cpp:b7b394f42062 : 392 + 0x10] 22:10:17 INFO - rbx = 0x0000000000000000 r12 = 0x0000000000000140 22:10:17 INFO - r13 = 0x0000000000000000 r14 = 0x000000010cfef500 22:10:17 INFO - r15 = 0x000000011d7c09f8 rip = 0x0000000101a4bd9d 22:10:17 INFO - rsp = 0x000000010d802a30 rbp = 0x0000000000000001 22:10:17 INFO - Found by: call frame info }
All of these failures seem to be in test_keyboard.py. Bug 1023730 touches lots of keyboard stuff. Any ideas, Tim?
Depends on: 1023730
Flags: needinfo?(timdream)
We need a fix or backout on this ASAP.
Flags: needinfo?(nical.bugzilla)
I'm on PTO for ~two weeks. From a quick glance at the stack and code it looks like the crash can be avoided witha null check in TextureImageTextureSourceOGL::CopyTo, althought it'd be worth understanding why mTexImage appears to be null. Matt knows this code.
Flags: needinfo?(nical.bugzilla) → needinfo?(matt.woodrow)
The first instance of this that I see on b2g-inbound is with the b2g bumper bot's push that included bug 1023730: https://tbpl.mozilla.org/?rev=b7b394f42062&tree=B2g-Inbound&jobname=b2g_macosx64%20b2g-inbound%20opt%20test%20gaia-ui-test I'm retriggering a bunch of jobs on that push and the one before it to make sure, but I'm really tempted to just revert bug 1023730 to resolve this intermittent failure until the actual cause in that patch can be fixed.
(In reply to Wes Kocher (:KWierso) from comment #137) > Reverted bug 1023730 in > https://github.com/mozilla-b2g/gaia/commit/ > 32ffcf9b8e01b350ba26673e6cbd1f61572b3265 to see if it makes these failures > go away. It doesn't, so please reland the patch. (In reply to Wes Kocher (:KWierso) from comment #68) > All of these failures seem to be in test_keyboard.py. > > Bug 1023730 touches lots of keyboard stuff. Any ideas, Tim? Even if the patch did cause the issue, I am not technically equip to help graphics bug when it crashes because of timing changes in web content.
Flags: needinfo?(timdream)
(In reply to Tim Guan-tin Chien [:timdream] (MoCo-TPE) (please ni?) from comment #143) > It doesn't, so please reland the patch. TBPL is only flagging the tree that revert commit has not reached, my conclusion here is premature and we need to wait.
The failures seem to have stopped since that revert got merged around, though the trees have been closed a lot for various reasons, so we might have just not had anything running to catch the failure since then. I'm retriggering a bunch of Gu runs on a recent mozilla-central build to see if we get any more failures. The retriggered jobs should show up here: https://tbpl.mozilla.org/?rev=789f505eaab7&jobname=b2g_macosx64%20mozilla-central%20opt%20test%20gaia-ui-test Assuming the crashes continue to not happen with these retriggered jobs, any idea why that gaia change would be causing this crash?
(In reply to Wes Kocher (:KWierso) from comment #165) > The failures seem to have stopped since that revert got merged around, > though the trees have been closed a lot for various reasons, so we might > have just not had anything running to catch the failure since then. I'm > retriggering a bunch of Gu runs on a recent mozilla-central build to see if > we get any more failures. > > The retriggered jobs should show up here: > https://tbpl.mozilla.org/?rev=789f505eaab7&jobname=b2g_macosx64%20mozilla- > central%20opt%20test%20gaia-ui-test This is great - thank you for helping out with this Wes :-) There are 84 Gu jobs on the push linked above, with no crashes.
CJ, is anyone in Taipei available to help out this? This blocks keyboard app feature work because somehow the patch in keyboard app crashes gecko but all keyboard app work is depend on that patch. Thanks.
Flags: needinfo?(cku)
Peter, are you or Chiajung have change to look into this bug?
Flags: needinfo?(cku) → needinfo?(pchang)
FYI, please consider this as feature-b2g: 2.1 precedence. So work on blocker first if there are blockers. Thanks.
I don't have any ideas about why this would happen. I can see two possible failure spots that would have this effect, but neither seem particularly likely: http://mxr.mozilla.org/mozilla-central/source/gfx/layers/opengl/TextureHostOGL.cpp#261 http://mxr.mozilla.org/mozilla-central/source/gfx/layers/client/ContentClient.cpp#929 As nical said, a null check at the crash site would work, but it might give broken rendering instead.
Flags: needinfo?(matt.woodrow)
(In reply to C.J. Ku[:cjku] from comment #168) > Peter, are you or Chiajung have change to look into this bug? Chiajung, please help to check.
Flags: needinfo?(pchang) → needinfo?(chung)
(In reply to peter chang[:pchang][:peter] from comment #171) > (In reply to C.J. Ku[:cjku] from comment #168) > > Peter, are you or Chiajung have change to look into this bug? > > Chiajung, please help to check. I tried to setup the same environment in my side but I couldn't reproduce the crash. Will try to add some logs to identify the problem.
I had tried it before and failed to run the test at all. Since my b2g-desktop build on Mac can not show 'Next' button of FTU, and I can not disable it. And from the code, I can not tell what's wrong here. A very simple solution should be adding some null check there but I think the problem is why they null.
Flags: needinfo?(chung)
(In reply to peter chang[:pchang][:peter] from comment #175) > I tried to setup the same environment in my side but I couldn't reproduce > the crash. > Will try to add some logs to identify the problem. Can we try to get these logs landed please?
Flags: needinfo?(pchang)
I tried to add some log and pushed to tryserver but the log does not pop up in the report. (Even if they are MOZ_RELEASE_ASSERT, though it make the crash point moves :) ) From the report, I tried to null check the aSrc which seems work around the problem(Can not reproduce in 50 try). By the way, the bug is very strange: The object relationship looks like: ContentHostIncremental->TextureImageTextureSourceOGL->TextureImage Where the TextureImage is null but not TextureImageTextureSourceOGL(I will try to ckeck this later), that means the ContentHost had received TextureCreation before but no TextureUpdate, and then receives a TextureCreation with COPY request. It is a Client/Host out of sync condition, which seems impossible for me.
How about the local OSX? Does it have log?
Flags: needinfo?(pchang) → needinfo?(chung)
(In reply to peter chang[:pchang][:peter] from comment #184) > How about the local OSX? Does it have log? No, I add log into the first line of the crashing function by printf_stderr("Enter BlitTextureImage"); And run the keyboard test. But no log observable in the console.
Flags: needinfo?(chung)
(In reply to Chiajung Hung [:chiajung] from comment #185) > (In reply to peter chang[:pchang][:peter] from comment #184) > > How about the local OSX? Does it have log? > > No, I add log into the first line of the crashing function by > printf_stderr("Enter BlitTextureImage"); > > And run the keyboard test. > > But no log observable in the console. It seems we can write logs into a file, but I don't know how to get the file from tryserver. (I asked QA and others, but no one knows).
I can not re-trigger this today... Let's wait to see whether it occurs again.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.