Closed
Bug 896054
Opened 11 years ago
Closed 11 years ago
Intermittent layout/reftests/bugs/579323-1.html | assertion count 1 is more than expected 0 assertions (Texture initialization failed! -- error 0x506, Source 0, Source format 0, RGBA Compat 1: 'Error')
Categories
(Core :: Graphics: Layers, defect)
Tracking
()
RESOLVED
FIXED
mozilla27
Tracking | Status | |
---|---|---|
firefox26 | --- | wontfix |
firefox27 | --- | fixed |
firefox-esr24 | --- | unaffected |
b2g-v1.2 | --- | wontfix |
People
(Reporter: RyanVM, Assigned: mstange)
References
Details
(Keywords: assertion, intermittent-failure)
https://tbpl.mozilla.org/php/getParsedLog.php?id=25503005&tree=Mozilla-Central Rev4 MacOSX Lion 10.7 mozilla-central debug test reftest on 2013-07-19 12:46:51 PDT for push b7fbc6770e37 slave: talos-r4-lion-049 13:03:55 INFO - REFTEST TEST-START | file:///Users/cltbld/talos-slave/test/build/tests/reftest/tests/layout/reftests/bugs/579323-1.html 13:03:55 INFO - REFTEST TEST-LOAD | file:///Users/cltbld/talos-slave/test/build/tests/reftest/tests/layout/reftests/bugs/579323-1.html | 3183 / 9443 (33%) 13:03:55 INFO - ++DOMWINDOW == 101 (0x11be0af58) [serial = 8725] [outer = 0x10c631058] 13:03:55 INFO - ###!!! ASSERTION: Texture initialization failed! -- error 0x506, Source 0, Source format 0, RGBA Compat 1: 'Error', file ../../../gfx/layers/opengl/CompositorOGL.cpp, line 904 13:03:55 INFO - mozilla::layers::CompositorOGL::CreateRenderTargetFromSource(mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownUnits> const&, mozilla::layers::CompositingRenderTarget const*) [gfx/layers/opengl/CompositorOGL.cpp:697] 13:03:55 INFO - void mozilla::layers::ContainerRender<mozilla::layers::ContainerLayerComposite>(mozilla::layers::ContainerLayerComposite*, nsIntPoint const&, mozilla::layers::LayerManagerComposite*, nsIntRect const&) [obj-firefox/dist/include/mozilla/RefPtr.h:230] 13:03:55 INFO - void mozilla::layers::ContainerRender<mozilla::layers::ContainerLayerComposite>(mozilla::layers::ContainerLayerComposite*, nsIntPoint const&, mozilla::layers::LayerManagerComposite*, nsIntRect const&) [gfx/layers/composite/ContainerLayerComposite.cpp:123] 13:03:55 INFO - void mozilla::layers::ContainerRender<mozilla::layers::ContainerLayerComposite>(mozilla::layers::ContainerLayerComposite*, nsIntPoint const&, mozilla::layers::LayerManagerComposite*, nsIntRect const&) [gfx/layers/composite/ContainerLayerComposite.cpp:123] 13:03:55 INFO - mozilla::layers::LayerManagerComposite::Render() [obj-firefox/dist/include/mozilla/RefPtr.h:181] 13:03:55 INFO - mozilla::layers::LayerManagerComposite::EndTransaction(void (*)(mozilla::layers::ThebesLayer*, gfxContext*, nsIntRegion const&, nsIntRegion const&, void*), void*, mozilla::layers::LayerManager::EndTransactionFlags) [gfx/layers/composite/LayerManagerComposite.cpp:197] 13:03:55 INFO - mozilla::layers::LayerManagerComposite::EndEmptyTransaction(mozilla::layers::LayerManager::EndTransactionFlags) [gfx/layers/composite/LayerManagerComposite.cpp:159] 13:03:55 INFO - mozilla::layers::CompositorParent::Composite() [obj-firefox/dist/include/mozilla/layers/AsyncCompositionManager.h:196] 13:03:55 INFO - MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&) [ipc/chromium/src/base/message_loop.cc:337] 13:03:55 INFO - MessageLoop::DoWork() [ipc/chromium/src/base/message_loop.cc:445] 13:03:55 INFO - base::MessagePumpDefault::Run(base::MessagePump::Delegate*) [ipc/chromium/src/base/message_pump_default.cc:23] 13:03:55 INFO - MessageLoop::Run() [ipc/chromium/src/base/message_loop.cc:506] 13:03:55 INFO - base::Thread::ThreadMain() [ipc/chromium/src/base/thread.cc:163] 13:03:55 INFO - ThreadFunc [ipc/chromium/src/base/platform_thread_posix.cc:39] 13:03:55 INFO - libsystem_c.dylib + 0x4e8bf 13:03:55 INFO - REFTEST TEST-LOAD | file:///Users/cltbld/talos-slave/test/build/tests/reftest/tests/layout/reftests/bugs/579323-1-ref.html | 3183 / 9443 (33%) 13:03:55 INFO - ++DOMWINDOW == 102 (0x11be01178) [serial = 8726] [outer = 0x10c631058] 13:03:55 INFO - REFTEST TEST-PASS | file:///Users/cltbld/talos-slave/test/build/tests/reftest/tests/layout/reftests/bugs/579323-1.html | image comparison (==) 13:03:55 INFO - REFTEST INFO | Loading a blank page 13:03:55 INFO - ++DOMWINDOW == 103 (0x11be44c28) [serial = 8727] [outer = 0x10c631058] 13:03:55 INFO - REFTEST TEST-UNEXPECTED-FAIL | file:///Users/cltbld/talos-slave/test/build/tests/reftest/tests/layout/reftests/bugs/579323-1.html | assertion count 1 is more than expected 0 assertions 13:03:55 INFO - REFTEST TEST-END | file:///Users/cltbld/talos-slave/test/build/tests/reftest/tests/layout/reftests/bugs/579323-1.html
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 4•11 years ago
|
||
The patches in bug 887868 and bug 897655 both seem to tickle things just the right way to make this perma-orange on OS X 10.7 debug builds (in various tests). See also bug 898713.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 6•11 years ago
|
||
I can reproduce this assertion firing locally on my 10.7 Mac pretty easily, but apparently it doesn't show up on 10.8. Does anyone with a better idea of how to debug this have a 10.7 build environment handy?
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 10•11 years ago
|
||
Once this is fixed, the workaround for this added in bug JS::CanCompileOffThread should be removed.
Whiteboard: Remove JS::CanCompileOffThread workaround when fixing
Comment hidden (Legacy TBPL/Treeherder Robot) |
Reporter | ||
Comment 12•11 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=26754771&tree=Mozilla-Inbound
Comment hidden (Legacy TBPL/Treeherder Robot) |
Reporter | ||
Comment 14•11 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=26830592&tree=Fx-Team
Comment 15•11 years ago
|
||
Milan: this graphics issue is holding up the landing of a bunch of unrelated front-end patches (bug 887868/bug 896688/bug 785487) and caused development trouble for another unrelated gecko change (bug 897655). Can someone on your team investigate, or help me investigate (since I can reproduce locally)?
Flags: needinfo?(milan)
Reporter | ||
Comment 16•11 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=27018774&tree=Mozilla-Inbound
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 24•11 years ago
|
||
See bug 859339 comment 29 :-)
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 27•11 years ago
|
||
(In reply to :Gavin Sharp (use gavin@gavinsharp.com for email) from comment #15) > Milan: this graphics issue is holding up the landing of a bunch of unrelated > front-end patches (bug 887868/bug 896688/bug 785487) and caused development > trouble for another unrelated gecko change (bug 897655). Can someone on your > team investigate, or help me investigate (since I can reproduce locally)? Gavin - do you have Intel graphics HD3000 on your 10.7 machine that reproduces the problem? If you have dual graphics and force the discrete one to be used (using something like gfxCardStatus), does the error persist?
Flags: needinfo?(gavin.sharp)
Comment 28•11 years ago
|
||
I'm going to verify it again, but it appears the page resulting from running this test is different than about a week ago.
Comment 29•11 years ago
|
||
(In reply to Milan Sreckovic [:milan] from comment #28) > I'm going to verify it again, but it appears the page resulting from running > this test is different than about a week ago. Never mind. I had the layer borders turned on. Probably a bug that gives us different result though.
Comment 30•11 years ago
|
||
(In reply to :Gavin Sharp (use gavin@gavinsharp.com for email) from comment #6) > I can reproduce this assertion firing locally on my 10.7 Mac pretty easily, > but apparently it doesn't show up on 10.8. Does anyone with a better idea of > how to debug this have a 10.7 build environment handy? You get this by running reftest locally on a debug build, or do you get it just by loading the page from the test?
Comment 31•11 years ago
|
||
I was getting it just starting up the build and loading a page or two, it didn't seem to be triggered by a particular page. But I now can't reproduce, so I'll need to try again. It is still easily reproducable on try, though: https://tbpl.mozilla.org/?tree=Try&rev=da490ac9a802
Flags: needinfo?(gavin.sharp)
Comment 32•11 years ago
|
||
I pushed a quick debug run to try just to get more info, let's see what it looks like. In the meantime, if the data above is to be trusted, either we don't have the layer manager created at the time this test hits this spot, or we end up somehow with the layer manager that is not OpenGL - this based on the format reported as GL_NONE, rather than expected GL_RGBA.
Flags: needinfo?(milan)
Comment 33•11 years ago
|
||
I'm thinking the layer manager isn't there, because calling GetLayerManager() before the offending call may have gotten rid of the problem. I'll retrigger the try.
Comment 34•11 years ago
|
||
Of course, the small debugging stuff I put in has changed the timing, so the tries runs are passing. See https://tbpl.mozilla.org/?tree=Try&rev=3511a5a5a554 for example.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 36•11 years ago
|
||
Benoit or Bas, can you interpret these findings? This could be a red herring, but I thought I'd share the info I got: * So far, this is a problem on OS X 10.7 alone * In CompositorOGL::CreateFBOWithTexture we call GetFrameBufferInternalFormat. ** At this point, aFrameBuffer (index) is zero, so we ask the widget itself for GetGLFrameBufferFormat * The widget is nsBaseWidget ** It has a valid layer manager ** The layer manager type is LAYER_CLIENT ** nsBaseWidget::GetGLFrameBufferFormat thus returns LOCAL_GL_NONE as the format * Back in CompositorOGL, we actually ignore the returned format value. We assume we can call CopyTexImage2D with LOCAL_GL_RGBA format all the time, except that we check if IsGLES2() is set to true - that's when we actually use that format value returned by the function above. So, is the result above as expected? We expect to have LAYER_CLIENT, we expect to return LOCAL_GL_NONE as the format, and we expect to be able to call CopyTexImage2D with LOCAL_GL_RGBA and not use the slow path?
Flags: needinfo?(bgirard)
Flags: needinfo?(bas)
Comment 37•11 years ago
|
||
IMO GetGLFrameBufferFormat() should return the right thing so we should fix that. With OMTC you used to be able to peek at the backend on the remote side so perhaps the right thing to do here is to ask the layer manager the remote backend and correctly return LOCAL_GL_RGBA then.
Flags: needinfo?(bgirard)
Comment 38•11 years ago
|
||
That makes sense. By the way, the situation I describe in comment 36 is exactly the same in the 10.8 runs that do succeed, so I'm not sure exactly what the above says.
Comment 39•11 years ago
|
||
I have nothing useful to add here beyond what Benoit already mentioned (I've never seriously touched the new compositorOGL code). I agree what it's currently doing seems a little hacky.
Flags: needinfo?(bas)
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 42•11 years ago
|
||
(In reply to Milan Sreckovic [:milan] from comment #36) > Benoit or Bas, can you interpret these findings? This could be a red > herring, but I thought I'd share the info I got: > * So far, this is a problem on OS X 10.7 alone > * In CompositorOGL::CreateFBOWithTexture we call > GetFrameBufferInternalFormat. > ** At this point, aFrameBuffer (index) is zero, so we ask the widget itself > for GetGLFrameBufferFormat > * The widget is nsBaseWidget > ** It has a valid layer manager > ** The layer manager type is LAYER_CLIENT > ** nsBaseWidget::GetGLFrameBufferFormat thus returns LOCAL_GL_NONE as the > format > * Back in CompositorOGL, we actually ignore the returned format value. We > assume we can call CopyTexImage2D with LOCAL_GL_RGBA format all the time, > except that we check if IsGLES2() is set to true - that's when we actually > use that format value returned by the function above. > > So, is the result above as expected? We expect to have LAYER_CLIENT, we > expect to return LOCAL_GL_NONE as the format, and we expect to be able to > call CopyTexImage2D with LOCAL_GL_RGBA and not use the slow path? These all seem expected, except for maybe it being 10.7 only. We're getting error 0x506, which is GL_INVALID_FRAMEBUFFER_OPERATION, from our call to glCopyTexImage2D. The docs for glCheckFramebufferStatus (http://www.khronos.org/opengles/sdk/docs/man/xhtml/glCheckFramebufferStatus.xml) say this: "This means that rendering commands (glClear, glDrawArrays, and glDrawElements) as well as commands that read the framebuffer (glReadPixels, glCopyTexImage2D, and glCopyTexSubImage2D) will generate the error GL_INVALID_FRAMEBUFFER_OPERATION if called while the framebuffer is not framebuffer complete". So it appears that the existing read framebuffer (which happens to be the default one, 0) is invalid. I think this code is actually incorrect, and the real bug lies in our cocoa/gl integration code, in widget/cocoa/nsChildView.mm (or possible gfx/gl/GLContextProviderCGL.mm). We have a bunch of code in there for this (one example: http://mxr.mozilla.org/mozilla-central/source/widget/cocoa/nsChildView.mm#2879). Someone probably needs to go through this and figure out what the right thing to do is, and add logging to figure out why it's not happening. One vague guess would be that we're opening a new window and trying to composite it before the native window is actually ready.
Comment 43•11 years ago
|
||
In bug 914437, I have a reliable way to reproduce the assertion on Mac OS X 10.8. It may or may not have the same underlying cause as this intermittent failure, but I hope it helps.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 47•11 years ago
|
||
This bug is blocking parallel reftests from landing (bug 813742). That change should reduce automation load by over 10% and make everyone's jobs finish faster. So, in case you were looking for a reason to prioritize this bug...
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 53•11 years ago
|
||
Markus, see comment 42, 43, 47 - any chance you can take a look at this and see if Matt's suggestion takes us somewhere?
Assignee: nobody → mstange
Flags: needinfo?(mstange)
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 57•11 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=28298106&tree=Mozilla-Inbound
Comment 58•11 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=28327799&tree=Mozilla-Inbound
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 63•11 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=28578202&tree=Mozilla-Central
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 69•11 years ago
|
||
I used the reproduction steps in bug 914437 (thanks Jesse!) and found out that at least in that case, Matt's guess from comment 42 is correct: We do try to composite before the window is shown. I've put up a patch in bug 914437. I did a try run with a partial version of that patch plus bug 887868, and the mochitest-4 10.7 debug results look good: https://tbpl.mozilla.org/?tree=Try&rev=7bf665e9f02e (Ignore the 10.6 results, it's orange because I activated OMTC on 10.6 in order to check whether this fixes bug 890997).
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 72•11 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=28723572&tree=Mozilla-Inbound
Comment 73•11 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=28733126&full=1&branch=mozilla-inbound
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 75•11 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=28807163&tree=Mozilla-Inbound
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 80•11 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=29113855&tree=Mozilla-Central
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 83•11 years ago
|
||
This bug is blocking a BuildFaster:P1 bug that will significantly reduce infrastructure load in automation by making reftests significantly faster. What's the status? If it's expected to not progress for a while, can we consider disabling this test so we can move forward with automation improvements?
Flags: needinfo?(mstange)
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 86•11 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=29386292&tree=Mozilla-Central
Assignee | ||
Comment 87•11 years ago
|
||
Sorry for the delay. I just landed https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?changeset=e197b100ba53 which should fix this. I was blocked on another intermittently-failing test that my patches made fail permanently, but it has now been disabled. (bug 924771)
Flags: needinfo?(mstange)
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Updated•11 years ago
|
Assignee | ||
Comment 91•11 years ago
|
||
Looks like bug 914437 fixed this.
Assignee | ||
Comment 92•11 years ago
|
||
I've filed bug 931747 for the JS::CanCompileOffThread workaround removal.
Assignee | ||
Updated•11 years ago
|
Whiteboard: Remove JS::CanCompileOffThread workaround when fixing
Reporter | ||
Updated•11 years ago
|
status-firefox26:
--- → affected
status-firefox27:
--- → fixed
status-firefox-esr24:
--- → unaffected
Target Milestone: --- → mozilla27
Comment hidden (Legacy TBPL/Treeherder Robot) |
Reporter | ||
Comment 95•11 years ago
|
||
Added an expected assertion on OSX on beta. https://hg.mozilla.org/releases/mozilla-beta/rev/df4d9dc0b5de
Reporter | ||
Comment 96•11 years ago
|
||
https://hg.mozilla.org/releases/mozilla-b2g26_v1_2/rev/df4d9dc0b5de
Reporter | ||
Updated•11 years ago
|
status-b2g-v1.2:
--- → wontfix
You need to log in
before you can comment on or make changes to this bug.
Description
•