Closed Bug 763449 Opened 12 years ago Closed 7 years ago

Intermittent test_browserElement_oop_Close.html, test_browserElement_oop_BackForward.html | Test timed out, "ABORT: X_CopyArea: BadDrawable (invalid Pixmap or Window parameter); 4 requests ago: nsX11ErrorHandler.cpp, line 157"

Categories

(Core :: Graphics: Layers, defect, P3)

x86
Linux
defect

Tracking

()

RESOLVED INCOMPLETE

People

(Reporter: emorley, Assigned: karlt)

References

Details

(Keywords: crash, intermittent-failure, qawanted, Whiteboard: [leave open])

Attachments

(3 files)

Rev3 Fedora 12 mozilla-inbound opt test mochitests-2/5 on 2012-06-10 22:22:19 PDT for push 61fd66629c4f slave: talos-r3-fed-070 https://tbpl.mozilla.org/php/getParsedLog.php?id=12543356&tree=Mozilla-Inbound { 821 INFO TEST-START | /tests/dom/browser-element/mochitest/test_browserElement_oop_Close.html creating 1! [Child 2244] ###!!! ABORT: X_CopyArea: BadDrawable (invalid Pixmap or Window parameter); 4 requests ago: file ../../../toolkit/xre/nsX11ErrorHandler.cpp, line 157 [Child 2244] ###!!! ABORT: X_CopyArea: BadDrawable (invalid Pixmap or Window parameter); 4 requests ago: file ../../../toolkit/xre/nsX11ErrorHandler.cpp, line 157 822 ERROR TEST-UNEXPECTED-FAIL | /tests/dom/browser-element/mochitest/test_browserElement_oop_Close.html | Test timed out. XScreenSaver state: Disabled User input has been idle for 969 seconds args: ['/home/cltbld/talos-slave/test/build/bin/screentopng'] SCREENSHOT: <snip> 823 ERROR TEST-UNEXPECTED-FAIL | /tests/dom/browser-element/mochitest/test_browserElement_oop_Close.html | This test left crash dumps behind, but we weren't expecting it to! 824 INFO TEST-INFO | Found unexpected crash dump file /tmp/tmpQU2mnt/minidumps/141750e9-841d-1590-7bd98799-2dc599d5.dmp. 825 INFO TEST-INFO | Found unexpected crash dump file /tmp/tmpQU2mnt/minidumps/141750e9-841d-1590-7bd98799-2dc599d5.extra. 826 INFO TEST-END | /tests/dom/browser-element/mochitest/test_browserElement_oop_Close.html | finished in 305522ms } ... { 159428 INFO TEST-START | Shutdown 159429 INFO Passed: 142295 159430 INFO Failed: 2 159431 INFO Todo: 16488 159432 INFO SimpleTest FINISHED 159433 INFO TEST-INFO | Ran 0 Loops 159434 INFO SimpleTest FINISHED NOTE: child process received `Goodbye', closing down NOTE: child process received `Goodbye', closing down INFO | automation.py | Application ran for: 0:12:40.292715 INFO | automation.py | Reading PID log: /tmp/tmpNsOgW9pidlog ==> process 2192 launched child process 2244 ==> process 2192 launched child process 2257 ==> process 2192 launched child process 2285 ==> process 2192 launched child process 2288 ==> process 2192 launched child process 2290 ==> process 2192 launched child process 2293 INFO | automation.py | Checking for orphan process with PID: 2244 INFO | automation.py | Checking for orphan process with PID: 2257 INFO | automation.py | Checking for orphan process with PID: 2285 INFO | automation.py | Checking for orphan process with PID: 2288 INFO | automation.py | Checking for orphan process with PID: 2290 INFO | automation.py | Checking for orphan process with PID: 2293 Downloading symbols from: http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-inbound-linux/1339389994/firefox-16.0a1.en-US.linux-i686.crashreporter-symbols.zip PROCESS-CRASH | Main app process exited normally | application crashed (minidump found) Crash dump filename: /tmp/tmpQU2mnt/minidumps/141750e9-841d-1590-7bd98799-2dc599d5.dmp Operating system: Linux 0.0.0 Linux 2.6.31.5-127.fc12.i686.PAE #1 SMP Sat Nov 7 21:25:57 EST 2009 i686 CPU: x86 GenuineIntel family 6 model 23 stepping 10 2 CPUs Crash reason: SIGSEGV Crash address: 0x0 Thread 0 (crashed) 0 libmozalloc.so!mozalloc_abort [mozalloc_abort.cpp : 19 + 0x0] eip = 0x00852eb2 esp = 0xbfa7f600 ebp = 0x00c59844 ebx = 0x00853118 esi = 0x00c59844 edi = 0xbfa7f648 eax = 0x0000000a ecx = 0x00000001 edx = 0x00c5a32c efl = 0x00210246 Found by: given as instruction pointer in context 1 libxul.so!NS_DebugBreak_P [nsDebugImpl.cpp : 382 + 0x5] eip = 0x01cca569 esp = 0xbfa7f620 ebp = 0x00c59844 ebx = 0x024efe20 esi = 0xbfa7fa34 edi = 0xbfa7f648 Found by: call frame info 2 libxul.so!X11Error [nsX11ErrorHandler.cpp : 157 + 0x1e] eip = 0x0118d161 esp = 0xbfa7fa60 ebp = 0x00000004 ebx = 0x024efe20 esi = 0x00000000 edi = 0xbfa802a8 Found by: call frame info 3 libX11.so.6.3.0 + 0x3c120 eip = 0x00661121 esp = 0xbfa80380 ebp = 0xbfa80438 ebx = 0x00759f44 esi = 0xb55f8490 edi = 0xbfa803bc Found by: call frame info 4 libX11.so.6.3.0 + 0x428e6 eip = 0x006678e7 esp = 0xbfa80440 ebp = 0xbfa804a8 Found by: previous frame's frame pointer 5 libX11.so.6.3.0 + 0x42f95 eip = 0x00667f96 esp = 0xbfa804b0 ebp = 0xbfa804f8 Found by: previous frame's frame pointer 6 libX11.so.6.3.0 + 0x36707 eip = 0x0065b708 esp = 0xbfa80500 ebp = 0xbfa80538 Found by: previous frame's frame pointer 7 libxul.so!mozilla::layers::ShadowLayerForwarder::PlatformSyncBeforeUpdate [ShadowLayerUtilsX11.cpp : 155 + 0x10] eip = 0x01d533df esp = 0xbfa80540 ebp = 0xb2e89e64 Found by: previous frame's frame pointer 8 libxul.so!mozilla::layers::ShadowLayerForwarder::EndTransaction [ShadowLayers.cpp : 321 + 0x4] eip = 0x01d51e01 esp = 0xbfa80560 ebp = 0xb2e89e64 ebx = 0x024efe20 Found by: call frame info 9 libxul.so!mozilla::layers::BasicShadowLayerManager::ForwardTransaction [BasicLayers.cpp : 3597 + 0x16] eip = 0x01d3b484 esp = 0xbfa81330 ebp = 0xb204bc40 ebx = 0x024efe20 esi = 0xbfa81368 edi = 0xb2e89e10 Found by: call frame info 10 libxul.so!mozilla::layers::BasicShadowLayerManager::EndTransaction [BasicLayers.cpp : 3569 + 0x7] eip = 0x01d3bb68 esp = 0xbfa81bc0 ebp = 0xb204bc40 ebx = 0x024efe20 esi = 0xb2e89e10 edi = 0xbfa82170 }
Would it be possible to run this test a few dozen times on Linux with the --sync command line arg? -> Widget:GTK until we get the real stack for the X11Error
Component: General → Widget: Gtk
Keywords: qawanted
QA Contact: general → gtk
Guessing a layers surface ownership/sync issue.
Component: Widget: Gtk → Graphics: Layers
QA Contact: gtk → graphics-layers
Or perhaps caused by whatever is causing bug 733323.
Summary: Intermittent abort during test_browserElement_oop_Close.html | Test timed out. ("ABORT: X_CopyArea: BadDrawable (invalid Pixmap or Window parameter); 4 requests ago: file ../../../toolkit/xre/nsX11ErrorHandler.cpp, line 157") → Intermittent abort during test_browserElement_oop_Close.html, test_browserElement_oop_Back | Test timed out. ("ABORT: X_CopyArea: BadDrawable (invalid Pixmap or Window parameter); 4 requests ago: file ../../../toolkit/xre/nsX11ErrorHandler.cpp line 157")
Summary: Intermittent abort during test_browserElement_oop_Close.html, test_browserElement_oop_Back | Test timed out. ("ABORT: X_CopyArea: BadDrawable (invalid Pixmap or Window parameter); 4 requests ago: file ../../../toolkit/xre/nsX11ErrorHandler.cpp line 157") → Intermittent test_browserElement_oop_Close.html, test_browserElement_oop_BackForward.html | Test timed out, "ABORT: X_CopyArea: BadDrawable (invalid Pixmap or Window parameter); 4 requests ago: nsX11ErrorHandler.cpp, line 157"
Blocks: 782505
The issue is that both the Shadow and Shadowable layers can access the front buffer, but when the Shadow layer destroys the front buffer it hasn't told the Shadowable layer not to use it. This seems specific to BasicShadowThebesLayer, but I'm not making sense of the D3D10 code where LayerManagerD3D10 also calls ShadowLayerForwarder::PaintedThebesBuffer() yet I don't see any D3D10 layers code handling the OpThebesBufferSwap reply. Prior to the call from ShadowLayerParent::Destroy() to mLayer->Disconnect() being added for bug 623255, the Disconnect() would only be called on the Shadow layer when its Shadowable layer is being deleted. I wonder why that bug wasn't resolved by making the BasicTextureImage keep an owning ref to its mGLContext? I played with delaying destruction of the front buffer to the BasicShadowThebesLayer destructor, but mAllocator has been deleted at that point. I'm wondering whether having separate Destroy and Disconnect methods on Layers may be the way to go? Destroy could be the signal corresponding to widget destruction, while Disconnect is the signal corresponding to disconnection from the Shadowable. Or perhaps Destroy can somehow be called when GL layers are removed from their layer tree (resolving bug 623255 in a different way).
Assignee: nobody → karlt
Only BasicShadowThebesLayers and ShadowThebesLayerOGLs with OpenDescriptorForDirectTexturing return a read-only front buffer from Swap. I'm not familiar with PGrallocBuffer, but perhaps the system looks after ref counting and saves us there. (In reply to Karl Tomlinson (:karlt) from comment #19) > I wonder why that bug > wasn't resolved by making the BasicTextureImage keep an owning ref to its > mGLContext? Making TextureImages keep a reference to the GL context wouldn't be a complete solution because GL layers also delete textures, framebuffers, and buffers, (and ?programs) directly and this would all need to be adjusted to use a shared context. > Or perhaps Destroy can somehow be called when GL layers are removed from > their layer tree (resolving bug 623255 in a different way). Detecting when layers are removed, but won't be re-added looks like it would require changes into FrameLayerBuilder (and I don't know how extensive that would be). I think it would be possible to have separate Destroy and Disconnect methods for Shadow GL layers, but making sure to destroy GL resources before the context looks like an unnecessary burden. The option I'm preferring is probably to change the GL layer resource handling to use a shared context to release resources (or no cleanup is necessary if there is no shared context). That would remove the need to have these Destroy methods, that are almost specific to GL layers. Explicitly calling Destroy on window destruction rather than waiting for a __delete__ message may provide slightly faster release of resources, and could use the existing context, rather than having to switch to another context, but they seem minor advantages for the extra code and complexity involved.
This is not a full solution, but it is part of the solution, and finishing the SyncFrontBufferToBackBuffer earlier may reduce the frequency of these test failures.
Attachment #663893 - Flags: review?(jones.chris.g)
A Destroy method gives more control over when to destroy a context while still allowing clients to keep a reference, which it can use to detect when the context is destroyed.
Attachment #663949 - Flags: review?(bgirard)
Still WIP: TextureImages on child, including orphan, shadow layers have a bare pointer to this GLContext. Make this into a strong reference, so that TextureImages can determine whether they need to destroy resources with a shared context. LayerOGLs use the GLContext on the LayerManagerOGL to clean up resources. LayerManagerOGL can switch its mGLContext to the shared context on Destroy. Some systems don't use a shared context. For these, resources are destroyed automatically when the context is destroyed. We could null-check the context, but there is debug resource tracking on the GLContext methods. Therefore it may be best to keep a reference to the destroyed GLContext on the LayerManagerOGL, so that the GLContext methods can still be used to track resources, even though the methods will be essentially no-ops when the context is destroyed.
Attachment #663949 - Flags: review?(bgirard) → review+
Comment on attachment 663959 [details] [diff] [review] part 3: keep a reference to mGLContext from LayerManagerOGL, but Destroy GLContext on manager Destroy Review of attachment 663959 [details] [diff] [review]: ----------------------------------------------------------------- ::: widget/gtk2/nsWindow.cpp @@ -595,5 @@ > > /** Need to clean our LayerManager up while still alive */ > if (mLayerManager) { > - nsRefPtr<GLContext> gl = nullptr; > - if (mLayerManager->GetBackendType() == mozilla::layers::LAYERS_OPENGL) { Nice, much better
Attachment #663959 - Flags: review?(bgirard) → review+
Blocks: 788935
Comment on attachment 663893 [details] [diff] [review] Finish X requests on mROFrontBuffer before deleting shadow layer Nice fix!
Attachment #663893 - Flags: review?(jones.chris.g) → review+
Comment on attachment 663893 [details] [diff] [review] Finish X requests on mROFrontBuffer before deleting shadow layer https://hg.mozilla.org/integration/mozilla-inbound/rev/ef440fbe0d52
Attachment #663893 - Flags: checkin+
Attachment 663959 [details] [diff] causes shutdown crashes on Android mochitests (but still green - bug 794244) due to TiledLayerBufferOGL accessing the mGLContext after the LayerManagerOGL is Destroy()ed.
Whiteboard: [orange] → [orange][leave open]
Whiteboard: [orange][leave open] → [leave open]
No longer blocks: 788935
Bulk assigning P3 to all open intermittent bugs without a priority set in Firefox components per bug 1298978.
Priority: -- → P3
Mass-closing old bugs I filed that have not had recent activity/no longer affect me.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: