Intermittent test_browserElement_oop_Close.html, test_browserElement_oop_BackForward.html | Test timed out, "ABORT: X_CopyArea: BadDrawable (invalid Pixmap or Window parameter); 4 requests ago: nsX11ErrorHandler.cpp, line 157"

NEW
Assigned to

Status

()

Core
Graphics: Layers
P3
critical
6 years ago
a year ago

People

(Reporter: emorley, Assigned: karlt)

Tracking

({crash, intermittent-failure, qawanted})

Trunk
x86
Linux
crash, intermittent-failure, qawanted
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [leave open])

Attachments

(3 attachments)

(Reporter)

Description

6 years ago
Rev3 Fedora 12 mozilla-inbound opt test mochitests-2/5 on 2012-06-10 22:22:19 PDT for push 61fd66629c4f

slave: talos-r3-fed-070

https://tbpl.mozilla.org/php/getParsedLog.php?id=12543356&tree=Mozilla-Inbound

{
821 INFO TEST-START | /tests/dom/browser-element/mochitest/test_browserElement_oop_Close.html
creating 1!
[Child 2244] ###!!! ABORT: X_CopyArea: BadDrawable (invalid Pixmap or Window parameter); 4 requests ago: file ../../../toolkit/xre/nsX11ErrorHandler.cpp, line 157
[Child 2244] ###!!! ABORT: X_CopyArea: BadDrawable (invalid Pixmap or Window parameter); 4 requests ago: file ../../../toolkit/xre/nsX11ErrorHandler.cpp, line 157
822 ERROR TEST-UNEXPECTED-FAIL | /tests/dom/browser-element/mochitest/test_browserElement_oop_Close.html | Test timed out.
XScreenSaver state: Disabled
User input has been idle for 969 seconds
args: ['/home/cltbld/talos-slave/test/build/bin/screentopng']
SCREENSHOT: <snip>
823 ERROR TEST-UNEXPECTED-FAIL | /tests/dom/browser-element/mochitest/test_browserElement_oop_Close.html | This test left crash dumps behind, but we weren't expecting it to!
824 INFO TEST-INFO | Found unexpected crash dump file /tmp/tmpQU2mnt/minidumps/141750e9-841d-1590-7bd98799-2dc599d5.dmp.
825 INFO TEST-INFO | Found unexpected crash dump file /tmp/tmpQU2mnt/minidumps/141750e9-841d-1590-7bd98799-2dc599d5.extra.
826 INFO TEST-END | /tests/dom/browser-element/mochitest/test_browserElement_oop_Close.html | finished in 305522ms
}

...

{
159428 INFO TEST-START | Shutdown
159429 INFO Passed: 142295
159430 INFO Failed: 2
159431 INFO Todo:   16488
159432 INFO SimpleTest FINISHED
159433 INFO TEST-INFO | Ran 0 Loops
159434 INFO SimpleTest FINISHED
NOTE: child process received `Goodbye', closing down
NOTE: child process received `Goodbye', closing down
INFO | automation.py | Application ran for: 0:12:40.292715
INFO | automation.py | Reading PID log: /tmp/tmpNsOgW9pidlog
==> process 2192 launched child process 2244
==> process 2192 launched child process 2257
==> process 2192 launched child process 2285
==> process 2192 launched child process 2288
==> process 2192 launched child process 2290
==> process 2192 launched child process 2293
INFO | automation.py | Checking for orphan process with PID: 2244
INFO | automation.py | Checking for orphan process with PID: 2257
INFO | automation.py | Checking for orphan process with PID: 2285
INFO | automation.py | Checking for orphan process with PID: 2288
INFO | automation.py | Checking for orphan process with PID: 2290
INFO | automation.py | Checking for orphan process with PID: 2293
Downloading symbols from: http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-inbound-linux/1339389994/firefox-16.0a1.en-US.linux-i686.crashreporter-symbols.zip
PROCESS-CRASH | Main app process exited normally | application crashed (minidump found)
Crash dump filename: /tmp/tmpQU2mnt/minidumps/141750e9-841d-1590-7bd98799-2dc599d5.dmp
Operating system: Linux
                  0.0.0 Linux 2.6.31.5-127.fc12.i686.PAE #1 SMP Sat Nov 7 21:25:57 EST 2009 i686
CPU: x86
     GenuineIntel family 6 model 23 stepping 10
     2 CPUs

Crash reason:  SIGSEGV
Crash address: 0x0

Thread 0 (crashed)
 0  libmozalloc.so!mozalloc_abort [mozalloc_abort.cpp : 19 + 0x0]
    eip = 0x00852eb2   esp = 0xbfa7f600   ebp = 0x00c59844   ebx = 0x00853118
    esi = 0x00c59844   edi = 0xbfa7f648   eax = 0x0000000a   ecx = 0x00000001
    edx = 0x00c5a32c   efl = 0x00210246
    Found by: given as instruction pointer in context
 1  libxul.so!NS_DebugBreak_P [nsDebugImpl.cpp : 382 + 0x5]
    eip = 0x01cca569   esp = 0xbfa7f620   ebp = 0x00c59844   ebx = 0x024efe20
    esi = 0xbfa7fa34   edi = 0xbfa7f648
    Found by: call frame info
 2  libxul.so!X11Error [nsX11ErrorHandler.cpp : 157 + 0x1e]
    eip = 0x0118d161   esp = 0xbfa7fa60   ebp = 0x00000004   ebx = 0x024efe20
    esi = 0x00000000   edi = 0xbfa802a8
    Found by: call frame info
 3  libX11.so.6.3.0 + 0x3c120
    eip = 0x00661121   esp = 0xbfa80380   ebp = 0xbfa80438   ebx = 0x00759f44
    esi = 0xb55f8490   edi = 0xbfa803bc
    Found by: call frame info
 4  libX11.so.6.3.0 + 0x428e6
    eip = 0x006678e7   esp = 0xbfa80440   ebp = 0xbfa804a8
    Found by: previous frame's frame pointer
 5  libX11.so.6.3.0 + 0x42f95
    eip = 0x00667f96   esp = 0xbfa804b0   ebp = 0xbfa804f8
    Found by: previous frame's frame pointer
 6  libX11.so.6.3.0 + 0x36707
    eip = 0x0065b708   esp = 0xbfa80500   ebp = 0xbfa80538
    Found by: previous frame's frame pointer
 7  libxul.so!mozilla::layers::ShadowLayerForwarder::PlatformSyncBeforeUpdate [ShadowLayerUtilsX11.cpp : 155 + 0x10]
    eip = 0x01d533df   esp = 0xbfa80540   ebp = 0xb2e89e64
    Found by: previous frame's frame pointer
 8  libxul.so!mozilla::layers::ShadowLayerForwarder::EndTransaction [ShadowLayers.cpp : 321 + 0x4]
    eip = 0x01d51e01   esp = 0xbfa80560   ebp = 0xb2e89e64   ebx = 0x024efe20
    Found by: call frame info
 9  libxul.so!mozilla::layers::BasicShadowLayerManager::ForwardTransaction [BasicLayers.cpp : 3597 + 0x16]
    eip = 0x01d3b484   esp = 0xbfa81330   ebp = 0xb204bc40   ebx = 0x024efe20
    esi = 0xbfa81368   edi = 0xb2e89e10
    Found by: call frame info
10  libxul.so!mozilla::layers::BasicShadowLayerManager::EndTransaction [BasicLayers.cpp : 3569 + 0x7]
    eip = 0x01d3bb68   esp = 0xbfa81bc0   ebp = 0xb204bc40   ebx = 0x024efe20
    esi = 0xb2e89e10   edi = 0xbfa82170
}
Would it be possible to run this test a few dozen times on Linux with the
--sync command line arg?

-> Widget:GTK until we get the real stack for the X11Error
Component: General → Widget: Gtk
Keywords: qawanted
QA Contact: general → gtk
(Assignee)

Comment 2

6 years ago
Guessing a layers surface ownership/sync issue.
Component: Widget: Gtk → Graphics: Layers
QA Contact: gtk → graphics-layers
(Assignee)

Comment 3

6 years ago
Or perhaps caused by whatever is causing bug 733323.
(Reporter)

Comment 4

6 years ago
https://tbpl.mozilla.org/php/getParsedLog.php?id=12977506&tree=Mozilla-Inbound
Summary: Intermittent abort during test_browserElement_oop_Close.html | Test timed out. ("ABORT: X_CopyArea: BadDrawable (invalid Pixmap or Window parameter); 4 requests ago: file ../../../toolkit/xre/nsX11ErrorHandler.cpp, line 157") → Intermittent abort during test_browserElement_oop_Close.html, test_browserElement_oop_Back | Test timed out. ("ABORT: X_CopyArea: BadDrawable (invalid Pixmap or Window parameter); 4 requests ago: file ../../../toolkit/xre/nsX11ErrorHandler.cpp line 157")
(Reporter)

Comment 5

6 years ago
https://tbpl.mozilla.org/php/getParsedLog.php?id=12977043&full=1&branch=mozilla-inbound
Summary: Intermittent abort during test_browserElement_oop_Close.html, test_browserElement_oop_Back | Test timed out. ("ABORT: X_CopyArea: BadDrawable (invalid Pixmap or Window parameter); 4 requests ago: file ../../../toolkit/xre/nsX11ErrorHandler.cpp line 157") → Intermittent test_browserElement_oop_Close.html, test_browserElement_oop_BackForward.html | Test timed out, "ABORT: X_CopyArea: BadDrawable (invalid Pixmap or Window parameter); 4 requests ago: nsX11ErrorHandler.cpp, line 157"
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
(Assignee)

Updated

6 years ago
Blocks: 782505
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
(Assignee)

Comment 19

5 years ago
The issue is that both the Shadow and Shadowable layers can access the front
buffer, but when the Shadow layer destroys the front buffer it hasn't told the
Shadowable layer not to use it.

This seems specific to BasicShadowThebesLayer, but I'm not making sense of the
D3D10 code where LayerManagerD3D10 also calls
ShadowLayerForwarder::PaintedThebesBuffer() yet I don't see any D3D10 layers
code handling the OpThebesBufferSwap reply.

Prior to the call from ShadowLayerParent::Destroy() to mLayer->Disconnect()
being added for bug 623255, the Disconnect() would only be called on the Shadow
layer when its Shadowable layer is being deleted.  I wonder why that bug
wasn't resolved by making the BasicTextureImage keep an owning ref to its
mGLContext?

I played with delaying destruction of the front buffer to the
BasicShadowThebesLayer destructor, but mAllocator has been deleted at that
point.

I'm wondering whether having separate Destroy and Disconnect methods on Layers
may be the way to go?  Destroy could be the signal corresponding to widget
destruction, while Disconnect is the signal corresponding to disconnection
from the Shadowable.

Or perhaps Destroy can somehow be called when GL layers are removed from their
layer tree (resolving bug 623255 in a different way).
(Assignee)

Updated

5 years ago
Assignee: nobody → karlt
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
(Assignee)

Comment 22

5 years ago
Only BasicShadowThebesLayers and ShadowThebesLayerOGLs with
OpenDescriptorForDirectTexturing return a read-only front buffer from Swap.
I'm not familiar with PGrallocBuffer, but perhaps the system looks after ref
counting and saves us there.

(In reply to Karl Tomlinson (:karlt) from comment #19)
> I wonder why that bug
> wasn't resolved by making the BasicTextureImage keep an owning ref to its
> mGLContext?

Making TextureImages keep a reference to the GL context wouldn't be a complete
solution because GL layers also delete textures, framebuffers, and buffers,
(and ?programs) directly and this would all need to be adjusted to use a
shared context.

> Or perhaps Destroy can somehow be called when GL layers are removed from
> their layer tree (resolving bug 623255 in a different way).

Detecting when layers are removed, but won't be re-added looks like it would
require changes into FrameLayerBuilder (and I don't know how extensive that
would be).

I think it would be possible to have separate Destroy and Disconnect methods
for Shadow GL layers, but making sure to destroy GL resources before the
context looks like an unnecessary burden.

The option I'm preferring is probably to change the GL layer resource handling
to use a shared context to release resources (or no cleanup is necessary if
there is no shared context).  That would remove the need to have these Destroy
methods, that are almost specific to GL layers.

Explicitly calling Destroy on window destruction rather than waiting for a
__delete__ message may provide slightly faster release of resources, and could
use the existing context, rather than having to switch to another context, but
they seem minor advantages for the extra code and complexity involved.
Comment hidden (Treeherder Robot)
(Assignee)

Comment 24

5 years ago
Created attachment 663893 [details] [diff] [review]
Finish X requests on mROFrontBuffer before deleting shadow layer

This is not a full solution, but it is part of the solution, and finishing the SyncFrontBufferToBackBuffer earlier may reduce the frequency of these test failures.
Attachment #663893 - Flags: review?(jones.chris.g)
(Assignee)

Comment 25

5 years ago
Created attachment 663949 [details] [diff] [review]
part 2: add a method to destroy a GLContext while references remain

A Destroy method gives more control over when to destroy a context while still allowing clients to keep a reference, which it can use to detect when the context is destroyed.
Attachment #663949 - Flags: review?(bgirard)
(Assignee)

Comment 26

5 years ago
Still WIP:

TextureImages on child, including orphan, shadow layers have a bare pointer
to this GLContext.  Make this into a strong reference, so that TextureImages
can determine whether they need to destroy resources with a shared context.

LayerOGLs use the GLContext on the LayerManagerOGL to clean up resources.
LayerManagerOGL can switch its mGLContext to the shared context on Destroy.

Some systems don't use a shared context.  For these, resources are destroyed
automatically when the context is destroyed.  We could null-check the context,
but there is debug resource tracking on the GLContext methods.  Therefore it may be
best to keep a reference to the destroyed GLContext on the LayerManagerOGL, so
that the GLContext methods can still be used to track resources, even though
the methods will be essentially no-ops when the context is destroyed.
(Assignee)

Comment 27

5 years ago
Created attachment 663959 [details] [diff] [review]
part 3: keep a reference to mGLContext from LayerManagerOGL, but Destroy GLContext on manager Destroy
Attachment #663959 - Flags: review?(bgirard)

Updated

5 years ago
Attachment #663949 - Flags: review?(bgirard) → review+
Comment on attachment 663959 [details] [diff] [review]
part 3: keep a reference to mGLContext from LayerManagerOGL, but Destroy GLContext on manager Destroy

Review of attachment 663959 [details] [diff] [review]:
-----------------------------------------------------------------

::: widget/gtk2/nsWindow.cpp
@@ -595,5 @@
>  
>      /** Need to clean our LayerManager up while still alive */
>      if (mLayerManager) {
> -        nsRefPtr<GLContext> gl = nullptr;
> -        if (mLayerManager->GetBackendType() == mozilla::layers::LAYERS_OPENGL) {

Nice, much better
Attachment #663959 - Flags: review?(bgirard) → review+
(Assignee)

Updated

5 years ago
Blocks: 788935
Comment on attachment 663893 [details] [diff] [review]
Finish X requests on mROFrontBuffer before deleting shadow layer

Nice fix!
Attachment #663893 - Flags: review?(jones.chris.g) → review+
Comment hidden (Treeherder Robot)
(Assignee)

Comment 31

5 years ago
Comment on attachment 663893 [details] [diff] [review]
Finish X requests on mROFrontBuffer before deleting shadow layer

https://hg.mozilla.org/integration/mozilla-inbound/rev/ef440fbe0d52
Attachment #663893 - Flags: checkin+
(Assignee)

Comment 32

5 years ago
Attachment 663959 [details] [diff] causes shutdown crashes on Android mochitests (but still green - bug 794244) due to TiledLayerBufferOGL accessing the mGLContext after the LayerManagerOGL is Destroy()ed.
Whiteboard: [orange] → [orange][leave open]
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Keywords: intermittent-failure
Whiteboard: [orange][leave open] → [leave open]
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
(Assignee)

Updated

5 years ago
No longer blocks: 788935
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)

Comment 48

a year ago
Bulk assigning P3 to all open intermittent bugs without a priority set in Firefox components per bug 1298978.
Priority: -- → P3
You need to log in before you can comment on or make changes to this bug.