Closed Bug 588187 Opened 14 years ago Closed 14 years ago

Strange crash in [@ WebGLContext::DestroyResourcesAndContext]

Categories

(Core :: Graphics: CanvasWebGL, defect)

x86_64
Linux
defect
Not set
critical

Tracking

()

RESOLVED FIXED

People

(Reporter: bjacob, Assigned: bjacob)

References

Details

(Keywords: crash)

Crash Data

Attachments

(2 files)

I am getting this crash only when running all the test suite; I can't reproduce on a single isolated test.

Everytime, it occurs in glDeleteTextures in the GL library. The fun thing is that this is true with both OSMesa and the nvidia driver.

With OSMesa, I get this backtrace:

#0  0x000000381e0a6afd in nanosleep () at ../sysdeps/unix/syscall-template.S:82
#1  0x000000381e0a6970 in __sleep (seconds=0) at ../sysdeps/unix/sysv/linux/sleep.c:138
#2  0x00007f944924c09c in ah_crap_handler (signum=11)
    at /home/bjacob/mozilla-central/toolkit/xre/nsSigHandlers.cpp:132
#3  0x00007f9449250e51 in nsProfileLock::FatalSignalHandler (signo=11, info=0x7fff642712f0, 
    context=0x7fff642711c0) at nsProfileLock.cpp:221
#4  <signal handler called>
#5  _mesa_HashLookup_unlocked (table=0x7f942d1bc000, key=4) at main/hash.c:138
#6  0x00007f943127f548 in _mesa_HashLookup (table=0x7f942d1bc000, key=<value optimized out>)
    at main/hash.c:161
#7  0x00007f94312a6045 in _mesa_lookup_texture (n=1, textures=<value optimized out>)
    at main/texobj.c:58
#8  _mesa_DeleteTextures (n=1, textures=<value optimized out>) at main/texobj.c:902
#9  0x00007f94499175b9 in mozilla::gl::GLContext::fDeleteTextures (this=0x7f941f1d1000, n=1, 
    names=0x7fff64271754) at ../../../dist/include/GLContext.h:1095
#10 0x00007f9449913a28 in DeleteTextureFunction (aKey=@0x7f94304b65e4, aValue=0x7f942e2cebe0, 
    aData=0x7f941f1d1000) at /home/bjacob/mozilla-central/content/canvas/src/WebGLContext.cpp:121
#11 0x00007f9449919202 in nsBaseHashtable<nsUint32HashKey, nsRefPtr<mozilla::WebGLTexture>, mozilla::WebGLTexture*>::s_EnumReadStub (table=0x7f9427e544e8, hdr=0x7f94304b65e0, number=1, arg=
    0x7fff64271880) at ../../../dist/include/nsBaseHashtable.h:345
#12 0x00007f944a7ee4c6 in PL_DHashTableEnumerate (table=0x7f9427e544e8, etor=
    0x7f944991919c <nsBaseHashtable<nsUint32HashKey, nsRefPtr<mozilla::WebGLTexture>, mozilla::WebGLTexture*>::s_EnumReadStub(PLDHashTable*, PLDHashEntryHdr*, PRUint32, void*)>, arg=
    0x7fff64271880) at pldhash.c:754
#13 0x00007f9449918791 in nsBaseHashtable<nsUint32HashKey, nsRefPtr<mozilla::WebGLTexture>, mozilla::WebGLTexture*>::EnumerateRead (this=0x7f9427e544e8, enumFunc=
    0x7f9449913998 <DeleteTextureFunction(PRUint32 const&, mozilla::WebGLTexture*, void*)>, 
    userArg=0x7f941f1d1000) at ../../../dist/include/nsBaseHashtable.h:206
#14 0x00007f9449913d83 in mozilla::WebGLContext::DestroyResourcesAndContext (this=0x7f9427e54400)
    at /home/bjacob/mozilla-central/content/canvas/src/WebGLContext.cpp:190
#15 0x00007f94499137d7 in mozilla::WebGLContext::~WebGLContext (this=0x7f9427e54400, 
    __in_chrg=<value optimized out>)
    at /home/bjacob/mozilla-central/content/canvas/src/WebGLContext.cpp:111
#16 0x00007f9449913982 in mozilla::WebGLContext::~WebGLContext (this=0x7f9427e54400, 
    __in_chrg=<value optimized out>)
    at /home/bjacob/mozilla-central/content/canvas/src/WebGLContext.cpp:112
#17 0x00007f9449914ee1 in mozilla::WebGLContext::Release (this=0x7f9427e54400)
    at /home/bjacob/mozilla-central/content/canvas/src/WebGLContext.cpp:560
#18 0x00007f944991128a in nsCOMPtr<nsICanvasRenderingContextInternal>::~nsCOMPtr (this=
    0x7f9430887660, __in_chrg=<value optimized out>) at ../../../dist/include/nsCOMPtr.h:533
#19 0x00007f944999e602 in nsHTMLCanvasElement::~nsHTMLCanvasElement (this=0x7f94308875e0, 
    __in_chrg=<value optimized out>)
    at /home/bjacob/mozilla-central/content/html/content/src/nsHTMLCanvasElement.cpp:73
#20 0x00007f944999e656 in nsHTMLCanvasElement::~nsHTMLCanvasElement (this=0x7f94308875e0, 
    __in_chrg=<value optimized out>)
    at /home/bjacob/mozilla-central/content/html/content/src/nsHTMLCanvasElement.cpp:73


With the NVidia driver, I get a similar backtrace: crash in fDeleteTextures() a few frames into the NVidia driver.

I have then added static 'object count' variables to GLContext and WebGLContext and printed some debug info in the ctor/dtor of these classes. Here is the end of the output before the crash:



delete WebGLContext 7f93ed343c00, objcnt decreased to 71
DestroyResourcesAndContext called on 7f93ed343c00
--- WebGL context destroyed: 0x7f93ed3bf000
delete GLContext 0x7f93ed3bf000, objcnt decreased to 71
delete WebGLContext 7f93ee2ad000, objcnt decreased to 70
DestroyResourcesAndContext called on 7f93ee2ad000
--- WebGL context destroyed: 0x7f93ede0b000

 [... SNIP SNIP SNIP ...]

delete WebGLContext 7f9430431000, objcnt decreased to 25
DestroyResourcesAndContext called on 7f9430431000
--- WebGL context destroyed: 0x7f942dc1a800
delete GLContext 0x7f942dc1a800, objcnt decreased to 25
delete WebGLContext 7f9427e54400, objcnt decreased to 24
DestroyResourcesAndContext called on 7f9427e54400
First of all:
 - I have tried calling glIsTexture to check it's a valid texture. Result: exact same crash in glIsTexture.
 - I have added redundant MakeCurrent() calls just before the crash point, that made no difference.


So, I have looked into the OSMesa source code:

static INLINE void *
_mesa_HashLookup_unlocked(struct _mesa_HashTable *table, GLuint key)
{
   GLuint pos;
   const struct HashEntry *entry;

   assert(table);
   assert(key);

   pos = HASH_FUNC(key);
   entry = table->Table[pos];
   while (entry) {
      if (entry->Key == key) {   //// <--- crash here
         return entry->Data;
      }
      entry = entry->Next;
   }
   return NULL;
}


Here's some info obtained by GDB:


(gdb) print entry
$1 = (const struct HashEntry *) 0x100000001
(gdb) print pos
$2 = 4
[...SNIP...]
(gdb) print table->Table[pos]
$5 = (struct HashEntry *) 0x7f080c9eb440
(gdb) print table->Table[pos]->Next
$6 = (struct HashEntry *) 0x100000001


So here we are: this code expect a null-terminated linked list, and instead our list has a bogus node 0x100000001 before our texture is actually found.
Very strange: can't reproduce the crash with the nvidia driver anymore.
ok, can now reproduce with nvidia driver, backtrace:


#0  0x000000381e0a6afd in nanosleep () at ../sysdeps/unix/syscall-template.S:82
#1  0x000000381e0a6970 in __sleep (seconds=0) at ../sysdeps/unix/sysv/linux/sleep.c:138
#2  0x00007fcc5d7a709c in ah_crap_handler (signum=11)
    at /home/bjacob/mozilla-central/toolkit/xre/nsSigHandlers.cpp:132
#3  0x00007fcc5d7abe51 in nsProfileLock::FatalSignalHandler (signo=11, info=0x7fff67a5e4f0, 
    context=0x7fff67a5e3c0) at nsProfileLock.cpp:221
#4  <signal handler called>
#5  0x0000000000416d8a in arena_dalloc (ptr=0x146a1ea80, offset=125568)
    at /home/bjacob/mozilla-central/memory/jemalloc/jemalloc.c:4222
#6  0x000000000041a576 in free (ptr=0x146a1ea80)
    at /home/bjacob/mozilla-central/memory/jemalloc/jemalloc.c:6064
#7  0x000000380a67d9ae in ?? () from /usr/lib64/nvidia/libGLcore.so.1
#8  0x000000380aaa7e68 in ?? () from /usr/lib64/nvidia/libGLcore.so.1
#9  0x000000380a728f7d in ?? () from /usr/lib64/nvidia/libGLcore.so.1
#10 0x000000380a711d14 in ?? () from /usr/lib64/nvidia/libGLcore.so.1
#11 0x00007fcc5de724f1 in mozilla::gl::GLContext::fDeleteTextures (this=0x7fcc4784d800, n=1, 
    names=0x7fcc46c76a18) at ../../../dist/include/GLContext.h:1091
#12 0x00007fcc5de6eec8 in mozilla::WebGLContext::DestroyResourcesAndContext (this=0x7fcc46c76800)
    at /home/bjacob/mozilla-central/content/canvas/src/WebGLContext.cpp:204
#13 0x00007fcc5de6e78a in mozilla::WebGLContext::~WebGLContext (this=0x7fcc46c76800, 
    __in_chrg=<value optimized out>)
    at /home/bjacob/mozilla-central/content/canvas/src/WebGLContext.cpp:106
#14 0x00007fcc5de6e936 in mozilla::WebGLContext::~WebGLContext (this=0x7fcc46c76800, 
    __in_chrg=<value optimized out>)
    at /home/bjacob/mozilla-central/content/canvas/src/WebGLContext.cpp:107
#15 0x00007fcc5de6fe19 in mozilla::WebGLContext::Release (this=0x7fcc46c76800)
    at /home/bjacob/mozilla-central/content/canvas/src/WebGLContext.cpp:545
#16 0x00007fcc5de6c28a in nsCOMPtr<nsICanvasRenderingContextInternal>::~nsCOMPtr (this=
    0x7fcc46de7540, __in_chrg=<value optimized out>) at ../../../dist/include/nsCOMPtr.h:533
#17 0x00007fcc5def9526 in nsHTMLCanvasElement::~nsHTMLCanvasElement (this=0x7fcc46de74c0, 
    __in_chrg=<value optimized out>)
    at /home/bjacob/mozilla-central/content/html/content/src/nsHTMLCanvasElement.cpp:73
#18 0x00007fcc5def957a in nsHTMLCanvasElement::~nsHTMLCanvasElement (this=0x7fcc46de74c0, 
    __in_chrg=<value optimized out>)
    at /home/bjacob/mozilla-central/content/html/content/src/nsHTMLCanvasElement.cpp:73
#19 0x00007fcc5ddf3223 in nsNodeUtils::LastRelease (aNode=0x7fcc46de74c0)
    at /home/bjacob/mozilla-central/content/base/src/nsNodeUtils.cpp:294
#20 0x00007fcc5dddbbb5 in nsGenericElement::Release (this=0x7fcc46de74c0)
    at /home/bjacob/mozilla-central/content/base/src/nsGenericElement.cpp:4470
#21 0x00007fcc5def9712 in nsHTMLCanvasElement::Release (this=0x7fcc46de74c0)
    at /home/bjacob/mozilla-central/content/html/content/src/nsHTMLCanvasElement.cpp:82
#22 0x00007fcc5ed5535b in nsXPCOMCycleCollectionParticipant::Unroot (this=0x7fcc6003d578, p=
    0x7fcc46de74c0) at nsCycleCollectionParticipant.cpp:74
#23 0x00007fcc5ede01c3 in nsCycleCollector::CollectWhite (this=0x7fcc5b2fd000)
    at /home/bjacob/mozilla-central/xpcom/base/nsCycleCollector.cpp:1800
#24 0x00007fcc5ede0a1a in nsCycleCollector::FinishCollection (this=0x7fcc5b2fd000)
Severity: normal → critical
Keywords: crash
Summary: Strange crash in WebGLContext::DestroyResourcesAndContext → Strange crash in [@ WebGLContext::DestroyResourcesAndContext]
Blocks: 586811
A patch allowing to reproduce the crash, at least for me with OSMesa, can be found here:
    https://bugzilla.mozilla.org/attachment.cgi?id=467179
No longer blocks: 586811
Blocks: 586811
Here's Valgrind output for a whole run of the webgl test suite, with OSMesa, with that crashy patch applied.

As you can see, valgrind doesn't report any crash. While, running without valgrind, I do get the crash.

The best explanation that I can find is that many tests are timing out when run under valgrind, possibly skipping the crashy part.
Here's the output of a new valgrind run, this time with higher timeouts on the tests so that they don't time out.

It ends in "VALGRIND INTERNAL ERROR" which according to certain gurus means that I trashed the heap.
So, the crash was introduced by my patch in bug 586811. The new version of this patch fixes it.

Will file another bug about fixing the uninitialized-value errors.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
See bug 588918 about fixing the uninitialized-value errors.
Crash Signature: [@ WebGLContext::DestroyResourcesAndContext]
Assignee: nobody → bjacob
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: