Closed Bug 654820 Opened 13 years ago Closed 11 years ago

Assertion failure: m_pools.empty(), at js/src/assembler/jit/ExecutableAllocator.h:180

Categories

(Core :: JavaScript Engine, defect)

x86_64
All
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla28

People

(Reporter: smaug, Assigned: sunfish)

References

Details

(Whiteboard: [MemShrink:P3])

Attachments

(1 file)

#0  0x0000003ce74aa87d in nanosleep () from /lib64/libc.so.6
#1  0x0000003ce74aa70f in sleep () from /lib64/libc.so.6
#2  0x00007f7d9a18248b in ah_crap_handler (signum=6) at /home/smaug/mozilla/hg/mozilla/toolkit/xre/nsSigHandlers.cpp:119
#3  0x00007f7d9a1876fb in nsProfileLock::FatalSignalHandler (signo=6, info=0x7fff29634a70, context=0x7fff29634940) at /home/smaug/mozilla/hg/mozilla/ff_build/toolkit/profile/nsProfileLock.cpp:226
#4  <signal handler called>
#5  0x0000003ce780ed8b in raise () from /lib64/libpthread.so.0
#6  0x00007f7d9c206a90 in ~ExecutableAllocator (this=<value optimized out>) at /home/smaug/mozilla/hg/mozilla/js/src/assembler/jit/ExecutableAllocator.h:180
#7  delete_<JSC::ExecutableAllocator> (this=<value optimized out>) at /home/smaug/mozilla/hg/mozilla/js/src/jsutil.h:498
#8  js::mjit::JaegerCompartment::Finish (this=<value optimized out>) at /home/smaug/mozilla/hg/mozilla/js/src/methodjit/MethodJIT.cpp:652
#9  0x00007f7d9bfa5342 in ~JaegerCompartment (this=0x7f7d8dcae000, __in_chrg=<value optimized out>) at /home/smaug/mozilla/hg/mozilla/js/src/methodjit/MethodJIT.h:229
#10 delete_<js::mjit::JaegerCompartment> (this=0x7f7d8dcae000, __in_chrg=<value optimized out>) at /home/smaug/mozilla/hg/mozilla/js/src/jsutil.h:498
#11 JSCompartment::~JSCompartment (this=0x7f7d8dcae000, __in_chrg=<value optimized out>) at /home/smaug/mozilla/hg/mozilla/js/src/jscompartment.cpp:113
#12 0x00007f7d9c008a89 in delete_<JSCompartment> (cx=<value optimized out>, comp=0x0, gckind=GC_LAST_CONTEXT) at /home/smaug/mozilla/hg/mozilla/js/src/jscntxt.h:1316
#13 SweepCompartments (cx=<value optimized out>, comp=0x0, gckind=GC_LAST_CONTEXT) at /home/smaug/mozilla/hg/mozilla/js/src/jsgc.cpp:2238
#14 MarkAndSweep (cx=<value optimized out>, comp=0x0, gckind=GC_LAST_CONTEXT) at /home/smaug/mozilla/hg/mozilla/js/src/jsgc.cpp:2438
#15 GCUntilDone (cx=<value optimized out>, comp=0x0, gckind=GC_LAST_CONTEXT) at /home/smaug/mozilla/hg/mozilla/js/src/jsgc.cpp:2684
#16 0x00007f7d9c008ec0 in js_GC (cx=0x7f7d7ed2fc00, comp=0x0, gckind=GC_LAST_CONTEXT) at /home/smaug/mozilla/hg/mozilla/js/src/jsgc.cpp:2765
#17 0x00007f7d9bfa1aaf in js_DestroyContext (cx=0x7f7d7ed2fc00, mode=JSDCM_FORCE_GC) at /home/smaug/mozilla/hg/mozilla/js/src/jscntxt.cpp:643
#18 0x00007f7d9b2d389d in nsXPConnect::~nsXPConnect (this=0x7f7d90348cc0, __in_chrg=<value optimized out>) at /home/smaug/mozilla/hg/mozilla/js/src/xpconnect/src/nsXPConnect.cpp:143
#19 0x00007f7d9b2d39a9 in nsXPConnect::~nsXPConnect (this=0x7f7d90348cc0, __in_chrg=<value optimized out>) at /home/smaug/mozilla/hg/mozilla/js/src/xpconnect/src/nsXPConnect.cpp:157
#20 0x00007f7d9b2d2c3d in nsXPConnect::Release (this=0x7f7d90348cc0) at /home/smaug/mozilla/hg/mozilla/js/src/xpconnect/src/nsXPConnect.cpp:70
#21 0x00007f7d9aed0b7f in nsScriptSecurityManager::Shutdown () at /home/smaug/mozilla/hg/mozilla/caps/src/nsScriptSecurityManager.cpp:3500
#22 0x00007f7d9bc0eb9e in ~KnownModule (this=0x7f7d98236320) at /home/smaug/mozilla/hg/mozilla/xpcom/components/nsComponentManager.h:204
#23 ~nsAutoPtr (this=0x7f7d98236320) at ../../dist/include/nsAutoPtr.h:104
#24 Destruct (this=0x7f7d98236320) at ../../dist/include/nsTArray.h:279
#25 DestructRange (this=0x7f7d98236320) at ../../dist/include/nsTArray.h:1106
#26 RemoveElementsAt (this=0x7f7d98236320) at ../../dist/include/nsTArray.h:834
#27 Clear (this=0x7f7d98236320) at ../../dist/include/nsTArray.h:845
#28 nsComponentManagerImpl::Shutdown (this=0x7f7d98236320) at /home/smaug/mozilla/hg/mozilla/xpcom/components/nsComponentManager.cpp:1008
#29 0x00007f7d9bb9ae77 in mozilla::ShutdownXPCOM (servMgr=<value optimized out>) at /home/smaug/mozilla/hg/mozilla/xpcom/build/nsXPComInit.cpp:714
#30 0x00007f7d9a16cc13 in ScopedXPCOMStartup::~ScopedXPCOMStartup (this=0x7fff29635840, __in_chrg=<value optimized out>) at /home/smaug/mozilla/hg/mozilla/toolkit/xre/nsAppRunner.cpp:1077
#31 0x00007f7d9a175ab6 in XRE_main (argc=<value optimized out>, argv=<value optimized out>, aAppData=<value optimized out>) at /home/smaug/mozilla/hg/mozilla/toolkit/xre/nsAppRunner.cpp:3434
#32 0x00000000004019d7 in main (argc=3, argv=0x7fff29635b18) at /home/smaug/mozilla/hg/mozilla/browser/app/nsBrowserApp.cpp:158
The relevant code:

    ~ExecutableAllocator()
    {
        for (size_t i = 0; i < m_smallPools.length(); i++)
            m_smallPools[i]->release(/* willDestroy = */true);
        JS_ASSERT(m_pools.empty());     // if this asserts we have a pool leak
    }

smaug, what were you doing when it triggered?
I was closing trunk debug Fx.
I'm also seeing this several times a day in my trunk debug Fx Linux64 build.
It always happens on exit, but it's not reproducible.
I added that assertion in bug 633653 added the assertion.  It also did some refactoring around that code that in theory shouldn't have changed its function, though it's possible that I did so by accident.

There are two likely possibilities: (a) we have a pool leak, one which pre-existed bug 633653, or (b) we don't have a leak but the assertion is invalid.  I'll try to reproduce and work out which of these is true.
Automation got this assertion on WinXP, Win7 on:

http://1001likes.free.fr/miss2.html
http://peliculasid.net/peliculas/battle-los-angeles.html
http://www.pwmania.com/
http://www.elyrics.net/read/c/chris-tomlin-lyrics/the-wonderful-cross-lyrics.html
http://www.mosfli.com/category/te-rinj/
http://www.streamiz.com/video.php?id=133448
http://www.y3.com/games/27093/Sonic-Nazo-Unleashed-Part-1

I've tried to reproduce locally the only luck I've had is with the 1001likes site shutting down after 15-20 popup ads. It is not reliable. Several of these sites appear to share ad networks which include imvu avatars and fake ipad winnings and use onunload to block leaving the ad pages. 1001likes also leaks urls and some mFreeCount and mAdoptFreeCount bytes.
I just hit this myself shutting down the browser:

#6  0x00007f3c09905598 in JS_Assert (s=0x7f3c0a048e3a "m_pools.empty()", 
    file=0x7f3c0a048dd8 "/home/njn/moz/ws7/js/src/assembler/jit/ExecutableAllocator.h", ln=180)
    at /home/njn/moz/ws7/js/src/jsutil.cpp:89
#7  0x00007f3c097cb487 in ~ExecutableAllocator (this=0x7f3bf75fa480, 
    __in_chrg=<value optimized out>)
    at /home/njn/moz/ws7/js/src/assembler/jit/ExecutableAllocator.h:180
#8  0x00007f3c097cb9f2 in js::Foreground::delete_<JSC::ExecutableAllocator> (p=0x7f3bf75fa480)
    at /home/njn/moz/ws7/js/src/jsutil.h:498
#9  0x00007f3c097c963d in ~JSCompartment (this=0x7f3bf49d1000, __in_chrg=<value optimized out>)
    at /home/njn/moz/ws7/js/src/jscompartment.cpp:105
#10 0x00007f3c0981f8d1 in JSContext::delete_<JSCompartment> (this=0x7f3be2c5b000, 
    p=0x7f3bf49d1000) at /home/njn/moz/ws7/js/src/jscntxt.h:1315
#11 0x00007f3c09814b7b in SweepCompartments (cx=0x7f3be2c5b000, gckind=GC_LAST_CONTEXT)
    at /home/njn/moz/ws7/js/src/jsgc.cpp:2238
#12 0x00007f3c098153b2 in MarkAndSweep (cx=0x7f3be2c5b000, comp=0x0, gckind=GC_LAST_CONTEXT)
    at /home/njn/moz/ws7/js/src/jsgc.cpp:2438
#13 0x00007f3c09815e17 in GCUntilDone (cx=0x7f3be2c5b000, comp=0x0, gckind=GC_LAST_CONTEXT)
    at /home/njn/moz/ws7/js/src/jsgc.cpp:2684
#14 0x00007f3c0981612e in js_GC (cx=0x7f3be2c5b000, comp=0x0, gckind=GC_LAST_CONTEXT)
    at /home/njn/moz/ws7/js/src/jsgc.cpp:2765
#15 0x00007f3c097c337e in js_DestroyContext (cx=0x7f3be2c5b000, mode=JSDCM_FORCE_GC)
    at /home/njn/moz/ws7/js/src/jscntxt.cpp:643
#16 0x00007f3c09783a65 in JS_DestroyContext (cx=0x7f3be2c5b000)
    at /home/njn/moz/ws7/js/src/jsapi.cpp:1024
#17 0x00007f3c08ae9dd9 in ~nsXPConnect (this=0x7f3bf756a2f0, __in_chrg=<value optimized out>)
    at /home/njn/moz/ws7/js/src/xpconnect/src/nsXPConnect.cpp:143
#18 0x00007f3c08ae9444 in nsXPConnect::Release (this=0x7f3bf756a2f0)
    at /home/njn/moz/ws7/js/src/xpconnect/src/nsXPConnect.cpp:70
#19 0x00007f3c0877773f in nsScriptSecurityManager::Shutdown ()
    at /home/njn/moz/ws7/caps/src/nsScriptSecurityManager.cpp:3500
#20 0x00007f3c07dffad9 in LayoutModuleDtor ()
    at /home/njn/moz/ws7/layout/build/nsLayoutModule.cpp:1210
#21 0x00007f3c0935b963 in ~KnownModule (this=0x7f3bf8a95dc0, __in_chrg=<value optimized out>)
    at /home/njn/moz/ws7/xpcom/components/nsComponentManager.h:204
#22 0x00007f3c0935f84d in ~nsAutoPtr (this=0x7f3bfe2b98b0, __in_chrg=<value optimized out>)
    at ../../dist/include/nsAutoPtr.h:104
#23 0x00007f3c0935f6cd in nsTArrayElementTraits<nsAutoPtr<nsComponentManagerImpl::KnownModule> >::Destruct (e=0x7f3bfe2b98b0) at ../../dist/include/nsTArray.h:279
#24 0x00007f3c0935f0d1 in nsTArray<nsAutoPtr<nsComponentManagerImpl::KnownModule>, nsTArrayDefaultAllocator>::DestructRange (this=0x7f3bf8a56288, start=0, count=52)
    at ../../dist/include/nsTArray.h:1106
#25 0x00007f3c0935e0e6 in nsTArray<nsAutoPtr<nsComponentManagerImpl::KnownModule>, nsTArrayDefaultAllocator>::RemoveElementsAt (this=0x7f3bf8a56288, start=0, count=52)
    at ../../dist/include/nsTArray.h:834
#26 0x00007f3c0935ca4d in nsTArray<nsAutoPtr<nsComponentManagerImpl::KnownModule>, nsTArrayDefaultAllocator>::Clear (this=0x7f3bf8a56288) at ../../dist/include/nsTArray.h:845
(dvander, cdleary:  a question for you is below!)

For the assertion in comment 6, there was one pool left in m_pools, looking like this:

  m_allocator = 0x7f3bf75fa480
  m_freePtr = 0x7f3bf31ee850
  m_end = 0x7f3bf31ef000
  m_allocation = {
    pages = 0x7f3bf31df000
    size = 65536
  }
  m_refCount = 3
  m_destroy = false
  m_gcNumber = 0

This pool was almost full.  The m_refCount of 3 is interesting -- it wasn't even close to being freed.

This line was in the stack trace (#9):

    Foreground::delete_(regExpAllocator);

So it's a pool from the regExpAllocator.  That line is followed shortly by this line:
 
   Foreground::delete_(jaegerCompartment);

I can imagine that there might be some references from jaegerCompartment to pools allocated by the regExpAllocator, and because jaegerCompartment is destroyed second, this might explain why we have reference hanging around.  dvander, cdleary, does that sound plausible?  If so, it's probably best just to remove this assertion.
(In reply to comment #7)
> 
> I can imagine that there might be some references from jaegerCompartment to
> pools allocated by the regExpAllocator

If that's the case, the two ExecutableAllocators should be merged.  That's slightly tricky because one is guarded by |#if ENABLE_YARR_JIT} and the other is guarded by |#ifdef JS_METHODJIT| but it should be doable.

That would make about:memory's mjit-code reporter more accurate too -- currently it doesn't take regExpAllocator into account.
I got exactly the same assertion on the TI branch in bug 656753. Test case is attached there.
(In reply to comment #8)
> > I can imagine that there might be some references from jaegerCompartment to
> > pools allocated by the regExpAllocator
> 
> If that's the case, the two ExecutableAllocators should be merged.

Yes! I added YARR before the methodjit landed, so we ended up with two executable allocator instances. I vote for unify!
(In reply to comment #7)
> I can imagine that there might be some references from jaegerCompartment to
> pools allocated by the regExpAllocator, and because jaegerCompartment is
> destroyed second, this might explain why we have reference hanging around. 
> dvander, cdleary, does that sound plausible?

Not sure, will take assignment and check it out, if you don't mind.
Assignee: general → cdleary
Status: NEW → ASSIGNED
(In reply to comment #9)
> I got exactly the same assertion on the TI branch in bug 656753. Test case
> is attached there.

Hmm, the backtrace from the TI branch doesn't fault on the regExpAllocator, so we're still looking for a reproducible test case for this crash. We should unify the executable allocators in general, but I don't think that's causing this bug. The regexp allocator only gets used by YARR internals to make the guts of regexp objects, so I can't imagine there being any cross-referencing going on here.

I'm going to un-assign myself since there's no reproducible test case ATM.
Assignee: cdleary → general
Status: ASSIGNED → NEW
Fixing bug 657164 should make this much more reproducible.  I fixed this in the TI branch last night and now all the tinderbox XPCShellTests are faulting when the regexpAllocator gets deleted.
cdleary, I'm reassigning this to you in light of comment 13.
Assignee: general → cdleary
(In reply to comment #13)

I actually created a patch this weekend that checked whether all regexps had refcounts that dropped to zero. It failed a bunch of XPCShell tests with global consts that were held in XPConnect via BackstagePasses. I'll post what I found when I get back to my laptop, which I'd left at home.
So it sounds like the issue is not that we have two ExecutableAllocators, but simply that some of the refs to pools from the regExpAllocator aren't being released, ie. just a simple pool leak?  Cool, looks like adding that assertion was worthwhile :)
I think bug 625600 is going to change the code involved with this bug.
Talked it over with Dave, bug 625600 is be independent of the refcounting issues we're seeing here.
With bug 657164 landing, we're also getting this assertion failure in xcpshell tests:

Assertion failure: m_refCount == 1, at /builds/slave/tm-lnx64-dbg/build/js/src/assembler/jit/ExecutableAllocator.h:117

The assertion has been temporarily disabled to avoid the test failures.
I suspect the two failures have the same root cause, so I'm filing it here.
No longer blocks: mlk-fx5+
Whiteboard: [MemShrink:P3]
How's this going?
I can no longer reproduce the assertion in crash automation with the urls in comment 5
I haven't seen this assertion in a long time.
> I haven't seen this assertion in a long time.

That's because the assertion was disabled -- see comment 19.
I just tried re-enabling the assertion and ran the xpcshell tests on try and they all passed.  So we've lost our reproducibility :(
Mass-reassigning cdleary's bugs to default. He won't work on any of them, anymore. I guess, at least.

@cdleary: shout if you take issue with this.
Assignee: cdleary → general
This is reproducible on trunk by uncommenting the asserts and running jit-tests.py with --no-baseline. Failures include basic/bug599854.js, basic/bug613399.js, and basic/627609.js.
(In reply to Dan Gohman [:sunfish] from comment #27)
> This is reproducible on trunk by uncommenting the asserts and running
> jit-tests.py with --no-baseline. Failures include basic/bug599854.js,
> basic/bug613399.js, and basic/627609.js.

What is the backtrace ?!  --no-baseline will disable all JIT from my understanding, as we expect to have compiled a function with baseline before going into Ion.
Oh, the new failure is not the assert itself, but the call to m_pools.empty() from within the assert. m_pools is a HashSet, which has an "uninitialized" state in which it's not even valid to call empty(). If no pools are created, it never gets initialized. Changing the assert to JS_ASSERT_IF(m_pools.initialized(), m_pools.empty()) may fix it.
Attached patch bug654820.patchSplinter Review
This patch re-enables the assert and updates it to check m_pools.initialized(). It works for me, for comment 25, and for some try runs. If there is still a bug here, re-enabling the assert is probably the best way to find it.
Attachment #8337737 - Flags: review?(n.nethercote)
Comment on attachment 8337737 [details] [diff] [review]
bug654820.patch

Review of attachment 8337737 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks!
Attachment #8337737 - Flags: review?(n.nethercote) → review+
https://hg.mozilla.org/mozilla-central/rev/887bcaf9de75
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla28
Assignee: general → sunfish
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: