###!!! ASSERTION: RECURSION_LEVEL(table_) > 0: 'RECURSION_LEVEL(table) > 0'

RESOLVED FIXED in mozilla6

Status

()

Core
XPCOM
RESOLVED FIXED
7 years ago
5 years ago

People

(Reporter: azakai, Assigned: Ehsan)

Tracking

({intermittent-failure})

Trunk
mozilla6
All
Linux
intermittent-failure
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(URL)

Attachments

(1 attachment, 3 obsolete attachments)

(Reporter)

Description

7 years ago
Not sure where to put this bug. Suspect it might be related to bug 614472.


REFTEST TEST-PASS(EXPECTED RANDOM) | file:///home/cltbld/talos-slave/mozilla-central_fedora-debug_test-jsreftest/build/jsreftest/tests/jsreftest.html?test=js1_8_1/String/regress-305064.js | ZERO WIDTH SPACE (category Cf):"a\u200B\u200B\u200B".trimRight()  item 395
REFTEST TEST-PASS(EXPECTED RANDOM) | file:///home/cltbld/talos-slave/mozilla-central_fedora-debug_test-jsreftest/build/jsreftest/tests/jsreftest.html?test=js1_8_1/String/regress-305064.js | ZERO WIDTH SPACE (category Cf):"\u200B\u200B\u200Ba\u200B\u200B\u200B".trimRight()  item 396
REFTEST INFO | Loading a blank page
++DOMWINDOW == 79 (0xac380c04) [serial = 5685] [outer = 0xa2f7328]
REFTEST TEST-START | file:///home/cltbld/talos-slave/mozilla-central_fedora-debug_test-jsreftest/build/jsreftest/tests/jsreftest.html?test=js1_8_1/strict/8.7.2.js
++DOMWINDOW == 80 (0xafee4304) [serial = 5686] [outer = 0xa2f7328]
WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80520012: file /builds/slave/mozilla-central-linux-debug/build/content/base/src/nsScriptLoader.cpp, line 340
###!!! ASSERTION: RECURSION_LEVEL(table_) > 0: 'RECURSION_LEVEL(table) > 0', file pldhash.c, line 707
PL_DHashTableOperate [pldhash.c:707]
nsTHashtable<nsBaseHashtableET<nsIDHashKey, xptiInterfaceEntry*> >::GetEntry [nsTHashtable.h:170]
nsBaseHashtable<nsIDHashKey, xptiInterfaceEntry*, xptiInterfaceEntry*>::Get [nsBaseHashtable.h:148]
xptiInterfaceInfoManager::GetInterfaceEntryForIID [xpcom/reflect/xptinfo/src/xptiInterfaceInfoManager.cpp:316]
NS_GetXPTCallStub_P [xpcom/reflect/xptcall/src/xptcall.cpp:78]
nsAutoXPTCStub::InitStub [nsXPTCUtils.h:59]
nsProxyEventObject::nsProxyEventObject [xpcom/proxy/src/nsProxyEventObject.cpp:65]
nsProxyObject::LockedFind [xpcom/proxy/src/nsProxyEvent.cpp:459]
nsProxyObjectManager::GetProxyForObject [xpcom/proxy/src/nsProxyObjectManager.cpp:265]
NS_GetProxyForObject [xpcom/proxy/src/nsProxyObjectManager.cpp:353]
nsThreadPool::ShutdownThread [xpcom/threads/nsThreadPool.cpp:146]
nsThreadPool::Run [xpcom/threads/nsThreadPool.cpp:233]
nsThread::ProcessNextEvent [xpcom/threads/nsThread.cpp:626]
NS_ProcessNextEvent_P [nsThreadUtils.cpp:250]
nsThread::ThreadFunc [xpcom/threads/nsThread.cpp:277]
_pt_root [nsprpub/pr/src/pthreads/ptthread.c:190]
libpthread.so.0 + 0x5ab5
(Reporter)

Updated

7 years ago
Whiteboard: [orange]

Updated

7 years ago
Blocks: 438871
It is entirely unrelated to bug 614472. This assertion usually means that we are modifying the XPTI hashtable (mIIDTable) in a thread-unsafe way. From the stack, it looks like we are shutting down a thread (not sure which one, though) and there is a simultaneous modification of the IID hash, which should only ever occur on the main thread, and only during startup.

Was the browser shutting down at the time?
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1290719952.1290721605.15582.gz&fulltext=1#err0

The browser shouldn't have been shutting down when the above jsreftest run hit it -- still much more of that test run to go.
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1292813375.1292814573.13273.gz
Rev3 MacOSX Leopard 10.5.8 mozilla-central debug test xpcshell on 2010/12/19 18:49:35 

In http://mxr.mozilla.org/comm-central/source/mozilla/netwerk/test/unit/test_bug470716.js and near as I can tell partway through the 2nd of 13 rounds, not shutting down.
Brendan, what are the conditions where this assertion fires? The invariants I'm currently using are:

* the table is only ever modified on the main thread (during startup/registration)
* multiple threads may enumerate or get entries at any time (not simultaneous with startup/registration)

So it's possible that this GetEntry is happening simultaneously with an enumeration on the main thread, but I thought that was "safe".
dbaron added the recursion level checking, so cc'ing him in case he spots something more than this: it is not sound safe to have ADDs and Enumerates racing.

/be
(In reply to comment #4)
> Brendan, what are the conditions where this assertion fires? The invariants I'm
> currently using are:
> 
> * the table is only ever modified on the main thread (during
> startup/registration)
> * multiple threads may enumerate or get entries at any time (not simultaneous
> with startup/registration)

If it's always the line 707 assertion, that's consistent with PL_DHASH_LOOKUPs racing on multiple threads, and causing the recursion level of the table to become incorrect due to the races.

We added PL_DHashMarkTableImmutable for cases like this, I think.  Is it true that you never do multi-thread lookups until the table is complete?
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1296073318.1296074693.31531.gz
Rev3 Fedora 12x64 mozilla-central debug test jsreftest on 2011/01/26 12:21:58
s: talos-r3-fed64-028
Duplicate of this bug: 629861
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1296615984.1296619183.7644.gz
Rev3 Fedora 12x64 tracemonkey debug test jsreftest on 2011/02/01 19:06:24
s: talos-r3-fed64-048

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_5/RegExp/7.8.5-01.js | assertion count 1 is more than expected 0 assertions
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1296630544.1296632283.26751.gz
Rev3 Fedora 12x64 tracemonkey debug test jsreftest on 2011/02/01 23:09:04
s: talos-r3-fed64-020

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_5/Types/8.12.5-01.js | assertion count 1 is more than expected 0 assertions
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1296685727.1296687263.10127.gz
Rev3 Fedora 12x64 mozilla-central debug test jsreftest on 2011/02/02 14:28:47
s: talos-r3-fed64-023

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_5/Array/length-01.js | assertion count 1 is more than expected 0 assertions
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1296717965.1296719324.3243.gz
Rev3 Fedora 12x64 mozilla-central debug test jsreftest on 2011/02/02 23:26:05
s: talos-r3-fed64-037

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_5/Expressions/11.1.5-01.js | assertion count 1 is more than expected 0 assertions
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1296736174.1296737641.11505.gz
s: talos-r3-fed64-043
REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_5/Global/parseInt-01.js | assertion count 1 is more than expected 0 assertions
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1296833453.1296835444.15418.gz
Rev3 WINNT 6.1 mozilla-central debug test jsreftest on 2011/02/04 07:30:53
s: talos-r3-w7-007

REFTEST TEST-UNEXPECTED-FAIL | file:///c:/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=js1_6/decompilation/regress-352613-02.js | assertion count 1 is more than expected 0 assertions
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1296838422.1296839962.826.gzRev3 Fedora 12x64 mozilla-central debug test xpcshell on 2011/02/04 08:53:42
s: talos-r3-fed64-006

###!!! ASSERTION: RECURSION_LEVEL(table_) > 0: 'RECURSION_LEVEL(table) > 0', file pldhash.c, line 707
PROCESS-CRASH | /home/cltbld/talos-slave/test/build/xpcshell/tests/services/sync/tests/unit/test_service_sync_updateEnabledEngines.js | application crashed (minidump found)
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1296852904.1296854426.32606.gz
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1296856270.1296857671.14649.gz
Rev3 Fedora 12x64 mozilla-central debug test jsreftest on 2011/02/04 13:51:10
s: talos-r3-fed64-006

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_3/extensions/10.1.3-2.js | assertion count 1 is more than expected 0 assertions
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1296860536.1296861958.1692.gz
Rev3 Fedora 12x64 mozilla-central debug test jsreftest

REFTEST TEST-START | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_5/Global/parseInt-01.js
###!!! ASSERTION: RECURSION_LEVEL(table_) > 0: 'RECURSION_LEVEL(table) > 0', file pldhash.c, line 707
REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_5/Global/parseInt-01.js | assertion count 1 is more than expected 0 assertions
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1296878265.1296879688.11191.gz
Rev3 Fedora 12x64 mozilla-central debug test jsreftest on 2011/02/04 19:57:45
s: talos-r3-fed64-046

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=js1_5/Expressions/regress-192288.js | assertion count 1 is more than expected 0 assertions
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1296880756.1296882149.19305.gz
Rev3 Fedora 12x64 mozilla-central debug test jsreftest on 2011/02/04 20:39:16
s: talos-r3-fed64-053

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=js1_8_1/jit/math-jit-tests.js | assertion count 1 is more than expected 0 assertions
http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTry/1297057170.1297058733.8856.gz
Rev3 Fedora 12 tryserver debug test xpcshell on 2011/02/06 21:39:30
s: talos-r3-fed-041

TEST-UNEXPECTED-FAIL | /home/cltbld/talos-slave/test/build/xpcshell/tests/services/sync/tests/unit/test_service_sync_updateEnabledEngines.js | test failed (with xpcshell return code: 1), see following log:
PROCESS-CRASH | /home/cltbld/talos-slave/test/build/xpcshell/tests/services/sync/tests/unit/test_service_sync_updateEnabledEngines.js | application crashed (minidump found)
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297134899.1297136425.13742.gz
Rev3 Fedora 12x64 mozilla-central debug test jsreftest on 2011/02/07 19:14:59
s: talos-r3-fed64-034

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=js1_4/Eval/eval-001.js | assertion count 1 is more than expected 0 assertions
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297156873.1297158345.11228.gz
Rev3 Fedora 12x64 mozilla-central debug test jsreftest on 2011/02/08 01:21:13
s: talos-r3-fed64-042

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=js1_4/Eval/eval-001.js | assertion count 1 is more than expected 0 assertions
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297206496.1297207977.15036.gz
Rev3 Fedora 12x64 mozilla-central debug test jsreftest on 2011/02/08 15:08:16
s: talos-r3-fed64-022

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_5/misc/global-numeric-properties.js | assertion count 1 is more than expected 0 assertions
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297208050.1297209516.22353.gz
Rev3 Fedora 12x64 mozilla-central debug test jsreftest on 2011/02/08 15:34:10
s: talos-r3-fed64-036

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_5/Global/parseInt-01.js | assertion count 1 is more than expected 0 assertions
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297208052.1297209719.23637.gz
Rev3 Fedora 12x64 mozilla-central debug test xpcshell on 2011/02/08 15:34:12
s: talos-r3-fed64-026

TEST-UNEXPECTED-FAIL | /home/cltbld/talos-slave/test/build/xpcshell/tests/services/sync/tests/unit/test_service_login.js | test failed (with xpcshell return code: 1), see following log:
PROCESS-CRASH | /home/cltbld/talos-slave/test/build/xpcshell/tests/services/sync/tests/unit/test_service_login.js | application crashed (minidump found)
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297210475.1297212169.3434.gz
Rev3 Fedora 12 mozilla-central debug test xpcshell on 2011/02/08 16:14:35
s: talos-r3-fed-020

TEST-UNEXPECTED-FAIL | /home/cltbld/talos-slave/test/build/xpcshell/tests/services/sync/tests/unit/test_corrupt_keys.js | test failed (with xpcshell return code: 1), see following log:
PROCESS-CRASH | /home/cltbld/talos-slave/test/build/xpcshell/tests/services/sync/tests/unit/test_corrupt_keys.js | application crashed (minidump found)
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1297216452.1297218041.30245.gz
Rev3 Fedora 12x64 tracemonkey debug test jsreftest on 2011/02/08 17:54:12
s: talos-r3-fed64-019

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=js1_6/Array/filter.js | assertion count 1 is more than expected 0 assertions
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297286836.1297288239.22244.gz
Rev3 Fedora 12x64 mozilla-central debug test jsreftest on 2011/02/09 13:27:16
s: talos-r3-fed64-026

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=js1_8/decompilation/regress-260106.js | assertion count 1 is more than expected 0 assertions
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297306574.1297308068.19960.gz
Rev3 Fedora 12x64 mozilla-central debug test jsreftest on 2011/02/09 18:56:14
s: talos-r3-fed64-034

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=js1_4/Eval/eval-001.js | assertion count 1 is more than expected 0 assertions
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297323019.1297324471.19516.gz
s: talos-r3-fed64-045
REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=js1_1/regress/function-001.js | assertion count 1 is more than expected 0 assertions
I'm not sure it's always true, but I'm pretty sure that this is true:

* We read XPTs on the main thread, there are no other threads accessing the table
* We start the extension manager, which may do some networking, loading background threads and PSM which may use XPCOM proxies which reads the XPT table
* All the networking stops, and then we register extension XPTs (table is modified)
* After this point I believe the table is immutable. Technically calls to nsIComponentManager.autoRegister could cause mutations, but I think we can probably just say that you can't register XPTs after startup.
What I'd really like, of course, is for XPCOM proxies to die and this to be truly main-thread only, or at least for XPCOM proxies to prefetch the xptinfo they need on the main thread before continuing. But that's not realistic now.
Rev3 Fedora 12x64 mozilla-central debug test jsreftest on 2011/02/10 09:08:00
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297357680.1297359135.24493.gz

Updated

6 years ago
Blocks: 627985
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297400718.1297402118.14349.gz
Rev3 Fedora 12x64 mozilla-central debug test jsreftest on 2011/02/10 21:05:18
s: talos-r3-fed64-038

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=js1_4/Eval/eval-001.js | assertion count 1 is more than expected 0 assertions
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297524398.1297525851.18277.gz
Rev3 Fedora 12x64 mozilla-central debug test jsreftest on 2011/02/12 07:26:38
s: talos-r3-fed64-045

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_5/Function/10.2.1.1.6.js | assertion count 1 is more than expected 0 assertions
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297546154.1297547575.26507.gz
Rev3 Fedora 12x64 mozilla-central debug test jsreftest on 2011/02/12 13:29:14
s: talos-r3-fed64-033

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_5/String/15.5.4.2.js | assertion count 1 is more than expected 0 assertions
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1297793888.1297795108.8268.gz
Rev3 Fedora 12x64 tracemonkey debug test jsreftest on 2011/02/15 10:18:08
s: talos-r3-fed64-011

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=js1_8_5/extensions/typedarray.js | assertion count 1 is more than expected 0 assertions
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297875556.1297876786.17164.gz
Rev3 Fedora 12x64 mozilla-central debug test jsreftest on 2011/02/16 08:59:16
s: talos-r3-fed64-053

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_5/Object/15.2.3.3-01.js | assertion count 1 is more than expected 0 assertions
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297927630.1297928774.26067.gz
Rev3 Fedora 12x64 mozilla-central debug test jsreftest on 2011/02/16 23:27:10
s: talos-r3-fed64-054

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=js1_4/Eval/eval-001.js | assertion count 1 is more than expected 0 assertions
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1298023268.1298024391.11095.gz
s: talos-r3-fed64-005
REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_5/Global/parseInt-01.js | assertion count 1 is more than expected 0 assertions
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1298062670.1298063922.14759.gz&fulltext=1
Rev3 Fedora 12x64 tracemonkey debug test jsreftest on 2011/02/18 12:57:50
s: talos-r3-fed64-014

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_3/Date/15.9.1.2-01.js | assertion count 1 is more than expected 0 assertions
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1298151090.1298152440.12277.gz&fulltext=1
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1298263016.1298264270.27220.gz
Rev3 Fedora 12x64 tracemonkey debug test jsreftest on 2011/02/20 20:36:56
s: talos-r3-fed64-005

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=js1_6/Array/filter.js | assertion count 1 is more than expected 0 assertions
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1298305950.1298307130.10027.gz
Rev3 Fedora 12x64 tracemonkey debug test jsreftest on 2011/02/21 08:32:30
s: talos-r3-fed64-043

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_3/Date/15.9.1.2-01.js | assertion count 1 is more than expected 0 assertions
http://tinderbox.mozilla.org/showlog.cgi?log=Build-System/1298307978.1298309151.17421.gz
Rev3 Fedora 12x64 build-system debug test jsreftest on 2011/02/21 09:06:18
s: talos-r3-fed64-042

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_3/Object/8.6.1-01.js | assertion count 1 is more than expected 0 assertions
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1298331952.1298335342.20410.gz
Rev3 WINNT 6.1 mozilla-central debug test xpcshell on 2011/02/21 15:45:52
s: talos-r3-w7-027
TEST-UNEXPECTED-FAIL | c:\talos-slave\test\build\xpcshell\tests\services\sync\tests\unit\test_load_modules.js | test failed (with xpcshell return code: -2147483645), see following log:
PROCESS-CRASH | c:\talos-slave\test\build\xpcshell\tests\services\sync\tests\unit\test_load_modules.js | application crashed (minidump found)
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1298871248.1298872763.25723.gz
Rev3 Fedora 12x64 mozilla-central debug test xpcshell on 2011/02/27 21:34:08
s: talos-r3-fed64-024

TEST-UNEXPECTED-FAIL | /home/cltbld/talos-slave/test/build/xpcshell/tests/services/sync/tests/unit/test_service_detect_upgrade.js | test failed (with xpcshell return code: 1), see following log:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1299565906.1299567324.9710.gz&fulltext=1
Rev3 Fedora 12x64 mozilla-central debug test reftest on 2011/03/07 22:31:46
s: talos-r3-fed64-044

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/box-properties/CSS21-t100303.xhtml | assertion count 1 is more than expected 0 assertions
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1299700627.1299698642.18674.gz
(In reply to comment #32)
> * After this point I believe the table is immutable. Technically calls to
> nsIComponentManager.autoRegister could cause mutations, but I think we can
> probably just say that you can't register XPTs after startup.

Hmm, we do have a bunch of tests which either call into autoRegister directly after startup, or do other things (such as calling checkForNewChrome for example) which might trigger the modification of this table indirectly.  So is that a safe assumption to make?
Created attachment 522919 [details] [diff] [review]
protect the tables with a lock

Here's an attempt to fix this and bug 627985 by protecting the mNameTable and mIIDTable with a single lock.  This lock also protects what mInfoMonitor used to protect.

The basic idea of the patch is as follows:
 * add mTableLock
 * remove unused mAutoRegLock
 * replace mInfoMonitor with mTableLock
 * all access to mNameTable and mIIDTable, plus writes to the linkage between xptiInterfaceInfo and xptiInterfaceEntry (which is only changed when nobody on the outside has a reference to the object) must be protected by mTableLock.  No particular reason to use the same lock except that the two things tend to happen at the same time.
 * xptiInterfaceInfo::Release has to avoid holding the lock across its destructor (and release of mParent).  (Alternatively, could have used a Monitor like mInfoMonitor, but didn't seem worth it.)

This is all-green on try so far, though I'm still waiting on builds:
http://tbpl.mozilla.org/?tree=MozillaTry&rev=f5c8ffd82371

Thoughts?
Attachment #522919 - Flags: review?(benjamin)
Comment on attachment 522919 [details] [diff] [review]
protect the tables with a lock

This is so sad. I hope this doesn't affect startup time too much.
Attachment #522919 - Flags: review?(benjamin) → review+

Updated

6 years ago
Duplicate of this bug: 646395
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1301895856.1301897596.15113.gz
http://tinderbox.mozilla.org/showlog.cgi?log=Cedar/1302117022.1302118201.27572.gz&fulltext=1
I tried to land this patch, but it fails to apply due to the monitor interface changes in a way which makes me very uncomfortable with trying to unbitrot it.

I think this should land on m-c, preferably as a single changeset in order for us to be able to watch the perf numbers...
Assignee: nobody → dbaron
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1302142349.1302143653.7076.gz
Rev3 MacOSX Snow Leopard 10.6.2 mozilla-central debug test jsreftest on 2011/04/06 19:12:29
s: talos-r3-snow-028

REFTEST TEST-UNEXPECTED-FAIL | file:///Users/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=e4x/GC/regress-280844-1.js | assertion count 1 is more than expected 0 assertions
Merged version of the patch is:

http://hg.mozilla.org/users/dbaron_mozilla.com/patches/raw-file/d3c8915ffd44/xpt-lock
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1302398069.1302400960.30097.gz
Rev3 WINNT 6.1 tracemonkey debug test xpcshell on 2011/04/09 18:14:29
s: talos-r3-w7-038

TEST-UNEXPECTED-FAIL | c:\talos-slave\test\build\xpcshell\tests\services\sync\tests\unit\test_service_verifyLogin.js | test failed (with xpcshell return code: -2147483645), see following log:
http://hg.mozilla.org/projects/cedar/rev/c509d8f8f423
Whiteboard: [orange] → [orange][fixed-in-cedar]
Seems like an oddly-specific failure mode, but Linux64 opt reftest hung during shutdown, so I retriggered it, and it hung on a different slave, so I retriggered it, and it hung on a third slave.
(In reply to comment #62)
> Seems like an oddly-specific failure mode, but Linux64 opt reftest hung during
> shutdown, so I retriggered it, and it hung on a different slave, so I
> retriggered it, and it hung on a third slave.

To verify my sanity, I pushed some other (unrelated) changes, and the timeout happened again, so I had to back this out: http://hg.mozilla.org/projects/cedar/rev/db98c4f46347)

http://tinderbox.mozilla.org/showlog.cgi?log=Cedar/1302475596.1302476652.6413.gz&fulltext=1
Whiteboard: [orange][fixed-in-cedar] → [orange]
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1302543476.1302545089.20225.gz
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1302678513.1302680057.17727.gz

s: talos-r3-fed64-020
TEST-UNEXPECTED-FAIL | /home/cltbld/talos-slave/test/build/xpcshell/tests/services/sync/tests/unit/test_service_sync_updateEnabledEngines.js | test failed (with xpcshell return code: 1), see following log:
PROCESS-CRASH | /home/cltbld/talos-slave/test/build/xpcshell/tests/services/sync/tests/unit/test_service_sync_updateEnabledEngines.js | application crashed (minidump found)
Thread 0 (crashed)

###!!! ASSERTION: RECURSION_LEVEL(table_) > 0: 'RECURSION_LEVEL(table) > 0', file pldhash.c, line 707
Did we try making the Mutex a Monitor instead so we can reenter it?  /me notes that xptiInterfaceInfo's Release is awful ...
I tried to reproduce this locally, but both my local build, and the tinderbox builds I downloaded shut down cleanly at the end of the reftest suite...

(In reply to comment #66)
> Did we try making the Mutex a Monitor instead so we can reenter it?

What do you mean?  Isn't that what we're trying to move away from?
(In reply to comment #67)
> I tried to reproduce this locally, but both my local build, and the tinderbox
> builds I downloaded shut down cleanly at the end of the reftest suite...
> 
> (In reply to comment #66)
> > Did we try making the Mutex a Monitor instead so we can reenter it?
> 
> What do you mean?  Isn't that what we're trying to move away from?


(In reply to comment #52)
>  * xptiInterfaceInfo::Release has to avoid holding the lock across its
> destructor (and release of mParent).  (Alternatively, could have used a Monitor
> like mInfoMonitor, but didn't seem worth it.)

Perhaps it is worth it now?
(In reply to comment #68)
> (In reply to comment #52)
> >  * xptiInterfaceInfo::Release has to avoid holding the lock across its
> > destructor (and release of mParent).  (Alternatively, could have used a Monitor
> > like mInfoMonitor, but didn't seem worth it.)
> 
> Perhaps it is worth it now?

Ah, I see what you mean.  I will give that a shot.
Created attachment 526178 [details] [diff] [review]
Part 2: Use the info monitor for xptiInterfaceInfo::Release

Kyle's suggestion worked out really well, and the try server seems to be happy with this.  This patch basically restores the info monitor for xptiInterfaceInfo::Release, and it's supposed to be applied on top of the other patch landed as part of this bug.
Attachment #526178 - Flags: review?(dbaron)
Comment on attachment 526178 [details] [diff] [review]
Part 2: Use the info monitor for xptiInterfaceInfo::Release

Sorry for the delay -- meant to get to this sooner.

This doesn't make sense, since the patch above turned most of the other uses of the info monitor into uses of the table lock.  You can't split those uses from this one and lock the same data structure with two different locks.

You could, however, just turn the table lock into a monitor.  It probably makes the most sense to do that by revising the existing patch, though.
Attachment #526178 - Flags: review?(dbaron) → review-
Created attachment 526853 [details] [diff] [review]
Patch (v2)

Good point.  This one should make more sense, I guess.
Assignee: dbaron → ehsan
Attachment #522919 - Attachment is obsolete: true
Attachment #526178 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #526853 - Flags: review?(dbaron)
Comment on attachment 526853 [details] [diff] [review]
Patch (v2)

Looks good, mostly, except you can now drop the changes I made to xptiInterfaceInfo::Release and just replace them with the obvious conversion from one monitor to another.  r=dbaron with that
Attachment #526853 - Flags: review?(dbaron) → review+
Heh, sorry I missed that!

I've pushed an updated version of the patch to the try server.  I will land it tomorrow if everything goes fine.
Created attachment 526868 [details] [diff] [review]
For landing
Attachment #526853 - Attachment is obsolete: true
http://hg.mozilla.org/mozilla-central/rev/0680c776e806
Status: ASSIGNED → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla6
Hmm, I guess this isn't completely fixed :(

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1303327169.1303327588.32081.gz
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(In reply to comment #77)
> Hmm, I guess this isn't completely fixed :(
> 
> http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1303327169.1303327588.32081.gz

That's another bug, sorry for the noise.
Status: REOPENED → RESOLVED
Last Resolved: 6 years ago6 years ago
Resolution: --- → FIXED
Keywords: intermittent-failure
Whiteboard: [orange]
You need to log in before you can comment on or make changes to this bug.