Last Comment Bug 614480 - ###!!! ASSERTION: RECURSION_LEVEL(table_) > 0: 'RECURSION_LEVEL(table) > 0'
: ###!!! ASSERTION: RECURSION_LEVEL(table_) > 0: 'RECURSION_LEVEL(table) > 0'
Status: RESOLVED FIXED
: intermittent-failure
Product: Core
Classification: Components
Component: XPCOM (show other bugs)
: Trunk
: All Linux
: -- normal (vote)
: mozilla6
Assigned To: :Ehsan Akhgari
:
: Nathan Froyd [:froydnj]
Mentors:
http://tinderbox.mozilla.org/showlog....
: 629861 646395 (view as bug list)
Depends on:
Blocks: 438871 627985
  Show dependency treegraph
 
Reported: 2010-11-23 21:09 PST by Alon Zakai (:azakai)
Modified: 2012-11-25 19:31 PST (History)
11 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
protect the tables with a lock (17.50 KB, patch)
2011-03-29 21:49 PDT, David Baron :dbaron: ⌚️UTC-10
benjamin: review+
Details | Diff | Splinter Review
Part 2: Use the info monitor for xptiInterfaceInfo::Release (5.27 KB, patch)
2011-04-14 20:43 PDT, :Ehsan Akhgari
dbaron: review-
Details | Diff | Splinter Review
Patch (v2) (16.25 KB, patch)
2011-04-18 15:42 PDT, :Ehsan Akhgari
dbaron: review+
Details | Diff | Splinter Review
For landing (15.09 KB, patch)
2011-04-18 16:30 PDT, :Ehsan Akhgari
no flags Details | Diff | Splinter Review

Description Alon Zakai (:azakai) 2010-11-23 21:09:11 PST
Not sure where to put this bug. Suspect it might be related to bug 614472.


REFTEST TEST-PASS(EXPECTED RANDOM) | file:///home/cltbld/talos-slave/mozilla-central_fedora-debug_test-jsreftest/build/jsreftest/tests/jsreftest.html?test=js1_8_1/String/regress-305064.js | ZERO WIDTH SPACE (category Cf):"a\u200B\u200B\u200B".trimRight()  item 395
REFTEST TEST-PASS(EXPECTED RANDOM) | file:///home/cltbld/talos-slave/mozilla-central_fedora-debug_test-jsreftest/build/jsreftest/tests/jsreftest.html?test=js1_8_1/String/regress-305064.js | ZERO WIDTH SPACE (category Cf):"\u200B\u200B\u200Ba\u200B\u200B\u200B".trimRight()  item 396
REFTEST INFO | Loading a blank page
++DOMWINDOW == 79 (0xac380c04) [serial = 5685] [outer = 0xa2f7328]
REFTEST TEST-START | file:///home/cltbld/talos-slave/mozilla-central_fedora-debug_test-jsreftest/build/jsreftest/tests/jsreftest.html?test=js1_8_1/strict/8.7.2.js
++DOMWINDOW == 80 (0xafee4304) [serial = 5686] [outer = 0xa2f7328]
WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80520012: file /builds/slave/mozilla-central-linux-debug/build/content/base/src/nsScriptLoader.cpp, line 340
###!!! ASSERTION: RECURSION_LEVEL(table_) > 0: 'RECURSION_LEVEL(table) > 0', file pldhash.c, line 707
PL_DHashTableOperate [pldhash.c:707]
nsTHashtable<nsBaseHashtableET<nsIDHashKey, xptiInterfaceEntry*> >::GetEntry [nsTHashtable.h:170]
nsBaseHashtable<nsIDHashKey, xptiInterfaceEntry*, xptiInterfaceEntry*>::Get [nsBaseHashtable.h:148]
xptiInterfaceInfoManager::GetInterfaceEntryForIID [xpcom/reflect/xptinfo/src/xptiInterfaceInfoManager.cpp:316]
NS_GetXPTCallStub_P [xpcom/reflect/xptcall/src/xptcall.cpp:78]
nsAutoXPTCStub::InitStub [nsXPTCUtils.h:59]
nsProxyEventObject::nsProxyEventObject [xpcom/proxy/src/nsProxyEventObject.cpp:65]
nsProxyObject::LockedFind [xpcom/proxy/src/nsProxyEvent.cpp:459]
nsProxyObjectManager::GetProxyForObject [xpcom/proxy/src/nsProxyObjectManager.cpp:265]
NS_GetProxyForObject [xpcom/proxy/src/nsProxyObjectManager.cpp:353]
nsThreadPool::ShutdownThread [xpcom/threads/nsThreadPool.cpp:146]
nsThreadPool::Run [xpcom/threads/nsThreadPool.cpp:233]
nsThread::ProcessNextEvent [xpcom/threads/nsThread.cpp:626]
NS_ProcessNextEvent_P [nsThreadUtils.cpp:250]
nsThread::ThreadFunc [xpcom/threads/nsThread.cpp:277]
_pt_root [nsprpub/pr/src/pthreads/ptthread.c:190]
libpthread.so.0 + 0x5ab5
Comment 1 Benjamin Smedberg [:bsmedberg] 2010-11-24 06:01:57 PST
It is entirely unrelated to bug 614472. This assertion usually means that we are modifying the XPTI hashtable (mIIDTable) in a thread-unsafe way. From the stack, it looks like we are shutting down a thread (not sure which one, though) and there is a simultaneous modification of the IID hash, which should only ever occur on the main thread, and only during startup.

Was the browser shutting down at the time?
Comment 2 Jeff Walden [:Waldo] (remove +bmo to email) 2010-11-25 14:38:19 PST
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1290719952.1290721605.15582.gz&fulltext=1#err0

The browser shouldn't have been shutting down when the above jsreftest run hit it -- still much more of that test run to go.
Comment 3 Phil Ringnalda (:philor) 2010-12-19 19:27:34 PST
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1292813375.1292814573.13273.gz
Rev3 MacOSX Leopard 10.5.8 mozilla-central debug test xpcshell on 2010/12/19 18:49:35 

In http://mxr.mozilla.org/comm-central/source/mozilla/netwerk/test/unit/test_bug470716.js and near as I can tell partway through the 2nd of 13 rounds, not shutting down.
Comment 4 Benjamin Smedberg [:bsmedberg] 2010-12-20 06:33:14 PST
Brendan, what are the conditions where this assertion fires? The invariants I'm currently using are:

* the table is only ever modified on the main thread (during startup/registration)
* multiple threads may enumerate or get entries at any time (not simultaneous with startup/registration)

So it's possible that this GetEntry is happening simultaneously with an enumeration on the main thread, but I thought that was "safe".
Comment 5 Brendan Eich [:brendan] 2010-12-20 16:48:53 PST
dbaron added the recursion level checking, so cc'ing him in case he spots something more than this: it is not sound safe to have ADDs and Enumerates racing.

/be
Comment 6 David Baron :dbaron: ⌚️UTC-10 2010-12-20 17:04:16 PST
(In reply to comment #4)
> Brendan, what are the conditions where this assertion fires? The invariants I'm
> currently using are:
> 
> * the table is only ever modified on the main thread (during
> startup/registration)
> * multiple threads may enumerate or get entries at any time (not simultaneous
> with startup/registration)

If it's always the line 707 assertion, that's consistent with PL_DHASH_LOOKUPs racing on multiple threads, and causing the recursion level of the table to become incorrect due to the races.

We added PL_DHashMarkTableImmutable for cases like this, I think.  Is it true that you never do multi-thread lookups until the table is complete?
Comment 7 Daniel Holbert [:dholbert] 2011-01-26 12:51:01 PST
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1296073318.1296074693.31531.gz
Rev3 Fedora 12x64 mozilla-central debug test jsreftest on 2011/01/26 12:21:58
s: talos-r3-fed64-028
Comment 8 Phil Ringnalda (:philor) 2011-02-01 20:48:08 PST
*** Bug 629861 has been marked as a duplicate of this bug. ***
Comment 9 Phil Ringnalda (:philor) 2011-02-01 20:48:51 PST
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1296615984.1296619183.7644.gz
Rev3 Fedora 12x64 tracemonkey debug test jsreftest on 2011/02/01 19:06:24
s: talos-r3-fed64-048

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_5/RegExp/7.8.5-01.js | assertion count 1 is more than expected 0 assertions
Comment 10 Phil Ringnalda (:philor) 2011-02-02 07:40:06 PST
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1296630544.1296632283.26751.gz
Rev3 Fedora 12x64 tracemonkey debug test jsreftest on 2011/02/01 23:09:04
s: talos-r3-fed64-020

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_5/Types/8.12.5-01.js | assertion count 1 is more than expected 0 assertions
Comment 11 Phil Ringnalda (:philor) 2011-02-02 15:38:50 PST
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1296685727.1296687263.10127.gz
Rev3 Fedora 12x64 mozilla-central debug test jsreftest on 2011/02/02 14:28:47
s: talos-r3-fed64-023

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_5/Array/length-01.js | assertion count 1 is more than expected 0 assertions
Comment 12 Phil Ringnalda (:philor) 2011-02-02 23:57:49 PST
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1296717965.1296719324.3243.gz
Rev3 Fedora 12x64 mozilla-central debug test jsreftest on 2011/02/02 23:26:05
s: talos-r3-fed64-037

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_5/Expressions/11.1.5-01.js | assertion count 1 is more than expected 0 assertions
Comment 13 Jonathan Kew (:jfkthame) 2011-02-03 05:16:04 PST
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1296736174.1296737641.11505.gz
s: talos-r3-fed64-043
REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_5/Global/parseInt-01.js | assertion count 1 is more than expected 0 assertions
Comment 14 Phil Ringnalda (:philor) 2011-02-04 08:15:56 PST
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1296833453.1296835444.15418.gz
Rev3 WINNT 6.1 mozilla-central debug test jsreftest on 2011/02/04 07:30:53
s: talos-r3-w7-007

REFTEST TEST-UNEXPECTED-FAIL | file:///c:/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=js1_6/decompilation/regress-352613-02.js | assertion count 1 is more than expected 0 assertions
Comment 15 Phil Ringnalda (:philor) 2011-02-04 09:56:57 PST
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1296838422.1296839962.826.gzRev3 Fedora 12x64 mozilla-central debug test xpcshell on 2011/02/04 08:53:42
s: talos-r3-fed64-006

###!!! ASSERTION: RECURSION_LEVEL(table_) > 0: 'RECURSION_LEVEL(table) > 0', file pldhash.c, line 707
PROCESS-CRASH | /home/cltbld/talos-slave/test/build/xpcshell/tests/services/sync/tests/unit/test_service_sync_updateEnabledEngines.js | application crashed (minidump found)
Comment 17 Phil Ringnalda (:philor) 2011-02-04 14:47:36 PST
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1296856270.1296857671.14649.gz
Rev3 Fedora 12x64 mozilla-central debug test jsreftest on 2011/02/04 13:51:10
s: talos-r3-fed64-006

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_3/extensions/10.1.3-2.js | assertion count 1 is more than expected 0 assertions
Comment 18 David Baron :dbaron: ⌚️UTC-10 2011-02-04 15:44:44 PST
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1296860536.1296861958.1692.gz
Rev3 Fedora 12x64 mozilla-central debug test jsreftest

REFTEST TEST-START | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_5/Global/parseInt-01.js
###!!! ASSERTION: RECURSION_LEVEL(table_) > 0: 'RECURSION_LEVEL(table) > 0', file pldhash.c, line 707
REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_5/Global/parseInt-01.js | assertion count 1 is more than expected 0 assertions
Comment 19 Phil Ringnalda (:philor) 2011-02-04 20:27:36 PST
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1296878265.1296879688.11191.gz
Rev3 Fedora 12x64 mozilla-central debug test jsreftest on 2011/02/04 19:57:45
s: talos-r3-fed64-046

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=js1_5/Expressions/regress-192288.js | assertion count 1 is more than expected 0 assertions
Comment 20 Phil Ringnalda (:philor) 2011-02-04 21:37:08 PST
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1296880756.1296882149.19305.gz
Rev3 Fedora 12x64 mozilla-central debug test jsreftest on 2011/02/04 20:39:16
s: talos-r3-fed64-053

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=js1_8_1/jit/math-jit-tests.js | assertion count 1 is more than expected 0 assertions
Comment 21 Phil Ringnalda (:philor) 2011-02-07 00:13:43 PST
http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTry/1297057170.1297058733.8856.gz
Rev3 Fedora 12 tryserver debug test xpcshell on 2011/02/06 21:39:30
s: talos-r3-fed-041

TEST-UNEXPECTED-FAIL | /home/cltbld/talos-slave/test/build/xpcshell/tests/services/sync/tests/unit/test_service_sync_updateEnabledEngines.js | test failed (with xpcshell return code: 1), see following log:
PROCESS-CRASH | /home/cltbld/talos-slave/test/build/xpcshell/tests/services/sync/tests/unit/test_service_sync_updateEnabledEngines.js | application crashed (minidump found)
Comment 22 Phil Ringnalda (:philor) 2011-02-07 19:50:26 PST
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297134899.1297136425.13742.gz
Rev3 Fedora 12x64 mozilla-central debug test jsreftest on 2011/02/07 19:14:59
s: talos-r3-fed64-034

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=js1_4/Eval/eval-001.js | assertion count 1 is more than expected 0 assertions
Comment 23 Phil Ringnalda (:philor) 2011-02-08 07:29:04 PST
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297156873.1297158345.11228.gz
Rev3 Fedora 12x64 mozilla-central debug test jsreftest on 2011/02/08 01:21:13
s: talos-r3-fed64-042

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=js1_4/Eval/eval-001.js | assertion count 1 is more than expected 0 assertions
Comment 24 Phil Ringnalda (:philor) 2011-02-08 16:00:59 PST
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297206496.1297207977.15036.gz
Rev3 Fedora 12x64 mozilla-central debug test jsreftest on 2011/02/08 15:08:16
s: talos-r3-fed64-022

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_5/misc/global-numeric-properties.js | assertion count 1 is more than expected 0 assertions
Comment 25 Phil Ringnalda (:philor) 2011-02-08 16:05:25 PST
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297208050.1297209516.22353.gz
Rev3 Fedora 12x64 mozilla-central debug test jsreftest on 2011/02/08 15:34:10
s: talos-r3-fed64-036

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_5/Global/parseInt-01.js | assertion count 1 is more than expected 0 assertions
Comment 26 Phil Ringnalda (:philor) 2011-02-08 16:07:21 PST
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297208052.1297209719.23637.gz
Rev3 Fedora 12x64 mozilla-central debug test xpcshell on 2011/02/08 15:34:12
s: talos-r3-fed64-026

TEST-UNEXPECTED-FAIL | /home/cltbld/talos-slave/test/build/xpcshell/tests/services/sync/tests/unit/test_service_login.js | test failed (with xpcshell return code: 1), see following log:
PROCESS-CRASH | /home/cltbld/talos-slave/test/build/xpcshell/tests/services/sync/tests/unit/test_service_login.js | application crashed (minidump found)
Comment 27 Phil Ringnalda (:philor) 2011-02-08 16:50:38 PST
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297210475.1297212169.3434.gz
Rev3 Fedora 12 mozilla-central debug test xpcshell on 2011/02/08 16:14:35
s: talos-r3-fed-020

TEST-UNEXPECTED-FAIL | /home/cltbld/talos-slave/test/build/xpcshell/tests/services/sync/tests/unit/test_corrupt_keys.js | test failed (with xpcshell return code: 1), see following log:
PROCESS-CRASH | /home/cltbld/talos-slave/test/build/xpcshell/tests/services/sync/tests/unit/test_corrupt_keys.js | application crashed (minidump found)
Comment 28 Phil Ringnalda (:philor) 2011-02-08 18:50:53 PST
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1297216452.1297218041.30245.gz
Rev3 Fedora 12x64 tracemonkey debug test jsreftest on 2011/02/08 17:54:12
s: talos-r3-fed64-019

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=js1_6/Array/filter.js | assertion count 1 is more than expected 0 assertions
Comment 29 Phil Ringnalda (:philor) 2011-02-09 14:01:57 PST
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297286836.1297288239.22244.gz
Rev3 Fedora 12x64 mozilla-central debug test jsreftest on 2011/02/09 13:27:16
s: talos-r3-fed64-026

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=js1_8/decompilation/regress-260106.js | assertion count 1 is more than expected 0 assertions
Comment 30 Phil Ringnalda (:philor) 2011-02-09 19:34:23 PST
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297306574.1297308068.19960.gz
Rev3 Fedora 12x64 mozilla-central debug test jsreftest on 2011/02/09 18:56:14
s: talos-r3-fed64-034

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=js1_4/Eval/eval-001.js | assertion count 1 is more than expected 0 assertions
Comment 31 Jonathan Kew (:jfkthame) 2011-02-10 00:07:20 PST
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297323019.1297324471.19516.gz
s: talos-r3-fed64-045
REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=js1_1/regress/function-001.js | assertion count 1 is more than expected 0 assertions
Comment 32 Benjamin Smedberg [:bsmedberg] 2011-02-10 06:39:21 PST
I'm not sure it's always true, but I'm pretty sure that this is true:

* We read XPTs on the main thread, there are no other threads accessing the table
* We start the extension manager, which may do some networking, loading background threads and PSM which may use XPCOM proxies which reads the XPT table
* All the networking stops, and then we register extension XPTs (table is modified)
* After this point I believe the table is immutable. Technically calls to nsIComponentManager.autoRegister could cause mutations, but I think we can probably just say that you can't register XPTs after startup.
Comment 33 Benjamin Smedberg [:bsmedberg] 2011-02-10 06:40:18 PST
What I'd really like, of course, is for XPCOM proxies to die and this to be truly main-thread only, or at least for XPCOM proxies to prefetch the xptinfo they need on the main thread before continuing. But that's not realistic now.
Comment 34 Robert Strong [:rstrong] (use needinfo to contact me) 2011-02-10 10:16:14 PST
Rev3 Fedora 12x64 mozilla-central debug test jsreftest on 2011/02/10 09:08:00
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297357680.1297359135.24493.gz
Comment 35 Phil Ringnalda (:philor) 2011-02-10 21:55:47 PST
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297400718.1297402118.14349.gz
Rev3 Fedora 12x64 mozilla-central debug test jsreftest on 2011/02/10 21:05:18
s: talos-r3-fed64-038

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=js1_4/Eval/eval-001.js | assertion count 1 is more than expected 0 assertions
Comment 36 Phil Ringnalda (:philor) 2011-02-12 09:51:53 PST
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297524398.1297525851.18277.gz
Rev3 Fedora 12x64 mozilla-central debug test jsreftest on 2011/02/12 07:26:38
s: talos-r3-fed64-045

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_5/Function/10.2.1.1.6.js | assertion count 1 is more than expected 0 assertions
Comment 37 Phil Ringnalda (:philor) 2011-02-12 13:56:24 PST
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297546154.1297547575.26507.gz
Rev3 Fedora 12x64 mozilla-central debug test jsreftest on 2011/02/12 13:29:14
s: talos-r3-fed64-033

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_5/String/15.5.4.2.js | assertion count 1 is more than expected 0 assertions
Comment 38 Phil Ringnalda (:philor) 2011-02-15 10:43:55 PST
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1297793888.1297795108.8268.gz
Rev3 Fedora 12x64 tracemonkey debug test jsreftest on 2011/02/15 10:18:08
s: talos-r3-fed64-011

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=js1_8_5/extensions/typedarray.js | assertion count 1 is more than expected 0 assertions
Comment 39 Phil Ringnalda (:philor) 2011-02-16 09:23:35 PST
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297875556.1297876786.17164.gz
Rev3 Fedora 12x64 mozilla-central debug test jsreftest on 2011/02/16 08:59:16
s: talos-r3-fed64-053

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_5/Object/15.2.3.3-01.js | assertion count 1 is more than expected 0 assertions
Comment 40 Daniel Holbert [:dholbert] 2011-02-17 00:42:50 PST
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297927630.1297928774.26067.gz
Rev3 Fedora 12x64 mozilla-central debug test jsreftest on 2011/02/16 23:27:10
s: talos-r3-fed64-054

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=js1_4/Eval/eval-001.js | assertion count 1 is more than expected 0 assertions
Comment 41 Jonathan Kew (:jfkthame) 2011-02-18 02:34:04 PST
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1298023268.1298024391.11095.gz
s: talos-r3-fed64-005
REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_5/Global/parseInt-01.js | assertion count 1 is more than expected 0 assertions
Comment 42 Phil Ringnalda (:philor) 2011-02-18 14:14:08 PST
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1298062670.1298063922.14759.gz&fulltext=1
Rev3 Fedora 12x64 tracemonkey debug test jsreftest on 2011/02/18 12:57:50
s: talos-r3-fed64-014

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_3/Date/15.9.1.2-01.js | assertion count 1 is more than expected 0 assertions
Comment 44 Phil Ringnalda (:philor) 2011-02-20 21:00:32 PST
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1298263016.1298264270.27220.gz
Rev3 Fedora 12x64 tracemonkey debug test jsreftest on 2011/02/20 20:36:56
s: talos-r3-fed64-005

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=js1_6/Array/filter.js | assertion count 1 is more than expected 0 assertions
Comment 45 Phil Ringnalda (:philor) 2011-02-21 09:27:28 PST
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1298305950.1298307130.10027.gz
Rev3 Fedora 12x64 tracemonkey debug test jsreftest on 2011/02/21 08:32:30
s: talos-r3-fed64-043

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_3/Date/15.9.1.2-01.js | assertion count 1 is more than expected 0 assertions
Comment 46 Phil Ringnalda (:philor) 2011-02-21 09:29:54 PST
http://tinderbox.mozilla.org/showlog.cgi?log=Build-System/1298307978.1298309151.17421.gz
Rev3 Fedora 12x64 build-system debug test jsreftest on 2011/02/21 09:06:18
s: talos-r3-fed64-042

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_3/Object/8.6.1-01.js | assertion count 1 is more than expected 0 assertions
Comment 47 Phil Ringnalda (:philor) 2011-02-21 16:51:53 PST
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1298331952.1298335342.20410.gz
Rev3 WINNT 6.1 mozilla-central debug test xpcshell on 2011/02/21 15:45:52
s: talos-r3-w7-027
TEST-UNEXPECTED-FAIL | c:\talos-slave\test\build\xpcshell\tests\services\sync\tests\unit\test_load_modules.js | test failed (with xpcshell return code: -2147483645), see following log:
PROCESS-CRASH | c:\talos-slave\test\build\xpcshell\tests\services\sync\tests\unit\test_load_modules.js | application crashed (minidump found)
Comment 48 Phil Ringnalda (:philor) 2011-02-27 22:03:21 PST
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1298871248.1298872763.25723.gz
Rev3 Fedora 12x64 mozilla-central debug test xpcshell on 2011/02/27 21:34:08
s: talos-r3-fed64-024

TEST-UNEXPECTED-FAIL | /home/cltbld/talos-slave/test/build/xpcshell/tests/services/sync/tests/unit/test_service_detect_upgrade.js | test failed (with xpcshell return code: 1), see following log:
Comment 49 Phil Ringnalda (:philor) 2011-03-07 23:20:51 PST
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1299565906.1299567324.9710.gz&fulltext=1
Rev3 Fedora 12x64 mozilla-central debug test reftest on 2011/03/07 22:31:46
s: talos-r3-fed64-044

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/box-properties/CSS21-t100303.xhtml | assertion count 1 is more than expected 0 assertions
Comment 50 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2011-03-09 12:05:55 PST
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1299700627.1299698642.18674.gz
Comment 51 :Ehsan Akhgari 2011-03-26 12:09:41 PDT
(In reply to comment #32)
> * After this point I believe the table is immutable. Technically calls to
> nsIComponentManager.autoRegister could cause mutations, but I think we can
> probably just say that you can't register XPTs after startup.

Hmm, we do have a bunch of tests which either call into autoRegister directly after startup, or do other things (such as calling checkForNewChrome for example) which might trigger the modification of this table indirectly.  So is that a safe assumption to make?
Comment 52 David Baron :dbaron: ⌚️UTC-10 2011-03-29 21:49:02 PDT
Created attachment 522919 [details] [diff] [review]
protect the tables with a lock

Here's an attempt to fix this and bug 627985 by protecting the mNameTable and mIIDTable with a single lock.  This lock also protects what mInfoMonitor used to protect.

The basic idea of the patch is as follows:
 * add mTableLock
 * remove unused mAutoRegLock
 * replace mInfoMonitor with mTableLock
 * all access to mNameTable and mIIDTable, plus writes to the linkage between xptiInterfaceInfo and xptiInterfaceEntry (which is only changed when nobody on the outside has a reference to the object) must be protected by mTableLock.  No particular reason to use the same lock except that the two things tend to happen at the same time.
 * xptiInterfaceInfo::Release has to avoid holding the lock across its destructor (and release of mParent).  (Alternatively, could have used a Monitor like mInfoMonitor, but didn't seem worth it.)

This is all-green on try so far, though I'm still waiting on builds:
http://tbpl.mozilla.org/?tree=MozillaTry&rev=f5c8ffd82371

Thoughts?
Comment 53 Benjamin Smedberg [:bsmedberg] 2011-03-31 09:32:06 PDT
Comment on attachment 522919 [details] [diff] [review]
protect the tables with a lock

This is so sad. I hope this doesn't affect startup time too much.
Comment 54 Josh Matthews [:jdm] (on vacation until Dec 5) 2011-04-02 11:32:21 PDT
*** Bug 646395 has been marked as a duplicate of this bug. ***
Comment 57 :Ehsan Akhgari 2011-04-06 13:45:59 PDT
I tried to land this patch, but it fails to apply due to the monitor interface changes in a way which makes me very uncomfortable with trying to unbitrot it.

I think this should land on m-c, preferably as a single changeset in order for us to be able to watch the perf numbers...
Comment 58 Phil Ringnalda (:philor) 2011-04-06 19:43:55 PDT
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1302142349.1302143653.7076.gz
Rev3 MacOSX Snow Leopard 10.6.2 mozilla-central debug test jsreftest on 2011/04/06 19:12:29
s: talos-r3-snow-028

REFTEST TEST-UNEXPECTED-FAIL | file:///Users/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=e4x/GC/regress-280844-1.js | assertion count 1 is more than expected 0 assertions
Comment 59 David Baron :dbaron: ⌚️UTC-10 2011-04-06 22:08:24 PDT
Merged version of the patch is:

http://hg.mozilla.org/users/dbaron_mozilla.com/patches/raw-file/d3c8915ffd44/xpt-lock
Comment 60 Phil Ringnalda (:philor) 2011-04-09 19:17:33 PDT
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1302398069.1302400960.30097.gz
Rev3 WINNT 6.1 tracemonkey debug test xpcshell on 2011/04/09 18:14:29
s: talos-r3-w7-038

TEST-UNEXPECTED-FAIL | c:\talos-slave\test\build\xpcshell\tests\services\sync\tests\unit\test_service_verifyLogin.js | test failed (with xpcshell return code: -2147483645), see following log:
Comment 62 Phil Ringnalda (:philor) 2011-04-10 18:28:13 PDT
Seems like an oddly-specific failure mode, but Linux64 opt reftest hung during shutdown, so I retriggered it, and it hung on a different slave, so I retriggered it, and it hung on a third slave.
Comment 63 :Ehsan Akhgari 2011-04-10 23:08:43 PDT
(In reply to comment #62)
> Seems like an oddly-specific failure mode, but Linux64 opt reftest hung during
> shutdown, so I retriggered it, and it hung on a different slave, so I
> retriggered it, and it hung on a third slave.

To verify my sanity, I pushed some other (unrelated) changes, and the timeout happened again, so I had to back this out: http://hg.mozilla.org/projects/cedar/rev/db98c4f46347)

http://tinderbox.mozilla.org/showlog.cgi?log=Cedar/1302475596.1302476652.6413.gz&fulltext=1
Comment 65 Jonathan Kew (:jfkthame) 2011-04-13 00:50:55 PDT
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1302678513.1302680057.17727.gz

s: talos-r3-fed64-020
TEST-UNEXPECTED-FAIL | /home/cltbld/talos-slave/test/build/xpcshell/tests/services/sync/tests/unit/test_service_sync_updateEnabledEngines.js | test failed (with xpcshell return code: 1), see following log:
PROCESS-CRASH | /home/cltbld/talos-slave/test/build/xpcshell/tests/services/sync/tests/unit/test_service_sync_updateEnabledEngines.js | application crashed (minidump found)
Thread 0 (crashed)

###!!! ASSERTION: RECURSION_LEVEL(table_) > 0: 'RECURSION_LEVEL(table) > 0', file pldhash.c, line 707
Comment 66 Kyle Huey [:khuey] (Exited; not receiving bugmail, email if necessary) 2011-04-13 21:26:58 PDT
Did we try making the Mutex a Monitor instead so we can reenter it?  /me notes that xptiInterfaceInfo's Release is awful ...
Comment 67 :Ehsan Akhgari 2011-04-14 12:16:54 PDT
I tried to reproduce this locally, but both my local build, and the tinderbox builds I downloaded shut down cleanly at the end of the reftest suite...

(In reply to comment #66)
> Did we try making the Mutex a Monitor instead so we can reenter it?

What do you mean?  Isn't that what we're trying to move away from?
Comment 68 Kyle Huey [:khuey] (Exited; not receiving bugmail, email if necessary) 2011-04-14 14:07:00 PDT
(In reply to comment #67)
> I tried to reproduce this locally, but both my local build, and the tinderbox
> builds I downloaded shut down cleanly at the end of the reftest suite...
> 
> (In reply to comment #66)
> > Did we try making the Mutex a Monitor instead so we can reenter it?
> 
> What do you mean?  Isn't that what we're trying to move away from?


(In reply to comment #52)
>  * xptiInterfaceInfo::Release has to avoid holding the lock across its
> destructor (and release of mParent).  (Alternatively, could have used a Monitor
> like mInfoMonitor, but didn't seem worth it.)

Perhaps it is worth it now?
Comment 69 :Ehsan Akhgari 2011-04-14 16:39:28 PDT
(In reply to comment #68)
> (In reply to comment #52)
> >  * xptiInterfaceInfo::Release has to avoid holding the lock across its
> > destructor (and release of mParent).  (Alternatively, could have used a Monitor
> > like mInfoMonitor, but didn't seem worth it.)
> 
> Perhaps it is worth it now?

Ah, I see what you mean.  I will give that a shot.
Comment 70 :Ehsan Akhgari 2011-04-14 20:43:03 PDT
Created attachment 526178 [details] [diff] [review]
Part 2: Use the info monitor for xptiInterfaceInfo::Release

Kyle's suggestion worked out really well, and the try server seems to be happy with this.  This patch basically restores the info monitor for xptiInterfaceInfo::Release, and it's supposed to be applied on top of the other patch landed as part of this bug.
Comment 71 David Baron :dbaron: ⌚️UTC-10 2011-04-17 15:49:45 PDT
Comment on attachment 526178 [details] [diff] [review]
Part 2: Use the info monitor for xptiInterfaceInfo::Release

Sorry for the delay -- meant to get to this sooner.

This doesn't make sense, since the patch above turned most of the other uses of the info monitor into uses of the table lock.  You can't split those uses from this one and lock the same data structure with two different locks.

You could, however, just turn the table lock into a monitor.  It probably makes the most sense to do that by revising the existing patch, though.
Comment 72 :Ehsan Akhgari 2011-04-18 15:42:35 PDT
Created attachment 526853 [details] [diff] [review]
Patch (v2)

Good point.  This one should make more sense, I guess.
Comment 73 David Baron :dbaron: ⌚️UTC-10 2011-04-18 16:04:09 PDT
Comment on attachment 526853 [details] [diff] [review]
Patch (v2)

Looks good, mostly, except you can now drop the changes I made to xptiInterfaceInfo::Release and just replace them with the obvious conversion from one monitor to another.  r=dbaron with that
Comment 74 :Ehsan Akhgari 2011-04-18 16:26:48 PDT
Heh, sorry I missed that!

I've pushed an updated version of the patch to the try server.  I will land it tomorrow if everything goes fine.
Comment 75 :Ehsan Akhgari 2011-04-18 16:30:49 PDT
Created attachment 526868 [details] [diff] [review]
For landing
Comment 77 :Ehsan Akhgari 2011-04-20 13:31:10 PDT
Hmm, I guess this isn't completely fixed :(

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1303327169.1303327588.32081.gz
Comment 78 :Ehsan Akhgari 2011-04-20 13:32:22 PDT
(In reply to comment #77)
> Hmm, I guess this isn't completely fixed :(
> 
> http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1303327169.1303327588.32081.gz

That's another bug, sorry for the noise.

Note You need to log in before you can comment on or make changes to this bug.