Closed Bug 535585 Opened 15 years ago Closed 15 years ago

Windows: random tests failing with test failed (with xpcshell return code: 1)

Categories

(Core :: General, defect)

x86
Windows 7
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla1.9.3a1

People

(Reporter: benjamin, Assigned: ted)

References

Details

(Keywords: intermittent-failure)

On Windows we've been seeing increasing incidence of random testfail with "test failed (with xpcshell return code 1)". I think we should try to catch this in a record-and-replay debugger, because I don't think it's particularly related to the test that's actually being run, but it might be related to GC timing or something else.

We've seen it occasionally on 1.9.2 but more regularly on m-c and e10s.
Didn't realize quite how much more regularly: 14 of the last 20 Windows Eo runs failed this way (and the remaining 6 failed in different ways). And e10s does seem to have made it much much worse: looking at the Windows Eo column on http://tinderbox.mozilla.org/showbuilds.cgi?tree=Firefox-Unittest&hours=168 (so last Thursday to Tuesday morning when they moved trees), there was one instance of this that I mis-starred on Saturday morning, but then starting with the e10s landing 8 of the next 20 had this.
Benjamin, can you or someone else on the E10S team get a replay setup going? I'm already spending a lot of time tracking down orange bugs.
(I'm more than happy to help anyone get that set up)
Yeah, I'll find somebody.
No idea whether this is just pure coincidence or not, but that "14 of 20 failed this way, the other 6 failed a different way, zero green" was followed by the push for bug 526216, and after that there have been 8 failures this way, 1 different failure, and 9 greens.
And conveniently, while all the others seem to have been opt, http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1261247247.1261254346.1507.gz&fulltext=1#err1 looks like it's this in a debug build.
Depends on: 536585
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1261738928.1261742741.1413.gz
WINNT 5.2 mozilla-central opt test everythingelse on 2009/12/25 03:02:08
s: win32-slave36
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-everythingelse\build\xpcshell\tests\test_dm\unit\test_removeDownloadsByTimeframe.js | test failed (with xpcshell return code: 1), see following log:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1262272299.1262274075.15332.gz
WINNT 5.2 mozilla-central opt test everythingelse on 2009/12/31 07:11:39
s: win32-slave15
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-everythingelse\build\xpcshell\tests\test_dm\unit\test_bug_409179.js | test failed (with xpcshell return code: 1), see following log:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1262307851.1262310028.27740.gz
WINNT 5.2 mozilla-central opt test everythingelse on 2009/12/31 17:04:11  
s: win32-slave06
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-everythingelse\build\xpcshell\tests\test_extensionmanager\unit\test_bug299716_2.js | test failed (with xpcshell return code: 1), see following log:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1262343728.1262347121.2223.gz
WINNT 5.2 mozilla-central opt test everythingelse on 2010/01/01 03:02:08
s: win32-slave14
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-everythingelse\build\xpcshell\tests\test_places\sync\test_expire_on_timed_flush_obeys_date.js | test failed (with xpcshell return code: 1), see following log:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1262355157.1262357338.29078.gz
WINNT 5.2 mozilla-central opt test everythingelse on 2010/01/01 06:12:37
s: win32-slave02
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-everythingelse\build\xpcshell\tests\test_places\unit\test_annotations.js | test failed (with xpcshell return code: 1), see following log:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1262491404.1262493208.18545.gz
WINNT 5.2 mozilla-central opt test everythingelse on 2010/01/02 20:03:24
s: win32-slave33
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1262495594.1262497993.4842.gz
WINNT 5.2 mozilla-central opt test everythingelse on 2010/01/02 21:13:14
s: win32-slave19
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-everythingelse\build\xpcshell\tests\test_update\unit\test_0070_update_dir_cleanup.js | test failed (with xpcshell return code: 1), see following log:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1262616126.1262618321.8493.gz
WINNT 5.2 mozilla-central opt test everythingelse on 2010/01/04 06:42:06
s: win32-slave05
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-everythingelse\build\xpcshell\tests\jsctypes-test\unit\test_jsctypes.js | test failed (with xpcshell return code: 1), see following log:
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-everythingelse\build\xpcshell\tests\test_places\bookmarks\test_448584.js | test failed (with xpcshell return code: 1), see following log:
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-everythingelse\build\xpcshell\tests\test_cookies\unit\test_bug468700.js | test failed (with xpcshell return code: 1), see following log:
Depends on: 538106
From the just-duped Bug 538106:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1262759285.1262762775.3955.gz
WINNT 5.2 mozilla-central opt test everythingelse on 2010/01/05 22:28:05
s: win32-slave38
{
TEST-UNEXPECTED-FAIL |
e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-everythingelse\build\xpcshell\tests\test_extensionmanager\unit\test_bug436207.js
| test failed (with xpcshell return code: 1), see following log:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1262785110.1262786950.22861.gz
WINNT 5.2 mozilla-central opt test everythingelse on 2010/01/06 05:38:30
s: win32-slave30
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-everythingelse\build\xpcshell\tests\test_places\sync\test_database_sync_expireAllFavicons.js | test failed (with xpcshell return code: 1), see following log:
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-everythingelse\build\xpcshell\tests\test_extensionmanager\unit\test_bug257155.js | test failed (with xpcshell return code: 1), see following log:
Version: Other Branch → Trunk
No longer depends on: 538106
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1262865054.1262867015.31838.gz
WINNT 5.2 mozilla-central opt test everythingelse on 2010/01/07 03:50:54
s: win32-slave41
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-everythingelse\build\xpcshell\tests\test_places\bookmarks\test_changeBookmarkURI.js | test failed (with xpcshell return code: 1), see following log:


http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1262859078.1262861257.32667.gz
WINNT 5.2 mozilla-central opt test everythingelse on 2010/01/07 02:11:18
s: win32-slave01
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-everythingelse\build\xpcshell\tests\test_places\autocomplete\test_multi_word_search.js | test failed (with xpcshell return code: 1), see following log:
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-everythingelse\build\xpcshell\tests\test_places\unit\test_bookmark_catobs.js | test failed (with xpcshell return code: 1), see following log:
WINNT 5.2 mozilla-central opt test everythingelse on 2010/01/07 09:39:28  
s: win32-slave27
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-everythingelse\build\xpcshell\tests\test_passwordmgr\unit\test_top_loginmgr_1.js | test failed (with xpcshell return code: 1), see following log:
WINNT 5.2 mozilla-central opt test everythingelse on 2010/01/07 10:00:25  
s: win32-slave17
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-everythingelse\build\xpcshell\tests\json_test\unit\test_decode.js | test failed (with xpcshell return code: 1), see following log:
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-everythingelse\build\xpcshell\tests\test_extensionmanager\unit\test_bug384052.js | test failed (with xpcshell return code: 1), see following log:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1262948525.1262954028.8387.gz
WINNT 5.2 mozilla-central opt test everythingelse on 2010/01/08 03:02:05
s: win32-slave19
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-everythingelse\build\xpcshell\tests\test_places\unit\test_tagging.js | test failed (with xpcshell return code: 1), see following log:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1262958010.1262960725.24064.gz
WINNT 5.2 mozilla-central opt test everythingelse on 2010/01/08 05:40:10
s: win32-slave12
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-everythingelse\build\xpcshell\tests\test_extensionmanager\unit\test_bug511091.js | test failed (with xpcshell return code: 1), see following log:
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-everythingelse\build\xpcshell\tests\test_update\unit\test_0062_manager.js | test failed (with xpcshell return code: 1), see following log:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1263771037.1263772141.22915.gz
WINNT 5.2 mozilla-central opt test xpcshell on 2010/01/17 15:30:37
s: win32-slave23
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_places\unit\test_markpageas.js | test failed (with xpcshell return code: 1), see following log:
Just in case anyone's fooled into thinking that the number of log-paste comments here in any way reflects the frequency, it doesn't: I star five or ten a day, every day, without commenting here, since there's no real content in the logs.
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1263950806.1263952399.6211.gz
WINNT 5.2 mozilla-central opt test xpcshell on 2010/01/19 17:26:46  
s: win32-slave13
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_dm\schema_migration\test_migration_to_6.js | test failed (with xpcshell return code: 1), see following log:
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_browser_places\unit\test_browserGlue_corrupt_nobackup.js | test failed (with xpcshell return code: 1), see following log:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1264305313.1264306373.1041.gz
WINNT 5.2 mozilla-central opt test xpcshell on 2010/01/23 19:55:13
s: win32-slave31
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_extensionmanager\unit\test_bug474988.js | test failed (with xpcshell return code: 1), see following log:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1264358656.1264359960.24629.gz
WINNT 5.2 mozilla-central opt test xpcshell on 2010/01/24 10:44:16
s: win32-slave11
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_places\unit\test_history_removeAllPages.js | test failed (with xpcshell return code: 1), see following log:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1264355120.1264356280.17166.gz
WINNT 5.2 mozilla-central opt test xpcshell on 2010/01/24 09:45:20
s: win32-slave28
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_necko\test\test_setstatusline.js | test failed (with xpcshell return code: 1), see following log:
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_necko\test\test_sjs.js | test failed (with xpcshell return code: 1), see following log:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1264347922.1264349326.2505.gz
WINNT 5.2 mozilla-central opt test xpcshell on 2010/01/24 07:45:22
s: win32-slave12
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_url-classifier\unit\test_cleankeycache.js | test failed (with xpcshell return code: 1), see following log:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1264339974.1264341096.1264.gz
WINNT 5.2 mozilla-central opt test xpcshell on 2010/01/24 05:32:54
s: win32-slave17
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_places\bookmarks\test_405938_restore_queries.js | test failed (with xpcshell return code: 1), see following log:
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_update\unit\test_0060_manager.js | test failed (with xpcshell return code: 1), see following log:
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_intl_uconv\unit\test_bug381412.euc_jp.js | test failed (with xpcshell return code: 1), see following log:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1264373642.1264374923.24049.gz
WINNT 5.2 mozilla-central opt test xpcshell on 2010/01/24 14:54:02
s: win32-slave33
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_update\unit\test_0060_manager.js | test failed (with xpcshell return code: 1), see following log:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1264443493.1264444527.30466.gz
s: win32-slave05TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_places\expiration\test_removeAllPages.js | test failed (with xpcshell return code: 0), see following log:
TEST-UNEXPECTED-FAIL | e:/builds/moz2_slave/mozilla-central-win32-opt-unittest-xpcshell/build/xpcshell/tests/test_places/expiration/test_removeAllPages.js | 1 == 0 - See following stack:
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_places\unit\test_399266.js | test failed (with xpcshell return code: 1), see following log:
(In reply to comment #39)
> http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1264443493.1264444527.30466.gz
> s: win32-slave05TEST-UNEXPECTED-FAIL |
> e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_places\expiration\test_removeAllPages.js
> | test failed (with xpcshell return code: 0), see following log:

Kyle: that first error is not the same bug (the second one is).
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1264447900.1264449336.19989.gz
WINNT 5.2 mozilla-central opt test xpcshell on 2010/01/25 11:31:40
s: win32-slave05
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_chrome\unit\test_bug380398.js | test failed (with xpcshell return code: 1), see following log:
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_places\bookmarks\test_restore_guids.js | test failed (with xpcshell return code: 1), see following log:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1264459013.1264460894.22278.gz
WINNT 5.2 mozilla-central opt test xpcshell on 2010/01/25 14:36:53
s: win32-slave30
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_zipwriter\unit\test_bug419769_1.js | test failed (with xpcshell return code: 1), see following log:
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_places\expiration\test_annos_expire_policy.js | test failed (with xpcshell return code: 1), see following log:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1264461574.1264463381.18516.gz
WINNT 5.2 mozilla-central opt test xpcshell on 2010/01/25 15:19:34
s: win32-slave20
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_extensionmanager\unit\test_bug371495.js | test failed (with xpcshell return code: 1), see following log:
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_extensionmanager\unit\test_bug509194.js | test failed (with xpcshell return code: 1), see following log:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1264467682.1264469207.19884.gz
WINNT 5.2 mozilla-central opt test xpcshell on 2010/01/25 17:01:22
s: win32-slave29
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_dm\schema_migration\test_migration_to_2.js | test failed (with xpcshell return code: 1), see following log:
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_timermanager\unit\test_0010_timermanager.js | test failed (with xpcshell return code: 1), see following log:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1264468530.1264470067.29418.gz
WINNT 5.2 mozilla-central opt test xpcshell on 2010/01/25 17:15:30
s: win32-slave09
TEST-UNEXPECTED-FAIL | E:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_necko\unit\test_protocolproxyservice.js | test failed (with xpcshell return code: 1), see following log:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1264468902.1264470271.31605.gz
WINNT 5.2 mozilla-central opt test xpcshell on 2010/01/25 17:21:42
s: win32-slave35
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_places\unit\test_399264_string_to_query.js | test failed (with xpcshell return code: 1), see following log:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1264470618.1264471695.14788.gz
WINNT 5.2 mozilla-central opt test xpcshell on 2010/01/25 17:50:18
s: win32-slave13
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_passwordmgr\unit\test_storage_legacy_4.js | test failed (with xpcshell return code: 1), see following log:
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_satchel\unit\test_db_corrupt.js | test failed (with xpcshell return code: 1), see following log:
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_places\unit\test_adaptive.js | test failed (with xpcshell return code: 1), see following log:
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_extensionmanager\unit\test_bug541420.js | test failed (with xpcshell return code: 1), see following log:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1264522179.1264523392.15717.gz
WINNT 5.2 mozilla-central opt test xpcshell on 2010/01/26 08:09:39
s: win32-slave43
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_extensionmanager\unit\test_bug257155.js | test failed (with xpcshell return code: 1), see following log:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1264525989.1264527128.28735.gz
WINNT 5.2 mozilla-central opt test xpcshell on 2010/01/26 09:13:09
s: win32-slave41
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_places\expiration\test_annos_expire_session.js | test failed (with xpcshell return code: 1), see following log:
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_places\bookmarks\test_393498.js | test failed (with xpcshell return code: 1), see following log:
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_necko\unit\test_protocolproxyservice.js | test failed (with xpcshell return code: 1), see following log:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1264530235.1264531410.13681.gz
WINNT 5.2 mozilla-central opt test xpcshell on 2010/01/26 10:23:55
s: win32-slave05
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_places\queries\test_results-as-visit.js | test failed (with xpcshell return code: 1), see following log:
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_extensionmanager\unit\test_bug394300.js | test failed (with xpcshell return code: 1), see following log:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1264552483.1264553830.9036.gz
WINNT 5.2 mozilla-central opt test xpcshell on 2010/01/26 16:34:43
s: win32-slave35
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_plugin\unit\test_bug455213.js | test failed (with xpcshell return code: 1), see following log:
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_places\autocomplete\test_422277.js | test failed (with xpcshell return code: 1), see following log:
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_places\expiration\test_pref_maxpages.js | test failed (with xpcshell return code: 1), see following log:
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_necko\test\test_sjs.js | test failed (with xpcshell return code: 1), see following log:
I captured an instance of this with VM record/replay.

It seems that the xpcshell process that we think returned exit code 1, in fact has NOT terminated when the python testrunner emits that message.
(I have a script that terminates recording when it spots a TEST-UNEXPECTED-FAIL, and the xpcshell process that runs the test that's reported to have returned exit code 1 is actually still running when recording was terminated.)
Hmm, those results are suspect. Disregard for now.
OK, I captured a new recording and the results make more sense now.

The xpcshell does exit before python reports an error.
The xpcshell is returning 0 from main().
python's call to GetExitCodeProcess is returning 1.

I'm scared. I'll keep digging.
I traced through the xpcshell process shutdown. All the way to NtTerminateProcess, the exit code is 0.
I confirmed that python is calling GetExitCodeProcess on the right process.

I have no idea what could possibly be happening here.
If anyone has even the wildest speculation about how the process exit code could be passed in as 0 to NtTerminateProcess and be returned as 1, I would really love to hear it. "evil OS bug" is one possibility. Right now I can't think of any others
Did you actually verify that the GetExitCodeProcess system call that Python is making returns 1? (just curious) This is indeed scary!

I wonder if we're hitting some weird Python bug? Maybe we could steal a build slave and try running with a newer version of Python. MozillaBuild currently ships 2.5, which is pretty old.
(In reply to comment #60)
> Did you actually verify that the GetExitCodeProcess system call that Python is
> making returns 1? (just curious) This is indeed scary!

Yes, it returns 1 as the exit code. It's actually not a system call, it calls NtQueryProcessInformation which is the actual system call.

> I wonder if we're hitting some weird Python bug? Maybe we could steal a build
> slave and try running with a newer version of Python. MozillaBuild currently
> ships 2.5, which is pretty old.

It doesn't look like a python bug to me. Running a new Python might be worth trying though.
It would be easy enough to install a new Python from python.org in your test VM and change your PATH in your MSYS shell to put it first. Alternately, if you're running from a packaged build+test combo, you can just use c:\path\to\python.exe runxpcshelltests.py, of course.
I'll try that at some point, but I seriously doubt it's a Python bug, given that I observe Windows returning 1 from GetExitCodeProcess, and I've checked that it's using the same process handle value that was returned by CreateProcess. I suppose in theory Python could have closed that handle and then reopened it for some other process, but that seems extremely far-fetched.

I checked that in the xpcshell process there are exactly two calls to NTTerminateProcess, both from the same invocation of ExitProcess. The first one seems to be some kind of preliminary shutdown signal. An exit code of 0 is passed to both processes. The process finally exists at the sysenter triggered by the second NTTerminateProcess. That rules out some random shutdown code reentering ExitProcess or calling NTTerminateProcess some other way, nor by an unhandled exception. (Unless, just possibly, there's an unhandled exception during the final system call, and VMWare's debugger integration doesn't report it for some reason? Seems utterly implausible. Likewise I suppose it's possible that someone calls NTTerminateProcess *from another process* passing 1 as the exit code, and it *just happens* to happen at the same moment that xpcshell itself does the NTTerminateProcess system call. Similarly implausible.)

Hmm, but that second scenario isn't so implausible, perhaps. What if xpcshell terminated itself and then python itself called TerminateProcess on the xpcshell, passing exit code 1? I could well believe that that would override the exit code, before Python returns it. I'll check.

One other little experiment I tried was resuming VM execution in non-replay mode after the xpcshell has started but before it has terminated. The test passed. That shows that the non-determinism that causes the bug happens *after* the xpcshell has launched.
OK, the python testrunner doesn't call NTTerminateProcess while the tests are running, so that theory fails. Trying python 2.6.4 now.
Exit code 1 still occurs with python 2.6.4, about 1 test in 1000.
Thought: can you break in xul.dll DllMain (nsDllMain.cpp) and see at what point that function is running relative to the calls to NTTerminateProcess?

Also, at the calls to NTTerminateProcess are there other threads still running? I expect at least the DNS thread will still be running, but I'm interested to see if the chromium I/O thread is running at that point, and whether it just goes away.
(In reply to comment #66)
> Thought: can you break in xul.dll DllMain (nsDllMain.cpp) and see at what point
> that function is running relative to the calls to NTTerminateProcess?

Sure. It gets a DLL_PROCESS_DETACH during LdrShutdownProcess, which is called by _ExitProcess after the first call to NTTerminateProcess and before the second call to NTTerminateProcess. It looks to me like the first call to NTTerminateProcess is a signal to the kernel that process shutdown has started, and then LdrShutdownProcess does all the DLL shutdown notifications, before the second NTTerminateProcess kills the process for good.

> Also, at the calls to NTTerminateProcess are there other threads still running?
> I expect at least the DNS thread will still be running, but I'm interested to
> see if the chromium I/O thread is running at that point, and whether it just
> goes away.

This is very interesting. When I get to the first NTTerminateProcess in the exit-code-1 xpcshell, I have 4 threads running: main thread, socket thread, Breakpad thread, and some other thread that is in an odd state. It's not suspended, according to Visual Studio, but it has no stack, it's just sitting at _threadstartex doing nothing.

But when I attach to a couple of other random xpcshells, at the first NTTerminateProcess there are only two threads: the main thread and the socket thread.

Do we expect to see a breakpad thread here? Is breakpad enabled for all xpcshell tests? Is that thread supposed to be shut down before we return from main?
http://hg.mozilla.org/mozilla-central/rev/edb10c83d7ce to see if this changes the symptom. We currently think that the TerminateThread isn't scheduling the termination until after the main thread exits, and Windows decides that the breakpad thread exit code is the process exit code.
Figured this one was worth spamming the bug with a copy-paste:

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1264739268.1264740581.9747.gz
WINNT 5.2 mozilla-central opt test xpcshell on 2010/01/28 20:27:48
s: win32-slave20
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_satchel\unit\test_db_update_v999a.js | test failed (with xpcshell return code: 44), see following log:
Woohoo!

We need to wait for the breakpad thread to shut down before proceeding. Either we stop calling TerminateThread (best) or we continue calling TerminateThread and call WaitForSingleObject to wait for the thread to terminate before proceeding.
Seems pretty obvious that bug 541853 is a dupe of this, then.
No longer depends on: 525167
No longer depends on: 529150
No longer depends on: 531169
No longer depends on: 535488
No longer depends on: 535491
No longer depends on: 536585
No longer blocks: 541853
No longer depends on: 535493
Well that was fun.
Assignee: nobody → ted.mielczarek
I think this likely started after I fixed bug 483062 (although that was May of 2009, and this bug wasn't filed until December, so maybe there's a mitigating factor here).

Also of interest is the Breakpad checkin comment that added that #ifdef BREAKPAD_NO_TERMINATE_THREAD:
http://code.google.com/p/google-breakpad/source/detail?r=309
No longer blocks: 535844
No longer blocks: 531328
I added a -DBREAKPAD_NO_TERMINATE_THREAD in that Makefile, let's see if it makes this go away:
http://hg.mozilla.org/mozilla-central/rev/17d06b3d8844

We can't hit the case that comment is warning about (deleting the exception handler in DllMain), so I don't think there's any downside to this.

Also, I think this bug wins the award for "most duped random orange".
There have been a few green runs in a row, I'm going to close this out. Reopen if it reoccurs.

Nice work, roc and bsmedberg!
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
You might want to report this issue upstream.
Clearly someone considered this, since there's an ifdef there, but the choice to use TerminateThread as the default seems odd.
Target Milestone: --- → mozilla1.9.3a1
TerminateThread is clearly labeled as evil by MSDN, but it doesn't say anything about this possible race condition, so it's likely the person who wrote this code for Breakpad wasn't aware of it.

I suspect you could use TerminateThread safely by doing a WaitForSingleObject on the thread handle to ensure that the thread is dead before you carry on.
Uh, wow! I guess we killed a lot of random orange with this one.
(In reply to comment #85)
> You might want to report this issue upstream.

Already filed in November 2008: http://code.google.com/p/google-breakpad/issues/detail?id=280
Whiteboard: [orange]
You need to log in before you can comment on or make changes to this bug.