Closed Bug 605093 Opened 14 years ago Closed 14 years ago

Permanent Mac Orange due to test timeouts: TEST-UNEXPECTED-FAIL | asyncTestUtils.js | Timeout running test, and we want you to have the log

Categories

(MailNews Core :: Backend, defect)

All
macOS
defect
Not set
blocker

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: standard8, Assigned: asuth)

References

Details

(Keywords: intermittent-failure)

This has been pretty much permanent orange since 2010/10/08. There's a set of about four tests that are timing out, but do not seem to have had something kick them off. The tests are:

- test_corrupt_database.js
- test_folder_logic.js
- test_index_sweep_folder.js
- test_imapAutoSync.js

The failure modes appear the same - they all seem to complete the test within a second but hang on shutdown.

Example log:
http://tinderbox.mozilla.org/showlog.cgi?log=Thunderbird/1287395748.1287400057.4981.gz
From tinderbox the possible regression range is:

http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=5cd4558f8d45&tochange=59a74d2748ca

There were no changes in comm-central. This wasn't a clobber build, so the regression range shouldn't be any further out.

Note that since we've started running xpcshell-tests as separate items in joint, the tests don't appear to fail as bad - maybe just one per run rather than 2-4. So this could be some sort of issue on the builders.

John, gozer, do we know if any changes were made to the minis around 2010/10/08 14:46:21 ?
I will use the exciting mozperfish tooling against the tests.  Thanks for the regression range; the startup cache thing looks like it could be changing some ordering and will provide a good focus point for investigation.
Assignee: nobody → bugmail
Status: NEW → ASSIGNED
according to probes, it looks like we're getting a neverending sequence of this callstack re-scheduling itself:

"nsThread::PutEvent(nsIRunnable*)",
"nsThread::Dispatch(nsIRunnable*, unsigned int)", 
"nsBaseAppShell::OnProcessNextEvent(nsIThreadInternal*, int, unsigned int)", 
"nsThread::ProcessNextEvent(int, int*)", 
"NS_ProcessNextEvent_P(nsIThread*, int)", 
"nsThread::Shutdown()", 
"nsThreadManager::Shutdown()", 
"mozilla::ShutdownXPCOM(nsIServiceManager*)", 
"NS_ShutdownXPCOM_P", 
"NS_ShutdownXPCOM"


and it looks like the base case is:

"nsThread::PutEvent(nsIRunnable*)", 
"nsThread::Dispatch(nsIRunnable*, unsigned int)", 
"nsBaseAppShell::OnProcessNextEvent(nsIThreadInternal*, int, unsigned int)", 
"nsThread::ProcessNextEvent(int, int*)", 
"NS_ProcessNextEvent_P(nsIThread*, int)", 
"nsThread::Shutdown()", 
"TimerThread::Shutdown()", 
"nsTimerImpl::Shutdown()", 
"mozilla::ShutdownXPCOM(nsIServiceManager*)", 
"NS_ShutdownXPCOM_P", 
"NS_ShutdownXPCOM"

apologies for the quotes but I'm just excising stuff from the JSON data structure.
Depends on: 605314
So, it wasn't re-scheduling itself so much as generating dummy events to keep the event loop from blocking.  This was part of a greater emergent strategy to busy-wait, chewing up cpu and making tests be intermittently orange.

Bug 605314 has the details.
This got fixed with bug 605314 iirc
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Whiteboard: [orange]
You need to log in before you can comment on or make changes to this bug.