xpcshell-tests: test_compose/unit/test_sendMessageLater2.js hangs often

RESOLVED FIXED in Thunderbird 3.3a3

Status

defect
--
major
RESOLVED FIXED
10 years ago
4 years ago

People

(Reporter: sgautherie, Assigned: standard8)

Tracking

({hang, intermittent-failure})

Dependency tree / graph
Bug Flags:
in-testsuite -

Firefox Tracking Flags

(Not tracked)

Details

()

Attachments

(3 attachments)

[Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9.2a1pre) Gecko/20090402 SeaMonkey/2.0b1pre] (experimental/_m-c_, home, optim default) (W2Ksp4)
(http://hg.mozilla.org/mozilla-central/rev/c4f6bb1db611
 +http://hg.mozilla.org/comm-central/rev/cdfcaddd6832)

Run in the whole suite or alone, this test (just) hangs rather often,
with xpcshell eating 100% cpu.

|env EXTRA_TEST_ARGS=--test=test_compose/unit/test_sendMessageLater2.js make -C objdir xpcshell-tests|
(NB: This EXTRA_TEST_ARGS feature is still a w-i-p patch of mine...)

PS:
Yet, this has not been seen/reported on SM and TB tinderboxes :-|
Timing issue on my "too fast" computer ??
I'll need to see a log of the test output, preferably with NSPR_LOG_MODULES=smtp:5 enabled.
Posted file Hang: test log (1)
|make XPCOM_MEM_LEAK_LOG=1 SOLO_FILE="test_sendMessageLater2.js" -C objdir/mailnews/compose/test check-one|

Can someone else reproduce?
Keywords: qawanted
[Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9.2a1pre) Gecko/20090530 SeaMonkey/2.0b1pre] (experimental/_m-c_, home, optim default) (W2Ksp4)
(http://hg.mozilla.org/mozilla-central/rev/e44d9c0f4805
 +http://hg.mozilla.org/comm-central/rev/62f2c362a9a4 + bug 493008 patches)

Still reproducible.


[Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9.2a1pre) Gecko/20090711 SeaMonkey/2.1a1pre] (home, optim default) (W2Ksp4)
(http://hg.mozilla.org/mozilla-central/rev/47bfcd275ede
 +http://hg.mozilla.org/comm-central/rev/291cbe3374b9)

Can't reproduce anymore.
Status: UNCONFIRMED → RESOLVED
Closed: 10 years ago
Keywords: qawanted
Resolution: --- → WORKSFORME
Summary: (TUnit) xpcshell-tests test_compose/unit/test_sendMessageLater2.js hangs often → (TUnit) xpcshell-tests: test_compose/unit/test_sendMessageLater2.js hangs often
I have noticed this timeout lately...

http://tinderbox.mozilla.org/showlog.cgi?log=Thunderbird/1252278946.1252280503.30599.gz
Linux comm-central check on 2009/09/06 16:15:46
{
TEST-PASS | /buildbot/linux-comm-central-check/build/objdir/mozilla/_tests/xpcshell/test_compose/unit/test_sendMessageLater.js | test passed

command timed out: 300 seconds without output, killing pid 8033
}
Status: RESOLVED → UNCONFIRMED
Flags: wanted-thunderbird3?
Resolution: WORKSFORME → ---
Summary: (TUnit) xpcshell-tests: test_compose/unit/test_sendMessageLater2.js hangs often → xpcshell-tests: test_compose/unit/test_sendMessageLater2.js hangs often
Status: UNCONFIRMED → NEW
Ever confirmed: true
Blocks: 438871
Whiteboard: [orange]
Severity: normal → critical
Severity: critical → major
Andrew, I've seen this twice on Mac since we got the other fix in, any chance you could poke you analysis tools at it?
Flags: wanted-thunderbird3?
http://tinderbox.mozilla.org/showlog.cgi?log=Thunderbird/1293653583.1293654998.23169.gz
MacOSX 10.6 comm-central test xpcshell on 2010/12/29 12:13:03
{
TEST-INFO | /builds/slave/comm-central-trunk-macosx64-opt-unittest-xpcshell/build/xpcshell/tests/mailnews/compose/test/unit/test_sendMessageLater2.js | running test ...

command timed out: 1200 seconds without output, killing pid 231
}
http://tinderbox.mozilla.org/showlog.cgi?log=ThunderbirdTry/1293793005.1293794420.11573.gz
OS X 10.6 tryserver opt test xpcshell on 2010/12/31 02:56:45

command timed out: 1200 seconds without output, killing pid 15159
process killed by signal 9
program finished with exit code -1
elapsedTime=1293.673643
TinderboxPrint: xpcshell<br/><em class="testfail">T-FAIL</em>
Posted patch The fixSplinter Review
This I believe should fix the issue (it fixed what I was seeing almost every time to not seeing it in somewhere over 100 runs with and without load).

The basic issue is that for some reason we can get into a place where message sending/responses sometimes happen async or sometimes sync. Rather than try to handle that with flags (which obviously don't work right), this patch makes it so that to restart the async_driver, we'll always be async hence getting around the issue.
Assignee: nobody → bugzilla
Status: NEW → ASSIGNED
Attachment #506199 - Flags: review?(bienvenu)
OS: Windows 2000 → All
Hardware: x86 → All
async_driver internally already uses do_execute_soon to postpone stuff happening to a future round.  If you really need the do_timeout, then there is probably still something unpleasant going on...
(In reply to comment #11)
> async_driver internally already uses do_execute_soon to postpone stuff
> happening to a future round.  If you really need the do_timeout, then there is
> probably still something unpleasant going on...

Well without it, the generator tends to explode in the synchronous case. My guess is that there's just not enough time/events to process in-between queuing the async_driver and running that call.
That still sounds orange-prone.  Presumably the issue is that server.performTest spins a nested event loop?  It might be advisable to stop using the fakeserver methods that spin nested event loops.  One could add methods on the fakeserver that are asyncTestUtils aware and so allow it to block the forward motion of the async stack without adding that extra confusion factor...

(server.stop() also does this.)
(or use wrapper functions / something / anything other than triggering more nested event loops)
Comment on attachment 506199 [details] [diff] [review]
The fix

there may still be an underlying issue per asuth, but I think it's worth seeing if this cuts down the orange.
Attachment #506199 - Flags: review?(bienvenu) → review+
Checked in: http://hg.mozilla.org/comm-central/rev/fd977cea54d2

Raised bug 628725 to follow-up on the fake server issues.
Status: ASSIGNED → RESOLVED
Closed: 10 years ago9 years ago
Flags: in-testsuite-
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 3.3a3
Whiteboard: [orange]
You need to log in before you can comment on or make changes to this bug.