xpcshell: intermittent "test_pop3ServerBrokenCRAMDisconnect.js, NS_ENSURE_SUCCESS() + ASSERTION (+ PROCESS-CRASH)"

RESOLVED FIXED in Thunderbird 5.0b1

Status

defect
--
major
RESOLVED FIXED
9 years ago
7 years ago

People

(Reporter: sgautherie, Assigned: BenB)

Tracking

({assertion})

Trunk
Thunderbird 5.0b1
x86
All
Dependency tree / graph
Bug Flags:
in-testsuite -

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [fixed by bug 428611] [near perma-orange on (MacOSX) SeaMonkey] [cc-orange], )

Attachments

(1 attachment, 5 obsolete attachments)

I had noticed this error report previously.
(But probably as an intermittent orange on Windows and/or Linux? Not sure and I didn't check older logs ftb.)

Atm, I notice a few MacOSX builds where it seems +/- perma-orange.
Looking at the details:

http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey/1285405429.1285406836.32367.gz&fulltext=1
OS X 10.5 comm-central-trunk debug test xpcshell on 2010/09/25 02:03:49
{
WARNING: NS_ENSURE_SUCCESS(rv, result) failed with result 0x80520012: file /builds/slave/comm-central-trunk-macosx-debug/build/mailnews/local/src/nsPop3Protocol.cpp, line 214

TEST-UNEXPECTED-FAIL | /builds/slave/comm-central-trunk-macosx-debug-unittest-xpcshell/build/xpcshell/tests/mailnews/local/test/unit/test_pop3ServerBrokenCRAMDisconnect.js | 2152398868 == 0 - See following stack:
JS frame :: /builds/slave/comm-central-trunk-macosx-debug-unittest-xpcshell/build/xpcshell/head.js :: do_throw :: line 317
JS frame :: /builds/slave/comm-central-trunk-macosx-debug-unittest-xpcshell/build/xpcshell/head.js :: do_check_eq :: line 347
JS frame :: /builds/slave/comm-central-trunk-macosx-debug-unittest-xpcshell/build/xpcshell/tests/mailnews/local/test/unit/test_pop3ServerBrokenCRAMDisconnect.js :: anonymous :: line 46

TEST-UNEXPECTED-FAIL | /builds/slave/comm-central-trunk-macosx-debug-unittest-xpcshell/build/xpcshell/tests/mailnews/local/test/unit/test_pop3ServerBrokenCRAMDisconnect.js | 2147500036 - See following stack:
JS frame :: /builds/slave/comm-central-trunk-macosx-debug-unittest-xpcshell/build/xpcshell/head.js :: do_throw :: line 317
JS frame :: /builds/slave/comm-central-trunk-macosx-debug-unittest-xpcshell/build/xpcshell/tests/mailnews/local/test/unit/test_pop3ServerBrokenCRAMDisconnect.js :: anonymous :: line 58

###!!! ASSERTION: unknown error, but don't alert user.: 'errorID != UNKNOWN_ERROR', file /builds/slave/comm-central-trunk-macosx-debug/build/mailnews/base/util/nsMsgProtocol.cpp, line 467
nsMsgProtocol::OnStopRequest(nsIRequest*, nsISupports*, unsigned int)+0x00000234 [/builds/slave/comm-central-trunk-macosx-debug-unittest-xpcshell/build/SeaMonkeyDebug.app/Contents/MacOS/XUL +0x01402646]
nsStopwatch::AddRef()+0x000FE3DF [/builds/slave/comm-central-trunk-macosx-debug-unittest-xpcshell/build/SeaMonkeyDebug.app/Contents/MacOS/XUL +0x015113E1]
catch_exception_raise+0x0002EF32 [/builds/slave/comm-central-trunk-macosx-debug-unittest-xpcshell/build/SeaMonkeyDebug.app/Contents/MacOS/XUL +0x00066434]
...

PROCESS-CRASH | /builds/slave/comm-central-trunk-macosx-debug-unittest-xpcshell/build/xpcshell/tests/mailnews/local/test/unit/test_pop3ServerBrokenCRAMDisconnect.js | application crashed (minidump found)
...
Crash reason:  EXC_BAD_ACCESS / KERN_PROTECTION_FAILURE
Crash address: 0x0
...
Thread 0 (crashed)
 0  libmozalloc.dylib!TouchBadMemory [mozalloc_abort.cpp:71e8b5aee972 : 64 + 0x5]
 1  libmozalloc.dylib!mozalloc_abort [mozalloc_abort.cpp:71e8b5aee972 : 85 + 0x4]
 2  XUL!Abort [nsDebugImpl.cpp:71e8b5aee972 : 379 + 0xa]
 3  XUL!NS_DebugBreak_P [nsDebugImpl.cpp:71e8b5aee972 : 366 + 0xd]
 4  XUL!nsMsgProtocol::OnStopRequest [nsMsgProtocol.cpp:a2d109f74af3 : 467 + 0x37]
...
}

(NB: I'm not sure whether the assertion stack is correct or not.)
(In reply to comment #0)
> (But probably as an intermittent orange on Windows and/or Linux? Not sure and I
> didn't check older logs ftb.)
> 
> Atm, I notice a few MacOSX builds where it seems +/- perma-orange.

Indeed:

http://brasstacks.mozilla.com/topfails/test/SeaMonkey?name=xpcshell/tests/test_mailnewslocal/unit/test_pop3ServerBrokenCRAMDisconnect.js
First failure is "2010-05-08 02:30", which may just be the oldest record in the DB.

http://brasstacks.mozilla.com/topfails/test/SeaMonkey?name=xpcshell/tests/mailnews/local/test/unit/test_pop3ServerBrokenCRAMDisconnect.js
Atm, this is 99%-perma on MacOSX.
(It's also random on Windows, but many tests fail at once, so the cause may be "unrelated".)
Summary: [Debug SeaMonkey, MacOSX!?] xpcshell: test_pop3ServerBrokenCRAMDisconnect.js, NS_ENSURE_SUCCESS() + ASSERTION (+ PROCESS-CRASH) → [Debug SeaMonkey, MacOSX] xpcshell: test_pop3ServerBrokenCRAMDisconnect.js, NS_ENSURE_SUCCESS() + ASSERTION (+ PROCESS-CRASH)
Whiteboard: [perma-orange][orange]
http://tinderbox.mozilla.org/showlog.cgi?log=Thunderbird/1285622322.1285626097.2895.gz&fulltext=1
WINNT 5.2 comm-central check on 2010/09/27 14:18:42

Same Windows random orange, with less details as it's an Opt build.
OS: Mac OS X → All
Summary: [Debug SeaMonkey, MacOSX] xpcshell: test_pop3ServerBrokenCRAMDisconnect.js, NS_ENSURE_SUCCESS() + ASSERTION (+ PROCESS-CRASH) → xpcshell: test_pop3ServerBrokenCRAMDisconnect.js, NS_ENSURE_SUCCESS() + ASSERTION (+ PROCESS-CRASH)
Summary: xpcshell: test_pop3ServerBrokenCRAMDisconnect.js, NS_ENSURE_SUCCESS() + ASSERTION (+ PROCESS-CRASH) → xpcshell: intermittent "test_pop3ServerBrokenCRAMDisconnect.js, NS_ENSURE_SUCCESS() + ASSERTION (+ PROCESS-CRASH)"
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey/1290212542.1290220032.17520.gz&fulltext=1
WINNT 5.2 comm-central-trunk debug test xpcshell on 2010/11/19 16:22:22
{
xul!nsPop3Protocol::OnStopRequest+0x0000000000000024 (e:\builds\slave\comm-central-trunk-win32-debug\build\mailnews\local\src\nspop3protocol.cpp, line 956)
xul!nsInputStreamPump::OnStateStop+0x00000000000000DE (e:\builds\slave\comm-central-trunk-win32-debug\build\mozilla\netwerk\base\src\nsinputstreampump.cpp, line 579)
xul!nsInputStreamPump::OnInputStreamReady+0x0000000000000090 (e:\builds\slave\comm-central-trunk-win32-debug\build\mozilla\netwerk\base\src\nsinputstreampump.cpp, line 403)
xul!nsInputStreamReadyEvent::Run+0x000000000000004A (e:\builds\slave\comm-central-trunk-win32-debug\build\mozilla\xpcom\io\nsstreamutils.cpp, line 113)
xul!nsThread::ProcessNextEvent+0x00000000000002A4 (e:\builds\slave\comm-central-trunk-win32-debug\build\mozilla\xpcom\threads\nsthread.cpp, line 626)
...
}

***

This bug is still badly hurting SeaMonkey t-b.
No longer blocks: SmTestFail
According to topfails we've not seen anything on Thunderbird since August. Whilst I understand this may be hurting SeaMonkey, I can't prioritise something that Thunderbird just isn't seeing frequently.
(In reply to comment #4)

> According to topfails we've not seen anything on Thunderbird since August.

Let's see:

TB:
http://brasstacks.mozilla.com/topfails/test/Thunderbird?name=objdir/mozilla/_tests/xpcshell/test_mailnewslocal/unit/test_pop3ServerBrokenCRAMDisconnect.js
2010-06-09 - 2010-08-14 : 18 times
http://brasstacks.mozilla.com/topfails/test/Thunderbird?name=objdir/mozilla/_tests/xpcshell/mailnews/local/test/unit/test_pop3ServerBrokenCRAMDisconnect.js
2010-09-10 - 2010-10-08 : 5 times
http://brasstacks.mozilla.com/topfails/test/Thunderbird?name=xpcshell/tests/mailnews/local/test/unit/test_pop3ServerBrokenCRAMDisconnect.js
2010-10-16 - 2010-11-12 : 10 times

So it does still happen on TB too.

> Whilst I understand this may be hurting SeaMonkey, I can't prioritise something
> that Thunderbird just isn't seeing frequently.

SM:
http://brasstacks.mozilla.com/topfails/test/SeaMonkey?name=xpcshell/tests/test_mailnewslocal/unit/test_pop3ServerBrokenCRAMDisconnect.js
2010-05-08 - 2010-09-07 : 669 times
http://brasstacks.mozilla.com/topfails/test/SeaMonkey?name=xpcshell/tests/mailnews/local/test/unit/test_pop3ServerBrokenCRAMDisconnect.js
2010-09-09 - 2010-11-19 : 369 times

Sad, as figures show that (mostly MacOSX) SeaMonkey does trigger this near permanently :-/

*****

If noone can investigate this bug atm,
I would want to disable this test for MacOSX+Windows SeaMonkey.
Karsten, would you agree with that?
Whiteboard: [perma-orange][orange] → [near perma-orange on (MacOSX) SeaMonkey][orange]
Just to help interpret:

> :: anonymous :: line 46

This is onStupRunningURL coming back with success != 0. In fact, the TODO above that line says it *should* come back with an error. The TODO says that I was puzzled why I got "success" instead. I wrongly tested for that.

So, that check can be safely removed.

This seems to solve the WINNT orange.

> :: anonymous :: line 58

This is the direct consequence of the above, it's just a re-throw.

> PROCESS-CRASH |

I don't know what this is, but it seems to be Mac only.
This does the above, removing the wrong check for success, allowing fail, because that's expected, and allowing success, because that's what we get often (I don't know why, as the TODO says).

I am sneaking in an unrelated comment-only change, too, if you don't mind.

This does not solve the Mac crash.
Attachment #492209 - Flags: superreview?(bugzilla)
Attachment #492209 - Flags: review?(sgautherie.bz)
(The client.py *really* wasn't intended to be there, though.)
Attachment #492209 - Attachment is obsolete: true
Attachment #492210 - Flags: superreview?(bugzilla)
Attachment #492210 - Flags: review?(sgautherie.bz)
Attachment #492209 - Flags: superreview?(bugzilla)
Attachment #492209 - Flags: review?(sgautherie.bz)
The unknown error on Mac is 0x804B0014 (NS_ERROR_NET_RESET, nsNetError.h says "The connection was established, but no data was ever received."), not caught in nsMsgProtocol::OnStopRequest.
FYI, the test is all about the server (intentionally) closing the connection *in the middle* of the chat, namely at auth state.
Comment on attachment 492210 [details] [diff] [review]
Part 1: Remove wrong check for result == success

>   OnStopRunningUrl: function (url, result) {
>     try {
>       // TODO we should be getting an error here, if we couldn't log in, but we don't.
>-      do_check_eq(result, 0);

I would prefer:
+      // do_check_neq(result, 0);

There seems to be some serious "backend" issues wrt this test. (Any bug(s) filed yet?)
Yet, let's reduce oranges until these are fixed.
f+, with(out) my suggestion.
Attachment #492210 - Flags: review?(sgautherie.bz)
Attachment #492210 - Flags: review?(bugzilla)
Attachment #492210 - Flags: review+
Attachment #492210 - Flags: review+ → feedback+
Depends on: 554044
(In reply to comment #9)
> The unknown error on Mac is 0x804B0014 (NS_ERROR_NET_RESET, nsNetError.h says
> "The connection was established, but no data was ever received."), not caught
> in nsMsgProtocol::OnStopRequest.

Right, I reopened bug 554044.
(In reply to comment #5)
> (In reply to comment #4)
> 
> > According to topfails we've not seen anything on Thunderbird since August.
> 
> Let's see:
> 
> TB:
> http://brasstacks.mozilla.com/topfails/test/Thunderbird?name=objdir/mozilla/_tests/xpcshell/test_mailnewslocal/unit/test_pop3ServerBrokenCRAMDisconnect.js
> 2010-06-09 - 2010-08-14 : 18 times
> http://brasstacks.mozilla.com/topfails/test/Thunderbird?name=objdir/mozilla/_tests/xpcshell/mailnews/local/test/unit/test_pop3ServerBrokenCRAMDisconnect.js
> 2010-09-10 - 2010-10-08 : 5 times
> http://brasstacks.mozilla.com/topfails/test/Thunderbird?name=xpcshell/tests/mailnews/local/test/unit/test_pop3ServerBrokenCRAMDisconnect.js
> 2010-10-16 - 2010-11-12 : 10 times
> 
> So it does still happen on TB too.

Relying on topfails isn't necessarily a good thing. We've had periods of general test brokenness e.g. half of the last set you quote there isn't random failures but tree failures.

Unfortunately that seems to be a limitation of topfails.
Comment on attachment 492210 [details] [diff] [review]
Part 1: Remove wrong check for result == success

No sr required. r=Standard8.

The process is crash is because assertions are fatal in xpcshell-tests, simple as that. Stop hitting the assertion and you'll stop crashing. The assertion is one we frequently see when we get this kind of failure in a test (i.e. stops the test finishing completely).

Not sure if you're planning a follow-up bug or to keep this one open, but it'd be good to get a clear summary of what we think is/isn't happening on a bug somewhere.
Attachment #492210 - Flags: superreview?(bugzilla)
Attachment #492210 - Flags: review?(bugzilla)
Attachment #492210 - Flags: review+
(In reply to comment #14)
> Not sure if you're planning a follow-up bug or to keep this one open, but it'd
> be good to get a clear summary of what we think is/isn't happening on a bug
> somewhere.

Yeah, based on the explanations given previously, it would be fine with me to morph this bug to cover this patch only (and probably backing it out when the underlying issue is fixed).
The assertion issue is bug 554044.
The other issues should be filed separately.
Looks like bug 428611 may actually help this, especially as it looks like we forgot to land the patch here.
Depends on: 428611
Whiteboard: [near perma-orange on (MacOSX) SeaMonkey][orange] → [near perma-orange on (MacOSX) SeaMonkey][cc-orange]
Attachment #492210 - Attachment is obsolete: true
Comment on attachment 492210 [details] [diff] [review]
Part 1: Remove wrong check for result == success

Mark, the line that this patch changes is simply wrong, as the comment above it clearly says.
If we don't use this patch, then we should enforce neq(0), but eq(0) is wrong and was just a workaround for the broken backend.
Attachment #492210 - Attachment is obsolete: false
(see comment 6/7 for more explanation)
Ben, maybe I wasn't clear enough. If you look at the patch in bug 428611 the changes there make the result OnStopRunningUrl always NS_ERROR_FAILURE. Hence we shouldn't have this problem of being a pass result most of the time, and a failure occasionally.
> the patch in bug 428611 the
> changes there make the result OnStopRunningUrl always NS_ERROR_FAILURE.

Yes, that's correct and what I always expected.

That's not what happened, though, so the test as-is has a
do_check_eq(result, 0);
which basically enforces the bug that bug 428611 fixed. The patch here removes that line, because after bug 428611, it would now always fail.

An even better patch would reverse the check and do
do_check_neq(result, 0);
> An even better patch would reverse the check and do
> do_check_neq(result, 0);

Attached.
Assignee: nobody → ben.bucksch
Attachment #492210 - Attachment is obsolete: true
Attachment #528375 - Flags: review?(mbanner)
Attachment #528375 - Attachment is obsolete: true
Attachment #528376 - Flags: review?(mbanner)
Attachment #528375 - Flags: review?(mbanner)
Attachment #528376 - Attachment is obsolete: true
Attachment #528382 - Flags: review?(mbanner)
Attachment #528376 - Flags: review?(mbanner)
Comment on attachment 528382 [details] [diff] [review]
v5: Revert wrong check and ensure failure

As I already have said, the patch on bug 428611 (attachment 526416 [details] [diff] [review] to be precise) does this at the end of that patch. There's nothing to do here.

If you want to do the just the comment changes anyway, that's fine with rs=me
Attachment #528382 - Flags: review?(mbanner) → review-
> the patch on bug 428611 (attachment 526416 [details] [diff] [review] to be
> precise) does this at the end of that patch

Ah, OK, I really misunderstood you in that part.

Attaching new patch with comment changes only.
has rs=mbanner
Attachment #528382 - Attachment is obsolete: true
Attachment #528456 - Flags: review+
Patch v6 commited as <http://hg.mozilla.org/comm-central/rev/9d16abe15b8b>.

Bug FIXED in bug 428611.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Flags: in-testsuite-
Target Milestone: --- → Thunderbird 3.3a4
Depends on: 671965
Whiteboard: [near perma-orange on (MacOSX) SeaMonkey][cc-orange] → [fixed by bug 428611] [near perma-orange on (MacOSX) SeaMonkey] [cc-orange]
You need to log in before you can comment on or make changes to this bug.