Last Comment Bug 554044 - xpcshell: test_smtpAuthMethods.js, test_smtpPasswordFailure1.js and others report intermittent "ASSERTION: unknown error, but don't alert user."
: xpcshell: test_smtpAuthMethods.js, test_smtpPasswordFailure1.js and others re...
Status: RESOLVED FIXED
: assertion, regression
Product: MailNews Core
Classification: Components
Component: Networking: SMTP (show other bugs)
: Trunk
: All Mac OS X
: -- major (vote)
: Thunderbird 14.0
Assigned To: Mark Banner (:standard8, limited time in Dec)
:
:
Mentors:
Depends on: 780124
Blocks: 739007 525238 599609
  Show dependency treegraph
 
Reported: 2010-03-22 07:59 PDT by Mark Banner (:standard8, limited time in Dec)
Modified: 2015-04-22 05:54 PDT (History)
7 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
wontfix
wontfix
wontfix
wontfix


Attachments
test_smtpAuthMethods.js log (14.55 KB, text/plain)
2010-03-22 07:59 PDT, Mark Banner (:standard8, limited time in Dec)
no flags Details
test_smtpPasswordFailure1.js log (12.18 KB, text/plain)
2010-03-22 08:00 PDT, Mark Banner (:standard8, limited time in Dec)
no flags Details
Possible fix (2.63 KB, patch)
2012-02-12 21:06 PST, Hiroyuki Ikezoe (:hiro)
no flags Details | Diff | Splinter Review
The fix for nsSmtpProtocol (3.24 KB, patch)
2012-02-13 21:48 PST, Hiroyuki Ikezoe (:hiro)
no flags Details | Diff | Splinter Review
The fix for nsSmtpProtocol v2 [Checked in: Comment 25] (2.46 KB, patch)
2012-03-19 08:23 PDT, Mark Banner (:standard8, limited time in Dec)
standard8: review+
standard8: approval‑comm‑aurora-
standard8: approval‑comm‑beta-
Details | Diff | Splinter Review

Description Mark Banner (:standard8, limited time in Dec) 2010-03-22 07:59:53 PDT
Created attachment 433927 [details]
test_smtpAuthMethods.js log

I'm trying to run test_smtpAuthMethods.js in debug mode using trunk builds.

The test is failing part way through:

TEST-PASS | /Users/moztest/comm/main/tb/mozilla/_tests/xpcshell/test_compose/unit/head_compose.js | [do_check_transaction : 64] EHLO test == EHLO test
Passed test Encrypted password, with server only supporting AUTH PLAIN (must fail)
NEXT test: Any secure method, with server supporting AUTH PLAIN, LOGIN and CRAM
WARNING: NS_ENSURE_TRUE(aCallbacks) failed: file /Users/moztest/comm/main/src/mailnews/compose/src/nsSmtpUrl.cpp, line 815
WARNING: NS_ENSURE_TRUE(aUrlListener) failed: file /Users/moztest/comm/main/src/mailnews/base/util/nsMsgMailNewsUrl.cpp, line 144
###!!! ASSERTION: unknown error, but don't alert user.: 'errorID != UNKNOWN_ERROR', file /Users/moztest/comm/main/src/mailnews/base/util/nsMsgProtocol.cpp, line 439

See the attached file for the full log.

Tests shouldn't assert in debug mode as this makes it hard for developers to run them in their normal debugging builds.

In this case, we appear to have an unknown error because the error passed to OnStopRequest was NS_ERROR_NET_INTERRUPT.

test_smtpPasswordFailure1.js also fails in a similar manner. It never used to in debug mode, certainly not on 3.0 branch.
Comment 1 Mark Banner (:standard8, limited time in Dec) 2010-03-22 08:00:47 PDT
Created attachment 433928 [details]
test_smtpPasswordFailure1.js log
Comment 2 Ben Bucksch (:BenB) 2010-03-22 08:08:57 PDT
This is totally dodgy code in nsMsgProtocol.cpp, from what I see.
It takes a generic nserror, then checks it against a small number of known net errors (mostly unknown host, connection refused, net timeout), and if it's none of these known errors, it doesn't know what to do and asserts.

I'd need to investigate why we run in there in the first place. But from the existing errors checks, it seems legitimate to run into this code in error cases and should be adapted/fixed.
Comment 3 Mark Banner (:standard8, limited time in Dec) 2010-03-22 08:19:25 PDT
(In reply to comment #2)
> I'd need to investigate why we run in there in the first place. But from the
> existing errors checks, it seems legitimate to run into this code in error
> cases and should be adapted/fixed.

I can understand adapting it for error cases, however what I don't understand is why is test_smtpPasswordFailure1.js now getting a network interrupt where it wasn't before?

Is this a regression in the fake server or something? As it sounds to me like something is now dropping a connection on the floor unexpectedly.
Comment 4 Ben Bucksch (:BenB) 2010-03-22 08:34:36 PDT
> what I don't understand is why is test_smtpPasswordFailure1.js now getting
> a network interrupt where it wasn't before?

Neither do I, that's why I said:
> I'd need to investigate why we run in there in the first place.

> regression in the fake server

unlikely, as the fakeserver change went in weeks ago.
My backend code added a number of new error codes. Maybe it's running into one of them. Maybe it's setting a wrong auth method and failing because of that. I can only speculate atm...
Comment 5 Mark Banner (:standard8, limited time in Dec) 2010-03-22 08:42:20 PDT
(In reply to comment #4)
> > what I don't understand is why is test_smtpPasswordFailure1.js now getting
> > a network interrupt where it wasn't before?
> 
> Neither do I, that's why I said:
> > I'd need to investigate why we run in there in the first place.

Sorry, I misread that bit.
Comment 6 Ben Bucksch (:BenB) 2010-04-02 09:29:36 PDT
> I'm trying to run test_smtpAuthMethods.js in debug mode using trunk builds.
> The test is failing part way through:

Works for me, on Linux opt and debug.

---

This bug has 2 parts:
- nsMsgProtocol.cpp needs to be fixed, to either consider the new error codes,
  or be more generic (preferred).
- Find out why we run in there, and why in some cases and not others.
  Given that I cannot reproduce, and the msg in the log file doesn't even
  give the error code, much less descr or stack, I can't fix that.
Comment 7 Mark Banner (:standard8, limited time in Dec) 2010-04-08 06:00:46 PDT
Hmm, I've just been running these tests again, and I don't see the error. I guess either something got fixed (although I couldn't see anything) or my build was broken in some way.

Anyway I've tried both trunk and 3.1 and got no problems, so marking this as WFM.
Comment 8 Serge Gautherie (:sgautherie) 2010-11-22 08:16:09 PST
SM:
http://brasstacks.mozilla.com/topfails/test/SeaMonkey?name=xpcshell/tests/test_compose/unit/test_smtpPasswordFailure1.js
from 2010-05-08 to 2010-09-02
http://brasstacks.mozilla.com/topfails/test/SeaMonkey?name=xpcshell/tests/mailnews/compose/test/unit/test_smtpPasswordFailure1.js
from 2010-09-10 to 2010-11-12

TB:
http://brasstacks.mozilla.com/topfails/test/Thunderbird?name=objdir/mozilla/_tests/xpcshell/test_compose/unit/test_smtpPasswordFailure1.js
from 2010-06-02 to 2010-08-14
http://brasstacks.mozilla.com/topfails/test/Thunderbird?name=objdir/mozilla/_tests/xpcshell/mailnews/compose/test/unit/test_smtpPasswordFailure1.js
from 2010-09-10 to 2010-09-10
http://brasstacks.mozilla.com/topfails/test/Thunderbird?name=xpcshell/tests/mailnews/compose/test/unit/test_smtpPasswordFailure1.js
from 2010-11-11 to 2010-11-12

I didn't check these histories, but here are some SM new ones:
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey/1290407408.1290408743.13549.gz&fulltext=1
OS X 10.6 comm-central-trunk debug test xpcshell on 2010/11/21 22:30:08
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey/1290413345.1290414782.3889.gz&fulltext=1
OS X 10.6 comm-central-trunk debug test xpcshell on 2010/11/22 00:09:05


(In reply to comment #0)
> In this case, we appear to have an unknown error because the error passed to
> OnStopRequest was NS_ERROR_NET_INTERRUPT.

This was re-discussed in bug 599609 (test_pop3ServerBrokenCRAMDisconnect.js) too.
Comment 10 Ben Bucksch (:BenB) 2010-11-22 15:08:06 PST
> http://brasstacks.mozilla.com/topfails/test/Thunderbird?name=xpcshell/tests/mailnews/compose/test/unit/test_smtpPasswordFailure1.js
> from 2010-11-11 to 2010-11-12

Serge, did you actually check them? In these 5 builds, *all* tests failed. That must have been bad builds.
Comment 11 Serge Gautherie (:sgautherie) 2012-01-12 12:16:14 PST
(In reply to Ben Bucksch (:BenB) from comment #10)
> Serge, did you actually check them? In these 5 builds, *all* tests failed.
> That must have been bad builds.

Those are too old now. Sorry.

*****

But this is still happening (intermittently):

http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey/1326357109.1326359855.30935.gz
OS X 10.5 comm-central-trunk debug test xpcshell on 2012/01/12 00:31:49
{
...
TEST-PASS | /builds/slave/test/build/xpcshell/tests/mailnews/compose/test/unit/test_smtpPasswordFailure1.js | [run_test : 145] smtptest == smtptest

###!!! ASSERTION: unknown error, but don't alert user.: 'errorID != UNKNOWN_ERROR', file /builds/slave/comm-cen-trunk-osx-dbg/build/mailnews/base/util/nsMsgProtocol.cpp, line 469
nsMsgProtocol::OnStopRequest(nsIRequest*, nsISupports*, unsigned int)+0x0000023B [/builds/slave/test/build/SeaMonkeyDebug.app/Contents/MacOS/XUL +0x015A7AA1]
...


xpcshell: 1496/1
}
Comment 12 Hiroyuki Ikezoe (:hiro) 2012-02-12 21:06:23 PST
Created attachment 596562 [details] [diff] [review]
Possible fix

I can reproduce test_smtpAuthMethods.js failure on my local machine and this patch fixes the failure.

I can not reproduce test_smtpPasswordFailure1.js failure case, but I hope this patch also fixes it.
Comment 13 Serge Gautherie (:sgautherie) 2012-02-13 00:01:57 PST
Comment on attachment 596562 [details] [diff] [review]
Possible fix

I believe we will want to check/fix the 5 other tests too:
http://mxr.mozilla.org/comm-central/search?string=sendMailMessage(&case=1&find=%2Fmailnews%2Fcompose%2Ftest%2Funit%2F
"Found 8 matching lines in 7 files"
Comment 14 Ben Bucksch (:BenB) 2012-02-13 05:57:00 PST
Thanks, Hiroyuki!

It seems to me it would be better to fix sendMailMessage() or request itself, though, to fix all tests that use them. It's a bug in them apparently, not in the test.
Comment 15 David :Bienvenu 2012-02-13 07:45:11 PST
an other approach would be to use a request observer and wait for the onStopRequest before running the next test, instead of cancelling the previous request.
Comment 16 Hiroyuki Ikezoe (:hiro) 2012-02-13 20:34:59 PST
(In reply to David :Bienvenu from comment #15)
> an other approach would be to use a request observer and wait for the
> onStopRequest before running the next test, instead of cancelling the
> previous request.

Yes, I had tried it first, but I couldn't find the way how to set the request observer in this test.
Comment 17 Hiroyuki Ikezoe (:hiro) 2012-02-13 20:37:38 PST
(In reply to Ben Bucksch (:BenB) from comment #14)
> Thanks, Hiroyuki!
> 
> It seems to me it would be better to fix sendMailMessage() or request
> itself, though, to fix all tests that use them. It's a bug in them
> apparently, not in the test.

Ah! Now I got your point. You mean that calling sendMailMessage twice hits this assertion, right? Like this:

  smtpService.sendMailMessage(someFile, kTo, identity,
                              null, null, null, null,
                              false, {}, {});
  smtpService.sendMailMessage(anotherFile, kTo, identity,
                              null, null, null, null,
                              false, {}, {});
Comment 18 Hiroyuki Ikezoe (:hiro) 2012-02-13 20:58:56 PST
(In reply to Hiroyuki Ikezoe (:hiro) from comment #17)
> (In reply to Ben Bucksch (:BenB) from comment #14)
> > Thanks, Hiroyuki!
> > 
> > It seems to me it would be better to fix sendMailMessage() or request
> > itself, though, to fix all tests that use them. It's a bug in them
> > apparently, not in the test.
> 
> Ah! Now I got your point. You mean that calling sendMailMessage twice hits
> this assertion, right? Like this:
> 
>   smtpService.sendMailMessage(someFile, kTo, identity,
>                               null, null, null, null,
>                               false, {}, {});
>   smtpService.sendMailMessage(anotherFile, kTo, identity,
>                               null, null, null, null,
>                               false, {}, {});

Urgh, I have misunderstood..
Comment 19 David :Bienvenu 2012-02-13 21:10:49 PST
sendMailMessage takes an optional url listener - when onStopRunningUrl is called on the listener, it should be OK to call it again, or after a return to the event loop with a yield or set timeout of 0.
Comment 20 Hiroyuki Ikezoe (:hiro) 2012-02-13 21:48:38 PST
Created attachment 596922 [details] [diff] [review]
The fix for nsSmtpProtocol

Set m_sendDone true whenever sending "QUIT" command.

I am not really sure this patch has no side-effects. I just run unit test in mailnews/compose/test/unit. All the tests in the path passed.
Comment 21 Hiroyuki Ikezoe (:hiro) 2012-02-13 21:51:12 PST
(In reply to David :Bienvenu from comment #19)
> sendMailMessage takes an optional url listener - when onStopRunningUrl is
> called on the listener, it should be OK to call it again

I did actually try it, but it couldn't fix this assertion.

> or after a return to the event loop with a yield or set timeout of 0.

Maybe this way fixes the assertion, but I do not prefer timeout magic....
Comment 22 Mark Banner (:standard8, limited time in Dec) 2012-02-14 00:47:15 PST
Comment on attachment 596922 [details] [diff] [review]
The fix for nsSmtpProtocol

I'd certainly like to take a look at this as I've touched code in this area before.
Comment 23 David :Bienvenu 2012-02-14 08:24:58 PST
(In reply to Hiroyuki Ikezoe (:hiro) from comment #21)
> (In reply to David :Bienvenu from comment #19)
> > sendMailMessage takes an optional url listener - when onStopRunningUrl is
> > called on the listener, it should be OK to call it again
> 
> I did actually try it, but it couldn't fix this assertion.
> 
> > or after a return to the event loop with a yield or set timeout of 0.
> 
> Maybe this way fixes the assertion, but I do not prefer timeout magic....

returning to the event loop to let the stack unwind isn't magic - but I agree that things like settimeout(100) are magic and to be avoided.
Comment 24 Mark Banner (:standard8, limited time in Dec) 2012-03-19 08:23:30 PDT
Created attachment 607175 [details] [diff] [review]
The fix for nsSmtpProtocol v2
[Checked in: Comment 25]

This is Hiroyuki's patch fixed for bitrot and one comment improvement.

Sorry for the delay in getting to this. As far as I can tell this is the right thing to do - OnStopRequest was trying to work out if the connection had been dropped whilst doing an auth, but also coupled with one of the steps returning an error.

My suspicion would be a race condition somewhere possibly being caused by the test infrastructure, but as far as I can tell this fixes the assertion.
Comment 25 Serge Gautherie (:sgautherie) 2012-03-19 11:05:09 PDT
Comment on attachment 607175 [details] [diff] [review]
The fix for nsSmtpProtocol v2
[Checked in: Comment 25]

http://hg.mozilla.org/comm-central/rev/6a883737262a


[Approval Request Comment]
I assume that would be good for branches too...
Comment 26 Mark Banner (:standard8, limited time in Dec) 2012-04-05 13:26:47 PDT
Comment on attachment 607175 [details] [diff] [review]
The fix for nsSmtpProtocol v2
[Checked in: Comment 25]

Not taking forward as this is a protocol level change which although simple, I'd still prefer to get tested, and according to the new bug created doesn't seem to fix the test issue fully anyway.

Note You need to log in before you can comment on or make changes to this bug.