xpcshell: test_smtpAuthMethods.js, test_smtpPasswordFailure1.js and others report intermittent "ASSERTION: unknown error, but don't alert user."

RESOLVED FIXED in Thunderbird 14.0

Status

MailNews Core
Networking: SMTP
--
major
RESOLVED FIXED
7 years ago
2 years ago

People

(Reporter: standard8, Assigned: standard8)

Tracking

(Depends on: 1 bug, Blocks: 1 bug, {assertion, regression})

Trunk
Thunderbird 14.0
All
Mac OS X
assertion, regression
Dependency tree / graph

Thunderbird Tracking Flags

(thunderbird11 wontfix, thunderbird12 wontfix, thunderbird13 wontfix, thunderbird-esr10 wontfix)

Details

Attachments

(3 attachments, 2 obsolete attachments)

(Assignee)

Description

7 years ago
Created attachment 433927 [details]
test_smtpAuthMethods.js log

I'm trying to run test_smtpAuthMethods.js in debug mode using trunk builds.

The test is failing part way through:

TEST-PASS | /Users/moztest/comm/main/tb/mozilla/_tests/xpcshell/test_compose/unit/head_compose.js | [do_check_transaction : 64] EHLO test == EHLO test
Passed test Encrypted password, with server only supporting AUTH PLAIN (must fail)
NEXT test: Any secure method, with server supporting AUTH PLAIN, LOGIN and CRAM
WARNING: NS_ENSURE_TRUE(aCallbacks) failed: file /Users/moztest/comm/main/src/mailnews/compose/src/nsSmtpUrl.cpp, line 815
WARNING: NS_ENSURE_TRUE(aUrlListener) failed: file /Users/moztest/comm/main/src/mailnews/base/util/nsMsgMailNewsUrl.cpp, line 144
###!!! ASSERTION: unknown error, but don't alert user.: 'errorID != UNKNOWN_ERROR', file /Users/moztest/comm/main/src/mailnews/base/util/nsMsgProtocol.cpp, line 439

See the attached file for the full log.

Tests shouldn't assert in debug mode as this makes it hard for developers to run them in their normal debugging builds.

In this case, we appear to have an unknown error because the error passed to OnStopRequest was NS_ERROR_NET_INTERRUPT.

test_smtpPasswordFailure1.js also fails in a similar manner. It never used to in debug mode, certainly not on 3.0 branch.
(Assignee)

Comment 1

7 years ago
Created attachment 433928 [details]
test_smtpPasswordFailure1.js log

Comment 2

7 years ago
This is totally dodgy code in nsMsgProtocol.cpp, from what I see.
It takes a generic nserror, then checks it against a small number of known net errors (mostly unknown host, connection refused, net timeout), and if it's none of these known errors, it doesn't know what to do and asserts.

I'd need to investigate why we run in there in the first place. But from the existing errors checks, it seems legitimate to run into this code in error cases and should be adapted/fixed.
(Assignee)

Comment 3

7 years ago
(In reply to comment #2)
> I'd need to investigate why we run in there in the first place. But from the
> existing errors checks, it seems legitimate to run into this code in error
> cases and should be adapted/fixed.

I can understand adapting it for error cases, however what I don't understand is why is test_smtpPasswordFailure1.js now getting a network interrupt where it wasn't before?

Is this a regression in the fake server or something? As it sounds to me like something is now dropping a connection on the floor unexpectedly.

Comment 4

7 years ago
> what I don't understand is why is test_smtpPasswordFailure1.js now getting
> a network interrupt where it wasn't before?

Neither do I, that's why I said:
> I'd need to investigate why we run in there in the first place.

> regression in the fake server

unlikely, as the fakeserver change went in weeks ago.
My backend code added a number of new error codes. Maybe it's running into one of them. Maybe it's setting a wrong auth method and failing because of that. I can only speculate atm...
(Assignee)

Comment 5

7 years ago
(In reply to comment #4)
> > what I don't understand is why is test_smtpPasswordFailure1.js now getting
> > a network interrupt where it wasn't before?
> 
> Neither do I, that's why I said:
> > I'd need to investigate why we run in there in the first place.

Sorry, I misread that bit.

Comment 6

7 years ago
> I'm trying to run test_smtpAuthMethods.js in debug mode using trunk builds.
> The test is failing part way through:

Works for me, on Linux opt and debug.

---

This bug has 2 parts:
- nsMsgProtocol.cpp needs to be fixed, to either consider the new error codes,
  or be more generic (preferred).
- Find out why we run in there, and why in some cases and not others.
  Given that I cannot reproduce, and the msg in the log file doesn't even
  give the error code, much less descr or stack, I can't fix that.
(Assignee)

Comment 7

7 years ago
Hmm, I've just been running these tests again, and I don't see the error. I guess either something got fixed (although I couldn't see anything) or my build was broken in some way.

Anyway I've tried both trunk and 3.1 and got no problems, so marking this as WFM.
Status: NEW → RESOLVED
Last Resolved: 7 years ago
Resolution: --- → WORKSFORME
SM:
http://brasstacks.mozilla.com/topfails/test/SeaMonkey?name=xpcshell/tests/test_compose/unit/test_smtpPasswordFailure1.js
from 2010-05-08 to 2010-09-02
http://brasstacks.mozilla.com/topfails/test/SeaMonkey?name=xpcshell/tests/mailnews/compose/test/unit/test_smtpPasswordFailure1.js
from 2010-09-10 to 2010-11-12

TB:
http://brasstacks.mozilla.com/topfails/test/Thunderbird?name=objdir/mozilla/_tests/xpcshell/test_compose/unit/test_smtpPasswordFailure1.js
from 2010-06-02 to 2010-08-14
http://brasstacks.mozilla.com/topfails/test/Thunderbird?name=objdir/mozilla/_tests/xpcshell/mailnews/compose/test/unit/test_smtpPasswordFailure1.js
from 2010-09-10 to 2010-09-10
http://brasstacks.mozilla.com/topfails/test/Thunderbird?name=xpcshell/tests/mailnews/compose/test/unit/test_smtpPasswordFailure1.js
from 2010-11-11 to 2010-11-12

I didn't check these histories, but here are some SM new ones:
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey/1290407408.1290408743.13549.gz&fulltext=1
OS X 10.6 comm-central-trunk debug test xpcshell on 2010/11/21 22:30:08
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey/1290413345.1290414782.3889.gz&fulltext=1
OS X 10.6 comm-central-trunk debug test xpcshell on 2010/11/22 00:09:05


(In reply to comment #0)
> In this case, we appear to have an unknown error because the error passed to
> OnStopRequest was NS_ERROR_NET_INTERRUPT.

This was re-discussed in bug 599609 (test_pop3ServerBrokenCRAMDisconnect.js) too.
Blocks: 599609
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
Summary: test_smtpAuthMethods.js and test_smtpPasswordFailure1.js assert when run in debug mode: ###!!! ASSERTION: unknown error, but don't alert user. → xpcshell: test_smtpAuthMethods.js, test_smtpPasswordFailure1.js and others report intermittent "ASSERTION: unknown error, but don't alert user."
(In reply to comment #8)

That comment was about test_smtpPasswordFailure1.js.

test_smtpAuthMethods.js is still failing too, most notably:
http://brasstacks.mozilla.com/topfails/test/SeaMonkey?name=xpcshell/tests/mailnews/compose/test/unit/test_smtpAuthMethods.js
from 2010-09-09 to 2010-11-21

***

http://mxr.mozilla.org/comm-central/source/mailnews/compose/test/unit/test_smtpAuthMethods.js
http://mxr.mozilla.org/comm-central/source/mailnews/compose/test/unit/test_smtpPasswordFailure1.js
Severity: normal → major

Comment 10

7 years ago
> http://brasstacks.mozilla.com/topfails/test/Thunderbird?name=xpcshell/tests/mailnews/compose/test/unit/test_smtpPasswordFailure1.js
> from 2010-11-11 to 2010-11-12

Serge, did you actually check them? In these 5 builds, *all* tests failed. That must have been bad builds.
(In reply to Ben Bucksch (:BenB) from comment #10)
> Serge, did you actually check them? In these 5 builds, *all* tests failed.
> That must have been bad builds.

Those are too old now. Sorry.

*****

But this is still happening (intermittently):

http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey/1326357109.1326359855.30935.gz
OS X 10.5 comm-central-trunk debug test xpcshell on 2012/01/12 00:31:49
{
...
TEST-PASS | /builds/slave/test/build/xpcshell/tests/mailnews/compose/test/unit/test_smtpPasswordFailure1.js | [run_test : 145] smtptest == smtptest

###!!! ASSERTION: unknown error, but don't alert user.: 'errorID != UNKNOWN_ERROR', file /builds/slave/comm-cen-trunk-osx-dbg/build/mailnews/base/util/nsMsgProtocol.cpp, line 469
nsMsgProtocol::OnStopRequest(nsIRequest*, nsISupports*, unsigned int)+0x0000023B [/builds/slave/test/build/SeaMonkeyDebug.app/Contents/MacOS/XUL +0x015A7AA1]
...


xpcshell: 1496/1
}
Keywords: assertion
Depends on: 712540
Created attachment 596562 [details] [diff] [review]
Possible fix

I can reproduce test_smtpAuthMethods.js failure on my local machine and this patch fixes the failure.

I can not reproduce test_smtpPasswordFailure1.js failure case, but I hope this patch also fixes it.
Attachment #596562 - Flags: review?(mbanner)
Comment on attachment 596562 [details] [diff] [review]
Possible fix

I believe we will want to check/fix the 5 other tests too:
http://mxr.mozilla.org/comm-central/search?string=sendMailMessage(&case=1&find=%2Fmailnews%2Fcompose%2Ftest%2Funit%2F
"Found 8 matching lines in 7 files"

Comment 14

5 years ago
Thanks, Hiroyuki!

It seems to me it would be better to fix sendMailMessage() or request itself, though, to fix all tests that use them. It's a bug in them apparently, not in the test.

Comment 15

5 years ago
an other approach would be to use a request observer and wait for the onStopRequest before running the next test, instead of cancelling the previous request.
(In reply to David :Bienvenu from comment #15)
> an other approach would be to use a request observer and wait for the
> onStopRequest before running the next test, instead of cancelling the
> previous request.

Yes, I had tried it first, but I couldn't find the way how to set the request observer in this test.
(In reply to Ben Bucksch (:BenB) from comment #14)
> Thanks, Hiroyuki!
> 
> It seems to me it would be better to fix sendMailMessage() or request
> itself, though, to fix all tests that use them. It's a bug in them
> apparently, not in the test.

Ah! Now I got your point. You mean that calling sendMailMessage twice hits this assertion, right? Like this:

  smtpService.sendMailMessage(someFile, kTo, identity,
                              null, null, null, null,
                              false, {}, {});
  smtpService.sendMailMessage(anotherFile, kTo, identity,
                              null, null, null, null,
                              false, {}, {});
(In reply to Hiroyuki Ikezoe (:hiro) from comment #17)
> (In reply to Ben Bucksch (:BenB) from comment #14)
> > Thanks, Hiroyuki!
> > 
> > It seems to me it would be better to fix sendMailMessage() or request
> > itself, though, to fix all tests that use them. It's a bug in them
> > apparently, not in the test.
> 
> Ah! Now I got your point. You mean that calling sendMailMessage twice hits
> this assertion, right? Like this:
> 
>   smtpService.sendMailMessage(someFile, kTo, identity,
>                               null, null, null, null,
>                               false, {}, {});
>   smtpService.sendMailMessage(anotherFile, kTo, identity,
>                               null, null, null, null,
>                               false, {}, {});

Urgh, I have misunderstood..

Comment 19

5 years ago
sendMailMessage takes an optional url listener - when onStopRunningUrl is called on the listener, it should be OK to call it again, or after a return to the event loop with a yield or set timeout of 0.
Created attachment 596922 [details] [diff] [review]
The fix for nsSmtpProtocol

Set m_sendDone true whenever sending "QUIT" command.

I am not really sure this patch has no side-effects. I just run unit test in mailnews/compose/test/unit. All the tests in the path passed.
Attachment #596562 - Attachment is obsolete: true
Attachment #596922 - Flags: review?(ben.bucksch)
Attachment #596562 - Flags: review?(mbanner)
(In reply to David :Bienvenu from comment #19)
> sendMailMessage takes an optional url listener - when onStopRunningUrl is
> called on the listener, it should be OK to call it again

I did actually try it, but it couldn't fix this assertion.

> or after a return to the event loop with a yield or set timeout of 0.

Maybe this way fixes the assertion, but I do not prefer timeout magic....
(Assignee)

Comment 22

5 years ago
Comment on attachment 596922 [details] [diff] [review]
The fix for nsSmtpProtocol

I'd certainly like to take a look at this as I've touched code in this area before.
Attachment #596922 - Flags: review?(mbanner)

Comment 23

5 years ago
(In reply to Hiroyuki Ikezoe (:hiro) from comment #21)
> (In reply to David :Bienvenu from comment #19)
> > sendMailMessage takes an optional url listener - when onStopRunningUrl is
> > called on the listener, it should be OK to call it again
> 
> I did actually try it, but it couldn't fix this assertion.
> 
> > or after a return to the event loop with a yield or set timeout of 0.
> 
> Maybe this way fixes the assertion, but I do not prefer timeout magic....

returning to the event loop to let the stack unwind isn't magic - but I agree that things like settimeout(100) are magic and to be avoided.
(Assignee)

Comment 24

5 years ago
Created attachment 607175 [details] [diff] [review]
The fix for nsSmtpProtocol v2
[Checked in: Comment 25]

This is Hiroyuki's patch fixed for bitrot and one comment improvement.

Sorry for the delay in getting to this. As far as I can tell this is the right thing to do - OnStopRequest was trying to work out if the connection had been dropped whilst doing an auth, but also coupled with one of the steps returning an error.

My suspicion would be a race condition somewhere possibly being caused by the test infrastructure, but as far as I can tell this fixes the assertion.
Attachment #596922 - Attachment is obsolete: true
Attachment #607175 - Flags: review+
Attachment #596922 - Flags: review?(mbanner)
Attachment #596922 - Flags: review?(ben.bucksch)
Comment on attachment 607175 [details] [diff] [review]
The fix for nsSmtpProtocol v2
[Checked in: Comment 25]

http://hg.mozilla.org/comm-central/rev/6a883737262a


[Approval Request Comment]
I assume that would be good for branches too...
Attachment #607175 - Attachment description: The fix for nsSmtpProtocol v2 → The fix for nsSmtpProtocol v2 [Checked in: Comment 25]
Attachment #607175 - Flags: approval-comm-beta?
Attachment #607175 - Flags: approval-comm-aurora?
(Assignee)

Updated

5 years ago
Status: REOPENED → RESOLVED
Last Resolved: 7 years ago5 years ago
Resolution: --- → FIXED
Assignee: nobody → mbanner
Target Milestone: --- → Thunderbird 14.0
No longer depends on: 712540
Blocks: 739007
(Assignee)

Comment 26

5 years ago
Comment on attachment 607175 [details] [diff] [review]
The fix for nsSmtpProtocol v2
[Checked in: Comment 25]

Not taking forward as this is a protocol level change which although simple, I'd still prefer to get tested, and according to the new bug created doesn't seem to fix the test issue fully anyway.
Attachment #607175 - Flags: approval-comm-beta?
Attachment #607175 - Flags: approval-comm-beta-
Attachment #607175 - Flags: approval-comm-aurora?
Attachment #607175 - Flags: approval-comm-aurora-
status-thunderbird11: --- → wontfix
status-thunderbird12: --- → wontfix
status-thunderbird13: --- → wontfix
status-thunderbird-esr10: --- → wontfix
Depends on: 780124
You need to log in before you can comment on or make changes to this bug.