Intermittent comm/mailnews/compose/test/unit/test_sendMailAddressIDN.js | onStopSending - [onStopSending : 63] 0 == 2153066735
Categories
(MailNews Core :: Networking: SMTP, defect, P5)
Tracking
(thunderbird_esr78 unaffected)
Tracking | Status | |
---|---|---|
thunderbird_esr78 | --- | unaffected |
People
(Reporter: intermittent-bug-filer, Assigned: gds)
References
(Regression)
Details
(Keywords: intermittent-failure, regression)
Attachments
(1 file, 1 obsolete file)
11.54 KB,
patch
|
mkmelin
:
review+
gds
:
feedback+
|
Details | Diff | Splinter Review |
Filed by: mkmelin [at] iki.fi
Parsed log: https://treeherder.mozilla.org/logviewer.html#?job_id=315942866&repo=comm-central
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/EM4hZWy0TKuQqR2PsLw-eQ/runs/0/artifacts/public/logs/live_backing.log
Started after bug 1563891```
Assignee | ||
Comment 1•4 years ago
•
|
||
(In reply to Ping Chen (:rnons) from Bug 1563891 comment #54)
Don't know why
test_sendMailAddressIDN.js
becomes unstable on my local, if I run it ten times, about five times will fail. It seems there are two places that can triggeronStopSending
:
- https://searchfox.org/comm-central/source/mailnews/compose/src/nsSmtpProtocol.cpp#2113
- https://searchfox.org/comm-central/source/mailnews/base/src/nsMsgProtocol.cpp#393
When
nsMsgProtocol::OnStopRequest
is called first, theaStatus
will be 0, and the test will fail.
Is this only failing on OSX?
On linux I see (1.) occurring first with the correct error code so it doesn't fail with many runs. (2.) also occurs but with error 0x80004004 (NS_ERROR_ABORT). I never see this with a 0 status code. (I think the NS_ERROR_ABORT occurs because I call do_test_finish().)
It appears that the failure is always with the last test based on the referenced logs. Is that always true? If so, you should see this as the last passing test in the log: Passed test børken.to@invalid.foo.invalid
. But if the last test passes you will see this last: Passed test børken.to
. (The last test tests the case where there is no @ in the address.) Only these 2 tests cause a check at the failing line 63.
Note: There is one error in my comments added to nsSmtpProtocol.cpp but it doesn't affect the tests or how the code works AFAICT . When the last test attempts to send a message to an address without @, the address actually received by the protocol code is empty. This is based on observations with printfs added to the protocol code running on linux. I assume this is also true for Windows and OSX but I can't test with those locally.
Comment 2•4 years ago
|
||
I can reproduce on two of my Linux machines. Both the last two cases can fail, see the latest build https://treeherder.mozilla.org/#/jobs?repo=comm-central&selectedTaskRun=ZwA9AhHvTaG1piRN0CKQ0g.0.
Assignee | ||
Comment 3•4 years ago
|
||
(In reply to Ping Chen (:rnons) from comment #2)
I can reproduce on two of my Linux machines. Both the last two cases can fail, see the latest build https://treeherder.mozilla.org/#/jobs?repo=comm-central&selectedTaskRun=ZwA9AhHvTaG1piRN0CKQ0g.0.
I'm not real familiar with reading the test results. Is the failure always in the ones labeled "X2"? If so, I think I only see failure for optimized OSX.
I have only tested locally with linux debug build. Are you testing locally with optimized or debug on linux?
I have run the test like this many time with my debug build and never see an error:
gene@hplt:~/mozilla$ ./mach xpcshell-test comm/mailnews/compose/test/unit/test_sendMailAddressIDN.js
:
:
xpcshell
~~~~~~~~
Ran 27 checks (26 subtests, 1 tests)
Expected results: 27
Unexpected results: 0
OK
0:01.75 INFO Node moz-http2 server shutting down ...
0:01.75 INFO http3Server server shutting down ...
Assignee | ||
Comment 4•4 years ago
|
||
I rebuilt with optimization and without debug on another linux machine. Now I see the failure. Not sure if the failure is due to optimization or because this machine is a somewhat faster desktop rather than the slower laptop that always passes running the debug build.
Comment 5•4 years ago
|
||
Because it's an xpcshell test, it will always be in the X group. There is some script that divides all xpcshell tests to X1-X4.
I didn't enable optimization or debug. I only have ac_add_options --enable-application=comm/mail
in my mozconfig. Also, I learnt about --verify
option yesterday to run a test for 10 times, so you can use
./mach test --verify comm/mailnews/compose/test/unit/test_sendMailAddressIDN.js
.
Comment 6•4 years ago
|
||
Some modernization of the test, but it's still unreliable. Dunno why. I think it's like Ping reported...
Assignee | ||
Comment 7•4 years ago
|
||
I think I may have a fix. The problem is because the return of NS_ERROR_BUT_DONT_SHOW_ALERT here
https://searchfox.org/comm-central/rev/27cb9951c097a9b15f8952a79617921fdcc568da/mailnews/compose/src/nsSmtpProtocol.cpp#706
is causing SendQuit() to be called here
https://searchfox.org/comm-central/rev/27cb9951c097a9b15f8952a79617921fdcc568da/mailnews/compose/src/nsSmtpProtocol.cpp#2140
sooner than the SMTP_ERROR_DONE state, where the error state NS_ERROR_BUT_DONT_SHOW_ALERT is set here
https://searchfox.org/comm-central/rev/27cb9951c097a9b15f8952a79617921fdcc568da/mailnews/compose/src/nsSmtpProtocol.cpp#2113
After the QUIT and the server disconnects OnStopRequest() is triggered with status 0. As Ping pointed out, if this occurs before SMTP_ERROR_DONE is acted upon, the 0 status is returned to the test and it fails. But if the SMTP_ERROR_DONE state is reached and acts first, the test passes because NS_ERROR_BUT_DONT_SHOW_ALERT is returned to the test as expected.
So the change is to return NS_OK here instead of NS_ERROR_BUT_DONT_SHOW_ALERT:
https://searchfox.org/comm-central/rev/27cb9951c097a9b15f8952a79617921fdcc568da/mailnews/compose/src/nsSmtpProtocol.cpp#706
This will ensure SendQuit() occurs last.
This seems to fix the problem on the machine that fails.
I don't think this affects anything in the real world but I'll test some more to be sure. Another easy fix is just remove the offending check from the test but that just ignores the race issue.
P/S: I haven't yet tried Magnus' new test code.
Assignee | ||
Comment 8•4 years ago
|
||
Here's a patch that fixes the problem by changing the mentioned return value to NS_OK. I also updated a few comments that were not quite right. These are both in nsSmtpProtocol.cpp
I also tested with your modernized version of test_sendMailAddressIDN.js and included those changes in the patch.
Assignee | ||
Comment 9•4 years ago
|
||
Comment 10•4 years ago
|
||
Comment 11•4 years ago
|
||
Pushed by mkmelin@iki.fi:
https://hg.mozilla.org/comm-central/rev/5be6a8a9d8b1
Fix to avoid failure in and updates to test_sendMailAddressIDN.js r=mkmelin
Comment hidden (Intermittent Failures Robot) |
Updated•4 years ago
|
Description
•