Closed Bug 1516134 Opened Last year Closed 11 months ago

Account creation not working in TB 65 beta

Categories

(Thunderbird :: Account Manager, defect, blocker)

defect
Not set
blocker

Tracking

(thunderbird65 fixed, thunderbird66 fixed)

RESOLVED FIXED
Thunderbird 66.0
Tracking Status
thunderbird65 --- fixed
thunderbird66 --- fixed

People

(Reporter: jorgk-bmo, Assigned: aceman)

References

Details

(Keywords: regression, Whiteboard: See comment 4 for reproduction steps. Appears only 60% of the time.)

Attachments

(2 files)

I created a new profile with TB 65 beta and wanted to set up a normal IMAP account. Typed in all the details, then manual config.

In the end I did "Re-test", TB was all happy, but clicking "Done" got me:

Programming bug. Assertion failed, see log.

In the error console I got:

2018-12-22 20:35:28	mail.setup	ERROR	Error creating account.  ex=Programming bug. Assertion failed, see log., stack=Exception@chrome://messenger/content/accountcreation/util.js:122:3
NotReached@chrome://messenger/content/accountcreation/util.js:135:3
assert@chrome://messenger/content/accountcreation/util.js:19:11
needed@chrome://messenger/content/accountcreation/emailWizard.js:1830:5
open@chrome://messenger/content/accountcreation/emailWizard.js:1883:18
onCreate@chrome://messenger/content/accountcreation/emailWizard.js:1661:7
oncommand@chrome://messenger/content/accountcreation/emailWizard.xul:1:1

log4moz.js:680

CApp_append resource:///modules/gloda/log4moz.js:680
Logger_log resource:///modules/gloda/log4moz.js:374
Logger_error resource:///modules/gloda/log4moz.js:382
onCreate chrome://messenger/content/accountcreation/emailWizard.js:1668
oncommand chrome://messenger/content/accountcreation/emailWizard.xul:1

I got my account created in the end by clicking "Advanced config", but somehow I'm not impressed. I'm not shipping a beta with account creation broken.

Looks like we focused so much on the Exchange stuff that no one pay attention to whether regular account creation is still working :-(

Wayne, manual account creations is part of the test script, no?
Flags: needinfo?(wls220spring)
Flags: needinfo?(vseerror)
Flags: needinfo?(mkmelin+mozilla)
Flags: needinfo?(ben.bucksch)
Flags: needinfo?(acelists)
Hmm, doesn't always fail, but this made it fail:

Entered jorgk@jorgk.com which is hosted at Hetzner. It found mail.jorgk.com and I corrected to mail.your-server.de. Hit "Re-test", didn't like the STARTTLS settings. Corrected to SSL/TLS, "Re-test", then it was happy. Done then gave an error.

If you want, I can give you a test account on that server, so you can see for yourself.

Don't make me regret uplifting all the Exchange stuff, since as quickly as I uplifted, I will back it out again.
> Wayne, manual account creations is part of the test script, no?

Yes - all four types iirc. And Walt is one of the testers. The call for testing hasn't gone out yet - I'll request to put it on hold on the assumption we will want to rebuild
Flags: needinfo?(vseerror)
I tried to reproduce using the steps in comment 1 and with an @jorgk.com test account that Jörg created for me (thanks).
Here's what I see:
1. Enter yourusername@jorgk.com and password
2. Click [Continue]
3. -> It uses guessing and finds imap.jorgk.com and smtp.jorgk.com, both with STARTTLS and Normal password
4. Click [Manual Config]
5. Replace both hostnames with "mail.your-server.de"
6. Click [Re-Test]
7. -> The dialog is happy, no error, [Done] is enabled (this is *not* what Jörg saw in comment 1)
8. Click [Done]
9. -> Account is set up correctly, and I see its folders, including clearly custom folders
10. I delete the IMAP account again. I cannot delete the SMTP account, TB doesn't let me (it's the only one). I restart TB.

Next attempt:
11. Follow steps 1-6 above.
12. Now it "failed to find the settings". This is different from step 7 above and the same that Jörg saw.
13. Change to SSL
14. Click [Re-Test]
15. -> No error
16. Click [Done]
17. Error dialog "Programming bug. Assertion failed, see log.". On the Error Console, I get the same stack
    as Jörg in comment 0.

Duh. I'll have to see why [Re-Test] sometimes fails.

The assertion thrown says (according to emailWizard.js line 1830) that the account configuration is incomplete. It shouldn't be at this point, so it assertion itself it correct. It is probably a consequence of [Re-Test] not working correctly.
Flags: needinfo?(wls220spring)
Flags: needinfo?(mkmelin+mozilla)
Flags: needinfo?(ben.bucksch)
Flags: needinfo?(acelists)
What's failing here is the function onHalfManualTest(), which calls guessConfig(). Something in guessConfig() must be failing and creating an incomplete config.

This is complex code (and the guess config wasn't written by me), I'll need some time to debug what's going wrong here.

----

I've looked at the Exchange patch, and found nothing that could cause this. I've made only small changes to guessConfig(), and I checked that they do *not* affect onHalfManualTest(). So, I reverted the Exchange patch and the later patches, recompiled, and tried again, and I still see the same bug. So, neither Exchange patches or the later ones cause this.

This bug existed before and is independent of the Exchange patches.

@Jörg: Given that you already reverted the Exchange patches, could you please test with them reverted, whether you still see the bug? Note that reproduction is not 100% reliable, see comment 4 step 7 vs. step 12.
If you still see the bug, could you please put the Exchange patches back on beta? If we fix this bug here, the backport to beta will be easier, if beta matches trunk. And the Exchange patches really need beta testing.
Flags: needinfo?(jorgk)
As far as I understood the thunderbird-drivers emails, 65 b1 build4 contains the Exchange patches and 65 b1 build5 does not contain them. For verification, I've downloaded http://ftp.mozilla.org/pub/thunderbird/candidates/65.0b1-candidates/build5/linux-x86_64/de/thunderbird-65.0b1.tar.bz2 and tested with that build, and I still see the bug. This should prove that the Exchange patches are not at fault.

I noticed that the bug appears only in about 60% of the cases. If in step 6 (with STARTTLS), I repeatedly press [Re-Test], it sometimes finds a valid config and sometimes not. This could be caused by either by a race condition in the code, or by a round-robin of the ISP and some hosts have a different software revision and don't respond to IMAP CAPACITY or SMTP EHLO as we expect (in the latter case, this would be ISP specific). I'll have to look at the network trace.
I tried build 5 and it failed as well :-( - My apologies for the hasty backout, but TB 65 beta 1 is already delayed by weeks and I didn't want to take additional risk. We can reland the Exchange stuff in beta 2.

I have set up my account @ Hetzner a felt million times in various test profiles and I have never seen the failure. I've just tried again in TB 60.4 and it worked fine. Maybe I was always lucky, or the unlucky percentage has increased.

So just repeating the steps that fail on TB 65:
1) enter details
2) auto-detections finds mail.jorgk.com/smtp.jorgk.com and user jorgk and STARTTLS
3) fill in the correct server mail.your-server.de and user to be the full e-mail address.
4) hit "Re-test", that fails, no idea why since STARTTLS should be supported.
5) Change to SSL/TLS and "Re-test", that succeeds.
6) Hit "Done" and get an error.
Flags: needinfo?(jorgk)
(In reply to Ben Bucksch (:BenB) from comment #6)
> I noticed that the bug appears only in about 60% of the cases. If in step 6
> (with STARTTLS), I repeatedly press [Re-Test], it sometimes finds a valid
> config and sometimes not. This could be caused by either by a race condition
> in the code, or by a round-robin of the ISP and some hosts have a different
> software revision and don't respond to IMAP CAPACITY or SMTP EHLO as we
> expect (in the latter case, this would be ISP specific). I'll have to look
> at the network trace.
Well, regardless whether STARTTLS is detected as valid, changing the settings to SSL/TLS, getting a *successful* re-test should not cause the "done" to fail, whatever happened before, right? So whether the ISP is flaky is not relevant, the code should always accept a configuration whose test finally passed.
Basically this fails on:
emailWizard.js:1830: assert(configSchema.isComplete());

isComplete is this: accountConfig.js
  isComplete() {
    return (!!this.incoming.hostname && !!this.incoming.port &&
         !!this.incoming.socketType && !!this.incoming.auth &&
         !!this.incoming.username &&
         (!!this.outgoing.existingServerKey ||
          this.outgoing.useGlobalPreferredServer ||
          (!!this.outgoing.hostname && !!this.outgoing.port &&
           !!this.outgoing.socketType && !!this.outgoing.auth &&
           !!this.outgoing.username)));

When I was at the point of all the "Done" errors, I reselected "Normal password" again on the panel, ran the re-test and then "Done" worked. So looks like the something lost its value, maybe the "auth". Dumping out the values would help.
OK, I added this clumsy debug and saw:

    dump(`1 ${this.incoming.hostname} 
          2 ${this.incoming.port}
          3 ${this.incoming.socketType}
          4 ${this.incoming.auth}
          5 ${this.incoming.username}
          6 ${this.outgoing.existingServerKey}
          7 ${this.outgoing.useGlobalPreferredServer}
          8 ${this.outgoing.hostname}
          9 ${this.outgoing.port}
         10 ${this.outgoing.socketType}
         11 ${this.outgoing.auth}
         12 ${this.outgoing.username}`);


1 mail.your-server.de
          2 993
          3 2
          4 0
          5 jorgk@jorgk.com
          6 null
          7 false
          8 mail.your-server.de
          9 465
         10 2
         11 0
         12 jorgk@jorgk.com
Programming bug. Assertion failed, see log.

So clearly, items 4 and 11 went to zero, which is the auth.
Right, reselecting "Normal password" makes it work, items 4 and 11 go to value 3.
I am not aware of a change in the account creation dialog that could cause this, but I'm not always asked for review, so something might have slipped my radar. It's possible that this is caused by UI widget changes in toolkit. We had strange problems with event handlers (onblur or similar) in the last 2 months.

A regression window would be useful to determine where this comes from: whether it's a account creation dialog change, or a toolkit change, and which exact one.
If you want a regression windows, you need to ask Alice. I hope you can make sense of comment #7. I'll send you the details/password in a PM. Thanks in advance.
Flags: needinfo?(alice0775)
Comment 4 has very precise reproduction steps.
Whiteboard: See comment 4 for reproduction steps. Appears only 60% of the time.
I think the keys is changing to SSL. Then the auth method gets lost internally and it fails on "Done".
I could also reproduce the problem and get the errors in console as in comment 0.
Looking through the changes in autoconfig, here are some candidates:
* Bug 1500620, Bug 1486051, Bug 1459184 - menulist
* Bug 1490265
* Bug 1431954
That's all I can find in the last 1.5 years that's relevant. That doesn't mean that they are the culprit, just candidates.

But the change might just as well be caused by toolkit changes. If the event handlers change behavior in a subtle way, it would also cause this bug.
(In reply to Jorg K (GMT+1) from comment #13)
> If you want a regression windows, you need to ask Alice. I hope you can make
> sense of comment #7. I'll send you the details/password in a PM. Thanks in
> advance.

What should I do?
Flags: needinfo?(alice0775)
Hi Alice :-)

You set up tbtest@jorgk.com. I'll send you in a private message. Enter some name, e-mail address and password on the first screen, then switch to manual config on the second one. Enter the correct server name, re-test. It will get stuck at STARTTLS, or even if it doesn't change both to SSL. Re-test will succeed, Done will fail. You can cancel out of the session and try with the next binary. The same instructions only longer in comment #6.
Regression window(not 100% confidence):
fails when "Re-test"

https://hg.mozilla.org/comm-central/pushloghtml?fromchange=531dfcb2d29dcd8a6dab337faf4fd2685fcf9d40&tochange=d303f46daae419dda630fa997e292b4c3fbe41ba
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=ddadc29de671917f66478e62a6accd4764892d25&tochange=3cc04ee79005058d817daf66da7963dfac3f0a3a


(Even if a good build, testing seems to fail intermittently. 
It is something like "Account Locking" by brute force attacks(Clicking the button repeatedly).)
Thanks Alice. The C-C range shows changes we made to compensate for regressions caused by bug 1437064, see bug 1437064 comment #5. M-C fixed the issues in bug 1501138 so we reverted all the changes, see here:
https://hg.mozilla.org/comm-central/rev/5f68c3830a75

So it is likely that something happened in that range, but then it should have become good around the 2018-11-09 again.

Never mind that the testing seems to fail intermittently. What is important is that changing the fields to SSL after the STARTTLS test failed or didn't fail for me at least always shows the programming error whereas in TB 60 this error doesn't happen.

Can you take another look please.
Regarding the error after clicking [Done]

STR
Type valid "Enter Your name", "Email address", "Password" field
Click on [Continie] and wait
Click on [Manual config]

Type valid Incomming/Outgoing Server hostname field
Click on [Re-Test] and wait for the detection of server settings

Chage SSL field to SSL/TLS for Incomming and Outgoing
Click on [Re-Test] and wait for the detection of server settings
Click on [Done]


Actual Results:
Error Creating Account
 Programming bug. Assertion failed, see log.

Expected Results:
Successfully created the folder for the account without error



Regression window:
https://hg.mozilla.org/comm-central/pushloghtml?fromchange=7ac42d333c2085e6df3ca99d470e2ed9937cdadd&tochange=fbd01a6e230fec49fd193f983ed9842747fe9c3c
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=f6df375b86987b2772067a61873ebfe3a98c353a&tochange=84b9cfcef11a337689bfb15944cf480e3cec415a
Thank to so much Alice. We were afraid of this outcome. Ben suspected in comment #17 that some toolkit or menulist changes could be at fault here. I suspect bug 1500620.

Over to Aceman and Geoff.
Flags: needinfo?(geoff)
Flags: needinfo?(acelists)
Account creation works fine after changing to SSL after local backout of rev fbd01a6e230f - \o/ !!!
Good catch Jorg and Alice.
I think I see the problem.
When the menulist is already editable, we still run onChangedOutgoingDropdown and expect the onChangedManualEdit() to be run, but we don't because the binding is already attached so bindingattached event isn't fired so onChangedManualEdit() isn't run again.
Assignee: nobody → acelists
Blocks: 1500620
Status: NEW → ASSIGNED
Flags: needinfo?(acelists)
OS: Unspecified → All
Hardware: Unspecified → All
Attached patch 1516134.patchSplinter Review
Let's add more hacks onto a hack:) This seems to work for me and try run also looks good:
https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=0a8932722b5bf58e5ff545d702909442c7f5b8b7
Attachment #9034591 - Flags: review?(jorgk)
Attachment #9034591 - Flags: review?(ben.bucksch)
Comment on attachment 9034591 [details] [diff] [review]
1516134.patch

Looks plausible and works. Ben can look at it later, and so can Geoff.
Flags: needinfo?(geoff)
Attachment #9034591 - Flags: review?(jorgk)
Attachment #9034591 - Flags: review?(geoff)
Attachment #9034591 - Flags: review+
Pushed by mozilla@jorgk.com:
https://hg.mozilla.org/comm-central/rev/1968c2aef7ae
run onChangedManualEdit() in emailWizard even if menulist hasn't changed from non-editable to editable, so that settings are read properly. r=jorgk DONTBUILD
Status: ASSIGNED → RESOLVED
Closed: 11 months ago
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 66.0
Comment on attachment 9034591 [details] [diff] [review]
1516134.patch

Seems alright to me. I really hate this two-bindings situation.
Attachment #9034591 - Flags: review?(geoff) → review+
Comment on attachment 9034591 [details] [diff] [review]
1516134.patch

All water down the bridge by now.
Attachment #9034591 - Flags: review?(ben.bucksch)
You need to log in before you can comment on or make changes to this bug.