Closed Bug 1185366 Opened 5 years ago Closed 5 months ago

[ISPDB] support office365 via generic solution (unfortunate MX overlap with outlook.com)

Categories

(Thunderbird :: Account Manager, defect)

defect
Not set

Tracking

(thunderbird_esr6871+ fixed, thunderbird71 fixed, thunderbird72 fixed)

VERIFIED FIXED
Thunderbird 72.0
Tracking Status
thunderbird_esr68 71+ fixed
thunderbird71 --- fixed
thunderbird72 --- fixed

People

(Reporter: anthony.bailey, Assigned: BenB)

References

(Blocks 2 open bugs, )

Details

Attachments

(3 files)

Attached file easystreet.net
No description provided.
This is an unfortunate case here.  As I mentioned in your bug 1185363 (iinet.com/pacifier.net/friends), we have MX lookup heuristics.  Unfortunately, Microsoft loves unified branding so much that we run into a problem where we get your MX record of "easystreet-net.mail.protection.outlook.com" and normalize that to "outlook.com" which also happens to be the same root domain for Microsoft's hotmail/live/outlook consumer service.

I think Thunderbird has a mechanism by which it tries options in order that could work for this.  (Firefox OS email unfortunately took a short-cut for various reasonable-ish reasons, but I can address that.)  A big question would be how the login failure is interpreted...

:BenB, it seems like Office 365 hosting is likely to be an ongoing thing where we'd benefit from the MX lookup heuristic working, what are your thoughts on this case?  The possibilities that spring to mind for me are:

A) Have the outlook.com entry also list the Office 365 option as the second IMAP choice to try.  Assuming the try-in-order logic is real and works, the issue with this is we will make a failing attempt on the first try for consumer outlook.  As long as this doesn't terminate autoconfig, this doesn't seem all that terrible since Microsoft runs all services, so it's not like we're telling Microsoft a username/password they didn't already know.

B) Same as 'A' but also add some additional, optional gunk to the entry for outlook.com so that an enhanced email client could directly pick the office 365 entry based on the MX and avoid the failed login.  We'd annotate the entry with the MX suffix, and the rule for match picking would be picking the longest suffix match.

C) Add some support for the MX lookups not exclusively truncating to the base domain.  Since I'm only doing Firefox OS engineering work, Thunderbird would not benefit from this unless someone stepped up.  Variants:
C1) just add a hacky table in each application
C2) Allow for the autoconfig logic to also do a lookup against one additional domain subcomponent.  So we'd lookup "protection.outlook.com" and "outlook.com" and we'd pick the more specific "protection.outlook.com".  We would be incredibly strict about this, saving it for cases like this where outsourced mail hosting is in use and the mail host is very large.

D) Ask Microsoft to stand up an autoconfig server at https://autoconfig.outlook.com that would just serve the right settings.

E) Use Microsoft's autodiscover server that is already stood up at autodiscover.outlook.com to disambiguate in this case.  This would work for Gaia email which already has activesync support.  I mention this only for completeness.  Because Microsoft asserts patents over ActiveSync and Thunderbird currently isn't actually using ActiveSync, this seems inadvisable in general, although I should point out that there are open-source carve-outs that exempt open source software and it's possible the autodiscover pieces (which I don't think uses WBXML?  I forget, could check...) aren't part of that.

Note that some combination of strategies could work too.
Flags: needinfo?(ben.bucksch)
Summary: [ISPDB] easystreet.net → [ISPDB] support office365 via generic solution (unfortunate MX overlap with outlook.com); specific driving case of easystreet.net
Duplicate of this bug: 594664
A doesn't work: We don't try all IMAP servers (and we should not), nor do we offer them all in the UI (user would be confused, and the UI in the usecase for the vast majority of hotmail users would be degraded, so not advisable either). We only use the first IMAP and POP3 server.

C would seem to be the most reasonable solution, but I don't know how we should go about implementing this without degrading the other cases.

---

Given the MX hostname "easystreet-net.mail.protection.outlook.com", this seems to be a spam filter of sorts, which then redirects to another domain. We already have this case for other cases:
* psmtp.net (Google Postini, service) - Just SMTP incoming and forwarding, no mail store, cannot support
* messagelabs.com (Symantec, service) - can't support, same as psmtp.net 
There's nothing we can do for these cases.

Here, the filter service (outlook.com) also hosts the IMAP server (for easystreet.net), I presume? So, in theory, we could do something. But question is whether it's worth the effort. And I don't know what's the least invasive solution.
Flags: needinfo?(ben.bucksch)
So, my candidate solution for the C2 case would be that when we do the MX lookup, we fetch from the ISPDB only (as is currently done) TLD+1 (which we already try to do) as well as TLD+2, and we favor the TLD+2 result over the TLD+1 result.

An interesting side-effect is that the FxOS Gaia mail implementation which does not currently have the publicsuffix.org database built-in and so does not understand things like ne.jp being a valid TLD would end up doing the right thing in those cases.  (Although would not be able to do the right thing in edge cases involving TLD's of length 2.  Although I plan to address this.)

But I don't believe this would degrade existing cases.  Since the MX case is already our last-ditch fall-back and we only consult the ISPDB, we have total control over the set of exposed domains.  Since we normally only host entries for TLD+1 domains, any choice for TLD+2 coverage needs to be intentional, and we can keep this in mind.  And the only case where there would be an ambiguity problem is where the ambiguity-breaking part of the domains is at level of TLD+3 or deeper.

In that case, TLD+2 is insufficient and we would also need to add TLD+3.  Which begs the question of whether we should search for the deeper levels, and indeed do lookups across the entire hierarchy.  The trade-offs are latency and network usage.  In this Office365 case, the MX domains have a depth of TLD+4.  In the case of netsolmail.net on bug 594670 there's a depth of TLD+4 as well.  Which isn't that bad, but they offer no additional useful information to us.


In terms of filter domains, I agree, there's nothing we can do about those.  We have a similar problem with dreamhost where they have multiple mail clusters, but they all go through the fltr-inN.mail.dreamhost.com filters.


Given that the TLD+2 lookup solution works and it's also a nice band-aid for FxOS Gaia Mail, I'm going to pursue it for Gaia mail and send a mailing list message to tb-planning announcing the intent and ask that someone maybe implement it for Thunderbird.

Thanks for the feedback, :BenB!
Status: UNCONFIRMED → NEW
Ever confirmed: true
> my candidate solution for the C2 case would be that when we do the MX lookup,
> we fetch from the ISPDB only (as is currently done) TLD+1 (which we already try to do)
> as well as TLD+2, and we favor the TLD+2 result over the TLD+1 result.

That would mean one additional fetch for every domain set up. We tried to be sparingly with them (there would have been tons of other checks we could have done). Personally, I don't think that's justified to fix one (small fraction of an) ISP.

Even though I hate per-ISP hacks, I think a code path that's executed only when outlook.com is the MX domain is better than a generic solution here. As you showed, the solution isn't really generic anyway. This is just a quirk.

Frankly, I think we should ask MS to fix this on their end.
You make a good point that it's arguably silly to add an additional lookup just for the Office365 case in the name of being generic.  However, I don't think the additional network traffic should be the primary concern.  Autoconfig can already have extremely high latencies due to some hosts having drop-packet firewall rules or the like which mean that our higher level connect/XHR timeouts come into play.  The ISPDB requests, in contrast, are unlikely to fail and if issued in parallel should have minimal additional latency regardless of whether pipelined or not.

A custom code-path for the Gaia mail app certainly is no worse, and we do have a lot of other one-off fixes (usually through local autoconfig).  However, it still is desirable for the entry to be hosted in the ISPDB, which raises the issue of which entry it should be hosted under.  Since the IMAP servers are under office365.com, I suppose that's the canonical answer.  But a question is whether it also makes sense to still advertise the entry under protection.outlook.com as well.

Note that, interestingly, the MX entries for outlook.com and live.com actually point at mxN.hotmail.com.  The only ambiguity is that we are using the same namespace for MX-domains as well as for email-domains.  (So like if we issued 


Re: advocacy, as mentioned on other bugs, since Thunderbird only performs MX indirection lookups against the ISPDB and not against the domain referenced, it's not clear there's an easy fix for Microsoft given the current Thunderbird implementation.  The most practical solution they could have would be to have office365 users add another CNAME for autoconfig given the instructions at https://support.office.com/en-za/article/Create-DNS-records-at-any-DNS-hosting-provider-for-Office-365-7b7b075d-79f9-4e37-8a9e-fb60c1d95166 but it would be entirely impractical for that domain to be able to server valid https connections for the autoconfig.ORIGINALHOST.TLD, which is what Gaia mail uses and Thunderbird should move to.  Except for the bit where we can't/don't do SRV lookups, it would arguably be better if they just used those.
(In reply to Andrew Sutherland [:asuth] from comment #6)
> Note that, interestingly, the MX entries for outlook.com and live.com
> actually point at mxN.hotmail.com.  The only ambiguity is that we are using
> the same namespace for MX-domains as well as for email-domains.  (So like if
> we issued 

(unfinished thought continuation): So like if we issued a special request like MX.outlook.com, we could completely avoid the ambiguity at all, although it still gets into having the entries explicitly aware of this concept.  (Noting that MX. would probably not want to be the prefix; we'd need a delimiter that was illegal, etc.  And I think that really would be way too weird/awkward for the ISPDB.  Although it could make sense for our Gaia email hack if we just used a separate subdirectory from our autoconfig/ dir.)
Duplicate of this bug: 1067605
Duplicate of this bug: 1297957
Assignee: nobody → ben.bucksch
Status: NEW → ASSIGNED
Attachment #9106373 - Flags: review?(jorgk)
Attachment #9106373 - Flags: review?(neil)
Attachment #9106373 - Flags: review?(bugmail)
Comment on attachment 9106373 [details] [diff] [review]
Differentiate based on domain, v2

(not currently doing work related to this)
Attachment #9106373 - Flags: review?(bugmail)
Comment on attachment 9106373 [details] [diff] [review]
Differentiate based on domain, v2

Sorry, I really know nothing about this. You could ask Magnus.
Attachment #9106373 - Flags: review?(jorgk)
Comment on attachment 9106373 [details] [diff] [review]
Differentiate based on domain, v2

Given that outlook.com and Office365 are different services with slightly different implementations and very different policies (e.g. regarding IMAP, MFA etc.), we should really differentiate them with different config files in the ISPDB.
Attachment #9106373 - Flags: review?(acelists)
Comment on attachment 9106373 [details] [diff] [review]
Differentiate based on domain, v2

>+      // to differentiate between Hotmail/outlook.com and Office365 business domains.
Either "Outlook.com/Hotmail" or "hotmail.com/outlook.com" would be slightly less confusing.

>+      } catch (ex) { // e.g. hostname doesn't have a enough components
Grammar nit: unnecessary "a"

>+        console.error(ex); // not fatal
`logException`?

>+      if (sld != mxDomain && mxDomain) {
I would prefer `mxDomain && mxDomain != sld`.
Attachment #9106373 - Flags: review?(neil) → review+
Comment on attachment 9106373 [details] [diff] [review]
Differentiate based on domain, v2

This works for me. Via pref("mailnews.auto_config_url", "https://www.beonex.com/autoconfig/test/"); Ben has set up
https://www.beonex.com/autoconfig/test/office365.com where Exchange is the first option, and
https://www.beonex.com/autoconfig/test/outlook.com where IMAP is the first option.

Trying addresses at outlook.com and office365.com I get those first options selected, so as far as I can see, this is working.

Some changes went into the ISPDB today:
https://autoconfig.thunderbird.net/v1.1/hotmail.com
https://autoconfig.thunderbird.net/v1.1/office365.com
https://autoconfig.thunderbird.net/v1.1/outlook.com

With this patch, clients will be able to distinguish O365 from the rest. That paves the way for selecting the IMAP/Exchange preference to be configurable via the ISPDB. Once all clients have received this patch, file outlook.com can be removed again.

I'll land it with the nits addressed and the testing pref removed.
Attachment #9106373 - Flags: review?(acelists) → feedback+

Proper author, commit message, linting, nits addressed, testing hunk removed. I left the console.error since I don't think it's an interesting error and we'd need var { logException } = ChromeUtils.import("resource:///modules/errUtils.js");

Attachment #9107767 - Attachment is patch: true

@Jörg: I was about to update the patch, but thanks a lot. :-)

Attachment #9107767 - Attachment description: office365-domain-1592258-2.diff - Patch for landing → Differentiate based on domain, v3 - Patch for landing

It seems this bug as reported is not relevant anymore. In the four years since it was reported, Microsoft changed their strategy and are AFAIK, pushing more or less every one of these services into using Office 365 in the cloud, i.e. the outlook.office365.com + stmp.office365.com servers. So for the driving case easystreet.com Thunderbird already do the right thing. See https://support.easystreet.com/?page_id=39

I don't understand the comment. Are you saying this shouldn't be landed to untangle outlook.com and office365.com?

I'm not sure. outlook.com should be using outlook.office365.com for IMAP. https://support.office.com/en-us/article/pop-imap-and-smtp-settings-for-outlook-com-d088b986-291d-42b8-9564-9c414e2aa040 and so should (of course) office365.com

I had a long discussion with Magnus about bug 1185366, bug 1592258 and bug 1594366. Here's a summary for this bug:

Magnus said he would try the patch and look further into the detail. Magnus and I would like to understand the patch better and would like to know which part of the autoconfig is compared additionally which differs between O365 and outlook/hotmail.

The comment says: "In addition to just the base domain, also check the full domain of the MX server". So which "bit" exactly is compared and is that bit likely to remain different in the future if MS is moving services to the office365.com domain. Is that "bit" in the autoconfig files or obtained elsewhere?

Sorry about the ignorance, it's just a learning exercise for me.

You can check the MX with dig mx plus the email domain. and the ISP with https://autoconfig.thunderbird.net/v1.1/ plus the domain domain.

  1. All other ISPs:
    dig mx yahoo.com = mta6.am0.yahoodns.net

Here, we take the SLC (second level domain) and look up https://autoconfig.thunderbird.net/v1.1/yahoodns.net in the ISPDB. yahoodns.net is known there as secondary domain for Yahoo, so the lookup finds Yahoo.

This works for almost all ISPs in the world, apart from Office365.

  1. The old system even works for Hotmail:
    dig mx outlook.com = outlook-com.olc.protection.outlook.com
    dig mx outlook.de = eur.olc.protection.outlook.com
    dig mx live.com = live-com.olc.protection.outlook.com
    dig mx hotmail.com = hotmail-com.olc.protection.outlook.com
    dig mx hotmail.de = eur.olc.protection.outlook.com

Again, we took the SLC outlook.com.

(But, before we do the MX lookup, we do a direct email domain lookup, so for foo@hotmail.de, we first ask the ISPDB for https://autoconfig.thunderbird.net/v1.1/hotmail.de , and if that gives a result, we use this config instead of the config found via MX. This is the trick that I used last night to fix most domains right now.)

  1. The MX lookup fails to distinguish Office365:
    (beonex.onmicrosoft.com is our Office365 test account, but the same is true for all fully hosted Office365 domains that I know.)
    dig mx beonex.onmicrosoft.com = beonex.mail.protection.outlook.com.

So, here the SLC is again outlook.com and not office365.com as you would expect, given that they are different offers with different properties.
That's the bug we're trying to fix.

  1. Fix:
    So, what we do after the fix is we continue to query the SLC, because that works very well for almost all ISPs in the world. E.g. for Yahoo, the correct lookup is yahoodns.net and not am0.yahoodns.net . But to fix this, we also make a lookup for the complete MX domain, without hostname, in this case olc.protection.outlook.com for Hotmail, and mail.protection.outlook.com for Office365. These domains are added to the Hotmail and Office365 configurations respectively, so that the lookups work:

So, this fix gives us more flexibility, without breaking anything that works right now.

Hmm, thanks for the explanation, this stuff in fiendishly complicated.

I added a dump here:

      let mxDomain;
      try {
        mxDomain = Services.eTLD.getNextSubDomain(mxHostname);
        dump(`=== ${mxDomain} =========================\n`);
      } catch (ex) {

and did some testing. Trying to set up jk@beonex.onmicrosoft.com, I indeed see mail.protection.outlook.com. Setting up jk@outlook.com doesn't run this code since outlook.com is directly available in the ISPDB.

I'm a little confused that this code is run when setting up jk@yahoo.com. yahoo.com is directly available in the ISPDB. I see am0.yahoodns.net.

BTW, testing was done using the live ISPDB, not https://www.beonex.com/autoconfig/test/.

I noticed something else when entering jk@office365.com. Of course the code again doesn't run since https://autoconfig.thunderbird.net/v1.1/office365.com exists and is used first. But I saw a considerable delay in the Exchange option being selected after the detection had finished. That was due to the entire MS welcome page being dumped out in the log. I suggest addressing that in a separate bug

client config xml = "\r\n\r\n\r\n<!DOCTYPE html>\r\n<html lang=\"de-de\" dir=\"l
tr\">\r\n<head data-info=\"{&quot;v&quot;:&quot;1.0.7243.34191&quot;,&quot;a&quo
t;:&quot;47528f8f-a5bf-4bea-b3e3-32e54a6bb4ad&quot;,&quot;cn&quot;:&quot;OneDepl
oyContainer&quot;,&quot;az&quot;:&quot;{did:92e7dc58ca2143cfb2c818b047cc5cd1, ri
d: OneDeployContainer, sn: marketingsites-prod-odnortheurope, dt: 2018-05-03T20:

and tons more.

Pushed by mozilla@jorgk.com:
https://hg.mozilla.org/comm-central/rev/8caa76852c2c
check the full domain of the MX server to differentiate between Outlook.com and Office365. r=Neil

Status: ASSIGNED → RESOLVED
Closed: 5 months ago
Resolution: --- → FIXED
Component: ISPDB Database Entries → Account Manager
Product: Webtools → Thunderbird
Target Milestone: --- → Thunderbird 72.0
Comment on attachment 9107767 [details] [diff] [review]
Differentiate based on domain, v3 - Patch for landing

This is key to fixing the Outlook/O365 dilemma. Heading into beta now to target TB 68.3.
Attachment #9107767 - Flags: approval-comm-esr68+
Attachment #9107767 - Flags: approval-comm-beta+

TB 71 beta 3:
https://hg.mozilla.org/releases/comm-beta/rev/8c3e92a5c1e90562fc5a3837a2a606031bd2cee8

I forgot to say, Magnus is not available for a day or so, so I advanced this. Ben, can you still reply to my comment #23.

Hi Jörg, thanks for landing this. This fixes a long-standing issue.

Yes, indeed, it would be good for this fix to land in TB 68.3.

I'm a little confused that this code is run when setting up jk@yahoo.com. yahoo.com is directly available in the ISPDB. I see am0.yahoodns.net.

For speed, we fire the 2 requests at the same time. [corrected:] If the second is available, we abort or ignore the first request.

the entire MS welcome page being dumped out in the log. I suggest addressing that in a separate bug
client config xml = "\r\n\r\n\r\n<!DOCTYPE html>\r\n<html lang=\"de-de\" dir=\"ltr\">\r\n<head data-info

Yes, that must be because Microsoft, as usual, cannot read specifications, not even the most basic stuff like HTTP 404. They appear to send a HTTP 200 success and HTML page for a "not found". I think this is new. (Welcome to our world of coding against Microsoft servers! :-( )

Not entirely coincidentally, I ran into the same issue and fixed it while working on this bug, but didn't include it in the patch, because it's irrelevant here. I filed bug 1595991 with a patch to cut this down.

Working in my TB 68 build (using Ben's test data at https://www.beonex.com/autoconfig/test/ since the ISPDB has the same data for Outlook and O365 presently).

Ben, you realise that on Windows the window is too small and it looks pretty bad, see attachment 9104934 [details]. I guess most customers will be on WIndows.

Status: RESOLVED → VERIFIED
Flags: needinfo?(ben.bucksch)

you realise that on Windows the window is too small and it looks pretty bad, see attachment 9104934 [details]. I guess most customers will be on Windows.

Filed bug 1596582.

Flags: needinfo?(ben.bucksch)
Summary: [ISPDB] support office365 via generic solution (unfortunate MX overlap with outlook.com); specific driving case of easystreet.net → [ISPDB] support office365 via generic solution (unfortunate MX overlap with outlook.com)
No longer depends on: 1299232
Duplicate of this bug: 1299232
You need to log in before you can comment on or make changes to this bug.