Bug 1572418 Comment 35 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

It's true that the addons.json is loaded only after we found a config that has an Exchange server in it. There are currently 2 ways to get an Exchange config: Via Microsoft AutoDiscover protocol, or via ISPDB. The AutoDiscover protocol has been there since last year (you had reviewed that change). This exists to support on-premise Exchange servers. That ISPDB can also return an Exchange config is rather new and was added to support Office365, which has a config in the ISPDB.

We have so many different autoconfig methods, and each need network calls, which can be slow, that we start all the network calls in parallel. First, we query the domain with our own autoconfig method. Then we query the ISPDB for the domain. Then we get the MX host, and query the ISPDB for that MX domain. Then we do AutoDiscover. When I say "first... then", in the past, we really did them in order. This was slow. Because many hosts are configured to DROP packets for unknown hosts, these queries have to time out before they finally fail. So, each of these can take 10s. If you chain them, one after the other, it becomes really slow. That's why the network calls are all started at the same time. If then one of the queries succeeds, we use them in the order that I mentioned. This change has been very much needed and dramatically increased the speed of the configuration from what could be 50 seconds to 10 seconds. Note that most calls are to the very domain where you host your emails, in fact the very first call is to the domain, and it's always been like that.

I think what happened here is that the AutoDiscover results were quick, because they came from the on-premise Exchange and the company-internal network, but the ISPDB and addons.json calls was slow due to the OSCP lookups, and the corresponding DNS lookuips. Even if the ISPDB call happens first, and takes 10s, and the addons.json call starts a second later while the ISPDB call is still in flight, the addon.json call would still wait for the same DNS lookups and OSCP queries, and block as well, and timeout as well.

Please don't concentrate on the addons.json call here. That's just how we found it, because we measured our installs. The same problem also happened for the ISPDB call. These calls also timed out, even though I have no statistics about it. I know from my logs here that about 2/3 of the addons.json calls timed out. Given that they happen later than the ISPDB, I would expect that even more of the ISPDB calls timed out, based on the same reasoning that you mentioned, that the ISPDB call happens first.
It's true that the addons.json is loaded only after we found a config that has an Exchange server in it. There are currently 2 ways to get an Exchange config: Via Microsoft AutoDiscover protocol, or via ISPDB. The AutoDiscover protocol has been there since last year (you had reviewed that change). This exists to support on-premise Exchange servers. That ISPDB can also return an Exchange config is rather new and was added to support Office365, which has a config in the ISPDB.

We have so many different autoconfig methods, and each need network calls, which are slow, that we start all the network calls in parallel. First, we query the email domain with our own autoconfig method. Then we query the ISPDB for the domain. Then we get the MX host, and query the ISPDB for that MX domain. Then we do AutoDiscover. When I say "first... then", we cannot do them in order. This would be very slow. Unfortunately, because many networks are configured to DROP packets, these queries have to time out before they finally fail. So, each of these network calls can take 10s. If you chain them, one after the other, it becomes really slow. That's why the network calls are all started at the same time. But we use them in the order that I mentioned. This change has been very much needed and dramatically increased the speed of the configuration from what could be 1.5 minutes to 10 seconds. Note that most calls are to the very domain where you host your emails, in fact the very first call is to the domain, and it's always been like that.

I think what happened here is that the AutoDiscover results were quick, because they came from the on-premise Exchange and the company-internal network, but the ISPDB and addons.json calls were slow due to the OSCP lookups, and the corresponding DNS lookups. Even if the ISPDB call happens first, and takes 7 seconds, and the addons.json call starts 1 second later while the ISPDB call is still in flight, the addons.json call would still wait for the same DNS lookups and OSCP queries, and block as well, and timeout as well.

Please don't concentrate on the addons.json call here. That's just how we found it, because we measured our Owl installs. Up to 2/3 of the addons.json calls timed out. But **the same problem also happened for the ISPDB call**. These calls also timed out, even though I have no statistics about it. Given that they happen later than the ISPDB, I would expect that even more of the ISPDB calls timed out, based on the same reasoning that you mentioned, namely that the ISPDB call happens first.
It's true that the addons.json is loaded only after we found a config that has an Exchange server in it. There are currently 2 ways to get an Exchange config: Via Microsoft AutoDiscover protocol, or via ISPDB. The AutoDiscover protocol has been there since last year (you had reviewed that change). This exists to support on-premise Exchange servers. That ISPDB can also return an Exchange config is rather new and was added to support Office365, which has a config in the ISPDB.

We have so many different autoconfig methods, and each need network calls, which are slow, that we start all the network calls in parallel. First, we query the email domain with our own autoconfig method. Then we query the ISPDB for the domain. Then we get the MX host, and query the ISPDB for that MX domain. Then we do AutoDiscover. When I say "first... then", we cannot do them in order. This would be very slow. Unfortunately, because many networks are configured to DROP packets, these queries have to time out before they finally fail. So, each of these network calls can take 10s. If you chain them, one after the other, it becomes really slow. That's why the network calls are all started at the same time. But we use them in the order that I mentioned. This change has been very much needed and dramatically increased the speed of the configuration from what could be 1.5 minutes to 10 seconds. Note that most calls are to the very domain where you host your emails, in fact the very first call is to the domain, and it's always been like that.

I think what happened here is that the AutoDiscover results were quick, because they came from the on-premise Exchange and the company-internal network, but the ISPDB and addons.json calls were slow due to the OSCP lookups, and the corresponding DNS lookups. Even if the ISPDB call happens first, and takes 7 seconds, and the addons.json call starts 1 second later while the ISPDB call is still in flight, the addons.json call would still wait for the same DNS lookups and OSCP queries, and block as well, and timeout as well.

Please don't concentrate on the addons.json call here. That's just how we found it, because we measured our Owl installs. From what I see from our logs, up to 2/3 of the addons.json calls had timed out before this fix. But **the same problem also happened for the ISPDB call**. These calls also timed out, even though I have no statistics about it. Given that they happen later than the ISPDB, I would expect that even more of the ISPDB calls timed out, based on the same reasoning that you mentioned, namely that the ISPDB call happens first.
It's true that the addons.json is loaded only after we found a config that has an Exchange server in it. There are currently 2 ways to get an Exchange config: Via Microsoft AutoDiscover protocol, or via ISPDB. The AutoDiscover protocol has been there since last year (you had reviewed that change). This exists to support on-premise Exchange servers. That ISPDB can also return an Exchange config is rather new and was added to support Office365, which has a config in the ISPDB.

We have so many different autoconfig methods, and each need network calls, which are slow, that we start all the network calls in parallel. First, we query the email domain with our own autoconfig method. Then we query the ISPDB for the domain. Then we get the MX host, and query the ISPDB for that MX domain. Then we do AutoDiscover. When I say "first... then", we cannot do them in order. This would be very slow. Unfortunately, because many networks are configured to DROP packets, these queries have to time out before they finally fail. So, each of these network calls can take 10s. If you chain them, one after the other, it becomes really slow. That's why the network calls are all started at the same time. But we use them in the order that I mentioned. This change has been very much needed and dramatically increased the speed of the configuration from what could be 1.5 minutes to 10 seconds. Note that most calls are to the very domain where you host your emails, in fact the very first call is to the domain, and it's always been like that.

I think what happened here is that the AutoDiscover results were quick, because they came from the on-premise Exchange and the company-internal network, but the ISPDB and addons.json calls were slow due to the OSCP lookups, and the corresponding DNS lookups. Even if the ISPDB call happens first, and takes 7 seconds, and the addons.json call starts 1 second later while the ISPDB call is still in flight, the addons.json call would still wait for the same DNS lookups and OSCP queries, and block as well, and timeout as well.

Please don't concentrate on the addons.json call here. That's just how we found it, because we measured our Owl installs. From what I see from our logs, up to 2/3 of the addons.json calls timed out before this fix. But **the same problem also happened for the ISPDB call**. These calls also timed out, even though I have no statistics about it. Given that they happen later than the ISPDB, I would expect that even more of the ISPDB calls timed out, based on the same reasoning that you mentioned, namely that the ISPDB call happens first.
It's true that the addons.json is loaded only after we found a config that has an Exchange server in it. There are currently 2 ways to get an Exchange config: Via Microsoft AutoDiscover protocol, or via ISPDB. The AutoDiscover protocol has been there since last year (you had reviewed that change). This exists to support on-premise Exchange servers. That ISPDB can also return an Exchange config is rather new and was added to support Office365, which has a config in the ISPDB.

We have so many different autoconfig methods, and each need network calls, which are slow, that we start all the network calls in parallel. First, we query the email domain with our own autoconfig method. Then we query the ISPDB for the domain. Then we get the MX host, and query the ISPDB for that MX domain. Then we do AutoDiscover. When I say "first... then", we cannot do them in order. This would be very slow. Unfortunately, because many networks are configured to DROP packets, these queries have to time out before they finally fail. So, each of these network calls can take 10s. If you chain them, one after the other, it becomes really slow. That's why the network calls are all started at the same time. But we use them in the order that I mentioned. This change has been very much needed and dramatically increased the speed of the configuration from what could be 1.5 minutes to 10 seconds. Note that most calls are to the very domain where you host your emails, in fact the very first call is to the domain, and it's always been like that.

I think what happened here is that the AutoDiscover results were quick, because they came from the on-premise Exchange and the company-internal network, but the ISPDB and addons.json calls were slow due to the OSCP lookups, and the corresponding DNS lookups. Even if the ISPDB call happens first, and takes 7 seconds, and the addons.json call starts 1 second later while the ISPDB call is still in flight, the addons.json call would still wait for the same DNS lookups and OSCP queries, and block as well, and timeout as well.

Please don't concentrate on the addons.json call here. That's just how we found it. But **the same problem also happened for the ISPDB call**. These calls also timed out, even though I have no statistics about it. Given that they happen later than the ISPDB, I would expect that even more of the ISPDB calls timed out, based on the same reasoning that you mentioned, namely that the ISPDB call happens first.
It's true that the addons.json is loaded only after we found a config that has an Exchange server in it. There are currently 2 ways to get an Exchange config: Via Microsoft AutoDiscover protocol, or via ISPDB. The AutoDiscover protocol has been there since last year (you had reviewed that change). This exists to support on-premise Exchange servers. That ISPDB can also return an Exchange config is rather new and was added to support Office365, which has a config in the ISPDB.

We have so many different autoconfig methods, and each need network calls, which are slow, that we start all the network calls in parallel. We query the email domain with our own autoconfig method. We query the ISPDB for the domain. We get the MX host, and query the ISPDB for that MX domain. We do AutoDiscover. Unfortunately, because many networks are configured to DROP packets, some of these queries have to time out before they finally fail, so many of these network calls can take 10s. If you chain them, one after the other, it becomes really slow. But even though the network calls are all started at the same time, we *use* them in the order that I mentioned, and if a more preferred method succeeds, we abort the rest. This change has been very much needed and dramatically increased the speed of the configuration from what could be 1.5 minutes to 10 seconds. Note that most calls are to the very domain where you host your emails, in fact the very first call is to the domain, and it's always been like that.

I think what happened here is that the AutoDiscover results were quick, because they came from the on-premise Exchange and the company-internal network, but the ISPDB and addons.json calls were slow due to the OSCP lookups, and the corresponding DNS lookups. Even if the ISPDB call happens first, and takes 7 seconds, and the addons.json call starts 1 second later while the ISPDB call is still in flight, the addons.json call would still wait for the same DNS lookups and OSCP queries, and block as well, and timeout as well.

Please don't concentrate on the addons.json call here. That's just how we found it. But **the same problem also happened for the ISPDB call**. These calls also timed out, even though I have no statistics about it. Given that they happen later than the ISPDB, I would expect that even more of the ISPDB calls timed out, based on the same reasoning that you mentioned, namely that the ISPDB call happens first.

Back to Bug 1572418 Comment 35