Bug 1594366 Comment 20 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

> Digicert (used by the server) has a reported average response time of 19ms for OSCP requests

I don't know who said that or who measured. Maybe Digicert itself, from the server perspective. But that's definitely not the time of the OSCP request from when the client fires it, with cold DNS caches, TCP setup, TLS setup (!), HTTP request sent, server itself responding (that's maybe those 19ms), and HTTP request transmitted. Such total request times are typically at least 1 second on a normal DSL line, and can be much longer.

> You wrote in bug 1572418 comment 9, you did NOT see it on your system.

OK, I see how that might look contra-dictionary. At the time when I wrote this comment, I didn't see it on my system, but Neil did on his. Then, during the course of reviews and further tests, I did see it repeatedly on my machine as well.

That's exactly why I said repeatedly that this bug is really hiding itself and really difficult to see on a developer system, due to caches, but affects a lot of end users.

I would please ask you not to speculate on the causes. I spent **6 months** trying to figure out why the statistics don't match up. The installs didn't match the expected rate based on other numbers I had. I analyzed them back and forth and nothing made sense. The different numbers just didn't match up. Until I found this bug. Once it was fixed, the numbers fit together. Install rates went up 3 to 4 times, not due to Office365, but due to this very bug here. We know our users, I know exactly who uses Office365, based on the meta data.

Aside from that, I correlated the numbers with other data. Before this fix, the numbers didn't fit at all. With this fix, the numbers fit together exactly as expected. That's why I'm sure that this bug here caused 60-80% lookup failures and consequently missing installs.
> Digicert (used by the server) has a reported average response time of 19ms for OSCP requests

I don't know who said that or who measured. Maybe Digicert itself, from the server perspective. But that's definitely not the time of the OSCP request from when the client fires it, with cold DNS caches, TCP setup, TLS setup (!), HTTP request sent, server itself responding (that's maybe those 19ms), and HTTP response transmitted. Such total request times are typically at least 1 second on a normal DSL line, and can be much longer. DNS plays a big factor here.

> You wrote in bug 1572418 comment 9, you did NOT see it on your system.

OK, I see how that might look contra-dictionary. At the time when I wrote this comment, I didn't see it on my system, but Neil did on his. Then, during the course of reviews and further tests, I did see it repeatedly on my machine as well.

That's exactly why I said repeatedly that this bug is really hiding itself and really difficult to see on a developer system, due to caches, but affects a lot of end users.

I would please ask you not to speculate on the causes. I spent **6 months** trying to figure out why the statistics don't match up. The installs didn't match the expected rate based on other numbers I had. I analyzed them back and forth and nothing made sense. The different numbers just didn't match up. Until I found this bug. Once it was fixed, the numbers fit together. Install rates went up 3 to 4 times, not due to Office365, but due to this very bug here. We know our users, I know exactly who uses Office365, based on the meta data.

Aside from that, I correlated the numbers with other data. Before this fix, the numbers didn't fit at all. With this fix, the numbers fit together exactly as expected. That's why I'm sure that this bug here caused 60-80% lookup failures and consequently missing installs.
> Digicert (used by the server) has a reported average response time of 19ms for OSCP requests

I don't know who said that or who measured. Maybe Digicert itself, from the server perspective. But that's definitely not the time of the OSCP request from when the client fires it, with cold DNS caches, TCP setup, TLS setup (!), HTTP request sent, server itself responding (that's maybe those 19ms), and HTTP response transmitted. Such total request times are typically at least 1 second on a normal DSL line, and can be much longer. DNS plays a big factor here.

> You wrote in bug 1572418 comment 9, you did NOT see it on your system.

OK, I see how that might look contra-dictionary. At the time when I wrote this comment, I didn't see it on my system, but Neil did on his. Then, during the course of reviews and further tests, I did see it repeatedly on my machine as well.

That's exactly why I said repeatedly that this bug is really hiding itself and really difficult to see on a developer system, due to caches, but affects a lot of end users.

I would please ask you not to speculate on the causes. I spent **6 months** trying to figure out why the statistics don't match up. The installs didn't match the expected rate based on the other numbers that I had. They just didn't correlate as they normally would. I analyzed them back and forth and nothing made sense. The different numbers just didn't match up. Until I found this bug. Once it was fixed, the numbers fit together. Install rates went up 3 to 4 times, not due to Office365, but due to this very bug here. We know our users, I know exactly who uses Office365, based on the meta data.

Aside from that, I correlated the numbers with other data. Before this fix, the numbers didn't fit at all. With this fix, the numbers fit together exactly as expected. That's why I'm sure that this bug here caused 60-80% lookup failures and consequently missing installs.
> Digicert (used by the server) has a reported average response time of 19ms for OSCP requests

I don't know who said that or who measured. Maybe Digicert itself, from the server perspective. But that's definitely not the time of the OSCP request from when the client fires it, with cold DNS caches, TCP setup, TLS setup (!), HTTP request sent, server itself responding (that's maybe those 19ms), and HTTP response transmitted. Such total request times are typically at least 1 second on a normal DSL line, and can be much longer. DNS plays a big factor here.

> You wrote in bug 1572418 comment 9, you did NOT see it on your system.

OK, I see how that might look contra-dictionary. At the time when I wrote this comment, I didn't see it on my system, but Neil did on his. Then, during the course of reviews and further tests, I did see it repeatedly on my machine as well.

That's exactly why I said repeatedly that this bug is really hiding itself and really difficult to see on a developer system, due to caches, but affects a lot of end users.

I can know who of our users use Office365 or outlook.com or on-premise Exchange. Based on the numbers I have, I know that's not the cause of the spike, but this bug here was.

I would please ask you not to speculate on the causes. I spent **6 months** trying to figure out why the statistics don't match up. I correlated the numbers with other data. They just didn't correlate as they normally would. I analyzed them back and forth and nothing made sense. Until I found this bug. Once it was fixed, the numbers fit together. With this fix, the numbers fit together exactly as expected. That's why I'm sure that this bug here caused 60-80% lookup failures and consequently missing installs.
> Digicert (used by the server) has a reported average response time of 19ms for OSCP requests

I don't know who said that or who measured. Maybe Digicert itself, from the server perspective. But that's definitely not the time of the OSCP request from when the client fires it, with cold DNS caches, TCP setup, TLS setup (!), HTTP request sent, server itself responding (that's maybe those 19ms), and HTTP response transmitted. Such total request times are typically at least 1 second on a normal DSL line, and can be much longer. DNS plays a big factor here.

> You wrote in bug 1572418 comment 9, you did NOT see it on your system.

OK, I see how that might look contra-dictionary. At the time when I wrote this comment, I didn't see it on my system, but Neil did on his. Then, during the course of reviews and further tests, I did see it repeatedly on my machine as well.

That's exactly why I said repeatedly that this bug is really hiding itself and really difficult to see on a developer system, due to caches, but affects a lot of end users.

I can know who of our users use Office365 or outlook.com or on-premise Exchange. Based on the numbers I have, I know that's not the cause of the spike, but this bug here was.

I would please ask you not to speculate on the causes. I spent **6 months** trying to figure out why the statistics don't match up. I correlated the numbers with other data. They just didn't correlate as they normally would. I analyzed them back and forth and nothing made sense. Until I found this bug. Once it was fixed, the numbers fit together. With this fix, the numbers fit together exactly as expected. That's why this bug here caused 60-80% lookup failures and consequently missing installs.

Back to Bug 1594366 Comment 20