DNS_TRR_REQUEST_PER_CONN doesn't seem to work right

RESOLVED FIXED in Firefox 63

Status

()

defect
P1
normal
RESOLVED FIXED
11 months ago
10 months ago

People

(Reporter: bagder, Assigned: bagder)

Tracking

Trunk
mozilla64
Points:
---

Firefox Tracking Flags

(firefox63 fixed, firefox64 fixed)

Details

(Whiteboard: [necko-triaged][trr])

Attachments

(1 attachment, 1 obsolete attachment)

This telemetry probe was added in bug 1470853, but looking at the data now,the counter reports 0 or 1 in 97% (!) of the cases which seems wrong!
Nick,

I'm back with some questions on this, since I manged to botch this patch and now I want to fix it. The patch landed in https://hg.mozilla.org/mozilla-central/rev/92baee16f180 clearly wasn't correct since the DNS_TRR_REQUEST_PER_CONN counters seen on telemetry.mozilla.org don't at all seem reasonable.

The first mistake is this line:
  https://searchfox.org/mozilla-central/source/netwerk/protocol/http/Http2Session.cpp#408

The logic there is surely reversed, right? The client initialized stream IDs are even. At least when I add some printf()-logging there that's what it looks like.

But... if I change that logic to instead become:

  if (!(stream->StreamID() & 1)) {

... and run the TRR-tests, this causes an assert() to trigger like this:

Assertion failure: mPushStream->Transaction() != this, at /netwerk/protocol/http/Http2Push.cpp:367
#01: mozilla::net::Http2PushTransactionBuffer::ConnectionInfo() (/netwerk/protocol/http/Http2Push.cpp:367)
#02: mozilla::net::Http2Session::RegisterStreamID(mozilla::net::Http2Stream*, unsigned int) (/netwerk/protocol/http/Http2Session.cpp:420)
#03: mozilla::net::Http2Session::RecvPushPromise(mozilla::net::Http2Session*) (/netwerk/protocol/http/Http2Session.cpp:1896)

This happens for the TRR via push test case, and the back trace also certainly seems to indicate a push stream there. This makes me think this check isn't good enough to check for non-pushed streams.
Flags: needinfo?(hurley)
No, client-initiated stream IDs are definitely odd: see https://searchfox.org/mozilla-central/source/netwerk/protocol/http/Http2Session.cpp#371 and https://tools.ietf.org/html/rfc7540#section-5.1.1

So that's why you get the assertion when you change the ID check.

You're right, though, those request/conn counts look suspiciously low. I mean, I'm willing to accept the level of zeroes, but the fact that the vast majority of actually-trr counts is 1 seems wrong.

Have you looked at the conn entry hash keys for these trr connections? It seems like there's something going on that causes us to not share connections as often as we would expect, which says to me the hash key is different in some important way.
Flags: needinfo?(hurley) → needinfo?(daniel)
Thanks.

I believe the problem is quite simply that the code checks the wrong stream ID. What do you think about this?

--- a/netwerk/protocol/http/Http2Session.cpp
+++ b/netwerk/protocol/http/Http2Session.cpp
@@ -403,11 +403,11 @@ Http2Session::RegisterStreamID(Http2Stream *stream, uint32_t aNewID)
       mCheckNetworkStallsWithTFO = true;
       mLastRequestBytesSentTime = PR_IntervalNow();
     }
   }
 
-  if (stream->StreamID() & 1) {
+  if (aNewID & 1) {
     // don't count push streams here
     MOZ_ASSERT(stream->Transaction(), "no transation for the stream!");
     RefPtr<nsHttpConnectionInfo> ci(stream->Transaction()->ConnectionInfo());
     if (ci && ci->GetTrrUsed()) {
       IncrementTrrCounter();
Flags: needinfo?(daniel) → needinfo?(hurley)
Aw foo. Yeah, that's the right thing to do there.
Flags: needinfo?(hurley)
The bot seems to have missed to mention that this patch went into m-c here:

https://hg.mozilla.org/mozilla-central/rev/4dfb623bb11e
Pushed by cbrindusan@mozilla.com:
https://hg.mozilla.org/mozilla-central/rev/d62f3bddcb7d
Fix the stream ID check when counting TRR streams. r=nwgh
(it was first pushed with the wrong bug number!)
Status: NEW → RESOLVED
Closed: 10 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla64
Is this something that should be considered for Beta backport or can it ride the trains?
Flags: needinfo?(daniel)
Yes, I want this considered for backport to 63.

How do I request beta-approval when phabricator doesn't add a patch entry here in bugzilla with the betal-approval option?
Flags: needinfo?(daniel)
Posted file (patch was on phabricartor) (obsolete) —
Approval Request Comment
[Feature/Bug causing the regression]: the bug was landed in bug 1470853
[User impact if declined]: getting wrong telemetry data for this probe, data that will help us judge how TRR fares in the wild
[Is this code covered by automated tests?]: no
[Has the fix been verified in Nightly?]: yes
[Needs manual test from QE? If yes, steps to reproduce]: no
[List of other uplifts needed for the feature/fix]: none
[Is the change risky?]: no
[Why is the change risky/not risky?]: single line minuscule change
[String changes made/needed]: none
Attachment #9010161 - Flags: approval-mozilla-beta?
Daniel, the attachment you are asking an uplift for is not a patch.
Flags: needinfo?(daniel)
Do we need to export the patch and attach it to get it uplifted? It seems like a weird extra step, but okay here it is.

This patch is from here: https://hg.mozilla.org/mozilla-central/raw-rev/d62f3bddcb7d
Attachment #9010161 - Attachment is obsolete: true
Attachment #9010161 - Flags: approval-mozilla-beta?
Flags: needinfo?(daniel)
Attachment #9010213 - Flags: approval-mozilla-beta?
Comment on attachment 9010213 [details] [diff] [review]
d62f3bddcb7d.patch

Thanks Daniel, P1 fix in early beta, uplift approved for 63 beta 8.
Attachment #9010213 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
(In reply to Daniel Stenberg [:bagder] from comment #9)
> How do I request beta-approval when phabricator doesn't add a patch entry
> here in bugzilla with the betal-approval option?

Under normal circumstances, Phabricator creates an attachment in the bug like you'd expect it to. However, since this bug originally landed under a different bug number, that wasn't the case here. You'll note that bug 1486137 does have this attachment included in it ;)
Oh right, it was all my fault. Should've suspected that! =) Thanks Ryan.
You need to log in before you can comment on or make changes to this bug.