TRR: DNS_TRR_NS_VERFIFIED is wrong

RESOLVED FIXED in Firefox 63

Status

()

defect
P1
normal
RESOLVED FIXED
11 months ago
11 months ago

People

(Reporter: bagder, Assigned: bagder)

Tracking

Trunk
mozilla63
Points:
---

Firefox Tracking Flags

(firefox63 fixed)

Details

(Whiteboard: [necko-triaged][trr])

Attachments

(1 attachment)

Assignee

Description

11 months ago
current TMO results show DNS_TRR_NS_VERFIFIED evaluate False (failure) on 99.46% on nightly-63, while it is expected to show a majority to be True (successful).

The wrong measurement has also been seen in live nightlies for which TRR works. This is data that doesn't reflect the functionality!
Assignee

Comment 1

11 months ago
The NS verification is done once - at startup. Most startups are session restores when LOTS of name resolves are being issued almost at once. 

Current theory: the confirmation state variable is accessed and updated from several threads. The TRRService::CompleteLookup() function first assigns the state variable and then subsequently reads it again within the same method. There's then a risk that the variable contents gets modified by another thread between the assignment and the reading when it updates the DNS_TRR_NS_VERFIFIED boolean, a race condition. But is the risk really *that* high to almost always trigger this?

By avoiding the re-reading of the variable we will avoid the race condition.
Comment hidden (mozreview-request)
my laptop does not have an entry for DNS_TRR_NS_VERIFIED at all in about:telemetry.. and yet its obviously using TRR (mode 2) based on both about:networking and other telemetry. How does this explain that?
What threads do you feel can write mConfirmationState ?

afaict (mt is main thread):

enabled() <- isTrrBlackListed mt
readprefs() mt
observe() mt
maybeconfirm() <- {enabled-mt, observe-mt, notify-mt)
notify() mt
* completelookup() mt

the telem is all handled in completelookup()

Comment 5

11 months ago
mozreview-review
Comment on attachment 8989149 [details]
bug 1472662 - only send DNS_TRR_NS_VERIFIED telemetry once

https://reviewboard.mozilla.org/r/254220/#review261036
Attachment #8989149 - Flags: review?(mcmanus) → review-
Assignee

Comment 6

11 months ago
The reason for this crazy number in TMO is that the "captive portal clear" message keeps getting received during a session, and if TRR failed to verify the NS domain, it will try again on every subsequent CP clear message. In a long-going session, that can sum up to a substantial amount of (failed) retries and for every such failed retry it will send a Telemetry::DNS_TRR_NS_VERIFIED false...

A simple fix is to just report success/failure once. Maybe send another report if the TRR URI is updated?
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)

Comment 10

11 months ago
mozreview-review
Comment on attachment 8989149 [details]
bug 1472662 - only send DNS_TRR_NS_VERIFIED telemetry once

https://reviewboard.mozilla.org/r/254220/#review263540

I think this still just papers over the telemtry.

in the case of captivepassed -> captivepassed where we are not in CONFIRM_OK, this just keeps trying to confirm the NS but doesn't report the telemetry, right?

the bug is that we keep trying to confirm the NS when there hasn't been any change in the cap-port state. The telemetry is showing us the bug.
Attachment #8989149 - Flags: review?(mcmanus) → review-
Comment hidden (mozreview-request)

Comment 12

11 months ago
mozreview-review
Comment on attachment 8989149 [details]
bug 1472662 - only send DNS_TRR_NS_VERIFIED telemetry once

https://reviewboard.mozilla.org/r/254220/#review263548
Attachment #8989149 - Flags: review?(mcmanus) → review+

Comment 13

11 months ago
Pushed by daniel@haxx.se:
https://hg.mozilla.org/integration/autoland/rev/f73236dac9ec
only send DNS_TRR_NS_VERIFIED telemetry once r=mcmanus

Comment 14

11 months ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/f73236dac9ec
Status: NEW → RESOLVED
Last Resolved: 11 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla63
You need to log in before you can comment on or make changes to this bug.