Resurrect telemetry for http redirects
Categories
(Core :: Networking: HTTP, enhancement, P2)
Tracking
()
People
(Reporter: overholt, Assigned: CuveeHsu)
References
Details
(Whiteboard: [necko-triaged])
Attachments
(3 files)
We had this telemetry from bug 1413512 in Firefox 59 through 63. There is a long tail of schemas if you look at the telemetry from then (https://mzl.la/2Y16Vec - click on the scheme to see all the types submitted) but I'm interested because Fission will often involve process switching.
Nika, was there anything else we discussed that would be useful?
Nhi, can anyone on the Necko team take this? The code is somewhere around https://searchfox.org/mozilla-central/source/netwerk/protocol/http/nsHttpHandler.cpp#839 FWIW.
Comment 1•5 years ago
|
||
It's good to know the wide diversity of different schemes which HTTP loads can redirect to. Many of these are external protocol handlers which, I believe, will never produce content to draw in a target process, but others (like data:
, ftp:
, jar:
etc) can produce data. If we want to restrict what URLs can be the target of HTTP redirects to a known set in the future, we probably need data on what URLs are being used as targets in the wild.
We should probably do this collection after security checks might reject the attempt to redirect, as redirects which are already redirected likely don't need to be recorded and might throw off the numbers.
I don't think it's important to collect information on which custom protocols are being used. In fact, given that they're referring to people's programs on their computers (e.g. zoomapp:
), we probably shouldn't collect that info for privacy reasons. I think it would be cleanest to collect into a well-defined series of buckets.
There should probably be one bucket for each internally-recognized scheme, and a bucket for externally-handled schemes. It looks like NS_NewURI
now directly encodes this list, rather than using interface lookup, so it might be easiest to maintain the list of known schemes nearby that function (https://searchfox.org/mozilla-central/rev/96403eac4ff6f96f89a0b120b29bd46540d5705e/netwerk/base/nsNetUtil.cpp#1694).
In general, I'm surprised by the number of different schemes which are used as redirect targets! The sheer number helps validate the DocumentChannel
approach, though, by making it so we don't need to handle every one of these independently. Having more formal documentation about what schemes are allowed as targets, though, may be valuable both for internal security and in the standard.
Updated•5 years ago
|
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 2•5 years ago
|
||
We should probably do this collection after security checks might reject the attempt to redirect, as redirects which are already redirected likely don't need to be recorded and might throw off the numbers.
I assume you mean checking if the redirection is valid.
That is https://searchfox.org/mozilla-central/rev/e0b0c38ee83f99d3cf868bad525ace4a395039f1/netwerk/base/nsIOService.cpp#602-611
We should collect the telemetry after this check.
I don't think it's important to collect information on which custom protocols are being used. In fact, given that they're referring to people's programs on their computers (e.g.
zoomapp:
), we probably shouldn't collect that info for privacy reasons. I think it would be cleanest to collect into a well-defined series of buckets.There should probably be one bucket for each internally-recognized scheme, and a bucket for externally-handled schemes. It looks like
NS_NewURI
now directly encodes this list, rather than using interface lookup, so it might be easiest to maintain the list of known schemes nearby that function (https://searchfox.org/mozilla-central/rev/96403eac4ff6f96f89a0b120b29bd46540d5705e/netwerk/base/nsNetUtil.cpp#1694).
So, I look into the telemetry we collected.
What we're interested in is {http(s), about, javascript, ftp, file, jar, data}
I'm not sure if I need to take care of {ssh, moz*, dat}
We have few but non-zero redirections.
ssh
, dat
use external app, and I believe we will remove them from NS_NewURI
in bug 1561860.
moz*
is something we should forbid.
By the way, we have extremely low samples of redirections to ftp/file.
The number is much less than magnet://
, a p2p protocol IIRC.
Assignee | ||
Comment 3•5 years ago
|
||
For fission purpose, we will continue to break down this information by whether the redirection happens for a document load or a subresource.
Assignee | ||
Comment 4•5 years ago
|
||
Assignee | ||
Comment 5•5 years ago
|
||
Not confident with the main thread assertion. Let's see.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=7a7f696cd772c3ac01ce2625ae34f86077fa4853
Updated•5 years ago
|
Assignee | ||
Comment 6•5 years ago
|
||
Comment 7•5 years ago
|
||
Comment on attachment 9110921 [details]
request.md
Load balancing to Loren.
Assignee | ||
Comment 8•4 years ago
|
||
Sorry for the push. Could we move this forward? Thanks.
Assignee | ||
Comment 9•4 years ago
|
||
Any possible to have your attention here, Chris? I don't want to miss this train again. Sorry for the inconvenience.
Comment 10•4 years ago
|
||
Comment on attachment 9110921 [details] request.md PRELIMINARY NOTES: The data collection will only collect on pre-release channels (it lacks a "releaseChannelCollection" key with value "opt-out"), but the review request is for all channels. I'll proceed with the review for all channels. DATA COLLECTION REVIEW RESPONSE: Is there or will there be documentation that describes the schema for the ultimate data set available publicly, complete and accurate? Yes. This collection is Telemetry so is documented in its definitions file [Histograms.json](https://hg.mozilla.org/mozilla-central/file/tip/toolkit/components/telemetry/Histograms.json) and the [Probe Dictionary](https://telemetry.mozilla.org/probe-dictionary/). Is there a control mechanism that allows the user to turn the data collection on and off? Yes. This collection is Telemetry so can be controlled through Firefox's Preferences. If the request is for permanent data collection, is there someone who will monitor the data over time? Yes, Junior Hsu is responsible. Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under? Category 1, Technical. Is the data collection request for default-on or default-off? Default on for all channels. Does the instrumentation include the addition of any new identifiers? No. Is the data collection covered by the existing Firefox privacy notice? Yes. Does there need to be a check-in in the future to determine whether to renew the data? No. This collection is permanent. --- Result: datareview+
Assignee | ||
Comment 11•4 years ago
|
||
Thanks for the review. I'll update to patch to opt-out all channels.
Comment 12•4 years ago
|
||
Pushed by juhsu@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/f1be062c4e2a Resurrect telemetry for http redirects, r=valentin
Comment 13•4 years ago
|
||
bugherder |
Comment 14•4 years ago
|
||
This test fails on Thunderbird otherwise.
Comment 15•4 years ago
|
||
Pushed by mkmelin@iki.fi: https://hg.mozilla.org/integration/autoland/rev/c68fb518e696 Make the test_redirect_protocol_telemetry.js test collect telemetry data for all products during testing. r=valentin
Comment 16•4 years ago
|
||
(In reply to Pulsebot from comment #15)
Pushed by mkmelin@iki.fi:
https://hg.mozilla.org/integration/autoland/rev/c68fb518e696
Make the test_redirect_protocol_telemetry.js test collect telemetry data for
all products during testing. r=valentin
I just realized we should also clear the pref after the test like so:
https://searchfox.org/mozilla-central/rev/62a130ba0ac80f75175e4b65536290b52391f116/netwerk/test/unit/test_parse_content_type.js#430-434
Geoff, can you submit an extra patch that does so?
Comment 17•4 years ago
|
||
AFAIK, xpcshell tests get a clean plate automatically so I would think that's not necessary.
Comment 18•4 years ago
|
||
bugherder |
Comment 19•4 years ago
|
||
I agree with Magnus, but I'll do it if it means you'll remember not to break us in this way again. :-)
Updated•4 years ago
|
Updated•4 years ago
|
Description
•