Closed Bug 1579507 Opened 4 months ago Closed 3 months ago

Collect telemetry on FTP usage

Categories

(Core :: Networking: FTP, task, P1)

task

Tracking

()

RESOLVED FIXED
mozilla71
Tracking Status
firefox71 --- fixed

People

(Reporter: nhi, Assigned: michal)

References

(Blocks 2 open bugs)

Details

(Whiteboard: [necko-triaged])

Attachments

(3 files, 1 obsolete file)

Collect telemetry data on FTP usage in order to help with the decision whether or not FTP support should be removed (bug 1574475).

We should do this for 71

Assignee: nobody → michal.novotny
Priority: -- → P1
Whiteboard: [necko-triaged]
Blocks: 1574475

Let's elaborate here what data is useful to make that decision.

Flags: needinfo?(michal.novotny)
Flags: needinfo?(dd.mozilla)

From a product standpoint, it would be useful to know:

  • Percentage of users that visit ftp:// URI's (including ftp:// URI's that download an item)
  • Percentage of navigations to ftp:// URI's (including ftp:// URI's that download an item)
  • Be able to split this data by desktop and mobile

This thread from Google shows the telemetry they used to make their decision to deprecate FTP. Also, maybe this could be generalized into reporting the usage of all protocols (FTP, HTTP, HTTPS, etc.) and be able to segment that data by OS and platform.

Blocks: 1570155
See Also: → 1090762

(In reply to Mike Conca [:mconca] from comment #3)

From a product standpoint, it would be useful to know:

  • Percentage of users that visit ftp:// URI's (including ftp:// URI's that download an item)
  • Percentage of navigations to ftp:// URI's (including ftp:// URI's that download an item)

What's the difference between these two? Given that we neither display FTP listings nor render FTP resources anymore, it IMO doesn't make sense.

Flags: needinfo?(michal.novotny)

(In reply to Michal Novotny [:michal] from comment #4)

What's the difference between these two?

The denominator (distinct client_ids vs total pageloads). A probe that reports a count of FTP navigation events would allow us to compute either.

With respect to desktop vs mobile: note that we collect a very limited set of probes from the Fennec release population, in the core ping. Telemetry histograms and scalars are not collected.

This probe should allow us to compute:

  • percentage of users that use FTP
  • percentage of navigations using FTP protocol
Attached file request.md (obsolete) —
Attachment #9100058 - Flags: data-review?(chutten)
Comment on attachment 9100058 [details]
request.md

Load balance to Martin.
Attachment #9100058 - Flags: data-review?(chutten) → data-review?(mlopatka)
Attached file request.md

The patch is going to change. I don't have yet the code, but the data review request is for the new code and the changes will be:

  1. Since we're considering removing a core functionality, we want the data from release version too.

  2. The probe will be split into two, one will count FTP channels used to download a resource and the other will count FTP channels used to download directory listings. It will allow us to make possibly a decision to keep FTP only for downloading files, but get rid of the ugly parsing code in ParseFTPList.cpp.

Attachment #9100058 - Attachment is obsolete: true
Attachment #9100058 - Flags: data-review?(mlopatka)
Attachment #9100256 - Flags: data-review?(mlopatka)
Comment on attachment 9100256 [details]
request.md

Passing review over to :tdsmith who has prior context for this request.
Attachment #9100256 - Flags: data-review?(mlopatka) → data-review?(tdsmith)
Comment on attachment 9100256 [details]
request.md

data-review+
I try to avoid reviewing changes I've consulted on but my involvement has been pretty arms-length and this seems straightforward.
--
1) Is there or will there be **documentation** that describes the schema for the ultimate data set in a public, complete, and accurate way?

Yes, in Scalars.yaml and the probe dictionary.

2) Is there a control mechanism that allows the user to turn the data collection on and off? 

Yes, the Firefox telemetry opt-out.

3) If the request is for permanent data collection, is there someone who will monitor the data over time?

n/a

4) Using the **[category system of data types](https://wiki.mozilla.org/Firefox/Data_Collection)** on the Mozilla wiki, what collection type of data do the requested measurements fall under?

Category 1, technical data.

5) Is the data collection request for default-on or default-off?

Default-on.

6) Does the instrumentation include the addition of **any *new* identifiers** (whether anonymous or otherwise; e.g., username, random IDs, etc.  See the appendix for more details)?

No.

7) Is the data collection covered by the existing Firefox privacy notice?

Yes.

8) Does there need to be a check-in in the future to determine whether to renew the data?

:michal and Nhi will be responsible for deciding whether to renew the collection before it expires before Firefox 77.

9) Does the data collection use a third-party collection tool?

No.
Attachment #9100256 - Flags: data-review?(tdsmith) → data-review+
Pushed by mnovotny@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/77d6ebaeecf6
Collect telemetry on FTP usage, r=valentin
Status: NEW → RESOLVED
Closed: 3 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla71
Flags: needinfo?(dd.mozilla)

Did you have a chance to test this? I don't see any examples of these being incremented in telemetry. Testing in latest nightly after visiting ftp://ftp.ca.debian.org/ and downloading a file, I don't see the scalars being set in about:telemetry.

I wonder if record_in_processes needs to contain content.

Flags: needinfo?(michal.novotny)

(In reply to Tim Smith 👨‍🔬 [:tdsmith] from comment #14)

Did you have a chance to test this? I don't see any examples of these being incremented in telemetry. Testing in latest nightly after visiting ftp://ftp.ca.debian.org/ and downloading a file, I don't see the scalars being set in about:telemetry.

I've tested that I hit all three places where I added ScalarAdd when using FTP.

I wonder if record_in_processes needs to contain content.

This code runs only in parent process.

Flags: needinfo?(michal.novotny)

(In reply to Michal Novotny [:michal] from comment #15)

I've tested that I hit all three places where I added ScalarAdd when using FTP.

There is a timing issue. It does work with debug build but with optimized build the connection is closed before we parse the last response from the server, so ScalarAdd() isn't called. The solution is to update the scalar value when sending LIST, STOR and RETR commands even if the response might be an error.

Status: RESOLVED → REOPENED
Resolution: FIXED → ---

We don't close the channel cleanly most of the time, so the probes need to be moved to a place where we have positive response from the server and the data transfer is about to start.

Attachment #9101970 - Attachment description: Bug 1579507 - Collect telemetry on FTP usage, r?valentin → Bug 1579507 - Fix non-working FTP telemetry probes, r?valentin
Keywords: checkin-needed

Pushed by dluca@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/650273d19207
Fix non-working FTP telemetry probes, r=valentin

Keywords: checkin-needed
Status: REOPENED → RESOLVED
Closed: 3 months ago3 months ago
Resolution: --- → FIXED
Duplicate of this bug: 1090762
You need to log in before you can comment on or make changes to this bug.