Closed Bug 1736864 Opened 3 years ago Closed 3 years ago

Add telemetry for improved download other content type handling

Categories

(Firefox :: Downloads Panel, task)

task
Points:
3

Tracking

()

RESOLVED FIXED
96 Branch
Tracking Status
firefox96 --- fixed

People

(Reporter: enndeakin, Assigned: enndeakin)

References

Details

(Whiteboard: [fidefe-mr11-downloads])

Attachments

(5 files)

With the changes to downloads to not prompt the user as much, we want to measure that there users do not encounter issues with downloads and get prompted less.

There are three telemetry keys that already exist related to downloads:

  • 'downloads.added' which occurs when a download is started by any means. This is further divided by file extension type
    from a hard-coded list of file extensions. Unrecognized extensions not in that list are counted under 'other'.
  • 'downloads.file_opened' which occurs when a downloaded file is opened. This is simply a count of the number of times it occurs.
  • 'pdf.viewer.used' which occurs when the internal pdf viewer is opened to display a pdf file, regardless of source.

The following telemetry is applied when opening a download file depending on the user's setting of the action for that type (in preferences).

  • Open in Firefox is selected -> 'downloads.added' and 'downloads.file_opened' and 'pdf.viewer.used'
  • Open in Other is selected -> 'downloads.added' and 'downloads.file_opened', the former happens when the download starts and the latter happens when the download finishes.
  • Save file is selected -> 'downloads.added' only
  • If Always Ask is selected, the same results as above occur but after the user selects the action from the dialog.

In addition when a file is selected from the download manager or downloads panel -> 'downloads.file_opened' only. 'pdf.viewer.used' will occur if needed.

With the download improvements preference disabled (browser.download.improvements_to_download_panel), the same thing happens as the above but the dialog appears in additional cases even when set to Open in Firefox. This means that a user will get prompted what to do with a download even though it just opens in the internal pdf viewer afterwards.

There isn't currently any telemetry for the content type helper dialog appearing. It was removed earlier this year. (unknownContentType.xhtml),

From this, I believe it has been determined that the extra telemetry needed is to add back the content type helper dialog one but in a revised form. It would only be temporary until we are satisfied no regression of download functionality has occurred.

For PDF in particular, currently:

If the setting for PDF is set to Ask, Save or Open External, the download gets handed off the external helper app service. The dialog
appears if Ask is assigned, otherwise the download is saved or opened in the external application.

If the setting for PDF is set to Open Internal, then any type with Content-disposition: attachment
is also handed off to the external helper app service in a similar way and the dialog always appears. If attachment is not specified,
then the pdf viewer is used directly. With browser.download.improvements_to_download_panel set to true, the dialog only appears when Ask is set in preferences. The pdf viewer is shown in every other case.

application/octet-stream files are also handled as pdfs if they are detected to be pdf files as well. A similar process occurs as above.

Summary: Add telemetry for improved → Add telemetry for improved download other content type handling

Proposed:

Adding telemetry when the external helper app gets invoked with information about:

  • the action to take (ask, view-internal, external-app, save)
  • whether this is a content-disposition: attachment or not (mime type sniffed is also a value that could be assigned here as well)
  • content type, although I think we only need to distinguish pdf, octet-stream and other.

Is this enough?

Working on a test but it modifies the one in browser_download_open_with_internal_handler.js, so I'm going to wait a couple days to both get feedback on whether this is the desired telemetry and bug 1736749 is finished.

After looking at bug 1719892 a bit, I realize that after that bug is fixed, the telemetry added here won't capture pdf attachments when 'internal' is set as the action, so I may need to add some additional telemetry that gets recorded when pdfs open internally.

Points: --- → 3

This is what will be implemented here:

Each telemetry event occurs when the external helper app is invoked and contains:

  • the action to take, (the last three are failure cases):
    ask - dialog is going to appear to ask the user
    internal - open internally
    external - open in an external application
    save - save to disk
    forbidden - download not allowed (typically because the page is in a sandboxed frame)
    spam - download not allowed because it was not initiated by the user (this feature was added by 1725353)
    savefailed - creating the file on disk to save to failed
  • whether this is a content-disposition: attachment or not. Possible values are 'attachment', 'sniffed', or 'other'. The sniffed value means that the type was guessed from the content.
  • content type, either 'pdf', 'octetstream' or 'other'

I currently set it to expire in version 100. Is that too early/late?

Roman, does the above sound like it would meet your needs?

Also, In the future do would also need to handle cases where pdfs are opened internally directly without being downloaded (bug 1719892)? The pdf viewer already has telemetry for when in it used.

Flags: needinfo?(rtestard)

Thanks, the telemetry listed here looks good to me.
Regarding bug 1719892 I agreed that adding details about whether the opened PDF originated from Content-disposition: attachment would be useful to ensure that downloads that are not counted anymore with 'downloads.added' (Content-disposition: attachment) are indeed captured on 'pdf.viewer.used' - I'm adding a comment to that bug.

Flags: needinfo?(rtestard)

Per https://bugzilla.mozilla.org/show_bug.cgi?id=1719892#c2 it seems like the requirement to add pdf telemetry fits better here so here it is:

  • What: add telemetry that captures if the PDF opened originates from opening a PDF sending CD: attachment
  • Why: this will help validate that downloads lost on 'downloads.added' are compensated by pdf opens added to 'pdf.viewer.used'.

Note that this doesn't ever trigger currently as pdf files loaded as attachments always get downloaded go through the external helper service

Attached file data review

This is the unfinished data review. Romain, could you review for correctness, answer the questions that I don't know the answer to, and request data-review?

Flags: needinfo?(rtestard)
Attached file fred5.txt
Flags: needinfo?(rtestard)
Attachment #9249964 - Flags: data-review?(chutten)

Comment on attachment 9249964 [details]
fred5.txt

DATA COLLECTION REVIEW RESPONSE:

Is there or will there be documentation that describes the schema for the ultimate data set available publicly, complete and accurate?

Yes.

Is there a control mechanism that allows the user to turn the data collection on and off?

Yes. This collection is Telemetry so can be controlled through Firefox's Preferences.

If the request is for permanent data collection, is there someone who will monitor the data over time?

No. This collection will expire in Firefox 100.

Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under?

Category 2, Interaction.

Is the data collection request for default-on or default-off?

Default on for all channels.

Does the instrumentation include the addition of any new identifiers?

No.

Is the data collection covered by the existing Firefox privacy notice?

Yes.

Does the data collection use a third-party collection tool?

No.


Result: datareview+

Attachment #9249964 - Flags: data-review?(chutten) → data-review+
Pushed by neil@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e368b894f43c
add telemetry when the external helper service is invoked to handle a content type, specifying the action to take and some details about the data to download, r=Gijs
https://hg.mozilla.org/integration/autoland/rev/2859d2cfa458
add tests that verify download telemetry for different types of download scenarios, r=Gijs
https://hg.mozilla.org/integration/autoland/rev/76b99d14b9f7
add extra telemetry flag to indicate if the pdf viewer is opening an attachment, r=Gijs
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 96 Branch

Neil, can you elaborate a bit more on the telemetry probes that we're expect to see with the browser.download.improvements_to_download_panel on true /false ?

Flags: needinfo?(enndeakin)

The details are in the data-review request, question 5, copied here:

Description: downloads.helpertype.unknowntype

unknowncontenttype.pdf_action. An event indicating the action to take on a file: ask the user, display internally, use an external viewer or save the file. In addition, three error cases are also indicated: download blocked because the page to trigger multiple downloads at once, download blocked due to sandboxing, download failed due to disk error). The values are:
ask - dialog is going to appear to ask the user
internal - open internally
external - open in an external application
save - save to disk
forbidden - download not allowed (typically because the page is in a sandboxed frame)
spam - download not allowed because it was not initiated by the user
savefailed - creating the file on disk to save to failed

Two other details are also recorded:

  • whether this is an attachment download (http header Content-disposition: attachment), or the type was sniffed from the content
  • the type indicator (limited to pdf, octetstream or other)
Flags: needinfo?(enndeakin)

Most of the telemetry here is verified as part of Nightly 97 Remove Download Panel test run with a few questions that pop-up up:

  1. With the latest changes, now accessing a PDF with any CD type will result in opening it, without downloading (default download settings). Since there is no download involved, we are not logging any telemetry for this case, I'm wondering if there would be intent for this case to be logged in some manner anyways.
  2. pdfjs download button and content menu save as only generates the the type indicator (e.g. 17418 downloads added fileExtension pdf) - expected?
  3. couldn't figure out usecases for forbidden, spam, savefailed ; anything we might use to trigger these cases?
Flags: needinfo?(enndeakin)
  1. With the latest changes, now accessing a PDF with any CD type will result in opening it, without downloading (default download settings). Since there is no download involved, we are not logging any telemetry for this case, I'm wondering if there would be intent for this case to be logged in some manner anyways.

We should be getting the pdf.viewer.is_attachment telemetry for this case when the pdf viewer opens.

  1. pdfjs download button and content menu save as only generates the the type indicator (e.g. 17418 downloads added fileExtension pdf) - expected?

I think that is correct. The telemetry added here only applies when the unknown content type dialog appears (the dialog that asks whether the save or open with another application), and when the pdf viewer is started.

  1. couldn't figure out usecases for forbidden, spam, savefailed ; anything we might use to trigger these cases?

I think 'spam' comes from a page auto-triggering more than one download at a time. 'forbidden' triggers when a download occurs from an iframe with a sandbox attribute without the 'allow-downloads' flag. As a guess, 'savefailed' might be tested by setting the download directory to a readonly place. It probably isn't worth spending a lot of time manually testing any of these and we do have automated tests for them anyway.

Flags: needinfo?(enndeakin)
See Also: → 1754636
See Also: → 1754659
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: