Closed Bug 1577317 Opened 4 years ago Closed 4 years ago

Telemetry for fix update directory permissions

Categories

(Toolkit :: Application Update, task, P1)

task

Tracking

()

RESOLVED FIXED
mozilla71
Tracking Status
firefox70 + fixed
firefox71 --- fixed

People

(Reporter: agashlin, Assigned: agashlin)

References

Details

(Whiteboard: [iu_tracking])

Attachments

(2 files)

We'd like to measure how often fix-update-directory-perms is being used from within Firefox. I think this will just require adding a probe to count it at fixUpdateDirectoryPermissions().

Assignee: nobody → agashlin
Status: NEW → ASSIGNED
Priority: -- → P1
Attached file data review request
Attachment #9093489 - Flags: data-review?(tdsmith)
Comment on attachment 9093489 [details]
data review request

1) Is there or will there be **documentation** that describes the schema for the ultimate data set in a public, complete, and accurate way?

Yes, in Scalars.yaml and the probe dictionary.

2) Is there a control mechanism that allows the user to turn the data collection on and off?

Yes, the Firefox telemetry opt-out.

3) If the request is for permanent data collection, is there someone who will monitor the data over time?

n/a

4) Using the **[category system of data types](https://wiki.mozilla.org/Firefox/Data_Collection)** on the Mozilla wiki, what collection type of data do the requested measurements fall under?

Category 1, technical data.

5) Is the data collection request for default-on or default-off?

Default-on.

6) Does the instrumentation include the addition of **any *new* identifiers** (whether anonymous or otherwise; e.g., username, random IDs, etc.?

No.

7) Is the data collection covered by the existing Firefox privacy notice?

Yes.

8) Does there need to be a check-in in the future to determine whether to renew the data?

:agashlin is responsible for making a decision about whether to renew the collection.

9) Does the data collection use a third-party collection tool?

No.
Attachment #9093489 - Flags: data-review?(tdsmith) → data-review+

The name of the proposed scalar is update.fix_permissions_attempted.

Tracking to keep an eye on this, since we would like it in beta 70.

Attachment #9093479 - Attachment description: Bug 1577317 - Telemetry for "fix update directory permissions". r?rstrong → Bug 1577317 - Add telemetry for "fix update directory permissions". r?rstrong
Pushed by agashlin@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/c5b0b8cd674f
Add telemetry for "fix update directory permissions". r=rstrong
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla71

Some early telemetry results:

Across a few Nightly builds there were a few submissions with this flag set:

20190921: 55
20190922: 50

For comparison, a more common probe like SCALARS_STARTUP.IS_COLD has 52k and 60k submissions for the same dates. So roughly 1/1000 submissions have this set.

I haven't been able to run any more detailed queries on this yet (such as how many clients this is affecting), as there are so few it never shows up in the 1% default.longitudinal_v20190921 table, and other tables don't have it yet.

It should be available in the moz-fx-data-shared-prod.telemetry_stable.main_v4 table in BigQuery.

Thanks Tim!

Looking at all Windows Nightly builds since 20190921100259, 0.069% of submissions had the flag true. Merging these by distinct client_id, 0.41% of clients submitted true at least once. The raw number of clients is almost the same as the number of submissions, so it seems like most clients only hit this once (though it's only been a few days so far, with an average of 6 submissions). I separately confirmed this by counting, only two clients ever reported it more than once. It was also possible that clients hitting this don't send as many telemetry submissions, but they actually submit more on average (though the stddev is naturally larger so they may be roughly the same).

I wondered if these fixes happened on a client's first session, so I checked if the first subsession (min subsession_start_date) with the flag set was also the first subsession from that client, going back to submission date 2019-09-01. There's a roughly 3:4 ratio of first to not first.

So it looks like it's affecting about 0.4% of clients, an issue comes up suddenly but doesn't recur too frequently.

I'd like to uplift this to beta to see what it looks like in a larger population, and ride towards release.

Comment on attachment 9093479 [details]
Bug 1577317 - Add telemetry for "fix update directory permissions". r?rstrong

Beta/Release Uplift Approval Request

  • User impact if declined: None directly, but it will delay analysis that will inform the design of the updater.
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Just adds a telemetry probe, the very small code change is wrapped in an exception handler for safety.
  • String changes made/needed:
Attachment #9093479 - Flags: approval-mozilla-beta?

Comment on attachment 9093479 [details]
Bug 1577317 - Add telemetry for "fix update directory permissions". r?rstrong

Let's try this in beta 10. Adam and rstrong can you follow up with results next week? Do we need to ship it to release or is data from beta releases enough?

Attachment #9093479 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

I'd like to have this on release.

Yes, we'll be looking at these values on release, but I think we can let it ride the trains with 70. After it's been on beta for a few beta builds I'll re-run the queries I described above.

I re-ran my analysis on release and beta, for submissions since 2019-12-03 on version 71+, sample_id 0. The flag appears for 0.92% of clients on release, 2.9% of clients on beta.

There are a lot of relevant questions that should be explored:

  • Does this percentage seem to change much over submission time? Build id?
  • How often does this reoccur for affected clients? Does it happen only once because the client never runs again?
  • How often do affected clients manage to update (via update ready/success ping or later version number)?
  • Is the fix attempt made on the first run of the client? What is the distribution of install ages that encounter it?
Depends on: 1615065
Whiteboard: [iu_tracking]
You need to log in before you can comment on or make changes to this bug.