Closed Bug 1694343 Opened 5 years ago Closed 4 years ago

Investigate why Fenix' metrics.distribution_id is sometimes unset

Categories

(Data Platform and Tools :: Glean: SDK, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: janerik, Assigned: janerik)

References

Details

Attachments

(1 file)

Fenix has a metrics.distribution_id that should have one of 2 values: Mozilla or MozillaOnline.
However in the wild we see it being unset even from versions that have that metric.

We should confirm what's happening there.

Resources:

Distribution ID by ping reason

Vast majority of "overdue" pings are missing the distribution ID.

Low numbers for "reschedule" and "tomorrow".
Still higher for "today" and "upgrade".

Distribution ID in overdue pings

The vast majority of "overdue" pings lack other "application"-lifetime metrics.
This suggests that our logic clears application-lifetime metrics too early, before an "overdue" ping is sent. I have not yet tested this further.
We do have a test that should ensure this works.

I am able to reproduce metric pings that are missing the distribution_id.
See the debug ping viewer here: https://debug-ping-preview.firebaseapp.com/pings/jer-appmissing

I launched a local-build of Fenix once, thens topped it.
Then used adb to set a date in the future: adb shell su root date 022410002021.00

This caused an overdue metrics ping (timestamp Feb 23, 2021, 12:45:42 in the debug ping viewer).

There seem to be certain situations in which Fenix is started, intializes Glean, but not initializes further.
These seem to be due to some events happening.

Here's what I did:

  1. Fresh simulator build, install local build of Fenix (with extended logging)
  2. Open it, all looks normal, it sends an upgrade metrics ping, that one does not include a distribution_id (because no old data around)
  3. Force-close the app
  4. Restart the simulator.
  5. After reboot the log shows that Fenix is started in the background, and Glean is initialized. It does not send a metrics ping, but then clears application metrics:
Process org.mozilla.fenix created for broadcast {org.mozilla.fenix/androidx.work.impl.background.systemalarm.RescheduleReceiver}
[...]
libglean_ffi  I  glean_core: Creating new Glean v35.0.0
[...]
glean/MetricsPingSched  E  The 'metrics' ping was already sent today, 1614097389159.
[...]
glean/Glean  E  Clear application lifetime metrics

This is all as expected.

  1. I start up the Fenix manually, eventually it sets metrics.distribution_id.

If I trigger an "overdue" metrics ping in step 6 then it won't have a value for metrics.distribution_id (because the early soft-launch in step 5 cleared it).
I have the suspicion a similar situation can occur without a full reboot.

From all I have right now Glean works as expected, though that might be surprising behavior for Glean users.

Question to Fenix people, e.g. :st3fan: Under what circumstances is Fenix launched in the background and thus runs early Glean initialization, but not much else?

Flags: needinfo?(sarentz)

pocmo merged PR #18482: "Bug 1694343 - Set distributionId right after initializing Glean" in c47f436.


Keeping this bug open for another week, so I remember to check the dat.
It seems like Nightly is only distributed with the Mozilla distribution id, but we should see a drop in missing values I hope.

Priority: P1 → P2

Looking at this query of distribution ids the overwhelming majority of pings now report a value (in Fenix Nightly).
Still some missing, but I'm satisfied for now.

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED

(In reply to Jan-Erik Rediger [:janerik] from comment #6)

Looking at this query of distribution ids the overwhelming majority of pings now report a value (in Fenix Nightly).
Still some missing, but I'm satisfied for now.

I asked :rxu to backport the patch in attachment 9209066 [details] [review] into MozillaOnline build of Fenix 87.0.0, and it looks like most of the metrics pings still missing distribution_id field are those with ping_info.seq set to 0, see rows with build_hour as 2021032315 in this query.

Reopening so I don't lose track of it.
If it's only with ping_info.seq = 0 then it seems the first ping is sent before any of the data is set. Not sure why that is happening yet.

Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Flags: needinfo?(sarentz)

Closing this here, this is tracked in 1706637 as its the same underlying issue it seems.

Status: REOPENED → RESOLVED
Closed: 5 years ago4 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: