Closed Bug 1874884 Opened 8 months ago Closed 8 months ago

Validate Glean.js automatic page loads vs manual page loads in the Debug Ping Viewer

Categories

(Data Platform and Tools :: Glean: SDK, task, P2)

task

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: brosa, Assigned: brosa)

Details

The Debug Ping Viewer currently has both automatic Glean.js page loads running as well as manual page loads instrumented. We kept both of these for a short time so that we could verify the automatic page loads were behaving as expected.

We need to compare the two over the same period and confirm that we are seeing the same counts. Once this is completed, a follow up bug can be filed to remove the manual page load events from the Debug Ping Viewer.

After doing some analysis, it looks like our counts are mostly the same with a few extra page loads coming from the automatic instrumentation. The built-in Glean page loads are being recorded more accurately than the existing manual page loads. A few of the edge cases called out below are things we didn't even think about until we saw the built-in page loads.

There are a few edge cases that could be causing us to see more page load events with the built-in Glean page loads

  1. The manual page load events are recorded on first mount of specific components. The built-in Glean page load events are hooked into the react-router and are recorded on each route change. This means if the user is on a page like the details page and clicks on a line number or set of line numbers, this will trigger a new page load event since our location (url) has changed.
  2. Along with what was mentioned in #1, when you click on a specific line number (anchor), if the URL is too long you don't actually see the line number coming through. Our event extras are currently limited to 100 characters. There are a few instances where pings we are sending have truncated URLs meaning we can't appropriately filter out the occurrences where the debug tag is long and the whole URL, with the line numbers, aren't captured.
  3. There are no manual events recorded for the Event Stream component/screen.
  4. If you click on the page you are currently viewing in the list of breadcrumbs it will trigger a built-in page load event, but not a manual page load event. There is no way to count the number of times that this happened.

I think based on the counts and the nature of the two, we can conclude that the built-in page loads are being recorded more accurately than the existing manual page load events. I recommend that we remove the existing page load events from the Debug Ping Viewer and collect solely using the Glean built-in page load events.

Query: https://sql.telemetry.mozilla.org/queries/97035/source

(In reply to Bruno Rosa [:brosa] from comment #1)

There are a few edge cases that could be causing us to see more page load events with the built-in Glean page loads

  1. The manual page load events are recorded on first mount of specific components. The built-in Glean page load events are hooked into the react-router and are recorded on each route change. This means if the user is on a page like the details page and clicks on a line number or set of line numbers, this will trigger a new page load event since our location (url) has changed.

This is a problem with our react integration, not with Glean.js itself, right?

  1. Along with what was mentioned in #1, when you click on a specific line number (anchor), if the URL is too long you don't actually see the line number coming through. Our event extras are currently limited to 100 characters. There are a few instances where pings we are sending have truncated URLs meaning we can't appropriately filter out the occurrences where the debug tag is long and the whole URL, with the line numbers, aren't captured.

This looks like a bad bug to me. Didn't we just increase the limit to 500 characters or so? If not, we should fix this.

Flags: needinfo?(brosa)

This is a problem with our react integration, not with Glean.js itself, right?

Yes, the issue is with a few edges missing in our initial integration. It has nothing to do with Glean.js itself.

This looks like a bad bug to me. Didn't we just increase the limit to 500 characters or so? If not, we should fix this.

Yes we did just increase this limit, but we haven't cut the new release yet. Once we release the new version of Glean.js I will update the DPV to use the new version.

Flags: needinfo?(brosa)

Perry, would you mind taking a look at what I have put together in this bug? There is a SQL query I am running to compare the counts as well as some additional info listed as to why the counts are going to be a small bit off. tl;dr: the way we instrumented the built-in page loads reports data more correctly than the previous manual implementation.

Flags: needinfo?(pmcmanis)
Flags: needinfo?(pmcmanis)

I took a look at the query and I am satisfied with Bruno's methodology and explanation of the differences. For my own edification, I also looked at the client id + day level (see query here) and I think that it is clearly the case that Glean's automatic cannot be worse. Furthermore, given how many cases where it's clearly collecting more, and the explanation above, I see sufficient evidence that I would personally support the recommendation outlined in https://bugzilla.mozilla.org/show_bug.cgi?id=1874884#c1 i.e. "that we remove the existing page load events from the Debug Ping Viewer and collect solely using the Glean built-in page load events."

Thank you Perry!

As a followup to this bug, I created https://bugzilla.mozilla.org/show_bug.cgi?id=1875839 for actually removing the previous implementation of page load events.

Status: ASSIGNED → RESOLVED
Closed: 8 months ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.