Closed Bug 1534462 Opened 9 months ago Closed 7 months ago

Diagnose high component feed request times for "complex" variant of Firefox Home smoke test

Categories

(Firefox :: New Tab Page, enhancement, P1)

66 Branch
enhancement

Tracking

()

RESOLVED WORKSFORME
Iteration:
68.2 - Apr 1 - 14

People

(Reporter: kirill, Assigned: kirill, Mentored)

References

Details

Nick Chapman, Su-Young Hong, and I met earlier today and we’re still curious for the final word on what is going on with the COMPONENT_FEED_REQUEST_TIME performance telemetry ping.

Nan mentioned before that this event measures “how long it takes for AS to fetch the data from the endpoint for component feed when the browser starts up”.

Looking at the average & median of this metric over time for the 66 Beta Smoke Test Part 2, there continues to be a large gap between complex and default variants. I would expect there to potentially be a large gap in total feed request time between variants given complex has more content, but it’s strange that each component feed seems to take longer in complex as well (Nick mentioned they are fetched in parallel).

Median:
https://sql.telemetry.mozilla.org/queries/61707#158803

Average:
https://sql.telemetry.mozilla.org/queries/61644#158681

I have looped Scott Downe, Gavin Suntop, Mathijs Miermans, and Nan Jiang into the conversation here at various points. Scott Downe last mentioned that he believes the event to be working correctly, just that people have slow internet:
https://mozilla.slack.com/archives/CEZEF1AHX/p1551378751023300

I’m not sure this explains why complex is so different from default :confused: Any chance those mentioned above can take another look into this metric? I think this may involve Nan and Scott pairing together to look into it.

Depends on: 1531444

Scott, can you elaborate on your findings on this issue?

Flags: needinfo?(sdowne)
Assignee: nobody → sdowne

To clarify, COMPONENT_FEED_REQUEST_TIME is the time that it takes for ALL feed fetches to complete.

This will always be at least the value of the slowest fetch.

Hmmm, interesting. The description I got from Nan on the performance telemetry was a little different:
https://mozilla.slack.com/archives/CEZEF1AHX/p1551402972039600

"Would you be able to provide a plain-language description of each of the following performance telemetry ping events?

-LAYOUT_REQUEST_TIME
-SPOCS_REQUEST_TIME
-COMPONENT_FEED_REQUEST_TIME
-DS_FEED_TOTAL_REQUEST_TIME

This would greatly help me understand what each of the events means. Thank you!"

"The first three measure how long it takes for AS to fetch the data from the endpoint for layout, spocs, and component feed when the browser starts up

The last one measures how long for AS to get all the data ready (either from endpoint or from local cache)"

Wouldn't this imply the DS_FEED_TOTAL_REQUEST_TIME is the time it takes for all feed fetches to complete?

DS_FEED_TOTAL_REQUEST_TIME is the time to get all feeds, layout, and spocs from either fetch or cache or a combination.
COMPONENT_FEED_REQUEST_TIME is the time from starting the first feed fetch, to the last feed fetch has finished.

Flags: needinfo?(sdowne)
Blocks: 1534775
Iteration: --- → 68.1 - Mar 18 - 31
Priority: -- → P1
Blocks: 1512725
No longer blocks: 1512725

I wasn't able to find anything out of place.

I wonder if maybe this is actually expected?

We're doing more tests on this in release, so we can compare some fx engagement metrics wit those results to see if there is reason to believe the delay in loading more data is connected to any other metrics.

Iteration: 68.1 - Mar 18 - 31 → 68.2 - Apr 1 - 14
See Also: → 1538769
Assignee: sdowne → kirill

Kirill - have you seen this play out in our 66 layout tests?

Flags: needinfo?(kirill)

For the 66 Layout test, we are seeing more reasonable differences:
https://sql.telemetry.mozilla.org/queries/62221#159907

Essentially, the basic variant has an average component feed request time of ~300ms and all of the more complicated variants are in the neighborhood of ~500ms.

I think we can mark this one as done given we are not seeing the 2-3x difference that was in the smoke test. I am not sure how it got fixed, but this data does not set off my suspicion-meter.

Flags: needinfo?(kirill)

\o/ Thank you Kirill!

Status: NEW → RESOLVED
Closed: 7 months ago
Resolution: --- → WORKSFORME

I would probably run it by Nan if we want to make any changes here.

However, I was aware it was doing more tracking than just the fetch time, and actually just tracking the majority of the time we took setting up. I think though if large block lists are negatively impacting our numbers, or something else, that's something we would want to dig into and fix. I don't mind if it's in the impression though.

Originally when Nan was working on this, we discussed the difficulty of tracking just the fetches as they were not one synchronous fetch anymore so we opted to track startup time. That can also be changed but that's where I would want Nan to weigh in.

Flags: needinfo?(sdowne)
Flags: needinfo?(gsuntop)
Component: Activity Streams: Newtab → New Tab Page
You need to log in before you can comment on or make changes to this bug.