Open Bug 1613448 Opened 5 years ago Updated 3 years ago

Collect additional navigation metrics from browsertime

Categories

(Testing :: Raptor, enhancement, P3)

enhancement

Tracking

(Not tracked)

People

(Reporter: acreskey, Unassigned)

References

Details

Currently browsertime in automation collects the following metrics:

dcf
cfp
fnbpaint
loadtime
+ visual metrics

There are additional metrics which are useful. e.g.

fetchStart (statistics|timings|navigationTiming)
redirectionTime (statistics|timings|pageTimings)
domainLookupTime (statistics|timings|pageTimings)
connectStart (statistics|timings|navigationTiming)
requestStart (statistics|timings|navigationTiming)
responseStart (statistics|timings|navigationTiming)
responseEnd (statistics|timings|navigationTiming)
resourceCount (from the resource timing API)

Opening this bug as a starting point for discussion, there are likely different opinions about which metrics are worth collecting.

Is there a specific metric you personally would like to see? Or if multiple what's the priority of each of them?

I would suggest we keep this bug as meta bug, and if there is something agreed on at a later time, dependencies can be filed.

Keywords: meta
Summary: Collect additional navigation metrics from browsertime → [meta] Collect additional navigation metrics from browsertime
Whiteboard: [perftest:triage]

:acreskey, we can add those two and it won't impact much. We would have to modify the metrics we gather from the browsertime JSON here: https://dxr.mozilla.org/mozilla-central/source/testing/raptor/raptor/results.py#442

For the rest of them, they will need to be discussed further because they would add a lot of data to the perfherder-data.json files that browsertime produces. :igoldan, I've cc'ed you in case you have any thoughts/concerns/etc. on our capacity to gather and store all of the metrics outlined in the description.

Priority: -- → P3
Whiteboard: [perftest:triage]

Hmm... If this meta bugs ends up being fully implemented, we'll basically triple the volume of Raptor/Browsertime data we generate.
These 2 frameworks, by themselves, already have the highest volume at the moment.

This basically increases the pressure on bug 1343328, as we'll deplete the available ids even faster.
This is a gut feeling & I could surely provide an even better estimate, if we decide to invest in enabling these extra metrics.

See Also: → 1343328

One idea we just discussed was to only capture the additional metrics if a given flag was passed in to the tests.
e.g. --verbose or --allmetrics or similar.

(In reply to Ionuț Goldan [:igoldan] from comment #4)

This basically increases the pressure on bug 1343328, as we'll deplete the available ids even faster.
This is a gut feeling & I could surely provide an even better estimate, if we decide to invest in enabling these extra metrics.

This came up in the biweekly performance strategy meeting, and there's a clear interest in storing these additional metrics. :igoldan could you provide an estimate on the impact for each additional metric we store on browsertime page load results? Should we mark bug 1343328 as a blocker?

Flags: needinfo?(igoldan)

(In reply to Dave Hunt [:davehunt] [he/him] ⌚GMT from comment #6)

[...]
This came up in the biweekly performance strategy meeting, and there's a clear interest in storing these additional metrics.

Are these additional metrics the same as those mentioned in comment #0 (e.g. fetchStart, redirectionTime, ..., resourceCount)? Or has the list updated?

[...] Should we mark bug 1343328 as a blocker?

I can answer this only after estimating the impact.

Flags: needinfo?(igoldan)
Flags: needinfo?(dave.hunt)

Regarding fetchStart (statistics|timings|navigationTiming): what are these pipe-separated fields?
Do we already collect (or at least generate) values for them from any of the tests we currently have (e.g. from dcf, cfp, fnbpaint etc)?

Flags: needinfo?(acreskey)
Version: Version 3 → unspecified

Another question: what's the complete list of the visual metrics tests? How many are there?

A preliminary investigation shows that this might increase the database' s capacity from 140 million rows (currently) to 190 million.
WRT the pressure on bug 1343328, we'd consume our available ids 20% faster, from 690K ids/day (currently) to 840K ids/day. (Basically the browser/raptor tests would consume from 150K ids/day (currently) to 300K ids/day.)

Preliminary investigation breakdown

Our relatively active raptor & browsertime tests generated around 39 million data points for the whole year.
Our database currently holds 140 million of them.

If with the addition of these new metrics we double the amount of tests we run, this implies the database' s total capacity will increase to 180 million rows (in the course of a year). This can turn out to be a bare minimum, especially if we implement FXP-1450 - Keep stalled data that has historical value (which could further increase it with several millions rows).

(Redash queries used for this investigation)

The statistics entries shouldn't be gathered in my opinion, that's a bit overkill since it adds 9 extra metrics per metric. I think we should either gather 1 statistic measurement or the raw values but not both. The raw values would be more useful since we can do more with them, but a statistic could be more stable in graph views.

(In reply to Ionuț Goldan [:igoldan] from comment #8)

Regarding fetchStart (statistics|timings|navigationTiming): what are these pipe-separated fields?
Do we already collect (or at least generate) values for them from any of the tests we currently have (e.g. from dcf, cfp, fnbpaint etc)?

I used the pipe-separated fields to describe where in the json these values can be found.

Flags: needinfo?(acreskey)

*Subject to further discussion, but I think connectStart and resourceCount (as a sanity check) would be the most useful additions.

(In reply to Ionuț Goldan [:igoldan] from comment #8)

Do we already collect (or at least generate) values for them from any of the tests we currently have (e.g. from dcf, cfp, fnbpaint etc)?

I believe that these metrics are stored in the json but not collected anywhere.

Also, it occurred to me that we could potentially remove some metrics.

dcf, dom content flushed, is one that many people feel is of little-to-no value.

(In reply to Andrew Creskey [:acreskey] [he/him] from comment #15)

Also, it occurred to me that we could potentially remove some metrics.

dcf, dom content flushed, is one that many people feel is of little-to-no value.

Should we file a separate bug for this?

(In reply to Ionuț Goldan [:igoldan] from comment #16)

Should we file a separate bug for this?

Yes, please. We should probably block on bug 1672794 to avoid false alerts with the geomean and ask the perf team if there any objections to removing this metric.

Flags: needinfo?(dave.hunt)

(In reply to Ionuț Goldan [:igoldan] from comment #16)

(In reply to Andrew Creskey [:acreskey] [he/him] from comment #15)

Also, it occurred to me that we could potentially remove some metrics.

dcf, dom content flushed, is one that many people feel is of little-to-no value.

Should we file a separate bug for this?

I logged this one,
Stop recording the dcf, DomContentFlushed, metric
Bug 1689104
and I've asked for comments from perf team.

See Also: → 1689104

The meta keyword is there, the bug doesn't depend on other bugs and there is no activity for 12 months.
:sparky, maybe it's time to close this bug?

Flags: needinfo?(gmierz2)
Flags: needinfo?(gmierz2)
Keywords: meta
Summary: [meta] Collect additional navigation metrics from browsertime → Collect additional navigation metrics from browsertime
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.