Server Timings with navigation timings regressed in firefox 63

RESOLVED FIXED in Firefox 64

Status

()

defect
P2
normal
RESOLVED FIXED
7 months ago
2 months ago

People

(Reporter: mcmanus, Assigned: dragana, NeedInfo)

Tracking

({regression})

63 Branch
mozilla65
Points:
---

Firefox Tracking Flags

(firefox-esr60 unaffected, firefox63 wontfix, firefox64+ fixed, firefox65 fixed)

Details

(Whiteboard: [necko-triaged])

Attachments

(2 attachments)

Reporter

Description

7 months ago
server timings for navigation timing (i.e. the root document), when used with trailers in h2 worked fine in 62 but are not present in >=63

here is an example https://h2.fastlylabs.com/pass.php

goto console (f12), performance.getEntries() under navigation timing find serverTiming.. you should find one called trailer-timestamp - that's the one from the trailer. (the server named it that for convenience)

repeat the test with 63, no entry.

It works fine with sub documents that use resource timing instead of navigation timing.

I believe I have debugged the right(?) answer.. but I'm not sure why it regressed.

The core issue is that the performance object for nav timings was populated as part of docshell creation trigger in OnStart when only the headers are there... any trailers that arrived after that point in time were ignored. Resource timing works a little differently and didn't have the same early start.

Its plausible the regression is simply a race condition.

the good news for us is that there is already a pattern for "update these things from the http channels after the document is done".. called by PerformanceNavigationTiming::UpdatePropertiesFromHttpChannel() so its just a matter of moving the server timing code into the code called by that.

I'll attach the diff
Reporter

Comment 1

7 months ago
[Tracking Requested - why for this release]: regeression vs 63 .. unsure of bug #
Has Regression Range: --- → yes
Has STR: --- → yes
Reporter

Comment 2

7 months ago
(there's no particular reason to think this is limited to h2, its just only verified there. judging by the patch it shouldn't be relevant)
Assignee

Updated

7 months ago
Assignee: nobody → dd.mozilla
Assignee

Updated

7 months ago
Priority: -- → P2
Whiteboard: [necko-triaged]
Assignee

Comment 3

6 months ago
I debugged it a bit. We have HttpBaseChannel::MaybeReportTimingData that should update the server timing (it is called from OnStopRequest) but GetPerformanceStorage returns nullptr because it cannot find loading document.

I will move GetServerTiming into PerformanceNavigationTiming::UpdatePropertiesFromHttpChannel().
Valentin, can you take a look, I think we should have a loading doc in LoadInfo.
Flags: needinfo?(valentin.gosu)
(In reply to Dragana Damjanovic [:dragana] from comment #3)
> Valentin, can you take a look, I think we should have a loading doc in
> LoadInfo.

For top level loads the loading doc is null - that seems to be the case here.
 
> I will move GetServerTiming into
> PerformanceNavigationTiming::UpdatePropertiesFromHttpChannel().

I think that's the right course. This probably regressed in bug 1462879.
Flags: needinfo?(valentin.gosu)
Assignee

Comment 5

6 months ago
(In reply to Valentin Gosu [:valentin] from comment #4)
> (In reply to Dragana Damjanovic [:dragana] from comment #3)
> > Valentin, can you take a look, I think we should have a loading doc in
> > LoadInfo.
> 
> For top level loads the loading doc is null - that seems to be the case here.
>  
> > I will move GetServerTiming into
> > PerformanceNavigationTiming::UpdatePropertiesFromHttpChannel().
> 
> I think that's the right course. This probably regressed in bug 1462879.

Does it mean that MaybeReportTimingData() is not necessary? It does nothing for the top level load but does for everything else? The resource timing should be the same for everybody. Or is this only for serviceWorkers(if so a comment would be good)?
Flags: needinfo?(valentin.gosu)
(In reply to Dragana Damjanovic [:dragana] from comment #5)
> Does it mean that MaybeReportTimingData() is not necessary? It does nothing
> for the top level load but does for everything else? The resource timing
> should be the same for everybody. Or is this only for serviceWorkers(if so a
> comment would be good)?

So, calling performance.getEntries() will usually return an array such as
> [ PerformanceNavigationTiming, PerformanceResourceTiming, PerformanceResourceTiming, PerformanceMark ]

The PerformanceResourceTiming entries are inserted into their loading document list by calling MaybeReportTimingData. These resources could be images, iframes, etc.
Service workers also report their entries via MaybeReportTimingData (90% sure).

The corner case here is PerformanceNavigationTiming which gets inserted by calling CreateNavigationTimingEntry at the moment when the performance object is created for the document.

What we should do is to make CreateNavigationTimingEntry and AddEntry share more code to avoid this kind of inconsistencies.
Flags: needinfo?(valentin.gosu)
Assignee

Updated

6 months ago
Keywords: checkin-needed

Comment 8

6 months ago
Pushed by ebalazs@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/f377e1c9ae60
ServerTiming header must be updated onStopRequest as well. r=valentin
Keywords: checkin-needed

Comment 9

6 months ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/f377e1c9ae60
Status: NEW → RESOLVED
Last Resolved: 6 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla65
Assignee

Comment 10

6 months ago
Comment on attachment 9025381 [details]
ServerTiming header must be updated onStopRequest as well.

[Beta/Release Uplift Approval Request]

Feature/Bug causing the regression: bug 1462879

User impact if declined: This is feature for web developers. The server timing delivered in trailers will not be shown.

Is this code covered by automated tests?: No

Has the fix been verified in Nightly?: Yes

Needs manual test from QE?: No

If yes, steps to reproduce: The issue can be reproduce using steps in the bug description.

List of other uplifts needed: None

Risk to taking this patch: Low

Why is the change risky/not risky? (and alternatives if risky): the code is very simple and not risky.

String changes made/needed: none
Attachment #9025381 - Flags: approval-mozilla-beta?
Comment on attachment 9025381 [details]
ServerTiming header must be updated onStopRequest as well.

fix perf timing regression, approved for 64.0b12
Attachment #9025381 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Flags: qe-verify+

Comment 13

4 months ago

Hi,

We would like to verify this issue and could not do it for now. Can you please elaborate a bit about the reproduction steps?
Couldn't find the trailer-timestamp as described on the latest Nightly. Not quite sure if reproduction steps changed in the meantime or I'm not following the instructions correctly.

NI? Dragana since Patrick does not accept NI? requests for the moment.

A screen recording/screenshot would be highly appreciated.
Thanks!

Flags: needinfo?(dd.mozilla)
Whiteboard: [necko-triaged] → [necko-triaged][qa-triaged]
QA Whiteboard: [qa-triaged]
Whiteboard: [necko-triaged][qa-triaged] → [necko-triaged]

Discussed with Julien Cristau [:jcristau], since this is an old one and went to Release already. Removing the qe verify+ flag.

Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.