Closed Bug 1123920 Opened 5 years ago Closed 5 years ago

Resource Timing domainLookupEnd and requestStart out-of-order

Categories

(Core :: DOM: Core & HTML, defect)

35 Branch
x86
macOS
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla39
Tracking Status
firefox37 --- wontfix
firefox38 --- fixed
firefox39 --- fixed
b2g-v2.2 --- fixed
b2g-master --- fixed

People

(Reporter: steve, Assigned: valentin)

References

Details

(Keywords: reproducible)

Attachments

(5 files, 3 obsolete files)

51.46 KB, image/png
Details
2.74 KB, patch
mcmanus
: review+
Details | Diff | Splinter Review
4.12 KB, patch
mcmanus
: review+
Details | Diff | Splinter Review
1.45 KB, patch
Details | Diff | Splinter Review
840 bytes, patch
Details | Diff | Splinter Review
Attached image ff-res-timing.png
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:35.0) Gecko/20100101 Firefox/35.0
Build ID: 20150108202552

Steps to reproduce:

Loaded a website and investigated performance.getEntries()


Actual results:

Several entries (resources) have requestStart less than domainLookupEnd. I've attached a screenshot showing the output in Developer Tools. Here are the resource entry's properties sorted:
0: redirectEnd, redirectStart, secureConnectionStart
1545.600197: connectEnd, connectStart, fetchStart, requestStart, startTime
1545.682271: domainLookupStart
1648.527292: domainLookupEnd
1648.580909: responseStart
1872.783224: responseEnd


Expected results:

requestStart MUST be greater than domainLookupEnd according to the spec (see http://www.w3.org/TR/resource-timing/#processing-model).
Valentin, do you know what's going on here?

(I can trivially reproduce on 36 beta, at least, by inspecting this page's getEntries() - the main CSS file shows the same issue)
Status: UNCONFIRMED → NEW
Component: Untriaged → DOM
Ever confirmed: true
Flags: needinfo?(valentin.gosu)
Keywords: reproducible
Product: Firefox → Core
This is a bug in our implementation: bug 1079837 comment 457
Thanks for filing it, Steve.
Assignee: nobody → valentin.gosu
Flags: needinfo?(valentin.gosu)
This patch does not fix the issue in this bug's description, but it does take care of the one in bug 1079837. The problem was with reused connections, for which it seems that OnTransportStatus(NS_NET_STATUS_CONNECTED_TO) is called after ReadRequestSegment, and messes up the expected order of the timestamps. Does this approach seem acceptable, or is there a way to change the order those methods are called?

A patch for the other issue is also incoming.
Attachment #8567430 - Flags: feedback?(mcmanus)
It seems that sometimes OnLookupComplete is called after the requestStart timestamp (first write to socket) - maybe because the dnsEntry expires while the socket is still connecte, or maybe for some other reason. In any case, using the dnsPrefetch's timestamps isn't appropriate.
Attachment #8567765 - Flags: review?(mcmanus)
Comment on attachment 8567430 [details] [diff] [review]
requestStart timestamp is before connectEnd

Review of attachment 8567430 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/protocol/http/nsHttpTransaction.cpp
@@ +516,5 @@
> +        // If the connection is reused, we may already have a usable socket,
> +        // and the requestStart may already be set.
> +        // These attributes need to be smaller or equal to requestStart.
> +        mozilla::TimeStamp timestamp = (mConnection && mConnection->IsReused())
> +             ? Timings().requestStart : TimeStamp::Now();

this doesn't feel right.. why is the check IsReused() rather than  some variation of !requestStart.IsNull()

maybe:
timestamp = requestStart.IsNull() ? TimeStamp::Now : requestStart;
Attachment #8567430 - Flags: feedback?(mcmanus)
(In reply to Valentin Gosu [:valentin] from comment #4)
> Created attachment 8567765 [details] [diff] [review]
> Resource Timing domainLookupEnd and requestStart out-of-order
> 
> It seems that sometimes OnLookupComplete is called after the requestStart
> timestamp (first write to socket) - maybe because the dnsEntry expires while
> the socket is still connecte, or maybe for some other reason.

I think you're right.. I think the prefetch really does a DNS query (and thus is 'valid') because of a dns cache expiration but the socket code actually never uses that lookup because a persistent connection is used instead.

> In any case,
> using the dnsPrefetch's timestamps isn't appropriate.

this is a little less clear (taking them if they are less than the other values might make sense) it is conservative
Attachment #8567765 - Flags: review?(mcmanus) → review+
Attachment #8567430 - Attachment is obsolete: true
Attachment #8567765 - Attachment is obsolete: true
Comment on attachment 8568892 [details] [diff] [review]
ConnectStart and ConnectEnd should call DomainLookupEndHighRes if null

Review of attachment 8568892 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/base/nsPerformance.cpp
@@ +276,5 @@
>      return mZeroTime;
>    }
> +  return (!mConnectStart.IsNull())
> +          ? TimeStampToDOMHighRes(mConnectStart)
> +          : DomainLookupEndHighRes();

Please reverse the conditional, like:

  return mConnectStart.IsNull() ?
         // null case :
         // non-null case;

@@ +293,5 @@
>      return mZeroTime;
>    }
> +  return (!mConnectEnd.IsNull())
> +          ? TimeStampToDOMHighRes(mConnectEnd)
> +          : DomainLookupEndHighRes();

Ditto.
Attachment #8568892 - Flags: review?(ehsan) → review+
Attachment #8568902 - Flags: review?(mcmanus) → review+
Attachment #8568890 - Flags: review?(mcmanus) → review+
How far back can we safely backport this?
Flags: needinfo?(valentin.gosu)
37 and 38 are both long-lived branches, so I'd really like to get it on both if possible.
Attachment #8568892 - Attachment is obsolete: true
Aurora:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=e637a6c86a46
Beta:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=40f15f3d3c11

Need to figure out the failures, will nominate after :)
Flags: needinfo?(valentin.gosu)
Duplicate of this bug: 1124930
(In reply to Valentin Gosu [:valentin] from comment #16)
> Aurora:
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=e637a6c86a46
> Beta:
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=40f15f3d3c11
> 

https://treeherder.mozilla.org/#/jobs?repo=try&revision=6fef3331b56d

Ryan, I have looked into the failures. The oranges don't seem to be related to this patch.
My main concern is the failure in the Crashtest - it fails in gfx in something dealing with timestamps. Have you seen this before? I'm a bit puzzled about how I'd fix this - or what patches I'd also need to uplift to make this work.
Flags: needinfo?(ryanvm)
None of those matter for the release branches. Keep in mind that some of those test suites normally only run on trunk :)
Flags: needinfo?(ryanvm)
Comment on attachment 8568890 [details] [diff] [review]
requestStart timestamp is before connectEnd when using persistent connections

Approval Request Comment

Patch series for aurora: attachment 8568886 [details] [diff] [review], attachment 8568890 [details] [diff] [review], attachment 8569939 [details] [diff] [review], attachment 8568902 [details] [diff] [review]

Patch series for beta: attachment 8576678 [details] [diff] [review], attachment 8576679 [details] [diff] [review], attachment 8569939 [details] [diff] [review], attachment 8568902 [details] [diff] [review]

[Feature/regressing bug #]:
Issue has been present since the initial implementation of timing interface for channels.
[User impact if declined]:
Incorrect ResourceTiming and PerformanceTiming values for resources that are served over a persistent connection.
These patches depend on the patch in bug 1132572 which fixes a data race.
This also reenables several tests for resource timing.
[Describe test coverage new/current, TreeHerder]:
Has been on nightly for a few weeks. Treeherder try pushes.
[Risks and why]:
The risks are low. The patches aren't overly complex.
Possible regression described in bug 1139831 (minor, compared to improvements gained by uplifting this bug)
[String/UUID change made/needed]:
None.
Attachment #8568890 - Flags: approval-mozilla-beta?
Attachment #8568890 - Flags: approval-mozilla-aurora?
Comment on attachment 8568890 [details] [diff] [review]
requestStart timestamp is before connectEnd when using persistent connections

After speaking with Valentin, this change isn't critical enough to take in Firefox 37. As Ryan said, b2g37 will be a long lived branch so it may make sense to take this change there. We should take this change in 38, which is the next ESR (long lived) branch. I'm going to deny uplift to Beta but nom for b2g37 (see comment 21) and approve for Aurora.
Attachment #8568890 - Flags: approval-mozilla-beta?
Attachment #8568890 - Flags: approval-mozilla-beta-
Attachment #8568890 - Flags: approval-mozilla-b2g37?
Attachment #8568890 - Flags: approval-mozilla-aurora?
Attachment #8568890 - Flags: approval-mozilla-aurora+
Attachment #8569939 - Flags: approval-mozilla-aurora+
Attachment #8568902 - Flags: approval-mozilla-aurora+
Comment on attachment 8568890 [details] [diff] [review]
requestStart timestamp is before connectEnd when using persistent connections

Actually, it's the other three patches on this bug that are required for b2g37.
Attachment #8568890 - Flags: approval-mozilla-b2g37?
Attachment #8568902 - Flags: approval-mozilla-b2g37?
Attachment #8569939 - Flags: approval-mozilla-b2g37?
Attachment #8576679 - Flags: approval-mozilla-b2g37?
Attachment #8568902 - Flags: approval-mozilla-b2g37? → approval-mozilla-b2g37+
Attachment #8569939 - Flags: approval-mozilla-b2g37? → approval-mozilla-b2g37+
Attachment #8576679 - Flags: approval-mozilla-b2g37? → approval-mozilla-b2g37+
Duplicate of this bug: 819685
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.