Devtools reports bogus network timing on 64-bit

RESOLVED FIXED in Firefox 58

Status

defect
P2
normal
RESOLVED FIXED
2 years ago
Last year

People

(Reporter: tcampbell, Assigned: dragana)

Tracking

({regression})

unspecified
Firefox 58
x86_64
Unspecified
Dependency tree / graph

Firefox Tracking Flags

(firefox-esr52 unaffected, firefox54 unaffected, firefox55 unaffected, firefox56 disabled, firefox57 wontfix, firefox58 fixed)

Details

Attachments

(1 attachment, 2 obsolete attachments)

STR
1) Open devtools Network tab
2) Load reddit.com
3) Look at |Finished:| time

Expected: A reasonable time in seconds
Actual: A time of 25 million minutes

This occurs on both Win10 and Linux for me on 64-bit only.

Regression window is approx 2017-07-19 but mozregression is giving me trouble with a more precise range.
See Also: → 1385116
Problem is resolved by forcing network.tcp.tcp_fastopen_enable to false. It seems the TCP FastOpen experiment in Bug 1363372 is causing this failure. According to that bug, it should be backed out before merge and that should some this regression.
Depends on: 1363372
Flags: needinfo?(mcmanus) → needinfo?(dd.mozilla)
With https://hg.mozilla.org/mozilla-central/rev/3e01e416c117 landed to turn off TFO, the issue no longer affects FF56. Leaving bug open since TFO will eventually be turned back on and this issue will reoccur.
This was not a problem before 772589 because we were showing connect time and tls handshake time as one. Since 772589 we are showing them separately.
Assignee: nobody → dd.mozilla
Status: NEW → ASSIGNED
Flags: needinfo?(dd.mozilla)
Duplicate of this bug: 1388405
Blocks: 772589
No longer depends on: 1363372
Posted patch bug_1384679_v1.patch (obsolete) — Splinter Review
This is a bigger patch than I expected.

No we show that tcp handshake and lts handshake or http data are send at the same time. So the timeline can overlap if tfo or tls1.3 0rtt is used.

A side effect that I have notice:
if we speculatively connect to a host but there is no request immediately for that host there is a gap between ssl ends and sending starts. in that case total time is large. I find it useful to see such a pattern, but the total time is weird. I can fix this if we want (it is one line fix, just mull connection timing if a connection is put into idle state). I notice this for google analysis for example.
Attachment #8895770 - Flags: review?(odvarko)
Attachment #8895770 - Flags: review?(mcmanus)
that's a big patch and I haven't read it all yet.

can't you just make sure that all the timestamp numbers are set to be at least the channel open time? iirc that's the guidance I read into the timings spec wrt speculative connect. It runs with the conceit that we don't record any events before channel open.
(well its not quite so simple as always using channel open time, but all the timestamps are ordered by their definitions, so you know what I mean.. timestamp N needs to be set to atleast timestamp N-1
(In reply to Patrick McManus [:mcmanus] from comment #8)
> that's a big patch and I haven't read it all yet.
> 
> can't you just make sure that all the timestamp numbers are set to be at
> least the channel open time? iirc that's the guidance I read into the
> timings spec wrt speculative connect. It runs with the conceit that we don't
> record any events before channel open.

they are not, but they are not even without this patch and tfo turned off. The example above with speculative connect(comment 6) is one of them. Before this patch it was only tricking that there is no gap in devtool between connect_end and send_start, but timestamps are the same.

I will look into fixing this. This all thing got so complicated with you patch (bug 772589) and TFO is only adding to it.
(In reply to Patrick McManus [:mcmanus] from comment #8)
> that's a big patch and I haven't read it all yet.
> 
> can't you just make sure that all the timestamp numbers are set to be at
> least the channel open time? iirc that's the guidance I read into the
> timings spec wrt speculative connect. It runs with the conceit that we don't
> record any events before channel open.

Timing spec - you mean only wedidl interface. That one does check that timings are not less than async open time. You added this check and I have not change that.

devtools are not really behaving as webidl. I am ok if we want to change that.
The big change that I did in necko code is:

without tfo connection is made only after CONNECTED_TO event. With TFO it is made after CONNECTING_TO and I make sure that events CONNECTED_TO, TLS_START and TLS_END are propagated to the connection (nsHttpConnection::mBootstrappedTimings) or to a nsHttpTransaction (if we have one).
Duplicate of this bug: 1390839
(In reply to Dragana Damjanovic [:dragana] from comment #11)
> (In reply to Patrick McManus [:mcmanus] from comment #8)

> devtools are not really behaving as webidl. I am ok if we want to change
> that.

right.. I mean the devtools can just enforce this ordering. (They actually already do for some specific subset of things)
Comment on attachment 8895770 [details] [diff] [review]
bug_1384679_v1.patch

Review of attachment 8895770 [details] [diff] [review]:
-----------------------------------------------------------------

Changes related to DevTools codebase look reasonable.
How should I test this"?
Not sure if the STR in comment #0 are still valid.

Honza
(In reply to Jan Honza Odvarko [:Honza] from comment #15)
> Comment on attachment 8895770 [details] [diff] [review]
> bug_1384679_v1.patch
> 
> Review of attachment 8895770 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Changes related to DevTools codebase look reasonable.
> How should I test this"?
> Not sure if the STR in comment #0 are still valid.
> 
> Honza

Probably they are valid. Also google.com and facebook.com have tfo turned on on the server side and you can see overlapping of the tls handshake and the tcp handshake.

tc fast open pref must be on. It is on by default on Nightly.
Here are my STRs:

0) Make sure "network.tcp.tcp_fastopen_enable" is true
1) Open a new browser tab
2) Open DevTools and select the Net panel
3) Load reddit.com (or facebook.com)
4) Watch the "Finish" value in Network panel status bar
5) The value is reasonable for me (~1-2 min)

How can I reproduce the problem?

Honza
Flags: needinfo?(dd.mozilla)
Jan, I'm finding it doesn't appear on a warm cache. Using Ctrl+F5 on Windows (64-bit) still reproduces for me.
Flags: needinfo?(dd.mozilla)
(In reply to Jan Honza Odvarko [:Honza] from comment #17)
> Here are my STRs:
> 
> 0) Make sure "network.tcp.tcp_fastopen_enable" is true
> 1) Open a new browser tab
> 2) Open DevTools and select the Net panel
> 3) Load reddit.com (or facebook.com)
> 4) Watch the "Finish" value in Network panel status bar
> 5) The value is reasonable for me (~1-2 min)
> 
> How can I reproduce the problem?
> 
> Honza

I can still reproduce it on the website I gave in https://bugzilla.mozilla.org/show_bug.cgi?id=1390839

Go on memopol.lqdn.fr and then do a clean refresh (ctrl/cmd + shift + r) with the network console opened, you'll most likely see erronous values on some items and for the end time.
I think I would prefer to see this fixed via comment 14
Duplicate of this bug: 1396658
(In reply to Jan Honza Odvarko [:Honza] from comment #22)
> Could bug 1395790 and bug 1386125 be duplicates?
> 
> Honza

In my humble opinion, those are clearly dupes. However I'll let the person you needed infos from decide of it.
In the meantime, I don't get why setting the fix as optional for firefox 57? Imho again, but that's clearly some breaking problem/regression toward the web developpers using Firefox to do their work that might lead them to move elsewhere if their tool is reporting bogus and unusable values, don't you think?
(In reply to Clément Lefèvre from comment #23)
> (In reply to Jan Honza Odvarko [:Honza] from comment #22)
> > Could bug 1395790 and bug 1386125 be duplicates?
> > 
> > Honza
> 
> In my humble opinion, those are clearly dupes. However I'll let the person
> you needed infos from decide of it.
> In the meantime, I don't get why setting the fix as optional for firefox 57?
> Imho again, but that's clearly some breaking problem/regression toward the
> web developpers using Firefox to do their work that might lead them to move
> elsewhere if their tool is reporting bogus and unusable values, don't you
> think?

TFO is turned off on 57(since 09/09 or 09/10). so there will not be any problems with devTools on 57. probably a better status is "disabled". We will fix this for 58.
Duplicate of this bug: 1395790
Duplicate of this bug: 1386125
(In reply to Jan Honza Odvarko [:Honza] from comment #22)
> Could bug 1395790 and bug 1386125 be duplicates?
> 
> Honza

yes.
Flags: needinfo?(dd.mozilla)
Comment on attachment 8895770 [details] [diff] [review]
bug_1384679_v1.patch

Review of attachment 8895770 [details] [diff] [review]:
-----------------------------------------------------------------

I am seeing the following compiler errors if building Firefox with the patch applied:

0:07.83 /Users/jodvarko/src/mozilla.org/mozilla-central/netwerk/protocol/http/nsHttpConnection.cpp:2453:19: error: out-of-line definition of 'SetEvent' does not match any declaration in 'mozilla::net::nsHttpConnection'
 0:07.83 nsHttpConnection::SetEvent(nsresult aStatus)
 0:07.83                   ^~~~~~~~
 0:07.93 BUILDSTATUS OBJECT_FILE plugin-container.o
 0:08.46 1 error generated.
 0:08.48 make[5]: *** [Unified_cpp_protocol_http2.o] Error 1
 0:08.48 make[5]: *** Waiting for unfinished jobs....
 0:08.50 libplugin-container.a.desc
 0:09.09 In file included from /Users/jodvarko/src/mozilla.org/mozilla-central/obj-x86_64-apple-darwin16.6.0/netwerk/protocol/http/Unified_cpp_protocol_http0.cpp:92:
 0:09.09 /Users/jodvarko/src/mozilla.org/mozilla-central/netwerk/protocol/http/Http2Session.cpp:2681:13: error: no member named 'SetEvent' in 'mozilla::net::nsHttpConnection'
 0:09.09       conn->SetEvent(aStatus);
 0:09.09       ~~~~  ^

Honza
Attachment #8895770 - Flags: review?(odvarko) → review-
I am able to reproduce the issue on my machine (Win10) using STR in comment #0

Honza
Has STR: --- → yes
(In reply to Jan Honza Odvarko [:Honza] from comment #28)
> Comment on attachment 8895770 [details] [diff] [review]
> bug_1384679_v1.patch
> 
> Review of attachment 8895770 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I am seeing the following compiler errors if building Firefox with the patch
> applied:
> 
> 0:07.83
> /Users/jodvarko/src/mozilla.org/mozilla-central/netwerk/protocol/http/
> nsHttpConnection.cpp:2453:19: error: out-of-line definition of 'SetEvent'
> does not match any declaration in 'mozilla::net::nsHttpConnection'
>  0:07.83 nsHttpConnection::SetEvent(nsresult aStatus)
>  0:07.83                   ^~~~~~~~
>  0:07.93 BUILDSTATUS OBJECT_FILE plugin-container.o
>  0:08.46 1 error generated.
>  0:08.48 make[5]: *** [Unified_cpp_protocol_http2.o] Error 1
>  0:08.48 make[5]: *** Waiting for unfinished jobs....
>  0:08.50 libplugin-container.a.desc
>  0:09.09 In file included from
> /Users/jodvarko/src/mozilla.org/mozilla-central/obj-x86_64-apple-darwin16.6.
> 0/netwerk/protocol/http/Unified_cpp_protocol_http0.cpp:92:
>  0:09.09
> /Users/jodvarko/src/mozilla.org/mozilla-central/netwerk/protocol/http/
> Http2Session.cpp:2681:13: error: no member named 'SetEvent' in
> 'mozilla::net::nsHttpConnection'
>  0:09.09       conn->SetEvent(aStatus);
>  0:09.09       ~~~~  ^
> 
> Honza

The patch needs rebasing.
Posted patch bug_1384679_v1.patch (obsolete) — Splinter Review
I have change the patch and timing earlier than asyncOpenTime are not displayed.
Attachment #8895770 - Attachment is obsolete: true
Attachment #8895770 - Flags: review?(mcmanus)
Attachment #8916081 - Flags: review?(odvarko)
Attachment #8916081 - Flags: review?(mcmanus)
Sorry, I forgot to refresh patch with the last update.
Attachment #8916081 - Attachment is obsolete: true
Attachment #8916081 - Flags: review?(odvarko)
Attachment #8916081 - Flags: review?(mcmanus)
Attachment #8916571 - Flags: review?(odvarko)
Attachment #8916571 - Flags: review?(mcmanus)
Comment on attachment 8916571 [details] [diff] [review]
bug_1384679_v1.patch

Review of attachment 8916571 [details] [diff] [review]:
-----------------------------------------------------------------

I think this adds some complexity where it would have been ok to just enforce the sorting inside devtools - but this is more accurate
Attachment #8916571 - Flags: review?(mcmanus) → review+
Comment on attachment 8916571 [details] [diff] [review]
bug_1384679_v1.patch

Review of attachment 8916571 [details] [diff] [review]:
-----------------------------------------------------------------

Tested on OSX and I can't see the issue with the patch applied.

Changes in timinngs-panel.js, webconsole.js and network-monitor.js look good.

R+ assuming try is green

Thanks for working on this!

Honza
Attachment #8916571 - Flags: review?(odvarko) → review+
https://hg.mozilla.org/mozilla-central/rev/f83c99399dba
https://hg.mozilla.org/mozilla-central/rev/b1e86c8915a1
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 58
Pretty giant patch. Given where we are in the cycle, I think we should wontfix this for 57. If you feel strongly that this should go into 57, feel free to set the status back to affected and nominate it for Beta approval.
(In reply to Ryan VanderMeulen [:RyanVM] from comment #39)
> Pretty giant patch. Given where we are in the cycle, I think we should
> wontfix this for 57. If you feel strongly that this should go into 57, feel
> free to set the status back to affected and nominate it for Beta approval.

TFO and TLS 1.3 are disabled on 57. Therefore this bug does not affect 57 (comment 24).
Depends on: 1409667
Product: Firefox → DevTools
You need to log in before you can comment on or make changes to this bug.