Open
Bug 1417308
Opened 7 years ago
Updated 2 years ago
Navigation timing responseStart reported wrong
Categories
(Core :: DOM: Core & HTML, defect, P3)
Tracking
()
NEW
People
(Reporter: peter, Unassigned)
References
(Depends on 1 open bug, Blocks 1 open bug)
Details
(Keywords: regression, regressionwindow-wanted, Whiteboard: [necko-triaged][webpagetest])
Attachments
(1 file)
5.49 MB,
application/zip
|
Details |
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.89 Safari/537.36 Steps to reproduce: I automated Firefox 57 through Browsertime, access the same URL Wikipedia URL multiple times. When loadEventEnd has ended I execute: (function() { var t = window.performance.timing; return { domainLookupTime: (t.domainLookupEnd - t.domainLookupStart), redirectionTime: (t.fetchStart - t.navigationStart), serverConnectionTime: (t.connectEnd - t.connectStart), serverResponseTime: (t.responseEnd - t.requestStart), pageDownloadTime: (t.responseEnd - t.responseStart), domInteractiveTime: (t.domInteractive - t.navigationStart), domContentLoadedTime: (t.domContentLoadedEventStart - t.navigationStart), pageLoadTime: (t.loadEventStart - t.navigationStart), frontEndTime: (t.loadEventStart - t.responseEnd), backEndTime: (t.responseStart - t.navigationStart) }; })(); Actual results: Backend time (responseStart - navigationStart) is sometimes 0 (but it is actually a new request and nothing is cached). But not always. Expected results: The responseStart should have a value different from navigation start. Testing the exact same way on Firefox 54 and Chrome 63 doesn't result in a 0. I've tested it manually by loading different URLs but haven't been able to reproduce. I'll do one change in my setup by clearing the cache through the WebExtension API to make sure it isn't a incompatible thing between Firefox 57/Geckodriver/Selenium and report back.
Reporter | ||
Comment 1•7 years ago
|
||
I've added so all Firefox caches are cleared between runs using a WebExtension and then everything looks ok. I think then Selenium doesn't clear the session as it used to be in Firefox 57. I wonder if the problem is Selenium or Geckodriver then.
Comment 2•7 years ago
|
||
Can you please use https://mozilla.github.io/mozregression/ to determine the regression range?
Comment 3•7 years ago
|
||
(In reply to Peter Hedenskog from comment #1) > I've added so all Firefox caches are cleared between runs using a > WebExtension and then everything looks ok. I think then Selenium doesn't > clear the session as it used to be in Firefox 57. I wonder if the problem is > Selenium or Geckodriver then. Hi David, are you the right person to check Peter's issue and comment? Thanks!
Flags: needinfo?(dburns)
Comment 4•7 years ago
|
||
Without being able to see what the selenium code looks like it's a bit difficult to see the true error. Selenium and Geckodriver doesnt cache anything between session as we make sure to use a fresh profile between sessions. If it looks like it is caching then I suggest making sure you use the #quit command to clear the session. This command will also delete the profile in use. When you do ```python driver = webdriver.Firefox() ``` It will create a new profile. hth
Flags: needinfo?(dburns)
Comment 5•7 years ago
|
||
Hi Peter, Could you help us get the regression window following comment 2? Also, see if David's comment 4 helps resolve your issue. Thanks.
Flags: needinfo?(peter)
Priority: -- → P3
Reporter | ||
Comment 6•7 years ago
|
||
Hey sorry for not getting back. David: Yep we create new Profile. When I run the exact same code with FF 54 it works like a charm. Just changing the browser to 57 I can see that the Navigation Timing metrics sometimes (not always) is wrong. I wasn't using the latest Geckodriver (used 0.18.0) but I guess that doesn't matter. My guess was that it was getting a cached copy of the page but then my guess is wrong maybe. The thing is if I clear the cache between the runs, the metrics is correct. FF 57 is fast but backend time as 0 or 1 ms is probably too fast. When I checked now I can see that also connectEnd, connectStart, responseEnd and requestStart seems affected (they are reported as zero). I guess it doesn't give you any extra info but this is where I get the data: https://results.sitespeed.io/en.wikipedia.org/2017-11-14-22-34-52/pages/en.wikipedia.org/wiki/Barack_Obama/3.html#browsertime I have no way to debug it further until the new HAR plugin is available, sorry. I know not reproducible bugs sucks but I thought I should report it anyway so you can keep track of it if more users get the same. I've tried to look at the RUM data we have at work to see if I can see the same for real users but at the moment we disregard users with faulty nav timings, need to check if that number increased with the 57 release. Hmm I'm haven't tried with mozregression but I can do it if you think it can help?
Flags: needinfo?(peter)
Comment 7•7 years ago
|
||
Not upgrading geckodriver is fine though it solidifies my thoughts that selenium/geckodriver is not the cause. With a lot of the speed up work in Firefox for 57 I know there is more caching in general so maybe something has gone "rogue"
Comment 8•7 years ago
|
||
Does this site use a service worker with a FetchEvent at all?
Reporter | ||
Comment 9•7 years ago
|
||
No service worker at all. We use HTTP/2.
Comment 10•7 years ago
|
||
Its possible we are speculatively loading here: https://searchfox.org/mozilla-central/source/dom/performance/PerformanceTiming.cpp#95 Another possibility is that these are things like cross-origin <img> requests without Timing-Origin-Allow header, etc.
Comment 11•7 years ago
|
||
Adding some necko folks who might be able to help further diagnose what the network stack is doing.
Comment 12•7 years ago
|
||
Sorry for all the questions, but are you by any chance running this in AWS or another virtualized environment with a windows operating system? We've had some issues without time counters being consistent across browser processes in some of these environments.
Reporter | ||
Comment 13•7 years ago
|
||
Yep it's in Docker on AWS. And it happened with the switch to 57. The headers from the server looks like this: accept-ranges:bytes age:80375 backend-timing:D=225027 t=1510765277142761 cache-control:private, s-maxage=0, max-age=0, must-revalidate content-encoding:gzip content-language:en content-length:239442 content-type:text/html; charset=UTF-8 date:Thu, 16 Nov 2017 15:20:53 GMT last-modified:Wed, 15 Nov 2017 00:18:58 GMT link:</static/images/project-logos/enwiki.png>;rel=preload;as=image;media=not all and (min-resolution: 1.5dppx),</static/images/project-logos/enwiki-1.5x.png>;rel=preload;as=image;media=(min-resolution: 1.5dppx) and (max-resolution: 1.999999dppx),</static/images/project-logos/enwiki-2x.png>;rel=preload;as=image;media=(min-resolution: 2dppx) p3p:CP="This is not a P3P policy! See https://en.wikipedia.org/wiki/Special:CentralAutoLogin/P3P for more info." server:mw1271.eqiad.wmnet status:200 strict-transport-security:max-age=106384710; includeSubDomains; preload vary:Accept-Encoding,Cookie,Authorization via:1.1 varnish-v4, 1.1 varnish-v4, 1.1 varnish-v4 x-analytics:ns=0;page_id=534366;WMF-Last-Access=16-Nov-2017;WMF-Last-Access-Global=16-Nov-2017;https=1 x-cache:cp1053 hit/6, cp3031 hit/5, cp3042 hit/204 x-cache-status:hit-front x-client-ip:84.55.65.82 x-content-type-options:nosniff x-powered-by:HHVM/3.18.6-dev x-ua-compatible:IE=Edge x-varnish:370913959 14487101, 179780583 190590610, 379753910 612690641
Reporter | ||
Comment 14•7 years ago
|
||
Last thing, it isn't consistent and happens only sometimes.
Comment 15•7 years ago
|
||
If you could capture an http log when this triggers, that might be helpful: https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging Honza, what logging values would be useful for this problem?
Flags: needinfo?(honzab.moz)
Comment 16•7 years ago
|
||
(In reply to Ben Kelly [:bkelly] from comment #15) > If you could capture an http log when this triggers, that might be helpful: > > https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging > > Honza, what logging values would be useful for this problem? I think nsHttp:5 would do, for both child and parent (please use env vars). Valentin, do you know about any other module that should be added?
Flags: needinfo?(honzab.moz) → needinfo?(valentin.gosu)
Reporter | ||
Comment 17•7 years ago
|
||
I can make a hack to turn on the logging, can get back with that tomorrow. I can reproduce (in Docker) on Mac OS X the same I get on AWS.
Reporter | ||
Comment 18•7 years ago
|
||
I've could reproduce with the log turned on but needed 21 runs to the same URL (sorry for the size). The URL is https://en.wikipedia.org/wiki/Barack_Obama and access number 15 gives the wrong Navigation Timing metrics.
Reporter | ||
Comment 19•7 years ago
|
||
I got this problems also when cache is cleared so it isn't an issue about caching.
Comment 20•7 years ago
|
||
This was probably caused by bug 919391 - before it we had negative timings. Now at least it's the same as navigationStart. There's a high chance this _is_ related to caching. We only set responseStart (from what I can tell) in nsHttpTransaction. Peter, When the problem reproduces, can you post the all values in perfomance.timing ? Thanks
Assignee: nobody → valentin.gosu
Flags: needinfo?(valentin.gosu) → needinfo?(peter)
Whiteboard: [necko-triaged]
Reporter | ||
Comment 21•7 years ago
|
||
(In reply to Valentin Gosu [:valentin] from comment #20) > This was probably caused by bug 919391 - before it we had negative timings. > Now at least it's the same as navigationStart. > There's a high chance this _is_ related to caching. We only set > responseStart (from what I can tell) in nsHttpTransaction. > > Peter, > When the problem reproduces, can you post the all values in > perfomance.timing ? > > Thanks Yes I'll fix that first thing tomorrow.
Reporter | ||
Comment 22•7 years ago
|
||
Hmm so the moz HTTP log gets deleted if we open a new Firefox right, so the log I provided is of no value? I've made 21 runs (open a new FF each time) and the error happened in of the runs in between. Sorry I missed that. I've reproduced now and running https://github.com/sitespeedio/browsertime/blob/master/browserscripts/timings/navigationTiming.js gave me: connectEnd 0 connectStart 0 domComplete 3620 domContentLoadedEventEnd 1743 domContentLoadedEventStart 1742 domInteractive 1720 domLoading 138 domainLookupEnd 0 domainLookupStart 0 fetchStart 0 loadEventEnd 3621 loadEventStart 3620 navigationStart 0 requestStart 0 responseEnd 0 responseStart 0 secureConnectionStart 0
Comment 23•7 years ago
|
||
Definitely seems cache related. All of the timestamps that should come from the network are zero.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: needinfo?(peter)
Reporter | ||
Comment 24•7 years ago
|
||
Ok. What's really (In reply to Valentin Gosu [:valentin] from comment #20) > This was probably caused by bug 919391 - before it we had negative timings. > Now at least it's the same as navigationStart. > There's a high chance this _is_ related to caching. We only set > responseStart (from what I can tell) in nsHttpTransaction. > > Peter, > When the problem reproduces, can you post the all values in > perfomance.timing ? > > Thanks Ok. To be more clear if that's help for the caching issue: I got this behavior using a new profile for each run. I also get this when I use a WebExtension to clear the cache between each runs: https://github.com/sitespeedio/browsertime-extension/blob/master/background.js#L146-L165
Comment 25•7 years ago
|
||
(In reply to Peter Hedenskog from comment #24) > Ok. To be more clear if that's help for the caching issue: I got this > behavior using a new profile for each run. I also get this when I use a > WebExtension to clear the cache between each runs: > https://github.com/sitespeedio/browsertime-extension/blob/master/background. > js#L146-L165 That's even more interesting :) I wouldn't expect this to happen with an empty cache, but maybe there's a case where we don't properly set the timings when the request is only from the network. The timings are recorded after onload has fired, right?
Reporter | ||
Comment 26•7 years ago
|
||
(In reply to Valentin Gosu [:valentin] from comment #25) > (In reply to Peter Hedenskog from comment #24) > > Ok. To be more clear if that's help for the caching issue: I got this > > behavior using a new profile for each run. I also get this when I use a > > WebExtension to clear the cache between each runs: > > https://github.com/sitespeedio/browsertime-extension/blob/master/background. > > js#L146-L165 > > That's even more interesting :) > I wouldn't expect this to happen with an empty cache, but maybe there's a > case where we don't properly set the timings when the request is only from > the network. The timings are recorded after onload has fired, right? Yes we wait on onloadEventEnd + 2 seconds before we start.
Reporter | ||
Comment 27•6 years ago
|
||
Could we get some more love on this one :) I'm really interested to know if this also could effect our RUM data?
Comment 28•6 years ago
|
||
(In reply to Peter Hedenskog from comment #27) > Could we get some more love on this one :) I'm really interested to know if > this also could effect our RUM data? Thanks for the poke on this. I tried for a bunch of time to reproduce it, but didn't manage. However, I did uncover bug 1429422, as it's easier to reproduce, and it's not clear if can get accurate results on this one while that one is still active.
Blocks: resource-timing
Depends on: 1429422
Comment 29•6 years ago
|
||
Here's a solid repro: Navigate to facebook.com and run this: (function() { function perfTimingsToNavTimings(p, timeOrigin) { const timings = [...Object.entries(p)].filter(e=>typeof e[1] === 'number').map(e=>[e[0], e[1] === 0 ? 0 : e[1] + timeOrigin]); const timingsObj = {}; for (const [key,value] of timings) { timingsObj[key] = value; } return timingsObj; } a = performance.timing.toJSON(); b = perfTimingsToNavTimings(performance.getEntriesByType('navigation')[0].toJSON(), performance.timeOrigin); diffs = {}; for (const key of Object.keys(b)) { if (a[key] != null) diffs[key] = b[key] - a[key] } console.log({ 'performance.timing': a, 'PerformanceNavigationTiming': b, difference: diffs }); })(); What's interesting is that the PerformanceNavigationTiming entry is correct on the responseStart/responseEnd values, but performance.timing is way off.
Whiteboard: [necko-triaged] → [necko-triaged][webpagetest]
Assignee | ||
Updated•5 years ago
|
Component: DOM → DOM: Core & HTML
Updated•2 years ago
|
Severity: normal → S3
You need to log in
before you can comment on or make changes to this bug.
Description
•