Closed Bug 691547 Opened 13 years ago Closed 13 years ago

Navigation Timing API returns incorrect values for navigationStart if subframe loads after main page

Categories

(Core :: DOM: Navigation, defect)

7 Branch
x86
macOS
defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla10
Tracking Status
firefox9 + fixed

People

(Reporter: sjiang-moz, Assigned: igor.bazarny)

Details

(Keywords: verified-aurora, verified-beta, Whiteboard: [qa-])

Attachments

(5 files, 2 obsolete files)

User Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_8; en-us) AppleWebKit/533.21.1 (KHTML, like Gecko) Version/5.0.5 Safari/533.21.1

Steps to reproduce:

Load http://finance.yahoo.com/ twice, about 25 seconds apart.  Inspect window.performance.timing using the Web Console after each load.


Actual results:

The values are as follows.  The second load's timing values are not consistent with the spec:

======= First load =======
connectEnd: 1317680797258
connectStart: 1317680797252
domComplete: 1317680800052
domContentLoadedEventEnd: 1317680799067
domContentLoadedEventStart: 1317680798962
domInteractive: 1317680798955
domLoading: 1317680797468
domainLookupEnd: 1317680797252
domainLookupStart: 1317680797203
fetchStart: 1317680797205
loadEventEnd: 1317680800053
loadEventStart: 1317680800052
navigationStart: 1317680797203
redirectEnd: 0
redirectStart: 0
requestStart: 1317680797258
responseEnd: 1317680797492
responseStart: 1317680797466
unloadEventEnd: 0
unloadEventStart: 0

======= Second load =======
connectEnd: 0
connectStart: 0
domComplete: 1317680824409
domContentLoadedEventEnd: 1317680823159
domContentLoadedEventStart: 1317680823094
domInteractive: 1317680823086
domLoading: 1317680821459
domainLookupEnd: 1317680821175
domainLookupStart: 1317680821175
fetchStart: 1317680821177
loadEventEnd: 1317680824410
loadEventStart: 1317680824409
navigationStart: 1317680805055
redirectEnd: 0
redirectStart: 0
requestStart: 1317680821192
responseEnd: 1317680821479
responseStart: 1317680821456
unloadEventEnd: 1317680821555
unloadEventStart: 1317680821459



Expected results:

navigationStart is incorrect for second load (more than 16 seconds before unloadEventStart)

connectStart and connectEnd are 0 when a persistent connection is used to fetch the document.  The Navigation Timing draft spec[1] states:
If a persistent connection [RFC 2616] is used or the current document is retrieved from relevant application caches or local resources, this attribute must return value of domainLookupEnd.

[1] https://dvcs.w3.org/hg/webperf/raw-file/tip/specs/NavigationTiming/Overview.html
Component: General → DOM: Core & HTML
Product: Firefox → Core
Component: DOM: Core & HTML → Networking
QA Contact: general → networking
Steve, which build are you testing in?  I can reproduce the connectStart/connectEnd being 0 issue in Firefox 7, but not in a current nightly.
Yes this was with release 7.0.1.  I'll try nightly.
Confirmed connectStart/connectEnd being fixed in nightly, still getting bad values for navigationStart when reloading finance.yahoo.com
OK.  How exactly are you reloading?  Reload button?  Enter in url bar?  Something else?  And just to make sure, you see the bug in safe mode, right?
Yes, tried safe mode.  Happens with reload button, cmd-R, and enter in URL bar.  

Values are such that navigationStart is several seconds less than unloadEventStart/unloadEventEnd.  Also, unloadEventStart is updating during the loading of the page, such that once the page has loaded, unloadEventStart > unloadEventEnd.
Hmm.  I can't reproduce that last bit..  With a current Mac nightly on 10.6.8, so pretty similar to your setup, I see unloadEventEnd be 1ms after unloadEventStart and navigationStart.

I do see navigationStart at about 20ms before the other quantities, though.  Checking on that now.
OK, so what happens here is that we hit EndPageLoad for the main page, copy mTiming to the document, then later a subframe starts a load, which lands us in MaybeInitTiming for the main page and we record a navigationStart at that point.

Igor, should we perhaps be calling MaybeInitTiming in the STATE_START codepath only if |this == aProgress|?
Status: UNCONFIRMED → NEW
Component: Networking → Document Navigation
Ever confirmed: true
QA Contact: networking → docshell
Summary: Navigation Timing API returns incorrect values → Navigation Timing API returns incorrect values for navigationStart if subframe loads after main page
Hmm, I had an impression that nsDocShell::OnProgressChange is not called for frames. If I'm wrong, than [this == aProgress] check seems useful. I have troubles creating testcase which demonstrates the issue.
OnProgressChange is absolutely called for frames.

Here's a testcase for you:

  <iframe onload="setTimeout(function() { alert(window.performance.timing.loadEventEnd - window.performance.timing.navigationStart); }, 0);" 
          src="data:text/html,<a href='http://www.mozilla.org'>Click me, wait 30 seconds, then force-refresh the page</a>"></iframe>
Attached patch Test and fix (obsolete) — Splinter Review
Test and fix for the bug.

I have no rights to run this change though the try server, would appreciate help.
Attachment #571614 - Flags: review?(bzbarsky)
Attachment #571614 - Flags: review?(bugs)
Unused code removed from the test.
Attachment #571614 - Attachment is obsolete: true
Attachment #571614 - Flags: review?(bzbarsky)
Attachment #571614 - Flags: review?(bugs)
I don't understand that test.  This bug is that navigation in a subframe records navigationStart for a _parent_ of that subframe.  But the test looks at window.performance in the subframe itself.  Does the test actually fail without the docshell change?  I doubt it does...
The test did fail before the fix. It contains 2 nested frames. Top window is a driver which waits for the frame (1) to load and than changes location of the inner frame (2). After 3 second timeout frame 1 is reloaded, time before reload is saved. Due to the bug, at the moment of reload frame 1 has timing initialized and navStart set to the time about 3s before the reload. Correct implementation should have navStart set after the time of reload call, that's what the test finally checks. I've seen that timing could be pretty imprecise on some platform, that's why I let the moment recorded by navigation time be 0.2s earlier than time recorded at the reload time.
New Relic uses the Web Timing API to derive navigation start time for reporting of end user response times. We've been seeing extremely large response times from FF7, which could be explained by this bug.
Comment on attachment 571616 [details] [diff] [review]
Same as before, unnecessary content removed from the test

Oh, I see.  I mixed up my |frame| and |innerFrame|.

s/latting/allowing/ and r=me.  Once we have a way to get access to the monotonic clock from script, we can nix this timeout hackery, right?

Is there anything else to do here before this lands?  I'd like to get it on inbound today so it makes Fx10....
Attachment #571616 - Flags: review+
Attached patch Comment typo fixSplinter Review
Typo correction done. Indeed, this seems to be all we need to fix the issue.
Attachment #571616 - Attachment is obsolete: true
Attachment #572788 - Flags: review?(bzbarsky)
I use PR_Now to set navigationStart than add durations calculated from TimeStamps converted to milliseconds for later moments. This way measurements look 'normal'. But this also means monotonic clock is not enough as we need clocks to be synchronized for each session.
Also, there are issues with clock precision and event ordering, so I was not able to compare Bavigation Timing values with values returned from Date.now() on some platforms and removed ordering checks for, say, loadEventStart, time recorded by onload handler and loadEventEnd.
So we might need more work in addition to monotinic clock access from script.

In fact, slack here might be unnecessary, I wanted to be on safe side given the previous experience with timing ordering tests and short time frame.
Comment on attachment 572788 [details] [diff] [review]
Comment typo fix

r=me
Attachment #572788 - Flags: review?(bzbarsky) → review+
https://hg.mozilla.org/mozilla-central/rev/54bfd8bf682e
Assignee: nobody → igor.bazarny
Status: NEW → RESOLVED
Closed: 13 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla10
Will there be a way to tell which browsers contain this fix based on the User Agent header? Does Firefox 8 also contain this bug?
Firefox 8 contains the bug, yes.  Firefox 10 will have it fixed.  What happens with Firefox 9 is an open question so far.

Igor, do you think this is safe enough to take on Aurora?  If so, please request approval...
Comment on attachment 572788 [details] [diff] [review]
Comment typo fix

I would certainly like to see this issue fixed in Fx9

The fix seems safe to me. I'm a bit confused about release lines. Didn't Fx9 move to beta when Fx8 was released this week? Should I ask for approval‑mozilla‑beta then?
Attachment #572788 - Flags: approval-mozilla-aurora?
Igor,

Yes, ask for approval-mozilla-beta if you want this in FF9.  Patch is small, otherwise I wouldn't say it was a realistic option (we don't take a lot of stuff on beta).
I do think this is a safe change well worth taking in beta so we don't impose this breakage on web developers any longer than we have to....
Attachment #572788 - Flags: approval-mozilla-beta?
Attachment #572788 - Flags: approval-mozilla-aurora?
Attachment #572788 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment on attachment 572788 [details] [diff] [review]
Comment typo fix

Igor, was there a reason you cleared the approval-mozilla-aurora flag?  I assume we want this on aurora too, since we've now got approval-beta.

I'm going ahead and resettting the flag--landing on beta and not aurora sounds way-non-kosher :)
Attachment #572788 - Flags: approval-mozilla-aurora?
Attachment #572788 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
(In reply to Jason Duell (:jduell) from comment #26)
> Comment on attachment 572788 [details] [diff] [review] [diff] [details] [review]
> Comment typo fix
> 
> Igor, was there a reason you cleared the approval-mozilla-aurora flag?  I
> assume we want this on aurora too, since we've now got approval-beta.
> 
> I'm going ahead and resettting the flag--landing on beta and not aurora
> sounds way-non-kosher :)

I had an impression that change went in right before Fx10 Aurora cut, so there is no need for integration.
Yep.  That impression was correct too; see comment 27.
Whiteboard: [qa+]
Attached image screen shot
Hi.
I want to verify this bug and as you can see in the attachment, I get different results when I load the page the second time.
This is intended? What parameters I have to observe for this bug to be verified   in Aurora and Nightly?

Thanks
For your second load, you seem to be looking at the timing information before the page is actually done loading.
Attached image Screen shot
I have re-verefied this bug and I let the tage to fully load.
You can see the figures for Firefox, Aurora and Nightly in the attachment.
They are correct?

I have tried this on:
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:9.0) Gecko/20100101 Firefox/9.0 beta 4
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:10.0a2) Gecko/20111206 Firefox/10.0a2
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:11.0a1) Gecko/20111206 Firefox/11.0a1

Thanks.
Possibly, depending on the steps you used...
(In reply to Boris Zbarsky (:bz) from comment #33)
> Possibly, depending on the steps you used...

I have used the steps from the description. 
Is this verified based on attachment from comment32?

Thanks.
Hm.  Your numbers actually don't make any sense whatsoever, if I'm doing the math correctly.  For example, for your "Firefox first load" the gap between navigationStart and loadEventStart is order of 500 seconds, right?  Similar for nightly.  Did the page really take 6 minutes to load for you?
The page fully loaded after some time, only then I reloaded the page.
I'll redo the test and wait exactly 25 seconds regardless if the page fully loaded or not.

I'll post back my results.
Attached image screenshot
Hi Boris.
I've attached a new screenshot. Does the figures look ok?
If not, I don't know how else can I test this.

Thanks
In that screenshot, you're examining the page before it's done loading....  I assume the page really does take forever to fully load for you for some reason.

But in any case, that screenshot shows the bug is fixed, as far as I can tell.  Compare the difference between navigationStart and fetchStart for the second load in builds with and and without the patch, maybe?  And then compare both to the same difference for the first load?
FWIW, I've verified that the case in my original report no longer exhibits the problem in Aurora.
Steve, could you do a quick verification with Firefox 9.0b6? (it's on FTP now)
Keywords: verified-aurora
Whiteboard: [qa+] → [qa+][qa!:10]
Vlad, please respond to comment 38. If this is fixed based on that in Beta, please update the whiteboard. Thanks
Anthony, I just checked 9.0b6.  It shows the proper values as well.
Thanks Steve, marking this VERIFIED.
Status: RESOLVED → VERIFIED
Keywords: verified-beta
Whiteboard: [qa+][qa!:10] → [qa-]
Indeed, the page takes a very long time to fully load. 
I've waited well over 5 minutes for the page to load.
This bug is not in the list of bug fixes for FF9.
http://www.mozilla.org/en-US/firefox/9.0/releasenotes/buglist.html

Is it correct that the bug is still present in FF9?
This is fixed in FF9.

Whoever generated that list did it wrong.
Dear navigation timing experts, please take a look at bug 748276
https://bugzilla.mozilla.org/show_bug.cgi?id=748276
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: