Last Comment Bug 691547 - Navigation Timing API returns incorrect values for navigationStart if subframe loads after main page
: Navigation Timing API returns incorrect values for navigationStart if subfram...
: verified-aurora, verified-beta
Product: Core
Classification: Components
Component: Document Navigation (show other bugs)
: 7 Branch
: x86 Mac OS X
: -- normal (vote)
: mozilla10
Assigned To: Igor Bazarny
: Andrew Overholt [:overholt]
Depends on:
  Show dependency treegraph
Reported: 2011-10-03 15:33 PDT by Steve Jiang
Modified: 2013-09-22 22:23 PDT (History)
16 users (show)
bzbarsky: in‑testsuite+
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---

Test and fix (4.90 KB, patch)
2011-11-03 05:50 PDT, Igor Bazarny
no flags Details | Diff | Splinter Review
Same as before, unnecessary content removed from the test (4.73 KB, patch)
2011-11-03 05:55 PDT, Igor Bazarny
bzbarsky: review+
Details | Diff | Splinter Review
screen shot of some large page load times (38.26 KB, image/png)
2011-11-07 07:21 PST, Darin Wright
no flags Details
Comment typo fix (4.73 KB, patch)
2011-11-08 05:26 PST, Igor Bazarny
bzbarsky: review+
christian: approval‑mozilla‑aurora+
christian: approval‑mozilla‑beta+
Details | Diff | Splinter Review
screen shot (90.13 KB, image/png)
2011-12-06 08:41 PST, Vlad [QA]
no flags Details
Screen shot (560.58 KB, image/png)
2011-12-07 06:45 PST, Vlad [QA]
no flags Details
screenshot (95.82 KB, image/png)
2011-12-14 05:57 PST, Vlad [QA]
no flags Details

Description Steve Jiang 2011-10-03 15:33:26 PDT
User Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_8; en-us) AppleWebKit/533.21.1 (KHTML, like Gecko) Version/5.0.5 Safari/533.21.1

Steps to reproduce:

Load twice, about 25 seconds apart.  Inspect window.performance.timing using the Web Console after each load.

Actual results:

The values are as follows.  The second load's timing values are not consistent with the spec:

======= First load =======
connectEnd: 1317680797258
connectStart: 1317680797252
domComplete: 1317680800052
domContentLoadedEventEnd: 1317680799067
domContentLoadedEventStart: 1317680798962
domInteractive: 1317680798955
domLoading: 1317680797468
domainLookupEnd: 1317680797252
domainLookupStart: 1317680797203
fetchStart: 1317680797205
loadEventEnd: 1317680800053
loadEventStart: 1317680800052
navigationStart: 1317680797203
redirectEnd: 0
redirectStart: 0
requestStart: 1317680797258
responseEnd: 1317680797492
responseStart: 1317680797466
unloadEventEnd: 0
unloadEventStart: 0

======= Second load =======
connectEnd: 0
connectStart: 0
domComplete: 1317680824409
domContentLoadedEventEnd: 1317680823159
domContentLoadedEventStart: 1317680823094
domInteractive: 1317680823086
domLoading: 1317680821459
domainLookupEnd: 1317680821175
domainLookupStart: 1317680821175
fetchStart: 1317680821177
loadEventEnd: 1317680824410
loadEventStart: 1317680824409
navigationStart: 1317680805055
redirectEnd: 0
redirectStart: 0
requestStart: 1317680821192
responseEnd: 1317680821479
responseStart: 1317680821456
unloadEventEnd: 1317680821555
unloadEventStart: 1317680821459

Expected results:

navigationStart is incorrect for second load (more than 16 seconds before unloadEventStart)

connectStart and connectEnd are 0 when a persistent connection is used to fetch the document.  The Navigation Timing draft spec[1] states:
If a persistent connection [RFC 2616] is used or the current document is retrieved from relevant application caches or local resources, this attribute must return value of domainLookupEnd.

Comment 1 Boris Zbarsky [:bz] (still a bit busy) 2011-10-24 19:21:14 PDT
Steve, which build are you testing in?  I can reproduce the connectStart/connectEnd being 0 issue in Firefox 7, but not in a current nightly.
Comment 2 Steve Jiang 2011-10-24 20:07:19 PDT
Yes this was with release 7.0.1.  I'll try nightly.
Comment 3 Steve Jiang 2011-10-24 20:15:39 PDT
Confirmed connectStart/connectEnd being fixed in nightly, still getting bad values for navigationStart when reloading
Comment 4 Boris Zbarsky [:bz] (still a bit busy) 2011-10-24 20:30:19 PDT
OK.  How exactly are you reloading?  Reload button?  Enter in url bar?  Something else?  And just to make sure, you see the bug in safe mode, right?
Comment 5 Steve Jiang 2011-10-24 20:42:24 PDT
Yes, tried safe mode.  Happens with reload button, cmd-R, and enter in URL bar.  

Values are such that navigationStart is several seconds less than unloadEventStart/unloadEventEnd.  Also, unloadEventStart is updating during the loading of the page, such that once the page has loaded, unloadEventStart > unloadEventEnd.
Comment 6 Boris Zbarsky [:bz] (still a bit busy) 2011-10-24 21:33:41 PDT
Hmm.  I can't reproduce that last bit..  With a current Mac nightly on 10.6.8, so pretty similar to your setup, I see unloadEventEnd be 1ms after unloadEventStart and navigationStart.

I do see navigationStart at about 20ms before the other quantities, though.  Checking on that now.
Comment 7 Boris Zbarsky [:bz] (still a bit busy) 2011-10-24 21:49:51 PDT
OK, so what happens here is that we hit EndPageLoad for the main page, copy mTiming to the document, then later a subframe starts a load, which lands us in MaybeInitTiming for the main page and we record a navigationStart at that point.

Igor, should we perhaps be calling MaybeInitTiming in the STATE_START codepath only if |this == aProgress|?
Comment 8 Igor Bazarny 2011-11-02 05:10:41 PDT
Hmm, I had an impression that nsDocShell::OnProgressChange is not called for frames. If I'm wrong, than [this == aProgress] check seems useful. I have troubles creating testcase which demonstrates the issue.
Comment 9 Boris Zbarsky [:bz] (still a bit busy) 2011-11-02 06:23:42 PDT
OnProgressChange is absolutely called for frames.

Here's a testcase for you:

  <iframe onload="setTimeout(function() { alert(window.performance.timing.loadEventEnd - window.performance.timing.navigationStart); }, 0);" 
          src="data:text/html,<a href=''>Click me, wait 30 seconds, then force-refresh the page</a>"></iframe>
Comment 10 Igor Bazarny 2011-11-03 05:50:08 PDT
Created attachment 571614 [details] [diff] [review]
Test and fix

Test and fix for the bug.

I have no rights to run this change though the try server, would appreciate help.
Comment 11 Igor Bazarny 2011-11-03 05:55:12 PDT
Created attachment 571616 [details] [diff] [review]
Same as before, unnecessary content removed from the test

Unused code removed from the test.
Comment 12 Boris Zbarsky [:bz] (still a bit busy) 2011-11-03 07:35:01 PDT
I don't understand that test.  This bug is that navigation in a subframe records navigationStart for a _parent_ of that subframe.  But the test looks at window.performance in the subframe itself.  Does the test actually fail without the docshell change?  I doubt it does...
Comment 13 Igor Bazarny 2011-11-07 01:09:13 PST
The test did fail before the fix. It contains 2 nested frames. Top window is a driver which waits for the frame (1) to load and than changes location of the inner frame (2). After 3 second timeout frame 1 is reloaded, time before reload is saved. Due to the bug, at the moment of reload frame 1 has timing initialized and navStart set to the time about 3s before the reload. Correct implementation should have navStart set after the time of reload call, that's what the test finally checks. I've seen that timing could be pretty imprecise on some platform, that's why I let the moment recorded by navigation time be 0.2s earlier than time recorded at the reload time.
Comment 14 Darin Wright 2011-11-07 07:20:45 PST
New Relic uses the Web Timing API to derive navigation start time for reporting of end user response times. We've been seeing extremely large response times from FF7, which could be explained by this bug.
Comment 15 Darin Wright 2011-11-07 07:21:40 PST
Created attachment 572464 [details]
screen shot of some large page load times
Comment 16 Boris Zbarsky [:bz] (still a bit busy) 2011-11-07 12:04:53 PST
Comment on attachment 571616 [details] [diff] [review]
Same as before, unnecessary content removed from the test

Oh, I see.  I mixed up my |frame| and |innerFrame|.

s/latting/allowing/ and r=me.  Once we have a way to get access to the monotonic clock from script, we can nix this timeout hackery, right?

Is there anything else to do here before this lands?  I'd like to get it on inbound today so it makes Fx10....
Comment 17 Igor Bazarny 2011-11-08 05:26:58 PST
Created attachment 572788 [details] [diff] [review]
Comment typo fix

Typo correction done. Indeed, this seems to be all we need to fix the issue.
Comment 18 Igor Bazarny 2011-11-08 05:47:26 PST
I use PR_Now to set navigationStart than add durations calculated from TimeStamps converted to milliseconds for later moments. This way measurements look 'normal'. But this also means monotonic clock is not enough as we need clocks to be synchronized for each session.
Also, there are issues with clock precision and event ordering, so I was not able to compare Bavigation Timing values with values returned from on some platforms and removed ordering checks for, say, loadEventStart, time recorded by onload handler and loadEventEnd.
So we might need more work in addition to monotinic clock access from script.

In fact, slack here might be unnecessary, I wanted to be on safe side given the previous experience with timing ordering tests and short time frame.
Comment 19 Boris Zbarsky [:bz] (still a bit busy) 2011-11-08 08:44:57 PST
Comment on attachment 572788 [details] [diff] [review]
Comment typo fix

Comment 20 Boris Zbarsky [:bz] (still a bit busy) 2011-11-08 09:09:01 PST
Comment 21 Darin Wright 2011-11-10 08:25:28 PST
Will there be a way to tell which browsers contain this fix based on the User Agent header? Does Firefox 8 also contain this bug?
Comment 22 Boris Zbarsky [:bz] (still a bit busy) 2011-11-10 09:07:59 PST
Firefox 8 contains the bug, yes.  Firefox 10 will have it fixed.  What happens with Firefox 9 is an open question so far.

Igor, do you think this is safe enough to take on Aurora?  If so, please request approval...
Comment 23 Igor Bazarny 2011-11-11 03:23:00 PST
Comment on attachment 572788 [details] [diff] [review]
Comment typo fix

I would certainly like to see this issue fixed in Fx9

The fix seems safe to me. I'm a bit confused about release lines. Didn't Fx9 move to beta when Fx8 was released this week? Should I ask for approval‑mozilla‑beta then?
Comment 24 Jason Duell [:jduell] (needinfo me) 2011-11-11 11:20:09 PST

Yes, ask for approval-mozilla-beta if you want this in FF9.  Patch is small, otherwise I wouldn't say it was a realistic option (we don't take a lot of stuff on beta).
Comment 25 Boris Zbarsky [:bz] (still a bit busy) 2011-11-11 12:29:44 PST
I do think this is a safe change well worth taking in beta so we don't impose this breakage on web developers any longer than we have to....
Comment 26 Jason Duell [:jduell] (needinfo me) 2011-11-15 15:50:08 PST
Comment on attachment 572788 [details] [diff] [review]
Comment typo fix

Igor, was there a reason you cleared the approval-mozilla-aurora flag?  I assume we want this on aurora too, since we've now got approval-beta.

I'm going ahead and resettting the flag--landing on beta and not aurora sounds way-non-kosher :)
Comment 27 christian 2011-11-15 17:04:49 PST

It looks to already be on aurora.
Comment 28 Igor Bazarny 2011-11-16 04:44:11 PST
(In reply to Jason Duell (:jduell) from comment #26)
> Comment on attachment 572788 [details] [diff] [review] [diff] [details] [review]
> Comment typo fix
> Igor, was there a reason you cleared the approval-mozilla-aurora flag?  I
> assume we want this on aurora too, since we've now got approval-beta.
> I'm going ahead and resettting the flag--landing on beta and not aurora
> sounds way-non-kosher :)

I had an impression that change went in right before Fx10 Aurora cut, so there is no need for integration.
Comment 29 Boris Zbarsky [:bz] (still a bit busy) 2011-11-16 04:44:58 PST
Yep.  That impression was correct too; see comment 27.
Comment 30 Vlad [QA] 2011-12-06 08:41:07 PST
Created attachment 579309 [details]
screen shot

I want to verify this bug and as you can see in the attachment, I get different results when I load the page the second time.
This is intended? What parameters I have to observe for this bug to be verified   in Aurora and Nightly?

Comment 31 Boris Zbarsky [:bz] (still a bit busy) 2011-12-06 11:03:17 PST
For your second load, you seem to be looking at the timing information before the page is actually done loading.
Comment 32 Vlad [QA] 2011-12-07 06:45:34 PST
Created attachment 579677 [details]
Screen shot

I have re-verefied this bug and I let the tage to fully load.
You can see the figures for Firefox, Aurora and Nightly in the attachment.
They are correct?

I have tried this on:
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:9.0) Gecko/20100101 Firefox/9.0 beta 4
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:10.0a2) Gecko/20111206 Firefox/10.0a2
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:11.0a1) Gecko/20111206 Firefox/11.0a1

Comment 33 Boris Zbarsky [:bz] (still a bit busy) 2011-12-07 06:48:46 PST
Possibly, depending on the steps you used...
Comment 34 Vlad [QA] 2011-12-12 05:14:39 PST
(In reply to Boris Zbarsky (:bz) from comment #33)
> Possibly, depending on the steps you used...

I have used the steps from the description. 
Is this verified based on attachment from comment32?

Comment 35 Boris Zbarsky [:bz] (still a bit busy) 2011-12-12 06:17:44 PST
Hm.  Your numbers actually don't make any sense whatsoever, if I'm doing the math correctly.  For example, for your "Firefox first load" the gap between navigationStart and loadEventStart is order of 500 seconds, right?  Similar for nightly.  Did the page really take 6 minutes to load for you?
Comment 36 Vlad [QA] 2011-12-12 07:58:08 PST
The page fully loaded after some time, only then I reloaded the page.
I'll redo the test and wait exactly 25 seconds regardless if the page fully loaded or not.

I'll post back my results.
Comment 37 Vlad [QA] 2011-12-14 05:57:39 PST
Created attachment 581606 [details]

Hi Boris.
I've attached a new screenshot. Does the figures look ok?
If not, I don't know how else can I test this.

Comment 38 Boris Zbarsky [:bz] (still a bit busy) 2011-12-14 09:55:02 PST
In that screenshot, you're examining the page before it's done loading....  I assume the page really does take forever to fully load for you for some reason.

But in any case, that screenshot shows the bug is fixed, as far as I can tell.  Compare the difference between navigationStart and fetchStart for the second load in builds with and and without the patch, maybe?  And then compare both to the same difference for the first load?
Comment 39 Steve Jiang 2011-12-14 10:05:55 PST
FWIW, I've verified that the case in my original report no longer exhibits the problem in Aurora.
Comment 40 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2011-12-14 10:45:42 PST
Steve, could you do a quick verification with Firefox 9.0b6? (it's on FTP now)
Comment 41 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2011-12-14 13:14:13 PST
Vlad, please respond to comment 38. If this is fixed based on that in Beta, please update the whiteboard. Thanks
Comment 42 Steve Jiang 2011-12-14 14:21:02 PST
Anthony, I just checked 9.0b6.  It shows the proper values as well.
Comment 43 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2011-12-14 14:40:40 PST
Thanks Steve, marking this VERIFIED.
Comment 44 Vlad [QA] 2011-12-14 23:29:08 PST
Indeed, the page takes a very long time to fully load. 
I've waited well over 5 minutes for the page to load.
Comment 45 Aaron Peters 2011-12-19 23:55:38 PST
This bug is not in the list of bug fixes for FF9.

Is it correct that the bug is still present in FF9?
Comment 46 Boris Zbarsky [:bz] (still a bit busy) 2011-12-21 07:54:22 PST
This is fixed in FF9.

Whoever generated that list did it wrong.
Comment 47 2012-05-15 06:37:41 PDT
Dear navigation timing experts, please take a look at bug 748276

Note You need to log in before you can comment on or make changes to this bug.