Closed Bug 601126 Opened 14 years ago Closed 12 years ago

Time elapsed in console log doesn't always match time in net request inspector

Categories

(DevTools :: General, defect, P1)

x86
macOS
defect

Tracking

(blocking2.0 -)

RESOLVED WONTFIX
Tracking Status
blocking2.0 --- -

People

(Reporter: johnath, Unassigned)

References

Details

(Keywords: dev-doc-complete)

Attachments

(1 file, 1 obsolete file)

When we log network requests to the console, we append the time elapsed once the request is complete - that is sexy hot. In most cases, if I click on the hyperlinked entry and pop up the network request inspector, the time reported in the "Response Headers" section matches the log entry, but in some cases it does not.

These STR are pretty consistent for me:

- Open console
- Load https://wiki.mozilla.org/Firefox/4/Triage
- Find the log entry for /images/header_tab.gif . The time elapsed for this image is typically 500-800ms for me.
- Click the log entry to open the net panel, and notice that the response headers section reports a time that is ~100ms shorter.

I'm willing to believe that these might be different timestamps - maybe the console logs "time to complete request", but the response headers entry in the inspector is "time until response started." If so, I'll own just misunderstanding it, but I think that those being the only two timestamps present, and having similar-but-not-identical meanings (as well as often being identical) will lead to confusion.

Or maybe they really are supposed to be the same measure, and we just have a bug.

Nominating for blocking particularly in case it's the second thing and we're just getting it wrong. If it's the first thing, I don't know what we can do to disambiguate. Maybe quoting the "time to response start" as a fraction of the total response time (e.g "700ms/856ms")?
blocking2.0: --- → ?
just running a quick look-see to verify what you're seeing. All of the response times were the same in my initial test run except for one of the other gifs in that list which was off by 1ms. I'll dig into this to see where those numbers are coming from and why they're different.

I think some unittests are probably a good place to start for this.
Assignee: nobody → rcampbell
Status: NEW → ASSIGNED
I cannot reproduce this using the url in the description. The calculation thatfigures out the elapsed time is here: http://mxr.mozilla.org/mozilla-central/source/toolkit/components/console/hudservice/HUDService.jsm#906
Blocks: devtools4b8
Gotta be able to trust the data. Web devs will notice this stuff, so blocking+.
blocking2.0: ? → final+
I have news!

I've done some prelimary spelunking to see what's going on here and I believe I understand what is Up.

In the simple case:
1. we send a GET request to the server to fetch a page. This is marked with a timestamp.
2. The server receives that request header, then sends back a response header. This is marked with a delta in the Response Header.
3. We receive the Response Body. This is marked with a further delta.
4. The total time to transmit is equal to the Response Header Delta + Response Body Delta, in ms.

The timing information that's being appended to the end of the http status in the log itself is calculated by doing one Math.round(response.COMPLETE - InitialTimestamp). The totals in the network panel are comprised of two delta calculations as above, request.header.complete and response.body.complete. This causes two calls to Round which, when added up can throw off the difference by a ms from the total time reported in the console.

Of course there's a catch...

The "bad" news is that I'm unable to reproduce the weirder timing problems of images misreporting their times in the network panel. I'm wondering if there were some fixes to the networking code or httpActivityObserver that may have improved the accuracy here.

I need to do a little digging in bugzilla, but at the very least, I have a handle on how to write some tests for this (assuming our unittest webserver is sufficiently tardy) and how to fix up the reported time to get rid of cumulative rounding errors.
Assignee: rcampbell → mihai.sucan
Attached patch floor (obsolete) — Splinter Review
convert round() to floor() to eliminate additive rounding errors.
Attachment #490898 - Flags: review?(gavin.sharp)
Why does floor() instead of round() help?
Sometimes,

round(a) + round(b) (... + round(n)) != round(a + b (... + n))

We can get additive rounding errors of 1ms per Math.round call.

e.g., round(5.2) + round(6.4) = 11
vs. round(5.2 + 6.4) = 12.

I don't think losing the fractions amounts to enough to make a big difference and I'm not sure the numbers below 1ms are accurate anyway.
Comment on attachment 490898 [details] [diff] [review]
floor

(In reply to comment #7)
> round(a) + round(b) (... + round(n)) != round(a + b (... + n))

The same is true with floor(), though, and you're just replacing one with the other (not changing when they're called).

Please re-request if I'm missing something and this does actually help somehow...
Attachment #490898 - Flags: review?(gavin.sharp) → review-
mass change: filter on PRIORITYSETTING
Blocks: devtools4
Priority: -- → P1
(In reply to comment #8)
> Comment on attachment 490898 [details] [diff] [review]
> floor
> 
> (In reply to comment #7)
> > round(a) + round(b) (... + round(n)) != round(a + b (... + n))
> 
> The same is true with floor(), though, and you're just replacing one with the
> other (not changing when they're called).
> 
> Please re-request if I'm missing something and this does actually help
> somehow...

no, you're right. That was some poor logic.
mass change: filter mail on BLOCKERSETTING
Severity: normal → blocker
I did investigate this and I believe this is not a bug.

What Johnathan noticed is:

1. in the Web Console output we display the total request duration. Say 992 ms.
2. In the Network panel we display the response header ms. Say, 501ms. We also display the response body ms, say 491ms. Those two, summed up, give the total ms. 

To actually notice, and reliably reproduce that, load a big image, and enable network logging of response bodies.

What Robert tried to fix doesn't account for the big differences in ms that Johnathan saw. Johnathan has valid concerns, but it works as expected. The difference between the ms shown in the Web Console and Response Headers of the network panel should not be a problem - it *is* expected to be different. I agree that we could make it more informative, but not sure how... given we are in string freeze now. We'd best explain this in the Web Console wiki documentation.

Rob's patch tries to tackle sum rounding errors, which can *only* be of 1ms.  To "fix" this, I can provide a really simple patch (basically I'd use the formula from response header and response body to calculate each of the two durations, and sum it up, instead of what's done now in the Web Console output display of duration). That would "fix" the issue - ensuring that the total is always without rounding errors. However, I think this is a non-issue.

Thoughts?
Mihai is correct about with his interpretation of the original problem... which is to say that this is a problem with user expectations.

I think that the best fix for this might be to display the total response time up near the top of the request/response information. Perhaps just under status code. I don't think we have a string for that, however, which would require a break in string freeze.

This can also be solved in the console docs on MDC.

So, yes, it's not actually a bug... it's just a question of how confused people will likely be.
I agree. This was pretty much the assessment I had in comment #4.

"4. The total time to transmit is equal to the Response Header Delta + Response
Body Delta, in ms."

The silly rounding->floor fix I tried was just wrong. If we want to make sure we don't display rounding errors, we just need to be consistent with how we do our additions. (I.e., always do round(a) + round(b) instead of round(a+b)).

Displaying total time at the top would be good, but would require a different string. I think the documentation solution is the way to go. Let's write that up somewhere and close this down.
and to answer Mihai's question, I think the fix for the rounding errors is probably not worth it.
I agree with you, Rob, that this doesn't seem like a good enough reason to break string freeze.

Adding dev-doc-needed + sheppy

To summarize the issue:

1. the time for network traffic in the console output area is the total time for the response (headers + body)

2. in the Network Panel that pops up when you click, the time listed by the header is just the amount of time required for the header.

3. if you have response body logging turned on, the time listed for the response body + response header is equal to the total time listed in the console output area


Taking care of this in the docs may be as simple as adding something under the "Network messages" part of the Using the Web Console page. Something like "The time listed after the HTTP status is the time taken for the complete response (header + body)."
Keywords: dev-doc-needed
I thought about resisting this conclusion based on the fact that, if I'm confused, others may be as well, and I believe that we can probably make the presentation better. But my reason for nominating (and Dietrich's reason for plussing) was the worry that this reflected a logic error in the way we collected our data, not just a failure on my part to notice the second timestamp in the inspector window.

So I think the bug should stay open, but I'm clearing the blocking flag.
blocking2.0: final+ → -
I certainly agree that fixing this in the docs is not ideal. But, is this something we want to break string freeze for?
Urr - to clarify - this should not block FF4 nor expect to be fixed there. We should keep the bug open, we should look for ways to fix it, but not in an FF4 (and string freeze) timeframe.
Gotcha. So we fix add a little to the docs for now, fix it for real post Firefox 4.
Severity: blocker → normal
Assignee: mihai.sucan → nobody
When testing this I had an image with a difference of > 1500ms due to a slow connection.

So in summary of all the above ...

In the Web Console we display:
DNS Lookup + Connecting + Sending + Waiting + Receiving

And in the network request inspector we display:
DNS Lookup + Connecting + Sending + Waiting

Would it not make sense to display e.g. "delta 4600ms (-receiving)"?
Attached patch Simple fixSplinter Review
Simple patch to add (-received) after delta time. Also marked robcee's patch as obsolete since he said it was wrong.
Attachment #490898 - Attachment is obsolete: true
Attachment #583468 - Flags: review?(rcampbell)
Comment on attachment 583468 [details] [diff] [review]
Simple fix

I'm not sure I like having this big, inert word hanging off the end of the line. 

From Kevin's summary description in comment 16, the deltas seem fairly explanatory if you log request and response bodies.

I'd kind of like to mark this bug wontfix or wfm.
Agreed, let's wontfix this.
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → WONTFIX
Product: Firefox → DevTools
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: