Closed Bug 662555 Opened 14 years ago Closed 14 years ago

Race condition when measuring requestEnd

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla9

People

(Reporter: mayhemer, Assigned: mayhemer)

References

Details

Attachments

(1 file, 2 obsolete files)

v1 14 years ago Honza Bambas (:mayhemer) 1.72 KB, patch	jduell.mcbugs : review+	Details \| Diff \| Splinter Review
v1.1 14 years ago Honza Bambas (:mayhemer) 1.77 KB, patch	mayhemer : review+	Details \| Diff \| Splinter Review
v2 14 years ago Honza Bambas (:mayhemer) 1.74 KB, patch	jduell.mcbugs : review+	Details \| Diff \| Splinter Review

Honza Bambas (:mayhemer)

Assignee

Description

•

14 years ago

nsHttpChannel::OnStopRequest calls gHttpHandler->CancelTransaction. That method posts an event to the socket thread that cancels the transaction asynchronously (by call to nsHttpTransaction::Close). In that method we stamp the requestEnd time. Sometimes this event fires later then we read the value of requestEnd time on the main thread, giving us a null value for it. The time (also more precise) has be stamped at nsHttpTransaction::HandleContent when we detect eof. In general we should check if the timings are (and can be/have to be) in sync with the activity distributor.

Honza Bambas (:mayhemer)

Assignee

Updated

•

14 years ago

Blocks: 576006

Jason Duell

Comment 1

•

14 years ago

Would it be good enough to set the requestEnd timestamp in nsHttpChannel::OnStopRequest? AFAICT we really only care about when the data finished getting to the user, not when the transaction was cancelled asynchronously.

Honza Bambas (:mayhemer)

Assignee

Comment 2

•

14 years ago

We could do that in OnStopRequest, right, but just to note that this is also an event posted from the socket thread to the main thread. So, there is some delay. Question is, do we want to include that delay or not? All other measures are captured on the socket thread, so why not this one as well? Also these should be pure-as-can-be network only timings. Also it shows that we don't protect the timings with a mutex. I wouldn't actually add a mutex (negative perf effect) but rather (if possible, wouldn't break the web timing API) make the values accessible only after OnStopRequest has been called, e.g. mPending == false. But it is not for this bug. Adding a hard block for bug 658894 as it shots leak test down (undetected by try before).

Blocks: 658894

Jason Duell

Comment 3

•

14 years ago

> We could do that in OnStopRequest, right, but just to note that this is also an event posted from the socket thread to the main thread. So, there is some delay. Question is, do we want to include that delay or not? Well, OnStop is when the channel "ends" from the client's perspective, so it seems like a good place to take the measurement. > All other measures are captured on the socket thread, so why not this one as well? Some events (DNS resolution, TCP connection, etc) do not get notifiction on the user thread, so we can't measure them from there. But the timings are ultimately for a user to understand how much time in the network layer their code takes, and that include whatever delay we take in posting OnStop to the main thread. I'm not too worried about which thread we measure which event on, as long as the metrics are sensible. And OnStopReq seems the sensible place for requestEnd.

Honza Bambas (:mayhemer)

Assignee

Comment 4

•

14 years ago

There are 3 points of view IMO: 1. To be precise. To exactly measure the times we spend by some activities. We are using this API for our C++ coded telemetry, that has to be precise as it can be, but we can use different ways to get the telemetry, so not a strong argument... 2. To be compatible with other ways we measure. Mainly with the activity observer that Firebug uses. NS_HTTP_ACTIVITY_SUBTYPE_RESPONSE_COMPLETE is what Firebug uses for stamping the "response end" time and it is exactly at the place I suggest, [1]. I checked we are compatible for the rest of the notification, which is good. 3. Deliver what Web Timing API wants: "This attribute (responseEnd) must return the time immediately after the user agent receives the last byte of the current document or immediately before the transport connection is closed, whichever comes first. The document here can be received either from the server, relevant application caches or from local resources." Also, there is a web version of Firebug that would like to use it. It wouldn't be good to give different numbers using XPI Firebug and web Firebug. So, it is clear this needs to be put to EOF detection, the same place where the activity observer notifies. [1] http://hg.mozilla.org/mozilla-central/annotate/e319657f9c92/netwerk/protocol/http/nsHttpTransaction.cpp#l1125

Honza Bambas (:mayhemer)

Assignee

Comment 5

•

14 years ago

Attached patch v1 (obsolete) — Details — Splinter Review

This is a patch that puts the stamp to the right place. We have to leave it as well in nsHttpTransaction::Close when we get an error state. In case of an error that method seems to be called before nsHttpChannel::OnStopRequest, so we are safe.

Assignee: nobody → honzab.moz

Status: NEW → ASSIGNED

Attachment #540246 - Flags: review?(jduell.mcbugs)

Jason Duell

Comment 6

•

14 years ago

Comment on attachment 540246 [details] [diff] [review] v1 Review of attachment 540246 [details] [diff] [review]: ----------------------------------------------------------------- > This attribute (responseEnd) must return the time immediately > after the user agent receives the last byte of the current document > or immediately before the transport connection is closed, whichever comes first You're right.

Attachment #540246 - Flags: review?(jduell.mcbugs) → review+

Honza Bambas (:mayhemer)

Assignee

Comment 7

•

14 years ago

http://hg.mozilla.org/integration/mozilla-inbound/rev/edf3a8d6cff3

Whiteboard: [inbound]

Christian :Biesinger (don't email me, ping me on IRC)

Comment 8

•

14 years ago

thanks for fixing this! but shouldn't this have a TimingEnabled() check? I guess I forgot that in the original codepath...

Honza Bambas (:mayhemer)

Assignee

Comment 9

•

14 years ago

(In reply to comment #8) > thanks for fixing this! but shouldn't this have a TimingEnabled() check? I > guess I forgot that in the original codepath... Probably should. I can do that in a followup.

Joe Drew (not getting mail)

Comment 10

•

14 years ago

Backed out for apparently causing reftest assertion failures on Windows debug: http://hg.mozilla.org/integration/mozilla-inbound/rev/3890c438f22d

Whiteboard: [inbound]

Honza Bambas (:mayhemer)

Assignee

Comment 11

•

14 years ago

I have no idea why this should cause reftest assertions. Unfortunatelly logs are gone by now. I believe this was backed out because the original version of the bug 658894 patch was causing assertion failures (changeset fcf08eff68eb) but a new version currently landed on m-c (eb5866601f88) doesn't suffer from it. I vote for just relanding this.

Honza Bambas (:mayhemer)

Assignee

Comment 12

•

14 years ago

Attached patch v1.1 (obsolete) — Details — Splinter Review

Same as the v1 patch just added the TimingEnabled() check.

Attachment #540246 - Attachment is obsolete: true

Attachment #550146 - Flags: review+

Honza Bambas (:mayhemer)

Assignee

Comment 13

•

14 years ago

Hmm... looks like the spec has changed in the mean time: responseEnd: "This attribute must return the time immediately after the user agent finishes receiving the last byte of the resource from from relevant application caches or from local resources." There is no more anything about closing the connection. I will update the patch.

Honza Bambas (:mayhemer)

Assignee

Comment 14

•

14 years ago

Attached patch v2 — Details — Splinter Review

This will also fix bug 673226. http://tbpl.mozilla.org/?tree=Try&rev=07e1908a3b94

Attachment #550146 - Attachment is obsolete: true

Attachment #550161 - Flags: review?(jduell.mcbugs)

Honza Bambas (:mayhemer)

Assignee

Updated

•

14 years ago

Blocks: 673226

Jason Duell

Updated

•

14 years ago

Attachment #550161 - Flags: review?(jduell.mcbugs) → review+

Jason Duell

Updated

•

14 years ago

Keywords: checkin-needed

Ed Morley [:emorley]

Comment 15

•

14 years ago

Sent to try: http://tbpl.allizom.org/?tree=Try&usebuildbot=1&rev=626e32292fdc Assuming all green, will push to inbound after. I'm happy to fix it this time, but for the future please can you make sure the format (context, commit message, author) is correct: http://blog.bonardo.net/2010/06/22/so-youre-about-to-use-checkin-needed Thanks :-)

Ed Morley [:emorley]

Comment 16

•

14 years ago

http://hg.mozilla.org/integration/mozilla-inbound/rev/95996d0763a6

Keywords: checkin-needed

Whiteboard: [inbound]

Ed Morley [:emorley]

Updated

•

14 years ago

Target Milestone: --- → mozilla9

:Ms2ger (he/him; ⌚ UTC+1/+2)

Comment 17

•

14 years ago

http://hg.mozilla.org/mozilla-central/rev/95996d0763a6

Status: ASSIGNED → RESOLVED

Closed: 14 years ago

Resolution: --- → FIXED

Whiteboard: [inbound]

You need to log in before you can comment on or make changes to this bug.