Add telemetry for Connection Manager transaction time

RESOLVED FIXED in mozilla18

Status

()

Core
Networking: HTTP
RESOLVED FIXED
6 years ago
6 years ago

People

(Reporter: valentin, Assigned: valentin)

Tracking

unspecified
mozilla18
Points:
---
Bug Flags:
in-testsuite -

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 2 obsolete attachments)

(Assignee)

Description

6 years ago
We need to have telemetry implemented to determine how long a transaction sits in
the http connection manager queue (i.e the time between submission to the connection manager and the time we dispatch it)
It needs to be broken down between normal http, http with pipelines, and spdy.

In the absence of stoneridge (or in addition to it) We could use this to
decide on pipeline deployment in conjunction with a/b testing of that.
(Assignee)

Updated

6 years ago
Assignee: nobody → valentin.gosu
Status: NEW → ASSIGNED
(Assignee)

Comment 1

6 years ago
Created attachment 661157 [details] [diff] [review]
Patch 1.0
Attachment #661157 - Flags: review?(mcmanus)
Comment on attachment 661157 [details] [diff] [review]
Patch 1.0

Review of attachment 661157 [details] [diff] [review]:
-----------------------------------------------------------------

valentin - that was fast!

can you do some testing of this - maybe with wireshark or some har processor to see if we're reporting sane values? Just go to a site without a lot of sharding and a million embedded images and you should see some queueing.

::: netwerk/protocol/http/nsHttpConnectionMgr.cpp
@@ +38,5 @@
>  {
>      // insert into queue with smallest valued number first.  search in reverse
>      // order under the assumption that many of the existing transactions will
>      // have the same priority (usually 0).
> +    trans->SetPendingTime();

Put this early in processnewtransaction()..

2 problems with putting it here:
 * rescheduling means this gets called more than once
 * things that are immediately dispatched never call this function at all - but we still want to record very short q times for them

@@ +941,5 @@
>                  LOG(("  removing pending transaction based on "
>                       "TryDispatchTransaction returning hard error %x\n", rv));
>  
>              ent->mPendingQ.RemoveElementAt(i);
> +            if (trans->QueryPipeline())

that looks tempting! but it won't work :(.. that will just give you the nsHttpPipeline datastructure, which we use a lot more often than we actually engage in pipelining (we do that because it allows changing our mind at a later stage)


For the non spdy cases you can just test  (mCaps & NS_HTTP_ALLOW_PIPELINING) to determine spdy vs regular.. that tests policy (would we pipeline this if we could) instead of the action (maybe we can't pipeline it because there are no connections established) which is what we want.

@@ +942,5 @@
>                       "TryDispatchTransaction returning hard error %x\n", rv));
>  
>              ent->mPendingQ.RemoveElementAt(i);
> +            if (trans->QueryPipeline())
> +                Telemetry::AccumulateTimeDelta(

these telemetry calls should be made in the success case of AddToShortestPipeline, dispatchtransaction (spdy), and dispatchabstracttransaction.

@@ +948,5 @@
> +                    trans->GetPendingTime(), mozilla::TimeStamp::Now());
> +            else
> +                Telemetry::AccumulateTimeDelta(
> +                    Telemetry::TRANSACTION_WAIT_TIME_HTTP,
> +                    trans->GetPendingTime(), mozilla::TimeStamp::Now());

what does accumulatetimedelta() do when GetPendingTime().IsNull() is true? I hope it just fails to create a sample.. but double check that.

also, after a telemetry call set the pending time back to null so that anything that is dispatched twice (due to error and rescheduling) doesn't make 2 samples.

::: netwerk/protocol/http/nsHttpTransaction.h
@@ +209,5 @@
>  
>      // protected by nsHttp::GetLock()
>      nsHttpResponseHead             *mForTakeResponseHead;
>      bool                            mResponseHeadTaken;
> +    mozilla::TimeStamp                       mTimePending;

document what it means and make its name consistent with the getter/setter
Attachment #661157 - Flags: review?(mcmanus)
(Assignee)

Comment 3

6 years ago
Created attachment 662031 [details] [diff] [review]
Patch 1.1

Hi Patrick,

I've changed the implementation to match your suggestions.
SetPendingTime is now called at the begining of ProcessNewTransaction.
Telemetry is accumulated in DispatchTransaction and AddToShortestPipeline.
Accumulation for DispatchAbstractTransaction is done in DispatchTransaction.
Not reporting for nsHttpConnectionMgr.cpp#2700 because that's only a null transaction.

Tested on SPDY and HTTP connections. HTTP_PIPELINES stats started showing up when activating pipelines. 

http://goo.gl/mYJSN
http://goo.gl/elDTW

Push to Try: https://tbpl.mozilla.org/?tree=Try&rev=f3bd3baffb97
Attachment #661157 - Attachment is obsolete: true
Attachment #662031 - Flags: review?(mcmanus)
Comment on attachment 662031 [details] [diff] [review]
Patch 1.1

Review of attachment 662031 [details] [diff] [review]:
-----------------------------------------------------------------

r=mcmanus with the two changes below, plus:

I appreciate you checked that telemetry showed up for the right cases, but before checkking in can you also spot check that the data being reported is roughly correct.. perhaps running wireshark should give you some idea of the delay for a deeply queued image.

::: netwerk/protocol/http/nsHttpConnectionMgr.cpp
@@ +1360,5 @@
> +                trans->GetPendingTime(), mozilla::TimeStamp::Now());
> +        else
> +            AccumulateTimeDelta(
> +                Telemetry::TRANSACTION_WAIT_TIME_HTTP,
> +                trans->GetPendingTime(), mozilla::TimeStamp::Now());

we need a setpendingtime(false), right?

@@ +1567,5 @@
> +                trans->GetPendingTime(), mozilla::TimeStamp::Now());
> +            trans->SetPendingTime(false);
> +        }
> +        return rv;
> +    } else {

you don't need the else{} after the return.. you can return this to the prior level of nesting
Attachment #662031 - Flags: review?(mcmanus) → review+
(Assignee)

Comment 5

6 years ago
Created attachment 662385 [details] [diff] [review]
Patch 1.2

Implemented the suggestions in the above comment.
Marking as r+ following Patrick's review.

https://tbpl.mozilla.org/?tree=Try&rev=99849ae0c5eb

Tested with wireshark and results appear to be correlated.
Attachment #662031 - Attachment is obsolete: true
Attachment #662385 - Flags: review+
Attachment #662385 - Flags: checkin?
(Assignee)

Updated

6 years ago
Keywords: checkin-needed
Attachment #662385 - Flags: checkin?
(In reply to Valentin Gosu from comment #5)
> https://tbpl.mozilla.org/?tree=Try&rev=99849ae0c5eb

Green on Try.

https://hg.mozilla.org/integration/mozilla-inbound/rev/4974c79f0439
Flags: in-testsuite-
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/4974c79f0439
Status: ASSIGNED → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla18
You need to log in before you can comment on or make changes to this bug.