Closed Bug 791088 Opened 10 years ago Closed 10 years ago

Add telemetry for Connection Manager transaction time


(Core :: Networking: HTTP, defect)

Not set





(Reporter: valentin, Assigned: valentin)



(1 file, 2 obsolete files)

We need to have telemetry implemented to determine how long a transaction sits in
the http connection manager queue (i.e the time between submission to the connection manager and the time we dispatch it)
It needs to be broken down between normal http, http with pipelines, and spdy.

In the absence of stoneridge (or in addition to it) We could use this to
decide on pipeline deployment in conjunction with a/b testing of that.
Assignee: nobody → valentin.gosu
Attached patch Patch 1.0 (obsolete) — Splinter Review
Attachment #661157 - Flags: review?(mcmanus)
Comment on attachment 661157 [details] [diff] [review]
Patch 1.0

Review of attachment 661157 [details] [diff] [review]:

valentin - that was fast!

can you do some testing of this - maybe with wireshark or some har processor to see if we're reporting sane values? Just go to a site without a lot of sharding and a million embedded images and you should see some queueing.

::: netwerk/protocol/http/nsHttpConnectionMgr.cpp
@@ +38,5 @@
>  {
>      // insert into queue with smallest valued number first.  search in reverse
>      // order under the assumption that many of the existing transactions will
>      // have the same priority (usually 0).
> +    trans->SetPendingTime();

Put this early in processnewtransaction()..

2 problems with putting it here:
 * rescheduling means this gets called more than once
 * things that are immediately dispatched never call this function at all - but we still want to record very short q times for them

@@ +941,5 @@
>                  LOG(("  removing pending transaction based on "
>                       "TryDispatchTransaction returning hard error %x\n", rv));
>              ent->mPendingQ.RemoveElementAt(i);
> +            if (trans->QueryPipeline())

that looks tempting! but it won't work :(.. that will just give you the nsHttpPipeline datastructure, which we use a lot more often than we actually engage in pipelining (we do that because it allows changing our mind at a later stage)

For the non spdy cases you can just test  (mCaps & NS_HTTP_ALLOW_PIPELINING) to determine spdy vs regular.. that tests policy (would we pipeline this if we could) instead of the action (maybe we can't pipeline it because there are no connections established) which is what we want.

@@ +942,5 @@
>                       "TryDispatchTransaction returning hard error %x\n", rv));
>              ent->mPendingQ.RemoveElementAt(i);
> +            if (trans->QueryPipeline())
> +                Telemetry::AccumulateTimeDelta(

these telemetry calls should be made in the success case of AddToShortestPipeline, dispatchtransaction (spdy), and dispatchabstracttransaction.

@@ +948,5 @@
> +                    trans->GetPendingTime(), mozilla::TimeStamp::Now());
> +            else
> +                Telemetry::AccumulateTimeDelta(
> +                    Telemetry::TRANSACTION_WAIT_TIME_HTTP,
> +                    trans->GetPendingTime(), mozilla::TimeStamp::Now());

what does accumulatetimedelta() do when GetPendingTime().IsNull() is true? I hope it just fails to create a sample.. but double check that.

also, after a telemetry call set the pending time back to null so that anything that is dispatched twice (due to error and rescheduling) doesn't make 2 samples.

::: netwerk/protocol/http/nsHttpTransaction.h
@@ +209,5 @@
>      // protected by nsHttp::GetLock()
>      nsHttpResponseHead             *mForTakeResponseHead;
>      bool                            mResponseHeadTaken;
> +    mozilla::TimeStamp                       mTimePending;

document what it means and make its name consistent with the getter/setter
Attachment #661157 - Flags: review?(mcmanus)
Attached patch Patch 1.1 (obsolete) — Splinter Review
Hi Patrick,

I've changed the implementation to match your suggestions.
SetPendingTime is now called at the begining of ProcessNewTransaction.
Telemetry is accumulated in DispatchTransaction and AddToShortestPipeline.
Accumulation for DispatchAbstractTransaction is done in DispatchTransaction.
Not reporting for nsHttpConnectionMgr.cpp#2700 because that's only a null transaction.

Tested on SPDY and HTTP connections. HTTP_PIPELINES stats started showing up when activating pipelines.

Push to Try:
Attachment #661157 - Attachment is obsolete: true
Attachment #662031 - Flags: review?(mcmanus)
Comment on attachment 662031 [details] [diff] [review]
Patch 1.1

Review of attachment 662031 [details] [diff] [review]:

r=mcmanus with the two changes below, plus:

I appreciate you checked that telemetry showed up for the right cases, but before checkking in can you also spot check that the data being reported is roughly correct.. perhaps running wireshark should give you some idea of the delay for a deeply queued image.

::: netwerk/protocol/http/nsHttpConnectionMgr.cpp
@@ +1360,5 @@
> +                trans->GetPendingTime(), mozilla::TimeStamp::Now());
> +        else
> +            AccumulateTimeDelta(
> +                Telemetry::TRANSACTION_WAIT_TIME_HTTP,
> +                trans->GetPendingTime(), mozilla::TimeStamp::Now());

we need a setpendingtime(false), right?

@@ +1567,5 @@
> +                trans->GetPendingTime(), mozilla::TimeStamp::Now());
> +            trans->SetPendingTime(false);
> +        }
> +        return rv;
> +    } else {

you don't need the else{} after the return.. you can return this to the prior level of nesting
Attachment #662031 - Flags: review?(mcmanus) → review+
Attached patch Patch 1.2Splinter Review
Implemented the suggestions in the above comment.
Marking as r+ following Patrick's review.

Tested with wireshark and results appear to be correlated.
Attachment #662031 - Attachment is obsolete: true
Attachment #662385 - Flags: review+
Attachment #662385 - Flags: checkin?
Keywords: checkin-needed
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla18
You need to log in before you can comment on or make changes to this bug.