Closed Bug 1130444 Opened 9 years ago Closed 9 years ago

Add telemetry for the SocketTransportService Poll cycle duration

Categories

(Core :: Networking, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla39
Tracking Status
firefox39 --- fixed

People

(Reporter: dragana, Assigned: dragana)

References

Details

Attachments

(1 file, 3 obsolete files)

Add telemetry to get some measurements of Poll calls durations.
do we need a pref?
Flags: needinfo?(mcmanus)
(In reply to Dragana Damjanovic [:dragana] from comment #1)
> do we need a pref?

on whether or not to do it? Just test whether telemetry is enabled to not, and use that.

as for what data we want here - I think we want the trip time through the socket thread event loop.. not sure if that's what you're thinking or not. i.e. posted at X ran at Y.. telemetry y-x
Flags: needinfo?(mcmanus)
Actually, I have a patch already. I measure only a single poll, then the number of pending events in the pending queue in one trip through the socket thread and the whole trip duration, so that we can make difference between how long we spend in a poll and how many events we serve, and the time all together. (hey will be correlated :))

I am thinking about adding telemetry for the duration of the last trip which ends up in shutdown it is probably not longer than others. (maybe something with trytoattach and shutdown goes wrong, or some other service on will_shutdown or shutdown event does something strange - maybe not ).
Assignee: nobody → dd.mozilla
Status: NEW → ASSIGNED
Attached patch STS telemetry v1 (obsolete) — Splinter Review
Attachment #8561349 - Flags: review?(mcmanus)
Comment on attachment 8561349 [details] [diff] [review]
STS telemetry v1

Review of attachment 8561349 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/base/nsSocketTransportService2.cpp
@@ +730,5 @@
>          bool pendingEvents = false;
>          thread->HasPendingEvents(&pendingEvents);
>  
> +        int numberOfPendingEvents = 0;
> +        TimeStamp cycleStart = TimeStamp::Now();

declare cycleStart and pollStart outside of the for(;;) and only udpate them if(telemetry).. we've had some issues in the past with the cost of timestamps on windows and most folks aren't running telemetry.

@@ +1014,5 @@
>              mKeepaliveEnabledPref = keepaliveEnabled;
>              OnKeepaliveEnabledPrefChange();
>          }
> +
> +        rv = tmpPrefService->GetBoolPref(TELEMETRY_PREF, &mTelemetryEnabledPref);

you're not using rv..

::: toolkit/components/telemetry/Histograms.json
@@ +2430,5 @@
>      "n_buckets": 50,
>      "extended_statistics_ok": true,
>      "description": "How many speculative connections are made needlessly"
>    },
> +  "STS_POLL_AND_EVENT_CYCLE": {

I'm not sure how this is interesting. It combines the time blocked in poll() plus the time to service the handlers found by poll plus the time to run other xpcom pending events. They are all interesting, but if we can't tease apart the blocking time from the running time I'm not sure how its helpful.

the question we want to answer is "how long does it take my new event to run" and the time spent blocking isn't that interesting there. (though it would be useful to record separately for power consumption reasons).

@@ +2440,5 @@
> +  },
> +  "STS_NUMBER_OF_PENDING_EVENTS": {
> +    "expires_in_version": "never",
> +    "kind": "linear",
> +    "high": "100",

I would make it exponential with a much higher limit. We want to capture explosions of activity. also number of events found by poll is also pretty interesting (invocations of onsocketready())

@@ +2444,5 @@
> +    "high": "100",
> +    "n_buckets": 50,
> +    "description": "Number of pending events per SocketThread cycle."
> +  },
> +  "STS_POLL_CYCLE": {

same issue as poll_and_event_cycle. we should tease out the amount of time that poll() itself takes as that is likely to be dominated by blocking not cpu. (the cpu is a concern, but I think relatively small all things considered).. 

also keep high and n_buckets consistent with other ms based measurements you are adding
Attachment #8561349 - Flags: review?(mcmanus)
(In reply to Patrick McManus [:mcmanus] from comment #5)
> Comment on attachment 8561349 [details] [diff] [review]
> STS telemetry v1
> 
> Review of attachment 8561349 [details] [diff] [review]:
> -----------------------------------------------------------------


I had something else in mind when I was making this patch, something more connected to bug 1124880.

I will rewrite the pach to get the answer to question "how long does it take my new event to run".
> I had something else in mind when I was making this patch, something more
> connected to bug 1124880.

that too :)
Attached patch bug_1130444_v1.patch (obsolete) — Splinter Review
Attachment #8561349 - Attachment is obsolete: true
Depends on: 1131557
Attached patch bug_1130444_v2.patch (obsolete) — Splinter Review
Attachment #8562765 - Attachment is obsolete: true
Attachment #8575856 - Flags: review?(mcmanus)
Comment on attachment 8575856 [details] [diff] [review]
bug_1130444_v2.patch

Review of attachment 8575856 [details] [diff] [review]:
-----------------------------------------------------------------

This should be really interesting info. r+

I'm on the fence about whether it should use NowLoRes() or Now() (as written). If you want to reconsider and make that change its fine by me, but I won't insist either way. (are we going to learn less by having ~16ms granularity? We're really looking for outliers anyhow.. otoh its all done under the telemetry pref and really isn't that invasive.)
Attachment #8575856 - Flags: review?(mcmanus) → review+
I like to get better resolution when it is possible, but we really do not need it. NowLoRes is enough to see the distribution and to see if it is changing.
Attachment #8575856 - Attachment is obsolete: true
Attachment #8576527 - Flags: review+
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/12ee3efc026f
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla39
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: