Closed Bug 729736 Opened 12 years ago Closed 12 years ago

experiment with different ping keep-alive values in spdy

Categories

(Core :: Networking: HTTP, enhancement)

13 Branch
x86_64
Linux
enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla13

People

(Reporter: mcmanus, Assigned: mcmanus)

References

Details

(Whiteboard: [spdy])

Attachments

(1 file, 1 obsolete file)

bug 728113 uses spdy-level pings as keep-alive (and dead connection detection) tools based on somewhere around just less than 60 seconds of idle time. (the default is at least 44, but the timer has a 15 second granularity.)

that's based on my old experience, recently gut checked with a few folks in the know, that the lowest nat timeout value we really need to worry about is 60 seconds. But nobody was really certain.

it would be good to have some data, and here is a nice place to get some where we don't have to do any kind of site tracking (because nat timeouts are generally a function of client topology) from folks already using telemetry.

So this patch is a first of its kind for me - I'm pretty excited by it.

* it takes around 1 of 1500 spdy sessions (of those that are already using telemetry) and assigns them a random ping threshold rather than the configured one. This value is between 10 and 180 seconds assuming default config.

* if the ping timer fires then we report via telemetry whether or not a timeout was detected afterwards and the value of the timer . A timeout means the value was too big :)

we can use that data to determine what this should be set at, etc.. the result has applicability to http too, but http can't measure it in the same way as it lacks a ping like mechanism.
Attached patch patch 0 (obsolete) — Splinter Review
Assignee: nobody → mcmanus
Status: NEW → ASSIGNED
Attachment #599790 - Flags: review?(honzab.moz)
Comment on attachment 599790 [details] [diff] [review]
patch 0

Review of attachment 599790 [details] [diff] [review]:
-----------------------------------------------------------------

We need a pref to turn this off and have a normal behavior of the ping if the experiment would cause some problems or just for debugging purposes with telemetry left on.

r- only for using "@mozilla.org/security/random-generator;1".

::: netwerk/protocol/http/SpdySession.cpp
@@ +149,5 @@
> +        }
> +      }
> +      NS_Free(buffer);
> +    }
> +  }

Worth have a separate method for this code also since it is just an experimental thing.  You can also do nice early returns then.

::: netwerk/protocol/http/nsHttpHandler.cpp
@@ +536,5 @@
> +nsHttpHandler::GetRandomGeneratorService()
> +{
> +    if (!mRandomGeneratorService)
> +        mRandomGeneratorService =
> +            do_GetService("@mozilla.org/security/random-generator;1");

This is very expensive including SHA256 calculation and eats entropy that is precious, you may even get blocked when called too often I think.  

You don't need a security save results for your purpose.  If you are concerned about leaking some rnd sequences with using just a linear generator, then it is void since you throw a lot of data away and you have way too few sessions probably (do we have a telemetry for how many parallel session we usually open?).

I think you can go with just a rand() seeded on the network thread (I didn't find a place where we would be doing that for that thread, so just add it).

::: netwerk/protocol/http/nsHttpHandler.h
@@ +347,5 @@
>      // For broadcasting the preference to not be tracked
>      bool           mDoNotTrackEnabled;
>      
> +    // Whether telemetry is reported or not
> +    bool          mTelemetryEnabled;

space count
Attachment #599790 - Flags: review?(honzab.moz) → review-
Attached patch patch v1Splinter Review
Attachment #599790 - Attachment is obsolete: true
Attachment #600288 - Flags: review?(honzab.moz)
(In reply to Honza Bambas (:mayhemer) from comment #2)
> (do we have a telemetry for how many parallel
> session we usually open?).

we don't.. though its usually just a handful per visit.. and they do time out after 3 minutes, so that will repeat if you keep going back to google search, for instance.

I know I set the selection rate really low. If we don't get any data its easy enough to bump it up safely on aurora or beta.. do you think I should double or quadruple the selection rate?
(In reply to Patrick McManus [:mcmanus] from comment #4)
> do you think I should
> double or quadruple the selection rate?

No idea.  We know we have some 50000 ADU nightly users.  Say that some less <5% turn spdy on (= <2500 users) - any telemetry on this? :)  So, every 1024th session does a test, a user may have some 100 (tele?) sessions per browser session.  Then we could have some 250 measurements per day.  I don't know if you've made any calculus your self, but I think it is OK to have this for nightly, we may land updates.  

Hmm.. what about to make the probability a preference?
Comment on attachment 600288 [details] [diff] [review]
patch v1

Review of attachment 600288 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks

r=honzab

::: netwerk/protocol/http/nsHttpHandler.cpp
@@ +202,5 @@
>      , mSendSecureXSiteReferrer(true)
>      , mEnablePersistentHttpsCaching(false)
>      , mDoNotTrackEnabled(false)
> +    , mTelemetryEnabled(false)
> +    , mAllowExperiments(true)

Rather false ? (for xpcshell profile-less tests to not get influenced by anything random)  Up to you.
Attachment #600288 - Flags: review?(honzab.moz) → review+
> Rather false ? (for xpcshell profile-less tests to not get influenced by
> anything random)  Up to you.

tehcnically true, but they won't have telem enabled either without a profile - so it will still be disabled.

(In reply to Honza Bambas (:mayhemer) from comment #5)
> (In reply to Patrick McManus [:mcmanus] from comment #4)
> > do you think I should
> > double or quadruple the selection rate?
> 
> No idea. 

I'm just kind of guessing too - that's why I started conservative. Needing to turn it up is better than panicing to turn it down :)
Backed out for OS X mochitest-1 orange, possibly also other m-1 and m-oth orange.

https://hg.mozilla.org/integration/mozilla-inbound/rev/d06799066707
This patch was ok - it just got a bad rap because it was pushed along wiht another bad one. here is the try reverification: https://tbpl.mozilla.org/?tree=Try&rev=732f7ed957a0

https://hg.mozilla.org/integration/mozilla-inbound/rev/483efb99bf3f
https://hg.mozilla.org/mozilla-central/rev/483efb99bf3f
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla13
Summary: experiement with different ping keep-alive values in spdy → experiment with different ping keep-alive values in spdy
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: