Closed Bug 1354405 Opened 7 years ago Closed 7 years ago

Add new telemetry probe for bandwidth cost based on average cache hit when we rcwn

Categories

(Core :: Networking: Cache, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox55 --- fixed

People

(Reporter: CuveeHsu, Assigned: CuveeHsu)

References

(Blocks 1 open bug)

Details

(Whiteboard: [necko-active])

Attachments

(1 file, 5 obsolete files)

To evaluate rcwn algorithm, we need new probe for real bandwidth cost will be just an estimate based on average cache hit
Tend to report from nsHttpTransaction::GetTransferSize
We will get 3 probes:
1) not race/race 
2) use cache/race 
3) use network

+-----------------+---------------------+----------------------+----------------------+
|                 |       Not Race      | Race and Use Network |  Race and Use Cache  |
+-----------------+---------------------+----------------------+----------------------+
| No Entry        |         Net         |          Net         |          Net         |
+-----------------+---------------------+----------------------+----------------------+
| Stale Entry     |         Net         |          Net         |          Net         |
+-----------------+---------------------+----------------------+----------------------+
| Re-validate 304 | Cache and small net |          Net         |        Depends       |
+-----------------+---------------------+----------------------+----------------------+
| Re-validate 200 |         Net         |          Net         |          Net         |
+-----------------+---------------------+----------------------+----------------------+
| Fresh Entry     |        Cache        |          Net         |        Cache*        |
+-----------------+---------------------+----------------------+----------------------+

* After we enable cancel-net-on-cache-win
Attached patch bandwidth - v1 (obsolete) — Splinter Review
The patch is based on Bug 1354409.
Attachment #8861343 - Flags: review?(michal.novotny)
Comment on attachment 8861343 [details] [diff] [review]
bandwidth - v1

Review of attachment 8861343 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/protocol/http/nsHttpChannel.cpp
@@ +7012,5 @@
>  
> +    if (mTransaction) {
> +        mTransferSize = mTransaction->GetTransferSize();
> +    }
> +    Telemetry::Accumulate(kRcwnTelemetry[rcwnStatus], mTransferSize);

See https://bugzilla.mozilla.org/show_bug.cgi?id=1354409#c11, the same apply for this probe.

::: toolkit/components/telemetry/Histograms.json
@@ +2165,5 @@
>    },
> +  "NETWORK_RACE_CACHE_BANDWITH_RACE_NETWORK": {
> +    "expires_in_version": "58",
> +    "kind": "exponential",
> +    "high": 1048576,

1MB upper limit looks quite small to me. Why did you choose this value?

@@ +2167,5 @@
> +    "expires_in_version": "58",
> +    "kind": "exponential",
> +    "high": 1048576,
> +    "n_buckets": 100,
> +    "description": "Amount of bytes sent when we decide to race cache with network and network wins.",

I think it should be "bytes received".
Attachment #8861343 - Flags: review?(michal.novotny) → feedback+
> ::: toolkit/components/telemetry/Histograms.json
> @@ +2165,5 @@
> >    },
> > +  "NETWORK_RACE_CACHE_BANDWITH_RACE_NETWORK": {
> > +    "expires_in_version": "58",
> > +    "kind": "exponential",
> > +    "high": 1048576,
> 
> 1MB upper limit looks quite small to me. Why did you choose this value?
> 

I'd like to choose 1GB before. Thanks for pointing this.
Attached patch bandwidth - v2 (obsolete) — Splinter Review
Attachment #8861343 - Attachment is obsolete: true
Attachment #8864018 - Flags: review?(michal.novotny)
Comment on attachment 8864018 [details] [diff] [review]
bandwidth - v2

Review of attachment 8864018 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/telemetry/Histograms.json
@@ +2076,5 @@
>    },
> +  "NETWORK_RACE_CACHE_BANDWITH_RACE_NETWORK": {
> +    "expires_in_version": "58",
> +    "kind": "exponential",
> +    "high": 1073741824,

I think this needs some tweaking. mTransferSize contains size including headers and it won't be usually smaller than few hundreds of bytes. So first 20-25 buckets would be unused and we would lose the most fine grained buckets. We need to have some reasonable low limit. And maybe with 1GB we should also have more than 100 buckets to have a good resolution for sizes <= 1MB.
Attachment #8864018 - Flags: review?(michal.novotny) → feedback+
(In reply to Michal Novotny (:michal) from comment #7)
> Comment on attachment 8864018 [details] [diff] [review]
> bandwidth - v2
> 
> Review of attachment 8864018 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: toolkit/components/telemetry/Histograms.json
> @@ +2076,5 @@
> >    },
> > +  "NETWORK_RACE_CACHE_BANDWITH_RACE_NETWORK": {
> > +    "expires_in_version": "58",
> > +    "kind": "exponential",
> > +    "high": 1073741824,
> 
> I think this needs some tweaking. mTransferSize contains size including
> headers and it won't be usually smaller than few hundreds of bytes. So first
> 20-25 buckets would be unused and we would lose the most fine grained
> buckets. We need to have some reasonable low limit. 

I'll grab some statistics for the header size.
However, H2 might have extremely small header size since HPACK compression, frequently < 100bytes.
Gathering <32 bytes into one bucket might be good.

> And maybe with 1GB we
> should also have more than 100 buckets to have a good resolution for sizes
> <= 1MB.

for 100 buckets, the rate is around 1.23
for 300 buckets, the rate is around 1.07, sounds better.
1) Per SPDY_SYN_REPLY_SIZE telemetry, 4% of H2 header size is <16 bytes.
I don't think it's a good idea to do much tweaking.

2) Without special approval, we can not have more than 100 buckets.

3) We won't race if the size of content is expected to be big.

Based on the above facts, I choose low=32 and high=16MB
The rate would be 1.14, which is similar to SPDY_SYN_REPLY_SIZE
Attached patch bandwidth - v3 (obsolete) — Splinter Review
Modify by last comment, also do some refactor.
Attachment #8864018 - Attachment is obsolete: true
Attachment #8864482 - Flags: review?(michal.novotny)
Comment on attachment 8864482 [details] [diff] [review]
bandwidth - v3

Review of attachment 8864482 [details] [diff] [review]:
-----------------------------------------------------------------

r+ but please change the expiration and also make sure to ask for data review before landing

::: toolkit/components/telemetry/Histograms.json
@@ +2083,5 @@
>      "n_buckets": 100,
>      "description": "Time in milliseconds between onStartRequest from the cache and onCacheEntryAvailable. Report only when net wins and OCEC is before onStartRequest from net."
>    },
> +  "NETWORK_RACE_CACHE_BANDWITH_RACE_NETWORK_WIN": {
> +    "expires_in_version": "58",

I think we want to keep these probes longer than till release 58.
Attachment #8864482 - Flags: review?(michal.novotny) → review+
Attached patch bandwidth -v4 (obsolete) — Splinter Review
Data review for additive telemetries.

These telemetries record the bandwidth of different conditions when we race cache with network, in order to measure the trade-off network data from earning speed.
Attachment #8864482 - Attachment is obsolete: true
Attachment #8864807 - Flags: review+
Attachment #8864807 - Flags: feedback?(benjamin)
Comment on attachment 8864807 [details] [diff] [review]
bandwidth -v4

In addition to the necko@m.c mailing list, please add a real person's name to the alert_emails so that we know who's responsible for this probe.

data-r=me with that change
Attachment #8864807 - Flags: feedback?(benjamin) → feedback+
Attached patch bandwidth - v5 (obsolete) — Splinter Review
Rebase on bug 1363700.
Good to have another eye on this small patch again.
Attachment #8864807 - Attachment is obsolete: true
Attachment #8868492 - Flags: review?(valentin.gosu)
Comment on attachment 8868492 [details] [diff] [review]
bandwidth - v5

Review of attachment 8868492 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with this one question answered.

::: netwerk/protocol/http/nsHttpChannel.cpp
@@ +8855,5 @@
> +        Telemetry::NETWORK_RACE_CACHE_BANDWITH_RACE_NETWORK_WIN,
> +        Telemetry::NETWORK_RACE_CACHE_BANDWITH_RACE_CACHE_WIN,
> +        Telemetry::NETWORK_RACE_CACHE_BANDWITH_NOT_RACE,
> +        Telemetry::NETWORK_RACE_CACHE_BANDWITH_NOT_RACE
> +    };

Isn't this the other way around? Shouldn't kDidNotRaceUsedNetwork go into the NETWORK_RACE_CACHE_BANDWITH_NOT_RACE bucket, and kRaceUsedCache go into NETWORK_RACE_CACHE_BANDWITH_RACE_CACHE_WIN ?
Attachment #8868492 - Flags: review?(valentin.gosu) → review+
Good catch, looks like following order is correct:

Telemetry::NETWORK_RACE_CACHE_BANDWITH_NOT_RACE,
Telemetry::NETWORK_RACE_CACHE_BANDWITH_NOT_RACE
Telemetry::NETWORK_RACE_CACHE_BANDWITH_RACE_NETWORK_WIN,
Telemetry::NETWORK_RACE_CACHE_BANDWITH_RACE_CACHE_WIN,
> Isn't this the other way around? Shouldn't kDidNotRaceUsedNetwork go into
> the NETWORK_RACE_CACHE_BANDWITH_NOT_RACE bucket, and kRaceUsedCache go into
> NETWORK_RACE_CACHE_BANDWITH_RACE_CACHE_WIN ?
>
> Good catch, looks like following order is correct:
> 
> Telemetry::NETWORK_RACE_CACHE_BANDWITH_NOT_RACE,
> Telemetry::NETWORK_RACE_CACHE_BANDWITH_NOT_RACE
> Telemetry::NETWORK_RACE_CACHE_BANDWITH_RACE_NETWORK_WIN,
> Telemetry::NETWORK_RACE_CACHE_BANDWITH_RACE_CACHE_WIN,

Yes, it's correct. Thanks for saving this horrible mistake.
Attached patch bandwidth -v6Splinter Review
Attachment #8868492 - Attachment is obsolete: true
Attachment #8868829 - Flags: review+
Attachment #8868829 - Flags: feedback+
Attachment #8868829 - Attachment description: bandwidth → bandwidth -v6
Keywords: checkin-needed
Pushed by cbook@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/bb52b557bc32
Add new telemetry probe for bandwidth cost for rcwn, r=michal, valentin, data-r=bsmedberg
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/bb52b557bc32
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Blocks: 1367353
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: