Closed Bug 904170 Opened 6 years ago Closed 6 years ago

telemetry for how much data used by product

Categories

(Core :: Networking, defect)

20 Branch
x86_64
Linux
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla27

People

(Reporter: mcmanus, Assigned: mcmanus)

References

Details

Attachments

(1 file)

how much http/https data is transferred day for different firefox deployments (desktop, fennec, b2g).. upstream and downstream.

socket transport is probably the best place to capture this. It won't capture DNS or RTC, but that's probably ok for now.
Hi Patrick, I want to work on this bug.
It's not clear to me, this telemetry should report per-day statistics or just per-session?
hi robert - thanks. be warned this one is going to be tricky but I appreciate you pitching in.

We need per day.. which means wall clock time is interesting even when firefox isn't running. Some persistence will be required. On mobile platforms we also need separate counters between ethernet and cell data (and iirc b2g and fennec have different ways of doing that). On desktop we can just pretend its all ethernet.
I've just talked with froydnj and mreid on irc and they've got an interesting point of view on this:
> 22:08:06 - froydnj: robertbindar: per-day statistics are difficult to do with telemetry; it's > also not clear to me why they would be superior to per-session statistics
> 22:09:12 - froydnj: robertbindar: maybe you could just ask patrick whether per-browsing-session >statistics would be ok
> 22:09:31 - mreid: seems like if it *were* per-day, you'd  also need to know the number and    > duration of sessions to make any sense of it anyways
> 22:09:39 - mreid: so it might as well just be per session
> 22:09:44 - froydnj: indeed
Sorry about formatting, bad shortcut.
We could just report only the transferred data, but if we also report the number and duration of sessions along with it like mreid said, wouldn't this aggregate of information be more meaningful?
Yes its difficult - I said that in comment 2 :)

Yes I need per day. Or even better would be per quartile of the day - but per day will probably suffice. The time we spend not running a session is important.

A question you could help answer with this data would be: "if we deployed N firefox OS devices bundled with my university VPN how much data should I expect to provision the VPN for?"

The amount of data in a session, or the number of sessions used aren't interesting for that question. They are no doubt interesting for other questions and may be worth recording in a different patch.

hope that helps.
It was very helpful, thanks. I will start working on it this week, but first time I have to do some documentation on persistent storage between restarts.

I'm also curious why is time between sessions important? (maybe for statistics matters?)
Attachment #804682 - Flags: review?(jduell.mcbugs)
Assignee: nobody → mcmanus
Status: NEW → ASSIGNED
(In reply to Patrick McManus [:mcmanus] from comment #0)
> how much http/https data is transferred day for different firefox
> deployments (desktop, fennec, b2g).. upstream and downstream.

FYI, AFAIK Telemetry is currently disabled on B2G.
Comment on attachment 804682 [details] [diff] [review]
telemetry for daily http data consumption

Review of attachment 804682 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/protocol/http/HttpDataUsage.js
@@ +112,5 @@
> +        this._dataUsage.resetHttpDataUsage();
> +    },
> +
> +    // writeCounters also releases the lock
> +    writeCounters: function writeCounters(stream, result) {

looks like you should get rid of (stream, result) args--not used, nor passed in call site.

@@ +119,5 @@
> +            this._ethernetWritten.toString() + "," +
> +            this._cellRead.toString() + "," +
> +            this._cellWritten.toString() + "\n";
> +
> +        var buffer = new this._pipe(true, false, 0, 0, null);

nit: you do know the sizes involved, here, and they're much smaller than the average pipe (which allocates 16 buffers of 4K each), so IIUC you could save a little space/time here by passing {4096, 1} instead of {0, 0}.

::: netwerk/protocol/http/nsHttpHandler.cpp
@@ +1917,5 @@
> +
> +NS_IMETHODIMP
> +nsHttpHandler::GetEthernetBytesRead(uint64_t *aEthernetBytesRead)
> +{
> +    MOZ_ASSERT(NS_IsMainThread());

Might be worth also returning error if !mainThread, for opt builds.

::: netwerk/protocol/http/nsHttpHandler.h
@@ +478,5 @@
>          mRequestTokenBucket = aTokenBucket;
>      }
>  
>  private:
> +// for nsIHttpDataUsage

indent the comment

@@ +490,3 @@
>      nsRefPtr<mozilla::net::Tickler> mWifiTickler;
> +    nsresult GetNetworkInfo(nsIInterfaceRequestor *cb,
> +                            bool *ethernet, uint32_t *gw);

Usage/implemention for this function is a little complex: ethernet/gw are optional parameters (but it doesn't make sense to call w/o one or the other); the function may fail if you ask for gateway on some platforms (B2G); callback object only used on android to get gateway, but function always fails if you don't pass it.  I suspect both the code and interface would be cleaner as separate GetIsEthernet and GetGateway functions.  Not an r+-breaker, but consider it. It really smells simpler to me.

::: netwerk/protocol/http/nsIHttpDataUsage.idl
@@ +5,5 @@
> +
> +#include "nsISupports.idl"
> +
> +/**
> + * nsIHttpDataUsage

Descriptive comment about class, please.  It's not apparent that this is per-day data, for example.

@@ +12,5 @@
> +[scriptable, uuid(79dee3eb-9323-4d5c-b0a8-1baa18934d9e)]
> +interface nsIHttpDataUsage : nsISupports
> +{
> +    readonly attribute unsigned long long ethernetBytesRead;
> +    readonly attribute unsigned long long ethernetBytesWritten;

worth noting in comment that "ethernet" includes wifi.

::: toolkit/components/telemetry/Histograms.json
@@ +868,5 @@
>    "HTTP_PAGELOAD_IS_SSL": {
>      "kind": "boolean",
>      "description": "Whether a HTTP base page load was over SSL or not."
>    },
> +  "HTTPDATA_ETHERNET_IN": {

Maybe put "PER_DAY" in names?  Last time I looked at telemetry UI the "description" field was never usefully shown, so if the name itself needed to be as descriptive as possible.
Attachment #804682 - Flags: review?(jduell.mcbugs) → review+
thanks jason!

(In reply to Jason Duell (:jduell) from comment #9)
> ::: netwerk/protocol/http/nsHttpHandler.cpp
> @@ +1917,5 @@
> > +
> > +NS_IMETHODIMP
> > +nsHttpHandler::GetEthernetBytesRead(uint64_t *aEthernetBytesRead)
> > +{
> > +    MOZ_ASSERT(NS_IsMainThread());
> 
> Might be worth also returning error if !mainThread, for opt builds.

I don't like to do more than the assert() unless I'm very wary of defensing against a bug. This case is pretty clear.

> 
> @@ +490,3 @@
> >      nsRefPtr<mozilla::net::Tickler> mWifiTickler;
> > +    nsresult GetNetworkInfo(nsIInterfaceRequestor *cb,
> > +                            bool *ethernet, uint32_t *gw);
> 
> Usage/implemention for this function is a little complex: 
[..]
> would be cleaner as separate GetIsEthernet and GetGateway functions. 

I tried to simplify this for a forthcoming update.. I ended up with one method for type and one for type+gw.. the latter is combined because I don't want to replicate the ugly series of QI twice when both data elements are wanted by the tickler. the latter also always fails on b2g and desktop. (I'm sure we could fill it in with platform specific code if there was a need). Some code is duplicated but the interface is better. meh :)
Backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/8edbe82f4f68 - looks like you probably wanted to wrap your nsINetworkManager.h include in #ifdef MOZ_WIDGET_GONK to avoid having it burn b2g desktop.
https://hg.mozilla.org/mozilla-central/rev/a906226a3865
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla27
Depends on: 931856
You need to log in before you can comment on or make changes to this bug.