Closed Bug 1003174 Opened 10 years ago Closed 10 years ago

Server needs to report an active user

Categories

(Hello (Loop) :: Server, defect, P1)

defect

Tracking

(Not tracked)

VERIFIED FIXED
mozilla33

People

(Reporter: RT, Assigned: tarek)

References

Details

(Whiteboard: [qa+] p=?)

User Story

As a product manager I want to know daily the number of unique users, weekly recurring users, fortnightly recurring users and monthly recurring users so that I know the service user adoption both on FFOS and FF desktop.
      No description provided.
Summary: Client needs to report the an active user → Client needs to report an active user
User Story: (updated)
Priority: -- → P3
Target Milestone: --- → mozilla33
Stable ID will allow identifying unique users as part of FHR, we may be able to leverage this.
Priority: P3 → P1
A user is defined as someone who has a successful call.
Whiteboard: [s=fx33]
Whiteboard: [s=fx33] → p=?
Believe we can gather from server rather than client.
Flags: needinfo?(alexis+bugs)
Once the server will now that a call was accepted / rejected, we will have a way to build this on the server, yes.

I'm marking this bug as a duplicate of the "server reporting" one.
No longer blocks: 972031
Status: NEW → RESOLVED
Closed: 10 years ago
Component: Client → Server
Depends on: 1003170
Flags: needinfo?(alexis+bugs)
Resolution: --- → DUPLICATE
Summary: Client needs to report an active user → Server needs to report an active user
How will you identify unique and recurring users on the server?
Flags: needinfo?(alexis+bugs)
I believe this is the work of the aggregator to tell which users are recurring and which ones are unique. The server will just say that user-<uuid> was active, and then the aggregator will aggregate the information the right way.
Flags: needinfo?(alexis+bugs)
After discussion with Alexis, the server will store the uuid and the number of links generated.
It will actually send these to statsd and not store them.
Depends on: 1024920
No longer depends on: 1003170
No longer depends on: 1024920
Blocks: 1024920
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
This has to be reported per user agent - i.e windows, Linux, Mac, FFOS, ...
Adding QA and OPs
Whiteboard: p=? → [qa+] p=?
We can do with this statsd but it would be better to do this with heka. The latest version supports bloom filters and hyperloglog for counting unique items. We use HyperLogLog in production already. It consumes very little memory and can uniquely count billions of items to a very low probability of error.

Heka does this by looking for specific events in the log stream and increments a counter if deemed unique.
Benson, I'm not sure to understand. 

We're currently using heka as a transport for statsd, so wouldn't it work out the box for us?
Can you point us to an example of such a log consumed by heka/HyperLogLog ?
Flags: needinfo?(bwong)
More context/info: 

The fxa-auth-server generates two log streams (to stdout/stderr): the nginx logs and the "application" logs. In particular, it logs a summary line for every request. Example:

{"name":"fxa-auth-server","hostname":"ip-10-220-156-235","pid":29233,"level":30,"op":"request.summary","code":200,"errno":0,"rid":"1403895904787-29233-33476","path":"/auth/v1/account/destroy","lang":"en-US,en;q=0.5","agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:30.0) Gecko/20100101 Firefox/30.0","remoteAddressChain":["75.37.29.214","10.245.46.178","127.0.0.1"],"t":579,"email":"kparlante@mozilla.com","msg":"","time":"2014-06-27T19:05:05.366Z","v":0}

Heka has a logstreamer input, it takes the log data from all of the nodes running the servers and sends them to heka nodes, which feed them to a heka aggregator. The heka nodes & aggregator do a few things:
- filters out pii
- parses User Agent string
- uses hyperloglog algorithm for counting unique daily uids on one particular endpoint (/certificate/sign)
- displays realtime graphs for some endpoints/measures (like the one benson linked to)
- triggers email alerts (mostly anomaly detection, also a couple of fraud/abuse related alerts)
- outputs pii-stripped data to elastic search, which can then be viewed by/explored with Kibana dashboards
- outputs log data to file for long term storage (eventually to canonical data store)
- outputs a json file to be used by "all mozilla" dashboard

Heka and Kibana dashboards are only available to qa/ops/dev working on servers, for privacy/data security reasons. (Stackdriver pulls data from aws/cloudwatch, so similar to nginx logs + cpu/memory utilization etc.)

Here's the dashboard that's visible to everyone at mozilla: https://metrics.fxa.us-west-2.prod.mozaws.net/accounts-dashboard/

The plan is to do a similar dashboard for loop, to meet this use case. I'm assuming the plan is also to have heka and kibana dashboards, alerts, stackdriver, etc. similar to what we're doing for fxa & sync.

More info on the fxa-auth summary log line: https://github.com/mozilla/fxa-auth-server/pull/565
Thanks for all the info. 

> The plan is to do a similar dashboard for loop, to meet this use case. I'm assuming the plan is also to 
> have heka and kibana dashboards, alerts, stackdriver, etc. similar to what we're doing for fxa & sync.

Do you have an idea about the time it would take to set everything up for this reporting system ?
Flags: needinfo?(kparlante)
(once we provide all the logging info)
Please note that after discussions with TEF, we need this statistic both for Desktop Loop users and FFOS Loop users in order to be able to differentiate service take-up on both platforms.
Please confirm if you will have the client identifier necessary to provide this.
User Story: (updated)
Hello Katie,

I am looking at the log function we are doing to match the Heka log stream protocol.

You said we need to return something like:

{
    "name": "fxa-auth-server",
    "hostname": "ip-10-220-156-235",
    "pid": 29233,
    "level": 30,
    "op": "request.summary",
    "code": 200,
    "errno": 0,
    "rid": "1403895904787-29233-33476",
    "path": "/auth/v1/account/destroy",
    "lang": "en-US,en;q=0.5",
    "agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:30.0) Gecko/20100101 Firefox/30.0",
    "remoteAddressChain": [
        "75.37.29.214",
        "10.245.46.178",
        "127.0.0.1"
    ],
    "t": 579,
    "email": "kparlante@mozilla.com",
    "msg": "",
    "time": "2014-06-27T19:05:05.366Z",
    "v": 0
}

Do you know what is the v parameter?

What is the email for? Can we provide the user unique Id instead?
Also do you know what is the level parameter?

Can we add other fields to it or are we tight to this fields?

Thank you very much.
User Story: (updated)
I added a requirement for "monthly recurring" statistics  in the user story given that most communication services share data related to their monthly recurring users - so we can compare.
Hi Rémy,

> Do you know what is the v parameter?

This is the version of the log format, in case it changes later. You can just set to 1.

> What is the email for? Can we provide the user unique Id instead?

"email" was specific to the auth server. You should include whatever fields you want to log for each loop event that you are logging.

Yes, provide the unique user id. Using the identifier "uid" is probably a good idea, I think "uuid" is used to uniquely identify the heka message.

> Also do you know what is the level parameter?

"level" is used to rank importance of different log entries. You can ignore it if you don't need/want to rank types of log entries.

> Can we add other fields to it or are we tight to this fields?

Yes, you should add other fields. 

> Thank you very much.
yw
Flags: needinfo?(kparlante)
Assignee: nobody → tarek
(In reply to Romain Testard [:RT] from comment #18)
> Please note that after discussions with TEF, we need this statistic both for
> Desktop Loop users and FFOS Loop users in order to be able to differentiate
> service take-up on both platforms.
> Please confirm if you will have the client identifier necessary to provide
> this.

Fernando, how do we make a difference between desktop users and mobile users ?
Flags: needinfo?(ferjmoreno)
^ is the user-agent good enough ?
(In reply to Rémy Hubscher (:natim) from comment #19)
> I am looking at the log function we are doing to match the Heka log stream
> protocol.

Thanks! 

Please synchronize with me if you are going to work on this so we don't duplicate efforts.
Rémy & Tarek,

I should list more clearly which fields are required/important for heka processing:

- "time" : this should be a UTC timestamp 
- "op" : type of log entry, you can have multiple types if you want/need. We might filter or process types differently
- "hostname" : os.hostname() -- allows qa/ops to notice problems on specific aws instances
- "v" : should be 1, allows us to process different log versions differently in the future
- "lang" : allows segmentation of the data by locale
- "agent" : user agent, allows segmentation of the data by desktop/fxos, segmentation by browser

Other conventions that might be useful for following:

- "uid" : as you noted, we'll need uid for the active daily, recurring, etc. counts
- "errno" : other servers have found it useful to have a number for each error type. 0 for success.
- "path" : request url, assuming the event being logged is a request that is being handled

And of course any other fields that might be useful specifically for loop.
(In reply to Tarek Ziadé (:tarek) from comment #22)
> (In reply to Romain Testard [:RT] from comment #18)
> > Please note that after discussions with TEF, we need this statistic both for
> > Desktop Loop users and FFOS Loop users in order to be able to differentiate
> > service take-up on both platforms.
> > Please confirm if you will have the client identifier necessary to provide
> > this.
> 
> Fernando, how do we make a difference between desktop users and mobile users
> ?

We're using user agent for device segmentations with other servers (to distinguish between Desktop/FxOS/Android). The user agent from FxOS should look like this: https://developer.mozilla.org/en-US/docs/Web/HTTP/Gecko_user_agent_string_reference#Firefox_OS
(In reply to Katie Parlante from comment #25)
> I should list more clearly which fields are required/important for heka processing

Thanks. The final list should not be that different from what you have in fxa. I guess agent/uid/path are the 3 main ones to be able to build aggregated views.

I'll push a first version this week so ops/then you can hook it up.
(In reply to Tarek Ziadé (:tarek) from comment #23)
> ^ is the user-agent good enough ?

Yes, I believe the UA should be enough.
Flags: needinfo?(ferjmoreno)
Depends on: 1036059
Depends on: 1036069
Depends on: 1036073
https://github.com/mozilla-services/loop-server/commit/61da2e98999f74eefe29b36d6d45c61eeb678631
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Resolution: --- → FIXED
I don't see the required stats on https://metrics.fxa.us-west-2.prod.mozaws.net/accounts-dashboard/
Is there another dashboard to access it?
I don't think it should be on Firefox Account stats page.
yea, that's not the right dashboard.

:mostlygeek can you help here?
Flags: needinfo?(bwong)
We don't have a high level dashboard for this yet. I'm working on total users and JSON output (should be done early this week), and :trink is working on the recurring users (hopefully by end of week). We're targeting having a dashboard available before Beta which I believe is the beginning of September. At least one app code change needs to go out before that (see bug #1046236).
Flags: needinfo?(bwong)
OK, will hold off on verifying this bug until then.
FWIW, the bug for high level dashboard itself is here: https://bugzilla.mozilla.org/show_bug.cgi?id=1036059
(In reply to Wesley Dawson [:whd] from comment #33)
> We don't have a high level dashboard for this yet. I'm working on total
> users and JSON output (should be done early this week), and :trink is
> working on the recurring users (hopefully by end of week). We're targeting
> having a dashboard available before Beta which I believe is the beginning of
> September. At least one app code change needs to go out before that (see bug
> #1046236).

Thanks Wesley. We have a product review early next week, if you can it would be great if you could send me by e-mail any data you have on total users and recurring users so I can have an idea (rough current numbers or ideally some form of historical data).
Where are we with this bug?
This does not feel "Resolved" to me...
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Indeed, the latest metrics code is running in production as of yesterday. :jbonacci I would say this bug can be closed as it is mostly tracking the server-side logging, which is in place.

The static prototype dashboard fueled by this logging is currently living at https://metrics.fxa.us-west-2.prod.mozaws.net/loop-server-dashboard/ but that has its own tracking bugs.
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Resolution: --- → FIXED
Thank you.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.