Closed
Bug 1122506
Opened 11 years ago
Closed 10 years ago
Report Hello conversation URL funnel
Categories
(Cloud Services :: Operations: Metrics/Monitoring, task)
Cloud Services
Operations: Metrics/Monitoring
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: RT, Assigned: kparlante)
References
Details
User Story
Report Hello conversation URL funnel Acceptance criteria: [P1] Daily provide: Number of conversations created Number of conversation URLs Copied/Shared Unique conversation URLs clicked Unique conversation joined on the clicker side Number of occurrences of “2 users in a conversation at the same time” [P2] Provide graph showing evolution of the above numbers daily
No description provided.
Reporter | ||
Updated•11 years ago
|
![]() |
Assignee | |
Comment 1•11 years ago
|
||
:whd, we want these counts (per day) in a json file: room_funnel.json
* room_created
This is a simple count of successful calls to "POST /rooms" (https://docs.services.mozilla.com/loop/apis.html#post-rooms). Note that we're not counting "POST /rooms/<token>", which is used for joining/leaving a room.
The equivalent kibana query is: "method:post AND path:rooms AND -token:* AND errno:0"
* room_url_clicked
We need to look at the nginx log for this one. The url that the user clicks looks like "GET https://hello.firefox.com/xxxxxxxxxxx" where "xxxxxxxxxxx" is the room token. It hits the static http server. One thing to look out for: we don't want to match any file extensions. We want to count the number of unique room tokens for this one.
* room_joined
We want to count the number of rooms that were joined by the link clicker. We want to count unique room tokens seen on "POST /rooms/<token>" with action=join. https://docs.services.mozilla.com/loop/apis.html#joining-the-room
kibana query: "method:post AND path:rooms AND action:join AND errno:0"
- count unique room tokens, in this case handily available in the "token" field
- the extra wrinkle here is that we only want to count "link clickers", not people who rejoin a room after having left the room. The "link clickers" should be using basic auth instead of hawk (https://docs.services.mozilla.com/loop/apis.html#rooms). I'm guessing that we'll have a uid for the hawk session users, but I can't test that out in kibana.
* two_in_room
We want to count the number of rooms that have ever had two participants in the room at the same time. Once 0.15 goes to production, we should have a "participants" field (https://bugzilla.mozilla.org/show_bug.cgi?id=1112379).
kibana query: "method:post AND path:rooms AND action:join AND errno:0 AND participants:2"
- unique room tokens, again found in "token"
Lets combine into one file, so:
[{"date":"2014-07-24","time_t":1406160000,"room_created":100,"room_url_clicked":99,"room_joined":98, "two_in_room":97},...]
Assignee: kparlante → whd
Comment 2•11 years ago
|
||
* room_created
This is a cbuf filter with message matcher:
Logger == 'mozilla-loop-server' && Fields[method] == 'post' && Fields[path] == '/rooms' && Fields[token] == NIL && Fields[errno] == 0
* room_url_clicked
This is a HLL filter on the request field with message matcher:
Logger == 'LoopWebserver' && Fields[request] =~ /^GET \/[^\.\/]{11}$/
* room_joined
This is a HLL filter on the token field with message matcher:
Logger == 'mozilla-loop-server' && Fields[method] == 'post' && Fields[path] =~ /rooms/ && Fields[action] == 'join' && Fields[errno] == 0
Regarding the wrinkle, it appears all "join" actions are accompanied by a uid, so I can't tell from loop server logs which requests used basic auth and which did not. I'm going to start out implementing naively and we can fix it up from there.
* two_in_room
This is a simple HLL filter on the token field with message matcher:
Logger == 'mozilla-loop-server' && Fields[method] == 'post' && Fields[path] =~ /rooms/ && Fields[action] == 'join' && Fields[errno] == 0 && Fields[participants] >= 2
I'm starting work on this in the loop_room_metrics branch of puppet-config.
Comment 3•11 years ago
|
||
These filters are live and if the data looks good I can backfill the JSON at:
https://metrics.services.mozilla.com/loop-server-dashboard/data/loop_room_funnel.json
for all metrics except ones involving participants.
![]() |
Assignee | |
Comment 4•11 years ago
|
||
I have this graphing locally, looks reasonable. Lets get it backfilled. Also, "two_in_room" data should be available now that "participants" is showing up.
Comment 5•11 years ago
|
||
I'm seeing for the "method:post AND path:rooms AND action:join AND participants:2" query in kibana all 202 errno and 400 code responses, so they aren't currently being counted in the "two_in_room" filter (requires errno 0). I can change the message matcher if need be but it looks like something might be up app-side.
![]() |
Assignee | |
Comment 6•11 years ago
|
||
202 is "Room Full": https://github.com/mozilla-services/loop-server/blob/master/loop/errno.json (there's a bug looking into the high-ish frequency of that error: https://bugzilla.mozilla.org/show_bug.cgi?id=1123588).
Looks like "participants" is the number of people in the room before the join, so lets change to:
Logger == 'mozilla-loop-server' && Fields[method] == 'post' && Fields[path] =~ /rooms/ && Fields[action] == 'join' && Fields[errno] == 0 && Fields[participants] >= 1
Comment 7•11 years ago
|
||
The matcher has been updated.
Comment 8•11 years ago
|
||
The data has been backfilled. In this pass I added the filter "Fields[user_agent_browser] != NIL" as we have with some other metrics to discount load tests.
![]() |
Assignee | |
Comment 9•11 years ago
|
||
This is now live on the dashboard: https://metrics.services.mozilla.com/loop-server-dashboard/
Missing:
- Number of conversation URLs Copied/Shared (should come from FHR+Telemetry, ideally working with Saptarshi to make sure we have a count that we can compare to the others in the funnel)
- Ideally, correlate with Tokbox data to know whether or not a successful audio/video connection happened.
- Not on a separate "analytics" dashboard
Romain, let me know if you'd like any of the text tweaked for terminology consistency.
![]() |
Assignee | |
Updated•11 years ago
|
Assignee: whd → kparlante
Reporter | ||
Comment 10•11 years ago
|
||
Mark, the shared URLs on Telemetry seem broken, is there a bug tracking this already?
Or is this something that should be part of the FHR work?
Flags: needinfo?(standard8)
Reporter | ||
Comment 11•11 years ago
|
||
(In reply to Katie Parlante from comment #9)
> This is now live on the dashboard:
> https://metrics.services.mozilla.com/loop-server-dashboard/
>
> Missing:
> - Number of conversation URLs Copied/Shared (should come from FHR+Telemetry,
> ideally working with Saptarshi to make sure we have a count that we can
> compare to the others in the funnel)
> - Ideally, correlate with Tokbox data to know whether or not a successful
> audio/video connection happened.
> - Not on a separate "analytics" dashboard
>
> Romain, let me know if you'd like any of the text tweaked for terminology
> consistency.
Thanks Katie, this all looks good!
Reporter | ||
Comment 12•11 years ago
|
||
The spike on Jan 27th is related to the load tests?
Any way to take out the data related to the load tests?
Flags: needinfo?(kparlante)
![]() |
Assignee | |
Comment 13•11 years ago
|
||
(In reply to Romain Testard [:RT] from comment #12)
> The spike on Jan 27th is related to the load tests?
> Any way to take out the data related to the load tests?
The room funnel data has the load test spike filtered out, as do most of the call metrics.
The metrics on the graph that have not been filtered are "active daily" (unique users that hit any endpoint) and the call setup stats that come from websocket logging. Those are trickier to filter, as we don't have the user agents on the log -- we have to associate the callId with other endpoints and look at those user agents. I'll log a separate bug for that one. It might be prudent to just asterisk the data.
Flags: needinfo?(kparlante)
![]() |
Assignee | |
Comment 14•11 years ago
|
||
Tracked here: https://bugzilla.mozilla.org/show_bug.cgi?id=1127564
Comment 15•11 years ago
|
||
(In reply to Wesley Dawson [:whd] from comment #8)
> The data has been backfilled. In this pass I added the filter
> "Fields[user_agent_browser] != NIL" as we have with some other metrics to
> discount load tests.
If we filter this out permanently are we going to hide from ourselves if other folks are writing clients and using them with Hello? (I don't expect this to be a high usage, but you never know). Or if we do filter out, maybe we should make sure we can track it somewhere...
Comment 16•11 years ago
|
||
(In reply to Romain Testard [:RT] from comment #10)
> Mark, the shared URLs on Telemetry seem broken, is there a bug tracking this
> already?
> Or is this something that should be part of the FHR work?
I've filed bug 1127574 on this. I'm sure we've discussed it before - we never took telemetry info across to call urls, though I think its different info tbh.
I think we should keep it separate from the fhr work at the moment - we'll more likely get something out sooner.
Flags: needinfo?(standard8)
Comment 17•11 years ago
|
||
(In reply to Mark Banner (:standard8) from comment #15)
> If we filter this out permanently are we going to hide from ourselves if
> other folks are writing clients and using them with Hello? (I don't expect
> this to be a high usage, but you never know). Or if we do filter out, maybe
> we should make sure we can track it somewhere...
This will potentially hide information about folks writing clients from the high-level dashboard (such as the "load test" client...). We still keep this info in kibana.
Alternatively we know what the non-standard user agent we use with our load tests is, and we can filter out that agent explicitly instead.
![]() |
Assignee | |
Comment 18•11 years ago
|
||
(In reply to Mark Banner (:standard8) from comment #15)
> (In reply to Wesley Dawson [:whd] from comment #8)
> > The data has been backfilled. In this pass I added the filter
> > "Fields[user_agent_browser] != NIL" as we have with some other metrics to
> > discount load tests.
>
> If we filter this out permanently are we going to hide from ourselves if
> other folks are writing clients and using them with Hello? (I don't expect
> this to be a high usage, but you never know). Or if we do filter out, maybe
> we should make sure we can track it somewhere...
We're not filtering it from elasticsearch/kibana or any ops team monitoring, we're just filtering it out of the aggregates for the custom dashboard. The specter of some third party client hitting the hello server endpoints makes me wonder if we should send a flag with each endpoint indicating the application the HTTP or websocket request is on behalf of. FxA has a service=sync, for example. That could also help when we need to distinguish standalone from desktop. Just a thought.
![]() |
Assignee | |
Comment 19•11 years ago
|
||
(In reply to Wesley Dawson [:whd] from comment #17)
> Alternatively we know what the non-standard user agent we use with our load
> tests is, and we can filter out that agent explicitly instead.
One detail to explain here: the raw user agent is not passed along to kibana right now, in part because it can be used as pii. The user agent field is parsed and put into common buckets for user_agent_browser, user_agent_os, user_agent_version. Looks like the load tester shows up as user_agent_os=Linux but the other fields don't fall into a recognized bucket. I think whd's proposal is that we modify that bucketing logic to specifically identify the load tester (user_agent_browser=loads) and then filter on that. Which is an excellent idea.
Comment 20•10 years ago
|
||
I think this is done. Reopen if necessary.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•