1122506 - Report Hello conversation URL funnel

:whd, we want these counts (per day) in a json file: room_funnel.json * room_created This is a simple count of successful calls to "POST /rooms" (https://docs.services.mozilla.com/loop/apis.html#post-rooms). Note that we're not counting "POST /rooms/<token>", which is used for joining/leaving a room. The equivalent kibana query is: "method:post AND path:rooms AND -token:* AND errno:0" * room_url_clicked We need to look at the nginx log for this one. The url that the user clicks looks like "GET https://hello.firefox.com/xxxxxxxxxxx" where "xxxxxxxxxxx" is the room token. It hits the static http server. One thing to look out for: we don't want to match any file extensions. We want to count the number of unique room tokens for this one. * room_joined We want to count the number of rooms that were joined by the link clicker. We want to count unique room tokens seen on "POST /rooms/<token>" with action=join. https://docs.services.mozilla.com/loop/apis.html#joining-the-room kibana query: "method:post AND path:rooms AND action:join AND errno:0" - count unique room tokens, in this case handily available in the "token" field - the extra wrinkle here is that we only want to count "link clickers", not people who rejoin a room after having left the room. The "link clickers" should be using basic auth instead of hawk (https://docs.services.mozilla.com/loop/apis.html#rooms). I'm guessing that we'll have a uid for the hawk session users, but I can't test that out in kibana. * two_in_room We want to count the number of rooms that have ever had two participants in the room at the same time. Once 0.15 goes to production, we should have a "participants" field (https://bugzilla.mozilla.org/show_bug.cgi?id=1112379). kibana query: "method:post AND path:rooms AND action:join AND errno:0 AND participants:2" - unique room tokens, again found in "token" Lets combine into one file, so: [{"date":"2014-07-24","time_t":1406160000,"room_created":100,"room_url_clicked":99,"room_joined":98, "two_in_room":97},...]

Assignee: kparlante → whd

Wesley Dawson [:whd]

Comment 2

•

11 years ago

* room_created This is a cbuf filter with message matcher: Logger == 'mozilla-loop-server' && Fields[method] == 'post' && Fields[path] == '/rooms' && Fields[token] == NIL && Fields[errno] == 0 * room_url_clicked This is a HLL filter on the request field with message matcher: Logger == 'LoopWebserver' && Fields[request] =~ /^GET \/[^\.\/]{11}$/ * room_joined This is a HLL filter on the token field with message matcher: Logger == 'mozilla-loop-server' && Fields[method] == 'post' && Fields[path] =~ /rooms/ && Fields[action] == 'join' && Fields[errno] == 0 Regarding the wrinkle, it appears all "join" actions are accompanied by a uid, so I can't tell from loop server logs which requests used basic auth and which did not. I'm going to start out implementing naively and we can fix it up from there. * two_in_room This is a simple HLL filter on the token field with message matcher: Logger == 'mozilla-loop-server' && Fields[method] == 'post' && Fields[path] =~ /rooms/ && Fields[action] == 'join' && Fields[errno] == 0 && Fields[participants] >= 2 I'm starting work on this in the loop_room_metrics branch of puppet-config.

Wesley Dawson [:whd]

Comment 3

•

11 years ago

These filters are live and if the data looks good I can backfill the JSON at: https://metrics.services.mozilla.com/loop-server-dashboard/data/loop_room_funnel.json for all metrics except ones involving participants.

Katie Parlante

Assignee

Comment 4

•

11 years ago

I have this graphing locally, looks reasonable. Lets get it backfilled. Also, "two_in_room" data should be available now that "participants" is showing up.

Wesley Dawson [:whd]

Comment 5

•

11 years ago

I'm seeing for the "method:post AND path:rooms AND action:join AND participants:2" query in kibana all 202 errno and 400 code responses, so they aren't currently being counted in the "two_in_room" filter (requires errno 0). I can change the message matcher if need be but it looks like something might be up app-side.

Katie Parlante

Assignee

Comment 6

•

11 years ago

202 is "Room Full": https://github.com/mozilla-services/loop-server/blob/master/loop/errno.json (there's a bug looking into the high-ish frequency of that error: https://bugzilla.mozilla.org/show_bug.cgi?id=1123588). Looks like "participants" is the number of people in the room before the join, so lets change to: Logger == 'mozilla-loop-server' && Fields[method] == 'post' && Fields[path] =~ /rooms/ && Fields[action] == 'join' && Fields[errno] == 0 && Fields[participants] >= 1

Wesley Dawson [:whd]

Comment 7

•

11 years ago

The matcher has been updated.

Wesley Dawson [:whd]

Comment 8

•

11 years ago

The data has been backfilled. In this pass I added the filter "Fields[user_agent_browser] != NIL" as we have with some other metrics to discount load tests.

Katie Parlante

Assignee

Comment 9

•

11 years ago

This is now live on the dashboard: https://metrics.services.mozilla.com/loop-server-dashboard/ Missing: - Number of conversation URLs Copied/Shared (should come from FHR+Telemetry, ideally working with Saptarshi to make sure we have a count that we can compare to the others in the funnel) - Ideally, correlate with Tokbox data to know whether or not a successful audio/video connection happened. - Not on a separate "analytics" dashboard Romain, let me know if you'd like any of the text tweaked for terminology consistency.

Katie Parlante

Assignee

Updated

•

11 years ago

Assignee: whd → kparlante

Romain Testard [:RT]

Reporter

Comment 10

•

11 years ago

Mark, the shared URLs on Telemetry seem broken, is there a bug tracking this already? Or is this something that should be part of the FHR work?

Flags: needinfo?(standard8)

Romain Testard [:RT]

Reporter

Comment 11

•

11 years ago

(In reply to Katie Parlante from comment #9) > This is now live on the dashboard: > https://metrics.services.mozilla.com/loop-server-dashboard/ > > Missing: > - Number of conversation URLs Copied/Shared (should come from FHR+Telemetry, > ideally working with Saptarshi to make sure we have a count that we can > compare to the others in the funnel) > - Ideally, correlate with Tokbox data to know whether or not a successful > audio/video connection happened. > - Not on a separate "analytics" dashboard > > Romain, let me know if you'd like any of the text tweaked for terminology > consistency. Thanks Katie, this all looks good!

Romain Testard [:RT]

Reporter

Comment 12

•

11 years ago

The spike on Jan 27th is related to the load tests? Any way to take out the data related to the load tests?

Flags: needinfo?(kparlante)

Katie Parlante

Assignee

Comment 13

•

11 years ago

(In reply to Romain Testard [:RT] from comment #12) > The spike on Jan 27th is related to the load tests? > Any way to take out the data related to the load tests? The room funnel data has the load test spike filtered out, as do most of the call metrics. The metrics on the graph that have not been filtered are "active daily" (unique users that hit any endpoint) and the call setup stats that come from websocket logging. Those are trickier to filter, as we don't have the user agents on the log -- we have to associate the callId with other endpoints and look at those user agents. I'll log a separate bug for that one. It might be prudent to just asterisk the data.

Flags: needinfo?(kparlante)

Katie Parlante

Assignee

Comment 14

•

11 years ago

Tracked here: https://bugzilla.mozilla.org/show_bug.cgi?id=1127564

Mark Banner (:standard8)

Comment 15

•

11 years ago

(In reply to Wesley Dawson [:whd] from comment #8) > The data has been backfilled. In this pass I added the filter > "Fields[user_agent_browser] != NIL" as we have with some other metrics to > discount load tests. If we filter this out permanently are we going to hide from ourselves if other folks are writing clients and using them with Hello? (I don't expect this to be a high usage, but you never know). Or if we do filter out, maybe we should make sure we can track it somewhere...

Mark Banner (:standard8)

Updated

•

11 years ago

Depends on: 1127574

Mark Banner (:standard8)

Comment 16

•

11 years ago

(In reply to Romain Testard [:RT] from comment #10) > Mark, the shared URLs on Telemetry seem broken, is there a bug tracking this > already? > Or is this something that should be part of the FHR work? I've filed bug 1127574 on this. I'm sure we've discussed it before - we never took telemetry info across to call urls, though I think its different info tbh. I think we should keep it separate from the fhr work at the moment - we'll more likely get something out sooner.

Flags: needinfo?(standard8)

Wesley Dawson [:whd]

Comment 17

•

11 years ago

(In reply to Mark Banner (:standard8) from comment #15) > If we filter this out permanently are we going to hide from ourselves if > other folks are writing clients and using them with Hello? (I don't expect > this to be a high usage, but you never know). Or if we do filter out, maybe > we should make sure we can track it somewhere... This will potentially hide information about folks writing clients from the high-level dashboard (such as the "load test" client...). We still keep this info in kibana. Alternatively we know what the non-standard user agent we use with our load tests is, and we can filter out that agent explicitly instead.

Katie Parlante

Assignee

Comment 18

•

11 years ago

(In reply to Mark Banner (:standard8) from comment #15) > (In reply to Wesley Dawson [:whd] from comment #8) > > The data has been backfilled. In this pass I added the filter > > "Fields[user_agent_browser] != NIL" as we have with some other metrics to > > discount load tests. > > If we filter this out permanently are we going to hide from ourselves if > other folks are writing clients and using them with Hello? (I don't expect > this to be a high usage, but you never know). Or if we do filter out, maybe > we should make sure we can track it somewhere... We're not filtering it from elasticsearch/kibana or any ops team monitoring, we're just filtering it out of the aggregates for the custom dashboard. The specter of some third party client hitting the hello server endpoints makes me wonder if we should send a flag with each endpoint indicating the application the HTTP or websocket request is on behalf of. FxA has a service=sync, for example. That could also help when we need to distinguish standalone from desktop. Just a thought.

Katie Parlante

Assignee

Comment 19

•

11 years ago

(In reply to Wesley Dawson [:whd] from comment #17) > Alternatively we know what the non-standard user agent we use with our load > tests is, and we can filter out that agent explicitly instead. One detail to explain here: the raw user agent is not passed along to kibana right now, in part because it can be used as pii. The user agent field is parsed and put into common buckets for user_agent_browser, user_agent_os, user_agent_version. Looks like the load tester shows up as user_agent_os=Linux but the other fields don't fall into a recognized bucket. I think whd's proposal is that we modify that bucketing logic to specifically identify the load tester (user_agent_browser=loads) and then filter on that. Which is an excellent idea.

Wesley Dawson [:whd]

Comment 20

•

10 years ago

I think this is done. Reopen if necessary.

Status: NEW → RESOLVED

Closed: 10 years ago

Resolution: --- → FIXED

Bugzilla

Report Hello conversation URL funnel

Categories

(Cloud Services :: Operations: Metrics/Monitoring, task)

Tracking

(Not tracked)

People

(Reporter: RT, Assigned: kparlante)

References

Details

Crash Data

Security

(public)

User Story

Description

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Updated

Comment 10

Comment 11

Comment 12

Comment 13

Comment 14

Comment 15

Updated

Comment 16

Comment 17

Comment 18

Comment 19

Comment 20