Closed Bug 1461441 Opened 7 years ago Closed 7 years ago

Metric Documentation: Add DAU definition documentation

Categories

(Data Science :: Documentation, task, P3)

x86_64
Linux
task
Points:
2

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: gkaberere, Assigned: jmccrosky)

References

Details

Metric to be defined: DAU Description of metric: countDistinct(client_id) by submission_date_s3 from Main_summary Any existing dashboards you know of: Filing bug per Ryan Harter's instructions. Currently there isn't documentation that clearly outlines the DAU definition nor an official report that can be used by analysts to validate their queries when starting to work with Main summary. Lack of clarity could result in confusion and misleading reporting particularly for teams new to the dataset.
Assignee: nobody → rharter
Points: --- → 2
Priority: -- → P2
Priority: P2 → P1
Priority: P1 → P3
Assignee: rharter → nobody
Component: Documentation and Knowledge Repo (RTMO) → Documentation
Product: Data Platform and Tools → Data Science
Assignee: nobody → jmccrosky
Status: NEW → ASSIGNED
Hi George, I'm happy to work on this. Given that you filed this some time ago, I just wanted to check in and see if your need is the same as it was seven months ago. Can you offer any additional clarification on exactly what you'd like?
Flags: needinfo?(gkaberere)
Hey Jesse, no changes to the ask. The bug is more a documentation bug for the metric. It's really to help us have a single place of truth for the metric/s as well as something to validate against so that we are all consistent in methodology.
Flags: needinfo?(gkaberere)
Are you sure you want DAU? Generally, we use aDAU now instead, which is documented here: https://docs.telemetry.mozilla.org/cookbooks/active_dau.html If you still have a use-case for DAU, can you please provide some information on it. Note that I am also in process of building more standard metric implementations and documentation, so if you feel that the aDAU doc I linked is inadequate, please let me know what more you would like to see.
Flags: needinfo?(gkaberere)
Yes, I was looking for DAU. My ask was to have similar documentation to aDAU for DAU so that we remove the ambiguity of how it is calculated. I know it's the count(distinct) client_ids on a given day but we should have some documentation for it so we are all on the same page. Thinking of it more from a point of view of a new hire. It would be great if they could have sufficient information to feel confident in the metrics they pull. In terms of use case, we do use DAU in a few ways. The one I use the most often is to see what percentage of DAU are aDAU. I sometimes look at different acquisition channels to see if they perform better at bringing us more high value users than others.
Flags: needinfo?(gkaberere)
Thanks for clarifying. Sent a PR (https://github.com/mozilla/firefox-data-docs/pull/234). Please take a look and feel free to let me know if anything needs further clarification.
Flags: needinfo?(gkaberere)
The DAU definitions look good to me. I like that you called out the HLL method as an approximation. The only suggestion I have is around MAU. You defined it at the top but didn't provide a code example for how to pull it. I think it would be helpful to include a query for it as well.
Flags: needinfo?(gkaberere)
Cool. That PR is now merged. To be honest, the HLL callout is a consequence of copy and paste from the aDAU documentation ;) I sent https://github.com/mozilla/firefox-data-docs/pull/236 to add a MAU example. There's actually no great solution right now. I'm working on a project that will improve this and will dramatically update documentation when that's ready. In the meantime, hope this suffices? Please close the bug if so.
Flags: needinfo?(gkaberere)
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Flags: needinfo?(gkaberere)
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.