Closed Bug 1303044 Opened 8 years ago Closed 8 years ago

Validate engagement measurements on Beta

Categories

(Toolkit :: Telemetry, defect, P1)

defect
Points:
3

Tracking

()

RESOLVED FIXED
Tracking Status
firefox51 --- affected

People

(Reporter: gfritzsche, Assigned: Dexter)

References

Details

(Whiteboard: [measurement:client])

Attachments

(1 obsolete file)

+++ This bug was initially created as a clone of Bug #1298342 +++

In bug 1276200 we validated the engagement measurements on Nightly. We should do the same on Aurora and Beta, once we have enough data there.

Nightly analysis gist: https://gist.github.com/Dexterp37/9bea37f536d5b25651aecc8b22d2dfb6
Priority: P1 → P2
The engagement measurements (phase 1) are on Beta since Sep 19, so we should validate the data on Beta now.
Assignee: nobody → jdorlus
Priority: P2 → P1
In addition to the considerations that were valid for the Aurora validation (reported below for convenience), there a couple of more things to change:

- The release calendar (https://wiki.mozilla.org/RapidRelease/Calendar) says that the engagement measurements were merged on 2016-09-19. That means we should consider pings submitted from the 26th Sept. 2016 (Monday) to the 2nd of October (Sunday), to have a full week. Change the submission_date accordingly.
- The channel needs to be set to "beta".
- The considered build ids must be from 20160920000000 to 20160929999999

(In reply to Alessio Placitelli [:Dexter] from bug 1298342 comment #3)
> A few sparse comments:
> 
> - You don't need to validate the broken uri pings bug. This means that you
> can avoid fetching the pings and defining the |broken_uri_pings| variable.
> Just keep |latest_pings| and change the |all_pings = broken_uri_pings +
> latest_pings| to |all_pings = latest_pings|. Most of the remaining analysis
> should still work.
> - By doing that, you should see a clear drop in the output of cell [3], as
> the count for the 'scalars section is None' error seems a bit high (3487059)
> - You can remove the plot from cell [22] as it's only plotting data from
> |broken_uri_pings|, that you should have removed.
> - Same for cell [37].
> 
> Please run the notebook again with that changes and keep us posted if you
> have any problem!
Flags: needinfo?(gfritzsche)
Flags: needinfo?(alessio.placitelli)
(In reply to John Dorlus [:Silne30] from comment #3)
> https://gist.github.com/silne30/ced6e5db314528013a3393ec89d303b8

Thanks John! The number of pings missing the scalars section looks a bit too high, and that's unexpected. However, I noticed a typo that could be triggering this: the starting date for the "submission_date" for |latest_pings| is "2016920", which is missing a leading 0 before the 9. It should really read "20160920". However, we should be comparing a Monday to Sunday period, so we should be using |submission_date=("20160926", "20161002")| as suggested in comment 2.

Since we're here, would you also remove the following block from one of the initial cells?

> broken_uri_pings = filter(get_pings(sc,
>                             app="Firefox",
>                             channel="beta",
>                             doc_type="main",
>                             schema="v4",
>                             submission_date=("20160920", "20160929"), # Only one week of submissions.
>                             build_id=("20160920000000", "20160929999999"), # Up to bug 1293222
>                             fraction=1.0))

And he whole cell 45 which contains:

> plot_histogram_scalar(broken_uri_pings, "browser.engagement.total_uri_count")

Together with cell 57 which contains:

> plot_histogram_scalar(broken_uri_pings, "browser.engagement.unique_domains_count")

To remove a cell, simply click on it to make it active, then delete it.
Flags: needinfo?(alessio.placitelli) → needinfo?(jdorlus)
Tried running this notebook but my spark cluster is throwing errors. Apparently, I am not the only one having this issue. So I am waiting for that to be resolved.
Flags: needinfo?(gfritzsche)
Flags: needinfo?(alessio.placitelli)
John, this is missing the updates Alessio mentioned in comment 4.

(In reply to Alessio Placitelli [:Dexter] from comment #4)
> (In reply to John Dorlus [:Silne30] from comment #3)
> > https://gist.github.com/silne30/ced6e5db314528013a3393ec89d303b8
> 
> Thanks John! The number of pings missing the scalars section looks a bit too
> high, and that's unexpected. However, I noticed a typo that could be
> triggering this: the starting date for the "submission_date" for
> |latest_pings| is "2016920", which is missing a leading 0 before the 9. It
> should really read "20160920". However, we should be comparing a Monday to
> Sunday period, so we should be using |submission_date=("20160926",
> "20161002")| as suggested in comment 2.
> 
> Since we're here, would you also remove the following block from one of the
> initial cells?
> 
> > broken_uri_pings = filter(get_pings(sc,
> >                             app="Firefox",
> >                             channel="beta",
> >                             doc_type="main",
> >                             schema="v4",
> >                             submission_date=("20160920", "20160929"), # Only one week of submissions.
> >                             build_id=("20160920000000", "20160929999999"), # Up to bug 1293222
> >                             fraction=1.0))
> 
> And he whole cell 45 which contains:
> 
> > plot_histogram_scalar(broken_uri_pings, "browser.engagement.total_uri_count")
> 
> Together with cell 57 which contains:
> 
> > plot_histogram_scalar(broken_uri_pings, "browser.engagement.unique_domains_count")
> 
> To remove a cell, simply click on it to make it active, then delete it.
Flags: needinfo?(jdorlus)
I'm taking the bug, as we only have one week left of Beta time before it freezes.
Assignee: jdorlus → alessio.placitelli
Flags: needinfo?(alessio.placitelli)
Brendan,

it would be great if you could kindly give a look at the notebook validating engagement measurements on beta [1], for a single week of data (Monday to Sunday). It doesn't look too surprising, and seem to confirm what we saw for nightly [2].

One thing worth noting is that the 50 percentile for the Window Open Event Count is just 2.

[1] - https://gist.github.com/Dexterp37/56aeadf8520d0a6d9c23f24bb5609916
[2] - https://gist.github.com/Dexterp37/9bea37f536d5b25651aecc8b22d2dfb6
Flags: needinfo?(bcolloran)
Flags: needinfo?(jdorlus)
 
> One thing worth noting is that the 50 percentile for the Window Open Event
> Count is just 2.
 

Hmm, I'm trying to square this assert (which looks like it comes from Out[20]) with some of the results and comments a few cells farther down.

At "In [20]" we have 

    plot_histogram_scalar(all_pings, "browser.engagement.window_open_event_count")

which results in "50%      2.000000e+00". I assume this is just the 50th percentile of window open events among all subsession pings.

But further down, below "In [23]", you try to "Plot and describe the number of window open events per client session." Using this code:

    per_session_win_open_events = combined_per_session_win.map(lambda x: x[1][0])
    plot_series(pd.Series(per_session_win_open_events.collect()))

you found "50%      0.000000e+00".

So if I'm understanding correctly, we have found that 50% of *sub*session have two window open events, but 50% of full client sessions have *zero* window open events. This is a really weird result-- we should expect that there should be more window open events when you sum over subsessions than when you look at subsessions individually.

I guess it could be possible in an example like this:
- 90% of sessions have only one subsession, and those sessions also have 0 window open events
- the remaining 10% of sessions have 10 subsessions each, and each of those subsessions has two window open events

So for this to make sense, on a per-session basis we'd need to see that if you plot a graph of subsessions per session on the X-axis, and window open events per session on the Y-axis, the curve would need to have a slope greater than 1. That is: the sessions with more subsessions would need to have a disproportionate share of window open events.

This seems possible, but since it is a bit weird relative the naive assumption that sessions should have greater median number of window open events than subsessions, we should double check it.
Flags: needinfo?(bcolloran)
Points: --- → 3
Attached image engagement_beta.png (obsolete) —
Attachment #8803357 - Attachment is obsolete: true
Yes Brendan, you understood correctly, and I agree, it's a weird result.

However, we should consider that if one uses tabs and windows restored by the session store, then you'd see zero window/tab open event from these users.

Anyway, I digged a bit more into that part and updated the notebook at [1]. My findings:

- The more subsessions you have, the less window open events you trigger.
- The distribution of the subsession is skewed towards sessions having just 1 or 2 subsessions.
- The mean number of window open events for sessions with only one subsession is ~0.8 -> ~1.

The new bits of analysis start at cell [27]. This doesn't match exactly your example, but it's close. Do you have an opinion on this? Should I dig deeper? And, if so, any suggestions about how to continue?

[1]: https://gist.github.com/Dexterp37/56aeadf8520d0a6d9c23f24bb5609916
Flags: needinfo?(bcolloran)
Ah, I'd forgotten about the session store tabs+windows. Do you happen to know if we are on track to white-list those for opt-out submission?

I think we should call it good for now, this is a solid amount of attention and diligence. Thank you Alessio!
Flags: needinfo?(bcolloran)
(In reply to brendan c from comment #13)
> Ah, I'd forgotten about the session store tabs+windows. Do you happen to
> know if we are on track to white-list those for opt-out submission?

There's a discussion about that in bug 1303278. We can follow up there.

> I think we should call it good for now, this is a solid amount of attention
> and diligence. Thank you Alessio!

Thank you Brendan!
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: