Bugzilla

Assignee

Comment 1

•

8 years ago

Thank you for filing this bug, Anthony! Some notes about the Spark notebook:

- It doesn't produce CSVs, it rather produces weekly aggregated JSON blobs which are appended to the previous state file fetched from S3 (this allows for iterative updates and eases backfilling).
- The format of the the produced JSON file is like [1].
- We only aggregate data coming from the current and the previous major release version of FF.
- We are generating the aggregates by considering only the newest ping for each client.

[1] - https://metrics.mozilla.com/protected/firefox-hardware-survey/data/hwsurvey-FF43.json

Anthony Vaughn [:anthony][:avaughn][San Francisco - Pacific]

Comment 2

•

8 years ago

Hey guys, goal is to get this data eventually published (and automatically updating on some TBD frequency) on the mozilla.games.org website. I think I understand the goal of this bug is for us to be able to validate that the data looks right before taking the next step, which is engaging with our web agency to start building the actual presentation. What is the easiest/right way to do that, if we can't via CSV?

Reporter

Comment 3

•

8 years ago

Thanks for the added info, Alessio!

Andre, we have answers for your questions which will come after this bug is complete. This bug is specifically for the Telemetry folks to make sure this is all good to go before we work with Games to do final tweaks (like frequency of updates, format etc) and involve the web agency.

FYI JSON should be just as easy for the web agency to use as CSV. I believe Ali already made a harness to read that which we can provide to the agency.

Comment 4

•

8 years ago

All sounds awesome... really looking forward to this, exciting stuff! Thanks for all the great work!

Thomas Huelbert

Updated

•

8 years ago

Points: --- → 2

Priority: -- → P1

Comment 5

•

8 years ago

I don't appear to have read access to that github pull request.  Was this contributed under the Mozilla project or as a personal repo?

Flags: needinfo?(aalmossawi)

Ali Almossawi

Comment 6

•

8 years ago

rweiss, done. Happy to move the repo to the mozilla org post-review.

Flags: needinfo?(aalmossawi)

Comment 7

•

8 years ago

Now I need access to :Dexter, since it seems like the PR originates from him:

I am trying to read this notebook:
https://github.com/Dexterp37/firefox-hardware-survey/blob/spark_job/report/summarize_json.ipynb

Flags: needinfo?(alessio.placitelli)

Anthony Vaughn [:anthony][:avaughn][San Francisco - Pacific]

Assignee

Comment 8

•

8 years ago

(In reply to Rebecca Weiss from comment #7)
> Now I need access to :Dexter, since it seems like the PR originates from him:
> 
> I am trying to read this notebook:
> https://github.com/Dexterp37/firefox-hardware-survey/blob/spark_job/report/
> summarize_json.ipynb

Done! It should be working now.

Flags: needinfo?(alessio.placitelli)

Reporter

Comment 9

•

8 years ago

Hi, everyone! Thanks for coming together on this one. Rebecca, are you good to go with no more blockers preventing you access to the repo?

The web agency is ready to go once the review is done and we've vetted final needs with the games group internally.

Thanks again.

Flags: needinfo?(rweiss)

Comment 10

•

8 years ago

Yup, I've reviewed the notebook; it seems right to me.  I would like to see the resulting json file, if that's possible.

:Dexter, could you attach a sample json for a given week (can be any week, doesn't matter) to this bug thread?

:avaughn, please be sure to include me with the web agency discussion for ensuring that we accurately describe how the data was collected and computed on the mozgames site.

Flags: needinfo?(rweiss) → needinfo?(alessio.placitelli)

Comment 11

•

8 years ago

FYI, we have changed scope with the web agency, and are now looking to get them to prioritize finishing the Open Web Games site, and not the games.mozilla.org site. So, at this point, we imagine that we will have someone internal (likely Harald) add a tile to the games.mozilla.org site that will link *somewhere* to show the live data. I assume it would make sense for us to connect to what Ali has mocked up? Short form, the "web agency discussion" is really an internal one, where we figure out how to publish the data regularly somewhere and display it using Ali's work. Make sense?

Assignee

Comment 12

•

8 years ago

(In reply to Rebecca Weiss from comment #10)
> Yup, I've reviewed the notebook; it seems right to me.  I would like to see
> the resulting json file, if that's possible.
> 
> :Dexter, could you attach a sample json for a given week (can be any week,
> doesn't matter) to this bug thread?

Thanks for taking time to review this. Here's a link to the file (7z compressed json, only accessible to Mozilla): https://drive.google.com/a/mozilla.com/file/d/0B1RACVTBwZ94YTZhdW9fUVlPd1E/view?usp=sharing 

I uploaded it to gdrive, as I was unsure about the policy for this kind of aggregated data: can they be publicly shared/attached to the bug?

Flags: needinfo?(alessio.placitelli)

Comment 13

•

8 years ago

:Dexter, I actually think this should be added to the bug given that it will be powering a page that will also be open.  almossawi, will we have a link that allows for people to download the data from the page?

:bsmedberg, do you agree or disagree that we should include a copy of an example raw json that will power the public-facing Games Hardware Survey to this bug?

Flags: needinfo?(benjamin)

Flags: needinfo?(aalmossawi)

Comment 14

•

8 years ago

Note: we made a point of allowing anyone who runs tests on the Open Web Games site to download both a representative JSON blob sample, as well as the actual data we collect. If possible, would like to do the same for the hardware telemetry data. Defer to Benjamin on this.

Comment 15

•

8 years ago

I am trying to understand what this data represents. The file from comment 12 doesn't parse as JSON. It appears to be multiple lines of JSON separated by commas?

 And I don't have access to the scripts, perhaps because of some security setting in github. If we're at the point of making the aggregates public, can we make the github repo public as well?

Do things like "display_1920x956":0.0000005969 mean that 0.00006% of Firefox users have a 1920x956 window size? Or is that combined with other data in the same line somehow? It seems likely that this is a single user?

In general, we're ok to publish any aggregate data that doesn't identify particular users. But there's a lot of data here I'm not certain about: is it possible for gpu_* or os_* lines to contain identifying strings? One way to avoid that is to make sure that any bucket you're publishing has at least N users (I suggest N=100) and clump everyone else into "other". 

Should we be bucketing these a little bit? So for example if I look at just the first line of this data, we have:

display_1366x768         0.291692284
display_1920x1080        0.2040645903
display_1600x900         0.0924423595
display_1280x1024        0.0866290654
display_1024x768         0.0685334788
display_1440x900         0.0567424422
display_1680x1050        0.0487181982
display_1280x800         0.0484170577
display_1360x768         0.0222858859
display_1920x1200        0.0161792111
.... thousands more

Comment 16

•

8 years ago

For users with less-common configurations, can we bucket these to the nearest-hundred for export? So in this case the "display_1920x956" would end up grouped in a larger "display_~1900x1000" bucket. And if we still don't have 100 users, the group the rest into "other".

This would also reduce concerns about junk data showing up in the long tail of OS names. For example:

>>> oses.sort(key=lambda (k, v): v, reverse=True)
>>> oses[:10]
[(u'os_Windows_NT-6.1', 0.5061198067), (u'os_Windows_NT-10.0', 0.1934596114), (u'os_Windows_NT-6.3',
 0.1121137851), (u'os_Windows_NT-5.1', 0.1070555806), (u'os_Windows_NT-6.0', 0.0296806896), (u'os_Wi
ndows_NT-6.2', 0.0167110568), (u'os_Darwin-14.5.0', 0.0064561648), (u'os_Darwin-15.3.0', 0.006130252
6), (u'os_Darwin-10.8.0', 0.0051531131), (u'os_Darwin-13.4.0', 0.003410438)]

This is fine.

>>> oses[-10:]
[(u'os_Linux-4.2.0-1-586', 2.985e-07), (u'os_Linux-3.0.0-24-generic', 2.985e-07), (u'os_Linux-3.5.0-
31-generic', 2.985e-07), (u'os_Linux-3.4.110-2vl6pae', 2.985e-07), (u'os_Linux-3.11.10-170.g1e76e80-
desktop', 2.985e-07), (u'os_Linux-3.11.0-19-generic', 2.985e-07), (u'os_Linux-3.0.66', 2.985e-07), (
u'os_Linux-2.6.43.8-1.fc15.x86_64', 2.985e-07), (u'os_Linux-4.3.3preempt-r2', 2.985e-07), (u'os_Linu
x-2.6.32-431.29.2.el6.i686', 2.985e-07)]

I believe this means is only one active user in the world on each of these configs. With this particular data I don't think that's a privacy risk, but since we don't know what data might show up I'd prefer for us to be careful.

Flags: needinfo?(benjamin)

Ali Almossawi

Comment 17

•

8 years ago

bsmedberg, to your last point: yes, pruning or consolidating many of the fields in the json file is essential, seeing as they contribute a lot to the json file's size, e.g. display resolutions and OSs.

For clarity, the demo that uses Alessio's json file is here (behind ldap): 
https://metrics.mozilla.com/protected/firefox-hardware-survey

The choice of fields for each chart is based solely on frequency, which may or may not make product sense.

Also for clarity, and to alleviate any potential issues with access, the mockup is now in this repo: https://github.com/mozilla/firefox-hardware-survey

Flags: needinfo?(aalmossawi)

Comment 18

•

8 years ago

Agree with :bsmedberg, and I will go further and suggest that it is better to simply roll up every tiny group to the nearest 1%.  Meaning any group that has < 1% should just be categorized "other".  

Given that this hardware survey is primarily designed to give insight into the state of hardware among Firefox users for the purpose of helping game developers identify market segments, I posit that any group with <1% configuration is unlikely to be of much interest anyway.

Ali Almossawi

Comment 19

•

8 years ago

Put differently, Alessio has all the fields there so we can quickly swap fields in the mockup in response to requests from stakeholders. The final json would presumably only have the fields that we’re displaying in the charts.

Comment 20

•

8 years ago

Well, we should not make the unfiltered data public without reviewing it "by hand". But I expect that shoving the <1% buckets into "other" is a trivial change to the notebook. I'd still like the notebook to be public.

Assignee

Comment 21

•

8 years ago

Thanks for the feedback so for, I'll work on fixing the mentioned issues (JSON validity, per resolution grouping, <1% group collapsing).

Thomas Huelbert

Comment 22

•

8 years ago

assigned to Dexter and prioritized to allow for next weeks scrub.

Assignee: rweiss → alessio.placitelli

Priority: P1 → --

Assignee

Comment 23

•

8 years ago

The PR at [1] now contains the updated notebook with the following changes:

- We're now using the longitudinal dataset, which makes everything MUCH faster (from 10+hrs down to 15mins on a 10 node cluster).
- We're collapsing <1% groups to the "other" bucket (OS and Resolutions are collapsed in a slightly different way, as per comment 16).
- We're producing a valid JSON output.

The output can be found at [2].

@Benjamin, do you think it should be ok now to share the JSON file on the bug? Is there anything else privacy-sensitive that stands out to you?

@Rebecca, any suggestion on how to deal with partial data from clients? For example, we have clients with broken screen resolutions but correct CPU and GPU information. Should we just filter out the broken records, using the partial data we have, or leave out the whole client?

[1] - https://github.com/mozilla/firefox-hardware-survey/pull/1
[2] - Data for April 2016: https://drive.google.com/a/mozilla.com/file/d/0B1RACVTBwZ94MFdoeEJlWm9qZTQ/view?usp=sharing

Flags: needinfo?(rweiss)

Flags: needinfo?(benjamin)

Comment 24

•

8 years ago

Yes, this is fine to share.

Flags: needinfo?(benjamin)

Assignee

Comment 25

•

8 years ago

Attached file Sample HW survey JSON (obsolete) — Details

Assignee

Updated

•

8 years ago

Status: NEW → ASSIGNED

Thomas Huelbert

Updated

•

8 years ago

Priority: -- → P2

Comment 26

•

8 years ago

This is mostly for public facing dashboard purposes, so I would try to keep as many client records in as possible.  If we have a broken record, use the partial data.

Flags: needinfo?(rweiss)

Assignee

Comment 27

•

8 years ago

(In reply to Rebecca Weiss from comment #26)
> This is mostly for public facing dashboard purposes, so I would try to keep
> as many client records in as possible.  If we have a broken record, use the
> partial data.

I had a nice conversation with Rebecca over IRC, and we agreed to drop broken records and discard the partial data. Otherwise, we would end up with the issue of having different subgroups with distinct totals (e.g. the total count of cpu records being different than the total count of gfx records).

We'd rather throw away the entire record and mention on the public dashboard that we throw away records with "corrupted fields" (to make sure people know).

Assignee

Comment 28

•

8 years ago

Attached file Sample HW survey JSON (obsolete) — Details

Updated the PR as per comment 27: this is the generated output.

@Rebecca: I flagged you for reviewing the PR (as discussed over IRC).

Attachment #8753239 - Attachment is obsolete: true

Attachment #8756431 - Flags: review?(rweiss)

Comment 29

•

8 years ago

I have also asked :spenrose to review this with me as the second reviewer.

Flags: needinfo?(spenrose)

Sam Penrose

Comment 30

•

8 years ago

1) The notebook discards an unknown amount of records without, as far as I can tell, counting them or distinguishing among several different reasons for discarding them. I think we should add counts for each reason and display them at the end, to see how they compare with the totals.

2) The notebook does not display intermediate results and it does not come with unit tests the way code would, which means that reviewing it requires holding a lot of state in one's head. I think that "reviewing" this work should consist in part of showing input and intermediate results so we can have a sense of the state machine. I am a candidate for doing that work (though not currently scheduled to).

Flags: needinfo?(spenrose)

Georg Fritzsche [:gfritzsche]

Assignee

Comment 31

•

8 years ago

(In reply to Sam Penrose from comment #30)
> 1) The notebook discards an unknown amount of records without, as far as I
> can tell, counting them or distinguishing among several different reasons
> for discarding them. I think we should add counts for each reason and
> display them at the end, to see how they compare with the totals.

That's a good point, I think we could report the reason why things were discarded in the logs or produce a separate, private report with this kind of information.

For what is worth, I had a look at the reasons why data was being discarded. Here's a breakdown in percentages of the discarded pings for the clients that were active in the past weeks:

- 1.8% due to pings not containing a monitor section
- 2.9% due to pings not containing a valid number for the number of CPU cores
- 0.04% due to pings not containing a valid vendor id for the video card
- 1.8% due to pings not containing a valid device id for the video card

Other reasons pings were being discarded (below .001%): no video adapters section in the ping, no CPU vendor in the ping.

Most of the clients are discarded because they are inactive (no submission in that date) in the week we are extracting the data for. This is not something that has to do with data validation though. 

> 2) The notebook does not display intermediate results and it does not come
> with unit tests the way code would, which means that reviewing it requires
> holding a lot of state in one's head. I think that "reviewing" this work
> should consist in part of showing input and intermediate results so we can
> have a sense of the state machine. I am a candidate for doing that work
> (though not currently scheduled to).

The notebook is not intended to be run one-off, but it should rather run as a scheduled weekly job and easily allow backfilling if we find some issues in the data. That's the reason why not much care was put in displaying intermediate results (even though this collides with the nature of Jupyter notebooks, I know).

That said, I'm open to suggestions on how to make things clearer/easier/faster to review. Please let me know if you were thinking of any specific action or you have a particular example/style I can adhere to!

Comment 32

•

8 years ago

(In reply to Alessio Placitelli [:Dexter] from comment #31)
> (In reply to Sam Penrose from comment #30)
> > 1) The notebook discards an unknown amount of records without, as far as I
> > can tell, counting them or distinguishing among several different reasons
> > for discarding them. I think we should add counts for each reason and
> > display them at the end, to see how they compare with the totals.
> 
> That's a good point, I think we could report the reason why things were
> discarded in the logs or produce a separate, private report with this kind
> of information.
> 
> For what is worth, I had a look at the reasons why data was being discarded.
> Here's a breakdown in percentages of the discarded pings for the clients
> that were active in the past weeks:
> 
> - 1.8% due to pings not containing a monitor section
> - 2.9% due to pings not containing a valid number for the number of CPU cores
> - 0.04% due to pings not containing a valid vendor id for the video card
> - 1.8% due to pings not containing a valid device id for the video card
> 
> Other reasons pings were being discarded (below .001%): no video adapters
> section in the ping, no CPU vendor in the ping.

How about just including them in the report in a separate bucket?
Either putting them into "other" or - more specific - buckets like "no cpu info", "no gfx info", etc.?

Sam Penrose

Comment 33

•

8 years ago

I have reviewed the PR: https://github.com/mozilla/firefox-hardware-survey/pull/1

Assignee

Comment 34

•

8 years ago

Attached file Sample HW survey JSON — Details

A slightly bigger sample with backfilled data, from January 2016 to the last week.

Attachment #8756431 - Attachment is obsolete: true

Attachment #8756431 - Flags: review?(rweiss)

Attachment #8765874 - Flags: review?(rweiss)

Assignee

Comment 35

•

8 years ago

Attached image OperatingSystemHW.png — Details

Assignee

Comment 36

•

8 years ago

As Ali pointed out, the data doesn't contain os_Darwin-15* entries on the four dates that we have troughs in the "Operating System" graph at [1] (screenshot in comment 35).

What's going on? Let's look at the data.

Data from 2016-02-21 (OSX 15 is there):

...
    "os_Darwin-14.5.0": 0.0135462513,
    "os_Darwin-15.3.0": 0.0129951447,
    "os_Darwin-Other": 0.0356674444,
...

Data from 2016-05-22 (OSX 15 isn't there):

...
    "os_Darwin-14.5.0": 0.0125374454,
    "os_Darwin-Other": 0.0509810781,
...

In the first case, Darwin-Other accounts for the 3% of the users while, in the second case, this generic bucket accounts for the 5%. Since we're bucketing any configuration that accounts for less than 1% of the sample in a generic bucket, I think that OSX 15.* for that particular weak is under that threshold and gets into Darwin-Other bucket.

Question time!
Can we live with this kind of artifact?

Flags: needinfo?(rweiss)

waxmigs2902

Updated

•

8 years ago

Flags: needinfo?(waxmigs2902)

Comment 37

•

8 years ago

To close out the early comment: Until we hear otherwise from stakeholders, we can live with this artifact.

Flags: needinfo?(rweiss)

Daniel Holbert [:dholbert]

Assignee

Comment 38

•

8 years ago

Attached file HW Survey Notebook — Details

Ryan, as mentioned over IRC, I'm r? you on this notebook for sanity checking.

Attachment #8781881 - Flags: review?(rharter)

Updated

•

8 years ago

Flags: needinfo?(waxmigs2902)

Ryan Harter [:harter]

Comment 39

•

8 years ago

Hey Alessio, 

Can you open a pull request so I can use reviewable for inline comments?

A few high level comments:
* Do you have an example of the generated report? Are these saved for future review by the automated job or are they discarded?
* The % of discarded observations does not appear to be reported/saved. I expect this measure will be very useful if anything should look strange in the data. Can we save this in the above mentioned report or, better yet, include it in the payload as suggested by Georg

[0] https://github.com/mozilla/firefox-hardware-survey/pull/1

Flags: needinfo?(alessio.placitelli)

Assignee

Comment 40

•

8 years ago

(In reply to Ryan Harter [:harter] from comment #39)
> Hey Alessio, 
> 
> Can you open a pull request so I can use reviewable for inline comments?

Hey, first of all, thanks for taking a look at this! As we cleared up over IRC, you can drop your comments on the PR for bug 1293954, as the PR for the notebook was already merged.

> A few high level comments:
> * Do you have an example of the generated report? Are these saved for future
> review by the automated job or are they discarded?

An example of the generated data is attached to this bug. This job runs weekly
and the most recent generated report will be saved in the public analysis bucket
by bug 1293954.

> * The % of discarded observations does not appear to be reported/saved. I
> expect this measure will be very useful if anything should look strange in
> the data. Can we save this in the above mentioned report or, better yet,
> include it in the payload as suggested by Georg

This should be saved, unless there's a bug :-(
Look for |discarded_count| in the notebook.

> [0] https://github.com/mozilla/firefox-hardware-survey/pull/1

Flags: needinfo?(alessio.placitelli)

Ryan Harter [:harter]

Updated

•

8 years ago

Attachment #8781881 - Flags: review?(rharter) → review+

Ryan Harter [:harter]

Comment 41

•

8 years ago

As noted offline, the example ping is out of date but the current ping [0] does include the discarded ping rate. This LGTM. Thanks!

[0] https://analysis-output.telemetry.mozilla.org/game-hardware-survey/data/hwsurvey-weekly.json

Comment 42

•

8 years ago

Comment on attachment 8765874 [details]
Sample HW survey JSON

This looks good to me.  You should r? :ali on this to make sure this doesn't break the site.

Attachment #8765874 - Flags: review?(rweiss) → review+