Improve the partial suggestion using actual usage number

RESOLVED FIXED

Status

Release Engineering
Ship It
RESOLVED FIXED
3 years ago
a year ago

People

(Reporter: sylvestre, Unassigned)

Tracking

(Depends on: 1 bug)

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(5 attachments)

(Reporter)

Description

3 years ago
For now, the algorithm to suggest partial is stupid, it suggests the last versions published (2 for releases, 3 for beta).

The PR proposes to improve the results by using actual ADI.
(Reporter)

Comment 1

3 years ago
Created attachment 8582372 [details] [review]
Implement most of the feature

I have a few remaining problems:
* we need to have a vertica access to retrieve the data
I can push the json files somewhere but I don't think we want that to be public.
Anyway, having it on the same machine as ship-it run a few time for day should be the easiest solution.

* in kickoff/views/submit.py, I have hardcoded the URL to http://localhost/, not sure how we want to set this up? (running it locally would tackle this issue)

* Beta ADI are missing. I reported bug 1145178. For now, still proposing the 2 last shipped builds.

Despite these issues, guys, I would like to have your feedback on this change.
Attachment #8582372 - Flags: review?(rail)
Attachment #8582372 - Flags: review?(bhearsum)
Comment hidden (typo)
Comment hidden (typo)
For background, we had an e-mail conversation about this. Tl;dr is that Vertica doesn't have a JSON api, so we're going to dump its data with a Python script, make it publicly available, and Ship It's frontend will source it.
Comment on attachment 8582372 [details] [review]
Implement most of the feature

Feedback given in PR.
Attachment #8582372 - Flags: review?(bhearsum) → review-
Duplicate of this bug: 1084559
(Reporter)

Comment 7

3 years ago
Release management has now a VM and we can publish the results here:
http://relman-vm1.dmz.scl3.mozilla.com/partial.json
(vpn access needed)
I updated the PR.
Flags: needinfo?(bhearsum)
(Reporter)

Comment 8

3 years ago
Comment on attachment 8582372 [details] [review]
Implement most of the feature

Ben, I think this is ready now!
Flags: needinfo?(bhearsum)
Attachment #8582372 - Flags: review- → review?(bhearsum)
Comment on attachment 8582372 [details] [review]
Implement most of the feature

I think this looks good now....I merged it.

Can you see how things are looking on https://ship-it-dev.allizom.org/ before we push to prod? Hopefully we can do that push before 38.0b4 next week.
Attachment #8582372 - Flags: review?(bhearsum)
Attachment #8582372 - Flags: review+
Attachment #8582372 - Flags: checked-in+
(Reporter)

Updated

3 years ago
Depends on: 1155935
(Reporter)

Comment 10

3 years ago
I wrote this PR to update the URL to match the new URL
https://github.com/mozilla/ship-it/pull/17
(Reporter)

Updated

3 years ago
Assignee: nobody → sledru
(Reporter)

Updated

3 years ago
Depends on: 1169614
Duplicate of this bug: 1092319
We had some issues with this today with 40.0.1. The suggestion was
  40.0build5, 39.0build6, 38.0.5b3build1, 35.0.1build1

* 40.0 and 39.0 are correct
* 39.0.3 should be added, it has the highest ADI right now
* 38.0.5b3build1 is bogus, was probably mean to be 38.0.5
* next would be 38.0.1, 37.0.2, and then 35.0.1

I'm getting ADI from https://dataviz.mozilla.org/views/DesktopADI/VersionChannelADI; August 10 data. Perhaps the Vertica datasource is unmaintained ?   https://ship-it.mozilla.org/cron/partial.json says it's recently updated but the numbers look very off to me.
Flags: needinfo?(sledru)
(Reporter)

Comment 13

3 years ago
(In reply to Nick Thomas [:nthomas] from comment #12)
> * 39.0.3 should be added, it has the highest ADI right now
OK, probably because the vertica db is not updated frequently enough.
Saptarshi, Sheeri would it be possible to update the data daily? Thanks


> * 38.0.5b3build1 is bogus, was probably mean to be 38.0.5
This is fixed for a while but ship-it prod instance is not up to date:
https://github.com/mozilla/ship-it/commit/6f809e28e1b77bd30f219cad5cdc25f8a26fa428

> I'm getting ADI from
> https://dataviz.mozilla.org/views/DesktopADI/VersionChannelADI; August 10
> data. 
To bad here is no API to access these data...

Perhaps the Vertica datasource is unmaintained ?  
> https://ship-it.mozilla.org/cron/partial.json says it's recently updated 
The update is the generation of the json file, not the latest refresh of vertica data.
Flags: needinfo?(sledru)
Flags: needinfo?(sguha)
Flags: needinfo?(scabral)
The data base fhr_rollups_daily_base (despite the name ..) cannot be updated daily. It depends on the 'de-orphaning' ETL which is once a week. Hence this table can only be updated once a week
Flags: needinfo?(sguha)
Rob, how does Socorro get ADI data ? Perhaps we can arrange access so we can use it here too
Flags: needinfo?(rhelmer)
(In reply to Nick Thomas [:nthomas] from comment #15)
> Rob, how does Socorro get ADI data ? Perhaps we can arrange access so we can
> use it here too

Socorro queries a Hive DB in SCL - there's a little VM in SCL just for accessing this, it runs a cron job which pushes the data to an RDS instances in AWS. Maybe this could push somewhere for you as well.

Alternatively - as this is already in the Socorro RDS instance, you could potentially pull it from crash-stats. I don't see any API endpoint for this when logged into https://crash-stats.mozilla.com/api/ right now though.

:peterbe/:adrian, is there a plan to have a crash-stats API endpoint for ADI? 

:lonnen, any problem with such an API (provided it required Persona auth or token and special privs)?
Flags: needinfo?(rhelmer)
Flags: needinfo?(peterbe)
Flags: needinfo?(chris.lonnen)
Flags: needinfo?(adrian)
As far as I know, we plan on exposing ADIs as part of our public API. I do not know if we have explicit permission to do that, though. The ADI middleware service exists but is currently only accessible internally, if we have agreement from Lonnen we can easily expose that in the public API.
Flags: needinfo?(adrian)
(In reply to Robert Helmer [:rhelmer] from comment #16)
> (In reply to Nick Thomas [:nthomas] from comment #15)
> > Rob, how does Socorro get ADI data ? Perhaps we can arrange access so we can
> > use it here too
> 
> Socorro queries a Hive DB in SCL - there's a little VM in SCL just for
> accessing this, it runs a cron job which pushes the data to an RDS instances
> in AWS. Maybe this could push somewhere for you as well.
> 
> Alternatively - as this is already in the Socorro RDS instance, you could
> potentially pull it from crash-stats. I don't see any API endpoint for this
> when logged into https://crash-stats.mozilla.com/api/ right now though.
> 
> :peterbe/:adrian, is there a plan to have a crash-stats API endpoint for
> ADI? 
> 
Any day now :)
It's on my todo-list to add such an endpoint.

Because it's quite new I haven't decided perfect yet how the constraints should be controlled or changed. For example, doing a GET on ADI for the last 2 months might cause such a large slow query that it'd significantly slow down our database. So that might be restrained to smaller date ranges. 
Also, at the time of writing there is no way to filter by operating system. You have to filter by product name and version. 

Wanna set up a blocker on https://bugzilla.mozilla.org/show_bug.cgi?id=1194329 ?
Flags: needinfo?(peterbe)

Comment 19

3 years ago
I'm fine with an api. Now that more than 1 project relies on it, I'd even be ok breaking that logic out and putting it into it's own service that both projects can consume, rather than using crash-stats as a passthrough.
Flags: needinfo?(chris.lonnen)

Comment 20

3 years ago
Hi there,

> we plan on exposing ADIs as part of our public API. 

Am I interpreting this correctly? We are planning on creating a public API where anyone can obtain information about our active profiles, per day, per build/channel/etc?

Comment 21

3 years ago
I spoke with John on Friday and I believe we're ok to move forward here. We're talking about using already public numbers from the crash reporter for a feature on an auth-protected internal tool.

The raw data is kind of messy and the data push is fairly brittle. With other and more reliable pipelines nearly done I'm not sure it makes sense to break out the common functionality from Socorro into a new service, though depending on Socorro for this info seems like a weird dependency. That might warrant a little extra thought.
Flags: needinfo?(scabral)
(Reporter)

Updated

2 years ago
Depends on: 1194329
(Reporter)

Updated

2 years ago
Depends on: 1229682
Sylvestre, what's the status here ?
(Reporter)

Comment 23

2 years ago
Yes, I contacted the telemetry team to have better data, they have something for this.
Depends on: 1240522
OK. I'd really like to fix this up as we generated these partials for 45.0.1 - 45.0 is correct, but 42.0, 41.0.2, 39.0 aren't so great. We really ought to have 44.0.2 in there, and 43.0.4 would have been useful too. Could we add a manual step for RelMan to check the partials when setting up a new release ?
I'm going to have a stab at using the Socorro data.
Assignee: sledru → nthomas
Created attachment 8765336 [details] [review]
PR for switching to crash-stats ADI data
Attachment #8765336 - Flags: review?(rail)
Comment on attachment 8765336 [details] [review]
PR for switching to crash-stats ADI data

Commented in PR
Attachment #8765336 - Flags: review?(rail) → review+
Depends on: 1282735
Ready to land the PR, but first bug 1282735 for the server-side changes to support it.
I realized I'm not actually blocked by server-side changes as requests should be added to vendor/, and a symlink handles the change of script in the cron job. I've deployed with that added to the PR. Bug 1282735 becomes tidy up work. Still need to verify https://ship-it.mozilla.org/cron/partial.json is updated by the new script.
No longer depends on: 1282735
See Also: → bug 1282735
The cron works, eg {"release": [{"version": "45.0.1", "ADI": 1288229}, {"version": "47.0.1", "ADI": 1251634}, {"version": "46.0", "ADI": 416029}, {"version": "45.0.2", "ADI": 1205393}, {"version": "47.0", "ADI": 85961983}, {"version": "46.0.1", "ADI": 6095365}, {"version": "45.0", "ADI": 394895}], "lastUpdate": "30/06/2016 03:01 using 28/06/2016 ADI data", "beta": [{"version": "48.0b2", "ADI": 203671}, {"version": "48.0b3", "ADI": 1405879}, {"version": "48.0b1", "ADI": 295490}, {"version": "48.0b4", "ADI": 2514}], "esr": [{"version": "45.0.2esr", "ADI": 32879}, {"version": "45.0.1esr", "ADI": 56243}, {"version": "38.7.1esr", "ADI": 95134}, {"version": "38.8.0esr", "ADI": 210381}, {"version": "45.1.0esr", "ADI": 87733}, {"version": "45.1.1esr", "ADI": 128450}, {"version": "45.2.0esr", "ADI": 1367821}]}

There are a couple of bugs which mean we still get the wrong partials
* the JS code expects the results to be in descending order of ADI, but it's random coming out of crash-stats
* traversing a X.0 version in the ADI list will match X.0.y in getVersionWithBuildNumber(), eg get 46.0.1buildN instead of 46.0buildN
Created attachment 8766931 [details] [review]
PR to fix up use of blocklist ADI

Explanations on the PR.
Attachment #8766931 - Flags: review?(rail)
Attachment #8766931 - Flags: review?(rail) → review+
Deployed PR 74, waiting for the cron to run to refresh the ADI data.
https://github.com/mozilla/ship-it/pull/75 to fix up one more case where we're matching X.0.Y for X.0 when we shouldn't. Deployed and getting sensible results now - eg for a 48.0 build1 the suggestion is 
  47.0.1build1,47.0build3,46.0.1build1,45.0.1build1
on ADI of
  {"version": "47.0",  "ADI": 76838356}, 
  {"version": "47.0.1", "ADI": 9165114},
  {"version": "46.0.1", "ADI": 5314865},
  {"version": "45.0.1", "ADI": 1225742},
  {"version": "45.0.2", "ADI": 1123018},
  {"version": "46.0",    "ADI": 387763},
  {"version": "45.0",    "ADI": 378686}

Possibly a regression in the number of partials suggested for ESR though, getting 4 instead of 1.
I'm not planning on pushing on a telemetry datasource, as the blocklist one seems adequate for now. Will be working on other improvements, eg bug 1292419 for the # of partials, bug 1229682 for using ADI for beta, bug 1177235 & bug 1223196 for RC builds on beta.

RESOLVED GOODENOUGH ?
Assignee: nthomas → nobody
(In reply to Nick Thomas [:nthomas] from comment #35) 
> RESOLVED GOODENOUGH ?

YES!
Status: NEW → RESOLVED
Last Resolved: a year ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.