Closed
Bug 1257602
Opened 9 years ago
Closed 7 years ago
Remove revision_hash code once all clients are transitioned to using revisions only
Categories
(Tree Management :: Treeherder: Data Ingestion, defect, P1)
Tree Management
Treeherder: Data Ingestion
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: camd, Assigned: emorley)
References
(Blocks 1 open bug)
Details
Attachments
(3 files)
There is some backward-compatible code for ``revision_hash``es since they're used everywhere at the moment. This is a bug to remove that code when that field is no longer used.
Should *probably* just remove the field from the DB as well..
Reporter | ||
Updated•9 years ago
|
Assignee: nobody → cdawson
Reporter | ||
Comment 1•8 years ago
|
||
There are still a few cases where we get revision_hash instead of revision due to longer-lived repos like ESR and such.
I have instrumentation in New Relic for when we see revision_hash in with the "revision_hash_usage" key. I'm not seeing any, but they're very infrequent. Only when longer-lived builds like ESR and Beta are built.
Comment 2•8 years ago
|
||
Can we backport the changes needed to get rid of revision hash to ESR (beta / release is not such a big deal, since they will auto-expire soon)? Especially since we're about to do large-scale changes to the way job and revision data is stored in bug 1178641 and friends, it would be nice not to have to worry about this.
Reporter | ||
Comment 3•8 years ago
|
||
Yeah, I agree. The main thing comes down to using the new v2 routing keys that use revision instead of revision_hash. Wes/Greg: how hard would it be to backport these to ESR or any other "old" repos?
from IRC: <garndt> Wouldn't be hard if sheriffs don't mind checking in some commits to them for me
Wes, is this something you'd be up for doing?
Flags: needinfo?(wkocher)
Happy to do so once those patches are posted.
Flags: needinfo?(wkocher) → needinfo?(garndt)
Comment 5•8 years ago
|
||
I commented on IRC, but probably better to ask here...
I can create a patch for the necessary branches, but it seems that some branches (like aurora) have some of the new in-tree scheduling changes that landed on m-c but not all of them. I am not aware of the merge schedule for aurora, but if I make changes to aurora specifically, would that cause merge problems later (I would be applying just the treeherder specific changes rather than the rest of the in-tree stuff)?
Comment 6•8 years ago
|
||
ok, confirmed with Kwierso on IRC that we can still create a patch, and if there is a merge conflict we can resolve that at that time.
Comment 7•8 years ago
|
||
So it appears that for aurora/beta we can wait until the next merge and it should get the newest in-tree changes that includes the new routes.
esr45 is the only branch that something manually should be done for, but I have not had a chance yet to create a patch. I'm going to keep ni? here just so it stays on my radar, but if this needs to be done soon, I would need to defer this to someone else while I'm on PTO (such as dustin or wander)
Reporter | ||
Comment 8•8 years ago
|
||
This is mostly so that we can remove some complicated backward-compatible code from treeherder, making other refactors we do easier. But a couple weeks won't make much difference. As I recall aurora/beta will trickle down by early August. If this came in sometime in August as well, that'd be great!
Thanks for sticking with this!
Assignee | ||
Comment 9•8 years ago
|
||
Btw I spotted that the python client's "support add_revision_hash but give a deprecation warning" behaviour is broken, since `required_properties` includes `revision`, which isn't set when just using `add_revision_hash`.
Probably not worth fixing that at this point though.
Comment 10•8 years ago
|
||
I haven't had a chance to get to this, so I'm hoping that soon as things merge around it'll automatically get these changes.
Flags: needinfo?(garndt)
Comment 11•8 years ago
|
||
(In reply to Greg Arndt [:garndt] from comment #10)
> I haven't had a chance to get to this, so I'm hoping that soon as things
> merge around it'll automatically get these changes.
ESR45 is going to be around for a while yet, would it be possible to port any changes to that branch only? I'm hoping to start a large-scale refactoring of the way we store revision data in treeherder in a week or so, and it would be nice not to have to worry about revision hashes.
I'm happy to help test stuff or even apply patches myself if you can point me at the right places.
Flags: needinfo?(garndt)
Comment 12•8 years ago
|
||
Alternately, maybe we could just turn off taskcluster for that branch? I'm not sure what good running tier 2 taskcluster builds there is doing...
https://treeherder.mozilla.org/#/jobs?repo=mozilla-esr45
Comment 13•8 years ago
|
||
William, does it mean I have to update all of those instances for mozmill-ci?
https://github.com/mozilla/mozmill-ci/search?utf8=%E2%9C%93&q=revision_hash
Comment 14•8 years ago
|
||
(In reply to Henrik Skupin (:whimboo) from comment #13)
> William, does it mean I have to update all of those instances for mozmill-ci?
>
> https://github.com/mozilla/mozmill-ci/search?utf8=%E2%9C%93&q=revision_hash
To be honest I'm still not an expert on this part of the Treeherder code (yet). From what I understand it doesn't matter if you're still submitting the revision_hash, just that you should also always submit the revision. It might reduce confusion if you took this out of your code though. :camd, can you confirm this?
Flags: needinfo?(cdawson)
Reporter | ||
Comment 15•8 years ago
|
||
Yikes, sorry it took me so long to respond. Yeah, that's correct. But you can submit both. We'll use the revision anyway.
It would be better to remove all the references to revision_hash, if possible though. We will hopefully be removing all code using revision_hash in the next quarter or so. I don't think we'll fail if it's there, it'll just be ignored. But better to remove the confusion.
Flags: needinfo?(cdawson)
Comment 16•8 years ago
|
||
Ok so I assume it also includes the retrieval of jobs. Right now we determine the revision hash first before using the API to retrieve jobs. Does it mean we can do it directly now?
Assignee | ||
Comment 17•8 years ago
|
||
Adding bug 1306674 to dependencies, since whilst it doesn't block per se, cleanup here would simplify the refactoring.
Blocks: 1306674
Assignee | ||
Comment 19•8 years ago
|
||
There are still some things submitting revision hashes, but I'm struggling to tell what:
https://insights.newrelic.com/accounts/677903/explorer/events?eventType=revision_hash_usage&duration=604800000
I think we need more parameters to be included in the New Relic logging call.
Comment 20•8 years ago
|
||
Could this be us with mozmill-ci?
https://github.com/mozilla/mozmill-ci/blob/master/jenkins-master/jobs/scripts/workspace/submission.py#L120
Assignee | ||
Comment 21•8 years ago
|
||
Ah yes that must be it. I don't suppose you could switch it to using revision, along the lines of the deprecation message from the Python client? :-)
https://github.com/mozilla/treeherder/blob/184ba384f0c132f01470ef39c189fc14c66d0cc4/treeherder/client/thclient/client.py#L150-L157
Comment 22•8 years ago
|
||
I filed https://github.com/mozilla/mozmill-ci/issues/839 to get this removed. Please give me some more information over there, so I can get started. Thanks.
Comment 23•8 years ago
|
||
(In reply to Ed Morley [:emorley] from comment #19)
> There are still some things submitting revision hashes, but I'm struggling
> to tell what:
> https://insights.newrelic.com/accounts/677903/explorer/
> events?eventType=revision_hash_usage&duration=604800000
>
> I think we need more parameters to be included in the New Relic logging call.
Highly likely that part of them are caused by pushes to the esr45 branch, which hasn't been patched to not use revision hash. I'm investigating if we can just land a patch there to turn off taskcluster automation all together. Otherwise I'll need to pick this back up and try to get rid of the revision hash stuff.
Flags: needinfo?(garndt)
Comment 24•8 years ago
|
||
(In reply to Greg Arndt [:garndt] from comment #23)
> (In reply to Ed Morley [:emorley] from comment #19)
> > There are still some things submitting revision hashes, but I'm struggling
> > to tell what:
> > https://insights.newrelic.com/accounts/677903/explorer/
> > events?eventType=revision_hash_usage&duration=604800000
> >
> > I think we need more parameters to be included in the New Relic logging call.
>
> Highly likely that part of them are caused by pushes to the esr45 branch,
> which hasn't been patched to not use revision hash. I'm investigating if we
> can just land a patch there to turn off taskcluster automation all together.
> Otherwise I'll need to pick this back up and try to get rid of the revision
> hash stuff.
esr45 is going to be retired pretty soon, so it might be easiest to just wait.
Comment 25•8 years ago
|
||
(In reply to William Lachance (:wlach) (use needinfo!) from comment #24)
> esr45 is going to be retired pretty soon, so it might be easiest to just
> wait.
It's still 3*6 weeks out, so 18 weeks. If it can wait 4 and 1/2 months...
Comment 26•8 years ago
|
||
(In reply to Henrik Skupin (:whimboo) from comment #25)
> (In reply to William Lachance (:wlach) (use needinfo!) from comment #24)
> > esr45 is going to be retired pretty soon, so it might be easiest to just
> > wait.
>
> It's still 3*6 weeks out, so 18 weeks. If it can wait 4 and 1/2 months...
Hmm, that's longer than I thought, but I'd still wait (unless it's super easy to turn off taskcluster on esr45). It would be nice to clean this up, but it's not critical.
Assignee | ||
Comment 27•8 years ago
|
||
If running Taskcluster jobs on esr45 is providing no value it seems like it would make sense to save the AWS cost and switch them off entirely.
If turning them off is too much of a pain from a technical point of view (but there there is still consensus that they provide no value), then Treeherder could always just ignore any jobs that use revision_hash (once the mozmill-ci parts are switched over).
Comment 28•8 years ago
|
||
After speaking with RyanVM, by disabling TaskCluster on ESR45, we would lose the static analysis builds since those are not running in buildbot.
Assignee | ||
Comment 29•8 years ago
|
||
(In reply to Henrik Skupin (:whimboo) from comment #25)
> It's still 3*6 weeks out, so 18 weeks. If it can wait 4 and 1/2 months...
I don't think it's that long, unless I'm reading https://wiki.mozilla.org/RapidRelease/Calendar wrong?
52.1 will be released on 2017-04-18, and that normally marks the last ESR point release (in this case 45.9).
Comment 30•8 years ago
|
||
(In reply to Ed Morley [:emorley] from comment #29)
> 52.1 will be released on 2017-04-18, and that normally marks the last ESR
> point release (in this case 45.9).
We have to keep it working until 2017-06-13 which is the date when esr45 get abandoned. Until then we can always have a security release.
Assignee | ||
Updated•8 years ago
|
Assignee: cdawson → emorley
Assignee | ||
Comment 31•7 years ago
|
||
ESR45 is now EOL, I'll open a bug to have it removed from Treeherder.
Another submitter I've found is AWFY - I've opened a PR to switch their TreeherderClient usage to add_revision() from add_revision_hash():
https://github.com/mozilla/arewefastyet/pull/171
Assignee | ||
Updated•7 years ago
|
See Also: → https://github.com/mozilla/arewefastyet/pull/171
Assignee | ||
Comment 32•7 years ago
|
||
(In reply to Ed Morley [:emorley] from comment #31)
> Another submitter I've found is AWFY - I've opened a PR to switch their
> TreeherderClient usage to add_revision() from add_revision_hash():
> https://github.com/mozilla/arewefastyet/pull/171
My PR has now been merged, so AWFY is no longer using revision_hash.
With that, there don't appear to be any more submitters using it, but I'll check again in 24 hours now that the New Relic noise from AWFY is gone (the custom event doesn't record the submitter unfortunately, so no way to filter).
Comment 33•7 years ago
|
||
Assignee | ||
Updated•7 years ago
|
Attachment #8907537 -
Flags: review?(cdawson)
Assignee | ||
Comment 34•7 years ago
|
||
Looking at New Relic earlier showed there were still jobs using revision_hash. In order to try and track them down we need more metadata to be included in the New Relic custom event, hence the PR.
I deployed that PR to stage just now, and it found that funsize is still using revision_hash, for which I've filed bug 1399440.
Reporter | ||
Updated•7 years ago
|
Attachment #8907537 -
Flags: review?(cdawson) → review+
Comment 35•7 years ago
|
||
Commit pushed to master at https://github.com/mozilla/treeherder
https://github.com/mozilla/treeherder/commit/da4ba40c7d3342be6fcfca6076543371aa39617e
Bug 1257602 - Include more metadata in revision_hash usage event (#2769)
Since previously it was virtually impossible to track down the source
of jobs that are still using `revision_hash`.
Assignee | ||
Comment 36•7 years ago
|
||
I have a branch for this locally which is pretty much done, however I'm wondering how best to handle the job pulse schema? Rather than revving the version now, I was thinking we could just adjust the description to say the revision_hash field will now be ignored, and then when we get around to a v2 schema (or switch to consuming from TC directly), we could fold several changes into one?
Thoughts?
Flags: needinfo?(cdawson)
Reporter | ||
Comment 37•7 years ago
|
||
Yeah, that sounds good. We should continue to log to new relic so we can contact people. But odds are if they're not seeing their jobs, then they'll gripe and we can tell them. Or they won't notice and don't care, so we can ignore them. :)
Thanks for doing this!!
Flags: needinfo?(cdawson)
Assignee | ||
Comment 38•7 years ago
|
||
Checking New Relic Insights shows nothing is using revision_hash any more. Let's do this!
Status: NEW → ASSIGNED
Priority: -- → P1
Comment 39•7 years ago
|
||
Assignee | ||
Updated•7 years ago
|
Attachment #8931458 -
Flags: review?(cdawson)
Reporter | ||
Comment 40•7 years ago
|
||
Comment on attachment 8931458 [details] [review]
Link to GitHub pull-request: https://github.com/mozilla/treeherder/pull/2983
I couldn't find anything you missed. This is awesome; so incredibly cathartic... :)
Attachment #8931458 -
Flags: review?(cdawson) → review+
Comment 41•7 years ago
|
||
Commit pushed to master at https://github.com/mozilla/treeherder
https://github.com/mozilla/treeherder/commit/f7f38ef4d0b52b0fe2c583a0ea9c35f76236e851
Bug 1257602 - Remove support for revision_hash (#2983)
Now that no submissions are using revision_hash, it can be removed.
This removes everything but the model field, which will be handled
later.
I've removed revision_hash from the Pulse jobs schema without bumping
the version, which wouldn't normally be ok, but no one is still using
it, and I'd rather have explicit failures later than if we left the
schema unchanged.
Comment 42•7 years ago
|
||
Assignee | ||
Updated•7 years ago
|
Attachment #8950431 -
Flags: review?(cdawson)
Reporter | ||
Comment 43•7 years ago
|
||
Comment on attachment 8950431 [details] [review]
Link to GitHub pull-request: https://github.com/mozilla/treeherder/pull/3218
It's finally gone! :)
Attachment #8950431 -
Flags: review?(cdawson) → review+
Comment 44•7 years ago
|
||
Commit pushed to master at https://github.com/mozilla/treeherder
https://github.com/mozilla/treeherder/commit/b98094d4f2490b3cdbf55c19915688b3a17fe9ff
Bug 1257602 - Remove revision_hash from the push table (#3218)
The last of the usages were removed a while ago, so we're safe to
drop the column. The auto-generated migration caused an exception
until it was manually re-ordered, due to a Django bug:
https://code.djangoproject.com/ticket/29124
The `push` table on production is only 300K rows and 100MB, so the
migration should be fairly fast.
The output from `sqlmigrate` is:
```
BEGIN;
--
-- Alter field revision on push
--
ALTER TABLE `push` MODIFY `revision` varchar(40) NOT NULL;
--
-- Alter unique_together for push (1 constraint(s))
--
CREATE INDEX `push_repository_id_e7501345` ON `push` (`repository_id`);
ALTER TABLE `push` DROP INDEX `push_repository_id_revision_hash_3cd3c5e3_uniq`;
--
-- Remove field revision_hash from push
--
ALTER TABLE `push` DROP COLUMN `revision_hash`;
COMMIT;
```
Assignee | ||
Updated•7 years ago
|
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 45•7 years ago
|
||
Dropping the column only took 45 seconds on stage (and will be quicker on prod since db.m4.2xlarge there vs db.m4.xlarge on stage), so looking fine for when the prod deploy occurs.
Assignee | ||
Comment 46•7 years ago
|
||
Feb 13 18:08:46 treeherder-stage app/release.7432: -----> PRE-DEPLOY: Running Django migration...
Feb 13 18:08:49 treeherder-stage app/release.7432: Operations to perform:
Feb 13 18:08:49 treeherder-stage app/release.7432: Apply all migrations: auth, contenttypes, credentials, model, perf, sessions, seta
Feb 13 18:08:49 treeherder-stage app/release.7432: Running migrations:
Feb 13 18:09:35 treeherder-stage app/release.7432: Applying model.0021_remove_revision_hash... OK
You need to log in
before you can comment on or make changes to this bug.
Description
•