Closed Bug 1291882 Opened 9 years ago Closed 9 years ago

Some Jobs not being stored or updated because of duplicates in tables being called with get_or_create()

Categories

(Tree Management :: Treeherder, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: camd, Assigned: camd)

References

Details

Attachments

(1 file, 1 obsolete file)

When we do a get_or_create() either of these tables, sometimes it creates a duplicate record. That's because we don't have a unique_together index on the fields in question.
The tables I know of so far are build_platform and machine_platform.
Summary: Jobs not being stored because of duplicates in the build_platform and machine_platform tables → Some Jobs not being stored or updated because of duplicates in tables being called with get_or_create()
Comment on attachment 8777576 [details] [review] [treeherder] mozilla:multiple-objects-workaround > mozilla:master This is a temporary work-around till we can get the db fixed.
Attachment #8777576 - Flags: review?(emorley)
This is going to take a few steps: 1. Push the patch attached that will allow job ingestion, despite the duplicate rows. 2. Point all the jobs records pointing to any dups to the first matching record 3. delete the duplicate records 4. create a migration with a unique_together index on os_name, platform and architecture for each table 5. undo or revert the initial PR.
OK, so I checked, and this was localized to try. (Sure appreciated the run_sql command). So I manually pointed the jobs using the duplicate records to the first in the list and deleted the dups from build_platform and machine_platform. I may go ahead and create the unique index on that table by hand and create the migration after. We could possibly forego the first patch with the workaround. Only a 6 or so duplicate records were created since this code was introduced in March. So it's not as bad as I'd feared.
Attachment #8777576 - Attachment is obsolete: true
Attachment #8777576 - Flags: review?(emorley)
Assignee: nobody → cdawson
Priority: -- → P1
Comment on attachment 8777624 [details] [review] [treeherder] mozilla:unique-together-machine-build-platforms > mozilla:master I already created the indexes for this on stage and prod so we won't get any more. If this looks good to you, I'll merge and it will update Heroku when we push it there.
Attachment #8777624 - Flags: review?(emorley)
ok, ammending comment 4.. :) The issue was localized enough to just try which made it easy to fix the job records. We can skip the earlier PR I made, I think. Just adding the index should be enough to resolve this.
Blocks: 1270629
> I already created the indexes for this on stage and prod so we won't get any more. Won't this mean the migration (and thus deploy) will fail?
Comment on attachment 8777624 [details] [review] [treeherder] mozilla:unique-together-machine-build-platforms > mozilla:master Thank you for looking into this! :-)
Attachment #8777624 - Flags: review?(emorley) → review+
(In reply to Ed Morley [:emorley] from comment #9) > > I already created the indexes for this on stage and prod so we won't get any more. > > Won't this mean the migration (and thus deploy) will fail? I'll go ahead and fake the migration on stage and prod prior to deploy. tbh, I'm not sure it would break it. It *might* just create a second unique index with another name. I'll try it on stage and see how that goes. Thanks for the quick review!
Commit pushed to master at https://github.com/mozilla/treeherder https://github.com/mozilla/treeherder/commit/0f5564d13d5167e9b442d5b6d62b5fa98b2a2438 Bug 1291882 - Add unique_together index on build_platform table (#1759) Also to the machine_platform table. This is necessary because we use a get_or_create() on these tables, but without the unique index, we can (and did) get duplicates which then blocked data ingestion of jobs on try.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
The Heroku stage deploy failed with this during the migration: django.db.utils.IntegrityError: (1062, "Duplicate entry '--android-api-15-gradle--' for key 'build_platform_os_name_6c016409abba5b6e_uniq'") (https://papertrailapp.com/systems/treeherder-stage/events?r=697898411588091934-697898549874290721) I don't suppose you could clean up the duplicate rows and then press the deploy button on the web UI, to trigger another run? Thanks :-)
Flags: needinfo?(cdawson)
OK, Horoku Stage is fixed now. Deploy succeeded.
Flags: needinfo?(cdawson)
Awesome - thank you :-)
Depends on: 1304338
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: