Closed
Bug 1598333
Opened 5 years ago
Closed 5 years ago
Optimise job_details inserts
Categories
(Tree Management :: Treeherder: Infrastructure, defect, P1)
Tree Management
Treeherder: Infrastructure
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: jgraham, Assigned: camd)
References
(Blocks 1 open bug)
Details
Attachments
(1 file)
(I don't really know if this is related to 1597136 but was asked to file)
https://github.com/mozilla/treeherder/blob/4704764a896417c58e9cc3daf97bcdebb2f33b72/treeherder/etl/artifact.py#L24-L45 looks like it's doing 1 insert per row in the job_details
table. That seems suspicious on performance grounds. I also wonder how common it is to reprocess jobs with different artifacts (i.e. it's unclear why this operation is update_or_create
). A possible solution would be:
- Use the
bulk_create
API to insert the new rows with a single query - If that fails due to having duplicates, select the existing rows, remove the entries that already exist and try again (I assume it's not possible to get duplicates with different auxillary data here since creating an artifact is idempotent).
Updated•5 years ago
|
Assignee: nobody → cdawson
Priority: -- → P1
Comment 1•5 years ago
|
||
Assignee | ||
Updated•5 years ago
|
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•