Open Bug 1979059 Opened 25 days ago Updated 7 days ago

improve performance of group_results API endpoint

Categories

(Tree Management :: Treeherder: API, task)

Tracking

(Not tracked)

People

(Reporter: aryx, Unassigned)

References

(Blocks 1 open bug)

Details

The group_results API end point is slower, 3-4s per request, e.g. https://treeherder.mozilla.org/api/project/autoland/push/group_results/?revision=<revisionhere>&format=json.

Interestingly, this current query

select group_status.status,
       "group".name,
       taskcluster_metadata.task_id
from "group"
join group_status
on "group".id = group_status.group_id
join job_log
on group_status.job_log_id = job_log.id
join job
on job_log.job_id = job.id
join push
on job.push_id = push.id
join repository
on push.repository_id = push.repository_id
join taskcluster_metadata
on job.id = taskcluster_metadata.job_id
where repository.name = 'autoland' and
      push.revision = 'cbd3b1d025d3a7f98753c65078a50c01c9f49778' and
      group_status.status in (1, 2)

is faster than this simplified one:

select taskcluster_metadata.task_id
from job
join push
on job.push_id = push.id
join repository
on push.repository_id = push.repository_id
join taskcluster_metadata
on job.id = taskcluster_metadata.job_id
where repository.name = 'autoland' and
      push.revision = 'cbd3b1d025d3a7f98753c65078a50c01c9f49778'

This query yields the same result as the simplified query but with a ~ 10x time improvement

SELECT taskcluster_metadata.task_id
FROM taskcluster_metadata
INNER JOIN job ON taskcluster_metadata.job_id = job.id
INNER JOIN
  (SELECT DISTINCT push.id
   FROM push
   JOIN repository ON push.repository_id = push.repository_id
   WHERE repository.name = 'autoland'
     AND push.revision = 'cbd3b1d025d3a7f98753c65078a50c01c9f49778' ) AS pushes ON (pushes.id = job.push_id)

The trick is that the job table is huge obviously, so if we avoid iterating a lot on it by joining it on a subquery that gives a small list of results, it's much faster.

I might have done this by hacking around with running tests standalone- specifically:

  • groups are supposed to be manifests- total is ~4200
  • I ran most mochitests standalone, therefore each test name was a group: now total is ~26000

so the 6x increase is unique group.name values is probably not helping.

You need to log in before you can comment on or make changes to this bug.