Closed Bug 1588436 Opened 5 years ago Closed 4 years ago

taskcluster-github: (ref, sha, event) -> task group ID

Categories

(Taskcluster :: Services, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: pmoore, Unassigned)

References

(Blocks 1 open bug)

Details

If you visit https://github.com/mozilla-releng/OpenCloudConfig/commits/master (and it isn't too far into the future after this bug has been created) you should see a commit from Wander "Deploying generic-worker 16.3.0 / taskcluster-proxy 5.1.0 to STAGING.".

If you follow the link to the task group inspector for that commit, you will reach https://tools.taskcluster.net/groups/dBZF1cHZQM6dU889wf8tcQ. If you select the one and only task for that task group, and view the task definition you will see it has

"GITHUB_HEAD_REF": "refs/heads/bug1587471"

The original github link shows commits for the master branch, but we see it has linked to a task for the bug1587471 branch.

This is because it has looked up the taskGroupId based on the git sha (df8b356d50ef67598e6eaf1c31b953a8f7fb8948) and found this other task group that was for a different branch, that had the same git sha. In this case, the git sha is not unique across branches as the same commit was pushed to master branch and to bug1587471 branch.

I think the following changes are worthwhile:

  1. [Necessary] Instead of mapping git sha to task group ID, instead the service should map the compound key (git sha, git ref, github event) to task group ID. This way, if the same commit was pushed to multiple branches, the link should still respect the branch that the user is inspecting in the github UI.

  2. [Preferred] If more than one result is returned, an alert should be thrown. Presumably, with the existing git sha -> task group ID mapping, more than one result is being returned, and the first or last result is being shown. However, if we checked that only one result is returned, and alerted if not, we would detect the problem and not provide a link at all, which could be the wrong one.

  3. [Preferred] If possible, we should have a table constraint to prevent duplicates, and detect the problem even earlier if an attempt is made to insert a key which is already present.

Flags: needinfo?(bugzeeeeee)

What is the question for me, Pete? I saw NI, but no question

Flags: needinfo?(bugzeeeeee) → needinfo?(pmoore)

Sorry, the needinfo was just a way of bringing your attention to the bug as module owner, so you could prioritise the issue against other issues, to see whether it is worth fixing or not, or how important you think it might be.

I'm not too familiar with the code, so I just wanted to raise awareness of the issue, but let you decide as module owner how important you think it is, and decide if or when we should fix it. Or for example, if you think it is is a good first bug, it would be a chance to tag it in case a contributor wants to work on it, etc.

In other words, I wasn't really expecting direct action, or it to be solved, just wanted to make you aware of the issue to essentially triage it. :-)

Flags: needinfo?(pmoore)

[Necessary] Instead of mapping git sha to task group ID, instead the service should map the compound key (git sha, git ref, github event) to task group ID.

What is "the service" here? The green check mark visible in GitHub’s web UI for commit list is a "status", which GitHub associates to commits by SHA regardless of what branch(es) that commit is in. As far as I understand this is just how GitHub works, and we don’t have control to change this.

Or are you saying that Taskcluster should maintain its own database with the desired mapping, plus UI to look things up?

Blocks: github-bugs

This is good background for a rethink of the GitHub integration.

Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.