Index should be using full 40 character revisions

RESOLVED FIXED

Status

RESOLVED FIXED
3 years ago
3 years ago

People

(Reporter: gps, Assigned: jlund)

Tracking

unspecified
Dependency tree / graph

Firefox Tracking Flags

(firefox42 fixed)

Details

Attachments

(1 attachment)

(Reporter)

Description

3 years ago
As part of a code review for bug 1162191, it was noticed that the TaskCluster Index API is using 12 character Mercurial short revisions for the <revision> parameter. Furthermore, look-ups against the full 40 character revisions were failing.

I'm not sure whether the 12 character revisions are a side-effect of how automation is scheduling/storing tasks or whether TaskCluster is truncating the 20 byte / 40 character revisions down to 12 characters for storage and/or exposure to the Index API. Either way it is wrong: *all automation should use the full 20 byte / 40 character SHA-1 hashes*. Full stop. No exceptions.

The birthday paradox says we'll get a collision in sqrt(256^6) = 256^3 = 16,777,216 commits, which is within 2 magnitudes of the current repository size. We will have a SHA-1 prefix collision sooner or later and this will almost certainly cause chaos by automated systems unless full SHA-1 hashes are used. See also http://www.jandemooij.nl/blog/2015/05/05/using-rust-to-generate-mercurial-short-hash-collisions/ for how easy it is to generate short hash collisions (this could be abused to create a hash collision that jeopardizes a Firefox chemspill, for example).

Please use full 20 byte / 40 character SHA-1 revisions everywhere. No exceptions.
Thanks for filing this; I found this issue in Task Cluster and in Tree Herder and just didn't get to filing it.  I don't know how deep the issue runs.

Comment 2

3 years ago
Should TH show on its UI the 40-char value?
I don't mind if the 12-char is used for visual purposes but we should allow for the 40-char to be copied.
I think.

Comment 3

3 years ago
I took a look at our in tree configs for b2g, emulator, mulet, etc based builds and it appears that they are using the 40 char revisions.

When browsing through our index browser [1] , these are the revisions I see:

Which after looking at mozilla-taskcluster (the tool that schedules taskcluster jobs per push), it looks like we just get the revision from the pushlog and use that in its entirety.

I did notice that some indexed routes under the buildbot namespace were 12 characters.  Digging in, I saw this [2] so perhaps that is where some of the routes are coming from, unless I'm mistaken.

Taskcluster index doesn't have the concept of revisions in it, so it's whatever the client creates.

Let me know if this helps out or if I can dig in more to figure out what might be going wrong so we can fix it immediately.  Thanks for raising this bug.

[1] https://tools.taskcluster.net/index/artifacts/#gecko.v1.mozilla-central.revision.linux/gecko.v1.mozilla-central.revision.linux
[2] https://dxr.mozilla.org/build:mozharness/source/mozharness/mozilla/building/buildbase.py#1126
(Reporter)

Comment 4

3 years ago
(In reply to Armen Zambrano G. (:armenzg - Toronto) from comment #2)
> Should TH show on its UI the 40-char value?
> I don't mind if the 12-char is used for visual purposes but we should allow
> for the 40-char to be copied.
> I think.

For presentation purposes, the 12 character form is acceptable. But the links and API calls should all be using the 40 character form. Mercurial's templating engine has a "shortest(rev, length)" function that displays the shortest unambiguous hash for a given revision. While Mercurial's templates don't all use it, "shortest(rev, 12)" should ideally be used in all places where a hash is displayed. Other UIs like Treeherder should arguably do something similar.
(In reply to Greg Arndt [:garndt] from comment #3)

> https://dxr.mozilla.org/build:mozharness/source/mozharness/mozilla/building/
> buildbase.py#1126

Nice spot, this indeed looks like a mozharness issue that got introduced in bug 858797.

I'm not sure if this is the only culprit, but like Greg says, since the Index doesn't have any concept of revision number, this is an issue with systems using the index, rather than the index itself. Therefore moving to Mozharness component...

Jordan, can this substring be changed to take the full 40 character hash?

Pete
Component: TaskCluster → Mozharness
Flags: needinfo?(jlund)
Product: Testing → Release Engineering
QA Contact: jlund
(Assignee)

Comment 6

3 years ago
(In reply to Pete Moore [:pmoore][:pete] from comment #5)
> (In reply to Greg Arndt [:garndt] from comment #3)
> 
> > https://dxr.mozilla.org/build:mozharness/source/mozharness/mozilla/building/
> > buildbase.py#1126
> 
> Nice spot, this indeed looks like a mozharness issue that got introduced in
> bug 858797.
> 
> I'm not sure if this is the only culprit, but like Greg says, since the
> Index doesn't have any concept of revision number, this is an issue with
> systems using the index, rather than the index itself. Therefore moving to
> Mozharness component...
> 
> Jordan, can this substring be changed to take the full 40 character hash?
> 
> Pete

apologies for delay

I think it's just for short convenience. best thing to do would be do a try run on a couple platforms/tests with the following change in query_revision (^ buildbase.py#1126):
s/revision[0:12]/revision

query_revision is used quite a lot and the unknowns would be places like post_upload.py, source-package target (android), query_pushdate (for taskcluster submissions), etc. However a try run will be the quickest way to spot any spurious results.
Flags: needinfo?(jlund)
Hey gps, are you happy to take it from here?

Thanks,
Pete
Flags: needinfo?(gps)
(Reporter)

Comment 8

3 years ago
I usually don't go digging into the bowels of mozharness. I'd prefer to just be the bug reporter and let someone with more domain over this code fix things.
Flags: needinfo?(gps)
Hey Jordan,

Are you ok to pick this one up?

Thanks,
Pete
Flags: needinfo?(jlund)
(Assignee)

Comment 10

3 years ago
I've put aside other tasks this week to wrap up q2. I have put in a reminder to test this in a few weeks.
Assignee: nobody → jlund
Flags: needinfo?(jlund)
Created attachment 8641253 [details] [diff] [review]
0001-Bug-1175655-mozharness-should-use-full-hg-revision.patch

I tried this while working on bug 1133074, and it seems to work fine:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=053c33f819e8
Attachment #8641253 - Flags: review?(jlund)

Updated

3 years ago
Blocks: 1133074
(Assignee)

Comment 12

3 years ago
Comment on attachment 8641253 [details] [diff] [review]
0001-Bug-1175655-mozharness-should-use-full-hg-revision.patch

Review of attachment 8641253 [details] [diff] [review]:
-----------------------------------------------------------------

this may bubble up issues with nightlies or releases or it may just be a convenience thing for shortening a url or prop with the rev in it but since I can's see how, let's try it :)
Attachment #8641253 - Flags: review?(jlund) → review+
https://hg.mozilla.org/mozilla-central/rev/88eead104033
Status: NEW → RESOLVED
Last Resolved: 3 years ago
status-firefox42: --- → fixed
Resolution: --- → FIXED
Depends on: 1217223
You need to log in before you can comment on or make changes to this bug.