Closed Bug 1148941 Opened 9 years ago Closed 8 years ago

gecko, tc-vcs: Decision tasks should fail if tc-vcs fails to pull hg.m.o

Categories

(Taskcluster :: General, defect)

x86_64
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED INVALID

People

(Reporter: jonasfj, Unassigned)

References

Details

(Whiteboard: [tc-vcs])

Attachments

(1 file)

See:
http://docs.taskcluster.net/tools/task-inspector/#7yZHS0D3RH2WXsHNyP7YCw/0

Which basically ended up trying to run the wrong revision.
That is pretty bad, we only discovered it here because running
the wrong revision caused a test to fail due to configuration issues.

We have the following issues here:
 1) tc-vcs **must** exit non-zero, if it doesn't do the job
 2) decision task **must** fail if tc-vcs fails
 3) decision task should validate that it has the right checkout
 4) tc-vcs should retry with exponential back-off

Issue (3) is a nice to have. But a simple sanity check that we got the right
revision would add some extra robustness, we could do it right here:
https://dxr.mozilla.org/mozilla-central/source/testing/docker/decision/bin/entrypoint#23
Just: if [$(tc-vcs revision $DEST) != $GECKO_HEAD_REV ]; then exit 1; fi
Any news on this?
Flags: needinfo?(jopsen)
Did we fix this... I can't remember if you fixed this.

Looking at entrypoint for decision task, it seems we're still missing the robustness check in bash.
Assuming we want it.
Flags: needinfo?(jopsen) → needinfo?(jlal)
Assignee: nobody → jlal
Flags: needinfo?(jlal)
Bug 1148941 - Use set in decision task vcs pull r=auswerk
Attachment #8623313 - Flags: review?(aus)
https://reviewboard.mozilla.org/r/11481/#review9905

::: testing/docker/builder/bin/decision.sh:1
(Diff revision 1)
> +set -ex

This is the important part
Comment on attachment 8623313 [details]
MozReview Request: Bug 1148941 - Use set in decision task vcs pull r=auswerk

https://reviewboard.mozilla.org/r/11483/#review9909

Ship It!
Attachment #8623313 - Flags: review?(aus) → review+
Motif of the backout:

23:03:01     INFO - Copy/paste: hgtool.py -r RELEASE_AUTOMATION https://hg.mozilla.org/build/compare-locales /home/worker/workspace/B2G/compare-locales
23:03:01     INFO - Using env: {'PATH': '/home/worker/workspace/gecko/testing/taskcluster/scripts/builder:/tools/gcc-4.7.3-0moz1/bin:/tools/tools/buildfarm/utils:/tools/python27-mercurial/bin:/tools/python27/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/home/worker/bin/'}
23:03:01     INFO -  Traceback (most recent call last):
23:03:01     INFO -    File "/tools/tools/buildfarm/utils/hgtool.py", line 14, in <module>
23:03:01     INFO -      from util.hg import mercurial, out, remove_path
23:03:01     INFO -    File "/tools/python27/lib/python2.7/site-packages/buildtools-1.0.6-py2.7.egg/util/hg.py", line 11, in <module>
23:03:01     INFO -      from util.retry import retry, retrier
23:03:01     INFO -    File "/tools/python27/lib/python2.7/site-packages/buildtools-1.0.6-py2.7.egg/util/retry.py", line 6, in <module>
23:03:01     INFO -      from redo import retry, retriable, retrying, retrier
23:03:01     INFO -  ImportError: No module named redo
23:03:01    ERROR - Return code: 1

https://s3-us-west-2.amazonaws.com/taskcluster-public-artifacts/Pj1RXMXLSkCNv_qJm4BczA/0/public/logs/live_backing.log
https://treeherder.mozilla.org/#/jobs?repo=b2g-inbound&revision=47ef6f3abc9f
Component: TaskCluster → General
Product: Testing → Taskcluster
Resurrecting this bug to see if either it needs to be worked on or if this issue has been resolved.  Looking at comment 10 , it's about a failed build where this patch does not touch any build tasks.  This was only done for the decision task that extends the task graph.

Armen, this is old, and I apologize for bugging you with it, but do you recall why this patch was backed out related to tasks that this patch didn't touch?
Flags: needinfo?(armenzg)
garndt: I would try to push to try and see what happens.
If we could output the revision of /tools we would not the real difference.

From looking at the treeherder push we can see all builds are orange:
https://treeherder.mozilla.org/#/jobs?repo=b2g-inbound&revision=47ef6f3abc9f&filter-searchStr=taskcluster

Perhaps the image of taskcluster/builder:0.6.0 has a newer tools repo which has the new redo dependency? [1]

For the record, we can now call hgtool.py from external_tools/ inside of mozharness instead of the one on the tools repo.
Hence making the tools repo unnecessary.



(In reply to Jonas Finnemann Jensen (:jonasfj) from comment #0)
>  3) decision task should validate that it has the right checkout
> 
> Issue (3) is a nice to have. But a simple sanity check that we got the right
> revision would add some extra robustness, we could do it right here:
> https://dxr.mozilla.org/mozilla-central/source/testing/docker/decision/bin/
> entrypoint#23
> Just: if [$(tc-vcs revision $DEST) != $GECKO_HEAD_REV ]; then exit 1; fi

Should this validation also be done by tc-vcs and exit as non 0? (instead of the decision task)
We do this validation in hgtool.py and gittool.py.

>  4) tc-vcs should retry with exponential back-off

What does back-off mean in here?

[1]
22:39:23     INFO - Copy/paste: hgtool.py -r RELEASE_AUTOMATION https://hg.mozilla.org/build/compare-locales /home/worker/workspace/B2G/compare-locales
22:39:23     INFO - Using env: {'PATH': '/home/worker/workspace/gecko/testing/taskcluster/scripts/builder:/tools/gcc-4.7.3-0moz1/bin:/tools/tools/buildfarm/utils:/tools/python27-mercurial/bin:/tools/python27/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/home/worker/bin/'}
22:39:23     INFO -  Traceback (most recent call last):
22:39:23     INFO -    File "/tools/tools/buildfarm/utils/hgtool.py", line 14, in <module>
22:39:23     INFO -      from util.hg import mercurial, out, remove_path
22:39:23     INFO -    File "/tools/python27/lib/python2.7/site-packages/buildtools-1.0.6-py2.7.egg/util/hg.py", line 11, in <module>
22:39:23     INFO -      from util.retry import retry, retrier
22:39:23     INFO -    File "/tools/python27/lib/python2.7/site-packages/buildtools-1.0.6-py2.7.egg/util/retry.py", line 6, in <module>
22:39:23     INFO -      from redo import retry, retriable, retrying, retrier
22:39:23     INFO -  ImportError: No module named redo
22:39:23    ERROR - Return code: 1
Assignee: jlal → nobody
Flags: needinfo?(armenzg)
In order to take advantage of hgtool.py living in the tree bug 1201171 would also need to be fixed.

Unlike the hgtool.py from tools/ where it has various dependencies, this is a single file which does the same:
https://dxr.mozilla.org/mozilla-central/source/testing/mozharness/external_tools/hgtool.py
Whiteboard: [tc-vcs]
Well, we're using `hg robustcheckout` now instead of tc-vcs or hgtool :)
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: