Closed Bug 590329 Opened 9 years ago Closed 5 years ago

fail early if the clobberer is broken

Categories

(Release Engineering :: General, defect, P3)

x86
macOS
defect

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: jhford, Unassigned)

References

Details

(Whiteboard: [clobberer][automation][simple])

During a staging release run, I had an issue where a slave wasn't able to reach the clobberer.  Instead of failing when unable to get to the clobberer, the slave happily kept going and failed out during a later operation.  Had the steps and the slave had the right combination of files and operations, old bits could be used in a new build with minimal warning.

An idea to fix this is to have a new MozillaBuildFactory argument of 'mandatory_clobber' that defaults to false.  For release builders, this would set the haltOnFailure to true to ensure that either the clobberer can be reached and provide valid information or fail out because failing (nearly) silently is bad.

Currently, if the clobberer can't be reached, the only indication of this is the word 'failed' in the following step 'descriptionDone'.

9. checking_clobber_times [checking clobber times failed]
    1. stdio

The header for the build page showed only the following hg_clone error.

Results:
failed hg_clone



====== misc reference information about this specific issue ======
The STDIO log contained

Checking clobber URL: http://build.mozilla.org/stage-clobberer/index.php?master=http%3A%2F%2Fstaging-master.build.mozilla.org%3A8011%2F&slave=mv-moz2-linux-ix-slave01&builddir=tag&branch=nothing&buildername=tag
Error contacting server

Wget returned

[cltbld@mv-moz2-linux-ix-slave01 ~]$ wget 'http://build.mozilla.org/stage-clobberer/index.php?master=http%3A%2F%2Fstaging-master.build.mozilla.org%3A8011%2F&slave=mv-moz2-linux-ix-slave01&builddir=tag&branch=nothing&buildername=tag'
--22:56:31--  http://build.mozilla.org/stage-clobberer/index.php?master=http%3A%2F%2Fstaging-master.build.mozilla.org%3A8011%2F&slave=mv-moz2-linux-ix-slave01&builddir=tag&branch=nothing&buildername=tag
Resolving build.mozilla.org... 10.2.74.128
Connecting to build.mozilla.org|10.2.74.128|:80... connected.
HTTP request sent, awaiting response... 302 Moved Temporarily
Location: https://build.mozilla.org/stage-clobberer/index.php?master=http%253A%252F%252Fstaging-master.build.mozilla.org%253A8011%252F&slave=mv-moz2-linux-ix-slave01&builddir=tag&branch=nothing&buildername=tag [following]
--22:56:31--  https://build.mozilla.org/stage-clobberer/index.php?master=http%253A%252F%252Fstaging-master.build.mozilla.org%253A8011%252F&slave=mv-moz2-linux-ix-slave01&builddir=tag&branch=nothing&buildername=tag
Connecting to build.mozilla.org|10.2.74.128|:443... connected.
HTTP request sent, awaiting response... 403 Forbidden
22:56:31 ERROR 403: Forbidden.
How about an option to default to clobber=True if the clobberer isn't reachable?
Priority: -- → P5
Duplicate of this bug: 650291
This is quite important raising priority as it is easy to hit.
Blocks: 627271
Priority: P5 → P3
What makes this easy to hit? I've never seen this in the real world, personally.
Mass move of bugs to Release Automation component.
Component: Release Engineering → Release Engineering: Automation (Release Automation)
No longer blocks: hg-automation
This doesn't matter for releases anymore, because we always clobber. I do think it's important to bail or fail early for non-release builds if we're being told to clobber the currently running build.
Component: Release Engineering: Automation (Release Automation) → Release Engineering: Automation (General)
Summary: fail early if the clobberer is broken (at least for releases) → fail early if the clobberer is broken
Product: mozilla.org → Release Engineering
Not a problem for releases; hasn't been an issue for non-release builds that I know of.

we're also going to be clobbering as part of pre-flight checks soon too.
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WONTFIX
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.