Talos should halt on download or unzip failure

RESOLVED FIXED

Status

Release Engineering
General
P3
normal
RESOLVED FIXED
8 years ago
5 years ago

People

(Reporter: philor, Assigned: edransch)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [talos][automation])

Attachments

(2 attachments, 1 obsolete attachment)

(Reporter)

Description

8 years ago
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1274399232.1274399865.15943.gz
Rev3 WINNT 6.1 mozilla-central talos on 2010/05/20 16:47:12  

  inflating: firefox/xul.dll          bad CRC ff3b34a5  (should be f3a57fd9)
program finished with exit code 2
...
Running test tdhtml: 
NOISE: __FAILbrowser non-zero return code (-1073741515)__FAIL

While http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1274399212.1274399358.13547.gz is an xpcshell test that got the same busted zip, it sensibly gave up when unzipping failed, rather than trying to run some partial, broken browser. Since there's no good that can come from running part of a browser, only red or worse (if it's possible to unzip only part of the browser, but then run it, the perf numbers from that would be... unreliable), it seems like it should bail when it doesn't unzip.
Created attachment 446722 [details] [diff] [review]
untested patch

This will force the step to halt the build if there is an error unpacking the file.  The build will still run the reboot step is it has alwaysRun=True.
If we were to fix bug 557336 we could use the same base class for the setup and tear down of unit tests and talos runs.  This seems like a good thing as both are running in the same pool of slaves and avoids code duplication.
Priority: -- → P3
Whiteboard: [talos][automation]
(Reporter)

Updated

7 years ago
Summary: Talos should flunk on unzip failure → Talos should halt on download or unzip failure
We should probably add haltOnFailure for most, if not all, of the DownloadFile and UnpackFile steps in TalosFactory and RuntimeTalosFactory. The only exception, I think, is for the download/unpack symbols. Because these aren't a _crucial_ part of the test process, I think it'd be better to continue on even if we fail to download or unpack them. We need to go through all the DownloadFile/UnpackFile steps starting from http://hg.mozilla.org/build/buildbotcustom/file/default/process/factory.py#l7004, ending at http://hg.mozilla.org/build/buildbotcustom/file/default/process/factory.py#l7723, and see which ones need this change applied.
(Assignee)

Comment 5

7 years ago
Created attachment 586171 [details] [diff] [review]
add haltOnFailure=True to Talos Factory

added haltOnFailure to UnpackFile and DownloadFile in Talos Factory (except for Download/Unpacks of symbols, as mentioned above).
Attachment #586171 - Flags: review?(catlee)
(Assignee)

Updated

7 years ago
Assignee: nobody → edransch
(Assignee)

Comment 6

7 years ago
Created attachment 586487 [details] [diff] [review]
Revised patch
Attachment #586171 - Attachment is obsolete: true
Attachment #586487 - Flags: review?(catlee)
Attachment #586171 - Flags: review?(catlee)
(Assignee)

Comment 7

7 years ago
Comment on attachment 586487 [details] [diff] [review]
Revised patch

Revised patch to correctly handle post-failure reboot.

Updated

7 years ago
Attachment #586487 - Flags: review?(catlee) → review+

Updated

7 years ago
Attachment #586487 - Flags: checked-in+
This made it to production yesterday. Yay!
Status: NEW → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.