Closed
Bug 398460
Opened 17 years ago
Closed 16 years ago
Intermittent slave failures on qm-pxp0*
Categories
(Release Engineering :: General, defect)
Release Engineering
General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: rcampbell, Unassigned)
References
Details
Seen on the qm-pxp0n "blades", occasionally a talos run will turn red due to download or zip file failures. Download failures could probably be fixed by repolling the download site a few times. Zip failures are not really "fixable" and should probably continue to fail.
Comment 1•17 years ago
|
||
If you are being served corrupted zips is there someone in build who could look at it?
Reporter | ||
Comment 2•17 years ago
|
||
yes, absolutely. And they should!
Comment 3•17 years ago
|
||
I've been watching for these lately and I'm 99% sure the packages get corrupted during the transfer. On a few occasions I've manually tested a package that the Talos machine had trouble with and each time I had no issues with it.
Comment 4•17 years ago
|
||
So, we could try something simple like adding a step to test the zip and looping to re-download, with some failsafe to break out if it looks like there really is something wrong with the copy on the server. unzip comes with a test feature, you can check if a zip is okay with unzip -tq. I'd assume that other unzippers have similar options for us to work with.
Comment 5•17 years ago
|
||
It's going to be a bit more complicated than that -- Buildbot doesn't have a "looping" concept for BuildSteps. We may be able to string some shell commands together to accomplish this. "wget ... && unzip -tq firefox.zip"..something like that. Another idea that popped into my head is simply doing a TinderboxPrint when unpacking the zip file fails. Showing "bad build" on the main page may mitigate the red tree.
Comment 6•17 years ago
|
||
Can't buildbot have a buildstep which is a script that can loop? It's got to be running the Talos code somehow...
Comment 7•17 years ago
|
||
Yeah, I think that's what I was trying to say in my second paragraph (but did a poor job of it).
Comment 8•16 years ago
|
||
This was resolved with having talos pull build zips from dated directories. The redness was due to talos attempting to download a build while a given build machine was dropping a new build with the same name in the same directory. Having builds go into unique, dated directories means that we no longer get any collisions.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Comment 9•16 years ago
|
||
(In reply to comment #8) > This was resolved with having talos pull build zips from dated directories. > The redness was due to talos attempting to download a build while a given build > machine was dropping a new build with the same name in the same directory. > Having builds go into unique, dated directories means that we no longer get any > collisions. ...and we really like that! :-D
Comment 10•16 years ago
|
||
Mass move of Core:Testing bugs to mozilla.org:ReleaseEngineering. Filter on RelEngMassMove to ignore.
Component: Testing → Release Engineering
Product: Core → mozilla.org
QA Contact: testing → release
Version: Trunk → other
Assignee | ||
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
You need to log in
before you can comment on or make changes to this bug.
Description
•