Closed
Bug 421917
Opened 17 years ago
Closed 17 years ago
Talos changes for improvements to staging setup
Categories
(Release Engineering :: General, defect, P2)
Release Engineering
General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: nthomas, Assigned: nthomas)
References
Details
Attachments
(2 files, 2 obsolete files)
|
2.75 KB,
patch
|
anodelman
:
review+
|
Details | Diff | Splinter Review |
|
4.48 KB,
patch
|
rhelmer
:
review+
|
Details | Diff | Splinter Review |
As part of bug 419978, we've been planning to improve the virus scanning for everything that goes onto ftp.m.o. Currently new files are scanned after they are published, and there is no delay between a tinderbox pushing a build and it being downloadable. The new system will go one better and scan them before they become available, so there will be a short delay. The directories used by Talos have a special fast-path to a scanner, so that time-critical bits don't (eg) get stuck behind a huge pile of l10n hourlies. We're testing this now, but the time-to-scan-and-publish should be less than 10 minutes.
The other change is that http://stage.m.o will no longer be accessible, and http://ftp.m.o should be used instead.
This attachment is a stab at the changes needed. There's currently no difference between ftp.m.o and stage.m.o, so I think this should be safe to use now.
| Assignee | ||
Comment 1•17 years ago
|
||
Alternatively, the existing staging server will be available at stage-old.m.o (once bug 421915 is fixed), so this an option to preserve the status quo while any bugs shake out.
| Assignee | ||
Comment 2•17 years ago
|
||
Rob, Alice,
Sorry for the late notice, the full impact on Talos only occurred to me today. We had Thursday pencilled in for this change, but need your feedback on how plausible this is.
20 question time - Did I put enough information into comment #0 ? How much testing/prep would make you comfortable with this change ? Are the patches any good ?
| Assignee | ||
Updated•17 years ago
|
Comment 3•17 years ago
|
||
From my understanding of comment #0:
- talos exclusively uses stage.m.o links to download builds, we'd need that to switch over the ftp.m.o
- the builds will be published as available but we won't be able to download them
So, for the switch to ftp.m.o I'd need a time line of when that should be done and then we can get that fixed. I'm more concerned about tinderbox publishing builds as complete/successful and them not being available for download. We've already had some problems in the past with talos attempting to download builds that aren't there and I'd like to avoid that in the future.
Can I get a better idea of how the talos buildmaster could realize that a build is completed but not yet available? Ideally, we have some in between state where a build was completed/unavailable and then switched to completed/available - or we'd simply include the virus scan as part of the build and consider it incomplete without it.
Comment 4•17 years ago
|
||
Sorry, I just looked over the patches and that would be enough to switch from stage.m.o to ftp.m.o/stage-old.m.o. But, I'm still more concerned about talos seeing a build as available and then failing on attempting to download it.
| Assignee | ||
Comment 5•17 years ago
|
||
I'm not sure how easy this will be to do in buildbot, but here goes.
From the looks of tinderboxpoller.py, it's currently pulling quickparse.txt and knows there is something to test when the timestamp changes. With all the hourlies going into one directory, we could test the timestamp on the file. If it's later than the stamp from tinderbox (build start time) then the build has been scanned and published, and the Change should be fired. Otherwise, don't update the saved state and check again on the next poll. Once we keep 24 hours worth of hourlies (bug 291167) then there's a unique dir & file to test for (using the build start time).
I guess there is a danger when Talos's 10 min poll interval + the scan lag is more than a tinderbox cycle time (eg 15 mins on Fx/Trunk/Linux), so that start times and builds get scrambled. Perhaps we can mitigate that by polling more frequently, and/or making sure the scan fast-path doesn't wait long between it's checks for new builds. Probably also good to use the uncached quickparse at (eg)
http://tinderbox.mozilla.org/showbuilds.cgi?tree=Firefox&quickparse=1
| Assignee | ||
Comment 6•17 years ago
|
||
If we need time to get something like comment #5 into place, then the fallback plan is to have Talos builds not go thru the scanner yet. Then we'd only need to switch to ftp.m.o to keep Talos going.
Comment 7•17 years ago
|
||
It does sound like there would need to be some work on tinderboxpoller.py, along with testing/baking time to ensure that it was working correctly. We could split that into another bug and go with switching to ftp.m.o if you want to move this ahead quickly.
| Assignee | ||
Updated•17 years ago
|
Attachment #308425 -
Flags: review?(anodelman)
| Assignee | ||
Comment 9•17 years ago
|
||
Switch the 6 machines doing 1.8branch and trunk nightlies/hourlies to push to stage-old.m.o until we can teach Talos how to cope with the scan lag.
Attachment #308427 -
Attachment is obsolete: true
Attachment #309164 -
Flags: review?(rhelmer)
| Assignee | ||
Comment 10•17 years ago
|
||
Farmed the talos changes out to bug 422725.
| Assignee | ||
Comment 11•17 years ago
|
||
Comment on attachment 309164 [details] [diff] [review]
Firefox Trunk & Moz1.8 should not go thru the scanner for now
Oops, these should be done in the bootstrap config.
Attachment #309164 -
Attachment is obsolete: true
Attachment #309164 -
Flags: review?(rhelmer)
| Assignee | ||
Comment 12•17 years ago
|
||
Bootstrap changes for 1.8 branch (and 1.9 for completeness), and tinder-config.pl for trunk.
Attachment #309179 -
Flags: review?(rhelmer)
Updated•17 years ago
|
Attachment #309179 -
Flags: review?(rhelmer) → review+
Updated•17 years ago
|
Attachment #308425 -
Flags: review?(anodelman) → review+
| Assignee | ||
Comment 13•17 years ago
|
||
Comment on attachment 309179 [details] [diff] [review]
[checked in] Firefox Trunk & Moz1.8 should not go thru the scanner for now - v2
Checking in release-auto-nightly/fx-moz18-nightly-bootstrap.cfg;
/cvsroot/mozilla/tools/release/configs/fx-moz18-nightly-bootstrap.cfg,v <-- fx-moz18-nightly-bootstrap.cfg
new revision: 1.12; previous revision: 1.11
done
Checking in release-auto-nightly/fx-moz19-nightly-bootstrap.cfg;
/cvsroot/mozilla/tools/release/configs/fx-moz19-nightly-bootstrap.cfg,v <-- fx-moz19-nightly-bootstrap.cfg
new revision: 1.9; previous revision: 1.8
done
Checking in trunk/firefox/linux/tinder-config.pl;
/cvsroot/mozilla/tools/tinderbox-configs/firefox/linux/tinder-config.pl,v <-- tinder-config.pl
new revision: 1.24; previous revision: 1.23
done
Checking in trunk/firefox/macosx/tinder-config.pl;
/cvsroot/mozilla/tools/tinderbox-configs/firefox/macosx/tinder-config.pl,v <-- tinder-config.pl
new revision: 1.40; previous revision: 1.39
done
Checking in trunk/firefox/win32/tinder-config.pl;
/cvsroot/mozilla/tools/tinderbox-configs/firefox/win32/tinder-config.pl,v <-- tinder-config.pl
new revision: 1.31; previous revision: 1.30
done
Attachment #309179 -
Attachment description: Firefox Trunk & Moz1.8 should not go thru the scanner for now - v2 → [checked in] Firefox Trunk & Moz1.8 should not go thru the scanner for now - v2
Comment 14•17 years ago
|
||
Checking in master.cfg;
/cvsroot/mozilla/tools/buildbot-configs/testing/talos/perfmaster/master.cfg,v <-- master.cfg
new revision: 1.46; previous revision: 1.45
done
Updated•17 years ago
|
Attachment #308425 -
Attachment description: Use ftp.m.o → [checked in] Use ftp.m.o
| Assignee | ||
Updated•17 years ago
|
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
| Assignee | ||
Comment 15•17 years ago
|
||
A "buildbot reconfig" worked fine this time.
Updated•12 years ago
|
Product: mozilla.org → Release Engineering
You need to log in
before you can comment on or make changes to this bug.
Description
•