Closed Bug 433392 Opened 17 years ago Closed 17 years ago

tinderboxes on mozilla-central doing build-on-landing don't cycle enough

Categories

(Release Engineering :: General, defect)

x86
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: dbaron, Unassigned)

References

Details

I think a number of machines on mozilla-central (at least the unit test boxes) are currently doing build-on-push (which is what we should call build-on-checkin for the hg world). However, if I compare the pushlog at http://hg.mozilla.org/mozilla-central/index.cgi/pushlog to the output on http://tinderbox.mozilla.org/showbuilds.cgi?tree=Mozilla2 I'm not seeing any evidence that builds are actually starting when pushes occur. In order to get mozilla-central open we need to do ONE of the following: (a) make build-on-push work (b) significantly increase the frequency of the builds that happen when no pushes happen (esp. for unit test boxes) This bug is on doing either one; if we choose (b) for now we should get a separate bug filed for (a). It's a little hard for me to analyze this given that I'm not sure which machines are *supposed* to be doing build-on-push right now, though. But I think at least the unit test boxes are, and maybe others. I'm not even 100% sure that my analysis is correct since pushes have been pretty infrequent.
I saw the builds and debug builds kicking of properly before I left, after bug 421175 landed. I didn't check the unit-test stuff: it's possible the unit-test master did not get the new hgpoller.
Depends on: mustard
AFAIK the unit test master does not have the new HgPoller yet.
I'm trying to confirm or deny this but the unittest master appears to be offlined. Talking with Ben Hearsum, it looks like the patched HgPoller should do the trick, so when I can find the machine, I'll make that change.
John O'Duinn pointed out to me that my term "build-on-push" could be confusing given other meanings of push; his suggestion of "build-on-landing" seems VCS-neutral and probably less confusing.
Summary: tinderboxes on mozilla-central doing build-on-push don't cycle enough → tinderboxes on mozilla-central doing build-on-landing don't cycle enough
Build-on-landing is working on mozilla-central+actionmonkey. For dep/nightly/debug we currently build every 2 hours + on-landing. Now that build-on-landing is working we should decrease that timeout to 6 or 8 hours IMHO. This will cause fewer Talos cycles as it is unable to test the same build more than once. Alice can speak more to that.
From the talos side of the world: - we don't have a way to trigger re-testing a build if a machine is idle - the graph server keys on time stamps for talos results, this will soon be build time (it is currently test time see bug 419487). We would need to have the graph server be able to display a re-tested result in a reasonable manner (ie, stack the dot on top of the previous result or some such) - right now if we send new data associated with a time stamp that has already been used the graph server gets very, very confused and data gets destroyed. If we move to test on landing we can expect to see a lot fewer talos results until we have systems in place to re-test builds and reasonably store the results. If we are okay with having sparse results then we are good to go. If you want to have more testing then we'd have to block on the necessary talos work.
I have two questions for everyone: 1) With build-on-landing now working how do people feel about the amount of cycles? 2) How do people feel about turning down the build-periodically time to once every 6 hours?
With build-on-landing working I think things are ok. I'm a little concerned about turning down the number of cycles because of the reduction in talos data.
Is that still going to be a concern when mozilla-central is opened to general check-in?
I think it will be -- I think we want the extra data overnight and on weekends when nobody is checking in -- it gives us better baselines. In any case, marking this bug fixed or worksforme at this point is fine with me.
I'm OK with that too. When Talos is able to re-run the same build multiple times I'd like to revisit the periodical builds though.
ok, marking worksforme then.
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → WORKSFORME
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.