Closed Bug 1429410 Opened 7 years ago Closed 7 years ago

masterball needs to start with changemaster's nextNumber and Builder's nextBuildNumber being high enough

Categories

(Localization Infrastructure and Tools :: Automation, enhancement)

enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: Pike, Assigned: Pike)

References

Details

In the AWS world, we might loose changes.pck. buildbot doesn't care much, but resets the changemaster, including setting the nextNumber to 1. Which will conflict with the mbdb database. So on startup, the changesource should check that self.parent.nextNumber is bigger than the max in the db. changes.pck is the only file in the local drive that's persisting state. The builder dirs will be on persistent storage.
Assignee: nobody → l10n
Commit pushed to develop at https://github.com/mozilla/elmo https://github.com/mozilla/elmo/commit/1f2520ed396fdc3c23e50c3690af261c2d0721ed bug 1429410, add fake ChangeSource to ensure ChangeMaster.nextNumber matches mbdb, r=me When loosing the working directory of the master, we loose changes.pck, and with that the nextNumber of the ChangeMaster. Sadly, we store that on the mbdb, too, so in that case, go back to the db and set the nextNumber to the highest number + 1.
tweaking to include Builders, too.
Summary: masterball needs to start with changemaster's nextNumber being high enough → masterball needs to start with changemaster's nextNumber and Builder's nextBuildNumber being high enough
Commits pushed to develop at https://github.com/mozilla/elmo https://github.com/mozilla/elmo/commit/4ad552f9e3f3773dbc3f71966ffa33ef643eb9b0 bug 1429410, follow-up, run ChangeMaster.nextNumber reset always We actually need to sync the nextNumber on each startup. There are two reasons for loosing data: - mbdb looses obsolete Changes, freeing up entries in the db. - buildbot looses the last changenumber due to unclean shutdown. The first case, mbdb loosing data, we just ignore. IDs are cheap. The latter case, buildbot would try to use the numbers in our db again, causing all kind of trouble. In that case, we set the nextNumber. We also clear out the existing changes that ChangeMaster synced back, as some parts of buildbot code assume that the changes are stored consecutive. That's harmless, and our changeHorizon is 1 anyway by now, so we don't have much data anymore in ChangeMaster. https://github.com/mozilla/elmo/commit/968ced7d17d9ba4e54160983e46f5b470d0a0a6f bug 1429410, use Build.objects to ensure the nextBuildNumber is high enough In unclean shut-downs or build cleanup scripts, we might loose build data. Let's ensure that the BuilderStatus.nextBuildNumber is higher than the current max build number in the db.
This is fixed and deployed, https://github.com/Pike/master-ball/commit/6e7c2c41095f420cf9c395da0ffab046edef3ff3 took the elmo changes into master-ball.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.