Closed
Bug 1429410
Opened 7 years ago
Closed 7 years ago
masterball needs to start with changemaster's nextNumber and Builder's nextBuildNumber being high enough
Categories
(Localization Infrastructure and Tools :: Automation, enhancement)
Localization Infrastructure and Tools
Automation
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: Pike, Assigned: Pike)
References
Details
In the AWS world, we might loose changes.pck. buildbot doesn't care much, but resets the changemaster, including setting the nextNumber to 1.
Which will conflict with the mbdb database.
So on startup, the changesource should check that self.parent.nextNumber is bigger than the max in the db.
changes.pck is the only file in the local drive that's persisting state. The builder dirs will be on persistent storage.
Assignee | ||
Updated•7 years ago
|
Assignee: nobody → l10n
Comment 1•7 years ago
|
||
Commit pushed to develop at https://github.com/mozilla/elmo
https://github.com/mozilla/elmo/commit/1f2520ed396fdc3c23e50c3690af261c2d0721ed
bug 1429410, add fake ChangeSource to ensure ChangeMaster.nextNumber matches mbdb, r=me
When loosing the working directory of the master, we loose changes.pck,
and with that the nextNumber of the ChangeMaster.
Sadly, we store that on the mbdb, too, so in that case, go back
to the db and set the nextNumber to the highest number + 1.
Assignee | ||
Comment 2•7 years ago
|
||
tweaking to include Builders, too.
Summary: masterball needs to start with changemaster's nextNumber being high enough → masterball needs to start with changemaster's nextNumber and Builder's nextBuildNumber being high enough
Comment 3•7 years ago
|
||
Commits pushed to develop at https://github.com/mozilla/elmo
https://github.com/mozilla/elmo/commit/4ad552f9e3f3773dbc3f71966ffa33ef643eb9b0
bug 1429410, follow-up, run ChangeMaster.nextNumber reset always
We actually need to sync the nextNumber on each startup.
There are two reasons for loosing data:
- mbdb looses obsolete Changes, freeing up entries in the db.
- buildbot looses the last changenumber due to unclean shutdown.
The first case, mbdb loosing data, we just ignore. IDs are cheap.
The latter case, buildbot would try to use the numbers in our
db again, causing all kind of trouble.
In that case, we set the nextNumber. We also clear out the existing
changes that ChangeMaster synced back, as some parts of buildbot
code assume that the changes are stored consecutive.
That's harmless, and our changeHorizon is 1 anyway by now, so
we don't have much data anymore in ChangeMaster.
https://github.com/mozilla/elmo/commit/968ced7d17d9ba4e54160983e46f5b470d0a0a6f
bug 1429410, use Build.objects to ensure the nextBuildNumber is high enough
In unclean shut-downs or build cleanup scripts, we might loose
build data.
Let's ensure that the BuilderStatus.nextBuildNumber is higher
than the current max build number in the db.
Assignee | ||
Comment 4•7 years ago
|
||
This is fixed and deployed, https://github.com/Pike/master-ball/commit/6e7c2c41095f420cf9c395da0ffab046edef3ff3 took the elmo changes into master-ball.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•