Closed
Bug 1132939
Opened 9 years ago
Closed 9 years ago
buildbot-master67's time gets reset when it reboots
Categories
(Infrastructure & Operations :: RelOps: General, task)
Infrastructure & Operations
RelOps: General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: dustin, Assigned: jlund)
References
()
Details
Feb 9 13:29:10 buildbot-master67.bb.releng.use1.mozilla.com ntpd[12710]: 0.0.0.0 c61c 0c clock_step -17999.988448 s Feb 12 09:50:26 buildbot-master67.bb.releng.use1.mozilla.com ntpdate[1102]: step time server 10.26.75.40 offset -28799.497552 sec That's not so good! I suspect that this is because the underlying host hardware's timezone is misconfigured -- an AWS bug. Should we (a) report to AWS? (b) terminate and re-create this master in hopes of finding a better instance?
Updated•9 years ago
|
Assignee: relops → dustin
Flags: needinfo?(rail)
Comment 1•9 years ago
|
||
We had something similar in bug 962099. I'm not sure how common this in AWS though. :/
Flags: needinfo?(rail)
Comment 3•9 years ago
|
||
I can't find where I said it before, but I had issues with my new BMs as well (120..123) with ntpdate doing the very same thing Feb 12 09:50:26 buildbot-master67.bb.releng.use1.mozilla.com ntpdate[1102]: step time server 10.26.75.40 offset -28799.497552 sec By many hours. And in those cases it seemed to happen always after buildbot was up and running.
Reporter | ||
Comment 4•9 years ago
|
||
I won't be available during the TCW, so someone else will need to take care of this.
Assignee: dustin → relops
Updated•9 years ago
|
Assignee: relops → jlund
Assignee | ||
Comment 5•9 years ago
|
||
/me takes steps: 1) disable in slavealloc 2) python buildfarm/maintenance/manage_masters.py -f buildfarm/maintenance/production-masters.json -H bm67-tests1-linux64 graceful_stop 3) terminate in console 4) aws_create_instance -c configs/buildbot-master -r us-east-1 -s aws-releng -k /builds/aws_manager/secrets/aws-secrets.json --ssh-key ~/.ssh/aws-ssh-key -i ./instance_data/us-east-1.instance_data_master.json buildbot-master67 5) python buildfarm/maintenance/manage_masters.py -f buildfarm/maintenance/production-masters.json -H bm67-tests1-linux64 start 6) enable in slavealloc 7) mark as done and celebrate saturday
Updated•9 years ago
|
Flags: cab-review? → cab-review+
Assignee | ||
Comment 6•9 years ago
|
||
master disabled and stopped, terminated, and is currently in the process of being recreated. puppetizing should complete soon
Assignee | ||
Comment 7•9 years ago
|
||
master has come back up. I noticed that it was reset back to a m3.medium so stopped it, bumped it back up to m3.large, and ensured it had swap. ni: myself to come back to this slave and ensure its last few jobs look good
Flags: needinfo?(jlund)
Assignee | ||
Updated•9 years ago
|
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 8•9 years ago
|
||
looks healthy
Assignee | ||
Updated•9 years ago
|
Flags: needinfo?(jlund)
Updated•9 years ago
|
Change Request: --- → approved
Flags: cab-review+
You need to log in
before you can comment on or make changes to this bug.
Description
•