Closed Bug 551205 Opened 14 years ago Closed 14 years ago

[Tracking Bug] Downtime for March 11th

Categories

(Release Engineering :: General, defect)

x86
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: catlee, Assigned: catlee)

References

Details

Planned events:

* Land bug 550876
* Restart pm02
* Add-ons branch (bug 542910)
* reconfig of Talos master to enable addonsmgr branch
* reset Try
Assignee: armenzg → catlee
Depends on: 550256
This downtime has commenced.
Memory usage for buildbot before restart is: 59%

Started with:
 nohup ./start_buildbot.sh &
 disown

Pruned events with:
  python ~/tools/buildfarm/maintenance/purge_events.py


Memory usage after restart is: 60% while pm01 is 22%.

In my opinion, it shouldn't be such a big difference.

We have discovered that changes.pck is huge on pm02, 200MB, while on pm01 is only 1.4MB.

After removing it and start it again pm02's memory usage stayed at around 45%.

The differences of these masters are the following:
* 254 builders vs 351 builders
* 3 branches vs 5 branches + 6 mobile branches
* 2 HgAllLocalesPoller vs 3 HgAllLocalesPoller (currently they are even - mobile repacks-on-change are disabled)
* pm02 has three extra pollers (mobile, mc and m192) beside the poller differences added on misc.py per branch


I have noticed that staging-master02 also had a really high memory usage (>50%) and after restarting and pruning it is at 44% (almost like production-master02). It seems like the difference of slaves is irrelevant.
The only thing left in here is restarting the try repo. We are waiting on IT.
All done here.  Try server will be done at a later time.
Status: NEW → RESOLVED
Closed: 14 years ago
No longer depends on: 550256
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.