Closed
Bug 1151284
Opened 10 years ago
Closed 10 years ago
APAC and EU wireless upgrade
Categories
(Infrastructure & Operations :: Change Requests, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: van, Assigned: van)
Details
current controller software has critical bugs that have been fixed in the latest update. we need to upgrade these controllers' software ASAP. refer to 1150113 for more info.
date, time, duration of maintenance
April 17, 10PM PDT, 12 hours
system(s) affected
APAC and EU (par1, lon1, tpe1, akl1, ber1) wireless controllers and APs
end-user impact
wireless will be unavailable for ~30min (60min) while the APs reboot to install their new code release
maintenance plan and timeline
https://mana.mozilla.org/wiki/display/NETOPS/Juniper+wireless+upgrade#Juniperwirelessupgrade-WLC%28MSS:MobilitySystemSoftware%29
rollback plan / rollback point
If any issue is noticed (instability, connections issues, etc...) restart the controllers to their backup partition and investigate
notification mechanisms
Oncall, whistlepig
who will be point, who else will be involved
Van, Arzhel
| Assignee | ||
Updated•10 years ago
|
Flags: cab-review+
| Assignee | ||
Updated•10 years ago
|
Flags: cab-review+ → cab-review?
| Assignee | ||
Comment 2•10 years ago
|
||
APAC upgrade completed without issues. EU upgrade completed but we still have 1 AP down (wap302.ops.lon1). The AP boots but won't download the image. It stops/crashes at the 5 minute mark. I am working with JTAC to resolve the issue or RMA the device depending on outcome of their review of our logs.
I've disabled PoE on the AP's interface and will try to reenable tomorrow to see if issue resolves. (Hoping that it just requires a long drain as several reboot attempts didn't work.)
| Assignee | ||
Comment 3•10 years ago
|
||
attempted to bring the AP back online this morning. still same issues, AP tries to download the image for 5 minutes then gives up. 2015-0411-0058 opened for WLA RMA.
AP displaying some unsual behaviors.
packet loss while downloading image:
--- 10.246.1.2 ping statistics ---
500 packets transmitted, 403 packets received, 0 errors, 19% packet loss
page gain when after it gives up:
--- 10.246.1.2 ping statistics ---
500 packets transmitted, 742 packets received, 0 errors, 0% packet loss
*wifi2.ops.lon1.mozilla.net# show log trace match 4632 -300
APM_RF Apr 11 18:12:41.332410 DEBUG AUTORF_ERROR: autorf_reset_radio_state: AP 4632: ap not in configured state
APM_RF Apr 11 18:12:41.332319 DEBUG AUTORF_ERROR: autorf_reset_radio_state: AP 4632: ap not in configured state
SM Apr 11 18:12:41.331254 NOTICE SM-EVENT: APM reports AP 4632 is down (LB)
APM_MGR Apr 11 18:12:41.329894 NOTICE MX_SEL_REC: deleting wlc sel rec for AP 4632. Reason = "connection timer" flags=0
APM_MGR Apr 11 18:11:06.213855 DEBUG dap_mgr_lb_run: Running LB for AP(4632) on Ctrler(10.246.0.30)
WLA Apr 11 18:07:44.023415 INFO AP 4632 network: <254>Dec 31 17:00:25 syslog: dap_set_initial_state: has_wlcinfo was 1, state is 1, try_last_good is 1
WLA Apr 11 18:07:44.023333 INFO AP 4632 tapa: <318>Dec 31 17:00:23 syslog: Boot count: 146 [BsVS: na; UbP:7: 7720118081012]
| Assignee | ||
Comment 4•10 years ago
|
||
ive also tried moving the AP interface out of the interface-range and pruned all VLANs but the ops VLAN but unfortunately that didn't resolve the issue either. there was still a lot of packet loss during the download phase and the AP still disappears after 5 minutes.
i will configure an AP in SFO1 and we'll send it to LON1 to swap out. i'll see if we can RMA wap302 but it appears to be out of warranty per the support site.
Comment 5•10 years ago
|
||
Any known issues with wap301.ops.lon1.mozilla.net ? It bounced this morning.
Comment 6•10 years ago
|
||
Just lost another WAP in London:
Mon 03:30:30 PDT [5040] wap309.ops.lon1.mozilla.net (10.246.1.9) is DOWN :PING CRITICAL - Packet loss = 100%
Comment 7•10 years ago
|
||
At least some of this is locals moving an AP without telling us first :(
| Assignee | ||
Comment 8•10 years ago
|
||
the clusters have been upgraded in all regions. we'll be tracking the one bad AP in bug 1154024.
Assignee: server-ops → vle
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Updated•10 years ago
|
Change Request: --- → approved
Flags: cab-review+
You need to log in
before you can comment on or make changes to this bug.
Description
•