Closed Bug 1237675 Opened 8 years ago Closed 8 years ago

TOR Commons ClearOne VH20 Reporting Critical Error Message

Categories

(Infrastructure & Operations Graveyard :: NetOps: Other, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: trecendez, Assigned: justdave)

Details

Toronto Commons VH20 Clearone bouncing on/off according to Nagios monitoring:

7:05 AM <nagios-scl3> Thu 07:05:39 PST [5157] pbx1.voip.tor1.mozilla.com:ClearOne devices tor is CRITICAL: [ClearOne device failed: 75500] (http://m.mozilla.org/ClearOne+devices+tor)

7:15 AM <nagios-scl3> Thu 07:15:38 PST [5162] pbx1.voip.tor1.mozilla.com:ClearOne devices tor is OK: All devices: devices are OK (http://m.mozilla.org/ClearOne+devices+tor)

Device IP: 10.242.47.7
Device MAC: 00:06:24:01:8A:85	

Troubleshooting steps from AV:

1. Power cycled the device. - Issue persisted
2. Flip/Flopped network cables with VH20 and 880. (Issue stayed with VH20 (Possible hardware failure)). 

Device is is in "Critical" mode for 10 to 50 mins before the error clears. 

Next steps from AV, to check for Dial Tone when the device reports offline. 

Creating network ticket to see if NetOps team is seeing any errors on their end regarding this device.
While troubleshooting we came across an active device 75552 that wasn't monitored. sysadmins r113244 adds monitoring for 75552.
Thanks ashish!

To add some information - when the nagios alert is triggered, I made an outbound call with the vh20 clear one sip interface, and on

https://pbx1.voip.tor1.mozilla.com/asterisk/pbxinfo.cgi

in the sip channels list:

SIP Channels

Peer             User/ANR         Call ID          Format           Hold     Last Message    Expiry     Peer      
10.242.47.7      75500            97c92ba0-af22f0  0x4 (ulaw)       No       Rx: ACK                    75500     

But in the Sip peers list it shows:

75500/75500               (Unspecified)                            D                 0        UNREACHABLE
Possible registration issue. Can we push this over to JustDave to have a look once he returns from PTO?
They use Bugzilla.  You will need to open a bug and refer to this SN Ticket.
Assignee: network-operations → justdave
Any update on this ticket?

Note that the alert has not reoccured since 01/08/2016.
full-20151124:[Nov 23 17:48:29] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (8ms / 2000ms)
full-20151124:[Nov 24 07:20:48] NOTICE[17959] chan_sip.c: Peer '75500' is now UNREACHABLE!  Last qualify: 8
full-20151129:[Nov 24 08:29:11] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (2ms / 2000ms)
full-20151129:[Nov 24 08:45:48] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (14ms / 2000ms)
full-20151213:[Dec 11 07:16:53] NOTICE[17959] chan_sip.c: Peer '75500' is now UNREACHABLE!  Last qualify: 7
full-20151213:[Dec 11 07:26:22] NOTICE[17959] chan_sip.c: Peer '75500' is now UNREACHABLE!  Last qualify: 0
full-20151213:[Dec 11 07:35:04] NOTICE[17959] chan_sip.c: Peer '75500' is now UNREACHABLE!  Last qualify: 0
full-20151213:[Dec 11 07:39:45] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (8ms / 2000ms)
full-20151213:[Dec 11 07:40:00] NOTICE[17959] chan_sip.c: Peer '75500' is now UNREACHABLE!  Last qualify: 8
full-20151213:[Dec 11 07:49:43] NOTICE[17959] chan_sip.c: Peer '75500' is now UNREACHABLE!  Last qualify: 0
full-20151213:[Dec 11 08:08:34] NOTICE[17959] chan_sip.c: Peer '75500' is now UNREACHABLE!  Last qualify: 0
full-20151213:[Dec 11 08:39:44] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (8ms / 2000ms)
full-20151213:[Dec 11 10:12:00] NOTICE[17959] chan_sip.c: Peer '75500' is now UNREACHABLE!  Last qualify: 7
full-20151213:[Dec 11 10:39:42] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (8ms / 2000ms)
full-20151213:[Dec 11 10:41:49] NOTICE[17959] chan_sip.c: Peer '75500' is now UNREACHABLE!  Last qualify: 7
full-20151213:[Dec 11 11:17:43] NOTICE[17959] chan_sip.c: Peer '75500' is now UNREACHABLE!  Last qualify: 0
full-20151213:[Dec 11 11:39:41] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (8ms / 2000ms)
full-20151213:[Dec 11 11:42:43] NOTICE[17959] chan_sip.c: Peer '75500' is now UNREACHABLE!  Last qualify: 7
full-20151213:[Dec 11 12:39:40] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (8ms / 2000ms)
full-20151213:[Jan  4 13:58:34] NOTICE[17959] chan_sip.c: Peer '75500' is now UNREACHABLE!  Last qualify: 8
full-20151213:[Jan  4 14:29:40] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (8ms / 2000ms)
full-20151213:[Jan  5 04:32:22] NOTICE[17959] chan_sip.c: Peer '75500' is now UNREACHABLE!  Last qualify: 7
full-20151213:[Jan  5 05:29:25] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (8ms / 2000ms)
full-20151213:[Jan  5 07:29:23] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (8ms / 2000ms)
full-20151213:[Jan  5 08:29:22] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (8ms / 2000ms)
full-20151213:[Jan  5 14:48:44] NOTICE[17959] chan_sip.c: Peer '75500' is now UNREACHABLE!  Last qualify: 2
full-20151213:[Jan  5 15:29:15] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (8ms / 2000ms)
full-20151213:[Jan  6 05:54:20] NOTICE[17959] chan_sip.c: Peer '75500' is now UNREACHABLE!  Last qualify: 12
full-20151213:[Jan  6 05:55:40] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (5ms / 2000ms)
full-20151213:[Jan  6 06:08:24] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (14ms / 2000ms)
full-20151213:[Jan  6 10:09:19] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (8ms / 2000ms)
full-20151213:[Jan  6 11:05:12] NOTICE[17959] chan_sip.c: Peer '75500' is now UNREACHABLE!  Last qualify: 26
full-20151213:[Jan  6 11:05:23] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (914ms / 2000ms)
full-20151213:[Jan  6 11:49:34] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (33ms / 2000ms)
full-20151213:[Jan  6 12:09:17] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (14ms / 2000ms)
full-20151213:[Jan  6 12:42:49] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (1ms / 2000ms)
full-20151213:[Jan  6 12:43:53] NOTICE[17959] chan_sip.c: Peer '75500' is now UNREACHABLE!  Last qualify: 1
full-20151213:[Jan  6 12:45:13] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (2ms / 2000ms)
full-20151213:[Jan  6 12:46:17] NOTICE[17959] chan_sip.c: Peer '75500' is now UNREACHABLE!  Last qualify: 2
full-20151213:[Jan  6 13:10:15] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (8ms / 2000ms)
full-20151213:[Jan  7 07:09:57] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (8ms / 2000ms)
full-20151213:[Jan  7 08:58:30] NOTICE[17959] chan_sip.c: Peer '75500' is now UNREACHABLE!  Last qualify: 41
full-20151213:[Jan  7 08:58:40] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (3ms / 2000ms)
full-20151213:[Jan  7 09:14:34] NOTICE[17959] chan_sip.c: Peer '75500' is now UNREACHABLE!  Last qualify: 25
full-20151213:[Jan  7 09:15:39] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (3ms / 2000ms)
full-20151213:[Jan  7 09:25:30] NOTICE[17959] chan_sip.c: Peer '75500' is now UNREACHABLE!  Last qualify: 54
full-20151213:[Jan  7 09:26:08] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (4ms / 2000ms)
full-20151213:[Jan  7 09:53:31] NOTICE[17959] chan_sip.c: Peer '75500' is now UNREACHABLE!  Last qualify: 2
full-20151213:[Jan  7 09:54:08] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (90ms / 2000ms)
full-20151213:[Jan  7 09:56:13] NOTICE[17959] chan_sip.c: Peer '75500' is now UNREACHABLE!  Last qualify: 1016
full-20151213:[Jan  7 09:56:37] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (128ms / 2000ms)
full-20151213:[Jan  7 10:47:53] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (2ms / 2000ms)
full-20151213:[Jan  7 11:09:53] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (9ms / 2000ms)
full-20151213:[Jan  7 12:04:05] NOTICE[17959] chan_sip.c: Peer '75500' is now UNREACHABLE!  Last qualify: 112
full-20151213:[Jan  7 12:09:52] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (8ms / 2000ms)
full-20151213:[Jan  8 05:52:02] NOTICE[17959] chan_sip.c: Peer '75500' is now UNREACHABLE!  Last qualify: 7
full-20151213:[Jan  8 06:09:33] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (8ms / 2000ms)
full-20151213:[Jan  8 06:11:52] NOTICE[17959] chan_sip.c: Peer '75500' is now UNREACHABLE!  Last qualify: 7
full-20151213:[Jan  8 07:09:32] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (8ms / 2000ms)
full-20151213:[Jan  8 08:09:31] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (8ms / 2000ms)
full-20151213:[Jan  8 08:48:16] NOTICE[17959] chan_sip.c: Peer '75500' is now UNREACHABLE!  Last qualify: 7
full-20151213:[Jan  8 09:05:13] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (14ms / 2000ms)
full-20151213:[Jan 14 21:41:34] NOTICE[17959] chan_sip.c: Peer '75500' is now Lagged. (3008ms / 2000ms)
full-20151213:[Jan 14 21:41:44] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (8ms / 2000ms)
full-20151213:[Jan 15 00:57:31] NOTICE[17959] chan_sip.c: Peer '75500' is now Lagged. (3008ms / 2000ms)
full-20151213:[Jan 15 00:57:42] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (1008ms / 2000ms)
full-20151213:[Jan 15 02:01:29] NOTICE[17959] chan_sip.c: Peer '75500' is now Lagged. (2008ms / 2000ms)
full-20151213:[Jan 15 02:01:39] NOTICE[17959] chan_sip.c: Peer '75500' is now Reachable. (8ms / 2000ms)
Short version of the above: you had multiple instances of several minute disconnects on the 8th.  On the 14th and 15th there were a few times when it was severely lagged, but it lasted less than 10 seconds so nagios didn't notice.

The "UNREACHABLE" here is Asterisk claiming that the ClearOne is not responding when it pings it (SIP OPTIONS packet, usually).
(In reply to Dave Miller [:justdave] (justdave@bugzilla.org) from comment #7)
> Short version of the above: you had multiple instances of several minute
> disconnects on the 8th.  On the 14th and 15th there were a few times when it
> was severely lagged, but it lasted less than 10 seconds so nagios didn't
> notice.
> 
> The "UNREACHABLE" here is Asterisk claiming that the ClearOne is not
> responding when it pings it (SIP OPTIONS packet, usually).

Are the alerts false as we were able to dial out during the critical state? If so, how do we prevent it from happening again?
The alerts meant that if Asterisk tried to place a call TO the ClearOne device that it would not have been able to.  A likely cause is that the CPU on the device may be busy and unable to reply fast enough when pinged or somesuch.  Presumably when you place a call, it'll start throwing CPU at your attempt to make a call and then it'll work.
Closing out unless this comes up again.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.