Closed Bug 1072979 Opened 10 years ago Closed 10 years ago

HTTPS - Port 443 on etherpad.zlb.phx.mozilla.net is CRITICAL: CRITICAL - Socket timeout after 10 seconds

Categories

(Infrastructure & Operations Graveyard :: WebOps: Other, task)

Other
Other
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: nagiosapi, Unassigned)

References

()

Details

(Whiteboard: [kanban:https://kanbanize.com/ctrl_board/4/1337] [id=nagios1.private.phx1.mozilla.com:373317])

Automated alert report from nagios1.private.phx1.mozilla.com:

Hostname: etherpad.zlb.phx.mozilla.net
Service:  HTTPS - Port 443
State:    CRITICAL
Output:   CRITICAL - Socket timeout after 10 seconds

Runbook:  http://m.allizom.org/HTTPS+-+Port+443
[root@etherpad3.webapp.phx1 lhirlimann]# screen -r

        at org.mortbay.jetty.nio.SelectChannelConnector.accept(SelectChannelConnector.java:124)
        at org.mortbay.jetty.AbstractConnector$Acceptor.run(AbstractConnector.java:707) 
        at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
2014-09-25 07:49:53.881::WARN:  EXCEPTION
java.io.IOException: Too many open files
        at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
        at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:165)  
        at org.mortbay.jetty.nio.SelectChannelConnector$1.acceptChannel(SelectChannelConnector.java:75)
        at org.mortbay.io.nio.SelectorManager$SelectSet.doSelect(SelectorManager.java:565)
        at org.mortbay.io.nio.SelectorManager.doSelect(SelectorManager.java:192)
        at org.mortbay.jetty.nio.SelectChannelConnector.accept(SelectChannelConnector.java:124)
        at org.mortbay.jetty.AbstractConnector$Acceptor.run(AbstractConnector.java:707) 
        at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
Task checkForStalePads execution failed with non-200 response: 500
Assignee: nobody → server-ops-webops
Component: MOC: Incidents → WebOps: Other
QA Contact: dmoore → nmaul
Task checkForStalePads execution failed with non-200 response: 500
^C[2014-09-25 07:59:26.240-0700]: Shutting down...
[2014-09-25 07:59:29.240-0700]: ...done, running onshutdown.
[2014-09-25 07:59:30.543-0700]: ...done, stopping server.
Whiteboard: [id=nagios1.private.phx1.mozilla.com:373317] → [kanban:https://kanbanize.com/ctrl_board/4/1337] [id=nagios1.private.phx1.mozilla.com:373317]
[root@etherpad3.webapp.phx1 lhirlimann]# service etherpad start
Starting soffice: 
Waiting for service to come up ...bash: /usr/lib64/openoffice.org3/program/soffice: No such file or directory
.
Starting etherpad: 

done
[root@etherpad3.webapp.phx1 lhirlimann]#  screen -r
[detached]
[root@etherpad3.webapp.phx1 lhirlimann]#
Closing the bug (since the service is up).  Wishing for team-pad like extension of the new Etherpad. >_<
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.