While working on security bug verification I have noticed that sending a crash report to Socorro doesn't queue the report up and process it. The server status says that there are 13 jobs waiting (which is pretty low) but none of the processors are really executing those. https://crash-stats.mozilla.com/status The oldest job in the queue is still the same: 2011-12-06 00:30:34.798482
HBase logs show that network communication with ZooKeeper was interrupted at 00:30 pacific time and after one minute, the entire cluster was shut down. I restarted the cluster and it is in the process of coming back up. Should be operational in an hour. 1. This is the second network event since our request to relocate the ZK nodes was denied. We should re-evaluate that decision. 2. Will check with tmary tomorrow on why on-call did not have the docs to cover this restart since it was nice and clean, the easiest of HBase issues to be possibly handled.
Assignee: nobody → deinspanjer
Thanks Daniel. Stats are looking good now and jobs are getting processed.
Status: NEW → RESOLVED
Last Resolved: 7 years ago
Resolution: --- → FIXED
Status: RESOLVED → VERIFIED
Component: Socorro → General
Product: Webtools → Socorro
You need to log in before you can comment on or make changes to this bug.