Closed Bug 479985 Opened 17 years ago Closed 17 years ago

Crash-stats.mozilla.com seems to be overloaded

Categories

(mozilla.org Graveyard :: Server Operations, task)

task
Not set
blocker

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: cmtalbert, Assigned: reed)

Details

I hit a bad crash last night on Shiretoko, and was trying to get my crash stacks out of Socorro, but it has never been able to produce them. It gives me the "we are giving your report priority" page, but it never moves off of that. I let it run for 3 hours last night, and it's been running for 2 more this morning (attempted to refresh all the tabs) and there's still no data. The stacks are: http://crash-stats.mozilla.com/report/pending/ce7a37b4-ebbc-471f-ac37-b3a3c2090223 http://crash-stats.mozilla.com/report/pending/7825f177-1058-4fad-aef2-6afa42090223 http://crash-stats.mozilla.com/report/pending/7541f53f-05db-40dd-a194-ac8372090223 This is pretty critical because this crash should be a beta 3 blocker. I've built debug in the meantime on the 1.9.1 branch and will attempt to get a stack from there to be sure the bug is filed.
Not really overloaded (status page says otherwise), but these aren't coming through for some reason. http://crash-stats.mozilla.com/status
Weren't those crashes transmitted correctly? Seems like they are not in the database yet.
Might be due to the nature of the crash. It happens very quickly on startup. Was able to ascertain that the crash is already reported and known, so knocking this down a couple of notches on the severity rating. The crash reporter told me that they were submitted successfully. What else can I check to see if it was lying?
Severity: critical → normal
chizu: you're the only one with access to the database and filesystem, only you can tell us what has gone wrong. 1st step - grep the log of the monitor for any of ce7a37b4-ebbc-471f-ac37-b3a3c2090223, 7825f177-1058-4fad-aef2-6afa42090223, 7541f53f-05db-40dd-a194-ac8372090223 If there are any hits, send me the log so I can see what happened. 2nd step (if there were no hits on the 1st step): In the configuration for collector/monitor/processor there definitions for some file system paths called storageRoot and deferredStorageRoot. Search those paths for any of the following files: ce7a37b4-ebbc-471f-ac37-b3a3c2090223.json, 7825f177-1058-4fad-aef2-6afa42090223.json, 7541f53f-05db-40dd-a194-ac8372090223.json report back to me any findings or lack there of
There is a problem with processor or monitor -- this needs to be looked at ASAP.
Severity: normal → blocker
Assignee: server-ops → reed
OS: Mac OS X → All
Hardware: x86 → All
Since no new crashes are being processed at all, I suspect that the monitor has halted for some reason. If someone could get me the last 100 lines or so of its log file.
Both the monitor and the processor were running, and I saw stackwalk processing dumps. I restarted both of them anyway. monitor.log is at people.mozilla.org/~reed/monitor.log.tar.bz2 for your perusal.
Mike, do we know of a good way to monitor for this condition? Something that can be pageable? (maybe we already know but I don't know...)
Could probably do something with the status page and provide a value for "reports processed in last 5 minutes" that nagios can check. Lars - any ideas there?
Aravind has bug 426941, bug 465687, and bug 465689 on his plate to deal with monitoring. Since it's working now, I'm resolving this.
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
was the existing monitor killed with "-9"? I cannot see in the log that it properly shut itself down. Thye log showed that it was working normally until it was restarted. We're the existing processors killed using "-9"? I'd like to see their logs, too.
(In reply to comment #12) > was the existing monitor killed with "-9"? Yes. > We're the existing processors killed using "-9"? I'd like to see their logs, > too. Yes, but I only killed one processor. I just found the other processor on another machine, but I didn't touch it earlier. Logs for both processors are available at http://people.mozilla.org/~reed/socorro/processor_logs.tar.bz2.
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.