Closed
Bug 479985
Opened 17 years ago
Closed 17 years ago
Crash-stats.mozilla.com seems to be overloaded
Categories
(mozilla.org Graveyard :: Server Operations, task)
mozilla.org Graveyard
Server Operations
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: cmtalbert, Assigned: reed)
Details
I hit a bad crash last night on Shiretoko, and was trying to get my crash stacks out of Socorro, but it has never been able to produce them. It gives me the "we are giving your report priority" page, but it never moves off of that. I let it run for 3 hours last night, and it's been running for 2 more this morning (attempted to refresh all the tabs) and there's still no data.
The stacks are:
http://crash-stats.mozilla.com/report/pending/ce7a37b4-ebbc-471f-ac37-b3a3c2090223
http://crash-stats.mozilla.com/report/pending/7825f177-1058-4fad-aef2-6afa42090223
http://crash-stats.mozilla.com/report/pending/7541f53f-05db-40dd-a194-ac8372090223
This is pretty critical because this crash should be a beta 3 blocker. I've built debug in the meantime on the 1.9.1 branch and will attempt to get a stack from there to be sure the bug is filed.
Comment 1•17 years ago
|
||
Not really overloaded (status page says otherwise), but these aren't coming through for some reason.
http://crash-stats.mozilla.com/status
Comment 2•17 years ago
|
||
Weren't those crashes transmitted correctly? Seems like they are not in the database yet.
Might be due to the nature of the crash. It happens very quickly on startup. Was able to ascertain that the crash is already reported and known, so knocking this down a couple of notches on the severity rating. The crash reporter told me that they were submitted successfully. What else can I check to see if it was lying?
Severity: critical → normal
Comment 4•17 years ago
|
||
chizu: you're the only one with access to the database and filesystem, only you can tell us what has gone wrong.
1st step - grep the log of the monitor for any of ce7a37b4-ebbc-471f-ac37-b3a3c2090223, 7825f177-1058-4fad-aef2-6afa42090223, 7541f53f-05db-40dd-a194-ac8372090223 If there are any hits, send me the log so I can see what happened.
2nd step (if there were no hits on the 1st step): In the configuration for collector/monitor/processor there definitions for some file system paths called storageRoot and deferredStorageRoot. Search those paths for any of the following files: ce7a37b4-ebbc-471f-ac37-b3a3c2090223.json, 7825f177-1058-4fad-aef2-6afa42090223.json, 7541f53f-05db-40dd-a194-ac8372090223.json
report back to me any findings or lack there of
Comment 5•17 years ago
|
||
There is a problem with processor or monitor -- this needs to be looked at ASAP.
Severity: normal → blocker
| Assignee | ||
Updated•17 years ago
|
Assignee: server-ops → reed
OS: Mac OS X → All
Hardware: x86 → All
Comment 6•17 years ago
|
||
Since no new crashes are being processed at all, I suspect that the monitor has halted for some reason. If someone could get me the last 100 lines or so of its log file.
| Assignee | ||
Comment 7•17 years ago
|
||
Both the monitor and the processor were running, and I saw stackwalk processing dumps. I restarted both of them anyway. monitor.log is at people.mozilla.org/~reed/monitor.log.tar.bz2 for your perusal.
Comment 8•17 years ago
|
||
This just worked for me:
http://crash-stats.mozilla.com/report/index/3e1c9474-2ed8-448f-86c1-8fdde2090224?p=1
Comment 9•17 years ago
|
||
Mike, do we know of a good way to monitor for this condition? Something that can be pageable? (maybe we already know but I don't know...)
Comment 10•17 years ago
|
||
Could probably do something with the status page and provide a value for "reports processed in last 5 minutes" that nagios can check. Lars - any ideas there?
| Assignee | ||
Comment 11•17 years ago
|
||
Aravind has bug 426941, bug 465687, and bug 465689 on his plate to deal with monitoring.
Since it's working now, I'm resolving this.
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Comment 12•17 years ago
|
||
was the existing monitor killed with "-9"? I cannot see in the log that it properly shut itself down. Thye log showed that it was working normally until it was restarted.
We're the existing processors killed using "-9"? I'd like to see their logs, too.
| Assignee | ||
Comment 13•17 years ago
|
||
(In reply to comment #12)
> was the existing monitor killed with "-9"?
Yes.
> We're the existing processors killed using "-9"? I'd like to see their logs,
> too.
Yes, but I only killed one processor. I just found the other processor on another machine, but I didn't touch it earlier.
Logs for both processors are available at http://people.mozilla.org/~reed/socorro/processor_logs.tar.bz2.
Updated•11 years ago
|
Product: mozilla.org → mozilla.org Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•