Closed
Bug 1235436
Opened 8 years ago
Closed 8 years ago
All Socorro Processors failing on memory error
Categories
(Socorro :: Backend, task)
Socorro
Backend
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: lars, Unassigned)
References
Details
At approximately 7:30am PST, all the Socorro processors died within moments of each other. Investigation showed that the python module ujson was core dumping and killing the entire processor process. For a reason that is not clear, the automatic restart was failing. Interestingly, while the problem is recurring every few minutes on all of the processor, the automatic restart is being successful 95% of the time. The problem appears to happen in the processor rule "OutOfMemoryBinaryRule" when it attempts to read the json "memory report" submitted by the client. ujson raises an unrecoverable "double free" or "memory corruption" error. There is no history of this problem happening in the past. It began suddenly and continues to recur every few minutes. It is not the same single crash repeating over and over, each crash that triggers the problem is new. Something changed on the Web that induces a crash in Firefox that in turn induces a crash in ujson, which brings the processors down. Experimenting with a workaround, substituting the 'json' module for 'json' forestalls the problem entirely. PR pending... collectors and crashmovers are not affected. No crashes are being lost.
Reporter | ||
Comment 1•8 years ago
|
||
Interestingly, the problem ceased almost exactly at noon, 12pm DST 2015-12-18.
Reporter | ||
Comment 2•8 years ago
|
||
and then it came back for a couple hours on 12/28 3 processors died of it 7 recovered but gained no immunity
Comment 3•8 years ago
|
||
How can we get access to these kinds of blobs for local testing/debugging?
Comment 4•8 years ago
|
||
By the way we're running a release of ujson from April 2014. https://bugzilla.mozilla.org/show_bug.cgi?id=1237386
Comment 5•8 years ago
|
||
Upgrading ujson is unlikely to solve this. We still need something to reproduce against. But I want to connect the bugs. Which we might want to re-evaluate later.
Depends on: 1237386
Comment 6•8 years ago
|
||
It happened again. ujson 1.35 "solved it" but we don't want 1gb processed crash data in so we filed https://bugzilla.mozilla.org/show_bug.cgi?id=1248610 which took care of it.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•