Closed Bug 474863 Opened 16 years ago Closed 8 years ago

Invalid minidump being produced without a thread list (can't be processed by minidump_stackwalk, stackwalk.sh failed)

Categories

(Toolkit :: Crash Reporting, defect)

x86
Linux
defect
Not set
normal

Tracking

()

RESOLVED INCOMPLETE

People

(Reporter: dholbert, Unassigned)

References

Details

I just submitted a crash report to the crash-stats server, but it's not processing it -- when I try to load its URL...
http://crash-stats.mozilla.com/report/pending/249e42c9-fc6c-4835-8e95-4ea192090122
... I get a page with these contents:

-------------------
Your report is being processed
    * We've given your report priority
    * Your report should be ready in a minute
    * This page will refresh and display your report's status
    * When your report has been processed, we will redirect you to your report

Queue Info
ID
    249e42c9-fc6c-4835-8e95-4ea192090122
Time Queued
    2009-01-22 13:51:12.196595
Time Started
    2009-01-22 14:20:19.773969
Message
    <type 'instance'>:/home/processor/stackwalk/bin/stackwalk.sh failed with return code 1 when processing dump 249e42c9-fc6c-4835-8e95-4ea192090122
-------------------

Note in particular the final line there re "stackwalk.sh failed" -- looks like something's broken there.

Also, the status page http://crash-stats.mozilla.com/status says the crash server's mood is "Deathly" with 1250 jobs waiting -- I'm not sure if that's related.

(See also bug 459397, filed a few months back, and bug 472775, filed a few weeks back.  They both mention a similar stackwalk.sh failure.)
I've retrieved this crash dump from the production server and sent it through the my development server.  The actual dump file is huge at 129MB. The processor made these comments:

/home/processor/stackwalk/bin/minidump_stackwalk returned no header lines for reportid: 2028
no thread was identified as the cause of the crash
No signature could be created because we don't know which thread crashed
/home/processor/stackwalk/bin/minidump_stackwalk returned no frame lines for reportid: 2028
/home/processor/stackwalk/bin/minidump_stackwalk failed with return code 1 when processing dump
clearly, minidump_stackwalk doesn't like this crash dump.  Ted, if you're interested in exploring this further, you can find the dump and json files at: http://people.mozilla.com/~lars/breakpad
I don't have time at the moment to try this out, but could you capture the raw output (stdout+stderr) of running minidump_stackwalk on this dump? Also, 129Mb??!! That's ridiculously big. I've only ever seen dumps of up to 1Mb in cases of stack overflow.
I've run minidump_stackwalk manually and captured the output.  There was nothing on stdout.  All output came from stderr: http://people.mozilla.com/~lars/breakpad/249e42c9-fc6c-4835-8e95-4ea192090122.stderr
Thanks:
2009-01-28 07:57:45: minidump.cc:3635: INFO: GetStream: type 3 not present
2009-01-28 07:57:45: minidump_processor.cc:101: ERROR: Minidump /home/lars/breakpad/oldStyleTestData/testdata.diditfail/249e42c9-fc6c-4835-8e95-4ea192090122.dump has no thread list

That's the relevant output. The minidump doesn't have a list of threads in the process, so it can't actually be processed. Sounds like it's a broken minidump.
I see this is a dump from Linux. On Linux, we do a lot of things in the crashed process to collect all the necessary info. It's a bit fragile, as obviously process has already crashed, so any time you touch memory or anything you run the risk of having things fail again.
This all means:
 1: Socorro worked normally for this crash (after the 1/29/09 upgrade, Socorro will report the same messages that I echoed in Comment #1 above and then retain the report)
 2: minidump_stackwalk behaved within specs, too.  There's no much it can do with a bad dump.
 3: the problem lies with breakpad on the client or with the client itself.

Options for actions to be taken:
 1: live with it - it's not always possible to generate a proper dump file from a crashed program
 2: try to repeat the trouble and adjust/fix the breakpad client libraries to prevent this from happening.

Ted, a fear you're the one that will have to decide the disposition of this bug.
-> Breakpad Integration, anyway. Socorro is just dealing with a broken dump.
Component: Socorro → Breakpad Integration
Product: Webtools → Toolkit
QA Contact: socorro → breakpad.integration
Summary: crash-stats isn't processing minefield crash-reports (stackwalk.sh failed) → minidump being produced without a thread list (can't be processed by minidump_stackwalk, stackwalk.sh failed)
(In reply to comment #6)
> Options for actions to be taken:
>  2: try to repeat the trouble and adjust/fix the breakpad client libraries to
> prevent this from happening.

FWIW: As the originator of the crash report in question, I have no idea how to reproduce the problem... IIRC, it was just a random crash, and I don't remember much about what I did (if anything) to trigger it.
Yeah, that's a bummer. If we find a reproducible crash that triggers this, we could debug the breakpad code and figure out where it's falling down.
Depends on: 507876
It's quite possible that bug 514188 will fix this.
(In reply to comment #10)
> It's quite possible that bug 514188 will fix this.

I don't think it has (assuming that that fix is in my Firefox 3.6.4 build 6) since I just ran into this - see http://crash-stats.mozilla.com/report/index/faf28c45-38ba-416e-aa9d-2a0612100607
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #9)
> Yeah, that's a bummer. If we find a reproducible crash that triggers this,
> we could debug the breakpad code and figure out where it's falling down.

We can more or less reproduce this crash for our Firefox UI update tests (bug 1202375 and bug 1222197) on Windows only. This is happening together with Marionette. I'm going to add the minidump, extra file, and more information to bug 1202375.
I uploaded the minidump as attachment 8698022 [details]. When I tried to open it with windbg the tool failed with an error 0x80004005. Looks like we indeed still fail to produce valid minidumps here.
Summary: minidump being produced without a thread list (can't be processed by minidump_stackwalk, stackwalk.sh failed) → Invalid minidump being produced without a thread list (can't be processed by minidump_stackwalk, stackwalk.sh failed)
This bug is overly-broad. Specific reproducible crashes that produce bad dumps should be filed separately.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.