10.20 KB, patch
|Details | Diff | Splinter Review|
148.28 KB, application/octet-stream
1.52 KB, application/octet-stream
1.65 KB, text/plain
Since we switched to VC++ 2010, we've been seeing some bad stacks in certain cases. It looks like the new compiler is producing stack unwind information that Breakpad doesn't know how to handle. There are two specific things that are causing problems: 1) A new operator in the unwind expressions: @. This appears to be an alignment operator. 2) A new built-in value: .raSearch. We haven't narrowed down exactly what the difference is between this and .raSearchStart. I have a WIP Breakpad patch that adds support for both of these, but I'm not sure if it's correct, and if it's going to fix the problem.
Created attachment 596661 [details] [diff] [review] Add support for @ operator and .raSearch in Breakpad x86 stackwalker Very much a WIP. I don't know if the @ implementation is correct, and I'm pretty sure the value of .raSearch isn't perfect.
I put my patch for the @ operator up for review upstream: http://breakpad.appspot.com/349002/ bsmedberg has a patch to make .raSearch handling work which he's also going to upstream.
The .raSearch fix is up at http://breakpad.appspot.com/349003/ I need the @ patch to land before I can upload a useful patch for the remaining issue, since it touches/refactors the same code blocks.
Created attachment 597449 [details] [diff] [review] Postfix tokenizer fix, on top of the "@" change For archiving.
The @ operator landed upstream: http://code.google.com/p/google-breakpad/source/detail?r=923
http://code.google.com/p/google-breakpad/source/detail?r=926 http://code.google.com/p/google-breakpad/source/detail?r=927 This bug is now ready to morph into a deployment bug. We need minidump_stackwalk recompiled against breakpad r927 and deployed to staging.
(In reply to Benjamin Smedberg [:bsmedberg] from comment #6) > http://code.google.com/p/google-breakpad/source/detail?r=926 > http://code.google.com/p/google-breakpad/source/detail?r=927 > > This bug is now ready to morph into a deployment bug. We need > minidump_stackwalk recompiled against breakpad r927 and deployed to staging. We now have a jenkins job for breakpad: https://ci.mozilla.org/job/breakpad/52/ The Socorro job automatically pulls the last successful build (which is that one): https://github.com/mozilla/socorro/blob/1c80b6cb7afcc0072d7b6a7010398d2cc0fcf906/scripts/build.sh#L12 https://ci.mozilla.org/job/breakpad/lastSuccessfulBuild/? So I think we don't need to do anything special here, this should automatically go out on the next Socorro release (and should be on the dev server crash-stats-dev.allizom.org already)
rhelmer is going to help shepherd this onto stage so that it can go out on Wednesday with the scheduled release.
Created attachment 601042 [details] Dump for testing on stage I'm going to attach a .dump and .extra file to test this with.
Created attachment 601403 [details] json file for testing on stage This is equivalent to the .extra file, but formatted as JSON so we can use the submitter tool.
I inserted crashes but can't complete testing quite yet, we're doing a DB import on dev and staging hbase seems to be down (bug 730986).
OK here is an example: https://crash-stats-dev.allizom.org/report/index/e80d6c06-0bdb-4072-a71a-baf5f2120229 Can someone help to make sure this looks ok? I'd like to also make sure "normal" jobs work before marking verified.
KaiRo says comment 13 looks ok, and seems to do no harm with other crashes: https://crash-stats-dev.allizom.org/report/index/ad99fc9c-c0ab-4454-8df4-5b37d2120229
Unfortunately, it looks like we also still have MSVC2010-generated builds coming in with bad signatures/stacks: https://crash-stats.mozilla.com/report/list?signature=hang%20%7C%20WaitForSingleObjectEx%20%7C%20google_breakpad%3A%3AExceptionHandler%3A%3AInitialize%28std%3A%3Abasic_string%3Cwchar_t%2C%20std%3A%3Achar_traits%3Cwchar_t%3E%2C%20std%3A%3Aallocator%3Cwchar_t%3E%20%3E%20const%26%2C%20bool%20%28%2A%29%28void%2A%2C%20_EXCEPTION_POINTERS%2A%2C%20MDRawAssertionInfo%2A%29%2C%20bool%20%28%2A%29%28wchar_t%20const%2A%2C%20wc... e.g. bp-12b703af-625d-4f51-9d48-d9e242120305 and bp-3f51f95c-ce33-4301-b96a-066b02120305
Are you sure the symbols were present at processing time? Without actually having the processing log, this looks like the symbols weren't fully uploaded/synced yet.
Oh, ignore me, the google_breakpad symbols are in xul.dll also. I don't know.
Kairo: those two crashes you linked are both 64-bit (note the amd64). We don't currently have useful unwind info in Win64 symbols. That's bug 548035.