Closed Bug 1280659 Opened 8 years ago Closed 7 years ago

Experiment with crash clustering on crash data in telemetry

Categories

(Socorro :: General, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: ted, Unassigned)

References

Details

Once we get production crash data in telemetry (lonnen plz mark this bug as dependent on that bug) Kyle wants to experiment with clustering techniques on it. We're pretty sure the techniques that FuzzManager uses won't scale to this size, but there are other things we can try.
I'm not Lonnen, but I will play Lonnen in an upcoming major motion picture. Pretty sure the "let's send it to telemetry!" bug is bug #1273657. If that's not it, I'll apologize profusely and eat my shorts in penance.
Depends on: 1273657
I was experimenting with this topic a bit, the results seem promising.

Some examples of stack traces with different signatures that would have been clustered together:
1
fnhkinlpcwpstructw messagebuilder::writetraversestateout @0xde3bd88b nsbaseappshell::doprocessnextnativeevent processincomingrequest nsbaseappshell::onprocessnextevent mozilla::ipc::messagepump::run nsappshell::processnextnativeevent ns_dispatchtocurrentthread hookutil<t>::callwndproc remoteuianodestub::onmessage messageloop::runhandler nsthread::processnextevent hookutilbase::basecallwndproc cthreadinputmgr::peekmessagew kiuserapcdispatcher ns_processnextevent invokeoncorrectcontext __fndword remoteuianodestub::incoming_find nullinvoker::calltarget handlehookmessage hookbasedserverconnectionmanager::hookcallback kiusercallbackdispatcher dispatchhookw invokeoncorrectcontext2_callback _peekmessage

peekmessagew fnhkinlpcwpstructw messagebuilder::writetraversestateout nsbaseappshell::doprocessnextnativeevent processincomingrequest nsbaseappshell::onprocessnextevent mozilla::ipc::messagepump::run nsappshell::processnextnativeevent ns_dispatchtocurrentthread hookutil<t>::callwndproc remoteuianodestub::onmessage messageloop::runhandler nsthread::processnextevent hookutilbase::basecallwndproc cthreadinputmgr::peekmessagew kiuserapcdispatcher ns_processnextevent invokeoncorrectcontext __fndword remoteuianodestub::incoming_find nullinvoker::calltarget @0xe4458d00 handlehookmessage hookbasedserverconnectionmanager::hookcallback kiusercallbackdispatcher dispatchhookw invokeoncorrectcontext2_callback



2
js::jit::x86encoding::baseassembler::addl_ir js::irregexp::interpretedregexpmacroassembler::loadcurrentcharacter js::irregexp::choicenode::emitoutoflinecontinuation js::irregexp::choicenode::emit @0x27d2a7 js::autoenteroomunsaferegion::crash js::irregexp::interpretedregexpmacroassembler::expand js::irregexp::interpretedregexpmacroassembler::emit32

js::jit::x86encoding::baseassembler::jmp_i js::irregexp::interpretedregexpmacroassembler::loadcurrentcharacter js::irregexp::choicenode::emitoutoflinecontinuation js::irregexp::choicenode::emit js::autoenteroomunsaferegion::crash js::irregexp::interpretedregexpmacroassembler::expand js::irregexp::interpretedregexpmacroassembler::emit32



3
mozilla::condvar::wait nsthread::processnextevent pr_waitcondvar scopedxpcomstartup::~scopedxpcomstartup pr_lock nsobserverservice::notifyobservers waitforsingleobject nsurlclassifierdbservice::observe nsthread::shutdown nsthread::shutdowninternal ntwaitforsingleobject kifastsystemcallret nseventqueue::getevent waitforsingleobjectex nsthread::putevent

nsthread::processnextevent mozilla::condvar::wait zwwaitforsingleobject pr_waitcondvar scopedxpcomstartup::~scopedxpcomstartup pr_lock nsobserverservice::notifyobservers waitforsingleobject nsurlclassifierdbservice::observe nsthread::shutdown nsthread::shutdowninternal kifastsystemcallret nseventqueue::getevent waitforsingleobjectex nsthread::putevent



4
js::jit::eagersimdunbox js::jit::compilebackend js::helperthread::handleionworkload js::helperthread::threadloop js::jit::mdefinition::issimdunbox js::jit::optimizemir>

js::jit::compilebackend js::helperthread::handleionworkload js::jit::optimizemir js::helperthread::threadloop js::jit::aliasanalysis::analyze



I'm thinking of building a service that suggests similar stack traces given a stack trace (or, similar signatures given a signature, etc.), so we can evaluate the results before using them for clustering. It would also be useful in its own right, for example to check if a signature is actually related to multiple bugs (e.g. if there are a lot of very different stack traces with the same signature), if a crash is actually fixed (e.g. if the stack trace was modified slightly by the patch that supposedly fixed a bug, but the bug is actually still there), etc..
Example of suggestion of similar stack traces given a stack trace: https://pastebin.mozilla.org/8930337.
Crash data's coming in from Nightly. Here's an analysis I wrote earlier: https://gist.github.com/chutten/2063fe236a6ed46eb9b566dfa51ea755 

As expected, no JIT frames... and there's an odd json error I haven't pinned down that occasionally pollutes things, but I'm able to symbolicate collected stacks from raw crashping data. 

So if you were waiting for data, wait no longer!
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.