Closed
Bug 1220430
Opened 9 years ago
Closed 8 years ago
Intermittent PROCESS-CRASH | tsvgx | application crashed [@ google_breakpad::ExceptionHandler::WriteMinidump(std::basic_string<wchar_t,std::char_traits<wchar_t>,std::allocator<wchar_t> > const &,bool (*)(wchar_t const *,wchar_t const *,void *,_EXCEPTION_P
Categories
(Core :: DOM: Content Processes, defect)
Core
DOM: Content Processes
Tracking
()
RESOLVED
WORKSFORME
Tracking | Status | |
---|---|---|
firefox45 | --- | affected |
People
(Reporter: philor, Unassigned)
References
Details
(Keywords: intermittent-failure)
https://treeherder.mozilla.org/logviewer.html#?job_id=16627810&repo=mozilla-inbound PROCESS-CRASH | tsvgx | application crashed [@ google_breakpad::ExceptionHandler::WriteMinidump(std::basic_string<wchar_t,std::char_traits<wchar_t>,std::allocator<wchar_t> > const &,bool (*)(wchar_t const *,wchar_t const *,void *,_EXCEPTION_POINTERS *,MDRawAssertionInfo *,bool),void *)] PROCESS-CRASH | tsvgx | application crashed [@ KiFastSystemCallRet + 0x0]
Comment 1•9 years ago
|
||
interesting set of bugs- looking at this in slightly more detail, I see that we complete the test successfully, and it appears that we make it past the code which parses and stores the results, but we fail on the check for crashes. In fact the browser exits with no error code, so we are seeing mini dumps sitting around. The question I have is should we take action on these or ignore them? Talos proper works just fine. If we have these errors, show them, file bugs- we should have somebody looking at them. I would vote for not checking for errors if we have success, but I know others might have reasons to care about the errors. As a note the related bugs in here all seem to follow the same pattern. :wlach, what are your thoughts on this?
Flags: needinfo?(wlachance)
Comment 2•9 years ago
|
||
Interesting question, is talos used to report only performance changes ? It seems like the browser crashed here, probably worst investigation, but maybe not under the talos bug category ?
Comment 3•9 years ago
|
||
yeah, talos is designed to measure performance and it is possible to do that even with these *crashes*. If we could find somebody who can make these crashes actionable, that would be great, otherwise we are missing data and creating more noise for the sheriffs.
Comment 4•9 years ago
|
||
Firefox crashing during normal operation is super serious, not something we should can ignore. We should get someone to look into these problems when they happen. I'm not sure who that should be but we should find out IMO.
Flags: needinfo?(wlachance)
Comment 5•9 years ago
|
||
(In reply to Joel Maher (:jmaher) from comment #3) > yeah, talos is designed to measure performance and it is possible to do that > even with these *crashes*. If we could find somebody who can make these > crashes actionable, that would be great, otherwise we are missing data and > creating more noise for the sheriffs. There is a stack trace of the crash, that will probably help in investigating the issue - even if it is not easily reproducible.
Comment 6•9 years ago
|
||
ted, given the dump from comment 0, where do we look for the failure? it shows breakpad at the top of the stack.
Flags: needinfo?(ted)
Comment 7•9 years ago
|
||
If you look down to frame 4 you'll see: 17:03:56 INFO - 4 xul.dll!mozilla::dom::ContentParent::ForceKillTimerCallback(nsITimer *,void *) [ContentParent.cpp:fe6809fd4d43 : 3543 + 0xd] This is the chrome process detecting that the content process is not responding, writing a pair of minidumps for itself+the content process and then killing the content process. Up a bit you can see: 17:03:44 INFO - 2015-10-31 17:03:44,348 INFO : Browser exited with error code: 0 The main process actually exited successfully after doing this. If you look down to the next PROCESS-CRASH line you can see the stack for the content process. Some relevant lines are: 17:04:02 INFO - 9 xul.dll!mozilla::ipc::MessageChannel::WaitForSyncNotify(bool) [WindowsMessageLoop.cpp:fe6809fd4d43 : 1080 + 0x5] 17:04:02 INFO - eip = 0x646bff7c esp = 0x002ded20 ebp = 0x002ded68 17:04:02 INFO - Found by: call frame info 17:04:02 INFO - 10 xul.dll!mozilla::ipc::MessageChannel::Send(IPC::Message *,IPC::Message *) [MessageChannel.cpp:fe6809fd4d43 : 946 + 0xa] 17:04:02 INFO - eip = 0x646c7bb0 esp = 0x002ded70 ebp = 0x002dedc0 17:04:02 INFO - Found by: call frame info 17:04:02 INFO - 11 xul.dll!mozilla::dom::PBrowserChild::SendGetInputContext(int *,int *,int *) [PBrowserChild.cpp:fe6809fd4d43 : 963 + 0x10] 17:04:02 INFO - eip = 0x647a4dce esp = 0x002dedc8 ebp = 0x002dee0c 17:04:02 INFO - Found by: call frame info 17:04:02 INFO - 12 xul.dll!mozilla::widget::PuppetWidget::GetInputContext() [PuppetWidget.cpp:fe6809fd4d43 : 680 + 0x14] 17:04:02 INFO - eip = 0x65309dad esp = 0x002dee14 ebp = 0x002dee28 17:04:02 INFO - Found by: call frame info It's hung up doing a synchronous IPC call in some IME code.
Flags: needinfo?(ted)
Comment 8•9 years ago
|
||
awesome, thanks ted!
Updated•9 years ago
|
Component: Talos → DOM: Content Processes
Product: Testing → Core
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 13•8 years ago
|
||
using this bug to track this- we have a unique bug for each talos test- this means that we are seeing about 120 failures/week with this :( Luckily this is 95% win7 e10s, and some win8- all on trunk. We should investigate why the content process is hanging.
Flags: needinfo?(jmaher)
Comment 14•8 years ago
|
||
ok, my plan to look at recent failures and find a percentage of failures while collecting info on where we are forcing this crash (timeout, etc.) seems silly now that none of these errors have happened since Jan 28th. I think I need to wait this out and see what comes up in the weekly reports- this might have been fixed by something else.
Flags: needinfo?(jmaher)
Comment 15•8 years ago
|
||
this bug hasn't been seen in 5+ weeks
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•