1220430 - Intermittent PROCESS-CRASH | tsvgx | application crashed [@ google_breakpad::ExceptionHandler::WriteMinidump(std::basic_string<wchar_t,std::char_traits<wchar_t>,std::allocator<wchar_t> > const &,bool (*)(wchar_t const *,wchar_t const *,void *,_EXCEPTION_P

Reporter

Description

•

9 years ago

https://treeherder.mozilla.org/logviewer.html#?job_id=16627810&repo=mozilla-inbound

 PROCESS-CRASH | tsvgx | application crashed [@ google_breakpad::ExceptionHandler::WriteMinidump(std::basic_string<wchar_t,std::char_traits<wchar_t>,std::allocator<wchar_t> > const &,bool (*)(wchar_t const *,wchar_t const *,void *,_EXCEPTION_POINTERS *,MDRawAssertionInfo *,bool),void *)] 
 PROCESS-CRASH | tsvgx | application crashed [@ KiFastSystemCallRet + 0x0]

Phil Ringnalda (:philor)

Reporter

Updated

•

9 years ago

Blocks: 1220489

Phil Ringnalda (:philor)

Reporter

Updated

•

9 years ago

Blocks: 1220933

Phil Ringnalda (:philor)

Reporter

Updated

•

9 years ago

Blocks: 1220934

Joel Maher ( :jmaher ) (UTC -8)

Comment 1

•

9 years ago

interesting set of bugs- looking at this in slightly more detail, I see that we complete the test successfully, and it appears that we make it past the code which parses and stores the results, but we fail on the check for crashes.  In fact the browser exits with no error code, so we are seeing mini dumps sitting around.

The question I have is should we take action on these or ignore them?  Talos proper works just fine.  If we have these errors, show them, file bugs- we should have somebody looking at them.  I would vote for not checking for errors if we have success, but I know others might have reasons to care about the errors.

As a note the related bugs in here all seem to follow the same pattern.

:wlach, what are your thoughts on this?

Flags: needinfo?(wlachance)

Julien Pagès (:parkouss)

Comment 2

•

9 years ago

Interesting question, is talos used to report only performance changes ? It seems like the browser crashed here, probably worst investigation, but maybe not under the talos bug category ?

Joel Maher ( :jmaher ) (UTC -8)

Comment 3

•

9 years ago

yeah, talos is designed to measure performance and it is possible to do that even with these *crashes*.  If we could find somebody who can make these crashes actionable, that would be great, otherwise we are missing data and creating more noise for the sheriffs.

William Lachance (:wlach)

Comment 4

•

9 years ago

Firefox crashing during normal operation is super serious, not something we should can ignore. We should get someone to look into these problems when they happen. I'm not sure who that should be but we should find out IMO.

Flags: needinfo?(wlachance)

Julien Pagès (:parkouss)

Comment 5

•

9 years ago

(In reply to Joel Maher (:jmaher) from comment #3)
> yeah, talos is designed to measure performance and it is possible to do that
> even with these *crashes*.  If we could find somebody who can make these
> crashes actionable, that would be great, otherwise we are missing data and
> creating more noise for the sheriffs.

There is a stack trace of the crash, that will probably help in investigating the issue - even if it is not easily reproducible.

Joel Maher ( :jmaher ) (UTC -8)

Comment 6

•

9 years ago

ted, given the dump from comment 0, where do we look for the failure?  it shows breakpad at the top of the stack.

Flags: needinfo?(ted)

(not currently active) Ted Mielczarek

Comment 7

•

9 years ago

If you look down to frame 4 you'll see:
 17:03:56 INFO - 4 xul.dll!mozilla::dom::ContentParent::ForceKillTimerCallback(nsITimer *,void *) [ContentParent.cpp:fe6809fd4d43 : 3543 + 0xd] 

This is the chrome process detecting that the content process is not responding, writing a pair of minidumps for itself+the content process and then killing the content process.

Up a bit you can see:
 17:03:44 INFO - 2015-10-31 17:03:44,348 INFO : Browser exited with error code: 0 

The main process actually exited successfully after doing this.

If you look down to the next PROCESS-CRASH line you can see the stack for the content process. Some relevant lines are:
17:04:02     INFO -   9  xul.dll!mozilla::ipc::MessageChannel::WaitForSyncNotify(bool) [WindowsMessageLoop.cpp:fe6809fd4d43 : 1080 + 0x5]
17:04:02     INFO -      eip = 0x646bff7c   esp = 0x002ded20   ebp = 0x002ded68
17:04:02     INFO -      Found by: call frame info
17:04:02     INFO -  10  xul.dll!mozilla::ipc::MessageChannel::Send(IPC::Message *,IPC::Message *) [MessageChannel.cpp:fe6809fd4d43 : 946 + 0xa]
17:04:02     INFO -      eip = 0x646c7bb0   esp = 0x002ded70   ebp = 0x002dedc0
17:04:02     INFO -      Found by: call frame info
17:04:02     INFO -  11  xul.dll!mozilla::dom::PBrowserChild::SendGetInputContext(int *,int *,int *) [PBrowserChild.cpp:fe6809fd4d43 : 963 + 0x10]
17:04:02     INFO -      eip = 0x647a4dce   esp = 0x002dedc8   ebp = 0x002dee0c
17:04:02     INFO -      Found by: call frame info
17:04:02     INFO -  12  xul.dll!mozilla::widget::PuppetWidget::GetInputContext() [PuppetWidget.cpp:fe6809fd4d43 : 680 + 0x14]
17:04:02     INFO -      eip = 0x65309dad   esp = 0x002dee14   ebp = 0x002dee28
17:04:02     INFO -      Found by: call frame info

It's hung up doing a synchronous IPC call in some IME code.

Flags: needinfo?(ted)

Joel Maher ( :jmaher ) (UTC -8)

Comment 8

•

9 years ago

awesome, thanks ted!

Phil Ringnalda (:philor)

Reporter

Updated

•

9 years ago

Blocks: 1227716

Phil Ringnalda (:philor)

Reporter

Updated

•

9 years ago

Blocks: 1228035

(not currently active) Ted Mielczarek

Updated

•

9 years ago

Component: Talos → DOM: Content Processes

Product: Testing → Core

Comment hidden (Intermittent Failures Robot)

6 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* fx-team: 3
* mozilla-inbound: 2
* mozilla-central: 1

Platform breakdown:
* windows7-32: 6

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1220430&startday=2015-11-30&endday=2015-12-06&tree=all

Phil Ringnalda (:philor)

Reporter

Updated

•

9 years ago

Blocks: 1234429

Comment hidden (Intermittent Failures Robot)

5 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-inbound: 4
* fx-team: 1

Platform breakdown:
* windows7-32: 4
* windows8-64: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1220430&startday=2016-01-04&endday=2016-01-10&tree=all

Comment hidden (Intermittent Failures Robot)

7 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* fx-team: 3
* mozilla-inbound: 2
* mozilla-central: 2

Platform breakdown:
* windows7-32: 4
* windows8-64: 3

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1220430&startday=2016-01-11&endday=2016-01-17&tree=all

Comment hidden (Intermittent Failures Robot)

9 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-inbound: 9

Platform breakdown:
* windows7-32: 7
* windows8-64: 2

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1220430&startday=2016-01-18&endday=2016-01-24&tree=all

Joel Maher ( :jmaher ) (UTC -8)

Comment 13

•

8 years ago

using this bug to track this- we have a unique bug for each talos test- this means that we are seeing about 120 failures/week with this :(

Luckily this is 95% win7 e10s, and some win8- all on trunk.  We should investigate why the content process is hanging.

Flags: needinfo?(jmaher)

Joel Maher ( :jmaher ) (UTC -8)

Comment 14

•

8 years ago

ok, my plan to look at recent failures and find a percentage of failures while collecting info on where we are forcing this crash (timeout, etc.) seems silly now that none of these errors have happened since Jan 28th.  I think I need to wait this out and see what comes up in the weekly reports- this might have been fixed by something else.

Flags: needinfo?(jmaher)

Joel Maher ( :jmaher ) (UTC -8)

Comment 15

•

8 years ago

this bug hasn't been seen in 5+ weeks

Status: NEW → RESOLVED

Closed: 8 years ago

Resolution: --- → WORKSFORME

Bugzilla

Quick Search

Intermittent PROCESS-CRASH | tsvgx | application crashed [@ google_breakpad::ExceptionHandler::WriteMinidump(std::basic_string<wchar_t,std::char_traits<wchar_t>,std::allocator<wchar_t> > const &,bool ()(wchar_t const ,wchar_t const ,void ,_EXCEPTION_P

Categories

(Core :: DOM: Content Processes, defect)

Tracking

()

People

(Reporter: philor, Unassigned)

References

Details

(Keywords: intermittent-failure)

Crash Data

Security

(public)

User Story

Description

Updated

Updated

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Updated

Updated

Updated

Comment 9

Updated

Comment 10

Comment 11

Comment 12

Comment 13

Comment 14

Comment 15