Win64 TEST-UNEXPECTED-FAIL | C:\slave\test\build\tests\xpcshell\tests\toolkit\crashreporter\test\unit\test_crash_AsyncShutdown.js | test failed (with xpcshell return code: 0),

RESOLVED FIXED in mozilla35

Status

()

Toolkit
General
RESOLVED FIXED
4 years ago
3 years ago

People

(Reporter: dmajor (offline), Unassigned)

Tracking

(Blocks: 1 bug)

unspecified
mozilla35
x86_64
Windows 7
Points:
---
Dependency tree / graph
Bug Flags:
qe-verify -

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

4 years ago
https://tbpl.mozilla.org/php/getParsedLog.php?id=41403121&tree=Date

Not sure what the actual error is.

Comment 1

4 years ago
It's this line:

19:14:53     INFO -  ERROR: AsyncShutdown timeout in profile-before-change Conditions: [{"name":"OS.File: flush I/O queued before profile-before-change","state":{"launched":true,"shutdown":false,"worker":true,"pendingReset":false,"latestSent":["Mon Jun 09 2014 19:14:53 GMT-0700 (Pacific Standard Time)","getCurrentDirectory"],"latestReceived":["Mon Jun 09 2014 19:14:53 GMT-0700 (Pacific Standard Time)",{"ok":{"string":"C:\\\\slave\\\\test\\\\build\\\\tests\\\\xpcshell\\\\tests\\\\toolkit\\\\crashreporter\\\\test\\\\unit"},"id":1,"durationMs":null,"timeStamps":{"entered":1402366491220,"loaded":1402366491236}}],"messagesSent":0,"messagesReceived":1,"messagesQueued":1,"DEBUG":false},"filename":"resource://gre/modules/osfile/osfile_async_front.jsm","lineNumber":1515}] At least one completion condition failed to complete within a reasonable amount of time. Causing a crash to ensure that we do not leave the user with an unresponsive process draining resources.

Comment 2

4 years ago
Oh wait, I'm wrong. This test is checking to make sure that the crash does happen properly. So:

19:14:53  WARNING -  TEST-UNEXPECTED-FAIL | C:/slave/test/build/tests/xpcshell/tests/toolkit/crashreporter/test/unit/head_crashreporter.js | No minidump found! - See following stack:

I expect this is something to do with crash reporting in win64.
(Reporter)

Updated

4 years ago
Blocks: 1033110
(Reporter)

Comment 3

4 years ago
test_crash_AsyncShutdown.js:

function run_test() {
  do_crash(setup_crash, after_crash);
  do_crash(setup_osfile_crash_noerror, after_osfile_crash_noerror);
  do_crash(setup_osfile_crash_exn, after_osfile_crash_exn);
}

The first do_crash test is fine. The second and third fail. The NS_DebugBreak never hits Breakpad's ExceptionHandler::HandleException. Instead I just get my default postmortem debugger.
(Reporter)

Comment 4

4 years ago
It passes with --disable-ion. I think this is something to do with JIT frames and stack walking.

I can see ntdll!RtlDispatchException calling ntdll!RtlLookupFunctionEntry in a loop, looking up each frame from the stack that produced the "int 3". After the frame for mozjs!js::jit::DoCallFallback, there is a RWX address that I assume is JIT code, and after that, the addresses go off into nowhere. The walk never finds its way back to "regular" code. I guess that stops it from finding the UnhandledExceptionFilter?

Comment 5

4 years ago
Oh, that's bad! Maybe our x86-64 JIT doesn't set up the ABI stackwalk properly?
Flags: needinfo?(jdemooij)
(Reporter)

Comment 6

4 years ago
http://msdn.microsoft.com/en-us/library/ft9x1kdx.aspx:

> For dynamically generated functions [JIT compilers], the runtime to support these functions must 
> either use RtlInstallFunctionTableCallback or RtlAddFunctionTable to provide this information to 
> the operating system. Failure to do so will result in unreliable exception handling and debugging
> of processes.

I don't see either of those in DXR.
(In reply to David Major [:dmajor] from comment #4)
> It passes with --disable-ion. I think this is something to do with JIT
> frames and stack walking.

That's a configure flag, right? This will also disable Baseline. Judging from DoCallFallback you mentioned this is probably Baseline code.

(In reply to Benjamin Smedberg  [:bsmedberg] from comment #5)
> Oh, that's bad! Maybe our x86-64 JIT doesn't set up the ABI stackwalk
> properly?

Unfortunately stack walking for JIT code is unreliable, because Ion code can allocate the frame pointer register like any other register... Baseline code should maintain the frame pointer, but I'm not sure this works 100% of the time. There are also some differences between x64 and Win64, I'll take a look at that.

Do we know what mechanism they use?

Comment 8

4 years ago
"they" in this case is the official x86-64 ABI for Windows and the link dmajor provided.

Comment 9

4 years ago
Win64 requires all dynamically-generated code register unwind info with the runtime for SEH to work and we totally don't do that (it'd be a significant undertaking).  This is only a problem because breakpad uses the unhandled exception filter which happens after SEH (and never gets called if SEH fails).  Fortunately, we can simply switch to use a "vectored" exception handler, which runs before SEH and doesn't depend on any stack-walking.  I filed bug 844196 on this a while ago.  An important corollary, though, is that we can't use SEH anywhere (well, on any threads that can interleave JIT code) in FF.
Flags: needinfo?(jdemooij)
(Reporter)

Updated

4 years ago
Depends on: 844196

Updated

3 years ago
Blocks: 880004

Updated

3 years ago
Blocks: 886640
Is this working post-bug 844196?
(Reporter)

Comment 11

3 years ago
Yep! Was just waiting for a Date run. https://treeherder.mozilla.org/ui/#/jobs?repo=date&revision=335bc10c5ecb
Status: NEW → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → FIXED
(Reporter)

Updated

3 years ago
Target Milestone: --- → mozilla35
Flags: qe-verify-
You need to log in before you can comment on or make changes to this bug.