Closed Bug 522829 Opened 10 years ago Closed 10 years ago

crash stacks on Windows unittests broken (MINIDUMP_STACKWALK path translation problem) (WindowsError: [Error 2] The system cannot find the path specified)

Categories

(Release Engineering :: General, defect, critical)

x86
Windows XP
defect
Not set
critical

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: ddahl, Assigned: bhearsum)

References

Details

(Keywords: intermittent-failure)

Attachments

(1 file)

*** registerContentHandler(text/html,http://localhost:8888/%s,Foo handler)
TEST-UNEXPECTED-FAIL | automation.py | Exited with code -1073741819 during test run
INFO | automation.py | Application ran for: 0:05:23.547000
TEST-UNEXPECTED-FAIL | automation.py | application crashed (minidump found)
Traceback (most recent call last):
  File "mochitest/runtests.py", line 580, in <module>
  File "mochitest/runtests.py", line 478, in main
  File "e:\builds\moz2_slave\mozilla-central-win32-unittest-mochitests\build\mochitest\automation.py", line 480, in runApp
    if checkForCrashes(os.path.join(profileDir, "minidumps"), symbolsPath):
  File "e:\builds\moz2_slave\mozilla-central-win32-unittest-mochitests\build\mochitest\automationutils.py", line 80, in checkForCrashes
    subprocess.call([stackwalkPath, d, symbolsPath], stderr=nullfd)
  File "d:\mozilla-build\python25\lib\subprocess.py", line 443, in call
    return Popen(*popenargs, **kwargs).wait()
  File "d:\mozilla-build\python25\lib\subprocess.py", line 593, in __init__
    errread, errwrite)
  File "d:\mozilla-build\python25\lib\subprocess.py", line 793, in _execute_child
    startupinfo)
WindowsError: [Error 2] The system cannot find the path specified

command timed out: 1200 seconds without output
program finished with exit code 1
elapsedTime=1531.172000


http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1255733097.1255734835.5391.gz&fulltext=1#err2

WINNT 5.2 mozilla-central test mochitests  [testfailed] Started 15:44, finished 16:14
Whiteboard: [orange]
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1256053889.1256055563.14931.gz
WINNT 5.2 mozilla-central test mochitests on 2009/10/20 08:51:29
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1256048602.1256050220.19341.gz
WINNT 5.2 mozilla-central test opt mochitests on 2009/10/20 07:23:22
Blocks: 438871
There are two bugs here:
1) Something is crashing mochitest and producing a minidump
2) The harness code is failing to execute minidump_stackwalk to print a stack. It looks like perhaps the minidump_stackwalk binary doesn't exist. I pushed an extra print statement to verify if this is the problem:
http://hg.mozilla.org/mozilla-central/rev/ba9c06d83b3e

Also, why is this filed in Core: General? It's either a harness problem (and belongs in Testing:Mochitest) or a problem with the build slave/environment (and belongs in mozilla.org:Release Engineering). Nobody looks at bugs in Core: General, so I had no idea this bug existed.
Component: General → Mochitest
Product: Core → Testing
QA Contact: general → mochitest
And this got hit again
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1256087154.1256087561.13988.gz#err0
WINNT 5.2 mozilla-central test opt mochitests on 2009/10/20 18:05:54

I went to moz2-win32-slave02 and 
$ ls -l /e/builds/moz2_slave/mozilla-central-win32-opt-unittest-mochitests/tools/breakpad/win32/minidump_stackwalk.exe
worked just fine at the msys prompt.
But! There's something interesting here. 

TEST-UNEXPECTED-FAIL | automation.py | application crashed (minidump found)
SUCCESS: The process with PID 3608 has been terminated.
MINIDUMP_STACKWALK binary not found: /e/builds/moz2_slave/mozilla-central-win32-opt-unittest-mochitests/tools/breakpad/win32/minidump_stackwalk.exe

That printed out *an MSYS path*, but Python is a normal windows program. MSYS is supposed to do path translation on command line arguments, as well as environment variables. Clearly something is going wrong here, since we're getting an MSYS path inside of Python, which it doesn't know how to handle.
And, as usual at this time of the morning, after writing a comment I suspect I see the problem. There's no MSYS involved here. With the packaged unit tests, we simply have Buildbot invoking Python directly, one Windows program to another. Passing MSYS paths in that situation is broken.

Can we either make the toolsdir a Windows-style path here:
http://mxr.mozilla.org/build/source/buildbotcustom/process/factory.py#2971
(pwd -W should produce a path that looks like c:/foo/bar, which should work with MSYS or Windows programs)
or run the packaged unittest steps via MSYS bash (gross, but would solve the problem).
Component: Mochitest → Release Engineering
Product: Testing → mozilla.org
QA Contact: mochitest → release
Version: Trunk → other
http://mxr.mozilla.org/build/source/buildbotcustom/process/factory.py#4184
for the packaged unit test case. UnittestBuildFactory days are numbered.
Severity: normal → critical
Summary: build/automation problem: WindowsError: [Error 2] The system cannot find the path specified → crash stacks on Windows unittests broken (MINIDUMP_STACKWALK path translation problem) (WindowsError: [Error 2] The system cannot find the path specified)
I manually ran some packaged mochitests with MINIDUMP_STACKWALK set to e:/... instead of /e/.... After forcing a crash with ted's crashinject app, I got this:
http://people.mozilla.org/~bhearsum/misc/mochitest.log

Ted says this looks good, so this patch should fix us up for all of the packaged build cases.
Attachment #407540 - Flags: review?(ted.mielczarek)
Comment on attachment 407540 [details] [diff] [review]
use pwd -W on windows to get a useful path

Great, thanks!
Attachment #407540 - Flags: review?(ted.mielczarek) → review+
Attachment #407540 - Flags: checked-in+
Comment on attachment 407540 [details] [diff] [review]
use pwd -W on windows to get a useful path

changeset:   449:8e3e2a4208d4
I updated the masters with this patch and kicked a few opt packaged mochitest runs. Hopefully (ha) one will crash.
Duplicate of this bug: 523645
No point in keeping it open, if a fix is in.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Assignee: nobody → bhearsum
Whiteboard: [orange]
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.