Closed Bug 826135 Opened 9 years ago Closed 8 years ago

Throw error if --xre-path is invalid

Categories

(Testing :: Mochitest, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
mozilla21

People

(Reporter: jgriffin, Assigned: jgriffin)

References

Details

Attachments

(2 files, 2 obsolete files)

If you specify an invalid --xre-path while running B2G mochitests, the harness will use the version produced by the build.  For B2G, that's an arm executable which won't run on the desktop, and the result will be some inscrutable failure along the lines of:

Traceback (most recent call last):
  File "runtestsb2g.py", line 527, in main
    retVal = mochitest.runTests(options)
  File "/files/mozilla/b2g/take2/objdir-gecko/_tests/testing/mochitest/runtests.py", line 693, in runTests
    self.startWebServer(options)
  File "runtestsb2g.py", line 325, in startWebServer
    self.server.start()
  File "/files/mozilla/b2g/take2/objdir-gecko/_tests/testing/mochitest/runtests.py", line 401, in start
    self._process = self._automation.Process([xpcshell] + args, env = env)
  File "/files/mozilla/b2g/take2/objdir-gecko/_tests/testing/mochitest/automation.py", line 179, in __init__
    universal_newlines, startupinfo, creationflags)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 672, in __init__
    errread, errwrite)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 1202, in _execute_child
    raise child_exception
OSError: [Errno 8] Exec format error

This has come up several times when developers are first running mochitests.

When --xre-path is invalid, we abort with an error, rather than falling back to the default.
The same problem happens with reftest
we have a need for this in android automation.  When developers are doing 'make mochitest-remote' it is very confusing for them if they don't have the proper xre-path setup.
I'll submit this to try as well.
Attachment #697740 - Flags: review?(ahalberstadt)
Assignee: nobody → jgriffin
Comment on attachment 697740 [details] [diff] [review]
Error out when invalid --xre-path is passed,

Review of attachment 697740 [details] [diff] [review]:
-----------------------------------------------------------------

I'm a little confused by what this is trying to accomplish. Yes it's good to have these checks in here, but wasn't the problem that devs didn't know they had to specify a desktop --xre-path? With this patch it still looks like we'd have this confusion.
I agree with ahal here, this is putting code in the right places, but we are not executing the xpcshell binary and verifying it is for the host os.

if I run xpcshell for arm on my linux os:
jmaher@jmaher-MacBookPro:~/mozilla/src$ ./objdir-droid/dist/bin/xpcshell --help
/system/bin/linker: No such file or directory

if I run it from my desktop build:
export LD_LIBRARY_PATH=/home/jmaher/mozilla/inbound/obj-x86_64-unknown-linux-gnu/dist/bin
jmaher@jmaher-MacBookPro:~/mozilla/inbound$ ./obj-x86_64-unknown-linux-gnu/dist/bin/xpcshell --help
JavaScript-C 1.8.5+ 2011-04-16
usage: xpcshell [-g gredir] [-a appdir] [-r manifest]... [-PsSwWxCijmIn] [-v version] [-f scriptfile] [-e script] [scriptfile] [scriptarg...]
jmaher@jmaher-MacBookPro:~/mozilla/inbound$ 


we should be able to detect the stderr or the valid stdout and report success/failure
If we really want a sanity check here we can run file(1) on the binary and ensure that it's the type we expect (for remote, it should always be an ELF ARM binary).

$ file -b ../fennec-android-mozilla-central/dist/bin/xpcshell 
ELF 32-bit LSB executable, ARM, version 1 (SYSV), dynamically linked (uses shared libs), not stripped

$ file -b ../opt-mozilla-central/dist/bin/xpcshell 
ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, BuildID[sha1]=0x4910fb75eca1ac6592abedc2820cb0b7e9e3594c, not stripped
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #7)
> If we really want a sanity check here we can run file(1) on the binary and
> ensure that it's the type we expect (for remote, it should always be an ELF
> ARM binary).

Assuming the slaves all have libmagic installed
The problem this was attempting to fix is that very often, developers will pass an --xre-path which is really the (desktop) xpcshell binary, not the directory containing that binary.  When that happens currently, we can't find xpcshell (--xre-path is always treated as a dir) so we fall back to the default one, which breaks everything.

E.g., --xre-path ~/xulrunner-sdk/bin/xpcshell will cause this problem, and I've had many complaints from developers who have encountered this and didn't understand what was happening.

This patch fixes that particular issue.

But I agree that adding an extra check that we can actually invoke xpcshell, and give an appropriate error if we can't, would be good too.  I'll amend the patch with that.
Here you go:

def arm_elf(filename):
    data = open(filename, 'rb').read(20)
    return data[:4] == "\x7fELF" and ord(data[18]) == 40 # EM_ARM
Try run for a0d7534b101d is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=a0d7534b101d
Results (out of 51 total builds):
    success: 49
    warnings: 1
    failure: 1
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jgriffin@mozilla.com-a0d7534b101d
Comment on attachment 697740 [details] [diff] [review]
Error out when invalid --xre-path is passed,

Clearing review as jgriffin said he was amending the patch. Jgriffin, if you want to land this for now and file a new bug about having the proper binary, consider my review an r+.
Attachment #697740 - Flags: review?(ahalberstadt)
This adds Ted's arm_elf method.  Pushed to try: https://tbpl.mozilla.org/?tree=Try&rev=b22ad4b336f5
Attachment #703700 - Flags: review?(ahalberstadt)
Comment on attachment 703700 [details] [diff] [review]
Error out when invalid --xre-path is passed,

Review of attachment 703700 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good

::: layout/tools/reftest/runreftestb2g.py
@@ +173,5 @@
> +            self.error('xpcshell not found at %s' % xpcshell)
> +        if self._automation.elf_arm(xpcshell):
> +            self.error('--xre-path points to an ARM version of xpcshell; it '
> +                       'should instead point to a version that can run on '
> +                       'your desktop')

Does elf_arm work on windows? If so this should probably live in verifyCommonOptions seeing as all three need a desktop binary
Attachment #703700 - Flags: review?(ahalberstadt) → review+
elf_arm will accurately detect that a binary is an ARM ELF executable, such as the target xpcshell that gets built for Android. That should work fine on Windows.
Duplicate of this bug: 822479
Updated per review comments.  Pushed to try again to check just the Windows run:  https://tbpl.mozilla.org/?tree=Try&rev=afc8c0bfc642
Attachment #703700 - Attachment is obsolete: true
Attachment #697740 - Attachment is obsolete: true
Comment on attachment 703961 [details] [diff] [review]
Error out when invalid --xre-path is passed,

Carry r+ forward
Attachment #703961 - Flags: review+
Try run failed with:

Traceback (most recent call last):
  File "reftest/runreftest.py", line 320, in <module>
    main()
  File "reftest/runreftest.py", line 301, in main
    options = parser.verifyCommonOptions(options, reftest)
  File "reftest/runreftest.py", line 279, in verifyCommonOptions
    xpcshell = os.path.join(options.xrePath, 'xpcshell')
  File "C:\mozilla-build\buildbotve\lib\ntpath.py", line 90, in join
    assert len(path) > 0
TypeError: object of type 'NoneType' has no len()

Try again: https://tbpl.mozilla.org/?tree=Try&rev=b2ea5e48107d
Blah:

Traceback (most recent call last):
  File "reftest/runreftest.py", line 320, in <module>
    main()
  File "reftest/runreftest.py", line 311, in main
    options = parser.verifyCommonOptions(options, reftest)
  File "reftest/runreftest.py", line 280, in verifyCommonOptions
    if self._automation._IS_WIN32:
AttributeError: 'Automation' object has no attribute '_IS_WIN32'

and again:  https://tbpl.mozilla.org/?tree=Try&rev=85ea227a489a

(I'm really just trying to increase my score at http://people.mozilla.org/~catlee/highscores/highscores.html)
https://hg.mozilla.org/mozilla-central/rev/20df9d2e6138
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla21
This change seems to cause problems when running mochitest-remote locally:

    mozdev@mozdev-virtual-machine:/media/extra/objdir-native-droid$ export DM_TRANS=sut
    mozdev@mozdev-virtual-machine:/media/extra/objdir-native-droid$ export TEST_DEVICE=192.168.0.131
    mozdev@mozdev-virtual-machine:/media/extra/objdir-native-droid$ export TEST_PATH=content/smil/test
    mozdev@mozdev-virtual-machine:/media/extra/objdir-native-droid$ make mochitest-remote
    Usage: Usage instructions for runtests.py.
    All arguments are optional.
    If --chrome is specified, chrome tests will be run instead of web content tests.
    If --browser-chrome is specified, browser-chrome tests will be run instead of web content tests.
    See <http://mochikit.com/doc/html/MochiKit/Logging.html> for details on the logging levels.
     
    runtestsremote.py: error: --utility-path '/mnt/sdcard/tests/fennec/bin' is not a directory
    make: *** [mochitest-remote] Error 2
    mozdev@mozdev-virtual-machine:/media/extra/objdir-native-droid$ 


I debugged a little and found utilityPath set to /mnt/sdcard/tests/fennec/bin here: https://hg.mozilla.org/mozilla-central/file/19f630648c80/testing/mochitest/runtestsremote.py#l114 ... clearly a remote path (that doesn't exist) but then we check for it on the local machine later on....
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
I don't think that's related to this patch. The check is getting tripped up on --utility-path while this patch only affects --xre-path.

My guess is that the make target has been broken for a long time and no one has noticed until now. We should probably just get rid of the make target and work on hooking remote mochitests into mach instead.
Does this patch fix the problem?
Attachment #706514 - Flags: review?(gbrown)
Attachment #703961 - Attachment is obsolete: true
Attachment #703961 - Attachment is obsolete: false
ahal, FWIW, this worked until recently for me, as we regularly run mochitests locally as part of the Android WebRTC bringup.

jgriffin this fixes it for me quite nicely.  Thanks for the quick turnaround!
(In reply to Dan Mosedale (:dmose) from comment #26)
> ahal, FWIW, this worked until recently for me, as we regularly run
> mochitests locally as part of the Android WebRTC bringup.

Yeah, looking more closely we set options.xrePath = options.utilityPath if none is passed in via command line. This is probably not the correct behaviour when running remotely, but lets just take the workaround for now as this is likely going to be refactored this quarter anyway.
Comment on attachment 706514 [details] [diff] [review]
Fix --utility-path error on Fennec,

Works for me too - thanks!
Attachment #706514 - Flags: review?(gbrown) → review+
https://hg.mozilla.org/mozilla-central/rev/f9800b4da607
Status: REOPENED → RESOLVED
Closed: 8 years ago8 years ago
Resolution: --- → FIXED
Duplicate of this bug: 835778
You need to log in before you can comment on or make changes to this bug.