OSX 10.6 xpcshell perma-fail due to "TypeError: active_tests() keywords must be strings"

RESOLVED FIXED

Status

defect
RESOLVED FIXED
6 years ago
6 years ago

People

(Reporter: RyanVM, Assigned: zwol)

Tracking

({intermittent-failure})

Trunk
x86
Windows 7
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox28 unaffected, firefox29 affected, firefox30 fixed, firefox-esr24 unaffected)

Details

Attachments

(1 attachment)

Calling this bustage from bug 933885 given the runxpcshelltests.py blame and when this started. What version of python is running on these slaves?

https://tbpl.mozilla.org/php/getParsedLog.php?id=33634874&tree=Thunderbird-Trunk

TB Rev4 MacOSX Snow Leopard 10.6 comm-central debug test xpcshell on 2014-01-27 08:55:39 PST for push 5a029594d5f0
slave: talos-r4-snow-167

Found node at /builds/slave/talos-slave/test/build/xpcshell/node
Found moz-spdy at /builds/slave/talos-slave/test/build/xpcshell/moz-spdy/moz-spdy.js
Could not run moz-spdy server: [Errno 8] Exec format error
Found moz-http2 at /builds/slave/talos-slave/test/build/xpcshell/moz-http2/moz-http2.js
Could not run moz-http2 server: [Errno 8] Exec format error
Traceback (most recent call last):
  File "xpcshell/runxpcshelltests.py", line 1623, in <module>
    main()
  File "xpcshell/runxpcshelltests.py", line 1619, in main
    if not xpcsh.runTests(args[0], testdirs=args[1:], **options.__dict__):
  File "xpcshell/runxpcshelltests.py", line 1323, in runTests
    self.buildTestList()
  File "xpcshell/runxpcshelltests.py", line 799, in buildTestList
    self.alltests = mp.active_tests(**mozinfo.info)
TypeError: active_tests() keywords must be strings
Not only can I not reproduce this locally, my dev box blew up on me this morning (don't ask), and I have to concentrate on the day job for the next several days.  But in order to get more information for debugging, could someone please check in this change to runxpcshelltests.py?  Change

    self.alltests = mp.active_tests(**mozinfo.info)

to

    try:
        self.alltests = mp.active_tests(**mozinfo.info)
    except TypeError:
        sys.stderr.write("*** offending mozinfo.info: %s\n" % repr(mozinfo.info))
        raise

and wait for it to fail again.  (Given the proximate cause of the failure, I bet it's Unicode strings snuck in there somehow.)
In case it wasn't clear, this is on comm-central, btw.
(In reply to TBPL Robot from comment #23)

*** offending mozinfo.info: {u'bin_suffix': u'', u'datareporting': True, u'toolkit': u'cocoa', u'webm': True, u'buildapp': u'../mail', u'crashreporter': True, u'asan': False, u'ogg': True, u'wave': True, u'tests_enabled': True, u'appname': u'thunderbird', 'bits': 64, u'topsrcdir': u'/builds/slave/tb-c-cen-osx64-d-0000000000000/build/mozilla', 'version': 'OS X 10.6.8', u'mozconfig': u'/builds/slave/tb-c-cen-osx64-d-0000000000000/build/.mozconfig', u'debug': True, 'hasNode': False, 'os': u'mac', 'processor': u'x86_64'}
It is not apparent to me why the changes in bug 933885 would have triggered this failure, but the fundamental issue is that Python only started allowing Unicode kwargs in version 2.6.5 (see http://bugs.python.org/issue4978).  The Thunderbird/OSX10.6 build workers must have an older version of 2.6.

I am inclined to think that this should be addressed by upgrading Python on those workers.  I know MacPorts has newer Python for OSX 10.6.
It just dawned on me why this happened.  In passing in bug 933885, I removed a place where runxpcshelltests.py was parsing JSON with eval() instead of json.loads() [attachment 8349569 [details] [diff] [review]].  That place ... was the load of mozInfoFile.

I still don't have a functional development machine (checking out m-c causes filesystem corruption - seriously), but here is a potential patch, quite literally composed by hand in this <textarea> and untested. gps, can you comment please?

--- a/testing/xpcshell/runxpcshelltests.py
+++ b/testing/xpcshell/runxpcshelltests.py
@@ -1266,17 +1259,29 @@ class XPCShellTests(object):
         self.event = Event()
 
         # Handle filenames in mozInfo
         if not isinstance(self.mozInfo, dict):
             mozInfoFile = self.mozInfo
             if not os.path.isfile(mozInfoFile):
                 self.log.error("Error: couldn't find mozinfo.json at '%s'. Perhaps you need to use --build-info-json?" % mozInfoFile)
                 return False
-            self.mozInfo = json.loads(open(mozInfoFile).read())
+            self.mozInfo = json.load(open(mozInfoFile))
+
+        # mozinfo.info is used as kwargs. Some builds are done with
+        # an older Python that can't handle Unicode keys in kwargs.
+        # All of the keys in question should be ASCII.
+        if 'info' in self.mozInfo:
+            fixedInfo = {}
+            for k, v in self.mozInfo['info']items():
+                if isinstance(k, unicode):
+                    k = k.encode("ascii")
+                fixedInfo[k] = v
+            self.mozInfo['info'] = fixedInfo
+
         mozinfo.update(self.mozInfo)
 
         # buildEnvironment() needs mozInfo, so we call it after mozInfo is initialized.
         self.buildEnvironment()
 
         # The appDirKey is a optional entry in either the default or individual test
         # sections that defines a relative application directory for test runs. If
         # defined we pass 'grePath/$appDirKey' for the -a parameter of the xpcshell
Flags: needinfo?(gps)
The proposed patch will work. But you should really upgrade your Python to 2.7.3+. We've officially deprecated everything less for all of automation. The fact 2.6 continues to work for some test suites is luck.
Flags: needinfo?(gps)
If by "your" python, you mean what's running on the test slaves, I completely agree. Now please convince RelEng of that :)
It's tempting to take comment 32 as an r+, but maybe let's do the official procedure dance anyway.

Re upgrading build workers to newer Python, I guess I could file a releng bug, but can't you just walk across the office, gps?
Assignee: nobody → zackw
Status: NEW → ASSIGNED
Attachment #8374113 - Flags: review?(gps)
Component: XPConnect → XPCShell Harness
Product: Core → Testing
mozinfo.json loading is performed a number of places in automation land and many of our slaves are still running 2.6. I suspect this code has been written before. If it hasn't, we should probably add it to mozbase rather than inside runxpcshelltests.py.

Ted or Andrew should know more.
Flags: needinfo?(ted)
Flags: needinfo?(ahalberstadt)
I don't think "many" of our slaves are running 2.6. I personally think we should be pushing to get 2.7 in the few remaining places. Let's just take the quick band-aid to get this working again.
Flags: needinfo?(ted)
It's just our rev3 minis that still have 2.6. These slaves are in the process of being deprecated. I think some tegras might also have it.
Flags: needinfo?(ahalberstadt)
Comment on attachment 8374113 [details] [diff] [review]
964379-mozinfo-info-keys-ascii.diff

Review of attachment 8374113 [details] [diff] [review]:
-----------------------------------------------------------------

Reluctant (but necessary) r+.

::: testing/xpcshell/runxpcshelltests.py
@@ +1311,5 @@
> +            fixedInfo = {}
> +            for k, v in self.mozInfo['info'].items():
> +                if isinstance(k, unicode):
> +                    k = k.encode('ascii')
> +                fixedInfo[k] = v

Keys will always be unicode. I'd write this as:

self.mozInfo['info'] = dict((k.encode('ascii'), v) for k, v in self.mozInfo['info'].items())
Attachment #8374113 - Flags: review?(gps) → review+
(In reply to Gregory Szorc [:gps] from comment #53)
> 
> Keys will always be unicode.

Alas, no, some of them are not.  Here's the dict from comment 24 run through pprint.pprint:

    {u'appname': u'thunderbird',
     u'asan': False,
     u'bin_suffix': u'',
      'bits': 64,
     u'buildapp': u'../mail',
     u'crashreporter': True,
     u'datareporting': True,
     u'debug': True,
      'hasNode': False,
     u'mozconfig': u'/builds/slave/tb-c-cen-osx64-d-0000000000000/build/.mozconfig',
     u'ogg': True,
      'os': u'mac',
      'processor': u'x86_64',
     u'tests_enabled': True,
     u'toolkit': u'cocoa',
     u'topsrcdir': u'/builds/slave/tb-c-cen-osx64-d-0000000000000/build/mozilla',
      'version': 'OS X 10.6.8',
     u'wave': True,
     u'webm': True}

I doubt figuring out why is a good use of anyone's time. I've pushed the patch unmodified:

https://hg.mozilla.org/integration/mozilla-inbound/rev/11f6bc1228c7
Python 2.6, go home: you are drunk.
(In reply to TBPL Robot from comment #60)

This was on m-c rev 5f1d4098333f, which contains the fix from comment 59 :(
So the reason why the patch failed is because it's trying to fix keys to a dictionary that doesn't exist. When I pushed to try with this one: <https://hg.mozilla.org/try-comm-central/diff/59bb9943845b/mozilla-0000-test-fix.patch>, it works (or at least, it stops with the perma-fail).
The followup appears to have fixed the issue.
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Whiteboard: [leave open]
You need to log in before you can comment on or make changes to this bug.