Closed Bug 1042963 Opened 10 years ago Closed 10 years ago

KeyError: 'message' | OSError: IO Completion Port failed to signal process shutdown

Categories

(Testing :: Mochitest, defect)

x86
Windows 7
defect
Not set
normal

Tracking

(firefox33 fixed, firefox34 fixed)

RESOLVED FIXED
mozilla34
Tracking Status
firefox33 --- fixed
firefox34 --- fixed

People

(Reporter: vaibhav1994, Assigned: akachkach)

References

Details

Attachments

(1 file, 2 obsolete files)

Traceback:
Exception in thread Thread-24:
12:10:19     INFO -  Traceback (most recent call last):
12:10:19     INFO -    File "c:\mozilla-build\python27\Lib\threading.py", line 551, in __bootstrap_inner
12:10:19     INFO -      self.run()
12:10:19     INFO -    File "c:\mozilla-build\python27\Lib\threading.py", line 504, in run
12:10:19     INFO -      self.__target(*self.__args, **self.__kwargs)
12:10:19     INFO -    File "C:\slave\test\build\venv\lib\site-packages\mozprocess\processhandler.py", line 774, in _processOutput
12:10:19     INFO -      self.processOutputLine(line.rstrip())
12:10:19     INFO -    File "C:\slave\test\build\venv\lib\site-packages\mozprocess\processhandler.py", line 719, in processOutputLine
12:10:19     INFO -      handler(line)
12:10:19     INFO -    File "C:\slave\test\build\tests\mochitest/runtests.py", line 1813, in processOutputLine
12:10:19     INFO -      message = handler(message)
12:10:19     INFO -    File "C:\slave\test\build\tests\mochitest/runtests.py", line 1906, in first_error
12:10:19     INFO -      self.harness.expectedError[key] = message['message'].strip()
12:10:19     INFO -  KeyError: 'message'
12:13:49     INFO -  IO Completion Port unexpectedly closed
12:13:49     INFO -  Traceback (most recent call last):
12:13:49     INFO -    File "C:\slave\test\build\tests\mochitest/runtests.py", line 1729, in doTests
12:13:49     INFO -      quiet=options.quiet
12:13:49     INFO -    File "C:\slave\test\build\tests\mochitest/runtests.py", line 1419, in runApp
12:13:49     INFO -      status = proc.wait()
12:13:49     INFO -    File "C:\slave\test\build\venv\lib\site-packages\mozprocess\processhandler.py", line 824, in wait
12:13:49     INFO -      self.returncode = self.proc.wait()
12:13:49     INFO -    File "C:\slave\test\build\venv\lib\site-packages\mozprocess\processhandler.py", line 162, in wait
12:13:49     INFO -      self.returncode = self._wait()
12:13:49     INFO -    File "C:\slave\test\build\venv\lib\site-packages\mozprocess\processhandler.py", line 462, in _wait
12:13:49     INFO -      raise OSError(err)
12:13:49     INFO -  OSError: IO Completion Port failed to signal process shutdown
12:13:49     INFO -  5524 INFO Automation Error: Received unexpected exception while running application
12:13:49     INFO -  5525 INFO Stopping web server
12:13:49     INFO -  5526 INFO Stopping web socket server
12:13:49     INFO -  5527 INFO Stopping ssltunnel
12:13:49     INFO -  5528 INFO runtests.py | Running tests: end.
12:13:49     INFO -  5529 INFO SUITE-END | took 970s
12:13:50     INFO -  rmtree() failed for "('c:\\users\\cltbld\\appdata\\local\\temp\\tmpunvd1j.mozrunner',)". Reason: The process cannot access the file because it is being used by another process (13). Retrying...
12:13:50     INFO -  rmtree() failed for "('c:\\users\\cltbld\\appdata\\local\\temp\\tmpunvd1j.mozrunner',)". Reason: The process cannot access the file because it is being used by another process (13). Retrying...
12:13:51     INFO -  rmtree() failed for "('c:\\users\\cltbld\\appdata\\local\\temp\\tmpunvd1j.mozrunner',)". Reason: The process cannot access the file because it is being used by another process (13). Retrying...
12:13:51     INFO -  rmtree() failed for "('c:\\users\\cltbld\\appdata\\local\\temp\\tmpunvd1j.mozrunner',)". Reason: The process cannot access the file because it is being used by another process (13). Retrying...
12:13:52     INFO -  rmtree() failed for "('c:\\users\\cltbld\\appdata\\local\\temp\\tmpunvd1j.mozrunner',)". Reason: The process cannot access the file because it is being used by another process (13). Retrying...
12:13:52     INFO -  Exception WindowsError: WindowsError(32, 'The process cannot access the file because it is being used by another process') in <bound method Profile.__del__ of <mozprofile.profile.Profile object at 0x027C1C30>> ignored
12:30:32     INFO - Automation Error: mozprocess timed out after 1000 seconds running ['C:\\slave\\test\\build\\venv\\Scripts\\python', '-u', 'C:\\slave\\test\\build\\tests\\mochitest/runtests.py', '--total-chunks', '5', '--this-chunk', '1', '--appname=C:\\slave\\test\\build\\application\\firefox\\firefox.exe', '--utility-path=tests/bin', '--extra-profile-file=tests/bin/plugins', '--symbols-path=https://ftp-ssl.mozilla.org/pub/mozilla.org/firefox/try-builds/vaibhavmagarwal@gmail.com-05d069406c9d/try-win32/firefox-34.0a1.en-US.win32.crashreporter-symbols.zip', '--certificate-path=tests/certs', '--autorun', '--close-when-done', '--console-level=INFO', '--quiet', '--chunk-by-dir=4']
12:30:32    ERROR - timed out after 1000 seconds of no output
12:30:32    ERROR - Return code: 1
Depends on: 886570
This error was hidden until now because the code associated is not used on try servers till now (the code is used for bisection of failing tests). This is likely a regression from bug 886570.

Try run: https://tbpl.mozilla.org/?tree=Try&rev=05d069406c9d

Logs: https://tbpl.mozilla.org/php/getParsedLog.php?id=44456211&tree=Try&full=1 ,
https://tbpl.mozilla.org/php/getParsedLog.php?id=44456134&tree=Try&full=1
Blocks: 886570
No longer depends on: 886570
This should do it ! When updating your changes to work with structured logs, I didn't really understand what this handler was doing. might also be the case for some other handler now that I think about it, you might want to check that.
Attachment #8462761 - Flags: review?(vaibhavmagarwal)
Assignee: nobody → akachkach
Here is the try push: https://tbpl.mozilla.org/?tree=Try&rev=ab79560540c9
And the logs of failing jobs: https://tbpl.mozilla.org/php/getParsedLog.php?id=44633552&tree=Try&full=1
https://tbpl.mozilla.org/php/getParsedLog.php?id=44622025&tree=Try&full=1

Ahmed, looks like the patch didn't work. KeyError:'message' looks like the problem here. Can you figure what may be going wrong?
Flags: needinfo?(akachkach)
Comment on attachment 8462761 [details] [diff] [review]
0001-Bug-1042963-Only-execute-the-first_error-handler-on-.patch

Review of attachment 8462761 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for picking up this bug!
Fails on try: https://tbpl.mozilla.org/?tree=Try&rev=ab79560540c9 , hence r-
Attachment #8462761 - Flags: review?(vaibhavmagarwal) → review-
Mmh, OK, so it seems like some mochitest assertions may not come with a message. One thing we could use to identify those errors is the name of the subtest (a name describing the assertion). We could also do something like : "<name of the assertion>" if we don't have a message and "<name of the assertion> - <message of the assertion>" if we have a message.

What solution would you prefer ?
Flags: needinfo?(akachkach)
Looking at both the logs, I see that the same test invokes this failure, which is, /tests/content/media/test/test_load_candidates.html . So ideally, I think it would be better to fix the assertion in that particular test. 

> We could also do something like :
> "<name of the assertion>" if we don't have a message and "<name of the
> assertion> - <message of the assertion>" if we have a message.
> 
> What solution would you prefer ?

If there are a large number of tests that makes the above solution difficult, I would prefer only adding "<name of the assertion>" if we don't have a message. Something of this sort:

if 'message' in message:
   self.harness.expectedError[key] = message['message'].strip()
else:
   self.harness.expectedError[key] = "<name of the assertion>"

This won't change the structure of the tests which already have a message and just add a message for the ones which don't have.
This should do what you requested. Do another try push to make sure it works though.
Attachment #8462761 - Attachment is obsolete: true
Attachment #8462952 - Flags: review?(vaibhavmagarwal)
Try Push: https://tbpl.mozilla.org/?tree=Try&rev=499053952a7a

This patch is now working!

The problem that was causing this failure is:
01:00:51     INFO -  5526 INFO TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_load_candidates.html | Error events on source children should not bubble

Looking at the test in dxr, http://dxr.mozilla.org/mozilla-central/source/content/media/test/test_load_candidates.html#48, there is the 'message' in the ok expression. Ahmed, this make me wonder why is the message not being passed to the structured logger in this particular case. Ideally, we would want:
   self.harness.expectedError[key] = "Error events on source children should not bubble" in this case.
I think that field is actually the subtest, not the message, that's why it is now working.
Comment on attachment 8462952 [details] [diff] [review]
0001-Bug-1042963-Only-execute-the-first_error-handler-on-.patch

Review of attachment 8462952 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for the patch :)
A nit:

-      if 'expected' in message and message['status'] == 'FAIL':
+      if message['action'] == 'test_status' and 'expected' in message and message['status'] == 'FAIL':

I would like to see the 'if' condition here same as the condition in record_result method http://dxr.mozilla.org/mozilla-central/source/testing/mochitest/runtests.py#1888
, that is,
if message['action'] in ['test_end', 'test_status'] and 'expected' in message:

Don't want to leave out any test that is marked as 'FAIL' in record_result method and not there in self.expectedError.
Attachment #8462952 - Flags: review?(vaibhavmagarwal) → review+
True, we should change both!

So, I think we should make it clear what we record: test_status messages (== subtests, assertions in mochitest) and/or test_end messages (== tests, an html file containing many assertions). Maybe recording test_status only is enough? (you're looking for the assertion that failed, you can get the test's name from there)
Ahmed, I think what we should check for "test_end" as well as "test_status", since "test_end" also checks for CRASH, which we are thinking of supporting in Bug 1036328. I think the current implementation in record_result (http://dxr.mozilla.org/mozilla-central/source/testing/mochitest/runtests.py#1901) is right, and we should just replicate that in first_error method. Correct me if I am wrong.
The thing with test_end is that you don't have any message or subtest, so I'm not sure what we'd put in expectedError[key].
We finally decided to drop test_end messages from record_result.

carry r+ forward.
Attachment #8462952 - Attachment is obsolete: true
Attachment #8463524 - Flags: review+
Both try runs look good, bisecting seem to detect the failures correctly.
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/6720ba79340d
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: