Closed Bug 1316309 Opened 3 years ago Closed 3 years ago
XP debug logs contain several "test _end for <test> logged while test not in progress"
58 bytes, text/x-review-board-request
This happens on other windows platforms as well, and infrequently on debug, but seems to happen most frequently on XP debug. Take any xpcshell log from that platform and grep for "test_end". For example: https://archive.mozilla.org/pub/firefox/tinderbox-builds/mozilla-central-win32-debug/1478640923/mozilla-central_xp_ix-debug_test-xpcshell-bm119-tests1-windows-build38.txt.gz You'll notice these are logged by mozharness at the INFO level, even though mozlog is emitting them at the ERROR level: https://dxr.mozilla.org/mozilla-central/source/testing/mozbase/mozlog/mozlog/structuredlog.py#359 Paul's patch in bug 1261197 got backed out because it is (correctly) making those lines get logged as an error. As a workaround, we could change that to a WARNING, but it seems like there is a legitimately bad problem happening here that warrants a fix.
> and infrequently on debug Should read "infrequently on opt"
So it seems roughly the following is happening for ~77 different tests in that log: 1) Test times out 2) We log a "test_end" with status=TIMEOUT 3) We attempt to kill the test process, but that fails 4) Test ends eventually, we try to log a second "test_end" How none of this was turning the job orange is kind of mind boggling to me. There are several things failing to happen here that are combining to cause this issue: 1) Many tests on Win XP debug are exceeding the timeout threshold 2) Killing the test processes after timeout is broken (at least on Windows) 3) All throughout this process, nothing gets logged that is sufficient for mozharness to turn the job orange (this is fixed by bug 1261197 at least) I'll make fixing this my priority.
Assignee: nobody → ahalberstadt
Status: NEW → ASSIGNED
The situation didn't up being as bad as it originally seemed in comment 2. These tests were being re-tried in the non-parallel sequence at the end of the test run and passing there. This patch is still important though as it makes sure we clean up properly and unblocks bug 1261197.
Comment on attachment 8809954 [details] Bug 1316309 - Use mozcrash.kill_and_get_minidump() in xpcshell instead of automation.py, https://reviewboard.mozilla.org/r/92426/#review93528 Apparently I had a half-finished patch for this in bug 1193738 but I never quite got around to getting it into a landable state. Thanks for fixing this! ::: testing/xpcshell/runxpcshelltests.py (Diff revision 1) > import psutil > HAVE_PSUTIL = True > except Exception: > HAVE_PSUTIL = False > > -from automation import Automation Hooray! If we got the Android harnesses off of remoteautomation we could actually remove automation.py...
Attachment #8809954 - Flags: review?(ted) → review+
Pushed by firstname.lastname@example.org: https://hg.mozilla.org/integration/autoland/rev/1a9545f0b58e Use mozcrash.kill_and_get_minidump() in xpcshell instead of automation.py, r=ted
You need to log in before you can comment on or make changes to this bug.