Closed
Bug 1489277
Opened 6 years ago
Closed 6 years ago
mozrunner unit tests causing crashes in sh.exe during mozbuild regression test suite
Categories
(Firefox Build System :: General, defect)
Tracking
(firefox64 fixed)
RESOLVED
FIXED
mozilla64
Tracking | Status | |
---|---|---|
firefox64 | --- | fixed |
People
(Reporter: bugzilla, Assigned: bugzilla)
References
Details
Attachments
(1 file)
See the failures on "S" and "BR" Windows builds:
https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=a21462dc142ad715926864ac5ccccadcdc2595f7
I added some additional information to the exception being thrown by the test suite, and it turns out that sh.exe's return code is 0xC0000005, which is the Windows exception code for access violations.
I don't know what it is about the config/define changes made by launcher process by default, but something has created an enviornment that causes sh.exe to randomly crash!
Comment 1•6 years ago
|
||
Were you able to reproduce these issues on try?
Flags: needinfo?(aklotz)
Keywords: in-triage
Assignee | ||
Comment 2•6 years ago
|
||
Flags: needinfo?(aklotz)
Comment 3•6 years ago
|
||
My working theory was that something was running your new code during this test run somehow and it's breaking the environment in a way that eventually causes msys' sh.exe to crash. I was puzzling over this for a bit, since I didn't think we should be running Firefox itself in non-PGO builds, and your code doesn't seem to get used for xpcshell or anything like that that we *would* run, then I noticed this in your log:
17:04:20 INFO - ..\testing\mozbase\mozrunner\tests\test_interactive.py::test_run_interactive[firefox] PASSED
Those mozrunner tests look like they actually do run Firefox:
https://dxr.mozilla.org/mozilla-central/rev/2a59b432d2bd9b15ceec6b9435f60c785a820ef2/testing/mozbase/mozrunner/tests/test_interactive.py#10
There's code in the test fixtures to locate the binary from the objdir:
https://dxr.mozilla.org/mozilla-central/rev/2a59b432d2bd9b15ceec6b9435f60c785a820ef2/testing/mozbase/mozrunner/tests/conftest.py#26
https://dxr.mozilla.org/mozilla-central/rev/2a59b432d2bd9b15ceec6b9435f60c785a820ef2/testing/mozbase/moztest/moztest/selftest/fixtures.py#87
Given that, I'd be interested to see if disabling those tests changes the output here.
Comment 4•6 years ago
|
||
(In reply to Ted Mielczarek [:ted] [:ted.mielczarek] from comment #3)
> Those mozrunner tests look like they actually do run Firefox:
> https://dxr.mozilla.org/mozilla-central/rev/
> 2a59b432d2bd9b15ceec6b9435f60c785a820ef2/testing/mozbase/mozrunner/tests/
> test_interactive.py#10
Note that the `mb` python-test jobs are not dependent on the build job, and as such cannot run with the real Firefox binary. See also bug 1065583 for getting those tests enabled.
> There's code in the test fixtures to locate the binary from the objdir:
> https://dxr.mozilla.org/mozilla-central/rev/
> 2a59b432d2bd9b15ceec6b9435f60c785a820ef2/testing/mozbase/mozrunner/tests/
> conftest.py#26
Maybe this only works on Windows because the tests are part of the build job on that platform? Eventually the tests will also be moved out to `mb`, but until then it might be related.
Assignee | ||
Comment 5•6 years ago
|
||
(In reply to Ted Mielczarek [:ted] [:ted.mielczarek] from comment #3)
> Given that, I'd be interested to see if disabling those tests changes the
> output here.
Holy cow, Ted, you were right! Setting skip-if = true on those tests cleared up the builds!
https://treeherder.mozilla.org/#/jobs?repo=try&revision=d596c1ebaca76ec53808be13e00256d1eb9a58c2
Assignee | ||
Comment 6•6 years ago
|
||
Now my next question is, what do we do about this? Henrik, any suggestions?
Flags: needinfo?(hskupin)
Comment 7•6 years ago
|
||
Andrew recently refactored those tests so that we can get coverage for Chrome. I think it would be best if he has a look at this.
Flags: needinfo?(hskupin) → needinfo?(ahal)
Comment 8•6 years ago
|
||
aklotz: I think it'd also be good for you to test PGO builds with your patch on Try, since those are definitely going to run Firefox during the build. If we hit the same problem there then we'll have a larger problem. (I think we could disable these tests in the build tasks without it being a big issue.)
Does making those tests run Firefox with your new flags (-wait-for-browser, -no-deelevate) change the behavior here at all? Related: will mozrunner want to use -wait-for-browser by default when running a new-enough Firefox?
Comment 9•6 years ago
|
||
(In reply to Ted Mielczarek [:ted] [:ted.mielczarek] from comment #8)
> I think we could disable these tests in the build tasks without it being a big issue.)
Yeah, disabling the tests such that they are skipped in automation but not locally would be fine. As Henrik mentioned we want to pull that whole suite out of the build anyway (at which point there will be no Firefox binary for them to find).
I'm not sure if it's possible to skip a test if it's running in CI or not via mozinfo/manifest.ini, if not we could use something like:
@pytest.skip(os.environ.get('MOZ_AUTOMATION'))
def test_...
I imagine pytest has a way to skip entire files as well.
> Does making those tests run Firefox with your new flags (-wait-for-browser,
> -no-deelevate) change the behavior here at all? Related: will mozrunner want
> to use -wait-for-browser by default when running a new-enough Firefox?
If we can fix the issue by setting flags, that would be better than skipping the tests. Without knowing any specifics, -wait-for-browser definitely sounds like something mozrunner should be using.
Flags: needinfo?(ahal)
Assignee | ||
Comment 10•6 years ago
|
||
(In reply to Ted Mielczarek [:ted] [:ted.mielczarek] from comment #8)
> aklotz: I think it'd also be good for you to test PGO builds with your patch
> on Try, since those are definitely going to run Firefox during the build. If
> we hit the same problem there then we'll have a larger problem. (I think we
> could disable these tests in the build tasks without it being a big issue.)
Yeah, I've got other issues on PGO courtesy of bug 1488627. But yeah, I'll keep an eye out.
>
> Does making those tests run Firefox with your new flags (-wait-for-browser,
> -no-deelevate) change the behavior here at all? Related: will mozrunner want
> to use -wait-for-browser by default when running a new-enough Firefox?
(In reply to Andrew Halberstadt [:ahal] from comment #9)
> If we can fix the issue by setting flags, that would be better than skipping
> the tests. Without knowing any specifics, -wait-for-browser definitely
> sounds like something mozrunner should be using.
-no-deelevate is specific to Talos, but the other flag needs to be set, yes. I already included a patch in that series on the try build that addresses this: https://hg.mozilla.org/try/rev/8d217f130399acda7e7393dd91adc3666c2d219e, but the sh.exe crashes were still being triggered until I disabled these tests outright.
Assignee | ||
Comment 11•6 years ago
|
||
Depends on D5921
Assignee | ||
Comment 12•6 years ago
|
||
Try build with this patch applied:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=9cfcefe45af78b7dda94f56ff3af7bd3fe3bbafc
Comment 13•6 years ago
|
||
OK. I think working around this by disabling those tests is totally reasonable. We are using a known ancient msys version of sh.exe and I don't think it's worth anyone's time to actually try to figure out why it's crashing. It's entirely possible that it's only even happening in this way because we have a python.exe that launches Firefox and then the same process goes on to launch sh.exe (in the mozconfig loader in configure tests), which I don't think is a common scenario in the build.
Comment 14•6 years ago
|
||
Comment on attachment 9009305 [details]
Bug 1489277: Skip mozrunner tests if running under automation; r=ahal!
Andrew Halberstadt [:ahal] has approved the revision.
Attachment #9009305 -
Flags: review+
Comment 15•6 years ago
|
||
Comment on attachment 9009305 [details]
Bug 1489277: Skip mozrunner tests if running under automation; r=ahal!
Andrew Halberstadt [:ahal] has requested changes to the revision.
Attachment #9009305 -
Flags: review+
Comment 16•6 years ago
|
||
Comment on attachment 9009305 [details]
Bug 1489277: Skip mozrunner tests if running under automation; r=ahal!
Andrew Halberstadt [:ahal] has approved the revision.
Attachment #9009305 -
Flags: review+
Assignee | ||
Updated•6 years ago
|
Assignee: nobody → aklotz
Status: NEW → ASSIGNED
Summary: Seemingly random crashes in sh.exe during mozbuild regression test suite → mozrunner unit tests causing crashes in sh.exe during mozbuild regression test suite
Assignee | ||
Comment 17•6 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/2cdf22687fa74d1f01f4888837e46c8ae2ad7b5e
Bug 1489277: Skip mozrunner tests if running under automation; r=ahal!
Comment 18•6 years ago
|
||
bugherder |
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
status-firefox64:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla64
Assignee | ||
Comment 19•6 years ago
|
||
You need to log in
before you can comment on or make changes to this bug.
Description
•