Closed Bug 1485259 Opened 2 years ago Closed 2 years ago

wptrunner doesn't print crash stack for process crashes

Categories

(Testing :: web-platform-tests, defect)

defect
Not set

Tracking

(firefox64 fixed)

RESOLVED FIXED
mozilla64
Tracking Status
firefox64 --- fixed

People

(Reporter: whimboo, Assigned: jgraham)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

As I have seen in various bugs like bug 1485240 which is wdspec, wptrunner doesn't print any crash details in case of a content crash happening.

Maybe it's because the exit code of Firefox is 0, and a call into mozcrash only happens for not equal 0?
The problematic code seem to be located here:

https://searchfox.org/mozilla-central/rev/f2ac80ab7dbde5400a3400d463e07331194dec94/testing/web-platform/tests/tools/wptrunner/wptrunner/testrunner.py#573-575

The check for crashes is only run in case of the following test statuses: "TIMEOUT", "EXTERNAL-TIMEOUT", "INTERNAL-ERROR".

But in case of the recent wdspec failures the status is TEST-UNEXPECTED-ERROR.

Given that there might be too many different status, I wonder why we are not always checking for crashes? In some cases a test might even pass, but a crash like in content could happen with a little delay, and eg the tear down step won't notice it.

James, do you foresee any problems when we always run a check for crashes? If not I would enable that. We do that for a very long time with Marionette now, and it works great.
Flags: needinfo?(james)
I don't know that it would be bad although it would certainly come with a performance penalty. I think we are assuming that if there's a content process crash we end up getting an error back from marionette, which should be an INTERNAL-ERROR, but maybe isn't for wdspec tests. So it might be sufficient to ensure that if we loose the marionette connection in the wdspec tests we return INTERNAL-ERROR rather than just ERROR (or also cehck for crashes on ERROR).
Flags: needinfo?(james)
(In reply to James Graham [:jgraham] from comment #2)
> I don't know that it would be bad although it would certainly come with a
> performance penalty.

I think the performance penalty is somewhat marginal because mozcrash only looks for files being present in the `minidump` subfolder of the Firefox profile. If none exist, it returns immediately.

> I think we are assuming that if there's a content
> process crash we end up getting an error back from marionette, which should
> be an INTERNAL-ERROR, but maybe isn't for wdspec tests. So it might be

No, given that wptrunner is managing the binary itself, Marionette doesn't check for exit codes, nor uses mozcrash itself for checking for crashes. Also note that for content process crashes Firefox quits with exit code 0 at the moment, because there is no way to tell `nsIAppStartup.Quit()` to use a different exit code yet.

That said wptrunner can get any kind of error from Marionette.

> sufficient to ensure that if we loose the marionette connection in the
> wdspec tests we return INTERNAL-ERROR rather than just ERROR (or also cehck
> for crashes on ERROR).

You speak about wdspec tests only here. I wonder if all wpt tests are actually affected by this problem.

The simplest wdspec test to trigger a content process crash is:

> def test_crash(session):
>     session.url = "about:crashcontent"

When I run that I even get a PASS:

> 0:12.54 pid:22416 A content process crashed and MOZ_CRASHREPORTER_SHUTDOWN is set, shutting down
> 0:12.80 pid:22416 1535102559224	Marionette	TRACE	0 <- [1,3,null,{"value":null}]
PASSED
> 0:12.81 pid:22416 [Child 22420, Main Thread] WARNING: nsAppShell::Exit() called redundantly: file /builds/worker/workspace/build/src/widget/cocoa/nsAppShell.mm, line 758
Oh, when I checked the logs I see that the Firefox profile is actually handled by geckodriver:

> "moz:profile":"/var/folders/4k/sf4gz5fn3kl9hr3nd7pzbvhc0000gn/T/rust_mozprofile.XwPc005YD2QW"

As such I think this bug is dependent on bug 1433495. The profile is removed by geckodriver before any crash checks can be performed by mozcrash.

Lets keep this bug for wdspec tests for now.
Depends on: 1433495
Summary: wptrunner doesn't print crash stack for content process crashes → [wdspec] wptrunner doesn't print crash stack for content process crashes
(In reply to Henrik Skupin (:whimboo) from comment #4)
> As such I think this bug is dependent on bug 1433495. The profile is removed
> by geckodriver before any crash checks can be performed by mozcrash.

As reported on https://github.com/web-platform-tests/results-collection/issues/592 there are also crashes outside of wdspec which aren't getting analyzed. So it doesn't really depend on just geckodriver.

Also this is for all crashes and not only for the content process.
No longer depends on: 1433495
Summary: [wdspec] wptrunner doesn't print crash stack for content process crashes → [wdspec] wptrunner doesn't print crash stack for process crashes
An option here could be to change the condition when checking for crashes to:

>        if not self.browser.is_alive():
>            if self.browser.check_for_crashes():
>                status = "CRASH"

But I don't know all the states when a browser gets closed (or waiting for user input) after a test finished. If we want to go that route note `is_alive()` is using `mozrunner.is_running()` which is currently broken! See bug 1433905. And I also see at least one case in wptrunner where we make use of it.

Given that wptrunner is hard to debug, maybe you can give a little information James if this could be a feasible approach, or if it won't work? Thanks.
Flags: needinfo?(james)
Assignee: nobody → james
Status: NEW → ASSIGNED
Flags: needinfo?(james)
Comment on attachment 9007310 [details]
Bug 1485259 - Always check for crashes after wpt tests

Andreas Tolfsen ❲:ato❳ has approved the revision.
Attachment #9007310 - Flags: review+
Pushed by james@hoppipolla.co.uk:
https://hg.mozilla.org/integration/autoland/rev/1445262bf861
Always check for crashes after wpt tests r=ato
Created web-platform-tests PR https://github.com/web-platform-tests/wpt/pull/12968 for changes under testing/web-platform/tests
Upstream web-platform-tests status checks passed, PR will merge once commit reaches central.
Pushed by dluca@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/6a9c8ef40c06
Always check for crashes after wpt tests r=ato
Upstream web-platform-tests status checks passed, PR will merge once commit reaches central.
https://hg.mozilla.org/mozilla-central/rev/1445262bf861
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla64
This fix was actually not only related to wdspec but wptrunner in general. For wdspec I filed bug 1490906.

James, shall we request uplift for this patch to beta?
Blocks: 1490906
Flags: needinfo?(james)
Summary: [wdspec] wptrunner doesn't print crash stack for process crashes → wptrunner doesn't print crash stack for process crashes
No longer blocks: 1485240, 1485482
Upstream PR merged
Can't merge web-platform-tests PR due to failing upstream checks:
Github PR https://github.com/web-platform-tests/wpt/pull/12968
* Taskcluster (pull_request) (https://tools.taskcluster.net/task-group-inspector/#/GLdBhOhoTNOuMgJ8wma9aw)
I don't think it's worthwhile to uplift this patch since we've had the existing behaviour for a long time and we'll get the new one in the next cycle anyway.
Flags: needinfo?(james)
You need to log in before you can comment on or make changes to this bug.