Closed Bug 1493907 Opened Last year Closed 4 months ago

Run Wd tests in headless mode

Categories

(Testing :: geckodriver, defect, P1)

defect

Tracking

(firefox69 fixed)

RESOLVED FIXED
mozilla69
Tracking Status
firefox69 --- fixed

People

(Reporter: ato, Assigned: whimboo)

References

(Depends on 4 open bugs, Blocks 2 open bugs)

Details

Attachments

(4 files, 1 obsolete file)

The WPT wdspec test type tests Firefox’ WebDriver (geckodriver +
Marionette) implementation exhaustively.  This would be a good test
suite to run headlessly to discover problems with headless mode and
prevent it from regressing.
Also lots of people use Selenium and geckodriver with Firefox running in headless nowadays. So it would be very helpful to see any kind if regression as early as possible.
I'm trying to enable headless mode by using `build-projects` in the following line:

https://searchfox.org/mozilla-central/source/taskcluster/ci/test/web-platform.yml#155

But sadly `mach try fuzzy` doesn't recognize that change, so that I'm not able to push to try.

Andrew, any idea what's wrong here?
Flags: needinfo?(ahal)
Fuzzy's default is all tasks that run on mozilla-central. If you want to schedule a task that doesn't run on mozilla-central you need to pass --full.
Flags: needinfo?(ahal)
Assignee: nobody → hskupin
Status: NEW → ASSIGNED
(In reply to Andrew Halberstadt [:ahal] from comment #3)
> Fuzzy's default is all tasks that run on mozilla-central. If you want to
> schedule a task that doesn't run on mozilla-central you need to pass --full.

No, this doesn't help for the attached patch. It still doesn't list headless. Maybe using `built-projects` isn't correct here? Or do you see if something else could cause it?
Assignee: hskupin → nobody
Status: ASSIGNED → NEW
Flags: needinfo?(ahal)
Assignee: nobody → hskupin
Status: NEW → ASSIGNED
Component: web-platform-tests → geckodriver
If you don't see your task with |mach try fuzzy --full|, then it's not being generated by the taskgraph module. To verify you can run:
$ ./mach taskgraph full | grep headless

The full taskgraph is pre target task filtering, so run-on-projects shouldn't have any affect on what shows up in the full taskgraph.
Flags: needinfo?(ahal)
I think you need to add 'web-platform-tests-wdspec-headless' to test-platforms.yml, and that's why it's not showing up.
Depends on: 1370636
(In reply to Andrew Halberstadt [:ahal] from comment #7)
> I think you need to add 'web-platform-tests-wdspec-headless' to
> test-platforms.yml, and that's why it's not showing up.

Oh, right. But this is only once source, and will work for Linux. For Mac and Windows I have to update the appropriate sets in test-sets.yml.
Priority: -- → P1
Depends on: 1496409
As expected we see lots of hangs in that try push when minimizing and fullscreen' a window. Lets wait for the patch on bug 1492499 to be landed before continuing on this bug.
Depends on: 1492499
Most of the window manipulation tests fail with failures like:

> /webdriver/tests/maximize_window/maximize.py | test_restore_the_window - assert False

That is actually exactly what we also see on wpt.fyi for all those commands. I assume that they only run in headless mode, and as such we haven't seen it yet ourselves.
This failure is actually for `document.hidden`:

> 05:54:34     INFO - >       assert document_hidden(session)
> 05:54:34     INFO - E       assert False

I will file a new bug for that particular issue.
(In reply to Henrik Skupin (:whimboo) from comment #13)
> This failure is actually for `document.hidden`:
> 
> > 05:54:34     INFO - >       assert document_hidden(session)
> > 05:54:34     INFO - E       assert False
> 
> I will file a new bug for that particular issue.

Actually as discussed in the WebDriver meeting yesterday, Andreas will go ahead and file that issue.
Flags: needinfo?(ato)
Andreas, can you please follow-up on it? Thanks!
Filed https://bugzilla.mozilla.org/show_bug.cgi?id=1510305 about
document.hidden in headless mode.

I wonder if we should not just go ahead and enable WPT WebDriver
tests in headless mode for the time being, with the failing tests
marked as expected to fail?
Flags: needinfo?(ato)
Is there a way to automatically generate the manifest files? Given the amount of failing tests I don't want to do it manually.
I know there’s a way.

jgraham, is it documented anywhere?
Flags: needinfo?(james)
For context: What is being requested here is to take the test failure
log from TC and pass it into "./mach wpt" to have it update the
expected results, so that we can ignore the failing Wd tests in
headless mode.
It's not fully documented, and possibly doesn't work out of the box with headless.

You want a try run with both passing and failing examples (so the code knows what the condition is causing the fail. But in this case that won't work because we don't by-default use headless mode as a criterion. So you could either add "headless" to both lists at [1] or just use the failing logs and then use sed or whatever to update the generated criterion)

To fetch the logs I have a tool [2] which you can install on the path and then

fetchlogs try <sha1> --log-type wptreport --out-dir logs

Once you have the logs then run

./mach wpt-udpate /path/to/logs/*

[1] https://searchfox.org/mozilla-central/source/testing/web-platform/tests/tools/wptrunner/wptrunner/browsers/firefox.py#159
[2] https://github.com/jgraham/fetchlogs
Flags: needinfo?(james)
Sounds like lot of work. Given that there are more important things on my plate, this will have to wait until the dependencies have been fixed, or someone else takes it.
Assignee: hskupin → nobody
Status: ASSIGNED → NEW
Priority: P1 → P2
Depends on: 1521179
Depends on: 1510305

Joel, what's our current situation with headless tests? I know that you disabled a lot of them, so I wonder if we still want to run those, and if yes, on which platforms. Maybe that helps us to get the wdspec ones landed easier.

headless tests are only run to ensure compatibility of headless mode, not to run in parallel or to safe resources. If there is a future investment into headless mode to make it support more of our needs for tests, then we could look at running normal tests as tier-2 and headless as tier-1 taking advantage of the faster runtime and possibility of parallel execution.

Given those constraints, running Wd in headless makes sense as
users are relying on using headless WebDriver and we want to avoid
any regressions in that area.

I should also add that—although not intended to be—Wd is probably
the best regression test suite we have for headless mode, considering
its scope is to ensure all things related to browser automation
works.

Blocks: 1560181

Andreas, we could try to get the headless tests added, and simply mark those tests as expected fail where we know those are failing due to broken behavior in Firefox. I hope that those shouldn't be that many affected tests.

Enabling headless is important before we can get started with bug 1560181.

I just pushed a try build:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=db4d725fb86b5294d0d24e05d64ea9271644098a

Try run is looking better with a patch:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=4b90f83a778aa26f792f0f8d94478e8834a96511

Still have one more failure:
TEST-UNEXPECTED-FAIL | /webdriver/tests/get_window_rect/get.py | test_payload - AssertionError: assert {'height': 60...x': 0, 'y': 0} == {'height': 600...100, 'y': 100}

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #25)

Andreas, we could try to get the headless tests added, and simply
mark those tests as expected fail where we know those are failing
due to broken behavior in Firefox. I hope that those shouldn't be
that many affected tests.

Perfect! I agree with that approach.

(In reply to Brendan Dahl [:bdahl] from comment #26)

Try run is looking better with a patch:

Lovely, thanks for pitching in!

Note, that I will wait a bit more before marking tests as expected fail. After talking to Brendan yesterday he has to do some more verification if his patch doesn't produce any regression (which did happen in the past). If all goes well, we might be able to run nearly all the tests, which is fantastic!

Brendan, please let me know if there is something I could help with. We would appreciate if we could get at least the recent patch landed.

Flags: needinfo?(bdahl)
Depends on: 1562025

Patch is up in bug 1562025

Flags: needinfo?(bdahl)

Wonderful. Thanks a lot again. I will have a look once it landed, what the remaining test failure is related to, and if it's even a bug in the test.

Depends on: 1563161

So I investigated the two remaining failing tests. The payload one for GetWindowRect was just poorly written. After refactoring it, it works fine. For the negative coordinates test I filed bug 1563161, and will mark the test as expected fail for now when run under headless.

Assignee: nobody → hskupin
Status: NEW → ASSIGNED
Priority: P2 → P1
Depends on: 1563248
Depends on: 1563251
Attachment #9075598 - Attachment description: Bug 1493907 - [wdspec] Mark "test_negative_x_y" for "Set Window Rect" as expected fail under headless. r=#webdriver → Bug 1493907 - [wdspec] Mark remaining failing tests as expected fail for headless mode. r=#webdriver
Attachment #9075597 - Attachment is obsolete: true
Pushed by hskupin@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/0609705a3472
[wptrunner] Expose headless flag for expected meta data. r=webdriver-reviewers,ato
https://hg.mozilla.org/integration/autoland/rev/d62f57d8e0b7
[wdspec] Mark remaining failing tests as expected fail for headless mode. r=webdriver-reviewers,ato
https://hg.mozilla.org/integration/autoland/rev/60ee55f4c31d
[wdspec] Run Wdspec tests for shippable builds in headless mode on all platforms. r=webdriver-reviewers,ato
Status: ASSIGNED → RESOLVED
Closed: 4 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla69
Blocks: 1563516
Created web-platform-tests PR https://github.com/web-platform-tests/wpt/pull/17724 for changes under testing/web-platform/tests
Can't merge web-platform-tests PR due to failing upstream checks:
Github PR https://github.com/web-platform-tests/wpt/pull/17724
* Taskcluster (pull_request) (https://tools.taskcluster.net/task-group-inspector/#/J_LoqpdWSx2zK8yQuIs_ZA)
Can't merge web-platform-tests PR due to failing upstream checks:
Github PR https://github.com/web-platform-tests/wpt/pull/17724
* Taskcluster (pull_request) (https://tools.taskcluster.net/task-group-inspector/#/O_agcpZATxm7XQLlD8aHwA)
Can't merge web-platform-tests PR due to failing upstream checks:
Github PR https://github.com/web-platform-tests/wpt/pull/17724
* Taskcluster (pull_request) (https://tools.taskcluster.net/task-group-inspector/#/J5slsTOUSbCXbun1x6pL_A)
Upstream PR was closed without merging
Upstream PR merged
Pushed by wptsync@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/264ce2118077
[wpt PR 17724] - [Gecko Bug 1493907] [wptrunner] Expose headless flag for expected meta data., a=testonly
You need to log in before you can comment on or make changes to this bug.