1425595 - Missing coverage on opt-web-platform-tests-wdspec-e10s (Wd)

Reporter

Description

•

6 years ago

ato says:

> The code coverage results for testing/marionette are lacking coverage
> from the WPT WebDriver conformance test suite.  This job is called 
> opt-web-platform-tests-wdspec-e10s (Wd).  For the coverage results
> to be useful for this component, we would have to include this
> test job.

This data came from ACtiveData, so any part of the ETL pipeline may be in error.

Kyle Lahnakoski [:ekyle]

Reporter

Updated

•

6 years ago

Blocks: code-coverage

Kyle Lahnakoski [:ekyle]

Reporter

Comment 1

•

6 years ago

Well, it appears to be running, but failing...

https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&revision=c6b71032e0831ddd09b67391e62024bc729a1d0d&filter-searchStr=ccov

Marco Castelluccio [:marco]

Comment 2

•

6 years ago

Does ActiveData ignore failing test chunks? Or does it collect coverage for failing test chunks too?

Andreas Tolfsen ❲:ato❳

Comment 3

•

6 years ago

There’s only one failing test in that run, which due to the nature
of slow builds could be unstable because it is an interaction test.
We will want to include failing jobs in the code coverage, if this
isn’t the case.

Marco Castelluccio [:marco]

Comment 4

•

6 years ago

(In reply to Andreas Tolfsen ‹:ato› from comment #3)
> There’s only one failing test in that run, which due to the nature
> of slow builds could be unstable because it is an interaction test.
> We will want to include failing jobs in the code coverage, if this
> isn’t the case.

It is the case on the codecov.io reports, I don't know about ActiveData.
Kyle?

Flags: needinfo?(klahnakoski)

Kyle Lahnakoski [:ekyle]

Reporter

Comment 5

•

6 years ago

ActiveData ingests all the coverage artifacts, it does not look at the job status.  

:ato, what file were we looking at again?

Flags: needinfo?(klahnakoski) → needinfo?(ato)

Andreas Tolfsen ❲:ato❳

Comment 6

•

6 years ago

So for example, the GeckoDriver#setWindowRect function [1] is invoked
through the set_window_rect.py WPT test [2].  You can run this file
locally with the following incantation:

	./mach wpt testing/web-platform/tests/webdriver/tests/set_window_rect.py 

I guess it is interesting that we have another test for this function
in the Mn job [3] that also calls this function.  Maybe there is
something more sinister at play here?

  [1] https://codecov.io/gh/marco-c/gecko-dev/src/3ec05888ca32b2d8a14d700474efb0c63411fca2/testing/marionette/driver.js#L1426
  [2] https://searchfox.org/mozilla-central/source/testing/web-platform/tests/webdriver/tests/set_window_rect.py#14
  [3] https://searchfox.org/mozilla-central/source/testing/marionette/harness/marionette_harness/tests/unit/test_window_rect.py

Flags: needinfo?(ato)

Marco Castelluccio [:marco]

Comment 7

•

6 years ago

I've run the test locally and the function is shown as covered.

I've also noticed it is covered in this build (https://codecov.io/gh/marco-c/gecko-dev/src/b56a0f8804e00086266672540d7eacb63ae3dbf1/testing/marionette/driver.js#L1407).

It isn't covered in this other build (https://codecov.io/gh/marco-c/gecko-dev/src/0d112b8fadbe994214d9419944ee1e4fba987226/testing/marionette/driver.js#L1407) where Wd failed with:
> [task 2018-01-31T19:13:36.165Z] 19:13:36     INFO - Automation Error: mozprocess timed out after 1000 seconds running ['/builds/worker/workspace/build/venv/bin/python', '-u', '/builds/worker/workspace/build/tests/web-platform/runtests.py', '--log-raw=-', '--log-raw=/builds/worker/workspace/build/blobber_upload_dir/wpt_raw.log', '--log-wptreport=/builds/worker/workspace/build/blobber_upload_dir/wptreport.json', '--log-errorsummary=/builds/worker/workspace/build/blobber_upload_dir/wpt_errorsummary.log', '--binary=/builds/worker/workspace/build/application/firefox/firefox', '--symbols-path=https://queue.taskcluster.net/v1/task/AhYwkCQ3RIqvvAf-gbx7_Q/artifacts/public/build/target.crashreporter-symbols.zip', '--stackwalk-binary=/usr/local/bin/linux64-minidump_stackwalk', '--stackfix-dir=/builds/worker/workspace/build/tests/bin', '--run-by-dir=3', '--no-pause-after-test', '--test-type=wdspec', '--stylo-threads=4', '--webdriver-binary=/builds/worker/workspace/build/tests/bin/geckodriver', '--prefs-root=/builds/worker/workspace/build/tests/web-platform/prefs', '--processes=1', '--config=/builds/worker/workspace/build/tests/web-platform/wptrunner.ini', '--ca-cert-path=/builds/worker/workspace/build/tests/web-platform/certs/cacert.pem', '--host-key-path=/builds/worker/workspace/build/tests/web-platform/certs/web-platform.test.key', '--host-cert-path=/builds/worker/workspace/build/tests/web-platform/certs/web-platform.test.pem', '--certutil-binary=/builds/worker/workspace/build/tests/bin/certutil']
> [task 2018-01-31T19:13:36.171Z] 19:13:36    ERROR - timed out after 1000 seconds of no output
> [task 2018-01-31T19:13:36.171Z] 19:13:36    ERROR - Return code: -15
> [task 2018-01-31T19:13:36.171Z] 19:13:36    ERROR - No suite end message was emitted by this harness.
> [task 2018-01-31T19:13:36.172Z] 19:13:36    ERROR - # TBPL FAILURE #

How are we handling the process on timeout? Are we abruptly killing it? It's possible that, if we are abruptly killing it, we could lose coverage.

Marco Castelluccio [:marco]

Comment 8

•

6 years ago

(In reply to Marco Castelluccio [:marco] from comment #7)
> How are we handling the process on timeout? Are we abruptly killing it? It's
> possible that, if we are abruptly killing it, we could lose coverage.

We are indeed losing coverage, there are no gcda or jsvm info files generated.
The first question still stands, are we killing the process or are we letting it run and terminating the job? If we are killing it, how are we killing it?

Marco Castelluccio [:marco]

Comment 9

•

6 years ago

(In reply to Marco Castelluccio [:marco] from comment #8)
> (In reply to Marco Castelluccio [:marco] from comment #7)
> > How are we handling the process on timeout? Are we abruptly killing it? It's
> > possible that, if we are abruptly killing it, we could lose coverage.
> 
> We are indeed losing coverage, there are no gcda or jsvm info files
> generated.
> The first question still stands, are we killing the process or are we
> letting it run and terminating the job? If we are killing it, how are we
> killing it?

James, do you know?

Flags: needinfo?(james)

James Graham [:jgraham]

Comment 10

•

6 years ago

The logging

> [task 2018-01-31T19:13:36.171Z] 19:13:36    ERROR - timed out after 1000 seconds of no output
> [task 2018-01-31T19:13:36.171Z] 19:13:36    ERROR - Return code: -15

indicates that taskcluster is killing the whole testrunner with SIGTERM. Generally wpt tries to do a graceful shutdown by first requesting an in-app shutdown and then falling back on SIGTERM and SIGKILL only if required. But the shutdown for wdspec tests is a little different; it looks like we don't necessarily try for a more graceful shutdown than sending SIGTERM in that case. So there is possibly a bug to fix here, but in the specific case mentioned it's nothing to do with wpt and it's not really fixable.

Flags: needinfo?(james)

BMO Automation

Updated

•

2 years ago

Severity: normal → S3

Bugzilla

Quick Search

Missing coverage on opt-web-platform-tests-wdspec-e10s (Wd)

Categories

(Testing :: Code Coverage, enhancement)

Tracking

(Not tracked)

People

(Reporter: ekyle, Unassigned)

References

(Blocks 1 open bug)

Details

Crash Data

Security

(public)

User Story

Description

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Updated