Closed Bug 1190474 Opened 4 years ago Closed 4 years ago

cppunittests: allow longer timeouts to be specified in manifest

Categories

(Testing :: General, defect)

defect
Not set

Tracking

(firefox43 fixed)

RESOLVED FIXED
mozilla43
Tracking Status
firefox43 --- fixed

People

(Reporter: jgriffin, Assigned: jgriffin)

Details

Attachments

(2 files)

ekr wants to enable a long-running cppunittest which currently triggers the 900s timeout.  Let's allow longer timeouts to be specified in the manifest on a per-test basis.

This means we'll also have to change how 'mach cppunittest' works, because it doesn't currently use the manifest at all.
I realized (looking at the code) that the way we implemented this manifest means that if someone adds new CPP_UNIT_TESTS to moz.build they won't be run in automation unless they're also added to cppunittest.ini. That's unfortunate, but we're probably also not adding many new CPP_UNIT_TESTS nowadays.
Still appears to be busted.
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #2)
> I realized (looking at the code) that the way we implemented this manifest
> means that if someone adds new CPP_UNIT_TESTS to moz.build they won't be run
> in automation unless they're also added to cppunittest.ini. That's
> unfortunate, but we're probably also not adding many new CPP_UNIT_TESTS
> nowadays.

A small price to pay
(In reply to Eric Rescorla (:ekr) from comment #4)
> Still appears to be busted.

Yes, testing a fix now.
Assignee: nobody → jgriffin
Status: NEW → ASSIGNED
Attachment #8644081 - Flags: review?(ted) → review?(cmanchester)
Comment on attachment 8644081 [details] [diff] [review]
Allow test-specific timeouts to be specified in cppunittest.ini,

Review of attachment 8644081 [details] [diff] [review]:
-----------------------------------------------------------------

::: testing/runcppunittests.py
@@ +199,5 @@
> +                    mp.read(os.path.join(p, 'cppunittest.ini'))
> +                except IOError:
> +                    tests.extend([(os.path.abspath(os.path.join(p, x)), 0) for x in os.listdir(p)])
> +            else:
> +                tests.append((os.path.abspath(p)), 0)

misplaced paren, should be:

tests.append((os.path.abspath(p), 0))

@@ +206,5 @@
>      # for all platforms (and it will fail on Windows anyway)
> +    active_tests = mp.active_tests(exists=False, disabled=False, **environ)
> +    suffix = '.exe' if mozinfo.isWin else ''
> +    if binary_path:
> +        tests.extend([(os.path.join(binary_path, test['relpath'] + suffix), int(test.get('timeout', 0))) for test in active_tests])

xpcshell tests specify per-test timeouts in manifests with "requesttimeoutfactor". We could do the same here or as a follow up if consistency seems desirable.
Attachment #8644081 - Flags: review?(cmanchester) → review+
(In reply to Chris Manchester [:chmanchester] from comment #9)
> xpcshell tests specify per-test timeouts in manifests with
> "requesttimeoutfactor". We could do the same here or as a follow up if
> consistency seems desirable.

We don't do this in xpcshell, but in Mochitest we have a larger timeout for debug tests, so the "timeout factor" is nice there.
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #10)
> (In reply to Chris Manchester [:chmanchester] from comment #9)
> > xpcshell tests specify per-test timeouts in manifests with
> > "requesttimeoutfactor". We could do the same here or as a follow up if
> > consistency seems desirable.
> 
> We don't do this in xpcshell, but in Mochitest we have a larger timeout for
> debug tests, so the "timeout factor" is nice there.

"requesttimeoutfactor" was added to xpcshell manifests in bug 927550.
(In reply to Chris Manchester [:chmanchester] from comment #11)
> (In reply to Ted Mielczarek [:ted.mielczarek] from comment #10)
> > (In reply to Chris Manchester [:chmanchester] from comment #9)
> > > xpcshell tests specify per-test timeouts in manifests with
> > > "requesttimeoutfactor". We could do the same here or as a follow up if
> > > consistency seems desirable.
> > 
> > We don't do this in xpcshell, but in Mochitest we have a larger timeout for
> > debug tests, so the "timeout factor" is nice there.
> 
> "requesttimeoutfactor" was added to xpcshell manifests in bug 927550.

This makes sense and I'd like to be consistent; I'll amend the patch.
(In reply to Pulsebot from comment #14)
> https://hg.mozilla.org/integration/mozilla-inbound/rev/bedff4a78d9a

This uses 'requesttimeoutfactor' for consistency with prior art, e.g., requesttimeoutfactor = 2 means allow twice the normal timeout.  This will be useful if the default timeout for desktop and android diverge (which currently isn't the case).
Backed out for B2G ICS Emulator Cpp unittests permafail: https://hg.mozilla.org/integration/mozilla-inbound/rev/34ab0b8678b2

23:36:48     INFO - Calling ['/builds/slave/test/build/venv/bin/python', 'remotecppunittests.py', '--dm_trans=adb', '--symbols-path=http://pvtbuilds.pvt.build.mozilla.org/pub/mozilla.org/b2g/tinderbox-builds/mozilla-inbound-emulator/20150812230633/b2g-43.0a1.en-US.android-arm.crashreporter-symbols.zip', '--xre-path=/builds/slave/test/build/xre/bin', '--addEnv', 'LD_LIBRARY_PATH=/vendor/lib:/system/lib:/system/b2g', '--with-b2g-emulator=/builds/slave/test/build/emulator/b2g-distro', '--skip-manifest=b2g_cppunittest_manifest.txt', '.'] with output_timeout 1000
23:36:48     INFO -  Usage: remotecppunittests.py [options]
23:36:48     INFO -  remotecppunittests.py: error: no such option: --skip-manifest
23:36:48    ERROR - Return code: 2
23:36:48    ERROR - No tests run or test summary not found
--skip-manifest looks like a no-op.  I removed it, testing on try:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=8f2479350709
https://hg.mozilla.org/mozilla-central/rev/8fa8ea614226
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla43
Thanks. Do you have an example of the required syntax.
Flags: needinfo?(jgriffin)
in cppunittest.ini, you could specify e.g.

[turn_unittest]
requesttimeoutfactor = 2

This would give that test 2 times the normal (900s, at present) timeout.  You can use as large a timeout factor as you need.
Flags: needinfo?(jgriffin)
I think this may not work correctly. Consider the following two try pushes:

Without modifying the timeout, the test times out at 900s.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=9de8e6c9386c

With an expanded timeout, we see a complaint about no output and the
logs show no signaling_unittests output even though without this change
you see a lot of log output:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=f843b56f18a8

09:11:36     INFO -  TEST-START | signaling_unittests
09:28:16     INFO - Automation Error: mozprocess timed out after 1000 seconds running ['/builds/slave/test/build/venv/bin/python', '-u', '/builds/slave/test/build/tests/cppunittest/runcppunittests.py', '--symbols-path=/builds/slave/test/build/symbols', '--xre-path=/builds/slave/test/build/application/firefox', 'tests/cppunittest']
09:28:16    ERROR - timed out after 1000 seconds of no output
09:28:16    ERROR - Return code: -9
Flags: needinfo?(jgriffin)
You're running into a different (mozharness) timeout.  I suspect the out differences are just due to how we process the output from these processes.  Here's a try run with a much longer mozharness timeout.  It's not the right way to fix it, but will tell us if that's all that needs to be done:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=caa1621e2b26
Flags: needinfo?(jgriffin)
(In reply to Jonathan Griffin (:jgriffin) from comment #23)
> You're running into a different (mozharness) timeout.  I suspect the out
> differences are just due to how we process the output from these processes. 
> Here's a try run with a much longer mozharness timeout.  It's not the right
> way to fix it, but will tell us if that's all that needs to be done:
> 
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=caa1621e2b26

Looks like this worked; the test failed with exit code 1 but didn't time out.  I'll make a follow-up patch to fix the mozharness timeout.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Alternately we could make runcppunittests print some output every N seconds while it waits for a test to finish. We do that in the script we use to wrap PGO linking on Windows:
https://dxr.mozilla.org/mozilla-central/source/config/link.py
Attachment #8648830 - Flags: review?(cmanchester) → review+
For some reason, we don't dump the output from the test while it's running; we store it up and dump it all at once after the process terminates.  This seems suboptimal; I'll file a separate follow-up bug for changing this.
https://hg.mozilla.org/mozilla-central/rev/afb1daac8215
Status: REOPENED → RESOLVED
Closed: 4 years ago4 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.