Closed Bug 1322616 Opened 3 years ago Closed 2 years ago

gtest timeout message does not differentiate between timeout and no-output timeout

Categories

(Testing :: General, defect)

Version 3
defect
Not set

Tracking

(firefox59 fixed)

RESOLVED FIXED
mozilla59
Tracking Status
firefox59 --- fixed

People

(Reporter: gbrown, Assigned: gbrown)

References

Details

(Whiteboard: [PI:December])

Attachments

(3 files)

The message "TEST-UNEXPECTED-FAIL | gtest | timed out after 1200 seconds" can mean either:
 - gtest ran for more than 1200 seconds
 - gtest ran for more then 300 seconds without output

In bug 1115253, many of the current failures are due to 300 seconds without output (likely hang during TestNamedPipeService, bug 1322377), but that certainly is not clear from the error message.
See Also: → 1332352
:gbrown could you please help us assign this bug to someone who can take a look? I am asking this because this Bug is blocking Bug 1115253
Flags: needinfo?(gbrown)
I think that resolving this bug is a good idea, but I don't have time for it and I'm not sure who else might be able to look at it. While this bug would help diagnose bug 1115253, it doesn't really block it.
No longer blocks: 1115253
Flags: needinfo?(gbrown)
See Also: → 1115253
Whiteboard: [PI:November]
It's worse than this, and it hit me while trying to debug bug 1419196:

I did a push try with a of GTests.TEST_PROC_TIMEOUT set to 3600, and when the test failed with a timeout it said:

[task 2017-11-23T04:45:51.239Z] 04:45:51    ERROR -  gtest TEST-UNEXPECTED-FAIL | gtest | timed out after 3600 seconds
https://public-artifacts.taskcluster.net/cxxPLjMxRDO_0kKy7WV_KA/0/public/logs/live_backing.log

which is obviously wrong, considering the task had only run for 13 minutes.

The actual reason it failed is because of the 5 minutes without output, which this bug is about. Note the last log line before the one I copy/pasted is 5 minutes earlier.

Now, I also changed the mozprocess code to kill -11 the process that has no output... and guess what? that didn't happen.

So, on top of that, I also set GTests.TEST_PROC_NO_OUTPUT_TIMEOUT to 3600 seconds. What happened next was that when the test failed with a timeout, it said:

[task 2017-11-23T06:15:23.570Z] 06:15:23     INFO - Automation Error: mozprocess timed out after 1000 seconds running ['/builds/worker/workspace/build/venv/bin/python', '-u', '/builds/worker/workspace/build/tests/gtest/rungtests.py', '--xre-path=/builds/worker/workspace/build/application/firefox', '--cwd=/builds/worker/workspace/build/tests/gtest', '--symbols-path=/builds/worker/workspace/build/symbols', '--utility-path=tests/bin', '/builds/worker/workspace/build/application/firefox/firefox']
[task 2017-11-23T06:15:23.583Z] 06:15:23    ERROR - timed out after 1000 seconds of no output

So we're hitting a *different* no-output timeout, and this time, my modification to mozprocess did apply... except I was hoping this would trigger a minidump that rungtests.py would pick up, and /that/ didn't happen.
:gbrown, is this something you wanted to work on?
Flags: needinfo?(gbrown)
I'll see what I can do...
Assignee: nobody → gbrown
Flags: needinfo?(gbrown)
Whiteboard: [PI:November] → [PI:December]
This doesn't modify the meaning or implementation of any of the existing timeout handling; it just adds a new property to specifically query for output timeout.
Attachment #8934671 - Flags: review?(jmaher)
mozharness keeps a copy of mozprocess -- this applies the same changes to that copy.
Attachment #8934672 - Flags: review?(jmaher)
Use the new property to report a different message when the no-output timeout is hit.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=c44821d84a23d2b84575dc407448f286dbce124d demonstrates no harm done to mozharness + demonstrates the new timeout message in gtest.
Attachment #8934673 - Flags: review?(jmaher)
Attachment #8934671 - Flags: review?(jmaher) → review+
Attachment #8934672 - Flags: review?(jmaher) → review+
Comment on attachment 8934673 [details] [diff] [review]
query for output timeout in gtest harness

Review of attachment 8934673 [details] [diff] [review]:
-----------------------------------------------------------------

this will make the error more specific
Attachment #8934673 - Flags: review?(jmaher) → review+
Pushed by gbrown@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/f1af834586d7
Differentiate between timeout and output timeout in mozprocess (mozharness copy of mozprocess); r=jmaher
https://hg.mozilla.org/integration/mozilla-inbound/rev/1e9235dc3a18
Differentiate between timeout and output timeout in mozprocess; r=jmaher
https://hg.mozilla.org/integration/mozilla-inbound/rev/2490244fe65d
Report unique message for gtest output timeout; r=jmaher
You need to log in before you can comment on or make changes to this bug.