Stop appending redundant "buildbot.slave.commands.TimeoutError" string

RESOLVED FIXED

Status

Release Engineering
General Automation
P3
normal
RESOLVED FIXED
5 years ago
4 years ago

People

(Reporter: emorley, Assigned: emorley)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment)

Normally buildbot timeouts occur as:
{
23490 INFO TEST-PASS | /tests/content/base/test/test_bug473162-2.html | Unexpected first node - [object HTMLSpanElement] should equal [object HTMLSpanElement]
23491 INFO TEST-PASS | /tests/content/base/test/test_bug473162-2.html | Unexpected second node - [object HTMLSpanElement] shou
command timed out: 2400 seconds without output, killing pid 20841
process killed by signal 9
program finished with exit code -1
elapsedTime=2698.630792
TinderboxPrint: mochitest-plain<br/><em class="testfail">T-FAIL</em>
buildbot.slave.commands.TimeoutError: command timed out: 2400 seconds without output, killing pid 20841
TinderboxPrint: mochitest-plain<br/><em class="testfail">timeout</em>
========= Finished 'python mochitest/runtestsremote.py ...' warnings (results: 1, elapsed: 44 mins, 59 secs) (at 2012-09-23 20:11:31.546325) =========
}
(https://tbpl.mozilla.org/php/getParsedLog.php?id=15470170&tree=Firefox#error0)

However, with mock (so I presume due to mozharness?), we get just:
{
localhost.localdomain - - [24/Sep/2012 08:04:57] "GET /safebrowsing-dummy/newkey HTTP/1.1" 404 -
--DOMWINDOW == 15 (0x39c3630) [serial = 15] [outer = 0x398c1e0] [url = about:blank]
--DOMWINDOW == 14 (0x2c36a80) [serial = 9] [outer = 0x2aab5e0] [url = about:blank]
--DOMWINDOW == 13 (0x3074f30) [serial = 12] [outer = 0x2aab5e0] [url = about:blank]
--DOMWINDOW == 12 (0x106c5d0) [serial = 2] [outer = 0x13c9700] [url = about:blank]
localhost.localdomain - - [24/Sep/2012 08:05:01] "GET /sample.html HTTP/1.1" 200 -
++DOMWINDOW == 13 (0x1358910) [serial = 17] [outer = 0x398c1e0]

command timed out: 3600 seconds without output, attempting to kill
program finished with exit code 247
elapsedTime=3609.427740
========= Finished alive test failed (results: 2, elapsed: 1 hrs, 9 secs) (at 2012-09-24 06:05:01.341018) =========
}
(this example from https://tbpl.mozilla.org/php/getParsedLog.php?id=15479030&tree=Firefox)

TBPL's parser only recognises "buildbot.slave.commands.TimeoutError:", so misses the latter case.
Blocks: 793016

Comment 1

5 years ago
The former case looks like mozharness (with the timestamp + INFO on every line); the latter looks like non-mozharness.

Comment 2

5 years ago
n/m, neither is mozharness.
Also:
https://tbpl.mozilla.org/php/getParsedLog.php?id=15846492&tree=Thunderbird-Trunk

{
WARNING: NS_ENSURE_TRUE(wrapper) failed: file ../../../../mozilla/dom/base/nsJSUtils.cpp, line 81
WARNING: NS_ENSURE_TRUE(wrapper) failed: file ../../../../mozilla/dom/base/nsJSUtils.cpp, line 81

command timed out: 1200 seconds without output, attempting to kill
program finished with exit code 247
elapsedTime=1227.483098
========= Finished 'mock_mozilla -v ...' failed (results: 2, elapsed: 20 mins, 27 secs) (at 2012-10-05 08:38:49.536995) =========
}

The mock changes have regressed TBPL visibility for these failures; please may someone take a look? (I have no idea where to start with mock).
Keywords: regression
Can we update TBPL to look for the new error message? I don't think updating mock to print out error messages that look like they come from buildbot is a good idea.

Updated

5 years ago
Priority: -- → P3
(In reply to Chris AtLee [:catlee] from comment #4)
> Can we update TBPL to look for the new error message? I don't think updating
> mock to print out error messages that look like they come from buildbot is a
> good idea.

We could, but it would mean we match twice against the buildbot logs, due to:
https://hg.mozilla.org/build/buildbotcustom/file/tip/steps/unittest.py#l373
...and I believe we won't be able to add "^command timed out:" to TBPL (to avoid the double match), since once we switch to mozharness, it prefixes the lines.

However, I can't see anything else that uses the buildbot prefixed line - so would you be ok with removing:
   373  'buildbot.slave.commands.TimeoutError: ' +
   374  line +

...and then we can just get TBPL to match against "command timed out: " rather than "buildbot.slave.commands.TimeoutError".
Assignee: nobody → bmo
Status: NEW → ASSIGNED
Flags: needinfo?(catlee)
catlee, ping on the question in comment 5.
Ok, presuming the answer to comment 5 is yes...

Bug 813039 will make TBPL match against "comment timed out:" directly, which will cover both the buildbot and mock cases.

We can then cleanup:
https://hg.mozilla.org/build/buildbotcustom/file/tip/steps/unittest.py#l373
No longer blocks: 778688
Depends on: 813039
Summary: mock's "command timed out: 3600 seconds without output" failures not accompanied by "buildbot.slave.commands.TimeoutError" → Stop appending redundant "buildbot.slave.commands.TimeoutError" string
Created attachment 683070 [details] [diff] [review]
Patch v1

(Line was originally added by bug 438324)
Attachment #683070 - Flags: review?(catlee)
Attachment #683070 - Flags: review?(catlee) → review?(bhearsum)
Flags: needinfo?(catlee)
Whiteboard: [sheriff-want]
Comment on attachment 683070 [details] [diff] [review]
Patch v1

Review of attachment 683070 [details] [diff] [review]:
-----------------------------------------------------------------

Shouldn't we take out the TinderboxPrint at the same time?
Attachment #683070 - Flags: review?(bhearsum) → review+
Phil, do we hit the "TinderboxPrint: <testsuite-name> \n timeout" case any more, or just the s/timeout/T-FAIL/ one elsewhere?
(In reply to Ben Hearsum [:bhearsum] from comment #9)
> Shouldn't we take out the TinderboxPrint at the same time?

Philor says that we are still using this. I trust him more than me on this! hehe :-)

https://hg.mozilla.org/build/buildbotcustom/rev/afb1205961f3
Keywords: regression
In production.
Status: ASSIGNED → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.