Closed
Bug 457976
Opened 16 years ago
Closed 6 years ago
Brief Log Summary should report failed Python (i.e. Hg) command (returning error code)
Categories
(Webtools Graveyard :: Tinderbox, defect)
Webtools Graveyard
Tinderbox
Tracking
(Not tracked)
RESOLVED
WONTFIX
People
(Reporter: sgautherie, Unassigned)
References
Details
(Whiteboard: [See comment 23])
Attachments
(2 files)
2.38 KB,
patch
|
Details | Diff | Splinter Review | |
1.84 KB,
patch
|
Details | Diff | Splinter Review |
Either the box should be green,
or the parser should report what is wrong.
***
This starts with
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey/1222796962.1222801391.31567.gz
Linux comm-central dep unit test on 2008/09/30 10:49:22
continues with
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey/1222801084.1222805511.10266.gz
Linux comm-central dep unit test on 2008/09/30 11:58:04
...
and is still there with (current)
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey/1222817573.1222821894.27234.gz
Linux comm-central dep unit test on 2008/09/30 16:32:53
Comment 1•16 years ago
|
||
Looks like leaks, so this is a dupe of bug 445596 IMO.
Comment 2•16 years ago
|
||
Ah, beg your pardon. The error is
Executing command: ['hg', 'update', '-r', 'tip', '-R', './mozilla']
abort: data/browser/app/Makefile.in.i@0a2578e045ed: no match found!
Traceback (most recent call last):
File "client.py", line 184, in <module>
do_hg_pull('mozilla', options.mozilla_repo, options.hg, options.mozilla_rev)
File "client.py", line 65, in do_hg_pull
check_call_noisy(cmd)
File "client.py", line 47, in check_call_noisy
check_call(cmd, *args, **kwargs)
File "/tools/python-2.5.1/lib/python2.5/subprocess.py", line 461, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['hg', 'update', '-r', 'tip', '-R', './mozilla']' returned non-zero exit status 255
program finished with exit code 1
Either way, these boxes don't get first-line support from Moco IT, so please file bugs in a more appropriate component. eg, to improve the error parser, that's Webtools::Tinderbox. Or you could go over to Thunderbird::Build for investigation on why this is happening.
(yes, it's a maze of twisty passages, all alike)
Reporter | ||
Comment 3•16 years ago
|
||
(In reply to comment #2)
> abort: data/browser/app/Makefile.in.i@0a2578e045ed: no match found!
Arf, I found it for the first red, and missed it for the next(s)... :-(
> to improve the error parser, that's Webtools::Tinderbox.
I'm morphing this bug.
> (yes, it's a maze of twisty passages, all alike)
Thanks for the second pair of eyes !
Assignee: server-ops → nobody
Severity: major → normal
Component: Server Operations: Tinderbox Maintenance → Tinderbox
Product: mozilla.org → Webtools
QA Contact: mrz → tinderbox
Summary: "Linux comm-central dep unit test" is RED, yet Brief Log Summary reports nothing :-( → Brief Log Summary should report failed Hg command (returning error code)
Whiteboard: [See comment 2]
Comment 4•16 years ago
|
||
You could argue that this is an example of a python error, and we should catch them in general.
Reporter | ||
Comment 5•16 years ago
|
||
Agreed, sure :-)
Summary: Brief Log Summary should report failed Hg command (returning error code) → Brief Log Summary should report failed Python (i.e. Hg) command (returning error code)
![]() |
||
Comment 6•16 years ago
|
||
This actually is some sort of hg bug, it seems.
killed mozilla/ and forced another build, should go non-red now.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
![]() |
||
Comment 7•16 years ago
|
||
Oh, sorry, just realized you converted a bug I thought was about a buildbot failure to a bug about error reporting. The box is fixed, the error reporting isn't.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Reporter | ||
Comment 8•16 years ago
|
||
(In reply to comment #7)
> Oh, sorry, just realized you converted a bug I thought was about a buildbot
Yes...
> failure to a bug about error reporting. The box is fixed, the error reporting
> isn't.
Good :-)
Status: REOPENED → NEW
Comment 9•16 years ago
|
||
For referencem, regarding comment #6, this is tracked upstream here:
http://www.selenic.com/mercurial/bts/issue1313
Comment 10•16 years ago
|
||
Until the hg bug itself is resolved, wouldn't it make sense to abort builds when client.py fails?
Reporter | ||
Comment 11•16 years ago
|
||
(In reply to comment #9)
> http://www.selenic.com/mercurial/bts/issue1313
Fwiw,
that Mercurial bug is about "abort: data/<file>@<rev>: unknown parent!",
this report here is about "abort: data/<file>.i@<rev>: no match found!";
but they may be the same, yes...
(Might be worth adding a comment there, or opening a separate bug.)
Comment 13•16 years ago
|
||
I'm not completely convinced that this is a tinderbox parser issue but abort is a generic enough warning term that it could be highlighted.
Reporter | ||
Comment 14•16 years ago
|
||
gozer, did you mean to ask for review of your patch ?
***
Are such errors still happening ?
Reporter | ||
Comment 15•16 years ago
|
||
Comment on attachment 359244 [details] [diff] [review]
v1.0
See bug 471295 comment 4:
shouldn't we rather check for the more generic |program finished with exit code [^0]| ?
Comment 16•16 years ago
|
||
Comment on attachment 341286 [details] [diff] [review]
[checked in] Abort build if python client.py breaks
No review was required, and I checked in that patch quite a while ago.
Attachment #341286 -
Attachment description: Abort build if python client.py breaks → [checked in] Abort build if python client.py breaks
Reporter | ||
Comment 17•16 years ago
|
||
Comment on attachment 341286 [details] [diff] [review]
[checked in] Abort build if python client.py breaks
(In reply to comment #16)
> No review was required, and I checked in that patch quite a while ago.
http://hg.mozilla.org/build/buildbot-configs/rev/e984bfabaae2
Comment 18•16 years ago
|
||
(In reply to comment #15)
> (From update of attachment 359244 [details] [diff] [review])
> See bug 471295 comment 4:
> shouldn't we rather check for the more generic |program finished with exit code
> [^0]| ?
I considered it but that string appears to be very buildbot specific not a generic python error check. You could just as easily argue that buildbot should use a generic error string that could be caught by the existing generic parsers.
Reporter | ||
Comment 19•16 years ago
|
||
(In reply to comment #18)
> I considered it but that string appears to be very buildbot specific not a
> generic python error check.
From bug 471295 comment 1 "To find problems you can search for non-zero exit codes", I thought that catching the buildbot line would ensure not to miss any error ... with the drawback that the caught line is not meaningful by itself.
If you think it's better to catch the initial/meaningful (python, or other) error line, then your patch should solve bug 471295 and this bug cases ... Then we'll see later if there would be other remaining cases to catch too.
> You could just as easily argue that buildbot should use
> a generic error string that could be caught by the existing generic parsers.
Could be, if there is one. (I don't know.)
Reporter | ||
Comment 20•16 years ago
|
||
(In reply to comment #19)
> we'll see later if there would be other remaining cases to catch too.
A different 'bug 471295 like' case:
http://tinderbox.mozilla.org/showlog.cgi?log=Thunderbird/1233236627.1233242943.13621.gz
Win2k3 comm-central check on 2009/01/29 05:43:47
{
...
make[7]: Entering directory `/d/buildbot/win32-comm-central-check/build/objdir-tb/mailnews/extensions/smime/build'
command timed out: 2400 seconds without output
program finished with exit code 1
...
}
Need to check for "command timed out: " too...
Comment 21•16 years ago
|
||
(In reply to comment #19)
> > You could just as easily argue that buildbot should use
> > a generic error string that could be caught by the existing generic parsers.
>
> Could be, if there is one. (I don't know.)
Um, how about a line that starts with 'Error' ? That's pretty generic and the parsers already support it.
(In reply to comment #20)
> command timed out: 2400 seconds without output
> program finished with exit code 1
> ...
> }
>
> Need to check for "command timed out: " too...
Which leads back to the argument that buildbot should be fixed to report each of these issues as an error that can be parsed by tinderbox.
Barring that, someone needs to generate an authoritative list of python errors that should be caught. You shouldn't have to scrape log files to generate that list.
Reporter | ||
Comment 22•16 years ago
|
||
Another (ongoing) case:
{
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.1/1234871310.1234871469.22366.gz
OS X 10.5.2 mozilla-1.9.1 leak test build on 2009/02/17 03:48:30
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.1/1234878510.1234878658.17605.gz
OS X 10.5.2 mozilla-1.9.1 leak test build on 2009/02/17 05:48:30
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.1/1234886017.1234886178.3361.gz
OS X 10.5.2 mozilla-1.9.1 leak test build on 2009/02/17 07:53:37
[...]
======== BuildStep started ========
alive test failed
=== Output ===
python leaktest.py -- -register
[...]
Traceback (most recent call last):
[...]
socket.error: (48, 'Address already in use')
program finished with exit code 1
}
Comment 23•16 years ago
|
||
Still not helpful. Can you provide a pointer to an authoritative method of highlighting python errors and not just snippets of build logs? If not, then this bug is going to morph back to buildbot so that it can use an existing standard error string.
Reporter | ||
Comment 24•16 years ago
|
||
(In reply to comment #23)
> Can you provide a pointer to an authoritative method of
> highlighting python errors
I don't know about that: you're the one suggesting this approach.
> this bug is going to morph back to buildbot so that it can use an existing
> standard error string.
***
Another comment 20 -like example...
{
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1237123771.1237129206.15061.gz
WINNT 5.2 mozilla-central unit test on 2009/03/15 06:29:31
======== BuildStep started ========
'make buildsymbols' failed
=== Output ===
[...]
command timed out: 1200 seconds without output
program finished with exit code 1
}
Comment 25•16 years ago
|
||
(In reply to comment #24)
> (In reply to comment #23)
> > Can you provide a pointer to an authoritative method of
> > highlighting python errors
>
> I don't know about that: you're the one suggesting this approach.
Back to buildbot maintainers to use a standard error string.
Assignee: cls → server-ops
Status: ASSIGNED → NEW
Component: Tinderbox → Server Operations: Tinderbox Maintenance
Product: Webtools → mozilla.org
QA Contact: tinderbox → mrz
Whiteboard: [See comment 2] → [See comment 23]
Attachment #359244 -
Flags: review?(reed)
Comment 26•16 years ago
|
||
Isn't this more a build & release thingy than server-ops tinderbox maintenance?
Reporter | ||
Comment 27•16 years ago
|
||
Another example:
{
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey/1237471648.1237476641.29184.gz&fulltext=1
Linux comm-central dep unit test on 2009/03/19 07:07:28
client.py checkout failed
...
abort: HTTP Error 500: Internal Server Error
...
subprocess.CalledProcessError: Command '['hg', 'pull', '-R', './mozilla', '-r', 'tip']' returned non-zero exit status 255
program finished with exit code 1
}
Updated•16 years ago
|
Assignee: server-ops → nobody
Component: Server Operations: Tinderbox Maintenance → Release Engineering
QA Contact: mrz → release
Comment 28•16 years ago
|
||
I think the parser should catch terms like ^abort for hg and ^Traceback for python, with the usual following lines and a few lines of pre-context. Those are canonical triggers for errors AFAICT. Both are big enough projects that they should be treated like gmake and cvs, in the sense that tinderbox has to learn how they report errors.
Buildbot errors (in particular the buildstep exit codes) I don't think we want to catch, because some of them are allowed to be non-fatal on non-zero exit. We should properly handle specific error conditions in separate bugs, eg bug 479308 and elsewhere where we massage summaries and status. And in the long term, we can use the new exception result from bug 476656 to try to separate out genuine code errors which developers should fix, with infrastructure errors that they have to hassle RelEng about.
Component: Release Engineering → Tinderbox
Product: mozilla.org → Webtools
QA Contact: release → tinderbox
Reporter | ||
Comment 29•16 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1242074648.1242081983.32217.gz&fulltext=1
WINNT 5.2 mozilla-central unit test on 2009/05/11 13:44:08
hg clone http://hg.mozilla.org/build/buildbot-configs mozconfigs
...
abort: data/thunderbird/buildbot.tac.i@c2f391dbc109: no match found!
program finished with exit code -1
Reporter | ||
Comment 30•16 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey/1242482310.1242486518.20445.gz
WINNT 5.2 comm-central unit test on 2009/05/16 06:58:30
Reporter | ||
Comment 31•16 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Thunderbird/1242834544.1242835340.24258.gz
MacOSX 10.4 comm-central check on 2009/05/20 08:49:04
/opt/local/bin/hg clone
https://hg.mozilla.org/comm-central
/Volumes/Build/macosx-comm-central-check/build
...
abort: Python support for SSL
and HTTPS is not installed
program finished with exit
code 255
Severity: normal → major
Reporter | ||
Updated•16 years ago
|
OS: Linux → All
Hardware: x86 → All
Reporter | ||
Comment 32•16 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1244839230.1244844777.2324.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/06/12 13:40:30
======== BuildStep started ========
sendchange to localhost:9010 failed
=== Output ===
master: localhost:9010
branch: mozilla-1.9.1-macosx-unittest
revision: None
comments:
user: sendchange-unittest
files: ['http://stage.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-1.9.1-macosx-unittest/1244839363/firefox-3.5pre.en-US.mac.dmg']
[Failure instance: Traceback (failure with no frames): <class 'twisted.internet.error.TimeoutError'>: User timeout caused connection failure.
]
=== Output ended ===
======== BuildStep ended ========
Reporter | ||
Comment 33•15 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey2.0/1250638176.1250644942.12196.gz&fulltext=1
WINNT 5.2 comm-1.9.1 unit test on 2009/08/18 16:29:36
{
======== BuildStep started ========
clean old builds failed
=== Output ===
[...]
python: can't open file 'tools/buildfarm/maintenance/purge_builds.py': [Errno 2] No such file or directory
program finished with exit code 2
}
Reporter | ||
Comment 34•15 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1250697214.1250699429.7114.gz
Linux mozilla-central test everythingelse on 2009/08/19 08:53:34
{
destination directory: tools
[...]
abort: consistency error adding group!
program finished with exit code 255
}
Reporter | ||
Comment 35•15 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Thunderbird3.0/1250726417.1250729616.23078.gz
Linux comm-1.9.1 build on 2009/08/19 17:00:17
{
======== BuildStep started ========
upload package(s) to stage.mozilla.org failed
=== Output ===
[...]
ssh: connect to host stage.mozilla.org port 22: Connection timed out
program finished with exit code 255
}
Assignee | ||
Updated•11 years ago
|
Product: Webtools → Webtools Graveyard
Comment 36•6 years ago
|
||
Tinderbox isn't maintained anymore. Closing.
Status: NEW → RESOLVED
Closed: 16 years ago → 6 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•