Closed
Bug 695467
Opened 13 years ago
Closed 13 years ago
Bm builders are hitting mercurial bugs & failing (turning blue)
Categories
(Release Engineering :: General, defect, P3)
Tracking
(Not tracked)
RESOLVED
DUPLICATE
of bug 693202
People
(Reporter: dholbert, Assigned: bkero)
References
Details
(Whiteboard: [buildduty][hg])
We seem to be hitting a mercurial bug on Mozilla-Beta and Mozilla-Aurora, triggering tons of (automatically-retriggered) Bm builds. e.g. https://tbpl.mozilla.org/php/getParsedLog.php?id=6914607&tree=Mozilla-Beta OS X 10.5.2 Mobile Desktop mozilla-beta build on 2011-10-18 12:19:07 PDT for push 72be1d924c35 https://tbpl.mozilla.org/php/getParsedLog.php?id=6913654&tree=Mozilla-Beta WINNT 5.2 Mobile Desktop mozilla-beta build on 2011-10-18 11:36:38 PDT for push 52c9c801be77 The error looks like this: { argv: ['/usr/local/bin/hg', 'clone', '--verbose', '--noupdate', u'http://hg.mozilla.org/releases/mozilla-beta', 'build'] environment: Apple_PubSub_Socket_Render=/tmp/launch-O1SyV8/Render CVS_RSH=ssh DISPLAY=/tmp/launch-wS10Xl/:0 HOME=/Users/cltbld LOGNAME=cltbld PATH=/tools/buildbot/bin:/tools/python/bin:/opt/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin PWD=/builds/slave/m-beta-osx-mb SHELL=/bin/bash SSH_AUTH_SOCK=/tmp/launch-bqv65a/Listeners TMPDIR=/var/folders/TL/TLg3RrMbFAur2hBCXvCeqk+++TM/-Tmp-/ USER=cltbld __CF_USER_TEXT_ENCODING=0x1F6:0:0 using PTY: False transaction abort! requesting all changes adding changesets rollback completed ** unknown exception encountered, please report by visiting ** http://mercurial.selenic.com/wiki/BugTracker ** Python 2.5.1 (r251:54863, Jan 17 2008, 19:35:17) [GCC 4.0.1 (Apple Inc. build 5465)] ** Mercurial Distributed SCM (version 1.7.5) ** Extensions loaded: share, rebase, mq, purge Traceback (most recent call last): File "/usr/local/bin/hg", line 38, in <module> mercurial.dispatch.run() File "tools/mercurial-1.7.5/lib/python2.5/site-packages/mercurial/dispatch.py", line 16, in run File "tools/mercurial-1.7.5/lib/python2.5/site-packages/mercurial/dispatch.py", line 36, in dispatch File "tools/mercurial-1.7.5/lib/python2.5/site-packages/mercurial/dispatch.py", line 58, in _runcatch File "tools/mercurial-1.7.5/lib/python2.5/site-packages/mercurial/dispatch.py", line 593, in _dispatch File "tools/mercurial-1.7.5/lib/python2.5/site-packages/mercurial/dispatch.py", line 401, in runcommand File "tools/mercurial-1.7.5/lib/python2.5/site-packages/mercurial/dispatch.py", line 644, in _runcommand File "tools/mercurial-1.7.5/lib/python2.5/site-packages/mercurial/dispatch.py", line 598, in checkargs File "tools/mercurial-1.7.5/lib/python2.5/site-packages/mercurial/dispatch.py", line 591, in <lambda> File "tools/mercurial-1.7.5/lib/python2.5/site-packages/mercurial/util.py", line 426, in check File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/commands.py", line 736, in clone File "tools/mercurial-1.7.5/lib/python2.5/site-packages/mercurial/hg.py", line 337, in clone File "tools/mercurial-1.7.5/lib/python2.5/site-packages/mercurial/localrepo.py", line 1886, in clone File "tools/mercurial-1.7.5/lib/python2.5/site-packages/mercurial/localrepo.py", line 1295, in pull File "tools/mercurial-1.7.5/lib/python2.5/site-packages/mercurial/localrepo.py", line 1692, in addchangegroup File "tools/mercurial-1.7.5/lib/python2.5/site-packages/mercurial/revlog.py", line 1381, in addgroup File "tools/mercurial-1.7.5/lib/python2.5/site-packages/mercurial/revlog.py", line 1220, in _addrevision mpatch.mpatchError: patch cannot be decoded elapsedTime=1044.706320 program finished with exit code 1 }
Comment 1•13 years ago
|
||
cc-ing bkero because he upgraded varnish in bug 693202 this morning
Priority: -- → P3
Whiteboard: [buildduty][hg]
Reporter | ||
Comment 2•13 years ago
|
||
The problem appears to have started today. The last-good push to Mozilla-Beta was last Thursday: https://tbpl.mozilla.org/?tree=Mozilla-Beta&rev=522217082f0d The first-bad push was this morning at 11:00 AM: https://tbpl.mozilla.org/?tree=Mozilla-Beta&rev=df9841857c9c (nothing else was pushed between Thursday and today) On Mozilla-Aurora, the last-good push appears to be this morning at 11:11 AM: https://tbpl.mozilla.org/?tree=Mozilla-Aurora&rev=4754469691db and the first-bad push was today at 12:16 PM: https://tbpl.mozilla.org/?tree=Mozilla-Aurora&rev=54a04805efe1
Comment 3•13 years ago
|
||
bkero: are there scripts involved here that need to be updated, like those for comm-beta this morning>
Assignee | ||
Comment 4•13 years ago
|
||
coop: I'm wondering if the scripts I updated were the same as the comm-beta ones. Is it possible to rerun this job to see if this problem was fixed with the scripts that I updated for comm-beta?
Comment 5•13 years ago
|
||
They are continually rerunning themselves, see https://tbpl.mozilla.org/?tree=Mozilla-Beta&jobname=OS%20X%2010.5.2%20Mobile%20Desktop%20mozilla-beta%20build
Comment 6•13 years ago
|
||
So, these mobile desktop builders (Bm) do a regular 'hg clone .../releases/mozilla-beta', instead of using our hgtool.py which uses hg share where it can. Consequently they will cause more traffic than the other builds (B), and the constant retrying could lead to a situation where you never get out of a broken state. The slaves having issues are located in SJC1, so the traffic is intra-colo. We know we changed varnish this morning, and we're consistently getting mpatch.mpatchError: patch cannot be decoded Can we try dumping all the pages for mozilla-beta and mozilla-aurora in the varnish cache ?
Assignee: nobody → server-ops-releng
Component: Release Engineering → Server Operations: RelEng
QA Contact: release → zandr
Updated•13 years ago
|
Assignee: server-ops-releng → bkero
Reporter | ||
Comment 7•13 years ago
|
||
This is happening on the main mozilla-central and mozilla-inbound trees too. Removing specific mention of Mozilla-Beta from summary.
Summary: Bm builders on Mozilla-Beta are hitting mercurial bugs & failing (turning blue) → Bm builders are hitting mercurial bugs & failing (turning blue)
Assignee | ||
Comment 8•13 years ago
|
||
I've dumped all of the varnish cache to see if helps resolve this problem. I have been attempting to replicate the issue. I've done a duplicate clone on a separate varnish instance (on an unrelated VM) and did not observe the issue. At this point I think the check might be related to how the cache is expired. I'll be investigating that.
Component: Server Operations: RelEng → Release Engineering
Comment 9•13 years ago
|
||
Ok, thanks. It would be good to know that the cache eviction on a push is working with the new version of varnish.
Comment 10•13 years ago
|
||
This is happening still. See https://tbpl.mozilla.org/?noignore=1&tree=Mozilla-Beta&rev=6120192ea12e The first build that failed on this branch showed /usr/local/bin/hg clone --verbose --noupdate http://hg.mozilla.org/releases/mozilla-beta build <snip> requesting all changes abort: HTTP Error 503: Service Unavailable elapsedTime=930.644596 program finished with exit code 255 The following builds show the traceback in comment 0.
Comment 11•13 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=6925363&tree=Mozilla-Aurora (minus some of the noise) /usr/local/bin/hg clone --verbose --noupdate http://hg.mozilla.org/releases/mozilla-aurora build requesting all changes adding changesets adding manifests adding file changes added 58555 changesets with 0 changes to 0 files (+27 heads) elapsedTime=208.757507 /usr/local/bin/hg identify --num --branch -1 default /usr/local/bin/hg update --clean --repository build --rev 0cb1870e32d2d63b380f48bd30e1f8e281dbd5ec abort: unknown revision '0cb1870e32d2d63b380f48bd30e1f8e281dbd5ec'!
Comment 12•13 years ago
|
||
... and that was the 49th attempt at building for that push :)
Comment 13•13 years ago
|
||
(In reply to Phil Ringnalda (:philor) from comment #12) > ... and that was the 49th attempt at building for that push :) The darwin9 clones are succeeding now after downgrading varnish. This build should finally persevere.
Comment 14•13 years ago
|
||
I'm going to dup this, since it's clearly related to / a symptom of changes in bug 693202.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → DUPLICATE
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
You need to log in
before you can comment on or make changes to this bug.
Description
•