Closed Bug 907693 Opened 11 years ago Closed 10 years ago

B2G builds intermittently fail with "fatal: Cannot get https://git.mozilla.org/external/google/gerrit/git-repo.git/clone.bundle", "HTTP error 500", "repo init failed;"

Categories

(Developer Services :: General, task)

ARM
Gonk (Firefox OS)
task
Not set
critical

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: emorley, Assigned: fubar)

References

Details

(Keywords: intermittent-failure)

b2g_mozilla-central_inari_dep on 2013-08-21 05:10:14 PDT for push ba6c02fc1fe6

slave: bld-linux64-ec2-326

https://tbpl.mozilla.org/php/getParsedLog.php?id=26815020&tree=Mozilla-Central

{
05:12:48     INFO - Changing directory to /builds/slave/b2g_m-cen_inari_dep-0000000000.
05:12:48     INFO - rmtree: /builds/slave/b2g_m-cen_inari_dep-0000000000/build/tmp_manifest
05:12:48     INFO - mkdir: /builds/slave/b2g_m-cen_inari_dep-0000000000/build/tmp_manifest
05:12:48     INFO - Running command: ['git', 'init'] in /builds/slave/b2g_m-cen_inari_dep-0000000000/build/tmp_manifest
05:12:48     INFO - Copy/paste: git init
05:12:48     INFO -  Initialized empty Git repository in /builds/slave/b2g_m-cen_inari_dep-0000000000/build/tmp_manifest/.git/
05:12:48     INFO - Return code: 0
05:12:48     INFO - Running command: ['git', 'add', '/builds/slave/b2g_m-cen_inari_dep-0000000000/build/tmp_manifest/inari.xml'] in /builds/slave/b2g_m-cen_inari_dep-0000000000/build/tmp_manifest
05:12:48     INFO - Copy/paste: git add /builds/slave/b2g_m-cen_inari_dep-0000000000/build/tmp_manifest/inari.xml
05:12:48     INFO - Return code: 0
05:12:48     INFO - Running command: ['git', 'commit', '-m', 'manifest'] in /builds/slave/b2g_m-cen_inari_dep-0000000000/build/tmp_manifest
05:12:48     INFO - Copy/paste: git commit -m manifest
05:12:48     INFO -  [master (root-commit) df0bb77] manifest
05:12:48     INFO -   1 file changed, 115 insertions(+)
05:12:48     INFO -   create mode 100644 inari.xml
05:12:48     INFO - Return code: 0
05:12:48     INFO - Running command: ['git', 'branch', '-m', u'master'] in /builds/slave/b2g_m-cen_inari_dep-0000000000/build/tmp_manifest
05:12:48     INFO - Copy/paste: git branch -m master
05:12:48     INFO - Return code: 0
05:12:48     INFO - rmtree: /builds/git-shared/repo/.repo
05:12:48     INFO - retry: Calling <function rmtree at 0x1760d70> with args: ('/builds/git-shared/repo/.repo',), kwargs: {}, attempt #1
05:12:49     INFO - Running command: ['/builds/slave/b2g_m-cen_inari_dep-0000000000/build/repo', 'init', '--repo-url', 'https://git.mozilla.org/external/google/gerrit/git-repo.git', '-q', '--mirror', '-u', '/builds/slave/b2g_m-cen_inari_dep-0000000000/build/tmp_manifest', '-m', 'inari.xml', '-b', u'master'] in /builds/git-shared/repo
05:12:49     INFO - Copy/paste: /builds/slave/b2g_m-cen_inari_dep-0000000000/build/repo init --repo-url https://git.mozilla.org/external/google/gerrit/git-repo.git -q --mirror -u /builds/slave/b2g_m-cen_inari_dep-0000000000/build/tmp_manifest -m inari.xml -b master
05:12:49     INFO -  fatal: Cannot get https://git.mozilla.org/external/google/gerrit/git-repo.git/clone.bundle
05:12:49     INFO -  fatal: HTTP error 500
05:12:49     INFO -  fatal: repo init failed; run without --quiet to see why
05:12:49    ERROR - Return code: 1
05:12:49    FATAL - Halting on failure while running ['/builds/slave/b2g_m-cen_inari_dep-0000000000/build/repo', 'init', '--repo-url', 'https://git.mozilla.org/external/google/gerrit/git-repo.git', '-q', '--mirror', '-u', '/builds/slave/b2g_m-cen_inari_dep-0000000000/build/tmp_manifest', '-m', 'inari.xml', '-b', u'master']
05:12:49    FATAL - Running post_fatal callback...
05:12:49    FATAL - Exiting 1
program finished with exit code 1
elapsedTime=149.687067
========= Finished 'scripts/scripts/b2g_build.py --target ...' failed (results: 2, elapsed: 2 mins, 29 secs) (at 2013-08-21 05:12:49.409248) =========
}

also:
https://tbpl.mozilla.org/php/getParsedLog.php?id=26814957&tree=Mozilla-Central
https://tbpl.mozilla.org/php/getParsedLog.php?id=26815602&tree=Mozilla-Central
https://tbpl.mozilla.org/php/getParsedLog.php?id=26815173&tree=Mozilla-Central
https://tbpl.mozilla.org/php/getParsedLog.php?id=26815280&tree=Mozilla-Central
https://tbpl.mozilla.org/php/getParsedLog.php?id=26815241&tree=Mozilla-Central
All trunk + b2g branch trees closed.
IT, I'm getting what look like zeus ISE 500s intermittently when trying to access http://git.mozilla.org/?p=external/google/gerrit/git-repo.git;a=summary.
Assignee: nobody → server-ops
Component: Buildduty → Server Operations
Product: Release Engineering → mozilla.org
QA Contact: armenzg → shyam
Assignee: server-ops → ludovic
Fubar is actually looking into it.
Assignee: ludovic → klibby
I am unable to reproduce the ISEs on http://git.mozilla.org/?p=external/google/gerrit/git-repo.git;a=summary; it wfm.

However, the URL in the subject (https://git.mozilla.org/external/google/gerrit/git-repo.git/clone.bundle) does NOT work, even locally on the git server.  changing it to "git.m.o/?p=external/..." DOES work, though. 

I am unaware of any recent changes to the gitweb configuration, but I presume you've been using that URL all along, yes?
hwine, is there anything that you are aware of that could have deleted (or missed syncing) the clone.bundle?
That is https://git.mozilla.org/external/google/gerrit/git-repo.git/clone.bundle

(In reply to Kendall Libby [:fubar] from comment #4)
> I am unaware of any recent changes to the gitweb configuration, but I
> presume you've been using that URL all along, yes?

Yes, we have.
Flags: needinfo?(hwine)
I'm not sure this is an IT issue but an external repos issue.

https://tbpl.mozilla.org/?jobname=b2g_mozilla-central_inari_dep
I was mistaken on my previous comment.
It seems to be IT related.

CalledProcessError: Command '['git', 'fetch', '-q', 'https://git.mozilla.org/b2g/B2G.git', '+refs/heads/*:refs/remotes/origin/*']' returned non-zero exit status 128
Return code: 1
CalledProcessError: Command '['git', 'fetch', '-q', 'https://git.mozilla.org/b2g/B2G.git', '+refs/heads/*:refs/remotes/origin/*']' returned non-zero exit status 128
Return code: 1
caught OS error 2: No such file or directory while running ['./gonk-misc/add-revision.py', '-o', 'sources.xml', '--force', '.repo/manifest.xml']
Running post_fatal callback...
Exiting -1

07:29:33     INFO -  error: The requested URL returned error: 500 while accessing https://git.mozilla.org/external/caf/platform/external/mksh/info/refs

caught OS error 2: No such file or directory while running ['./gonk-misc/add-revision.py', '-o', 'sources.xml', '--force', '.repo/manifest.xml']
Flags: needinfo?(hwine)
Summary: B2G builds failing with "fatal: Cannot get https://git.mozilla.org/external/google/gerrit/git-repo.git/clone.bundle", "HTTP error 500", "repo init failed;" → [intermittent] B2G builds failing with "fatal: Cannot get https://git.mozilla.org/external/google/gerrit/git-repo.git/clone.bundle", "HTTP error 500", "repo init failed;"
Whiteboard: [intermittent] Trees are closed.
Zeus was intermittently erroring out the gitweb pool; I've changed the health monitor over from simple-http to the same monitor we use for hg, which has a longer timeout and checks the status code. So far, it looks happier.

Probably unrelated, but possibly an issue for releng - the gitmo/full/path URLs don't appear to work for any repo, while a URL like gitmo/?p=full/path does work. I'm not sure if they should, or have ever work, or what. It certainly contributed to confusion on my end. :)
No zeus ISEs in 20+ minutes (i.e. since the monitor change). Calling it fixed. Will review with bkero and see if we want to tweak the monitor any, or make any other changes.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Thank you everyone! :-)

Trees were reopened ~25 mins ago.
This has started happening again in bug 920096.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Summary: [intermittent] B2G builds failing with "fatal: Cannot get https://git.mozilla.org/external/google/gerrit/git-repo.git/clone.bundle", "HTTP error 500", "repo init failed;" → B2G builds intermittently fail with "fatal: Cannot get https://git.mozilla.org/external/google/gerrit/git-repo.git/clone.bundle", "HTTP error 500", "repo init failed;"
Whiteboard: [intermittent] Trees are closed.
Severity: blocker → critical
Can we please get someone working on this?

14:54 RyanVM|sheriffduty: armenzg_buildduty: b2g builds on inbound are all dying like this - https://tbpl.mozilla.org/php/getParsedLog.php?id=28297916&tree=Mozilla-Inbound
(In reply to Armen Zambrano [:armenzg] (Release Engineering) (EDT/UTC-4) from comment #13)
> Can we please get someone working on this?
> 
> 14:54 RyanVM|sheriffduty: armenzg_buildduty: b2g builds on inbound are all
> dying like this -
> https://tbpl.mozilla.org/php/getParsedLog.php?id=28297916&tree=Mozilla-
> Inbound

Please ignore this comment. It seems that the actual error is slightly different.
Sorry!
No worries! 

This time, it was a process eating 34gig virtual memory:

2351     22279 22278 99 18:53 ?        00:37:49 git pack-objects 
--keep-true-parents --honor-pack-keep --non-empty --all --reflog 
--unpacked --incremental --local -q --delta-base-offset 
/var/lib/gitolite3/repositories/users/eakhgari@mozilla.com/mozilla-history-tools.git/objects/pack/.tmp-22263-pack

talked to ehsan in #releng and changed the config on his repo to not auto-gc
Component: Server Operations → WebOps: Source Control
Product: mozilla.org → Infrastructure & Operations
QA Contact: shyam → nmaul
Status: REOPENED → RESOLVED
Closed: 11 years ago10 years ago
Resolution: --- → FIXED
Component: WebOps: Source Control → General
Product: Infrastructure & Operations → Developer Services
You need to log in before you can comment on or make changes to this bug.