B2G builds failing with | error.GitError: manifests rev-list (u'^d3a898d0ef4b0115c579a095347ce6a9498430ff', 'HEAD', '--'): fatal: bad object HEAD

RESOLVED FIXED

Status

Developer Services
General
--
critical
RESOLVED FIXED
4 years ago
3 years ago

People

(Reporter: KWierso, Assigned: bkero)

Tracking

Details

(Reporter)

Description

4 years ago
https://tbpl.mozilla.org/php/getParsedLog.php?id=28313960&tree=B2g-Inbound
slave: bld-linux64-ec2-375


16:52:20     INFO - Running command: ['/builds/slave/b2g_b2g-in_emu-jb-d_dep-000000/build/repo', 'init', '--repo-url', 'https://git.mozilla.org/external/google/gerrit/git-repo.git', '-q', '--reference', '/builds/git-shared/repo', '-u', '/builds/slave/b2g_b2g-in_emu-jb-d_dep-000000/build/tmp_manifest', '-m', 'generic.xml', '-b', u'master'] in /builds/slave/b2g_b2g-in_emu-jb-d_dep-000000/build
16:52:20     INFO - Copy/paste: /builds/slave/b2g_b2g-in_emu-jb-d_dep-000000/build/repo init --repo-url https://git.mozilla.org/external/google/gerrit/git-repo.git -q --reference /builds/git-shared/repo -u /builds/slave/b2g_b2g-in_emu-jb-d_dep-000000/build/tmp_manifest -m generic.xml -b master
16:52:21     INFO -  ... A new repo command ( 1.20) is available.
16:52:21     INFO -  ... You should upgrade soon:
16:52:21     INFO -      cp /builds/slave/b2g_b2g-in_emu-jb-d_dep-000000/build/.repo/repo/repo /builds/slave/b2g_b2g-in_emu-jb-d_dep-000000/build/repo
16:52:21     INFO -  error: refs/heads/default does not point to a valid object!
16:52:21     INFO -  error: refs/remotes/m/master does not point to a valid object!
16:52:21     INFO -  error: refs/remotes/origin/master does not point to a valid object!
16:52:21     INFO -  error: refs/heads/default does not point to a valid object!
16:52:21     INFO -  error: refs/remotes/m/master does not point to a valid object!
16:52:21     INFO -  error: refs/remotes/origin/master does not point to a valid object!
16:52:21     INFO -  error: refs/heads/default does not point to a valid object!
16:52:21     INFO -  error: refs/remotes/m/master does not point to a valid object!
16:52:21     INFO -  error: refs/remotes/origin/master does not point to a valid object!
16:52:21     INFO -  From /builds/slave/b2g_b2g-in_emu-jb-d_dep-000000/build/tmp_manifest
16:52:21     INFO -   * [new branch]      master     -> origin/master
16:52:21     INFO -  Traceback (most recent call last):
16:52:21     INFO -    File "/builds/slave/b2g_b2g-in_emu-jb-d_dep-000000/build/.repo/repo/main.py", line 418, in <module>
16:52:21     INFO -      _Main(sys.argv[1:])
16:52:21     INFO -    File "/builds/slave/b2g_b2g-in_emu-jb-d_dep-000000/build/.repo/repo/main.py", line 394, in _Main
16:52:21     INFO -      result = repo._Run(argv) or 0
16:52:21     INFO -    File "/builds/slave/b2g_b2g-in_emu-jb-d_dep-000000/build/.repo/repo/main.py", line 142, in _Run
16:52:21     INFO -      result = cmd.Execute(copts, cargs)
16:52:21     INFO -    File "/builds/slave/b2g_b2g-in_emu-jb-d_dep-000000/build/.repo/repo/subcmds/init.py", line 369, in Execute
16:52:21     INFO -      self._SyncManifest(opt)
16:52:21     INFO -    File "/builds/slave/b2g_b2g-in_emu-jb-d_dep-000000/build/.repo/repo/subcmds/init.py", line 222, in _SyncManifest
16:52:21     INFO -      m.MetaBranchSwitch(opt.manifest_branch)
16:52:21     INFO -    File "/builds/slave/b2g_b2g-in_emu-jb-d_dep-000000/build/.repo/repo/project.py", line 2419, in MetaBranchSwitch
16:52:21     INFO -      self.Sync_LocalHalf(syncbuf)
16:52:21     INFO -    File "/builds/slave/b2g_b2g-in_emu-jb-d_dep-000000/build/.repo/repo/project.py", line 1106, in Sync_LocalHalf
16:52:21     INFO -      lost = self._revlist(not_rev(revid), HEAD)
16:52:21     INFO -    File "/builds/slave/b2g_b2g-in_emu-jb-d_dep-000000/build/.repo/repo/project.py", line 2069, in _revlist
16:52:21     INFO -      return self.work_git.rev_list(*a, **kw)
16:52:21     INFO -    File "/builds/slave/b2g_b2g-in_emu-jb-d_dep-000000/build/.repo/repo/project.py", line 2222, in rev_list
16:52:21     INFO -      p.stderr))
16:52:21     INFO -  error.GitError: manifests rev-list (u'^d3a898d0ef4b0115c579a095347ce6a9498430ff', 'HEAD', '--'): fatal: bad object HEAD
16:52:21    ERROR - Return code: 1
16:52:21    FATAL - Halting on failure while running ['/builds/slave/b2g_b2g-in_emu-jb-d_dep-000000/build/repo', 'init', '--repo-url', 'https://git.mozilla.org/external/google/gerrit/git-repo.git', '-q', '--reference', '/builds/git-shared/repo', '-u', '/builds/slave/b2g_b2g-in_emu-jb-d_dep-000000/build/tmp_manifest', '-m', 'generic.xml', '-b', u'master']
16:52:21    FATAL - Running post_fatal callback...
16:52:21    FATAL - Exiting 1
program finished with exit code 1
I checked the recent build history and this was the only b2g compile job to fail in that period. Will investigate the particular slave, but the previous job it did was the same build and that was green.
Assignee: nobody → nthomas
Severity: blocker → normal
(Reporter)

Comment 2

4 years ago
https://tbpl.mozilla.org/php/getParsedLog.php?id=28319230&tree=Mozilla-Inbound
using slave: bld-linux64-ec2-191
The log in comment #0 was bld-linux64-ec2-375 doing b2g_b2g-inbound_emulator-jb-debug_dep. There was a clobber at 08:53 PT today. The first build after that, at 14:17, was green. It built again at 16:47 and failed.

Comment #2 is similar - bld-linux64-ec2-191 doing b2g_mozilla-inbound_unagi_dep. A clobber was set 13:22 today. There was a green build at 15:11, then a failure at 19:06.
I've run 'git fsck' on bld-linux64-ec2-191 in
  /builds/slave/b2g_m-in_unagi_dep-00000000000/build
  /builds/slave/b2g_m-in_unagi_dep-00000000000/build/tmp_manifest
and it didn't report any problems. Also tried some of the other arguments like --full, --no-reflogs, --strict with nothing showing up.

catlee, do you know how to debug this ?
Assignee: nthomas → nobody
Flags: needinfo?(catlee)
seems this is raising more and more - nearly on every 2nd build on b2g-inbound so raising this bug

example failure logs
https://tbpl.mozilla.org/php/getParsedLog.php?id=28335883&tree=B2g-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=28335838&tree=B2g-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=28335692&tree=B2g-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=28332495&tree=B2g-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=28332438&tree=B2g-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=28332973&tree=B2g-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=28331508&tree=B2g-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=28330153&tree=B2g-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=28329566&tree=B2g-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=28322548&tree=B2g-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=28322137&tree=B2g-Inbound
Severity: normal → critical
https://tbpl.mozilla.org/php/getParsedLog.php?id=28342220&tree=B2g-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=28343424&tree=B2g-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=28343359&tree=B2g-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=28343391&tree=B2g-Inbound
Any updates here? We're still hitting this frequently...

Comment 8

4 years ago
14:00 Callek: CC: ehsan gps (since they likely know more about git internals than some of us)
14:00 aki: ping IT -- https://git.mozilla.org/external/google/gerrit/git-repo.git is 404ing again
14:01 Callek: fubar: ^
14:01 Callek does all-the-pings
14:02 rail: but you can clone it
14:03 rail: https://git.mozilla.org/?p=external/google/gerrit/git-repo.git;a=summary works
14:03 aki: yup, that same thing happened when ehsan's repo gc was killing git.m.o
14:04 gps: Callek: you might have a corrupt repo there
14:04 Callek: ugh :/
14:04 fubar: gitmo seems fine
14:05 gps: cat .git/HEAD
14:05 gps: it sounds like HEAD is just pointing to a bad / non-existing ref?
14:05 gps: you can fix that by doing a git reset --hard
14:06 ehsan: what's the problem?
14:08 gps: git fsck --full
14:09 armenzg_buildduty: ehsan: last night we started seeing B2G builds failing and it is increasing
14:09 gps: (you should always run git fsck periodically)
14:09 gps: if you don't run git fsck periodically, you may get silent repository corruption
14:09 gps: well, you may get it regardless - fsck helps you recover sooner
14:10 armenzg_buildduty: gps: is git fsck --full something that should get run on the client checkouts or the server side?
14:10 gps: armenzg_buildduty: everywhere
14:10 gps: all it takes is a bad disk I/O and your repo is corrupted
14:13 armenzg_buildduty: gps: how do I recover? do I just ssh into each machine that had an issue and I run that command on the git checkouts?
14:13 armenzg_buildduty: why would it start spreading?
14:13 armenzg_buildduty: anything that IT can do on their side to fix it?
14:13 gps: armenzg_buildduty: it would start spreading if the server is corrupted
14:14 gps: if clients are corrupted, you typically just do a fresh clone
14:14 ehsan: shouldn't gitolite handle that?
14:14 gps: that's the easy way

Updated

4 years ago
Assignee: nobody → infra
Component: Buildduty → Infrastructure: Other
Product: Release Engineering → Infrastructure & Operations
QA Contact: armenzg → jdow

Comment 9

4 years ago
14:15 gps: if we are not periodically running git fsck on the canonical git server, that is a P1 bug
14:15 gps: someone should get a page if git fsck fails on any repo

Comment 10

4 years ago
14:19 fubar: fwiw, git fsck on /external/gerrit/git-repo.git ran fine
14:19 fubar: also, AFAIK, git fsck is not run on gitmo, but 302 bkero
14:19 gps: btw the mercurial equivalent is |hg verify|
14:20 fubar: also not run, afaik, except for the few times releng has asked us to
14:20 gps: server side mercurial repos are harder to corrupt because all writes are append only
14:20 gps: with git, there is repacking, more I/O, and thus more potential for filesystem corruption
14:21 gps: well, 98% of writes are append only :)
Assignee: infra → server-ops-webops
Component: Infrastructure: Other → WebOps: Source Control
QA Contact: jdow → nmaul
(Assignee)

Comment 11

4 years ago
I've run:

[root@git1.dmz.scl3 git-repo.git]# pwd
/var/lib/gitolite3/repositories/external/google/gerrit/git-repo.git

[root@git1.dmz.scl3 git-repo.git]# git fsck --full
Checking object directories: 100% (256/256), done.
Checking objects: 100% (2354/2354), done.

[root@git1.dmz.scl3 log]# grep sda /var/log/dmesg
sd 0:0:0:0: [sda] 781357232 512-byte logical blocks: (400 GB/372 GiB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 6b 00 00 08
sd 0:0:0:0: [sda] Write cache: disabled, read cache: disabled, doesn't support DPO or FUA
 sda: sda1 sda2 sda3
sd 0:0:0:0: [sda] Attached SCSI disk
EXT4-fs (sda3): mounted filesystem with ordered data mode. Opts: 
dracut: Mounted root filesystem /dev/sda3
EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: 
Adding 2097144k swap on /dev/sda2.  Priority:-1 extents:1 across:2097144k 

[root@git1.dmz.scl3 log]# dmesg|grep sda
[root@git1.dmz.scl3 log]# 

This yielded that there shouldn't be any file corruption on the repository. Likely something between the filesystem and the client is responsible.

Comment 12

4 years ago
fubar: armenzg_brb: git fsck on that repo shows no corruption. that said, I do see random 404s for it, but a lot more 200s
(Assignee)

Comment 13

4 years ago
I am seeing this in the apache error_log:

[root@git1.dmz.scl3 httpd]# grep "Cannot open" error_log
[Sun Sep 22 14:57:30 2013] [error] [client 10.22.74.208] Cannot open 'objects/7b/6147372cbf560744a02be50e0a862a825caef6': No such file or directory
[Sun Sep 22 14:57:32 2013] [error] [client 10.22.74.208] Cannot open 'objects/info/http-alternates': No such file or directory
[Sun Sep 22 14:57:33 2013] [error] [client 10.22.74.208] Cannot open 'objects/info/alternates': No such file or directory
[Sun Sep 22 14:57:36 2013] [error] [client 10.22.74.208] Cannot open 'objects/46/5bba049a435866b9aacf8fb9bceccaa4170b3e': No such file or directory
[Sun Sep 22 14:57:37 2013] [error] [client 10.22.74.208] Cannot open 'objects/info/http-alternates': No such file or directory
[Sun Sep 22 14:57:38 2013] [error] [client 10.22.74.208] Cannot open 'objects/info/alternates': No such file or directory
[Sun Sep 22 14:59:37 2013] [error] [client 10.22.74.208] Cannot open 'objects/95/5bc891ad101e5abd8905e46127ab0b752b7983': No such file or directory
[Sun Sep 22 14:59:38 2013] [error] [client 10.22.74.208] Cannot open 'objects/info/http-alternates': No such file or directory
[Sun Sep 22 14:59:39 2013] [error] [client 10.22.74.208] Cannot open 'objects/info/alternates': No such file or directory
[Sun Sep 22 15:07:25 2013] [error] [client 10.22.74.208] Cannot open 'objects/37/19c6d25733d4c03bc03503e49447e3ab85a15c': No such file or directory
[Sun Sep 22 15:07:25 2013] [error] [client 10.22.74.208] Cannot open 'objects/7a/0bb6ec11ea41315a1f4bf955645676d97013b2': No such file or directory
[Sun Sep 22 15:07:25 2013] [error] [client 10.22.74.208] Cannot open 'objects/info/http-alternates': No such file or directory
[Sun Sep 22 15:07:25 2013] [error] [client 10.22.74.208] Cannot open 'objects/info/alternates': No such file or directory
[Mon Sep 23 14:57:43 2013] [error] [client 10.22.74.208] Cannot open 'objects/62/fff6a76a6c34471ed2f560287c7c64914e44c3': No such file or directory
[Mon Sep 23 14:57:43 2013] [error] [client 10.22.74.208] Cannot open 'objects/4e/6c8564edf0ef8c21c29edb25b836b1f20fb98b': No such file or directory
[Mon Sep 23 14:57:45 2013] [error] [client 10.22.74.208] Cannot open 'objects/info/http-alternates': No such file or directory
[Mon Sep 23 14:57:45 2013] [error] [client 10.22.74.208] Cannot open 'objects/info/http-alternates': No such file or directory
[Mon Sep 23 14:57:47 2013] [error] [client 10.22.74.208] Cannot open 'objects/info/alternates': No such file or directory
[Mon Sep 23 14:57:49 2013] [error] [client 10.22.74.208] Cannot open 'objects/info/alternates': No such file or directory
[Mon Sep 23 14:58:40 2013] [error] [client 10.22.74.208] Cannot open 'objects/bb/681b91a206094ed38300b796e6e400d47d1f65': No such file or directory
[Mon Sep 23 18:01:50 2013] [error] [client 10.22.74.208] Cannot open 'objects/ea/ae9cdb28b1fb28eddbb562b205c676e5bc1eb2': No such file or directory
[Mon Sep 23 18:01:51 2013] [error] [client 10.22.74.208] Cannot open 'objects/info/http-alternates': No such file or directory
[Mon Sep 23 18:01:51 2013] [error] [client 10.22.74.208] Cannot open 'objects/info/alternates': No such file or directory
[Mon Sep 23 22:35:22 2013] [error] [client 10.22.74.208] Cannot open 'objects/ea/ae9cdb28b1fb28eddbb562b205c676e5bc1eb2': No such file or directory
[Mon Sep 23 22:35:22 2013] [error] [client 10.22.74.208] Cannot open 'objects/info/http-alternates': No such file or directory
[Mon Sep 23 22:35:22 2013] [error] [client 10.22.74.208] Cannot open 'objects/info/alternates': No such file or directory
[Mon Sep 23 22:35:34 2013] [error] [client 10.22.74.208] Cannot open 'objects/8a/141e1ae12eee9e790b8277e3cc397d17be5216': No such file or directory
[Tue Sep 24 14:57:46 2013] [error] [client 10.22.74.208] Cannot open 'objects/ff/474390146bb8a5b0cce783f8036995fba9fb85': No such file or directory
[Tue Sep 24 14:57:48 2013] [error] [client 10.22.74.208] Cannot open 'objects/info/http-alternates': No such file or directory
[Tue Sep 24 19:26:34 2013] [error] [client 10.22.74.208] Cannot open 'objects/c0/7a74de4fb4ebbad5a9b84a5a7b68e4f489a6e0': No such file or directory
[Tue Sep 24 19:26:34 2013] [error] [client 10.22.74.208] Cannot open 'objects/bd/22519d29bbc927a063c629c77b972cdc57ad8e': No such file or directory
[Tue Sep 24 19:26:34 2013] [error] [client 10.22.74.208] Cannot open 'objects/84/eecbde146ae500dec1dee4dd85958358a1b56a': No such file or directory
[Tue Sep 24 19:26:35 2013] [error] [client 10.22.74.208] Cannot open 'objects/a6/ab108f96ce6167de1baddf175e2fc79e824792': No such file or directory
[Tue Sep 24 19:26:35 2013] [error] [client 10.22.74.208] Cannot open 'objects/info/http-alternates': No such file or directory
[Tue Sep 24 19:26:35 2013] [error] [client 10.22.74.208] Cannot open 'objects/info/alternates': No such file or directory
[Tue Sep 24 19:26:36 2013] [error] [client 10.22.74.208] Cannot open 'objects/60/44d51b1284c18c1e4fde5ca22d91fd4e2249f6': No such file or directory
[Tue Sep 24 19:26:37 2013] [error] [client 10.22.74.208] Cannot open 'objects/8d/6bd65d5b90fb7a3b390aa475790f7df7ecbf13': No such file or directory
[Wed Sep 25 07:18:09 2013] [error] [client 10.22.74.208] Cannot open 'objects/info/http-alternates': No such file or directory
[Wed Sep 25 14:57:42 2013] [error] [client 10.22.74.208] Cannot open 'objects/1e/02f948e4b68ca882549d1c808c34ccb7ddae9c': No such file or directory
[Wed Sep 25 14:57:43 2013] [error] [client 10.22.74.208] Cannot open 'objects/info/http-alternates': No such file or directory
[Wed Sep 25 14:57:44 2013] [error] [client 10.22.74.208] Cannot open 'objects/info/alternates': No such file or directory
[Wed Sep 25 14:58:17 2013] [error] [client 10.22.74.208] Cannot open 'objects/87/b11c4496ea41c60af43dff185bd11248cba3d2': No such file or directory
Assignee: server-ops-webops → bkero

Comment 14

4 years ago
14:21 gps: but we also run mercurial over NFS, which isn't a recommended configuration
14:21 fubar: out of curiousity, what is your observed disk corruption rate?  only because I can't recall the last time I ran into that issue (which doesn't mean it's not an issue...)
14:24 catlee-dnd: I think it's slave-side corruption
14:26 catlee-dnd: RyanVM|sheriffduty: does clobbering the slaves fix the build?
14:27 RyanVM|sheriffduty: catlee-dnd: dunno, b-i was last clobbered ~27h ago
14:27 catlee: give that a try

A b2g-inbound clobber request happened at 2013-09-25 11:58:07 PDT
(Assignee)

Comment 15

4 years ago
I'm trying to replicate this problem, although am having a hard time running the repo command because I'm lacking a few files/repositories for doing this.

Is there some way you could provide us with the information or files to replicate this?
(In reply to Ben Kero [:bkero] from comment #15)
> I'm trying to replicate this problem, although am having a hard time running
> the repo command because I'm lacking a few files/repositories for doing this.
> 
> Is there some way you could provide us with the information or files to
> replicate this?

not sure but maybe https://tbpl.mozilla.org/?tree=B2g-Inbound helps (the red Build bustages are currently mostly this issue here)

Comment 17

4 years ago
Is this any important?
05:03:20     INFO -  ... A new repo command ( 1.20) is available.
05:03:20     INFO -  ... You should upgrade soon:
05:03:20     INFO -      cp /builds/slave/b2g_b2g-in_emu-jb_dep-00000000/build/.repo/repo/repo /builds/slave/b2g_b2g-in_emu-jb_dep-00000000/build/repo
(Reporter)

Comment 18

4 years ago
https://tbpl.mozilla.org/php/getParsedLog.php?id=28424348&tree=Mozilla-Inbound#error1
Looking at one slave, it looks like the repository that it's complaining about is a temporary repository we create for manifests. It's not any remote repository that's corrupted.

I'm not sure how or why this temporary repo gets into this state though.
Flags: needinfo?(catlee)
(Reporter)

Comment 20

4 years ago
https://tbpl.mozilla.org/php/getParsedLog.php?id=28433395&tree=Mozilla-Inbound
(Reporter)

Comment 21

4 years ago
https://tbpl.mozilla.org/php/getParsedLog.php?id=28466778&tree=Mozilla-Inbound
(Reporter)

Comment 22

4 years ago
https://tbpl.mozilla.org/php/getParsedLog.php?id=28475201&tree=Mozilla-Inbound
(Reporter)

Comment 23

4 years ago
https://tbpl.mozilla.org/php/getParsedLog.php?id=28475704&tree=Mozilla-Inbound#error1 
https://tbpl.mozilla.org/php/getParsedLog.php?id=28475754&tree=Mozilla-Inbound#error1
https://tbpl.mozilla.org/php/getParsedLog.php?id=28528492&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=28526591&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=28525345&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=28524278&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=28525104&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=28524381&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=28545448&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=28545249&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=28542074&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=28542388&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=28542122&tree=Mozilla-Inbound
OK I've done a little research to see what is going on.

During the B2G build process, checkout_sources is called (https://github.com/mozilla/build-mozharness/blob/c378177e2fe06818a52efd0c7c8df024386f1473/scripts/b2g_build.py#L469-L586)

During this method, the repo tool is called, e.g.: ['/builds/slave/b2g_m-in_unagi_dep-00000000000/build/repo', 'init', '--repo-url', 'https://git.mozilla.org/external/google/gerrit/git-repo.git', '-q', '--reference', '/builds/git-shared/repo', '-u', '/builds/slave/b2g_m-in_unagi_dep-00000000000/build/tmp_manifest', '-m', 'unagi.xml', '-b', 'master'] in /builds/slave/b2g_m-in_unagi_dep-00000000000/build


This repo tool is a googlecode project built on top of git, see http://source.android.com/source/developing.html

This is what the repo help for the "init" command.


Summary
-------
Initialize repo in the current directory

Usage: repo init [options]

Options:
  -h, --help            show this help message and exit

  Logging options:
    -q, --quiet         be quiet

  Manifest options:
    -u URL, --manifest-url=URL
                        manifest repository location
    -b REVISION, --manifest-branch=REVISION
                        manifest branch or revision
    -m NAME.xml, --manifest-name=NAME.xml
                        initial manifest file
    --mirror            create a replica of the remote repositories rather
                        than a client working directory
    --reference=DIR     location of mirror directory
    --depth=DEPTH       create a shallow clone with given depth; see git clone
    -g GROUP, --groups=GROUP
                        restrict manifest projects to ones with specified
                        group(s) [default|all|G1,G2,G3|G4,-G5,-G6]
    -p PLATFORM, --platform=PLATFORM
                        restrict manifest projects to ones with a specified
                        platform group [auto|all|none|linux|darwin|...]

  repo Version options:
    --repo-url=URL      repo repository location
    --repo-branch=REVISION
                        repo branch or revision
    --no-repo-verify    do not verify repo source code

  Other options:
    --config-name       Always prompt for name/e-mail

Description
-----------
The 'repo init' command is run once to install and initialize repo. The
latest repo source code and manifest collection is downloaded from the
server and is installed in the .repo/ directory in the current working
directory.

The optional -b argument can be used to select the manifest branch to
checkout and use. If no branch is specified, master is assumed.

The optional -m argument can be used to specify an alternate manifest to
be used. If no manifest is specified, the manifest default.xml will be
used.

The --reference option can be used to point to a directory that has the
content of a --mirror sync. This will make the working directory use as
much data as possible from the local reference directory when fetching
from the server. This will make the sync go a lot faster by reducing
data traffic on the network.

Switching Manifest Branches
---------------------------
To switch to another manifest branch, `repo init -b otherbranch` may be
used in an existing client. However, as this only updates the manifest,
a subsequent `repo sync` (or `repo sync -d`) is necessary to update the
working directory files.




So this command, that fails, is installing and initialising a repo. The repo url is https://git.mozilla.org/external/google/gerrit/git-repo.git, it is using the reference repository under /builds/git-shared/repo

Now the repo tool is calling git commands under the hood, and what it is actually doing is calling:
git rev-list 5979535f856cbaf4a673f14dba5329debf7a9375 HEAD --

The git rev-list command, should return all commits traversable from both: a) the commit 5979535f856cbaf4a673f14dba5329debf7a9375 and b) the HEAD. The error message says, e.g.:
21:29:57     INFO -  error.GitError: manifests rev-list (u'^5979535f856cbaf4a673f14dba5329debf7a9375', 'HEAD', '--'): fatal: bad object HEAD

This is surprising, because it does not say revision 5979535f856cbaf4a673f14dba5329debf7a9375 is a bad commit, it says *HEAD* is a bad commit.

Looking on the slave, it is not clear why this command does not work:

[cltbld@bld-linux64-ec2-312.build.releng.usw2.mozilla.com tmp_manifest]$ cd /builds/slave/b2g_m-in_unagi_dep-00000000000/build/tmp_manifest && (echo LOG; echo; git log; echo; echo REV-LIST; echo; git rev-list 5979535f856cbaf4a673f14dba5329debf7a9375 HEAD --)

This returns the following output, suggesting the repo is ok, and HEAD is defined:

LOG

commit 5979535f856cbaf4a673f14dba5329debf7a9375
Author: Mozilla Release Engineering <release@mozilla.com>
Date:   Sun Sep 29 21:24:18 2013 -0700

    manifest

REV-LIST

5979535f856cbaf4a673f14dba5329debf7a9375

Furthermore, the HEADS file is intact, and valid:

[cltbld@bld-linux64-ec2-312.build.releng.usw2.mozilla.com tmp_manifest]$ cat /builds/slave/b2g_m-in_unagi_dep-00000000000/build/tmp_manifest/.git/HEAD
ref: refs/heads/master
[cltbld@bld-linux64-ec2-312.build.releng.usw2.mozilla.com tmp_manifest]$ cat /builds/slave/b2g_m-in_unagi_dep-00000000000/build/tmp_manifest/.git/refs/heads/master
5979535f856cbaf4a673f14dba5329debf7a9375

However, running the repo command directly still fails:

[cltbld@bld-linux64-ec2-312.build.releng.usw2.mozilla.com tmp_manifest]$ '/builds/slave/b2g_m-in_unagi_dep-00000000000/build/repo' 'init' '--repo-url' 'https://git.mozilla.org/external/google/gerrit/git-repo.git' '--reference' '/builds/git-shared/repo' '-u' '/builds/slave/b2g_m-in_unagi_dep-00000000000/build/tmp_manifest' '-m' 'unagi.xml' '-b' 'master'

... A new repo command ( 1.20) is available.
... You should upgrade soon:

    cp /builds/slave/b2g_m-in_unagi_dep-00000000000/build/.repo/repo/repo /builds/slave/b2g_m-in_unagi_dep-00000000000/build/repo

error: refs/heads/default does not point to a valid object!
error: refs/heads/default does not point to a valid object!
['rev-list', u'^5979535f856cbaf4a673f14dba5329debf7a9375', 'HEAD', '--']
Traceback (most recent call last):
  File "/builds/slave/b2g_m-in_unagi_dep-00000000000/build/.repo/repo/main.py", line 418, in <module>
    _Main(sys.argv[1:])
  File "/builds/slave/b2g_m-in_unagi_dep-00000000000/build/.repo/repo/main.py", line 394, in _Main
    result = repo._Run(argv) or 0
  File "/builds/slave/b2g_m-in_unagi_dep-00000000000/build/.repo/repo/main.py", line 142, in _Run
    result = cmd.Execute(copts, cargs)
  File "/builds/slave/b2g_m-in_unagi_dep-00000000000/build/.repo/repo/subcmds/init.py", line 369, in Execute
    self._SyncManifest(opt)
  File "/builds/slave/b2g_m-in_unagi_dep-00000000000/build/.repo/repo/subcmds/init.py", line 222, in _SyncManifest
    m.MetaBranchSwitch(opt.manifest_branch)
  File "/builds/slave/b2g_m-in_unagi_dep-00000000000/build/.repo/repo/project.py", line 2420, in MetaBranchSwitch
    self.Sync_LocalHalf(syncbuf)
  File "/builds/slave/b2g_m-in_unagi_dep-00000000000/build/.repo/repo/project.py", line 1106, in Sync_LocalHalf
    lost = self._revlist(not_rev(revid), HEAD)
  File "/builds/slave/b2g_m-in_unagi_dep-00000000000/build/.repo/repo/project.py", line 2069, in _revlist
    return self.work_git.rev_list(*a, **kw)
  File "/builds/slave/b2g_m-in_unagi_dep-00000000000/build/.repo/repo/project.py", line 2223, in rev_list
    p.stderr))
error.GitError: manifests rev-list (u'^5979535f856cbaf4a673f14dba5329debf7a9375', 'HEAD', '--'): fatal: bad object HEAD

I've tried updating repo command using cp command above suggested, but this did not change the result. Therefore I restored the old version again.

Summary
=======
We need to understand why this repo command fails ('/builds/slave/b2g_m-in_unagi_dep-00000000000/build/repo' 'init' '--repo-url' 'https://git.mozilla.org/external/google/gerrit/git-repo.git' '--reference' '/builds/git-shared/repo' '-u' '/builds/slave/b2g_m-in_unagi_dep-00000000000/build/tmp_manifest' '-m' 'unagi.xml' '-b' 'master') with the message:
error.GitError: manifests rev-list (u'^5979535f856cbaf4a673f14dba5329debf7a9375', 'HEAD', '--'): fatal: bad object HEAD

despite the fact that:
cd /builds/slave/b2g_m-in_unagi_dep-00000000000/build/tmp_manifest && git rev-list 5979535f856cbaf4a673f14dba5329debf7a9375 HEAD --
does not fail
Thank you for taking an initial look at this! :-)
https://tbpl.mozilla.org/php/getParsedLog.php?id=28548613&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=28546753&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=28548168&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=28545631&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=28547594&tree=B2g-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=28546157&tree=B2g-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=28545295&tree=B2g-Inbound
This failure mode is currently not starrable - even if the short term, if we could catch the non-zero return code from running the repo command, and output an appropriate (parsable) error message, it would help quite a bit :-)
https://tbpl.mozilla.org/php/getParsedLog.php?id=28555043&tree=Mozilla-Aurora
I'm hoping https://hg.mozilla.org/build/mozharness/rev/1480b72edb19 fixes this, as it changes how we use repo and shared local checkouts.
https://tbpl.mozilla.org/php/getParsedLog.php?id=28621209&tree=Mozilla-Inbound
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #34)
> https://tbpl.mozilla.org/php/getParsedLog.php?id=28621209&tree=Mozilla-
> Inbound

That's a different problem:
10:06:34     INFO - Copy/paste: /builds/slave/b2g_m-in_emulator-jb-d_dep-000/build/repo init --repo-url https://git.mozilla.org/external/google/gerrit/git-repo.git -q -u /builds/slave/b2g_m-in_emulator-jb-d_dep-000/build/tmp_manifest -m generic.xml -b master
10:06:38     INFO -  object e76efdd7b342577c40aa271fa5ded9d66a783a9b
10:06:38     INFO -  type commit
10:06:38     INFO -  tag v1.12.4
10:06:38     INFO -  tagger Conley Owens <cco3@android.com> 1380645867 -0700
10:06:38     INFO -  repo 1.12.4
10:06:38     INFO -  gpg: Signature made Tue 01 Oct 2013 09:44:27 AM PDT using RSA key ID 692B382C
10:06:38     INFO -  gpg: Can't check signature: No public key
10:06:38     INFO -  error: could not verify the tag 'v1.12.4'
10:06:38     INFO -  fatal: repo init failed; run without --quiet to see why
10:08:05     INFO -  Traceback (most recent call last):
10:08:05     INFO -    File "/builds/slave/b2g_m-in_emulator-jb-d_dep-000/build/repo", line 738, in <module>
10:08:05     INFO -      main(sys.argv[1:])
10:08:05     INFO -    File "/builds/slave/b2g_m-in_emulator-jb-d_dep-000/build/repo", line 712, in main
10:08:05     INFO -      os.rmdir(repodir)
10:08:05     INFO -  OSError: [Errno 20] Not a directory: '.repo'

This is caused by upstream changes to the repo tool. I'm trying to figure out a way to handle this, as this also bit us last week.
Git supports signing commits and tags with GPG. It sounds like the public key of the signer of that v1.12.4 tag can't be found. I'm not sure if repo/git is smart enough to import GPG keys from public keyservers or if you need to do it beforehand. I'm also not sure if there is a way to have repo not verify signatures.

It looks like the public key for this person is on a public server and can be found at http://pgp.mit.edu:11371/pks/lookup?op=vindex&search=0x19269455338871A4.

Perhaps this is a firewall issue? If the repo clients can't talk to the public key servers, they obviously can't import public keys.
It also fails for me on my laptop which doesn't have any firewall restrictions.
I think bug 922750 fixed this.
(In reply to Chris AtLee [:catlee] from comment #38)
> I think bug 922750 fixed this.

Looks like it did - closing
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
Component: WebOps: Source Control → General
Product: Infrastructure & Operations → Developer Services
You need to log in before you can comment on or make changes to this bug.