Closed Bug 1206666 Opened 9 years ago Closed 9 years ago

fxos branch of valgrind diverged between github.com and git.mozilla.org

Categories

(Developer Services :: General, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: pmoore, Assigned: hwine)

Details

Compare https://github.com/mozilla-b2g/valgrind/tree/fxos vs http://git.mozilla.org/?p=b2g/valgrind.git;a=shortlog;h=refs/heads/fxos


pmoore@Petes-iMac:~/git/valgrind fxos $ git remote -v
mozilla	ssh://gitolite3@git.mozilla.org/b2g/valgrind.git (fetch)
mozilla	ssh://gitolite3@git.mozilla.org/b2g/valgrind.git (push)
origin	git@github.com:mozilla-b2g/valgrind.git (fetch)
origin	git@github.com:mozilla-b2g/valgrind.git (push)


pmoore@Petes-iMac:~/git/valgrind fxos $ git ls-remote mozilla fxos 2>/dev/null
daa61633c32b9606f58799a3186395fd2bbb8d8c  refs/heads/fxos

pmoore@Petes-iMac:~/git/valgrind fxos $ git ls-remote origin fxos 2>/dev/null
331f5e1089fbbd1573d40da3593b22a27d195712  refs/heads/fxos


## 9 commits on git.mozilla.org fxos branch that are not on github.com fxos branch:

pmoore@Petes-iMac:~/git/valgrind fxos $ git log --pretty=oneline origin/fxos..mozilla/fxos
daa61633c32b9606f58799a3186395fd2bbb8d8c Bug 977156 - Turn on vgdb builds and add xml copies, add arm64 sources
ebc732512dbf2e89aea44400381511d4b88a9afa Comment out Elf32_Nhdr redefine since jelly bean (>=4.3) now includes this. This will break builds on < 4.3 (ics and lower).
de650e964ae0a7adcc0434578d3588b564664087 Added image.mk to android makefile
1efc0dc5183c6820145e7535efd5d9614eeb6096 Bug 877859: Make symlinks to required header files inside valgrind repo; r=mwu
cdc6b43746c3066faca2d2eb4dfc73ad9931da15 Bug 875659: Only build B2G Valgrind with B2G_VALGRIND environment variable is on; r=mwu
4efadc9b0b2dd77ae6940436a73f2f91fc69fba2 Bug 875468: Add syswrap-xen.c to valgrind Android.mk for B2G; r=vyang
26e5088a5c6112c27e47b28751d1e7e8e99e79d9 Bug 854517: Fixes for hard VEX includes in some files
f0cf0491817af3469f604220cac924917b241e4f Bug 854517: Package include file for valgrind repo
2bc03643298d344e05fe0df94b7aa3012b7687e9 Bug 854517: B2G specific valgrind patches; r=jseward

pmoore@Petes-iMac:~/git/valgrind fxos $ git log --pretty=oneline origin/fxos..mozilla/fxos | wc -l
       9


## The fxos branch on github.com has 1295 commits that do not exist on the fxos branch on git.mozilla.org:


pmoore@Petes-iMac:~/git/valgrind fxos $ git log --pretty=oneline mozilla/fxos..origin/fxos | wc -l
    1295


In other words, they have *diverged*.


Furthermore, we have manifests pinned to fxos branch, e.g. see:

pmoore@Petes-iMac:~/git/b2g-manifest master $ grep -r fxos . | grep valgrind
./base-caf-jb.xml:  <project path="external/valgrind" name="valgrind" remote="b2g" revision="fxos" />
./base-jb.xml:  <project path="external/valgrind" name="valgrind" remote="b2g" revision="fxos" />
./base-kk.xml:  <project name="valgrind" path="external/valgrind" remote="b2g" revision="fxos"/>
./base-l.xml:  <project name="valgrind" path="external/valgrind" remote="b2g" revision="fxos"/>


The mirroring looks correct:

http://hg.mozilla.org/users/hwine_mozilla.com/repo-sync-configs/file/baefebb1fd27/b2g-valgrind/config


     1 [core]
     2 	repositoryformatversion = 0
     3 	filemode = true
     4 	bare = true
     5 [remote "origin"]
     6 	url = git://github.com/mozilla-b2g/valgrind
     7 	fetch = +refs/heads/*:refs/heads/*
     8 	fetch = +refs/tags/*:refs/tags/*
     9 [remote "git.m.o"]
    10 	url = git+ssh://git.m.o/b2g/valgrind.git
    11 	mirror = true
    12 [gc]
    13 	auto = 0


We are getting the following error, e.g. in https://s3-us-west-2.amazonaws.com/taskcluster-public-artifacts/MWcfPMxNRs-h_zOSDC9Vfg/0/public/logs/live_backing.log


....
....
....
Fetching project platform/system/netd
Fetching project platform/external/wpa_supplicant_8
Fetching project platform/external/libnl-headers
Fetching project platform/external/skia
Fetching project valgrind
Traceback (most recent call last):
  File "/home/worker/workspace/B2G/.repo/repo/main.py", line 506, in <module>
    _Main(sys.argv[1:])
  File "/home/worker/workspace/B2G/.repo/repo/main.py", line 482, in _Main
    result = repo._Run(argv) or 0
  File "/home/worker/workspace/B2G/.repo/repo/main.py", line 161, in _Run
    result = cmd.Execute(copts, cargs)
  File "/home/worker/workspace/B2G/.repo/repo/subcmds/sync.py", line 641, in Execute
    fetched = self._Fetch(to_fetch, opt)
  File "/home/worker/workspace/B2G/.repo/repo/subcmds/sync.py", line 342, in _Fetch
    self._FetchProjectList(**kwargs)
  File "/home/worker/workspace/B2G/.repo/repo/subcmds/sync.py", line 237, in _FetchProjectList
    success = self._FetchHelper(opt, project, *args, **kwargs)
  File "/home/worker/workspace/B2G/.repo/repo/subcmds/sync.py", line 278, in _FetchHelper
    no_tags=opt.no_tags, archive=self.manifest.IsArchive)
  File "/home/worker/workspace/B2G/.repo/repo/project.py", line 1140, in Sync_NetworkHalf
    self._InitMRef()
  File "/home/worker/workspace/B2G/.repo/repo/project.py", line 2211, in _InitMRef
    self._InitAnyMRef(R_M + self.manifest.branch)
  File "/home/worker/workspace/B2G/.repo/repo/project.py", line 2223, in _InitAnyMRef
    self.bare_git.UpdateRef(ref, dst, message=msg, detach=True)
  File "/home/worker/workspace/B2G/.repo/repo/project.py", line 2477, in UpdateRef
    self.update_ref(*cmdv)
  File "/home/worker/workspace/B2G/.repo/repo/project.py", line 2554, in runner
    p.stderr))
error.GitError: valgrind update-ref: fatal: 331f5e1089fbbd1573d40da3593b22a27d195712^0: not a valid SHA1

[taskcluster-vcs:warning] run end (with error) try (10/20) retrying in 9852607.375383377 ms : ./repo sync -j1

Note that the mentioned SHA (331f5e1089fbbd1573d40da3593b22a27d195712) is the current head of the fxos branch on github (https://github.com/mozilla-b2g/valgrind/tree/fxos).


I'm not sure how they diverged, but it means the situation cannot be automatically reconciled.

Normally it should not be possible/allowed for the mirror to be updated by anything other than the vcs sync agent.
Things that should be reviewed after fixing this issue:

1) Find out how it was possible for them to diverge - was there a push to git.mozilla.org from something other than vcs sync? Can controls be tightened, to make this impossible?
2) What happened when vcs sync hit this problem - did it trigger a monitoring alert? If not, we should make sure that such issues trigger a monitoring alert so that support teams are notified.
3) How did the b2g build process pick up the SHA from github.com branch, if it is meant to pull from the mirror? Even though this helped us in this case, our automation should not be touching github.com repositories - we should fix this too.
4) Do we in general want a monitoring tool that periodicially checks that mirrors share the same branch heads and tags as their upstream counterparts? This may be the same as 2) except that an external monitoring system has the benefit that it sits outside the vcs sync tool, which might hang, fail, be misconfigured, etc. Something independent with a monitoring check, even if only once every couple of hours, might be more reliable than e.g. email sending from the vcs sync tool.
Basically, I'm saying we should be able to automate ourselves away from such errors in future. =)
(In reply to Pete Moore [:pmoore][:pete] from comment #1)

[...]

> 3) How did the b2g build process pick up the SHA from github.com branch, if
> it is meant to pull from the mirror? Even though this helped us in this
> case, our automation should not be touching github.com repositories - we
> should fix this too.

I might be able to get more infos on this topic. This is happening in context of https://bugzilla.mozilla.org/show_bug.cgi?id=1206368

I did a couple of initial pushes that were buggy and triggered https://bugzilla.mozilla.org/show_bug.cgi?id=1206383 https://bugzilla.mozilla.org/show_bug.cgi?id=1206395

Part of the bugginess was that I could not find any proper documentation on the bootstrap process for b2g/nexus-4-kk/sources.xml file, so I pulled one out of my build tree and used it. That was wrong because:
 - I included a reference to the gecko tree (triggering the bug where we hit the 5GB limit)
 - I did not rewrote the remote to be pulling from mirrors but instead had references to github.com

So maybe your SHA1 from github are coming from this?
move to correct product/component
Component: Tools → General
Product: Release Engineering → Developer Services
QA Contact: hwine
The divergence is caused by non-fast forward changes being made to the upstream repository (github.com/mozilla-b2g/valgrind). Original b2g requests were that no repo having "Mozilla contributions to b2g code" allow non-fast forward changes to be made visible to partners. Non fast forward changes required advance email notification to partners, as their repositories were also set to reject such changes.

ni: mwu -- who can sign off on these changes being okay? Is it just this one change that is okay, or all changes to valgrind?
Assignee: nobody → hwine
Status: NEW → ASSIGNED
Flags: needinfo?(mwu)
FYI while hacking this during the weekend, I also had the issue on other repos: VEX, broadcom/wlan. I stoppped at this point, but there might be others ?
Likely - the original request was to treat all repos hosted in the mozilla-b2g org on github as being subject to the "no non-fast forward, no delete" rules, as they are partner facing.
AFAIK I don't think this non-FF change was suppose to be ok, but qDot can probably help/answer better here.
Flags: needinfo?(mwu) → needinfo?(kyle)
Oh. Huh. I've been rebasing the fxos branches of both VEX and valgrind since I started including the repos, had no idea there was supposed to be signoff to do that, and was never told so. However, the repo is updated so infrequently (basically, whenever someone complains at me) that I guess we didn't catch that until now.

All the changes on the mozilla-b2g VEX and valgrind are fine to take, as it includes fixes from the valgrind developers for B2G.

In the future, I suppose we can start trying to do merges for the fxos branch, though we very much need to find another maintainer for this, as all I do is merge patches, I haven't actually /run/ valgrind on B2G in years. It'd be nice to have someone that still uses it and knows how git-svn works maintaining this.
Flags: needinfo?(kyle)
(In reply to Kyle Machulis [:kmachulis] [:qdot] (USE NEEDINFO?) from comment #9)
> All the changes on the mozilla-b2g VEX and valgrind are fine to take, as it
> includes fixes from the valgrind developers for B2G.

qDot - the current issue isn't whether these are "approved" changes, it is how do we fix the currently "broken" repository.  The two choices are:

a) Notify partners that a non Fast Forward change sets are coming their way, then do the changes (that requires signoff from the TAMs aiui), or

b) redo the changes such that they can be fast forwarded. (no notifications or coordination needed)

Which path do you want to take here?
Flags: needinfo?(kyle)
Maybe Julian can be interested in Valgrind?
Flags: needinfo?(jseward)
Hal or Pete, during my hacking this weekend I tried using a revision that was in the mirror and I hit a similar issue on other repos. At least VEX and broadcom/wlan from what I recall.

Can we make sure all repos are fine?
Flags: needinfo?(pmoore)
Flags: needinfo?(hwine)
(In reply to Hal Wine [:hwine] (use NI) from comment #10)
> (In reply to Kyle Machulis [:kmachulis] [:qdot] (USE NEEDINFO?) from comment
> #9)
> > All the changes on the mozilla-b2g VEX and valgrind are fine to take, as it
> > includes fixes from the valgrind developers for B2G.
> 
> qDot - the current issue isn't whether these are "approved" changes, it is
> how do we fix the currently "broken" repository.  The two choices are:
> 
> a) Notify partners that a non Fast Forward change sets are coming their way,
> then do the changes (that requires signoff from the TAMs aiui), or
> 
> b) redo the changes such that they can be fast forwarded. (no notifications
> or coordination needed)
> 
> Which path do you want to take here?

I'll try synchronizing things as a merge instead of a rebase I guess. Going to match the branches messy, but I'm not sure anyone cares at this point.
Flags: needinfo?(kyle)
Ok. Changed fxos branches of valgrind and vex to use merges instead of rebases. Things should fast forward now.
(In reply to Kyle Machulis [:kmachulis] [:qdot] (USE NEEDINFO?) from comment #14)
> Ok. Changed fxos branches of valgrind and vex to use merges instead of
> rebases. Things should fast forward now.

Mirrors not catching this yet: https://tools.taskcluster.net/task-inspector/#SPHNViRWRBGuwAch2Yf_FA/0
(In reply to Alexandre LISSY :gerard-majax from comment #12)
> Hal or Pete, during my hacking this weekend I tried using a revision that
> was in the mirror and I hit a similar issue on other repos. At least VEX and
> broadcom/wlan from what I recall.
> 
> Can we make sure all repos are fine?

This will be a task for Hal or Dev Services I think - I'm not working in RelEng these days.
Flags: needinfo?(pmoore)
(In reply to Kyle Machulis [:kmachulis] [:qdot] (USE NEEDINFO?) from comment #9)
> Oh. Huh. I've been rebasing the fxos branches of both VEX and valgrind since
> I started including the repos, had no idea there was supposed to be signoff
> to do that, and was never told so. However, the repo is updated so
> infrequently (basically, whenever someone complains at me) that I guess we
> didn't catch that until now.
> 
> All the changes on the mozilla-b2g VEX and valgrind are fine to take, as it
> includes fixes from the valgrind developers for B2G.
> 
> In the future, I suppose we can start trying to do merges for the fxos
> branch, though we very much need to find another maintainer for this, as all
> I do is merge patches, I haven't actually /run/ valgrind on B2G in years.
> It'd be nice to have someone that still uses it and knows how git-svn works
> maintaining this.

@Hal - is there a possibility to allow non-fast-forwards, or are these repos partner facing such that this would be absolutely prohibited? If non-fast-forwards could be allowed, it would no doubt make life simpler for Kyle.
(In reply to Alexandre LISSY :gerard-majax from comment #12)
> Hal or Pete, during my hacking this weekend I tried using a revision that
> was in the mirror and I hit a similar issue on other repos. At least VEX and
> broadcom/wlan from what I recall.
> 
> Can we make sure all repos are fine?

Right, so I have hacked around the sources.xml: switched VEX and valgrind to pull from mirror the "fxos" branch. Then I hit an issue with mako kernel repo not being mirrored. Filed this under bug 1207092.

As per https://tools.taskcluster.net/task-inspector/#KVToGvJjQ9edkvIB3rruqg/0 it looks like it is now doing the build nicely. So I think once VEX and valgrind mirrors are updated and bug 1207092 is fixed we are all good :)
Flags: needinfo?(hwine)
Rebasing isn't needed, merging is fine, I just wasn't aware of the requirement to update third parties. Whoever takes over maintaining B2G Valgrind can just merge from now on with no problems.
Flags: needinfo?(hwine)
(In reply to Pete Moore [:pmoore][:pete] from comment #17)
> (In reply to Kyle Machulis [:kmachulis] [:qdot] (USE NEEDINFO?) from comment
> @Hal - is there a possibility to allow non-fast-forwards, or are these repos
> partner facing such that this would be absolutely prohibited? If
> non-fast-forwards could be allowed, it would no doubt make life simpler for
> Kyle.

This is a b2g business decision to make, not a vcs-sync/git.mozilla.org configuration choice.
From review of the vcs-sync logs, I believe this problem has been resolved and there are no more issues. Please open a new bug if there are.

Thanks :qDot!
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Flags: needinfo?(hwine)
Resolution: --- → FIXED
Flags: needinfo?(jseward)
You need to log in before you can comment on or make changes to this bug.