Open Bug 1119438 Opened 9 years ago Updated 9 years ago

Convert CVS tree into a DVCS repository

Categories

(Developer Services :: General, task)

x86_64
Linux
task
Not set
normal

Tracking

(Not tracked)

People

(Reporter: jcranmer, Unassigned)

References

Details

For historical archival purposes, it would be useful to have a DVCS version of the CVS history, complete with all branches and tags. I personally prefer Mercurial, but the difference between Mercurial or git is quite frankly minor in comparison to the value in having a DVCS version of record.

Some notes:
A. The mozilla-cvs-history repository exists, but it doesn't have tag or branch information.
B. For ease of repository grafting and archaeology, it may be beneficial to make a few touchups to information. In particular, replacing the % in usernames with a @ as email addresses (e.g., Pidgeot18%gmail.com -> Pidgeot18@gmail.com).
C. In the long term, it would be helpful to start piecing everything together into unified CVS/hg repository of records. However, the existence of a git unified repo that may potentially have different revision numbers for SHA1 days makes this extraordinarily hard. Trying to piece together the history of comm-central fully accurately is probably a lost cause (although I started on it, and that is what led me to file this bug).
D. It turns out that mozilla-central contains a mirror of CVS trunk for the first several revisions. This is more data that makes piecing together a proper unified history difficult.
(I added a see also to decommissioning CVS. Strictly speaking, this doesn't need to occur after CVS is made read-only, but I don't know how robust various conversion tools are to being setup as a one-way mirror. Given that we already have a mess of different IDs for the same revision, it seems to me prudent to avoid making it even worse).
Oh, one more thing: we ought to pay attention to the proper charset of commit messages. I see that the current git mozilla-cvs-history has "Bug 434857 ��� Crash [@ nsAccessibleWrap::Next(unsigned long, tagVARIANT*, unsigned long*), r=aaronlev, sr=neil, a=ss" as a commit message. Bonsai reported this as "Bug 434857 – Crash [@ nsAccessibleWrap::Next(unsigned long, tagVARIANT*, unsigned long*), r=aaronlev, sr=neil, a=ss" (cf. <http://bonsai.mozilla.org/cvslog.cgi?file=mozilla/accessible/src/msaa/nsAccessibleWrap.h&rev=HEAD&mark=1.34>, <https://github.com/jrmuizel/mozilla-cvs-history/commit/d67c8717b45a2b0bfd8f62b56a2b05c98a217bbf>).
I've long wanted an official'ish version of mozilla-central with CVS history.

I've entertained the idea of writing a Mercurial extension that allowed us to graft rev 0 of existing mozilla-central onto the CVS history and swallow any SHA-1 errors Mercurial complains about. This way, people could develop against a local full history repo and still push to production without having SHA-1 mismatch.

FTR, any time we attempt to convert *any* repository from CVS to *anything*, the conversion is "fun." As people like dbaron and hwine will tell you, there was some crazy shit that went down in the days of CVS. All our conversions of mozilla-central from CVS to Git have all gotten different parts wrong. I have no doubt we could do a *better* conversion than what's currently in gecko-dev. But we're not going to get it perfect. I'm half tempted to piece together something very hacky and write the Mercurial tools necessary to "regraft" repository histories easily. e.g. we could make it easy to swap out CVS conversion A with conversion B, once we fix a bug with the conversion process.
(In reply to Gregory Szorc [:gps] from comment #3)
> I've entertained the idea of writing a Mercurial extension that allowed us
> to graft rev 0 of existing mozilla-central onto the CVS history and swallow
> any SHA-1 errors Mercurial complains about. This way, people could develop
> against a local full history repo and still push to production without
> having SHA-1 mismatch.

If you do add such an extension, I'd like to ask that the graft also transparently handle file moves. Some of the mozilla-central->comm-central graft things I see need to handle renaming some/path to another/path.
You need to log in before you can comment on or make changes to this bug.