Closed Bug 797919 Opened 12 years ago Closed 11 years ago

Add a hook which prevents case difference renames from being pushed

Categories

(Developer Services :: Mercurial: hg.mozilla.org, defect)

x86
macOS
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: ehsan.akhgari, Unassigned)

References

Details

Attachments

(3 files)

We should add a hook which prevents changesets which includes renames from things like fooBar to FoObAr in order to prevent disasters such as bug 797910.  This needs to be deployed to all of the hg repos we host.
Component: Release Engineering → Hg: Customizations
Assignee: nobody → ted.mielczarek
We run multiple repo systems now - we'll need the matching git hook ready to be deployed as needed as well.
(In reply to comment #1)
> We run multiple repo systems now - we'll need the matching git hook ready to be
> deployed as needed as well.

No, git does not have this bug.
Here's a hook. It passes the tests, so I think it should work.
Comment on attachment 668084 [details] [diff] [review]
Add a hook which prevents case difference renames from being pushed

Review of attachment 668084 [details] [diff] [review]:
-----------------------------------------------------------------

::: mozhghooks/prevent_case_only_renames.py
@@ +24,5 @@
> +    rev = repo.changectx(node).rev()
> +    tip = repo.changectx("tip").rev()
> +    rejecting = False
> +
> +    for i in reversed(xrange(rev, tip + 1)):

Does

range(tip + 1, rev, -1)

work?
Made that change, and also added a test for renaming a directory (since that's what caused this ruckus in the first place):
http://hg.mozilla.org/hgcustom/hghooks/rev/81840580bc95

You'll need to file a new IT bug to get this hook deployed. You want

[hooks]
pretxnchangegroup.renamecase = python:mozhghooks.prevent_case_only_renames.hook

deployed globally.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Ted, correct me if I'm wrong, but this hook specifically looks for stuff renamed using hg mv, right?  If that's the case we should probably change it to look for removals and additions in the same changeset as well, as that was actually what triggered the problem today.
Comment on attachment 668084 [details] [diff] [review]
Add a hook which prevents case difference renames from being pushed

Review of attachment 668084 [details] [diff] [review]:
-----------------------------------------------------------------

::: mozhghooks/prevent_case_only_renames.py
@@ +12,5 @@
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program; if not, write to the Free Software
> +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.

y u no MPL?

@@ +24,5 @@
> +    rev = repo.changectx(node).rev()
> +    tip = repo.changectx("tip").rev()
> +    rejecting = False
> +
> +    for i in reversed(xrange(rev, tip + 1)):

I don't think this is the right approach. The underlying problem is the revlog .i files could clash on a case insensitive filesystem, right? Assuming that to be true, I think the proper thing to do is cross-reference the set of revlogs already in the repository with the set that would be in the repository if the changectx is allowed. If there is a case sensitivity conflict, reject the changeset.

Also, the Mercurial people told me that changectx() is not reliable. It should be fine on newer Mercurial versions (which version exactly, I'm not sure). Basically, they said that on some older Mercurial releases changectx() didn't return all the metadata really associated with a changeset. They instead told me to use repo.status(). See also https://github.com/indygreg/hg-git/blob/performance-master/hggit/hg2git.py#L69
(In reply to Ehsan Akhgari [:ehsan] from comment #6)
> Ted, correct me if I'm wrong, but this hook specifically looks for stuff
> renamed using hg mv, right?  If that's the case we should probably change it
> to look for removals and additions in the same changeset as well, as that
> was actually what triggered the problem today.

You can pretty easily try it, or write a test and see if it catches that.

(In reply to Gregory Szorc [:gps] from comment #7)
> y u no MPL?

Mercurial is GPL. Hook code gets loaded in Mercurial.

> I don't think this is the right approach. The underlying problem is the
> revlog .i files could clash on a case insensitive filesystem, right?
> Assuming that to be true, I think the proper thing to do is cross-reference
> the set of revlogs already in the repository with the set that would be in
> the repository if the changectx is allowed. If there is a case sensitivity
> conflict, reject the changeset.
> 
> Also, the Mercurial people told me that changectx() is not reliable. It
> should be fine on newer Mercurial versions (which version exactly, I'm not
> sure). Basically, they said that on some older Mercurial releases
> changectx() didn't return all the metadata really associated with a
> changeset. They instead told me to use repo.status(). See also
> https://github.com/indygreg/hg-git/blob/performance-master/hggit/hg2git.
> py#L69

Please fix this if you know a better solution. I fumbled my way around to this solution.
I'm going to try running the casestop extension: http://mercurial.selenic.com/wiki/CasestopExtension

Since I import from an upstream source with tons of files and a propensity to rename dirs/files, it may me avoid this again.
Ehsan: I assume this is the case that we actually hit? The hook does not in fact catch this.
(In reply to comment #10)
> Created attachment 668409 [details] [diff] [review]
>   --> https://bugzilla.mozilla.org/attachment.cgi?id=668409&action=edit
> hook should prevent remove+add with case change
> 
> Ehsan: I assume this is the case that we actually hit? The hook does not in
> fact catch this.

No, it's not exactly.  The fauly cset was <https://bitbucket.org/ehsan/broken-inbound/changeset/d81b605dc8d8da6d9b7ed3ca3b5d03038c630c30>, which both added and removed the directory in question.  I'll try pushing that to a user repo in a sec...
(In reply to comment #7)
> I don't think this is the right approach. The underlying problem is the revlog
> .i files could clash on a case insensitive filesystem, right?

No, this is about directory names, not file names.
To clarify, there is no problem with case sensitivity of either files or directories in the internal mercurial file store (i.e. files & dirs under .hg/store/data/) for the versions of mercurial we run on hg.m.o. (The case folding plan discussed in the wiki has been implemented - an '_' is prepended to any letter that should be upper case, otherwise the letter is lower case.)

The issue occurs when hg client software tries to map case changes onto non-case sensitive file systems. (e.g. during updates or merges)
The hook indeed does not work as expected.  I managed to push the same faulty changeset here: http://hg.mozilla.org/users/eakhgari_mozilla.com/broken-inbound/rev/d81b605dc8d8
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
After poking around with Mercurial, it appears that when you |hg mv| things it keeps the old .i indexes in the data dir. e.g. a rename of foo/bar.c -> foo/Bar.c resulted in:

.hg/store/data
.hg/store/data/foo
.hg/store/data/foo/_bar.c.i
.hg/store/data/foo/bar.c.i

This in of itself is not a problem because Mercurial encodes filenames in the repo. Good for Mercurial.

I /think/ the actual problem is when one particular changeset contains multiple directories or paths that should be different but wouldn't on a case insensitive filesystem.
I'm confused on what the underlying problem is. Is it that Mercurial doesn't handle moves/renames when performing an update/checkout/switch? Is it that we could currently allow files of mixed case into a single changeset?

If it is the latter, this patch address that. But, this patch removes part of the former. Tell me what conditions we need to prevent and I'll get make a hook for you.
I don't think we have the root cause yet - at small scale, both file/dir case-only renames work fine whether or not done inside or outside hg.

There is some condition that our 'known bad' changeset trips that is not yet well enough described to do a specific hook. The "huge net" is to stop case only renames of files/dirs within a single changeset (needs to be done as 2 change sets: 'a' -> 'A.tmp'; 'A.tmp'-> 'A').

Based on the above, that sounds like we need the hook to sit on two codepaths: that for an 'inside hg' rename (hg mv) and that for a 'outside of hg' rename (where it's a side effect of applying the patch).
There's a set of reproducible steps in http://bz.selenic.com/show_bug.cgi?id=3648.
Any updates here? Are we waiting for an upstream fix?
(In reply to comment #19)
> Any updates here? Are we waiting for an upstream fix?

An upstream fix will only help when everyone updates their client side hg installation.
To sum up:
 - there are no problems with the hg server

 - this is a relatively rare situation that requires both of the following to happen:
   - have a certain type of case changes that typically only occurs when someone is incorporating upstream changes. These cases can also be detected by an extension run on the hg client by the upstream maintainer (see comment #9)
   - the update is performed using an hg client affected by the regression, version 2.1 up (fix still pending) AND on a case-insensitive file system (windows/stock mac)

With luck, the actions :jesup is taking in comment #9 will mean we won't see this issue again. Or at least not until the hg client has been fixed (http://bz.selenic.com/show_bug.cgi?id=3648).

Since we don't have a working hook to prevent this, we are agreeing to use one of the following courses of action if we're unlucky enough to have another occurrence:
 - have affected users downgrade to hg client 2.0.x just until after they've merged the affected code
 - provide access to linux boxes for affected users to do the merge (push their repo, do the merge, pull their repo)
 - discuss whether or not we should strip the hg repo as done last time. The impact on related repos is now much larger, and this may not be an automatic "yes".
I don't know what to do here.
Assignee: ted → nobody
Any objection from anyone on closing wontfix? It never was a server issue (code or data store), and the upstream client bug has been fixed.
Well, we got lucky and got away with ignoring this for 6 months.  We might as well make the implicit wontfix here explicit!
Status: REOPENED → RESOLVED
Closed: 12 years ago11 years ago
Resolution: --- → WONTFIX
Product: mozilla.org → Release Engineering
Product: Release Engineering → Developer Services
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: