Support l20n file in pre-commit hook

RESOLVED INCOMPLETE

Status

RESOLVED INCOMPLETE
3 years ago
a year ago

People

(Reporter: eragonj, Assigned: eragonj)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

In bug 1164346, we did introduce this pre-commit hook to hint developers that if they changed *.properties files, when committing related changes, our hook will print out some messages to tell them what localeId got changed and they should double check again to make sure they won't introduce a new translation under the same localeId.

For this bug, we need to add support of l20n format.
Here comes some proposals based on the comment which Zibi left on mail.

> So, the best strategy for l20n would be to:
> 
> 1) identify .l20n files that are affected by the changeset
> 2) open/parse them with and without the change
> 3) run a simple script to find entities that have values changes but no ID change.
> 
> I can write 3), just tell me which language. 2) is available in JS and Python. Hope that will be sufficient. I don't know how to do 1) :)
> 
> Should we file a bug and work there?
> 
> zb.
(In reply to EJ Chen [:eragonj][:小龍哥][ni? if you need me] from comment #1)
> Here comes some proposals based on the comment which Zibi left on mail.
> 
> > So, the best strategy for l20n would be to:
> > 
> > 1) identify .l20n files that are affected by the changeset

I think this should work -> `git diff --staged --name-only --diff-filter=ACMRT | grep '\.l20n$'`

> > 2) open/parse them with and without the change

If we have 3), I think we don't need this ?

> > 3) run a simple script to find entities that have values changes but no ID change.

Vote for this one, what we need to know here is to use changed diffs to match back its localeId and it's enough ! Because we don't care too much about the other intact content, we just want this localeId and then print it out.

Any idea :) ?
(In reply to EJ Chen [:eragonj][:小龍哥][ni? if you need me] from comment #2)
> (In reply to EJ Chen [:eragonj][:小龍哥][ni? if you need me] from comment #1)
> > Here comes some proposals based on the comment which Zibi left on mail.
> > 
> > > So, the best strategy for l20n would be to:
> > > 
> > > 1) identify .l20n files that are affected by the changeset
> 
> I think this should work -> `git diff --staged --name-only
> --diff-filter=ACMRT | grep '\.l20n$'`

Yup.

> > > 2) open/parse them with and without the change
> 
> If we have 3), I think we don't need this ?

I don't know if we can avoid that. L20n is semantically more complex than .properties. As you can see in the grammar[0] document

I think we will want to be able to parse the whole entity and compare it to the same entity in another file. If the values changed, but ID didn't, we report that.

If we have to parse the entity, we can as well parse the whole file of those entities and look for differences.

In other words, I don't know if it would be easier to take file example.en-US.l20n, parse it with and without the changes and loop through the entities launching isEntityChanged(old, new) or parse the entity out of a diff.

> > > 3) run a simple script to find entities that have values changes but no ID change.
> 
> Vote for this one, what we need to know here is to use changed diffs to
> match back its localeId and it's enough ! Because we don't care too much
> about the other intact content, we just want this localeId and then print it
> out.

I think we do. L20n entity may have a value and arbitrary number of attributes, like this:

<entity1 "Value of the Entity"
 placeholder: "Value of the placeholder">

If either of the values changed, we would like to update the entity1's ID.

What's more, the value may be a hash:

<entity2 {
  zero: "Value for zero",
  other: "Value for other"
}
  placeholder:  {
    zero: "Attribute for zero",
    other: "Attribute for other"
  }
>

Now, the rule is that if the *value* of any key changed, we want to update the entity ID. If, on the other hand the key has been added, or the ID of the key has been changed, we don't need to update the entity ID.

So, change like this:

<entity2 {
  zero: "New value for zero",
  other: "Value for other"
}
  placeholder:  {
    zero: "Attribute for zero",
    other: "Attribute for other"
  }
>

should trigger your hook, but this:

<entity2 {
  zero: "Value for zero",
  one: "Value for one",
  other: "Value for other"
}
  placeholder:  {
    zero: "Attribute for zero",
    other: "Attribute for other"
  }
>

shouldn't.

I believe we need to parse the entity into its AST and compare two ASTs. I'm happy to write the function that will compare those two ASTs and we have parsers that can parse them.

Now, if you don't want to use L20n parser to parse the whole file, we need code that will pick the whole entity out of the diff. L20n Parser can parse just the entity into AST and then we can compare the AST's.

Does it sound like a plan?

[0] http://l20n.github.io/spec/grammar.html
The reason why I didn't parse anything in my original implementation is because that there is one scenario that we can't report - "If developers are just trying to update a typo."

For this typo problem, for any kind of implementation, we can't easily recognize it, or we have no way to recognize it. So that's in my original patch, I just tried to know which translation is changed (no matter why), then find its localeId and report it.

Any idea for this typo problem ? :)
Status: NEW → RESOLVED
Last Resolved: a year ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.