Bug 1659691 Comment 0 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

Original comment by

Francesco Lodolo [:flod]

on 2020-08-18 04:41:24 PDT

As discussed in our last meeting l10n+RelEng, filing a bug to keep track of this alternative approach for cross-channel.

The general idea is to keep using cross-channel without changes, as long as it works.

Pros:
- No ties to Mercurial's internals
- Generally better code

Cons:
- gecko-strings becomes useless for blame/annotate
- It would become harder to understand what caused issues in the output

Attaching here Axel's notes, and a patch that could be resurrected in case we need to go down this road.

> The history-aware code-base for l10n cross-channel needs a lot of
> information on the graph and data from the mercurial source
> repositories. Thus, it's written as python code directly interacting
> with the internal APIs of Mercurial.
> 
> That it has close ties to the data enables it to create very meaningful
> output, replaying history in the gecko-strings repository very strongly
> tied to the history in mozilla-central. Being able to reason about the
> generated commits in strong context of the original change it links to
> makes reviewing quarantine much more efficient.
> 
> The downside is that new combinations of data can trigger crashing bugs
> in the code, or unwanted output. Finding out where the problem comes
> from exactly is hard. Also, updates to Mercurial require updates to the
> code, though this hasn't happened since Mercurial 4.8 back in May 2019.
> 
> Another mixed bag is the split between mozilla-central and comm-central
> history. With both repository families being processed independently,
> it's hard to reason which files in the repository are actually needed.
> As they could be needed by the other repo. This is a challenge when
> dropping branches or projects.
> 
> An alternative approach is creating all content for one set of revisions
> among all repositories, ignoring history. The needs here limit to
> revision identification, and loading data for a file and a revision.
> This can be safely done with hglib. That way we have a stable API,
> though it's not terribly performant. Also, there's no need to analyze
> the history graph, so it's just overall less code.
> 
> Plus, it's new code, so it's genuinely in better shape.
> 
> The downside is that changes to gecko-strings can't be traced back to
> original commits immediately. One needs to investigate all revranges in
> both repos and all branches to see which commit probably changed the
> input responsible for the offending output.

Tip at the time of creating this bug was https://hg.mozilla.org/users/axel_mozilla.com/cross-channel-experimental/rev/0be59f582ee00f88d74b87652544a8cf2be33593

Revision 1 by

Francesco Lodolo [:flod]

on 2020-08-18 04:43:42 PDT

As discussed in our last meeting l10n+RelEng, filing a bug to keep track of this alternative approach for cross-channel.

The general idea is to keep using cross-channel without changes, as long as it works.

Pros:
- No ties to Mercurial's internals
- Generally better code

Cons:
- gecko-strings becomes useless for blame/annotate
- It would become harder to understand what caused issues in the output

Attaching here Axel's notes, and a patch that could be resurrected in case we need to go down this road.

> The history-aware code-base for l10n cross-channel needs a lot of
> information on the graph and data from the mercurial source
> repositories. Thus, it's written as python code directly interacting
> with the internal APIs of Mercurial.
> 
> That it has close ties to the data enables it to create very meaningful
> output, replaying history in the gecko-strings repository very strongly
> tied to the history in mozilla-central. Being able to reason about the
> generated commits in strong context of the original change it links to
> makes reviewing quarantine much more efficient.
> 
> The downside is that new combinations of data can trigger crashing bugs
> in the code, or unwanted output. Finding out where the problem comes
> from exactly is hard. Also, updates to Mercurial require updates to the
> code, though this hasn't happened since Mercurial 4.8 back in May 2019.
> 
> Another mixed bag is the split between mozilla-central and comm-central
> history. With both repository families being processed independently,
> it's hard to reason which files in the repository are actually needed.
> As they could be needed by the other repo. This is a challenge when
> dropping branches or projects.
> 
> An alternative approach is creating all content for one set of revisions
> among all repositories, ignoring history. The needs here limit to
> revision identification, and loading data for a file and a revision.
> This can be safely done with hglib. That way we have a stable API,
> though it's not terribly performant. Also, there's no need to analyze
> the history graph, so it's just overall less code.
> 
> Plus, it's new code, so it's genuinely in better shape.
> 
> The downside is that changes to gecko-strings can't be traced back to
> original commits immediately. One needs to investigate all revranges in
> both repos and all branches to see which commit probably changed the
> input responsible for the offending output.

Back to Bug 1659691 Comment 0