As discussed in our last meeting l10n+RelEng, filing a bug to keep track of this alternative approach for cross-channel. The general idea is to keep using cross-channel without changes, as long as it works. Pros: - No ties to Mercurial's internals - Generally better code Cons: - gecko-strings becomes useless for blame/annotate - It would become harder to understand what caused issues in the output Attaching here Axel's notes, and a patch that could be resurrected in case we need to go down this road. > The history-aware code-base for l10n cross-channel needs a lot of > information on the graph and data from the mercurial source > repositories. Thus, it's written as python code directly interacting > with the internal APIs of Mercurial. > > That it has close ties to the data enables it to create very meaningful > output, replaying history in the gecko-strings repository very strongly > tied to the history in mozilla-central. Being able to reason about the > generated commits in strong context of the original change it links to > makes reviewing quarantine much more efficient. > > The downside is that new combinations of data can trigger crashing bugs > in the code, or unwanted output. Finding out where the problem comes > from exactly is hard. Also, updates to Mercurial require updates to the > code, though this hasn't happened since Mercurial 4.8 back in May 2019. > > Another mixed bag is the split between mozilla-central and comm-central > history. With both repository families being processed independently, > it's hard to reason which files in the repository are actually needed. > As they could be needed by the other repo. This is a challenge when > dropping branches or projects. > > An alternative approach is creating all content for one set of revisions > among all repositories, ignoring history. The needs here limit to > revision identification, and loading data for a file and a revision. > This can be safely done with hglib. That way we have a stable API, > though it's not terribly performant. Also, there's no need to analyze > the history graph, so it's just overall less code. > > Plus, it's new code, so it's genuinely in better shape. > > The downside is that changes to gecko-strings can't be traced back to > original commits immediately. One needs to investigate all revranges in > both repos and all branches to see which commit probably changed the > input responsible for the offending output. Tip at the time of creating this bug was https://hg.mozilla.org/users/axel_mozilla.com/cross-channel-experimental/rev/0be59f582ee00f88d74b87652544a8cf2be33593
Bug 1659691 Comment 0 Edit History
Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.
As discussed in our last meeting l10n+RelEng, filing a bug to keep track of this alternative approach for cross-channel.
The general idea is to keep using cross-channel without changes, as long as it works.
Pros:
- No ties to Mercurial's internals
- Generally better code
Cons:
- gecko-strings becomes useless for blame/annotate
- It would become harder to understand what caused issues in the output
Attaching here Axel's notes, and a patch that could be resurrected in case we need to go down this road.
> The history-aware code-base for l10n cross-channel needs a lot of
> information on the graph and data from the mercurial source
> repositories. Thus, it's written as python code directly interacting
> with the internal APIs of Mercurial.
>
> That it has close ties to the data enables it to create very meaningful
> output, replaying history in the gecko-strings repository very strongly
> tied to the history in mozilla-central. Being able to reason about the
> generated commits in strong context of the original change it links to
> makes reviewing quarantine much more efficient.
>
> The downside is that new combinations of data can trigger crashing bugs
> in the code, or unwanted output. Finding out where the problem comes
> from exactly is hard. Also, updates to Mercurial require updates to the
> code, though this hasn't happened since Mercurial 4.8 back in May 2019.
>
> Another mixed bag is the split between mozilla-central and comm-central
> history. With both repository families being processed independently,
> it's hard to reason which files in the repository are actually needed.
> As they could be needed by the other repo. This is a challenge when
> dropping branches or projects.
>
> An alternative approach is creating all content for one set of revisions
> among all repositories, ignoring history. The needs here limit to
> revision identification, and loading data for a file and a revision.
> This can be safely done with hglib. That way we have a stable API,
> though it's not terribly performant. Also, there's no need to analyze
> the history graph, so it's just overall less code.
>
> Plus, it's new code, so it's genuinely in better shape.
>
> The downside is that changes to gecko-strings can't be traced back to
> original commits immediately. One needs to investigate all revranges in
> both repos and all branches to see which commit probably changed the
> input responsible for the offending output.