beta l10n is broken for locale be and lij

NEW
Unassigned

Status

enhancement
6 months ago
5 months ago

People

(Reporter: jlund, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 obsolete attachment)

needs l10n-changeset bump to take on new patch in l10n-central/be
Summary: beta l10n is broken for locale be → beta l10n is broken for locale be and lij
gandalf> delphine: so, there's an error in `be` - https://hg.mozilla.org/l10n-central/be/file/tip/toolkit/toolkit/about/aboutAddons.ftl#l76
14:24:32 reported here https://l10n.mozilla.org/dashboard/compare?run=1109301
14:24:40 which prevents beta4 build :(

delphine fixed up l10n-central/{be,lij} repos.

l10n bumper did not pick up the new revs as it needs to be blessed somehow in: https://l10n.mozilla.org/shipping/l10n-changesets?av=fx64

So we manually bumped in-tree l10n revs via:

https://hg.mozilla.org/releases/mozilla-beta/rev/aeac14f57e9a41301655ae8b488c8a2563d2b615
https://hg.mozilla.org/releases/mozilla-beta/rev/6ac07ce08ab8ff40601bd3bfe952f78bbd479434

Now we need to cancel build1 and start build2 for dev and beta

Comment 2

6 months ago
For the l10n team:

1. can we bump the signed-off version for beta? we should do this before b5 gtb; or before b4 build3 if needed.
2. is there a way we can prevent signing off on revisions that show as busted on the dashboard? If there are levels of bustedness, maybe we need 3 statuses (good, broken but shippable, broken and unshippable)
(In reply to Aki Sasaki [:aki] from comment #2)
> For the l10n team:
> 
> 1. can we bump the signed-off version for beta? we should do this before b5
> gtb; or before b4 build3 if needed.

n-i to flod

> 2. is there a way we can prevent signing off on revisions that show as
> busted on the dashboard? If there are levels of bustedness, maybe we need 3
> statuses (good, broken but shippable, broken and unshippable)

n-i to flod as well

and I'll add a new item

3. Is this the sort of error that in tree packaging should catch (like with compare-locales) and error out if we hit it?

n-i to Pike for this one
Flags: needinfo?(l10n)
Flags: needinfo?(francesco.lodolo)
I've updated the sign-offs for both locales. As said in the email thread, I understand why builds would be busted in the first place.

1) We shouldn't die on one error of unparsed content.

2) That content is not used in Beta, it's only used in Nightly.
Sign-offs happen on Beta, so there's no visibility for errors
happening in Nightly only.
Flags: needinfo?(francesco.lodolo)

Comment 5

6 months ago
(In reply to Justin Wood (:Callek) from comment #3)
> 3. Is this the sort of error that in tree packaging should catch (like with
> compare-locales) and error out if we hit it?

Yes, and it should just build. I'll need a link to the actual bustage to see why this doesn't work as designed.
Flags: needinfo?(l10n)

Comment 6

6 months ago
Jordan, can you point me to the broken builds and logs?
Flags: needinfo?(jlund)
(In reply to Axel Hecht [:Pike] from comment #6)
> Jordan, can you point me to the broken builds and logs?

the failed release task:

https://taskcluster-artifacts.net/fPTlnhPwT7qHYL6C62ZQTQ/5/public/logs/live_backing.log

irc context in debugging `be`

gandalf> jlund: well, the fix is literally to move the `.` character to the end of the previous line, sooo...
14:21:50 I think we're conficent
14:22:02 the question is who can do that, I have no been working with l10n-central ever :)
14:22:33 I asked my colleague - delphine, who may be able to help
14:22:37 she's joining this channel
14:23:31 
<•jlund> okay, given it's out of your knowledge, and unless delphine says otherwise, I would rather use a known good revision from the past.
14:23:47 
<gandalf> makes sense, let's wait for her pls
<gandalf> delphine: so, there's an error in `be` - https://hg.mozilla.org/l10n-central/be/file/tip/toolkit/toolkit/about/aboutAddons.ftl#l76
14:24:32 reported here https://l10n.mozilla.org/dashboard/compare?run=1109301
14:24:40 which prevents beta4 build :(
14:24:42 
<delphine> Ok I'll take a look, and let you know if I can help
14:25:06 
<gandalf> jlund suggests that we revert to the last good revision, or we fix it manually by moving the `.` to the previous line
14:25:15 can you advise on the course of action here?
14:25:30 
<•jlund> tx
14:26:46 
<delphine> ok just give me a bit of time, I'll look into this asap
14:27:33 
<gandalf> thank you!
14:28:05 
<•jlund> no rush. fwiw - I won't be able to push myself: `ssh hg.mozilla.org` tells me "You will NOT have write access to the following repos: Localization Repos (releases/l10n/*, others)"
14:28:10 
<delphine> gandalf: I don't see that it's localized in Pontoon though: https://pontoon.mozilla.org/be/firefox/toolkit/toolkit/about/aboutAddons.ftl/?search=discover-description&string=192091
14:28:13 we're talking about this string right?
14:28:29 
<gandalf> yes
14:28:52 that's what is in the hg repository - https://hg.mozilla.org/l10n-central/be/file/tip/toolkit/toolkit/about/aboutAddons.ftl#l76
14:28:59 not sure why pontoon doesn't show it :(
14:29:25 
<delphine> ok he may be working on mercurial
14:29:34 
<Aryx> no
14:29:37 
<delphine> but wouldn't it sync back to pontoon?
14:29:39 ok so I dunno
14:29:59 
<Aryx> this is from automatic migration, try moving the full stop at the end of the previous line
14:29:59 
<gandalf> looking at https://hg.mozilla.org/l10n-central/be/log/tip/toolkit/toolkit/about/aboutAddons.ftl 
14:30:08 Aryx: oh, maybe
14:30:18 
<Aryx> it likely expects something after the stop as a property of the entity
14:30:30 
<gandalf> yes, that's why the error is there, moving the dot will fix it
14:30:45 
<Aryx> not sure if fluent supports multi-line texts
14:30:49 
<gandalf> it does
14:30:59 we should improve our migration tools to not migrate if the result is a broken FTL
14:31:12 
<delphine> hmmm, so if I just copy/paste the string into pontoon but move the dot it should fix it at next sync?
14:32:27 
<Aryx> delphine: my theory, trying to verify locally but toolchain wants an update just now
14:33:03 
<delphine> Aryx: what I can do is try what I just said. And I can't do much more, unfortunately
14:33:09 
<gandalf> delphine: I'm surprised pontoon didn't sync back from VCS
14:34:11 
<delphine> well, yeah. that's why I'm scared doing anything :) 
14:34:30 gandalf: want me to try putting it in pontoon? that's the best I can do at this point
14:34:31 
<Aryx> "This will give you write access to the following repos:   Autoland (integration/autoland), Firefox Repos (mozilla-central, releases/*), Localization Repos (releases/l10n/*, others), Project Repos (projects/), Try, User Repos (users/)"
14:35:16 
<gandalf> delphine: yeah!
14:35:25 
<delphine> ok!
14:35:44 gandalf: done: https://pontoon.mozilla.org/be/firefox/toolkit/toolkit/about/aboutAddons.ftl/?search=discover-description&string=192091
14:35:47 let's see if that works!
14:36:36 
<gandalf> delphine: do you know how often pontoon syncs?
14:36:43 
<delphine> yup
14:36:48 
<gandalf> jlund: we'll also want to update the revision for the release to it, right?
14:36:50 
<delphine> https://pontoon.mozilla.org/sync/log/
14:37:00 should be around :50
14:37:10 hope it fixes it...
14:40:13 
<gandalf> ok, I gotta run now. I'll be back in 1h if you need me
14:43:38 
— •jlund reads scrollback
14:45:15 
<•jlund> gandalf: what revision?
14:45:42 delphine: ^
15:00:10 
<Aryx> failure is gone: https://l10n.mozilla.org/teams/be
Flags: needinfo?(jlund)

Comment 8

6 months ago
OK, figured out what's happening.

We're over-shipping Fluent files. That is, because we're using **/*.ftl in our jar.mns, we're packaging Fluent files which are not in en-US, and not used in the build. Now, compare-locales doesn't run a check on a localized file that doesn't have an en-US file. Which means that aboutAddons.ftl got copied over to the merge dir as is, unsanitized. Which means that we shipped an un-used Fluent file with parser errors.

If I introduce a Fluent error in a file that exists in en-US, the packaged version gets stripped off that error.

The log for `installers-be` in https://taskcluster-artifacts.net/WMHyK4mBREKR0eWJrM7X9g/0/public/logs/live_backing.log shows no errors, corresponding to what flod saw in the signoff view on l10n.m.o.

Error assessment from my POV:

The one error that Fluent can't really cope well with at runtime is bad mixtures of values and attributes, which would set text or attributes on DOM nodes that the developer didn't expect. As that's only affecting files we have in en-US, my testing shows that we're covered here.

Error recovery from parse errors works fine in Fluent, and thus trigger of this incident was more of a nuance, IMHO.
(In reply to Axel Hecht [:Pike] from comment #8)
> Error recovery from parse errors works fine in Fluent, and thus trigger of
> this incident was more of a nuance, IMHO.

A nuance, sure, however we can't have fluent issues breaking builds at beta/release/esr build time. Especially if the build is to recover from a security zero day (wasn't the case here).

So, that brings me to what our solution is, do we

A) make language packs (at least for now) strip files not in en-US somehow?
B) make AMO not error (and only warn) on invalid .ftl?
C) make compare-locales error-check .ftl even if not present in en-US? 
D) make beta l10n-dashboard error check .ftl not used in beta (so signoff doesn't hit this)?
E) All of the above?
F) <something else, or some combination of above>

My main goals here are resiliency for users and automation, how can we achieve that now that we know root cause?
Flags: needinfo?(l10n)
(In reply to Justin Wood (:Callek) from comment #9)
> (In reply to Axel Hecht [:Pike] from comment #8)
> > Error recovery from parse errors works fine in Fluent, and thus trigger of
> > this incident was more of a nuance, IMHO.
> 
> A nuance, sure, however we can't have fluent issues breaking builds at
> beta/release/esr build time. Especially if the build is to recover from a
> security zero day (wasn't the case here).

I agree.

> So, that brings me to what our solution is, do we
> 
> A) make language packs (at least for now) strip files not in en-US somehow?

I'm not sure if we should explicitly forbid that. This is also a question for the build folks, I think. For Fluent, we (I) want a build step that accepts missing files, which we're currently emulating with the jar.mn wildcard logic. The wildcard also has the benefit that we're not recreating the mess where source files and runtime files had no useful mapping. There's probably something more expressive and intentional we can do for building, packaging and repackaging Fluent.

> B) make AMO not error (and only warn) on invalid .ftl?

I think this should be worded stronger, and we should really do the following: the language pack signing process should not make decisions about the worthiness of shipping something.
If there's something that shouldn't ship, we need to have found that way earlier in the process.

> C) make compare-locales error-check .ftl even if not present in en-US? 
> D) make beta l10n-dashboard error check .ftl not used in beta (so signoff
> doesn't hit this)?

C) and D) would practically be the same thing. Filed bug 1502935.

> E) All of the above?
> F) <something else, or some combination of above>

F) We'll need to have clear engagement rules for bustages in localized releases. There was positive feedback about how we tracked the langpack-removal incident. Slack channel, incident document, clear owners and severity assessment would have helped here, too, I think.
Flags: needinfo?(l10n)
critic is buggy (see bug 1508686 about passing wrong data in).
Fixing the bugs seems to be a rabbit hole, so gps would rather
remove it.
Use ./mach lint instead.
Attachment #9029270 - Attachment is obsolete: true
You need to log in before you can comment on or make changes to this bug.