Fluent migration generates broken FTL when migrated pattern starts or ends with a new line

RESOLVED FIXED

Status

enhancement
RESOLVED FIXED
7 months ago
6 months ago

People

(Reporter: flod, Assigned: Pike)

Tracking

Details

Attachments

(1 attachment)

(Reporter)

Description

7 months ago
Migration from bug 1488788 generates a broken FTL.

https://searchfox.org/mozilla-central/rev/bdc89dfd7869e418d788b28eb60ab8d94e708a15/browser/locales/en-US/chrome/browser/aboutRestartRequired.dtd#8

Results in

restart-required-description =
    { "
    " }<p>We have just installed an update in the background. Click Restart { -brand-short-name } to finish applying it.</p>
    <p>We will restore all your pages, windows and tabs afterwards, so you can be on your way quickly.</p>
(Reporter)

Updated

7 months ago
Summary: Fluent migration generates broken FTL → Fluent migration generates broken FTL when DTD string starts with a new line
Do you think it's safe to assume that the newline at the beginning of the string in question is there for formatting purposes?

    <!ENTITY restartRequired.description "
    <p>We have just installed an update in the background. Click Restart &brandShortName; to finish applying it.</p>
    <p>We will restore all your pages, windows and tabs afterwards, so you can be on your way quickly.</p>
    ">

In FTL, would we want to format it as follows?

    restart-required-description =
        <p>We have just installed an update in the background. Click Restart { -brand-short-name } to finish applying it.</p>
        <p>We will restore all your pages, windows and tabs afterwards, so you can be on your way quickly.</p>
(Reporter)

Comment 2

7 months ago
(In reply to Staś Małolepszy :stas (back on Sept 17) from comment #1)
> Do you think it's safe to assume that the newline at the beginning of the
> string in question is there for formatting purposes?

Yes, that's the only explanation.

See a different example, where the 2nd and 3rd line are indented to align with the 1st one
https://hg.mozilla.org/l10n/gecko-strings/file/default/toolkit/chrome/global/aboutAbout.dtd#l6

In that case we carry over a ton of indentation (bug 1486934), but the FTL is valid.

> In FTL, would we want to format it as follows?
> 
>     restart-required-description =
>         <p>We have just installed an update in the background. Click Restart
> { -brand-short-name } to finish applying it.</p>
>         <p>We will restore all your pages, windows and tabs afterwards, so
> you can be on your way quickly.</p>

I think so, yes.
(Assignee)

Comment 3

7 months ago
So, part of this bug is going to be fixed by bug 1491833, where we strip and normalize white-space if the migration wants that.

But in other cases, we shouldn't create invalid Fluent. Resummarizing to reflect my research:

The problem is that transforms.extract_whitespace extracts \s, instead of ' '. And it should insert an empty {""} to protect leading and trailing newlines, if needed.

Question, should we normalize \t to ' '? Normalize \r -> \n? No idea if we have current cases for that, though.
Summary: Fluent migration generates broken FTL when DTD string starts with a new line → Fluent migration generates broken FTL when migrated pattern starts or ends with a new line
(Assignee)

Comment 4

6 months ago
flod, should we generally switch on `trim` when migrating DTD strings?

Trying to find out if trim would have fixed `lij` in bug 1491677.
Flags: needinfo?(francesco.lodolo)
(Reporter)

Comment 5

6 months ago
I don't think we should, unless we add yet another switch to disable trimming for some scenarios.

There are case (start+link+end) where trailing and leading whitespaces are meaningful, and to migrate those with trimmed strings we would need to make a lot of guessing.
Flags: needinfo?(francesco.lodolo)
(Assignee)

Comment 6

6 months ago
OK, let's not change the default right now. Talking to flod, we'll start observing what's more common, and then see if we should make the trim logic default, or maybe just default for DTDs.
(Assignee)

Updated

6 months ago
Assignee: nobody → l10n
(Assignee)

Comment 7

6 months ago
This is in line with our changes to Fluent to make inline
indents just be spaces.
It also changes the leading/trailing white-space regexes to be
strictly about start to end of the string, which is good for
multi-line transforms.
The trickery around the text vs block_text regexes is mostly
to make the \n be part of text, but it needs to be an
alternative to whitespace.
(Assignee)

Comment 8

6 months ago
Landed as https://hg.mozilla.org/l10n/fluent-migration/rev/3877312dc1f42d01aa7451bb3c00a01192200e7c.
Status: NEW → RESOLVED
Last Resolved: 6 months ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.