Closed Bug 1491738 Opened 6 years ago Closed 6 years ago

Fluent migration generates broken FTL when migrated pattern starts or ends with a new line

Categories

(Localization Infrastructure and Tools :: Fluent Migration, enhancement)

enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: flod, Assigned: Pike)

References

Details

Attachments

(1 file)

Migration from bug 1488788 generates a broken FTL.

https://searchfox.org/mozilla-central/rev/bdc89dfd7869e418d788b28eb60ab8d94e708a15/browser/locales/en-US/chrome/browser/aboutRestartRequired.dtd#8

Results in

restart-required-description =
    { "
    " }<p>We have just installed an update in the background. Click Restart { -brand-short-name } to finish applying it.</p>
    <p>We will restore all your pages, windows and tabs afterwards, so you can be on your way quickly.</p>
Summary: Fluent migration generates broken FTL → Fluent migration generates broken FTL when DTD string starts with a new line
Do you think it's safe to assume that the newline at the beginning of the string in question is there for formatting purposes?

    <!ENTITY restartRequired.description "
    <p>We have just installed an update in the background. Click Restart &brandShortName; to finish applying it.</p>
    <p>We will restore all your pages, windows and tabs afterwards, so you can be on your way quickly.</p>
    ">

In FTL, would we want to format it as follows?

    restart-required-description =
        <p>We have just installed an update in the background. Click Restart { -brand-short-name } to finish applying it.</p>
        <p>We will restore all your pages, windows and tabs afterwards, so you can be on your way quickly.</p>
(In reply to Staś Małolepszy :stas (back on Sept 17) from comment #1)
> Do you think it's safe to assume that the newline at the beginning of the
> string in question is there for formatting purposes?

Yes, that's the only explanation.

See a different example, where the 2nd and 3rd line are indented to align with the 1st one
https://hg.mozilla.org/l10n/gecko-strings/file/default/toolkit/chrome/global/aboutAbout.dtd#l6

In that case we carry over a ton of indentation (bug 1486934), but the FTL is valid.

> In FTL, would we want to format it as follows?
> 
>     restart-required-description =
>         <p>We have just installed an update in the background. Click Restart
> { -brand-short-name } to finish applying it.</p>
>         <p>We will restore all your pages, windows and tabs afterwards, so
> you can be on your way quickly.</p>

I think so, yes.
So, part of this bug is going to be fixed by bug 1491833, where we strip and normalize white-space if the migration wants that.

But in other cases, we shouldn't create invalid Fluent. Resummarizing to reflect my research:

The problem is that transforms.extract_whitespace extracts \s, instead of ' '. And it should insert an empty {""} to protect leading and trailing newlines, if needed.

Question, should we normalize \t to ' '? Normalize \r -> \n? No idea if we have current cases for that, though.
Summary: Fluent migration generates broken FTL when DTD string starts with a new line → Fluent migration generates broken FTL when migrated pattern starts or ends with a new line
flod, should we generally switch on `trim` when migrating DTD strings?

Trying to find out if trim would have fixed `lij` in bug 1491677.
Flags: needinfo?(francesco.lodolo)
I don't think we should, unless we add yet another switch to disable trimming for some scenarios.

There are case (start+link+end) where trailing and leading whitespaces are meaningful, and to migrate those with trimmed strings we would need to make a lot of guessing.
Flags: needinfo?(francesco.lodolo)
OK, let's not change the default right now. Talking to flod, we'll start observing what's more common, and then see if we should make the trim logic default, or maybe just default for DTDs.
Assignee: nobody → l10n
This is in line with our changes to Fluent to make inline
indents just be spaces.
It also changes the leading/trailing white-space regexes to be
strictly about start to end of the string, which is good for
multi-line transforms.
The trickery around the text vs block_text regexes is mostly
to make the \n be part of text, but it needs to be an
alternative to whitespace.
Landed as https://hg.mozilla.org/l10n/fluent-migration/rev/3877312dc1f42d01aa7451bb3c00a01192200e7c.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: