Support localization using L10nRegistry+Fluent in unprivileged content

RESOLVED FIXED

Status

()

enhancement
P3
normal
RESOLVED FIXED
2 years ago
7 months ago

People

(Reporter: gandalf, Unassigned)

Tracking

(Blocks 4 bugs)

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 obsolete attachment)

At the moment localization using the new API requires access to L10nRegistry and at least MessageContext, both are jsm's in chrome context.

The current workflow is to call L10nRegistry.generateContexts which is an async generator lazily returning MessageContext objects.

Then some l10n API (either MessageContext, Localization, DOMLocalization or fluent-react) uses this generator for translations.

We can probably do without access to platform's Fluent from content and ask those projects to bundle their own Fluent if needed, but we do need access to L10nRegistry.

:mossop - you said that it should be possible to create some WebIDL API for unprivileged content to get access to L10nRegistry. Can you elaborate?
(Reporter)

Updated

2 years ago
Blocks: 1365426
Priority: -- → P3
(Reporter)

Updated

2 years ago
Flags: needinfo?(dtownsend)
I guess there are a couple of options.

1. Create a webidl interface that the page can call to get itself localised.
2. Create a framescript that listens for page loads and does the localisation based on info in the page.

The latter might be the most straightforward and doesn't require DOM sign-offs. I can imagine something that listens for about: page loads and then checks them for some meta tags that describe where l20n resources come from and then does the localisation with direct access to the DOM.

Does that make sense?
Flags: needinfo?(dtownsend)
(Reporter)

Comment 2

2 years ago
Thank you Dave!

We'll look into using framescript for that. One question that popped to us is that it seems that framescript can only communicate serializable data via JSON structure.

This is probably possible but require us to dig into internal APIs of Fluent. What we'd like to do here is pass MessageContext objects around, or even a generator of them.
Is there any way that we could design some WebIDL or other mean to describe the MessageContext object structure and in result allow such objects to be passed between framescript and the content?
Flags: needinfo?(dtownsend)
(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #2)
> Thank you Dave!
> 
> We'll look into using framescript for that. One question that popped to us
> is that it seems that framescript can only communicate serializable data via
> JSON structure.

The other alternative is to just load the l10n in the child processes rather than serialising them across the process boundaries. 

> This is probably possible but require us to dig into internal APIs of
> Fluent. What we'd like to do here is pass MessageContext objects around, or
> even a generator of them.
> Is there any way that we could design some WebIDL or other mean to describe
> the MessageContext object structure and in result allow such objects to be
> passed between framescript and the content?

I don't know how to do that, billm or mconley might be able to help you there.
Flags: needinfo?(dtownsend)
(Reporter)

Comment 4

2 years ago
The new l10n infrastructure uses async generators to create a lazy fallback chain of objects called `MessageContext`.

Each object is a, if you squit, a list of localization entries and some meta data for their formatting.

So, in chrome process all we do is:

1) Take a list of resource IDs that are required to localize a given piece (say: ["./main.ftl", "./common/brand.ftl"])
2) Build a locale list negotiated between locales requested by the user and available in L10nRegistry
3) Call L10nRegistry.generateContets(locales, ["./main.ftl", "./common/brand.ftl"])
4) This returns an asynchronous generator over those MessageContext objects
5) We take the first of those objects that has the right translation and use it to localize

This model allows us to lazily fallback on fallback locales thanks to using a generator, and thanks to said generator being async we can do async I/O to build `MessageContext` objects.

One thing to notice is that when we build the object, we parse the resolved .ftl files using our parser.

=====================

Now, here comes the challenge - some of our front-end lives without chrome privileges.

So the starting position we're in is that we have no access to Localization.jsm, DOMLocalization.jsm, MessageContext.jsm or L10nRegistry.jsm.

In comment 1, Dave suggested trying to use framescript that listens for page loads and does localization based on info from the page. My concern is that such approach would result in visible delay before localizaed UI shows up, since we'd be waiting for page load event before starting the back and forth between content and parent process asking for localization and even initializing the I/O for it.

Is there any way we could do something like this:

1) Have a thin glue code in JS per-document for content process, that, as early as possible (before DOMContentLoaded) sends a signal to parent process with the list of resource IDs from <link> elements in the head (`["./main.ftl", "./common/brand.ftl"]`)
2) The parent process kicks off I/O based on the known negotiated locales for Firefox and the list of resources
3) When it receives the async iterator over MessageContext objects, it either sends it down to content somehow (?)
4) Or it sends the first MessageContext and keeps the iterator alive, for the content process to request the next element out of it in case error or missing translation is encountered

If needed, we could try to make MessageContext serializable by adding "toJSON" and "fromJSON", but that would be probably inefficient so if there's any way we could either transfer whole iterator or a MessageContext object, that would be probably preferred.

Mike, can you help us here? Or NI someone who you think may know how it should work?
Flags: needinfo?(mconley)
Out of curiosity, why is it that we must ask the parent process for these things?

(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #4)
> 
> So the starting position we're in is that we have no access to
> Localization.jsm, DOMLocalization.jsm, MessageContext.jsm or
> L10nRegistry.jsm.

Why not? We can use JSMs from the content process... is there a reason these have to be restricted to the parent process?

It sounds like, if we can restrict the file reads so that we don't freak out the sandbox people, we should be able to find and access the .ftl files from the content process, and then parse and apply them, without having to explicitly message the parent to do that work.

Or am I misunderstanding?
Flags: needinfo?(mconley) → needinfo?(gandalf)
(Reporter)

Comment 6

a year ago
How can we get access to the JSM files (and resource://) from non privileged content process like from within of about:robots ?
Flags: needinfo?(gandalf) → needinfo?(mconley)
I guess that's what we could use a frame script for. Have a frame script in the content process notice when we load one of these internal pages that we translate, and then work with the JSM's.
Flags: needinfo?(mconley)
I just hackeried Cu.import for MessageContext.jsm, and ended up with

Security wrapper denied access to property (void 0) on privileged Javascript object. Support for exposing privileged objects to untrusted content via __exposedProps__ has been removed - use WebIDL bindings or Components.utils.cloneInto instead. Note that only the first denied property access from a given global object will be reported.

So I guess the question is WebIDL or cloneInto? Do we have an existing usecase in our code that we could follow?
(Reporter)

Comment 9

a year ago
Here's an example of new code that uses unprivileged content that we'd like to support with Fluent - https://firefox-source-docs.mozilla.org/toolkit/components/payments/docs/index.html#dialog-architecture
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Dave, Mike - can you take a look at the POC I created to verify if I'm on the right track?

The result works, and aboutRobots.xhtml handle localization via Fluent+L10nRegistry using the same DOM API as its privileged brother.

The code has two pieces:

 - l10n-framescript is a small handler that for each request creates or retrieves from cache a `Localization` object and handles `formatValues` and `formatMessages` calls from the content

 - l10n-unpriv has three parts:

1) A thin-wrapper called `Localization` that is intended to have the same API as the real `Localization` class in Localization.jsm, but instead of performing any operations, it submits an event that the framescript picks up and handles the response.
2) A direct copy of DOMLocalization.jsm which handles all DOM operations, MutationObserver for `data-l10n-id` attributes etc.
3) A direct copy of the runtime `l10n.js` which initialized `document.l10n` to be an instance of `DOMLocalization` and hooks observers.

(2) and (3) are not very interesting I guess, since it's the same code that we use for privileged content. The interesting part is the (1) and how it uses l10n-framescript to rely all the calls to the real `Localization` from Localization.jsm.

Is that a reasonable approach from the perspective of performance/memory impact and Gecko design?
Flags: needinfo?(mconley)
Flags: needinfo?(dtownsend)
Hey gandalf,

Thanks for this. This approach seems okay, but I think we might need to do more things to ensure that the localize event is coming from a trusted source. I know that Localization.jsm isn't doing anything _too_ privileged, but better to be safe than sorry.

We do similar things here for the old about:home:

https://searchfox.org/mozilla-central/rev/b0098afaeaad24a6752aa418ca3e86eade5fab17/browser/base/content/tab-content.js#96

and about:privatebrowsing:

https://searchfox.org/mozilla-central/rev/b0098afaeaad24a6752aa418ca3e86eade5fab17/browser/base/content/tab-content.js#214

we might want something similar inside tab-content.js, which checks "localize" events against a whitelist of internal pages that make sense for us to localize, and then use the cloneInto mechanism that you're using to send the data from the JSM back down to the .xhtml file.

Does that make sense?
Flags: needinfo?(mconley)
Comment on attachment 8936317 [details]
Bug 1407418: Localize non-privileged content with Fluent.

https://reviewboard.mozilla.org/r/207048/#review230136

::: browser/base/content/aboutRobots.xhtml:60
(Diff revision 2)
>          transform: scaleX(-1);
>        }
>      ]]></style>
> +
> +    <link rel="localization" href="browser/aboutRobots.ftl"/>
> +    <script type="text/javascript" src="chrome://global/content/l10n-unpriv.js"></script>

One thing that I don't see addressed and that is misleading in this WIP is: At what URI will l10n-unpriv.js be referenced? It will need to be at a URI that unprivileged pages can use it (IIUC, not chrome://… ) while avoiding exposing it to regular (http:/https:) web content.
> It will need to be at a URI that unprivileged pages can use it (IIUC, not chrome://… )

I didn't know of this limitation. The unprivileged content I've been targeting (like about:robots) does seem to have access to chrome:// protocol - https://searchfox.org/mozilla-central/source/browser/base/content/aboutRobots.xhtml#29

Am I missing some constrain here?
Flags: needinfo?(MattN+bmo)
IIUC An HTML file loaded over a resource: URI cannot access chrome: URIs unless they are contentaccessible (specified in the chrome/jar manifest). I believe you could make your l10n-unpriv.js contentaccessible but then I think normal webpages could reference it. We do have quite a few resouce/chrome URIs which are still contentaccessible so I'm not sure if that's a problem. I believe the main concern is fingerprinting but if that code is consistent for a given Fx version then I'm not sure that's an issue. IIUC one benefit of making it contentaccessible is that it would support development over file: which we use for Web Payments so a build isn't even required.
Flags: needinfo?(MattN+bmo)
>  I believe the main concern is fingerprinting but if that code is consistent for a given Fx version then I'm not sure that's an issue

Yeah, I expect it to stabilize pretty quickly and remain stable throughout versions.

> IIUC one benefit of making it contentaccessible is that it would support development over file: which we use for Web Payments so a build isn't even required.

Cool, yeah, I see no issue with making it contentaccessible. I'll look into it when I get back to this work.

My main concern is now to work out with :stas the unified vision for it since we have different opinions on whether `MessageContext` should act on the unprivileged or privileged side of things here.
Comment on attachment 8936317 [details]
Bug 1407418: Localize non-privileged content with Fluent.

https://reviewboard.mozilla.org/r/207048/#review231086

::: browser/base/content/browser.js:1252
(Diff revision 2)
>      let mm = window.getGroupMessageManager("browsers");
>      mm.loadFrameScript("chrome://browser/content/tab-content.js", true);
>      mm.loadFrameScript("chrome://browser/content/content.js", true);
>      mm.loadFrameScript("chrome://browser/content/content-UITour.js", true);
>      mm.loadFrameScript("chrome://global/content/manifestMessages.js", true);
> +    mm.loadFrameScript("chrome://global/content/l10n-framescript.js", true);

This would only work for the "browsers" message manager group which basically means: only for browsers associated with tabs. I think this API should be in all <browser>s and <iframe mozbrowser>  so that it's available for things like sidebars, prefs subdialogs, and the Payment Request dialog frame. For <browser> I think you would need to add your event listener in toolkit/content/browser-child.js and for mozbrowser it looks like dom/browser-element/BrowserElementChildPreload.js is the place but I'm not sure.
(Reporter)

Updated

a year ago
Summary: Support localization using L10nRegistry+Fluent in non-privileged content → Support localization using L10nRegistry+Fluent in unprivileged content
We discussed an alternative approach and I'm going to experiment with getting it working in bug 1455649
Flags: needinfo?(dtownsend)
Just want to note that the issue isn't about the API alone, it's also about how we expose strings to specific contexts so unprivileged pages can read their required strings but regular webpages can't. Being able to read strings is a problem for privacy and fingerprinting as OS settings and the user's locale can be exposed.
(Reporter)

Updated

a year ago
Attachment #8936317 - Attachment is obsolete: true
Blocks: 1484955
Blocks: 1494039
(Reporter)

Comment 23

7 months ago
We now have Fluent in unprivileged, and the remaining items are for non-system-principal which is bug 1488973. Closing this one!
Status: NEW → RESOLVED
Last Resolved: 7 months ago
Resolution: --- → FIXED
No longer blocks: 1494039
(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #23)
> We now have Fluent in unprivileged, and the remaining items are for
> non-system-principal which is bug 1488973. Closing this one!

In my mind unprivileged == non-system-principal…
(Reporter)

Comment 25

7 months ago
> In my mind unprivileged == non-system-principal…

Yeah, I thought the same and then I encountered https://developer.mozilla.org/en-US/docs/Mozilla/Gecko/Script_security . security is such a rabbit hole.
You need to log in before you can comment on or make changes to this bug.