Closed Bug 1365031 Opened 7 years ago Closed 7 years ago

Turn langpacks into webextension-like packages

Categories

(Core :: Internationalization, enhancement, P3)

enhancement

Tracking

()

RESOLVED FIXED

People

(Reporter: zbraniecki, Unassigned)

References

Details

Attachments

(1 file, 1 obsolete file)

At the moment, Firefox langpacks are regular old-style extensions.

They are "type 8" extensions (locale) and they register their resources via ChromeRegistry.

With the shift toward the new Localization API we're also moving away from ChromeRegistry and will use the new L10nRegistry (bug 1333980) for managing language resources.

In order to register a langpack in L10nRegistry, some code needs to perform the following:

```
  const src = new FileSource(
    'langpack-pl@mozilla.org',
    ['pl'],
    'resource:///localization/{locale}/',
  );
  L10nRegistry.registerSource(src);
```

We can either add this to some AddonManager code that will index all langpacks in app package and in the profile dir and register available language resources, or we can turn langpacks into bootstrapped extensions that execute this code on startup() in bootstrap.js.

:Pike suggested to do the latter.

For that to happen, I'll need to:
1) Make type:8 extensions be bootstrappable
2) Modify the build system that produces langpacks to produce bootstrap.js

I'm willing to work on (2), but I'm not sure how to get (1) and would like to get an opinion on the approach from people working on startup/extensions.
NI on :kmag, and :rhelmer and :aswan per kmag's suggestion.
Flags: needinfo?(rhelmer)
Flags: needinfo?(kmaglione+bmo)
Flags: needinfo?(aswan)
Depends on: 1333980
Maybe this is what you already meant, but lets please not pull bootstrap.js from the xpi and execute it.
For various extension types we have static bootstrap files that get executed automatically, see:
http://searchfox.org/mozilla-central/rev/484d2b7f51b7aed035147bbb4a565061659d9278/toolkit/mozapps/extensions/internal/XPIProvider.jsm#5192-5198
I'm not familiar with all the requirement of langpacks but just following the pattern described above for them sounds like a fine idea...
Flags: needinfo?(aswan)
I think my preference would be to turn them into WebExtension-like add-ons with a manifest.json file, and execute whatever startup code is necessary based on the contents of the manifest.

Since we're hoping to eventually get rid of the RDF service as soon as possible, which means getting rid of install.rdf files, that probably makes the best long-term sense, anyway.
Flags: needinfo?(kmaglione+bmo)
The maximum amount of data I'd like to provide to L10nRegistry is:

1) What languages the langpacks has resources for (example: ['sr-Cyrl', 'sr-Latn'])
2) What's the base path to the resources (example 'resource:///localization/{locale}/')
3) What resources are within the langpack (example: ['resource:///localization/sr-Cyrl/browser.ftl', 'resource:///localization/sr-Latn/browser.ftl'])

The (3) is optional, but would help with performance.

I can produce all the data and bundle it into a web extension, but I'll need help in getting such webextension to be registered in Gecko.

It's also critical for this to be registered early to prevent Firefox UI to show up in one locale before the langpack webextension is registered and causes lang switch on fly.
For that reason, if webextensions are registered on startup later than old-school extensions, it may be a no-go.

For any other case, I'm totally fine with WebExtensions approach and since there won't be any custom per-langpack bootstrap, I'll be happy to have WebExtensions Manager in Gecko boostrap the langpacks and register them into L10nRegistry using manifest.json information.

Pike, will this approach fly with you?
Flags: needinfo?(l10n)
I don't mind where the code is that triggers the registration hooks.

I suggested making bootstrap work because it sounded like the easiest way to get something working for testing, I'm not all that strictly bound to where that code lives.

Note, I'm not sure we can drop chrome registry lookup for legacy l10n firefox content any time soon, so I'm not sure how viable going directly for web extensions would be.
Flags: needinfo?(l10n)
Ok, so we'll probably need some esoteric combination of chrome extension and web extension for the langpack for the time being until we can get rid of the chrome part.

I was able to produce a langpack that can be registed in L10nRegistry and works correctly. I had to modify the current .xpi file with adding:

```
resource langpackpl .
```

to xpi/browser/chrome.manifest

and then registering the following new source:

```
  const src = new IndexedFileSource(
    'langpack-pl@mozilla.org',
    ['pl'],
    'resource://langpackpl/localization/{locale}/',
    [
      'resource://langpackpl/localization/pl/browser/aboutDialog.ftl'
    ]
  );
  L10nRegistry.registerSource(src);
```

This triggered language negotiation with the proper new 'pl' MessageContexts created!

So, in order for the current old-style xpi to work I need:

1) Add the `resource` entry to chrome manifest
2) Fix compare-locales to work with FTL (currently it just packages en-US FTL file)
3) Somehow register the langpack into L10nRegistry. (I did it manually from the browser console so far).

And for WebExtensions, I'll need

4) Transition to produce the manifest.json for the webextension
5) Make Gecko register this type of webextension as langpack into L10nRegistry

And to top it up, for the mix of old l10n and new l10n to work I'll need to:

6) figure out if it's possible to produce a hybrid extension that is registered via WebExtensions and WebExtensions register it into L10nRegistry, but also registered into ChromeRegistry for the time being.
We can register the appropriate resource URLs and chrome registry entries based on a WebExtension manifest fairly easily.
Summary: Turn langpacks into bootstrapped addons → Turn langpacks into webextension-like packages
Here's the first idea on how such manifest.json could look like: https://pastebin.mozilla.org/9021829

It contains everything needed to register the langpack for L10nRegistry and an example of chrome.manifest entries for ChromeRegistry.

When I look into an XPI I see the following data that we may want/have to convert into manifest.json entries and register somehow:

1) browser/crashreport-override.ini
2) defaults/preferences/firefox-l10n.js

everything else seems to be based on chrome.manifest entries and are already planned for in the proposal.

Pike,  do we need (1) and (2) and is there a way for WebExtensions to read this data from manifest.json and programmatically instruct the code that needs it?
Flags: needinfo?(l10n)
Attached file gandalf's manifest.json (obsolete) —
Attaching the manifest proposal here so that we can comment on it...

Re your question, neither of those files should be in languagepacks, that's a bug.
Flags: needinfo?(l10n)
Comment on attachment 8868075 [details]
gandalf's manifest.json

  "resources": [
    "/localization/pl/browser/aboutDialog.ftl",
  ],

This looks heavy, like a looooong list in practice.

  "chrome_manifests": [
    "locale alerts pl pl/locale/pl/alerts/",
    "locale autoconfig pl pl/locale/pl/autoconfig/",
    "locale global pl pl/local/pl/global/",
  ]

Not sure how the addons folks feel about this, I'd be OK with making this way more l10n specific.
Can you add some explanation of what the base_path, resources, and chrome_manifests properties do?
To the extent that its possible, I would prefer that we remove boilerplate and avoid having to validate these properties.  For instance, if the only type of chrome resource that can be registered is "locale", lets only put the parts that actually can vary into the manifest and make the rest implied.  It looks like the same might be applicable to "resources"...
> This looks heavy, like a looooong list in practice.

Yep, it's a tradeoff between runtime speed and bytes in the manifest.json.

> Not sure how the addons folks feel about this, I'd be OK with making this way more l10n specific.

Agree!

> Re your question, neither of those files should be in languagepacks, that's a bug.

This is excellent news!

> Can you add some explanation of what the base_path, resources, and chrome_manifests properties do?

==============

base_path - this is a string that is used in L10nRegistry to construct the `resource:` path to the given resource.
All calls to L10nRegistry say something like:

```
L10nRegistry.generateContexts(['pl', 'de', 'fr'], [
  '/browser/menu.ftl',
  '/toolkit/brand.ftl'
]);
```

As we look through the sources (langpacks and app bundled resources), L10nRegistry is constructing the real path for each resource by taking the base_path and combining it with the resource path:

base_path: 'resource:///localization/{locale}',
resource_path: '/browser/menu.ftl'

result_path: 'resource:///localization/pl/browser/menu.ftl'

==============

resources:

In L10nRegistry we have two types of sources that can be registered (packages in ChromeRegistry vocabulary) - FileSource and IndexedFileSource.

FileSource just says "I have resources for ab-CD and my base path is X". That means that L10nRegistry has to perform I/O attempt the first time it tries to load a resource from a given source. That costs.
After the first attempt, it caches the result so the next time anything asks for the resource from that source, the L10nRegistry knows that, for example `langpack-pl` does not have the resource for `/browser/menu.ftl`.

IndexedFileSource works the same way except that it is provided a list of resources packaged into it, which means that when L10nRegistry is loading `/browser/menu.ftl` it can just check with the index if the file is present in the given source.

That gives us a tradeoff - we either build this index at build time and carry it around, but then L10nRegistry doesn't have to do I/O on each source for resources that are not available, or it builds the index at runtime at the cost of first I/O attempt having to be made for each resource.

The decision on which is better depends on the use case. If all our langpacks will usually contain all the files, then the index is not needed, because majority of I/O attempts will be for existing resources.
If, on the other hand, we want to have "partial langpacks" like, "es-MX" storing only resources that differ from "es", and falling back on "es" when resource is missing, then having an index is useful to avoid a lot of I/O calls to es-MX langpack for resources that it doesn't carry around.

I'm totally fine with starting with regular `FileSource` and adding the index later if we'll see the performance win out of it.

======

chrome_manifests - this is based on Kris' suggestion to provide the chrome manifest entries via manifest.json and make WebExtensions register those packages in Chrome Registry.

We can simplify that list as all entries will be of type "locale alias ab-CD path".

Hope that helps!

> It looks like the same might be applicable to "resources"...

While I agree on the chrome_manifests field, I don't think there's much more we can cut out of resources field.

"/localization/pl/browser/menu.ftl" is the internal path that the langpack carries around. If we cut out the base_path and just leave "/browser/menu.ftl" we'd need additionally to carry information about which languages the resource is available in (it may be that in the future we'll want a multi-lingual langpack with resources for several locales).
Looking at the explanation of resources, that sounds like an optimization we should do at registry-quick-start hook-up time.

As in, at the point where we upload cached info about a langpack into the startup json, we can just iterate the resources in the langpack and cache the list for an IndexedFileSource.

I thing making that a build-time thing is counter-productive, as we then need to validate the information.
I filed bug 1365440 for build config changes to produce the new type of language packs. We should file a similar bug for the WebExtensions Gecko code to pick up the new langpacks and register/update/remove them.
Depends on: 1365440
Attached file manifest.json
Here's a manifest I was able to produce using a small stub of a script that scraped through chrome manifests in a langpack.

It has mostly "locale" entries, but it has several "override" and one "resource" entry.

I preserved the format from chrome.manifest because it's very compact, but if you prefer each line to be a dict in json, like:

{
  "chrome_entries": [
    {"type": "locale", "alias": "alarms", "locale": "pl", "path": "..."}
  ]
}

then I can do that as well.
Attachment #8868075 - Attachment is obsolete: true
Attachment #8868470 - Attachment mime type: application/json → text/plain
Depends on: 1365709
I filed bug 1365440 for building the new langpacks and bug 1365709 for consuming them
Flags: needinfo?(rhelmer)
Priority: -- → P3
Depends on: 1377911
Per conversation on IRC triggered by :bsmedberg in bug 1384689 comment 4:

We have two uses of the chromeregisty in the new langpack:

1) We need a protocol to retrieve resources

For that, we'd like to use a new protocol, "mozl10n:".
Mossop gave me https://dxr.mozilla.org/mozilla-central/source/toolkit/components/thumbnails/PageThumbsProtocol.js as an example and I'll add it.

The expected paths will be as follows:

a) for the platform it will be "mozl10n://toolkit@mozilla.org/path" which roughly corresponds to resource://gre/localization
b) for the app it will be "mozl10n://firefox@mozilla.org/path" which roughly corresponds to resource://app/localization
c) for language packs it will be "mozl10n://{langpackID}/path" which may use resource://{langpackID} or moz-extension:// or manually navigate between file: and jar: (we don't know yet).

2) We need to register old-type resources into chrome registry for StringBundle/DTD.

This is needed temporarily until we migrate everything to the new L10n API.
:kmag, :mossop and :bsmedberg are going to debate if it's better to build a chrome.manifest file into the langpack, or add entries to manifest.json and register it programmatically.

For now, :kmag suggested that he'll write the patch for bug 1384689 and, and I quote, "and then we can argue over it later".
Depends on: 1385544
I'm pretty sure that wee don't want the chrome registry for new-style resources.

I don't see how using a different path for language packs helps us doing language packs, re c).

This all sounds really close into falling for the "single resource for path" problem again. We need multiple resources in reality, for a single resource identifier in the calling code.
> I'm pretty sure that wee don't want the chrome registry for new-style resources.

We're not doing that. I filed bug 1385544 and we're going for our own protocol.
I'll try to summarize the latest discussion we're having and positions collected so far:

Stakeholders: :kmag, :pike, :bsmedberg, :mossop, :aswan, :gandalf

1) We need to register "old style" l10n resources into ChromeRegistry.

That's a temporary measure until we transition the whole Firefox to the new L10n API. The timeline is not guaranteed but we'd hope to do the migration before the end of the year.

:kmag would prefer us to add the entries to manifest.json and use programmable API for ChromeRegistry to register them.
:bsmedberg would prefer us to not add a programmable API to ChromeRegistry and instead just add a chrome.manifest file at build time.
:kmag would prefer not to do that because he believes that the programmable API approach gives us better ability to limit what entries are registered for security reasons.
:bsmedberg and :pike point out that that's what we're doing now, so at worst we're staying on par.

2) We need to register path to the new style l10n resources into L10nRegistry

At the moment, we have three places where the new localization resources will live: in platform's omni.ja, app's omni.ja and langpacks.

Currently, we route all three through resource: protocol which means ChromeRegistry. We use `resource://gre/localization`, `resource://app/localization` and `resource://langpack-ID/` - the last one requires WebExtensions API to add a chrome registry entry when the langpack is installed and at startup.

:bsmedberg suggested that we add a custom `mozl10n:` protocol that would handle that for us.
:pike's position is that instead, when registering Source to L10nRegistry the registrar (WebExtensions) should point to a file: or jar: protocol path and then we don't need a custom protocol.

===============

I hope I represented people's positions properly.
:mossop - I believe that ultimate it's your call now (since :bsmedberg is offline).
Do you think you have enough information to make a decision? Would you like to organize a call for all stakeholders?
Flags: needinfo?(dtownsend)
My personal take is that the selection between jar:|file: is a bit of a PITA and I would love to not have to do this for each callsite registering FileSource in L10nRegistry.

In other words it seems that I'd personally prefer to either have the mozl10n: protocol that does it for me, or use the resource: (or moz-extension:) protocol.
I have a hard time finding moz-extension specified, but from googling and guessing, it sounds like that's the privacy-proof way of referring to data in web extensions. Also abstracts away jar: vs file:.

My point is that moz-l10n as a protocol would have to implement the same things, and I'd much rather use an existing protocol than inventing a new one, that has to do things already covered by other protocols. And as it'd never be user- nor developer-facing, it'd just be dead overhead. I don't care which existing protocol we use really, just that I really don't want to invent a new one.
(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #20)
> I'll try to summarize the latest discussion we're having and positions
> collected so far:
> 
> Stakeholders: :kmag, :pike, :bsmedberg, :mossop, :aswan, :gandalf
> 
> 1) We need to register "old style" l10n resources into ChromeRegistry.
> 
> That's a temporary measure until we transition the whole Firefox to the new
> L10n API. The timeline is not guaranteed but we'd hope to do the migration
> before the end of the year.
> 
> :kmag would prefer us to add the entries to manifest.json and use
> programmable API for ChromeRegistry to register them.
> :bsmedberg would prefer us to not add a programmable API to ChromeRegistry
> and instead just add a chrome.manifest file at build time.
> :kmag would prefer not to do that because he believes that the programmable
> API approach gives us better ability to limit what entries are registered
> for security reasons.
> :bsmedberg and :pike point out that that's what we're doing now, so at worst
> we're staying on par.

This is where I fall too. Adding a bunch of code to something that is already crufty to support a use case that we expect to last less than half a year is overkill. Let's take the simple path for now and register manifest files for webextension locales

> 2) We need to register path to the new style l10n resources into L10nRegistry
> 
> At the moment, we have three places where the new localization resources
> will live: in platform's omni.ja, app's omni.ja and langpacks.
> 
> Currently, we route all three through resource: protocol which means
> ChromeRegistry.

The resource protocol doesn't use the chrome registry.

> We use `resource://gre/localization`,
> `resource://app/localization` and `resource://langpack-ID/` - the last one
> requires WebExtensions API to add a chrome registry entry when the langpack
> is installed and at startup.
> 
> :bsmedberg suggested that we add a custom `mozl10n:` protocol that would
> handle that for us.
> :pike's position is that instead, when registering Source to L10nRegistry
> the registrar (WebExtensions) should point to a file: or jar: protocol path
> and then we don't need a custom protocol.

If the latter works then let's just do that.
Flags: needinfo?(dtownsend)
Kris, can you confirm if you're ok with :mossop's decision?

If Kris gives a go-ahead, my next steps would be:

1) Switch webext-langpack generation code to produce chrome.manifest
2) Ping :aswan to get his consumption patch to generate path to jar: or file: for FileSource registered into L10nRegistry
Flags: needinfo?(kmaglione+bmo)
I'm pretty strongly against bundling chrome.manifest files in langpacks. Even if this is only a short term solution, we have to handle it somewhere, and I think we're better off handling in the browser than in the langpack generation machinery. That gives us a lot fewer moving parts, a lot less validation to do on the browser side, and a lot fewer security concerns.

I also don't think that it would be particularly complicated to implement, and actually pretty strongly suspect that it would be simpler than the alternative.
Flags: needinfo?(kmaglione+bmo)
Mossop, are you ok with this approach proposed by :kmag?
Flags: needinfo?(dtownsend)
I suspect I'm missing something about the complexity here.

(In reply to Kris Maglione [:kmag] from comment #25)
> I'm pretty strongly against bundling chrome.manifest files in langpacks.
> Even if this is only a short term solution, we have to handle it somewhere,
> and I think we're better off handling in the browser than in the langpack
> generation machinery. That gives us a lot fewer moving parts, a lot less
> validation to do on the browser side, and a lot fewer security concerns.

What moving parts and validation are you envisioning here?

Can you give me more detail about these security concerns?

> I also don't think that it would be particularly complicated to implement,
> and actually pretty strongly suspect that it would be simpler than the
> alternative.

I'm not sure I understand this. The alternative is already implemented no?
Flags: needinfo?(dtownsend) → needinfo?(kmaglione+bmo)
(In reply to Dave Townsend [:mossop] from comment #27)
> What moving parts and validation are you envisioning here?
>
> Can you give me more detail about these security concerns?

There aren't a lot of limits to what chrome manifests can do. We put some
restrictions on theme manifest, but that's about it. In the case of langpacks,
all we want them to be able to do is register locale resources for the
particular locales that they provide, and register 3 static overrides that are
the same for all langpacks.

If we go the chrome.manifest route, we need to validate them, either just
prior to loading, or during the loading process. The former, aside from being
complicated, gives me some performance concerns, especially since we already
have to synchronously load 7 chrome.manifest files for each langpack at
startup.

> > I also don't think that it would be particularly complicated to implement,
> > and actually pretty strongly suspect that it would be simpler than the
> > alternative.
>
> I'm not sure I understand this. The alternative is already implemented no?

No, the machinery for generating these langpacks is going to have to change
one way or the other for accommodating the new format. And we need completely
new machinery for loading and validating them.
Flags: needinfo?(kmaglione+bmo)
(In reply to Kris Maglione [:kmag] from comment #28)
> (In reply to Dave Townsend [:mossop] from comment #27)
> > What moving parts and validation are you envisioning here?
> >
> > Can you give me more detail about these security concerns?
> 
> There aren't a lot of limits to what chrome manifests can do. We put some
> restrictions on theme manifest, but that's about it. In the case of
> langpacks,
> all we want them to be able to do is register locale resources for the
> particular locales that they provide, and register 3 static overrides that
> are
> the same for all langpacks.

If everything is the same for all langpacks then this becomes a very simple problem...

> If we go the chrome.manifest route, we need to validate them, either just
> prior to loading, or during the loading process. The former, aside from being
> complicated, gives me some performance concerns, especially since we already
> have to synchronously load 7 chrome.manifest files for each langpack at
> startup.

They could also be validated before signing time, no performance concerns there.
Wait, where does 7 chrome.manifest files come from?

> > > I also don't think that it would be particularly complicated to implement,
> > > and actually pretty strongly suspect that it would be simpler than the
> > > alternative.
> >
> > I'm not sure I understand this. The alternative is already implemented no?
> 
> No, the machinery for generating these langpacks is going to have to change
> one way or the other for accommodating the new format. And we need completely
> new machinery for loading and validating them.

Ok, but if the manifest is basically the same for every language pack then this seems like a pretty trivial problem...
Flags: needinfo?(kmaglione+bmo)
(In reply to Dave Townsend [:mossop] from comment #29)
> (In reply to Kris Maglione [:kmag] from comment #28)
> > There aren't a lot of limits to what chrome manifests can do. We put some
> > restrictions on theme manifest, but that's about it. In the case of
> > langpacks, all we want them to be able to do is register locale resources
> > for the particular locales that they provide, and register 3 static
> > overrides that are the same for all langpacks.
> 
> If everything is the same for all langpacks then this becomes a very simple
> problem...

The three static overrides are the same for all langpacks, but the individual
locale entries are not.

> > If we go the chrome.manifest route, we need to validate them, either just
> > prior to loading, or during the loading process. The former, aside from being
> > complicated, gives me some performance concerns, especially since we already
> > have to synchronously load 7 chrome.manifest files for each langpack at
> > startup.
>
> They could also be validated before signing time, no performance concerns
> there.

We don't currently enforce signatures for langpacks. Even if we did, though,
implementing the validation would be at least as complicated as implementing
dynamic registration, and would be once more thing we need to keep in sync
between AMO and Firefox.

> Wait, where does 7 chrome.manifest files come from?

It's just the way langpacks are currently built. The root manifest includes a
couple of other manifests, and the pattern continues a few levels deep.

> Ok, but if the manifest is basically the same for every language pack then
> this seems like a pretty trivial problem...

The three overrides are the same, but the other entries are variable.
Especially if we're going to be building langpacks with multiple, partial
locales.
Flags: needinfo?(kmaglione+bmo)
> It's just the way langpacks are currently built. The root manifest includes a
couple of other manifests, and the pattern continues a few levels deep.

fwiw - My patch in bug 1365440 collects all entries into a single list. I don't really care if I end up placing it inside manifest.json or a single chrome.manifest. Definitely not gonna do 7 :)
Depends on: 1389397
Depends on: 1393848
This is now completed as of Gecko 58!
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Depends on: 1411951
Depends on: 1413322
Depends on: 1425689
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: