Closed Bug 1304350 Opened 5 years ago Closed 5 years ago

Rethink naming of classes and concepts

Categories

(L20n :: JS Library, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: stas, Assigned: Pike)

References

Details

(Whiteboard: [gecko-l20n])

We're currently using the word 'bundle' to describe too many things :)  There are also a few other terms which I'd like to make sure are named consistently with the rest of the codebase.

The list of terms for us to consider includes:

  - MessageContext
  - Localization
  - LocalizationObserver
  - language vs. locale
  - ResourceBundle on the client
  - raw "bundles" in the IO module
  - data-l10n-bundle to reference Localization objects by name
To start the discussion, here is one way we could rename things:

  - MessageContext       → no change
  - Localization         → LocalizationBundle
  - LocalizationObserver → DocumentLocalization
  - language vs. locale  → use language for string codes, reserve locale for Locale objects
  - ResourceBundle       → ResourceSpec(ification)
  - data-l10n-bundle     → no change

Or:

  - MessageContext       → no change
  - Localization         → no change
  - LocalizationObserver → DocumentLocalization
  - language vs. locale  → use language for string codes, reserve locale for Locale objects
  - ResourceBundle       → no change
  - data-l10n-bundle     → data-l10n-name
Pike, Zibi -- thoughts?
Flags: needinfo?(l10n)
Flags: needinfo?(gandalf)
I'm lost in LocalizationObserver and DocumentLocalization. I'm tempted to lobby for things in libs/observer to be in libs/dom, and then to think in terms of DOM.Foo.

I like language and locale as string vs Object.

data-l10n-foo is about -bundle referencing Bundle and -name referencing Thing? I wonder if data-localization is the right way to reference a Localization?
Flags: needinfo?(l10n)
My proposal:

  - MessageContext       → no change
  - Localization         → LocalizationContext
  - LocalizationObserver → {XUL|HTML}Bindings
  - language vs. locale  → use language for language code ('pl', 'en-US'), use locale for language tag + extension keys ('pl-u-hc-h12', 'en-US-u-hc-h24')
  - ResourceBundle       → no change
  - data-l10n-bundle     → data-l10n-ctx
Flags: needinfo?(gandalf)
(In reply to Axel Hecht [:Pike] from comment #3)
> I'm lost in LocalizationObserver and DocumentLocalization. I'm tempted to
> lobby for things in libs/observer to be in libs/dom, and then to think in
> terms of DOM.Foo.

The concept remains: LocalizationObserver is a level higher than Localizations: one observer can create  multiple Localizations (depending on <link name="…">).  Localization's from XBL can hook into the main document's Observer by calling its observeRoot method.

I really think that DocumentLocalization is the way to go.  It's exposed as document.l10n and there's only one per document, whereas there may be many Localizations.
 
> data-l10n-foo is about -bundle referencing Bundle and -name referencing
> Thing? I wonder if data-localization is the right way to reference a
> Localization?

Currently data-l10n-bundle="foo" will look up the element's data-l10n-id in the Localization called 'foo' (as per <link name="…">).  My initial reaction to data-localization is that it's not consistent with the rest of data-l10n-* attributes.

(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #4)
> My proposal:
> 
>   - MessageContext       → no change
>   - Localization         → LocalizationContext
>   - LocalizationObserver → {XUL|HTML}Bindings
>   - language vs. locale  → use language for language code ('pl', 'en-US'),
> use locale for language tag + extension keys ('pl-u-hc-h12',
> 'en-US-u-hc-h24')
>   - ResourceBundle       → no change
>   - data-l10n-bundle     → data-l10n-ctx

I'd like to avoid re-using terms, so if we go for a LocalizationContext, can you suggest a new name for MessageContext?

I thought 'pl-u-hc-h12' is just the result of Locale.toString() and that we'd prefer to use the objects instead of the string-representation?
(In reply to Staś Małolepszy :stas from comment #5)
> (In reply to Zibi Braniecki [:gandalf][:zibi] from comment #4)
> > My proposal:
> > 
> >   - MessageContext       → no change
> >   - Localization         → LocalizationContext
> >   - LocalizationObserver → {XUL|HTML}Bindings
> >   - language vs. locale  → use language for language code ('pl', 'en-US'),
> > use locale for language tag + extension keys ('pl-u-hc-h12',
> > 'en-US-u-hc-h24')
> >   - ResourceBundle       → no change
> >   - data-l10n-bundle     → data-l10n-ctx
> 
> I'd like to avoid re-using terms, so if we go for a LocalizationContext, can
> you suggest a new name for MessageContext?

I think I disagree here. MessageContext is a context for messages (tying into ICU's API), LocalizationContext is a broader type of context that uses MessageContexts internally.

Both are contexts.

> I thought 'pl-u-hc-h12' is just the result of Locale.toString() and that
> we'd prefer to use the objects instead of the string-representation?

Once we're done with Intl.Locale, it will be, but it doesn't matter that much. What is important for me is that  in our internal vocabulary, I'd like to refer to "language", or "lang" when we talk about a language tag.

For example - URI templates should have {lang} instead of {locale}, because we're resolving `/foo/pl/main.ftl', and not '/foo/pl-u-hc-h12/main.ftl'.
LocalizationObserver → LocalizationManager ?

I'd love to see these terms used in sentences that use at least two of the named concepts, to get a better idea of relationships.
(In reply to Axel Hecht [:Pike] from comment #7)
> LocalizationObserver → LocalizationManager ?
> 
> I'd love to see these terms used in sentences that use at least two of the
> named concepts, to get a better idea of relationships.

From https://github.com/l20n/l20n.js/blob/6b727018e523d7d868d3efb372edc725d75dc261/docs/localization.md#localizationobserver:

"The LocalizationObserver class is responsible for localizing DOM trees. It also implements the iterable protocol which allows iterating over and retrieving available Localization objects. Each document will have its corresponding LocalizationObserver instance created automatically on startup, as document.l10n."

and from https://github.com/l20n/l20n.js/blob/6b727018e523d7d868d3efb372edc725d75dc261/docs/localization.md#localization:

"The Localization class is responsible for fetching resources and formatting translations. It implements the fallback strategy in case of errors encountered during the formatting of translations. In HTML and XUL, l20n.js will create an instance of Localization for the default set of <link rel="localization"> elements.  Different names can be specified via the name attribute on the <link> elements. One document can have more than one Localization instance, but one Localization instance can only be assigned to a single document. HTMLLocalization and XULLocalization extend Localization and provide document-specific methods for sanitizing translations containing markup before they're inserted into the DOM."
LocalizationObserver → DOMLocalizationBinding?

I've read this bug and hacking.rst in parallel, and I didn't understand either ad-hoc.

I think it'd be good to have naming conventions that make it clear what's a l20n thing, and what's a platform binding thing. Which is why my suggestion above is using standard DOM naming prefix.
I don't think the division is as strict as we'd like it to be.  The Localization class itself is part of the bindings to DOM, and the LocalizationObserver is a manager for instances of the platform-specific subclasses of Localization: HTMLLocalization or XULLocalization.

In fact, there's no layer between low-level Intl API and the high-level binding API.  The Localization class uses the Intl API directly to create MessageContext instances for the resource group specs it received from the L10nRegistry.

Axel, you suggested in the call yesterday to merge HTMLLocalization and XULLocalization, and it sounds like we could call it DOMLocalization then, although at this point just the plain Localization would be enough.

I'm not a huge fan of the word Manager but I'm leaning towards your suggestion from comment 7.  We'd then have a LocalizationManager managing Localization instances which in turn use the Intl.MessageContext API to format translations.

It's widely understood what a manager is and there's a lot of prior-art in Mozilla:

https://developer.mozilla.org/en-US/docs/Web/API/BatteryManager
https://developer.mozilla.org/en-US/docs/Web/API/PushManager
https://developer.mozilla.org/en-US/docs/Mozilla/B2G_OS/API/ContactManager
https://developer.mozilla.org/en-US/docs/Mozilla/B2G_OS/API/CameraManager
https://developer.mozilla.org/en-US/docs/Mozilla/B2G_OS/API/MozNetworkStatsManager
https://developer.mozilla.org/en-US/docs/Mozilla/B2G_OS/API/MozSmsManager
https://developer.mozilla.org/en-US/docs/Mozilla/B2G_OS/API/WifiManager
https://developer.mozilla.org/en-US/docs/Mozilla/B2G_OS/API/MozPowerManager
https://developer.mozilla.org/en-US/docs/Mozilla/B2G_OS/API/SettingsManager
https://developer.mozilla.org/en-US/docs/Mozilla/B2G_OS/API/MozTimeManager

Admittedly many of these B2G-specific APIs will now become obsolete, but you can clearly spot a trend and assume that developers understand where any kind of Manager fits into the global picture.

With that, the following summarizes my current thinking:

  - MessageContext       → no change
  - Localization         → LocalizationBundle
  - LocalizationObserver → LocalizationManager
  - language vs. locale  → use language for string codes, reserve locale for Locale objects
  - ResourceBundle       → ResourceSpec
  - data-l10n-bundle     → no change

The LocalizationManager class would have a getBundle() method which returns LocalizationBundle instances ('main' etc.).
That looks good to me Stas!

The only one I'm not fond of is `ResourceSpec`, but I'm not sure if I have a better name for it.
:)
Quick question. With your proposal, what say you to replace the word "Bundle" (which indicates something dummy and non algorithmic) with "Context"?

So, LocalizationContext and data-l10n-ctx ?
I think it's confusing to have two different contexts doing different things: MessageContext and LocalizationContext.  If we made the change you suggest I'd like to find a new name for MessageContext.  A good candidate would be MessageBundle, which I like but I'm concerned about its current meaning:

http://docs.oracle.com/javaee/5/tutorial/doc/bnasp.html
http://www.ibm.com/support/knowledgecenter/SSLKUM_9.0.0/com.ibm.pim.java.doc/com/ibm/pim/utils/MessageBundle.html
https://commons.apache.org/sandbox/commons-i18n/apidocs/org/apache/commons/i18n/bundles/MessageBundle.html

Would be a feature or a bug to call our thing MessageBundle? :)

Do you have other suggestions in mind?  Or am I the only one to have an issue with two contexts?
So, the way it works in my mind is that:

{
  a: 1,
  b: 2,
  c: 3
}

is a bundle of numbers, but


{
  a: 1,
  b: "a + 1",
  c: "a + 2"
} 

is a context in which values are resolved. And we not only carry inter-dependencies but also arguments provided to it are resolved.

That's why I think about them as MessageContext and LocalizationContext.
(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #15)

> That's why I think about them as MessageContext and LocalizationContext.

Except that Localization isn't really a context -- it manages MessageContexts: downloads and orders them, and uses them to format requested strings.  It's a _localization strategy_, but that's a terrible name, so perhaps it's better to say that it's a bundle of message contexts which binds to HTML or XUL.

I currently have three proposals in mind: 

  1. MessageContext - Localization - LocalizationManager.
  2. MessageContext - LocalizationBundle - LocalizationManager.
  3. MessageContext - LocalizationBinding - LocalizationManager.

TBH I like Localization as the name and I'd vote for #1 if it wasn't for the data-l10n-* attribute which I have no idea how to call.  data-l10n-name is too similar to data-l10n-id and data-l10n-localization is weird (or maybe it isn't and I'm thinking too much about it?)

The reason I like LocalizationBundle is that "bundle" is what people naturally tend to use to describe groups of translation resources.  Also, if you consider our API, the resource links do form bundles.  The following makes sense to me:

    <link rel="localization" name="sidebar" href="…">
    <link rel="localization" name="sidebar" href="…">

    <h1 data-l10n-id="title" data-l10n-bundle="sidebar">…</h1>

Here, "sidebar" is a bundle of two resources.
Yeah, let's stay with (2).
Before we go in and refactor the code, I'd love to see hacking.rst rewritten in the proposed nomenclature.
(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #19)
> Oh, you'll love it :stas -
> http://www.unicode.org/reports/tr35/#Language_and_Locale_IDs

That's an interesting read.  What would your recommendation be for the nomenclature we use?

If I'm reading this right, we should use "language" whenever we refer to a set of translations, and "locale" when we add i18n data as preferred by the user, region or the language.

For instance, the resource bundles are in a *language* but they should also store the *locale* they were returned for, so that we can construct the Intl formatters properly.

For instance, should we rename requestLanguages to requestLocales?  Will the APIs with "Locale" be polymorphic, accepting locale code strings as well as Locale objects?  Similar to certain Date APIs.
I talked to Pike on Friday and he convinced me that Localization is a good name.  Here's is the current proposal which we discussed:

  - Entity               → Message
  - MessageContext       → MessageContext (no change)
  - Localization         → Localization (no change)
  - LocalizationObserver → LocalizationManager
  - ResourceBundle       → ResourceBundle (no change)
  - data-l10n-bundle     → data-l10n-with

Let's do another round of feedback. Pike, Zibi -- how does the above look to you?
Flags: needinfo?(l10n)
Flags: needinfo?(gandalf)
I would also like to take this opportunity and change the layout of our code:

  - intl/     - the parser and the MessageContext
  - lib/      - the Localization class
  - bindings/ - the LocalizationManager class + the overlay logic (bug 1308893)
  - runtime/  - the runtime code which instantiates classes and binds events.
> (In reply to Zibi Braniecki [:gandalf][:zibi] from comment #19)
> > Oh, you'll love it :stas -
> > http://www.unicode.org/reports/tr35/#Language_and_Locale_IDs
> 
> That's an interesting read.  What would your recommendation be for the
> nomenclature we use?
> 
> If I'm reading this right, we should use "language" whenever we refer to a
> set of translations, and "locale" when we add i18n data as preferred by the
> user, region or the language.

Yes, that's the proposal I gave in comment 4.

>  Will the
> APIs with "Locale" be polymorphic, accepting locale code strings as well as
> Locale objects?  Similar to certain Date APIs.

Yes.

See https://github.com/zbraniecki/proposal-intl-locale

> Let's do another round of feedback. Pike, Zibi -- how does the above look to you?

a) I have not received a similar treatment of convincing and still find Localization a wrong name
b) Even if Localization would be a good name, that doesn't make the other names we suggested not good.

You say "Pike convinced you that Localization is a good name" - that's not very well documented procedure, is it? We came to a consensus (comment 10, comment 11, comment 16 and comment 17) and if you want to give up on it, it would be good to describe the reasons.

To top it up, you added two new change proposals in the latest list that were not discussed at all.
It so happens that I don't find them good (Entity to Message and data-l10m-bundle to data-l10n-with), but I believe the larger concern is the format in which you did this...

> I would also like to take this opportunity and change the layout of our code:

That looks good.
Flags: needinfo?(gandalf)
In comment 16 I said that "TBH I like Localization as the name and I'd vote for #1 if it wasn't for the data-l10n-* attribute which I have no idea how to call."  I liked Pike's idea to call it data-l10n-with which is why Localization is back on the table for me.  This is what I meant by "convinced me."

While we were talking about the MessageContext API (in particular, addMessages), it occurred to us that we use Entity vs. Message inconsistently too.  This bug is about revisiting many different naming choices and it seems fitting to add "Entity" to the lits.  Both Pike and I thought that Entity was a relic of DTD and that Message will be a better name. (IIRC Pike wasn't overly enthusiastic about it but thought it was at least better than Entity.)

Your comment seems to be about the process of my proposing these changes.  Please note that my proposals are not final and that I did ask for your and Pike's input.  If you don't like them, it would be helpful to know why.

What is the second change that I added to the list?
> Your comment seems to be about the process of my proposing these changes.

Correct, I still don't see any reason to move away from the consensus we reached and back into the proposal we don't have an agreement on.

>  Please note that my proposals are not final and that I did ask for your and Pike's input.  If you don't like them, it would be helpful to know why.

First of all, I believe it should be on the party that is stepping away from the consensus to provide reasons, you still omitted that.
Secondly, I believe it should be on the party that is introducing change proposals to give reasons why, you didn't do this either.

>  - Entity               → Message

I'm concerned about Message being a multi-value structure. Message is a great term for ICU to use in MessageFormat since the unit which it operates with is a single Message.

But a canonical example of an L20n unit:

key = Value
 [html/title] Title for the value

is not a single message. And I find the concept of "associating a message with a widget by localizing its content and attributes"

>  - Localization         → Localization (no change)

I don't like using generic terms that can be a noun or a verb and are one of the two names of the whole industry to name a particular structure in our code.

It's like calling something "data", "piece", or "science".

It's even worse than "bundle", because at least there is a single type of bundles we may have, but localization is absolutely everything related to our project. From the industry, department, area of interest, through community, science, to tech layer we operate on.

Also, there are at least 5 distinct items we work with in our code that could be named "Localization" - Entity, MessageContext, ResourceBundle to name a few.

For that reason I find this name ultimately cryptic and indistinguishable from "Foo".

>  - data-l10n-bundle     → data-l10n-with

I don't like names that are meant to indicate sentence-like structure into code. (like, <label for="">), I prefer semantic names that describe the type of content they hole (like, <label class="">, <label style="">, <label id="">, <label l10n-bundle="">)
Blocks: 1291693
Last week Axel suggested an interesting alternative to the current three-tier layout and naming.

If we claim the document.l10n property, it would make sense for the thing it points to to be called Localization, and in particular, DOMLocalization.  Axel suggested then that instead of a owner-ownee relationship like in LocalizationManager <-> Localization, the document.l10n Localization could delegate some translations to other Localizations which registered with it.

I'm not yet sure if that should be a feature specific to the DOMLocalization or maybe inherent to all Localizations.

There's an important caveat to this approach:  we currently pass requestBundles to the Localization constructor which requires the list of resource identifiers to be known before the Localization can be created.  If document.l10n is a Localization instance and if we still want to make sure it's always available for other scripts, we'd need to either:

  1) enforce the requirement that <script src="l20n.js"> always comes after the <links> and is before any other scripts, XBL bindings and web components, or

  2) make requestBundles wait for DOMContentLoaded or something similar before it queries for <links>.

I think #1 will make it harder for l20n.js to be part of bundles like the ones produced by Webpack.  #2 sounds easy enough.  Whatever we choose we can only implement it in selected runtimes, too, so for Gecko we could go with #1 and for the Web with #2, although I'd probably prefer to keep it consistent.

Thoughts?
Flags: needinfo?(gandalf)
As Axel was right to point out, I had asked to support each new suggestion with a full summary of changes, and then didn't do it above.  Sorry.

The changes described in comment 26 would be:


  - Entity               → Message
  - MessageContext       → MessageContext
  - Localization         → Localization
  - LocalizationObserver → DOMLocalization, subclass of Localization
  - ResourceBundle       → ResourceBundle
  - data-l10n-bundle     → data-l10n-with
The conversation is not progressing and we're approaching landing time with no resolution and I feel like my objections are of stylistic nature, isolated and when I raise them (like with `data-l10n-with`) there's no response at all except of a reiteration of the proposal.

I don't like how this discussion looks like but and I don't like aspects of the proposal in comment 27, but I have nothing new to add and what I stated so far doesn't seem to affect the proposal so please, don't block on me if my objections are not impacting your proposal.

At this point I don't see any way forward except of the driver of this project to just make an authoritative decision and move forward, and since I don't have hopes that more time will allow us to reach a consensus, I don't see a reason to wait.
Flags: needinfo?(gandalf)
What are your thoughts on DOMLocalization being a subclass of Localization?
Flags: needinfo?(gandalf)
(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #28)
> The conversation is not progressing and we're approaching landing time with
> no resolution and I feel like my objections are of stylistic nature,
> isolated and when I raise them (like with `data-l10n-with`) there's no
> response at all except of a reiteration of the proposal.

Yeah, a lot of that is stylistic and it's hard to reach a full consensus.   As for `data-l10n-with` you suggested `data-l10n-bundle` as an alternative (that's how I read your comment 25) but both Axel and I said before that we'd like to avoid using the word Bundle in LocalizationBundle.  Are you suggesting we use `data-l10n-bundle` nonetheless?

> At this point I don't see any way forward except of the driver of this
> project to just make an authoritative decision and move forward, and since I
> don't have hopes that more time will allow us to reach a consensus, I don't
> see a reason to wait.

Yes, we need Axel to make the call.  He said he would work with the current documentation and see how different proposals fit there.  This should bring us closer to fixing this bug.
> What are your thoughts on DOMLocalization being a subclass of Localization?

I'm ok with "DOMLocalization" name, but I believe that "Localization" as a name is extremely vague and in result cryptic.

Looking into Gecko, we have JSContext, StyleContext and I believe we should have LocalizationContext/L10nContext. 
Instead your proposal names it "Localization" which would be on par with "Style" for CSS and "Script" for JS.
Names that exist in HTML syntax, but not in JS/CPP and would be, imo, confusing to use in conversation or debugging.

> Are you suggesting we use `data-l10n-bundle` nonetheless?

Nope, data-l10n-ctx as stated in comment 4, comment 6 and comment 13.

So, my proposal (to stick to the policy):

  - Entity               → Message
  - MessageContext       → MessageContext
  - Localization         → L10nContext
  - LocalizationObserver → L10nBindings (*)
  - ResourceBundle       → ResourceBundle
  - data-l10n-bundle     → data-l10n-ctx

*) Or "L10nContext" if we merge them. In that case I imagine `document.l10n` to be an instance of L10nContext with `DOMBindings` on it which would imho work even better but would require code changes.

I believe that whatever is on 'document.l10n' is the most important name we should have and for the external audience should be the name that represents the entry point for them - much like JSContext and StyleContext are for JS and CSS.

I also believe that duplication between MessageContext and L10nContext is a smaller issue than naming a core class in the library with the name of the industry and field in which the project that creates this class exists.

Everything else is more internal and should be just a consequence.

As I said, nothing new to add.
Flags: needinfo?(gandalf)
I've looked at which documents are affected by the renames, and it seems that only hacking.rst is really. bindings.md is, too, but that's generated documentation, so I didn't feel like going down to edit that.

https://github.com/l20n/l20n.js/compare/master...Pike:DOMLocalization-hacking is what I have for comment 27, and the changes go a good deal beyond renaming things.

Some of the edits are bugs in the orignal hacking.rst, I think. Where they are, they're probably that the current version of hacking.rst overly generalizes what an environment binding should look like.

There's a left-over thought for hacking.rst, where I think that some of the arguments for async should be "in environments that have an event loop and where synchronous IO is frowned upon, this API is async".

I'm not yet making a call here, though I might soon.
Flags: needinfo?(l10n)
Pike, have you had a chance to think about this?
Flags: needinfo?(l10n)
We're going with:

  - Entity               → Message
  - MessageContext       → MessageContext
  - Localization         → Localization
  - LocalizationObserver → DOMLocalization, subclass of Localization
  - ResourceBundle       → ResourceBundle
  - data-l10n-bundle     → data-l10n-with

I did think about gandalf's concerns, and his reference of JSContext and StyleContext. At the level of APIs like document.l10n, we do have script elements, document.stylesheets, StyleRule, etc. The only API at that level I came across with Context is canvas. Thus I think just going directly for Localization is in line how those other web standards expose themselves at the DOM level.

I did entertain the idea of doing ResourceBundle -> L10nContext, but looking at the code, it's not enough code to deserve such a name. It's really just a bag of resources, without any smarts. Thus keep it as it is.

Gandalf filed bug 1316083 to implement this.
Assignee: nobody → l10n
Blocks: 1316083
Status: NEW → RESOLVED
Closed: 5 years ago
Flags: needinfo?(l10n)
Resolution: --- → FIXED
We have bug 1316083 to deal with the DOMLocalization refactor and bug 1317461 to track the renaming of Entity to Message.
You need to log in before you can comment on or make changes to this bug.