Implement an L20n API for Gecko

RESOLVED INCOMPLETE

Status

()

Core
Localization
RESOLVED INCOMPLETE
6 years ago
a month ago

People

(Reporter: gandalf, Assigned: gandalf)

Tracking

Trunk
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 1 obsolete attachment)

Comment hidden (empty)
OS: Mac OS X → All
Hardware: x86 → All
Version: unspecified → Trunk
(Assignee)

Comment 1

6 years ago
Created attachment 576340 [details] [diff] [review]
WIP 1

initial API for L20n that aims to live in intl/l20n for now. It'll be used by HTML5Parser, and XUL/XBL parsers initially.

It loads resources asynchronously but it localizes by injecting content/attributes directly into the DOM tree.
According to sicking we will want to investigate shadow DOM trees in the future.

Henri: can you take a look at the structure of the code, data types and APIs first? I'll clean up the code and remove the printfs and improve the variable namings on the way, but I'm mostly trying to make sure that I'm on the right track with this code here.
Assignee: nobody → gandalf
Status: NEW → ASSIGNED
Attachment #576340 - Flags: feedback?(hsivonen)
(Assignee)

Comment 2

6 years ago
oh, one things that I regressed here is that I'm not caching .j20n resources, nor JSContexts nor L20nContexts. (see bug 614536)
We can revisit those concepts later.
Comment on attachment 576340 [details] [diff] [review]
WIP 1


>+L20nResourceLoadRequest::L20nResourceLoadRequest(L20nResource* aResource)
>+    : mLoading(true),
>+      mL20nCtx(0)
>+{
>+  mResource = aResource;
>+}

Please line , under :. Please use nsnull instead of 0 for cases like this. Also, the mResource assignment can be in the list between : and {.

 : mResource(aResource)
 , mLoading(true)
 , mL20nCtx(nsnull)

>+L20nContext* L20n::GetContext(nsIDocument* aDocument) {

Please put the return type on the previous line.

>+  L20nContext* ctx;
>+
>+  printf("====>> creating new L20nContext!\n");
>+  ctx = new L20nContext(aDocument);
>+  return ctx;
>+}

This should most likely return already_AddRefed<L20nContext> to make sure that callers don't end up leaking the returned object.

>+void L20n::GetResource(L20nResource* resource, L20nContext* ctx)
>+{
>+  printf("=====>> L20n::GetResource\n");
>+  L20nResourceLoadRequest* request = new L20nResourceLoadRequest(resource);
>+  request->mURI = resource->mURI;
>+  request->mL20nCtx = ctx;
>+
>+  StartLoad(request, ctx->mDocument);
>+  printf("=====>> L20n::GetResource done\n");
>+}

This leaks request if an ENSURE_SUCCESS in StartLoad returns early. Please use nsRefPtr or nsCOMPtr as a habit for local variables and members unless you know a plain pointer is fine.

>+/* static */ nsresult
>+L20n::ConvertToUTF16(const PRUint8* aData, PRUint32 aLength, nsString& aString)
>+{
>+  if (!aLength) {
>+    aString.Truncate();
>+    return NS_OK;
>+  }
>+
>+  nsCAutoString characterSet;
>+
>+  nsresult rv = NS_OK;
>+
>+  if (characterSet.IsEmpty()) {
>+    // fall back to ISO-8859-1, see bug 118404
>+    characterSet.AssignLiteral("ISO-8859-1");
>+  }

This should probably pick up the charset parameter from HTTP earlier and fall back to UTF-8. Or ignore HTTP and always use UTF-8.
>+  L20nResource* mResource;
>+  bool mLoading;             // Are we still waiting for a load to complete?
>+  nsString mScriptText;      // Holds script for loaded scripts
>+  nsIURI* mURI;
>+  L20nContext* mL20nCtx;

Why are these plain pointers instead of nsCOMPtrs?

>+L20nResource::L20nResource(L20nResource* const& res) {
>+  mURI = res->mURI;
>+  mLoading = res->mLoading;
>+}

Why do you need a copy constructor?

>+void L20nContext::AddResource(const nsAString& aData, nsIDocument *aDocument) {
>+  printf("====>> Adding resource\n");
>+  nsresult rv;
>+  nsCOMPtr<nsIURI> uri;
>+  rv = NS_NewURI(getter_AddRefs(uri), aData);

Shouldn't this use the URL of aDocument as the base URL?

>+  L20n* l20n = new L20n();
>+  l20n->GetResource(resource, this);

This looks odd. Wasn't L20n supposed to be a singleton?

>+nsAutoString L20nContext::Get(const nsAString& aId) {

You should not return nsAutoString. Instead, you should have an nsAString& aOut argument that you assign to.

>+nsDataHashtable<nsStringHashKey, nsAutoString>* L20nContext::GetAttributes(const nsAString& aId) {

This should probably return already_AddRefed<nsDataHashtable<nsStringHashKey, nsAutoString> > to make it easier to avoid leaking.

>+void L20nContext::AttemptToLocalize() {
>+  if (!this->mDocumentLoaded)
>+    return;
>+
>+  /*
>+   * No resources, no localization
>+   */
>+  if (mResources.Length() == 0)
>+    return;
>+
>+  bool ready = true;
>+
>+  for (PRInt32 i = mResources.Length() - 1; i >= 0 ; i--)
>+    if (mResources[i]->mLoading)
>+      ready = false;

You could return early here instead of using a boolean flag.

>+static PLDHashOperator ActuallyLocalizeAttr(const nsAString &key,
>+                                            nsAutoString value,
>+                                            void *userArg)
>+{
>+  mozilla::dom::Element* elem = static_cast<mozilla::dom::Element*>(userArg);
>+  nsCOMPtr<nsIAtom> localName = do_GetAtom(key);
>+  printf("===> Adding an attribute %s with value %s\n", NS_LossyConvertUTF16toASCII(key).get(), NS_LossyConvertUTF16toASCII(value).get());
>+  elem->SetAttr(0, localName, value, false);

Please use the constant for no namespace instead of 0.

>+void L20nContext::ActuallyLocalize(mozilla::dom::Element* aElement)
>+{
>+  printf("====> Actually localizing an Entity. Sooo coool!\n");
>+  nsIAtom* l10nId = NS_NewAtom("l10n-id");

You should add a static atom for l10n-id in nsGkAtomList and use that.

>+  protected:

If you aren't planning for subclassing, just make these private instead of protected.

Having separate classes for L20nResourceLoadRequest and L20nResource is slightly weird--especially when mURI of L20nResource is never used after initializing its corresponding L20nResourceLoadRequest.

The general approach here makes sense for chrome only if we aren't doing shadow trees yet. I think we shouldn't expose the shadow treeless iteration to Web content. feedback+ on the general approach. (feedback- on the details mentioned above.)
Attachment #576340 - Flags: feedback?(hsivonen) → feedback+
(Assignee)

Comment 4

6 years ago
Created attachment 577287 [details] [diff] [review]
WIP 2

WIP updated to feedback
Attachment #576340 - Attachment is obsolete: true
Attachment #577287 - Flags: feedback?(hsivonen)
(Assignee)

Comment 5

6 years ago
(In reply to Henri Sivonen (:hsivonen) from comment #3)
> This leaks request if an ENSURE_SUCCESS in StartLoad returns early. Please
> use nsRefPtr or nsCOMPtr as a habit for local variables and members unless
> you know a plain pointer is fine.

Added it everywhere I was able to :)

> >+  L20nResource* mResource;
> >+  bool mLoading;             // Are we still waiting for a load to complete?
> >+  nsString mScriptText;      // Holds script for loaded scripts
> >+  nsIURI* mURI;
> >+  L20nContext* mL20nCtx;
> 
> Why are these plain pointers instead of nsCOMPtrs?

If I make L20nResource nsRefPtr it crashes in StreamProcess :(
 
> >+  L20n* l20n = new L20n();
> >+  l20n->GetResource(resource, this);
> 
> This looks odd. Wasn't L20n supposed to be a singleton?

It was, but since it's also nsIStreamLoaderObserver I hit the wall and got it crashing whenever the same singleton has been used as a stream loader observer for XUL context and then for HTML5 context so I temporarily switched back to having L20n as a normal class.

I still need it to be singleton so that I can cache contexts as members, but it may require me to go for a separate class that will server as a nsIStreamLoaderObserver unless I can figure out why does it crash.

> >+nsDataHashtable<nsStringHashKey, nsAutoString>* L20nContext::GetAttributes(const nsAString& aId) {
> 
> This should probably return
> already_AddRefed<nsDataHashtable<nsStringHashKey, nsAutoString> > to make it
> easier to avoid leaking.

The problem here is that nsDataHashable does not implement refcounting so I can't go for
nsRefPtr<nsDataHashable<> > = GetAttributes(); :(

It seems that I'd have to extend nsDataHashable to support refcounting and use my class instead. Should I go for it?


> 
> Having separate classes for L20nResourceLoadRequest and L20nResource is
> slightly weird--especially when mURI of L20nResource is never used after
> initializing its corresponding L20nResourceLoadRequest.

The goal here is to have L20nResourceLoadRequest hold two bits - L20nResource url and L20nContext it'll be bind to after loading.
L20nResource object should be independent of the L20nContext and cached in L20n singleton and reused later.
I can't do this now since L20n is not a singleton :(
Does it sound fair?
 
> The general approach here makes sense for chrome only if we aren't doing
> shadow trees yet. I think we shouldn't expose the shadow treeless iteration
> to Web content. feedback+ on the general approach. (feedback- on the details
> mentioned above.)

Yeah, how should I make it chrome-only? ifdefs or do we have sth smarter? (I'd like to go for pref if that makes sense so that I can manually turn it on for the web for testing purposes)
Comment on attachment 577287 [details] [diff] [review]
WIP 2

(In reply to Zbigniew Braniecki [:gandalf] from comment #5)
> > >+  L20nResource* mResource;
> > >+  bool mLoading;             // Are we still waiting for a load to complete?
> > >+  nsString mScriptText;      // Holds script for loaded scripts
> > >+  nsIURI* mURI;
> > >+  L20nContext* mL20nCtx;
> > 
> > Why are these plain pointers instead of nsCOMPtrs?
> 
> If I make L20nResource nsRefPtr it crashes in StreamProcess :(

How does it crash? Does Necko end up calling Release so that refcount goes to zero? You need to make sure that you hold a reference to the object from the main thread so that the object stays alive regardless of what Necko does. (I think it's dodgy that Necko wants to refcount objects that always need to be kept alive by the main thread anyway, but we'll just need to live with that design for now.)

> > >+nsDataHashtable<nsStringHashKey, nsAutoString>* L20nContext::GetAttributes(const nsAString& aId) {
> > 
> > This should probably return
> > already_AddRefed<nsDataHashtable<nsStringHashKey, nsAutoString> > to make it
> > easier to avoid leaking.
> 
> The problem here is that nsDataHashable does not implement refcounting so I
> can't go for
> nsRefPtr<nsDataHashable<> > = GetAttributes(); :(
> 
> It seems that I'd have to extend nsDataHashable to support refcounting and
> use my class instead. Should I go for it?

No. I didn't realize that nsDataHashtable doesn't support refcounting.

> > 
> > Having separate classes for L20nResourceLoadRequest and L20nResource is
> > slightly weird--especially when mURI of L20nResource is never used after
> > initializing its corresponding L20nResourceLoadRequest.
> 
> The goal here is to have L20nResourceLoadRequest hold two bits -
> L20nResource url and L20nContext it'll be bind to after loading.
> L20nResource object should be independent of the L20nContext and cached in
> L20n singleton and reused later.
> I can't do this now since L20n is not a singleton :(
> Does it sound fair?

Sorry, I don't follow. Why does it make sense to cache objects if those objects don't cache the whole L20nContext?

> > The general approach here makes sense for chrome only if we aren't doing
> > shadow trees yet. I think we shouldn't expose the shadow treeless iteration
> > to Web content. feedback+ on the general approach. (feedback- on the details
> > mentioned above.)
> 
> Yeah, how should I make it chrome-only? ifdefs or do we have sth smarter?
> (I'd like to go for pref if that makes sense so that I can manually turn it
> on for the web for testing purposes)

A pref would work in nightlies. For the release channel, it might be prudent not to have a pref for turning it on, but maybe a pref wouldn't be too dangerous on the release channel, either.

>+already_AddRefed<L20nContext>
>+L20n::GetContext(nsIDocument* aDocument) {
>+  L20nContext* ctx;
>+
>+  printf("====>> creating new L20nContext!\n");
>+  ctx = new L20nContext(aDocument);
>+  return ctx;
>+}

You aren't actually addrefing the object.

Something like:
nsRefPtr<L20nContext> ctx = new L20nContext(aDocument);
return ctx.forget();

>+void
>+L20n::GetResource(L20nResource* resource, L20nContext* ctx)
>+{
>+  printf("=====>> L20n::GetResource\n");
>+  nsRefPtr<L20nResourceLoadRequest> request = new L20nResourceLoadRequest(resource);
>+  request->mURI = resource->mURI;

Is there a reason why the constructor of L20nResourceLoadRequest doesn't grab mURI from resource?

>+  request->mL20nCtx = ctx;

>+  StartLoad(request, ctx->mDocument);

Request holds ctx which holds mDocument. Is there a reason why the document is passed to StartLoad as a separate argument? E.g. the URL isn't passed separately and StartLoad takes if from request.

>+    channelPolicy->SetLoadType(nsIContentPolicy::TYPE_SCRIPT);

I wonder if eventually we'll need a new type for L20n files, considering that they aren't as dangerous as scripts.

>+nsresult
>+L20n::OnStreamComplete(nsIStreamLoader* aLoader,
>+                                 nsISupports* aContext,
>+                                 nsresult aStatus,
>+                                 PRUint32 aStringLen,
>+                                 const PRUint8* aString)

The four last arguments should line up with the first one.

>+nsresult
>+L20n::ConvertToUTF16(const PRUint8* aData, PRUint32 aLength, nsString& aString)
>+{
>+  if (!aLength) {
>+    aString.Truncate();
>+    return NS_OK;
>+  }
>+
>+  nsCAutoString characterSet;
>+
>+  nsresult rv = NS_OK;
>+
>+  characterSet.AssignLiteral("UTF-8");
>+
>+  nsCOMPtr<nsICharsetConverterManager> charsetConv =
>+    do_GetService(NS_CHARSETCONVERTERMANAGER_CONTRACTID, &rv);
>+
>+  nsCOMPtr<nsIUnicodeDecoder> unicodeDecoder;
>+
>+  if (NS_SUCCEEDED(rv) && charsetConv) {
>+    rv = charsetConv->GetUnicodeDecoder(characterSet.get(),
>+                                        getter_AddRefs(unicodeDecoder));
>+    if (NS_FAILED(rv)) {
>+      // fall back to ISO-8859-1 if charset is not supported. (bug 230104)
>+      rv = charsetConv->GetUnicodeDecoderRaw("ISO-8859-1",
>+                                             getter_AddRefs(unicodeDecoder));
>+    }
>+  }

UTF-8 will always be supported. You can assert that instead of falling back to ISO-8859-1.

>+    void GetResource(L20nResource* resource, L20nContext* ctx);

Maybe change this to use some other verb than "Get", since this isn't a getter method. LoadResource maybe?

>+void
>+L20nContext::AddResource(const nsAString& aData, nsIDocument *aDocument) {

Calling the argument aUrl instead of aData would make it clearer what the argument takes.

>+  printf("====>> Adding resource\n");
>+  nsresult rv;
>+  nsCOMPtr<nsIURI> uri;
>+  rv = NS_NewURI(getter_AddRefs(uri), aData, nsnull, aDocument->GetDocumentURI());

The third argument should be aDocument-GetDocumentCharacterSet().get().

>+    if (s) {
>+      nsAutoString val(NS_ConvertUTF8toUTF16(
>+        JS_EncodeString(jsctx, s)));

How come the string ends up as being UTF-8 encoded as an intermediate step?

>+  nsRefPtr<nsIAtom> l10nId = NS_NewAtom("l10n-id");

This line isn't needed anymore.

feedback+ on the general approach, but especially the non-owning pointers as fields of L20nResourceLoadRequest need more attention.
Attachment #577287 - Flags: feedback?(hsivonen) → feedback+
(Assignee)

Comment 7

a month ago
Seven years later, we're making another attempt to refactor our l10n layer.

The new tracking bug is bug 1365426 and I'll mark the previous effort as "INCOMPLETE".
Status: ASSIGNED → RESOLVED
Last Resolved: a month ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.