Closed Bug 1543493 Opened 6 months ago Closed 6 months ago

Make XUL <image> tags work in fluent translations

Categories

(Core :: Localization, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla68
Tracking Status
firefox68 --- fixed

People

(Reporter: Gijs, Assigned: Gijs)

References

Details

Attachments

(1 file)

Right now, because Fluent uses the tree sanitizer implicitly when calling node.innerHTML = value in DOMLocalization:

https://searchfox.org/mozilla-central/source/intl/l10n/DOMLocalization.jsm#100

and because the tree sanitizer strips out any non-HTML, non SVG, non MathML nodes:

https://searchfox.org/mozilla-central/rev/dd7e27f4a805e4115d0dbee70e1220b23b23c567/dom/base/nsTreeSanitizer.cpp#965-996

it's effectively impossible to have XUL <image> nodes in a translation.

This is annoying because html:img ones behave subtly differently (e.g. not the same support for CSS list-style-image).

Nodes that share node local names with HTML (like <label>) are OK because DOMLocalization apparently doesn't care about namespaces, inserts stuff into an HTML namespaced <template>, so they just get interpreted as HTML:label on one side and shoved into corresponding XUL <label> elements on the other side... which is sort of wrong, but oh well, it works...

<image> breaks because our HTML parser seems to magically transform it into <img> which then doesn't map onto a XUL <image> anymore because the nodenames mismatch, as per

https://searchfox.org/mozilla-central/rev/dd7e27f4a805e4115d0dbee70e1220b23b23c567/intl/l10n/DOMLocalization.jsm#217-224 .

I see 2 solutions:

  1. make the sanitizer allow some XUL elements only in XUL documents (I have no appetite for it allowing XUL anywhere else, for obvious reasons), without any attributes (except the always-allowed _underscore, aria-* and data-* ones)
  2. hack up the DOMLocalization code to allow the img=image mismatch

(2) is less work but liable to break, so we should probably write an automated test to ensure it works. It also feels like it acquiesces to fluent not caring about namespaces which feels like the wrong direction longer term.

(1) is scarier because the sanitizer is used for other things, but I do have a patch in hand for that change.

Opinions on what to do?

Flags: needinfo?(kmaglione+bmo)
Flags: needinfo?(jaws)
Flags: needinfo?(gandalf)
See Also: → 1503657

Gandalf is working on migrating this code to C++, and in the process will hopefully stop using the tree sanitizer, since Fluent already does its own form of sanitization. That's probably the best solution.

I really don't want to make the tree sanitizer start allowing XUL elements...

Flags: needinfo?(kmaglione+bmo)

I think there are more options here, and more problems.

In particular when we're going for custom elements as "functional" children in a DOM overlay, we'll have an open ended problem.

Given we sanitize after parsing anyway, we might want to disable our innerHTML sanitization?

This is also related to tests that zibi is writing for moving dom overlays into C++.

(In reply to Axel Hecht [:Pike] from comment #2)

I think there are more options here, and more problems.

In particular when we're going for custom elements as "functional" children in a DOM overlay, we'll have an open ended problem.

OK, but I have a bug right now that I need to get unblocked, and I have a working fix for this specific problem. If porting this stuff to C++ makes the issue go away longer-term (next week or whatever), so much the better.

Given we sanitize after parsing anyway, we might want to disable our innerHTML sanitization?

There is no straightforward way to do this, we removed unsafesetinnerhtml or whatever it was called (or at least, I sure hope we did), and making this code a lot more ugly to use some other solution (DOMParser?) seems not worth it, especially if we're going to be rm -rf 'ing it soon.

I've attached a patch with a short-term fix for this specific type of node + a test to ensure that the C++ porting will not break this. Given comment #1, I don't think the sanitizer route is any good, and the workarounds required to switch to html:img have moved us so far into the weeds (like, layout apparently makes different decisions on whether to load the image if it's in a hidden bit of content between html:img and xul <image> with list-style-image, which means we spuriously load the images in this popup when we switch to html:img, which is a perf issue and breaks the startup_images test...). Let's just get this out of the way and hope the C++ port fixes this "properly".

Flags: needinfo?(jaws)
Flags: needinfo?(gandalf)
Blocks: 1523757

(In reply to :Gijs (he/him) from comment #4)

There is no straightforward way to do this, we removed unsafesetinnerhtml or whatever it was called (or at least, I sure hope we did)

We did, yes. The only way to do this from JS now is to parse the nodes in another document (e.g., with DOMParser) and import them. It's intentionally difficult.

When the code is migrated to C++, there will be more options.

Let's just get this out of the way and hope the C++ port fixes this "properly".

I'd like to better understand what do we hope to see out of the C++ fix.

My patches in bug 1503657 do not skip the sanitization, just like the JS implementation doesn't skip it today. If we want to, Kris helped me craft a way to make us skip it, but I believe that we'd have to hold a conversation with the security team about it.

The patch that Gijs wrote seems pretty straightforward and could be ported to C++ as well, but I'm not sure how it'll handle localization in the long run.

If I understand correctly, the patch allows named element matching between source and translation, so the following code would work:

key0 = This is an <img data-l10n-name="picture"/> image.
<div data-l10n-id="key0">
  <xul:image data-l10n-name="picture"/>
</div>

This would work with Gijs patch in rev. 2, and with my current proposal POC patches in rev. 3 of DOMOverlays, but depending on the final decision with rev. 3 it may or may not work.
We'd need to commit to reusing the matched element from the source in the translation. Code-wide, that's easy. Spec-wise, I don't know. Unfortunately, this seems to fall into the gray area between Fluent design work (L10n Drivers group), localization management for Firefox (L10n Drivers group), and Fluent in Gecko bindings (Browser Architecture group).

I'm comfortable taking the responsibility for the last and in result accepting the patch into the tree and committing to maintain the quirk of ftl:img==xul:image, but since the result may affect DOMOverlays rev. 3, I'd prefer to get consensus with whoever will lead DOMOverlays rev. 3 spec work, and I'm not sure if that'll be me :(

Axel - do you have any concerns about it in the context of rev. 3, or just the sanitization?

Flags: needinfo?(l10n)

(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #6)

We'd need to commit to reusing the matched element from the source in the translation.

I don't understand why this is related. The checks today only verify that the node name is the same. I added an escape clause that treats 'img' and 'image' the same. If we recreated the element and e.g. copied attributes from the source to the translation, that would be fine for my usecase - it's not tied to element identity, only to being able to use the <image> element at all.

Really, the "right" fix is changing the default namespace inside the HTML template element to match the default namespace of the enclosing document, so namespaceless elements get parsed into the right namespace, and for fluent's sanitization to not remove all XUL elements, and check namespaces when comparing elements. But the indications so far are that we don't want the sanitizer to have XUL exceptions. So then the only real answer is to remove the tree sanitizer's sanitization and use fluent's own, which we can't do very easily in the JS case.

As far as I'm concerned this is just a short-term workaround, and there are multiple ways of resolving this after we switch to C++, and this solution doesn't unduly prejudice what we go for in the C++ case, except to say that XUL <image> should be supported, which I think (from the comments here) is uncontroversial.

I think we should have the test case in the patch here for sure, and with that the quirk right now. I've tried to scan the spec Gijs had on irc, https://html.spec.whatwg.org/multipage/parsing.html, and I didn't find any other tag name redirects. And I'm scared of asking. Let's assume that this is the only quirk we need right now.

I re-read the C++ part in bug 1503657, and I think it actually does change the namespace context when parsing the DOM fragment. That's also why I want us to have the test case in tree. We might not need the quirk, or we might need something else.

So, yes, in the world of gecko, we should make sure that XUL images work. Nice that XUL doesn't matter outside of that.

I'll add <image>-in-html-parsing to my list of things to keep in mind for overlays V3. I bet fluent-react overlays are affected by this, too.

Flags: needinfo?(l10n)
Pushed by gijskruitbosch@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/0cee09cf4016
ensure DOM localization in XUL docs can localize content that includes an <image>, r=zbraniecki
Status: ASSIGNED → RESOLVED
Closed: 6 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla68
You need to log in before you can comment on or make changes to this bug.