KumaScript changed escaping of HTML entities



2 years ago
2 years ago


(Reporter: fscholz, Unassigned)



(Whiteboard: [specification][type:bug])



2 years ago
What did you do?
Went to https://developer.mozilla.org/en-US/docs/Web/JavaScript

What happened?
Saw "Expressions & operators" in the sidebar navigation.

The string output is from KumaScript, in the jsSidebar.ejs macro:

What should have happened?
"Expressions & operators" in the sidebar.

A workaround would be to now output it like this: <%-text['Operators']%>

Is there anything else we should know?
The JS pages are quite trafficked, this might a affect quite a lot of pages, and especially localizations of them.

I suspect this is due to our module updates. A quick search on the ejs repo suggests it might be this change: https://github.com/tj/ejs/pull/165

Questions that I have:

Is this new escaping behavior for HTML entities with "<%=" by design and should we use <%- when we need HTML entities in the output?

Or, do we want to fix the new "<%=" so that the old behavior is restored?
It is a good idea to be explicit about when you are adding text that should be escaped, and text that contains vetted HTML.  This helps avoid XSS vectors. "<%-" should be the exception, and cause reviewer's warning bells to go off.

I think JsSidebar template should be updated:


The strings should use plain ampersands, in the English and other translations:

'Operators': 'Expressions & operators',

And the EJS should continue to use the "escape and render" version '<%=' :

<li data-default-state="<%=state('Operators')%>"><a href="/<%=locale%>/docs/Web/JavaScript/Reference/Operators"><%=text['Operators']%></a>
&nbsp can be converted to the JS string hexadecimal equivalent:

'&nbsp' -> '\xa0' or '\xA0'

This site has a table of HTML hexadecimal codes and the equivalent HTML entities:


So copyright © '&copy;' is '&#x00A9;' in HTML and '\xA9' or '\u00A9` in a JS string.

Unicode points like € are '&euro;' and '&#x20AC;'. The four-digit JS escape is '\u20AC'.  It may be better to standardize on the unicode escapes ('\u00A9' and '\u20AC'), since the shorter '\xA9' ones are not allowed in JSON.

Of course, for things like €, you should be able to just use the character in the string.
PR 96 merged, deployed to staging and production. After a force refresh, the sidebar is back to normal.
Last Resolved: 2 years ago
Resolution: --- → FIXED

Comment 5

2 years ago
Thanks for the quick review and deployment!
See Also: → bug 1335006
You need to log in before you can comment on or make changes to this bug.