Bug 1482422 - Use UTF-8 character set instead of ASCII when reading Activity Stream scripts with `mozJSSubScriptLoader`.
46 bytes, text/x-phabricator-request
|Details | Review|
52 bytes, text/x-github-pull-request
|Details | Review|
Non-standard configuration: browser.tabs.remote.separatePrivilegedContentProcess;true German umlauts are busted. (Jan Andre Ikenmeyer [:darkspirit] from bug 1482404 comment 2) > https://screenshotscdn.firefoxusercontent.com/images/9b74c55a-8acd-48ed-9cfc-e5dc36d4ad91.jpg > I just noticed text encoding bugs of german words (center bottom context menu + section title "Overview") and will try to narrow it down. (Should this block the fission meta bug or something else instead?)
Hey jfkthame, can you think of any reason why the separate Activity Stream content process might be rendering this character differently?
At some level, somewhere, there's confusion between a UTF-8 string and Latin-1. So the code units that are meant to make up a single multi-byte UTF-8 character are being interpreted as separate 8-bit characters. I don't know offhand where this is happening, though. Interestingly, the titles of the actual tiles don't seem to be affected (see the bottom-right one), although the "Overview" heading is. FWIW, after enabling separatePrivilegedContentProcess on macOS, I see a similar glitch when I go to manually add a "New Top Site": beneath the URL field, there's a link that reads "Use a custom imageâ¦". (It's supposed to be an ellipsis, U+2026, which in UTF-8 is the three bytes <E2 80 A6>; in Latin-1, those are <a-circumflex, ctrl-char-80, broken bar>, which is what we see.)
I guess one key difference between the text that gets mangled and the tile titles (which work OK) is the source of the text: in the first case, it's part of the browser UI, which I guess gets loaded from resources in omni.jar or somewhere like that, while the titles originate from web content (and presumably get stashed in the profile). So.... I wonder if the new process is not using the right encoding when it reads UI strings from the application package, for some reason?
Interesting. This should be the fix: https://searchfox.org/mozilla-central/rev/dc28b8bddfbb7bbc89de5d0fd6448589aa6a2991/browser/components/newtab/aboutNewTabService.js#164. At the moment, it is using the default character set, which is "ASCII", to read the strings. We will need to pass in "UTF-8" as the character set when we call `loadSubScript`: https://searchfox.org/mozilla-central/rev/dc28b8bddfbb7bbc89de5d0fd6448589aa6a2991/js/xpconnect/idl/mozIJSSubScriptLoader.idl#24-26. Tested the fix locally and the strings were displayed correctly: https://imgur.com/SnjNuFR.
The default character set that is used to read sub-scripts using `mozJSSubScriptLoader` is ASCII. Scripts for Activity Stream contains strings which are encoded in UTF-8, and therefore, reading them in ASCII will display the strings incorrectly.
Comment on attachment 8999413 [details] Bug 1482422 - Use UTF-8 character set instead of ASCII when reading Activity Stream scripts with `mozJSSubScriptLoader`. Mike Conley (:mconley) (:⚙️) has approved the revision.
Attachment #8999413 - Flags: review+
Pushed by email@example.com: https://hg.mozilla.org/integration/autoland/rev/ccba4bf58b0f Use UTF-8 character set instead of ASCII when reading Activity Stream scripts with `mozJSSubScriptLoader`. r=mconley
Verified fixed in Nightly 63 x64 20180813220525 de_DE @ Debian Testing (KDE, Xorg). Thank you!
Status: RESOLVED → VERIFIED
Commit pushed to master at https://github.com/mozilla/activity-stream https://github.com/mozilla/activity-stream/commit/c9679fea2a5768cde043e3c0ee802cb2565a85bf chore(mc): Port Bug 1482422 - Use UTF-8 character set instead of ASCII when reading Activity Stream scripts with `mozJSSubScriptLoader`. r=mconley (#4335)
You need to log in before you can comment on or make changes to this bug.