Closed Bug 1987357 Opened 5 months ago Closed 4 months ago

planner.ikea.com.tw - The page is blank

Categories

(Web Compatibility :: Site Reports, defect, P2)

Tracking

(Webcompat Priority:P2, Webcompat Score:5, firefox144 verified)

VERIFIED FIXED
Webcompat Priority P2
Webcompat Score 5
Tracking Status
firefox144 --- verified

People

(Reporter: rbucata, Unassigned)

References

()

Details

(Keywords: webcompat:needs-sitepatch, webcompat:site-report, Whiteboard: [webcompat-source:web-bugs][webcompat:diagnosis:site-bug])

User Story

platform:windows,mac,linux,android
impact:site-broken
configuration:general
affects:all
branch:release
diagnosis-team:dom
user-impact-score:200
outreach-assignee:hsinyi
outreach-contact-date:2025-09-10

Attachments

(3 files, 2 obsolete files)

Environment:
Operating system: Ubuntu
Firefox version: Firefox 142.0

Steps to reproduce:

  1. Navigate to: https://planner.ikea.com.tw/addon-app/space/platform/latest/tw/en/#/room/workspace
  2. Observe

Expected Behavior:
The page loads

Actual Behavior:
The page is blank

Notes:

  • Reproduces regardless of the status of ETP
  • Reproduces in firefox-nightly, and firefox-release
  • Does not reproduce in chrome

Created from https://github.com/webcompat/web-bugs/issues/175997

OS: Unspecified → All
Hardware: Unspecified → All

Since nightly and release are affected, beta will likely be affected too.
For more information, please visit BugBot documentation.

We see an error in the console: Uncaught SyntaxError: unmatched ) in regular expression internal.9f59ec3deed82adffff4.js:3944:6388

Severity: -- → S2
User Story: (updated)
Webcompat Priority: --- → P2
Webcompat Score: --- → 6
Priority: -- → P2

The regular expression issue seems to be because we're rendering the document with the Big5 charset for some reason (as can be seen by checking document.characterSet in the Web Console, or by doing File|Save-As with "Web Page (Complete)", and then inspecting the resulting html file to look at the <meta tag that specifies the charset, which is:

<meta http-equiv="content-type" content="text/html; charset=Big5">

If I craft a local testcase with that^ meta tag and with a script pointing at the JS file in question, then that testcase trips a regex error in both Firefox and Chrome. So I think the issue here is the fact that we're inferring this Big5 charset (while Chrome is not).

So how do we end up establishing a Big5 encoding for this doc? It happens in nsHtml5StreamParser::ProcessLookingForMetaCharset which does:
https://searchfox.org/firefox-main/rev/3c23ce1368431d49bae08e8e211f7f2bf4e4829d/parser/html/nsHtml5StreamParser.cpp#2203-2205

auto [encoding, source] = GuessEncoding(true);
mNeedsEncodingSwitchTo = encoding;
mEncodingSwitchSource = source;

That results in encoding.mBasePtr = 0x73ea6ef32148 <encoding_rs::BIG5_INIT>

hsivonen, do you know why we're inferring Big5 here and Chrome is apparently not?

Flags: needinfo?(hsivonen)
Attached file testcase 1 (obsolete) —

whoops, sorry, I used the wrong JS file. I think this is the right one

Attachment #9512127 - Attachment is obsolete: true
Attachment #9512128 - Attachment is obsolete: true

(In reply to Daniel Holbert [:dholbert] from comment #3)

If I craft a local testcase with [a Big5] meta tag and with a script pointing at the JS file in question, then that testcase trips a regex error in both Firefox and Chrome. So I think the issue here is the fact that we're inferring this Big5 charset (while Chrome is not).

Testcase 1 is an example of that^. It's got the Big5 encoding explicitly specified, and that triggers this error in Firefox:

Uncaught SyntaxError: unmatched ) in regular expression
...and this error in Chrome, which I suspect is their way of reporting the same issue:
Uncaught SyntaxError: Invalid regular expression flags

Whereas if I instead use charset=windows-1252 (which is what I see in the HTML if I save the Ikea page from Chrome), then I get no such error from either Chrome or Firefox.

So: bottom-line, this guessed Big5 encoding seems to be the source of the regular-expression SyntaxError (which we're assuming for now is the reason that the page fails to render).

Attachment #9512130 - Attachment description: testcase 1 (with explicit Big5 encoding) → testcase 1 (with explicit Big5 encoding, resulting in SyntaxError in both Firefox and Chrome)
Attachment #9512130 - Attachment description: testcase 1 (with explicit Big5 encoding, resulting in SyntaxError in both Firefox and Chrome) → testcase 1 (with explicit Big5 encoding, resulting in SyntaxError in Web Console, in both Firefox and Chrome)
User Story: (updated)

This is an all-ASCII HTML document without an explicit encoding declaration, so 1) there's no declaration to use and 2) the detector has no non-ASCII bytes to work with. In such a case, Chrome guesses windows-1252 from a networking-side-effect-dependent prefix of the page. Safari falls back to a pref, which I believe to differ in its default depending on the macOS install-time language (on perhaps Safari first-run language?). Firefox falls back to the TLD, in this case .tw, which Firefox considers to be associated with Big5.

If you configure Safari to use Big5 as the fallback, which I believe to be the configuration you get if you use Traditional Chinese as the UI language during macOS install / first setup / Safari first run, Safari shows a black page, too.

I believe that using a TLD-based guess instead of global windows-1252 guess generally reduces the probability of Firefox having to reload the page later if head or the first 1024 bytes (whichever is larger) is all ASCII but the detector sees non-ASCII bytes that fit the most common legacy encoding for the TLD later on, so I'd prefer not to change the Firefox behavior due to this isolated counter example. The TLD-based approach also nicely approximates compat with Safari without making the behavior dependent on the browser settings.

I'm guessing that the page is running into containing a backslash followed by a byte that forms a Big5 character. See also https://github.com/whatwg/encoding/issues/171 . (Due to the popularity of the encodings that have an issue of this nature, we can't just categorically defend against the XSS risk for sites that fail to declare their encoding.)

I suggest that we contact Ikea and point out that declaring the encoding would be more compatible and more XSS-safe.

Flags: needinfo?(hsivonen)
Webcompat Score: 6 → 5

Thanks, Henri!

twisniewski, do you know if it's possible to ship an intervention here? I think it would need to be something that has the effect of inserting <meta charset="utf-8" /> (or windows-1252) into the HTML here, before we start parsing the external <script> references. Not sure if we have that ability right now.

(If we have interventions that have to spoof a meta viewport tag, maybe we could use a similar mechanism here?)

Flags: needinfo?(twisniewski)

as a quick local exploration into what-an-intervention-might-look-like... Unfortunately I'm not able to get a script-inserted meta charset tag to take effect (which is maybe [?] by-design). Same results in Firefox and Chrome. Here are three documents with Chinese characters -- (1) is what we want to avoid, (2) is a reference, and (3) is a strawman-intervention which doesn't work:

(1) No meta charset declared:
data:text/html,你好
--> renders ä½ å¥½ because there's no encoding declared.

(2) meta charset declared (normal/good):
data:text/html,<meta charset="utf-8">你好
--> properly renders the Chinese characters 你好 because the encoding is declared.

(3) meta charset tag constructed via script and inserted:
data:text/html,<head></head><script>let m = document.createElement("meta");m.setAttribute("charset", "utf-8");document.head.appendChild(m);</script>你好
--> renders ä½ å¥½ because the encoding that gets inserted via script is apparently not honored/recognized.

(In reply to Daniel Holbert [:dholbert] from comment #13)

as a quick local exploration into what-an-intervention-might-look-like... Unfortunately I'm not able to get a script-inserted meta charset tag to take effect (which is maybe [?] by-design).

By design, yes.

If we want an intervention for this, it seems to me that modifying the Content-Type response header would be a better approach than modifying the byte stream of the response to inject a meta tag.

https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/webRequest/onHeadersReceived

I looked into this a bit more. internal.1c4100ed6327a97c53fc.js is intended to be UTF-8 and has regular expressions that contain a curly quotation mark as the last character of a regular expression character class, which is denoted with square brackets, like this: ”]

The curly quote is E2 80 9D in UTF-8. The closing square bracket is 5D. 9D 5D is a Big5 byte pair for 㷷 (see https://encoding.spec.whatwg.org/big5.html), so the square bracket gets eaten, which makes the regular expression not compile.

This suggests that the regular expressions aren't working in Chrome, either, but at least they compile in Chrome, since the character class ends up being syntactically valid garbage.

Ikea could fix stuff in Firefox, Safari, and Chrome by using the HTTP header Content-Type: text/html; charset=utf-8 on the HTML resource (currently missing the ; charset=utf-8 part).

It appears that the Web app is global with https://www.ikea.com/addon-app/space/platform/latest/fi/fi/#/room/workspace using even the same JavaScript bundle build as the .com.tw instance. (Also without the appropriate ; charset=utf-8 but .com resulting in windows-1252 so the regexps become syntactically valid garbage.)

It probably makes sense to contact whatever development headquarters Ikea has for their Web apps instead of a .tw point of contact.

https://www.ikea.co.jp/ redirects to https://www.ikea.com/jp/ja/ , which is why the Japan site doesn't have the same problem as .tw. (9D 5D is a two-byte character also in Shift_JIS.)

I had already sent an outreach email to ikea.tw, before reading comment 16, which suggested to contact the global development headquarters. Once I hear responses, I will share the new thought from comment 16 to them.

User Story: (updated)

I just made a simple intervention here to add the charset to the content-type, but that just gives me an error page:

Unexpected Application Error!
WebGL not supported

This is the intervention, in case anyone has time to debug and wants to see what might be going on:

  "1987357": {
    "label": "planner.ikea.com.tw",
    "bugs": {
      "1987357": {
        "issue": "page-fails-to-load",
        "matches": ["*://planner.ikea.com.tw/addon-app/space/platform/latest/*"]
      } 
    },
    "interventions": [
      {
        "platforms": ["all"],
        "alter_response_headers": [
          {
            "headers": ["content-type"],
            "replace": "text/html",
            "replacement": "text/html; charset=utf-8"
          } 
        ] 
      } 
    ] 
  },

(just add that somewhere in the middle of interventions.json, mach rebuild, and it should let you get past the empty page).

Flags: needinfo?(twisniewski)

With the intervention (also tried changing matches to ["*://planner.ikea.com.tw/*"]), ctrl/cmd-i still shows Big5, so I think the header replacement isn't happening.

Thanks for the debugging here everyone. The team has added the correct encoding to the header (as suggested by Henri) so this is working again in Firefox.

Nice, thanks!

Status: NEW → RESOLVED
Closed: 4 months ago
Resolution: --- → FIXED

Verified, the issue no longer reproduces.

Tested with:

  • Browser / Version: Firefox 144.0-candidate build 1
  • Operating System: Windows 10 / Ubuntu
Status: RESOLVED → VERIFIED
Whiteboard: [webcompat-source:web-bugs] → [webcompat-source:web-bugs][webcompat:diagnosis:site-bug]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: