Firefox 19.0 does not support BIG5-HKSCS encoding. Version 18.0 does

RESOLVED FIXED in Firefox 20

Status

()

defect
RESOLVED FIXED
7 years ago
6 years ago

People

(Reporter: arbiant, Assigned: emk)

Tracking

({regression})

19 Branch
mozilla22
x86
Windows 7
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite +

Firefox Tracking Flags

(firefox19 wontfix, firefox20+ verified, firefox21+ verified)

Details

Attachments

(2 attachments, 1 obsolete attachment)

User Agent: Mozilla/5.0 (Windows NT 6.1; rv:18.0) Gecko/20100101 Firefox/18.0
Build ID: 20121231071231

Steps to reproduce:

does not show HK special character although webpage set with charset=BIG5-HKSCS


Actual results:

鰂 show as 鰂


Expected results:

should show as 鰂 under charset=BIG5-HKSCS. It works on firefox 18.0 but not version 19.0
Version: 18 Branch → 19 Branch
Posted file Testcase (obsolete) —
WFM here with the testcase.
Component: Untriaged → Layout: Text
Product: Firefox → Core
The 3rd character is a big5 hkscs character: 鰂 (show correct on firefox v18.0 and Google Chrome). show in Firefox v19 as 裵 (big5).

sorry, my earlier explanation is not quite write. when we enter in 鰂 inside a form (charset=Big5-HKSCS), when server received it translate as 鰂 because Firefox 19.0 treat Big5-HKSCS as Big5, as it not part of Big5 that is why server received it as 'unicode' 鰂
sorry, mean: my earlier explanation is not quite right.
Attachment #718907 - Attachment mime type: text/plain → text/html
Comment on attachment 718896 [details]
Testcase

This testcase isn't useful, as the file is actually in UTF-16LE (despite its charset claim), and moreover it represents the Chinese character with a Unicode numeric entity, so the page's own encoding is not relevant.
Attachment #718896 - Attachment is obsolete: true
Comment on attachment 718907 [details]
3rd character is a big5-hkscs.

This testcase displays correctly for me (with FF18 on Windows, or with Nightly on OS X) but only if I explicitly choose the Big5-HKSCS encoding from the Character Encoding submenu; when it is initially opened, it displays complete junk: "­»´ä‘o³½¯F ". Maybe because bugzilla serves the testcase with

  Content-Type:text/html; name="test-big5-hkscs.html"; charset=windows-1252

which is clearly the wrong charset for the file.

(Possibly the behavior varies if you're running a localized Chinese version of Firefox, or have your system locale set to Chinese?)
IIRC, the Encoding Standard unifies big5 and big5-hkscs based on research about existing content and based on IE not recognizing the big5-hkscs label.

Reporter, is this breaking a real site somewhere that works in other browsers?

(If this is about a page you authored yourself, please use UTF-8 instead of a legacy encoding.)
just need to open the attached file test-big5-hkscs.html on FF19.
the question is, why it works on FF18, and does not works in FF19? 
something might has set wrongly there. 

If Firefox accept this as a bug and do not want to fixed this, we will just use Google Chrome and forget about FF.

Jonathan Kew, as explained it works for FF18 but not FF19.
please try on FF19. you will see the 3rd character is wrong.
Attachment #718907 - Attachment mime type: text/html → text/html; charset=big5-hkscs
Component: Layout: Text → Internationalization
Reporter, is this breaking a Web site somewhere? Which site? Is that site maintained by you?

Does the page work in a Hong Kong-localized version of IE? Does it work in Taiwan-localized or en-US-localized IE?
Flags: needinfo?(arbiant)
it's on our internal CRM. long history, data originated big5-hkscs from old desktop database system.
then today our user realize something wrong and reported this problem. 
we don't use IE and only use FF.
if FF don't support HKSCS, we have no choice but to go to google chrome. 
yes, we are looking at converting the data to utf.
by then, we will just use available solution.

just one question, why FF19 suddenly stop support HKSCS?
Flags: needinfo?(arbiant)
(In reply to Henri Sivonen (:hsivonen) from comment #9)
> Probably caused by bug 801402.

Yes, I changed it because Encoding Standard defines big5-hkscs as just an alias of big5.

Reporter, please file a spec bug and convince a spec editor of the demand of big5-hkscs as a separate encoding.
https://www.w3.org/Bugs/Public/enter_bug.cgi?product=WHATWG&component=Encoding
I think we should fix this and not wait for the Encoding Standard.
I'd agree.

(FWIW, if we were going to "unify big5 and big5-hkscs" - which may not be a good idea after all - it seems a bit odd to still have Big5-HKSCS available in the menu, distinct from Big5, yet -not- to recognize it in the charset metadata. I'd have expected that either we support that charset just as we used to, or we eliminate it completely. Continuing to "support" it via the UI yet mapping it to Big5 when it occurs in charset metadata seems a bit confusing.)
Annevk sees here, so the spec will be updated shortly.
Assignee: nobody → VYV03354
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Attachment #718991 - Flags: review?(smontagu)
(In reply to Jonathan Kew (:jfkthame) from comment #13)
> (FWIW, if we were going to "unify big5 and big5-hkscs" - which may not be a
> good idea after all - it seems a bit odd to still have Big5-HKSCS available
> in the menu, distinct from Big5, yet -not- to recognize it in the charset
> metadata. I'd have expected that either we support that charset just as we
> used to, or we eliminate it completely. Continuing to "support" it via the
> UI yet mapping it to Big5 when it occurs in charset metadata seems a bit
> confusing.)

I filed bug 805374 to match them.
This revert should be propagated as soon as possible.
thanks
Comment on attachment 718991 [details] [diff] [review]
Support Big5-HKSCS as a separate encoding again

Review of attachment 718991 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks
Attachment #718991 - Flags: review?(smontagu) → review+
If there are sighting of broken public Web content or information about other broken intranets, please make it known here instead of just concluding that the bug has been reported already.
Flags: in-testsuite+
https://hg.mozilla.org/mozilla-central/rev/4cc81e04c718
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla22
Comment on attachment 718991 [details] [diff] [review]
Support Big5-HKSCS as a separate encoding again

[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 801402
User impact if declined: Some web pages encoded in Big5-HKSCS will not work correctly.
Testing completed (on m-c, etc.): on m-c
Risk to taking this patch (and alternatives if risky): very low
String or UUID changes made by this patch: none
Attachment #718991 - Flags: approval-mozilla-beta?
Attachment #718991 - Flags: approval-mozilla-aurora?
(In reply to Masatoshi Kimura [:emk] from comment #22)
> User impact if declined: Some web pages

Well, evidence so far is of one intranet app, only.
http://w3techs.com/technologies/details/en-b5hkscs/all/all lists a few public websites using Big5-HKSCS.

Not every such page is necessarily affected, as most codepoints in Big5-HKSCS will be the same when interpreted as Big5 (AIUI), but there may be occasional characters that appear incorrectly.
Those pages were studied as far as I know. See also:

http://lists.w3.org/Archives/Public/public-whatwg-archive/2012Apr/thread.html#msg42

It's not entirely clear to me how we support big5 though, if we implemented the changes suggested in the Encoding Standard or if we kept our old weird variant.
Attachment #718991 - Flags: approval-mozilla-beta?
Attachment #718991 - Flags: approval-mozilla-beta+
Attachment #718991 - Flags: approval-mozilla-aurora?
Attachment #718991 - Flags: approval-mozilla-aurora+
Verified the issue for Firefox 19.0 (20130215130331). By using the testcase from comment 2, the 3rd character is 裵. 

Verified the fix for Firefox 20.0 Beta 3 (20130305164032), the 3rd character is 鰂, as expected.
Mozilla/5.0 (Windows NT 6.1; rv:21.0) Gecko/20100101 Firefox/21.0
Build ID: 20130423212553

Verified as fixed on Firefox 21 beta 3: the shown character is:  鰂.
mass remove verifyme requests greater than 4 months old
Keywords: verifyme
You need to log in before you can comment on or make changes to this bug.