Closed Bug 145199 Opened 22 years ago Closed 22 years ago

GB18030 has been improved tooooooo much~ killing Big5 pages

Categories

(Core :: Internationalization, defect)

x86
All
defect
Not set
major

Tracking

()

VERIFIED DUPLICATE of bug 132006

People

(Reporter: piaip, Assigned: tetsuroy)

References

()

Details

(Keywords: intl)

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; zh-TW; rv:1.0rc2)
Gecko/20020510
BuildID:    2002051006 (RC2)

Since 1.0RC1 Mozilla has "# Improved GB18030 support".
Well, the GB18030 is improved too much than it should be.

If we set Mozilla charset encoding to Autodetect - Chinese,
Mozilla now (not in prior to 0.99) treats ALL Chinese pages
including Traditional and Simplified as Simplified Chinese
GB18030, even if the page has described itself as Big5.

The URL http://udnnews.com/ and linuxfab.cx has 
<meta http-equiv='Content-Type' content='text/html; charset=big5'>
but still recognized as GB18030. I think the improvement is too much.

This is painful to Big5 users because we couldn't set Autodetect
to Chinese anymore and see T/S Chinese easily.

Reproducible: Always
Steps to Reproduce:
1. Set Charset Encoding to Autodetect - Chinese 
   (Not Traditional nor Simplified)
2. Browse most Big5(Traditional)pages, as http://udnnews.com/ or
   http://linuxfab.cx/

Actual Results:  Charset encoding was recognized as GB18030

Expected Results:  Should be Big5
   I've tried with RC2 on Win98SE.

   http://linuxfab.cx/  is *ok* with your steps.
   http://udnnews.com/  *failed* in some place on the right side.
   
   Did you tried on other platform ?
It is said that this bug also happens in Mail/News.
If we don't specify charset and using autodetect charset,
then Composer/Mail & News will get wrong encoding,
displaying as unrecognizable characters.
And Mozilla again uses GB encodings.
Summary: GB18030 is improved tooooooo much~ killed Big5 pages → GB18030 is improved tooooooo much~ killing Big5 pages
In reply to comment #2,
this bug also appears on Win2000, FreeBSD and Linux platforms.
One thing to notice is that once you have manually specified the
charset encodings (even if you still have autodetect enabled),
the result of detection may change.
i.e., after you specified as big5, sometimes it'll change to 
big5.
Trying this bug on websites supporting multi-languages may lead
to unpredictable results.

What do you mean by fail and ok? I don't understand~
Displaying correctly, or recognizable, or....?
Summary: GB18030 is improved tooooooo much~ killing Big5 pages → GB18030 has been improved tooooooo much~ killing Big5 pages
This bug does bother me a lot.
Autodetection used to work well. I have had my mozilla to detect all CJK
(eastern Asia) encodings and it almost always succeeded to find the correct
encoding before this bug. But now all Chinese pages that do not specify
explicitly their encodings falls to GB. It drove me crazy since most Chinese
pages I browse are in traditional Chinese (Big5).
cc shanjain (autodetect owner)
Due to some reason, I think the URL
http://www.csie.ntu.edu.tw/~b7506051/mozilla/145199.html
is better to describe this bug.
This page has no <meta> tag to specify charset and was 
written in Big5. Using Auto-detect will recognize this page
as GB18030.
This page also have 2 screenshots to show correct(b5) displaying
and incorrect(gb) displaying.

Notice if you have used Autodetect and MANUALLY selected Big5
to some pages then the result might be unpredictable.
CJK Autodetect fails to recognise UTF-8. It gives only GB18030.
*** This bug has been confirmed by popular vote. ***
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: intl
QA Contact: ruixu → ylong
I think it's a dup of bug 132006 which has been fixed in trunk build, and will
check into branch build very soon.

I checked it on 05-17 trunk build/WinME-JA, auto-detect Chinese and Universal
will detect the page: http://www.csie.ntu.edu.tw/~b7506051/mozilla/145199.html
as Big5.
marking as dup

*** This bug has been marked as a duplicate of 132006 ***
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → DUPLICATE
Mark as verified.  Please reopen if still see it on latest trunk build or latest
branch build after bug 132006 checked in.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.