Closed Bug 291107 Opened 20 years ago Closed 9 years ago

Automatic Character Encoding fails on some websites

Categories

(Core :: Internationalization, defect)

PowerPC
All
defect
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: michael.graubart7, Assigned: smontagu)

References

()

Details

(Keywords: testcase)

Attachments

(4 files, 1 obsolete file)

User-Agent:       Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.8b2) Gecko/20050419
Build Identifier: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.8b2) Gecko/20050419

With some websites or web pages, the character encoding gets automatically set
to 'Chinese Simplified' and has to be reset manually under 'View'. In addition,
on the particular website above, if one selects -> Herbal Direct Online-Shop ->
P -> 'Psorasolv' -> Buy, one line of text appears as a complete listing of the
music characters (presumably because I have a music typesetting application
called Finale with its music fonts on my hard disk). And this line of music
symbols cannot be changed by resetting the character encoding.

Reproducible: Always

Steps to Reproduce:
1. Select English as first language and Western ISO-8859-1 as default encoding
in Preferences.
2. Go to the above website/web page.
3. Select -> Herbal Direct Online-Shop -> P -> 'Psorasolv' -> Buy
4. Note wrong symbols for £ (or $?).
5. Note line of text which (presumably depending on which fonts are installed)
shows as a complete listing of a font.
6. Under 'View', change character encoding to Western. 

Actual Results:  
At 4., symbols are wrong. At 5., line of all symbols from some font such as
Petrucci or Maestro (if installed). At 6., currency symbols are corrected, but
line of font characters remains.

Expected Results:  
The character encoding should have been correct (Western ISO-8859-1) from the
start, and the mysterious line of font characters should have been a line of text.

eMac G4, OS X 10.3.9. Classic theme.
Attached file testcase (obsolete) —
testcase contains £4.35 and the unescaped pound character.
This bug is either invalid or Tech Evangelism.
The unescaped pound characters trigger the detection logic and ask you to
install chinese fonts.
Keywords: testcase
OS: MacOS X → All
Re Comment #2: (a) I have Chinese fonts installed. (b) The website belongs to a
British company (situated in the UK) and has no Chinese connections as far as I
know. (c) Mr. Schwab  seems not to have to have gone on to the further page (see
my instructions for reproducing the bug) to see whether my line of music
characters is in any way reproduced.
Moreover, I have just tried the website with Internet Explorer. All the £
(pound) signs are correct, the line about shipping to Europe, etc. which has
zeros in Mozilla, reads thus in IE: 

'For overseas customers, there is an extra Post & Packaging cost of:
Europe - £7.50
Rest of the World - £14.00 (Excludes USA)'

and on the next page the line of (in my case) music characters reads

'[For USA, we are unable to process your order through our automative system,
please click here to send your order via email and we will contact you, or you
can purchase our products here. Telephone order please ring our orders hotline
on: +44(0)191 523 6578]'
The attached testcase is made from the original code. Do you see questionmarks
there? 
Please try changing the setting 
View->Character Encoding->Auto-Detect->(Off)

(Off) renders the unencoded pound characters normally, 
Universal sets encoding to  Chinese Simplified (GB18030)
Screenshot converted to black & white to reduce size to below 300 kB
Attached file Screenshot - PDF
Screenshot converted to black & white to reduce size to below 300 kB
Attached file screenshot - PDF
Screenshot converted to black & white to reduce size to below 300 kB.

NB: Could this bug be connected with #287675?
Attached file better testcase
testcase made from original:
load http://www.herbal-direct.com/psorasolv_oint_product.html and click buy.

Michael, can you see the bug on this testcase?
Can you see it also if Auto-detect is off?
Attachment #181264 - Attachment is obsolete: true
Hermann, the line of music characters is there, and remains completely unchanged
in every detail, when I open your test-case attachment and try every possible
setting under 'Auto-Detect', including 'Off'. What does change is the 'Overseas
customers…' sentence in large, bold red letters.
(In reply to comment #9)
Hermann, I can see this only when Auto-Detect is On (and the character set
detected is Chinese simplified (GB18030)
Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8b2) Gecko/20050423
Firefox/1.0+

In fact, I see this on many french websites, sometimes (like www.boursorama.com
for instance) and it makes me mad.
Assignee: general → smontagu
Component: General → Internationalization
Product: Mozilla Application Suite → Core
QA Contact: general → amyy
Version: unspecified → Trunk
*** Bug 293356 has been marked as a duplicate of this bug. ***
I can't help suspecting that my music-characters bug (293356) is not just a
matter of failure to respond to automatic encoding, because setting character
encoding to Western makes the main sentence and the £ (pound sterling) signs
come out right, but the line in small print is still in music characters on my
computer. But I have made a further intriquing discovery about this bug: if one
selects the line of music characters, copies it and pastes it into a text editor
or word processor, it comes out as the correct text in alphabetic script.
QA Contact: amyy → i18n
Attachment #181363 - Attachment mime type: text/html → text/html; charset=
 the "universal" detector is gone
Status: UNCONFIRMED → RESOLVED
Closed: 9 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: