Closed
Bug 233474
Opened 22 years ago
Closed 8 years ago
Symbol fonts only work on ISO-8859-1 pages
Categories
(Core :: Layout: Text and Fonts, defect)
Tracking
()
VERIFIED
WONTFIX
People
(Reporter: paultolk, Unassigned)
References
()
Details
Attachments
(1 file)
|
10.27 KB,
image/jpeg
|
Details |
User-Agent:
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.6) Gecko/20040113
Open the URL page
http://www.mccme.ru/mmmf-lectures/books/books/books.php?book=20&page=2 in
Mozilla. Its right character coding is Cyrillic (Windows 1251). Select this
character coding from View|Character Coding|More|East European menu if it is not
selected automatically.
Pay attention to the last characters of the first paragraph. They look like
A M B. In fact, M must be a well known math symbol meaning "is a subset of". To
see it correctly, change your Character Coding to Western (ISO-8859-1).
Unfortunately, all the rest text becomes an abracadabra in this encoding.
I noticed that in the page source, the character is marked as
<font face="symbol">Ì</font>
It seems that, when Mozilla switches to Win 1251, it starts displaying this
symbol in its default font, instead of "symbol" (font symbol does contain the
glyph looking similar to 'M' but it is different from the one displayed by Mozilla).
Internet Explorer shows this page just fine in Win 1251.
Reproducible: Always
Steps to Reproduce:
1. Browse http://www.mccme.ru/mmmf-lectures/books/books/books.php?book=20&page=2
2. Switch encoding to Cyrillic (Windows 1251)
3. Observe A M B at the end of the first paragraph of normal text (in Russian).
Actual Results:
I can see A M B
Expected Results:
I must see A "is a subset of" B where "is a subset of" is a glyph in a 'symbol'
fontface.
Comment 1•21 years ago
|
||
Hello
I have visited your site using IE and cannot agree to your statement.
The html-sequence <i>A</i> <font face="symbol">М</font> <i>B</i> only works on
your computer and those who have similar font sets than yours.
Best regards
Wolfgang
Comment 2•21 years ago
|
||
The <font face="symbol"> quirk only works on pages in ISO-8859-1 (or compatible)
encoding. In Windows-1251 0xCC is decoded to the Unicode codepoint U+041C,
CYRILLIC CAPITAL LETTER EM, which isn't represented in the symbol font.
I believe that IE ignores the encoding of the page for <font face="symbol">, but
I doubt if we want to extend the quirk to do that.
Summary: When selecting proper encoding, math. symbols are not right → Symbol fonts only work on ISO-8859-1 pages
From the Standard:
"
face = cdata [CI]
Deprecated. This attribute defines a comma-separated list of font names the
user agent should search for in order of preference.
"
My reading of it is that the agent must look through the list of fonts. If the
font is not found or the character numeric value is not defined in this
fontface, the standard does not say what to do; however, displaying "CYRILLIC
CAPITAL LETTER EM" seems at least questinable and and of course arbitrary.
Displaying the question mark in this case (say, in the default font of the page)
would work much better.
However, I do not believe this is the case here. The font is defined, the
numeric value of a character is legal in the encoding (0xCC and the value of
HTTP header "Encoding" is "windows-1251"), so the character's glyph should be
taken from the said font. The standard never implies that 'face' attribute is a
quirk or it must only work if the encoding is "ISO-8859-1 (or compatible)" or
that that strange procedure (taking the character name in Unicode following by
the search of the character name in the specified font) should be applied. By
the way, the latter procedure would probably not work even if the one would want
to change current font from "Corier" to "Helvetica" for Latin "A" because the
descriptions of this character in the font, even where available, would probably
not match the Unicode's code point name.
My approach to following the standard in this case would be like this:
1. Use current encoding rules (which is the combination of a character set with
a reversible method of serializing them to a sequence of bit) to get a numeric
representation of a character. (Please note that the standard does not even
require all document characters to be representable by the encoding: it allows
using character entity references to represent other characters; so the
document's encoding is more an optimization than something that must affect the
behavior of the agent)
2. Apply the current font to the obtained number and display.
2. Apply the current font to get a glyph
3. Display the glyph
Comment 4•21 years ago
|
||
http://www.w3.org/TR/1999/REC-html401-19991224/charset.html#h-5.2.1 :
"Conforming user agents must correctly map to ISO 10646 all characters in any
character encodings that they recognize (or they must behave as if they did)."
In other words, if the document encoding is Windows-1251, the octet 0xCC
represents CYRILLIC CAPITAL LETTER EM. This is neither questionable nor
arbitrary, it's compulsory.
(In reply to comment #3)
> Please note that the standard does not even
> require all document characters to be representable by the encoding: it allows
> using character entity references to represent other characters
Quite so: this is exactly the way to represent the SUBSET OF character: a
numeric reference |⊂| or |⊂| or an entity reference |⊂| Using
the Symbol font is unnecessary, non-standard, and non-portable.
"
5.4 Undisplayable characters
...
we recommend the following behavior for user agents:
1. Adopt a clearly visible, but unobtrusive mechanism to alert the user of
missing resources.
"
'M' is hardly such alert. Until I decided to browse the page in IE I was looking
for a definition of the operator 'M' and was feeling sick...
Comment 6•21 years ago
|
||
I think you are missing the point here. A better argument for fixing this would
be simple consistency: if we have decided to implement a quirk for <font
face="symbol> there isn't any good reason (other than ease of implementation)
for it to be dependent on the encoding of the document.
Well, if you put it this way I will certainly not argue :-) -- as soon as the
behavior is going to change -- this way or that.
Comment 8•20 years ago
|
||
This is an automated message, with ID "auto-resolve01".
This bug has had no comments for a long time. Statistically, we have found that
bug reports that have not been confirmed by a second user after three months are
highly unlikely to be the source of a fix to the code.
While your input is very important to us, our resources are limited and so we
are asking for your help in focussing our efforts. If you can still reproduce
this problem in the latest version of the product (see below for how to obtain a
copy) or, for feature requests, if it's not present in the latest version and
you still believe we should implement it, please visit the URL of this bug
(given at the top of this mail) and add a comment to that effect, giving more
reproduction information if you have it.
If it is not a problem any longer, you need take no action. If this bug is not
changed in any way in the next two weeks, it will be automatically resolved.
Thank you for your help in this matter.
The latest beta releases can be obtained from:
Firefox: http://www.mozilla.org/projects/firefox/
Thunderbird: http://www.mozilla.org/products/thunderbird/releases/1.5beta1.html
Seamonkey: http://www.mozilla.org/projects/seamonkey/
Somebody has to confirm the reported behavior. It have not changed and M is
still displayed instead of a 'a subset of' symbol in Mozilla 1.7.10
Hardware: PC → Other
Updated•20 years ago
|
Status: UNCONFIRMED → NEW
Ever confirmed: true
Comment 10•8 years ago
|
||
I agree that we should not add the hack outlined in comment 2. I doubt Edge implements it.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
Comment 11•8 years ago
|
||
Agreed WONTFIX, but for the record do we support <font face="symbol"> under any circumstances these days?
Status: RESOLVED → VERIFIED
Comment 12•8 years ago
|
||
I don't know if there's any special code paths, but if there's a font named "symbol" that should work just like "verdana" works. (So if it decides that certain code points map to glyphs that don't really represent those code points, it'd work.)
You need to log in
before you can comment on or make changes to this bug.
Description
•