Closed
Bug 37330
Opened 24 years ago
Closed 16 years ago
Do not use '?' as fallback gylph
Categories
(Core :: Internationalization, defect, P3)
Tracking
()
RESOLVED
FIXED
People
(Reporter: BenB, Assigned: smontagu)
References
Details
(Keywords: helpwanted)
Attachments
(2 files)
143 bytes,
text/html
|
Details | |
1.77 KB,
patch
|
Details | Diff | Splinter Review |
BUG If a char can not be displayed on Unix, because there's no corresponding gylphs found in the font(s), a "?" is displayed. RELEVANCE This is *very* irritating. You don't want to know, what "explanations" I found for myself in trying to figure out, why there are always those question marks instead of quote chars in 4.x. Even now, that I know, what is the reason, this still irritates me. SUGGESTION Use a glyph, that doesn't usually appear in text. The convention on Windows is a non-filled rectangle.
Reporter | ||
Comment 1•24 years ago
|
||
Comment 2•24 years ago
|
||
Thanks for the report. Feel free to submit the fix. I have other priorities for the foreseeable future.
Status: NEW → ASSIGNED
Target Milestone: --- → M30
Comment 3•24 years ago
|
||
Perhaps more useful than a filled rectangle would be to print [XYZ] in some special style, where XYZ is the ASCII (or whatever) number for the character? The string would, of course, need to be treated as one character (for selection, :first-letter etc purposes). The solution for this bug should be the same as what you do for an unknown entity (&foobar;).
Keywords: helpwanted
There are some other possibilities here. U+FFFD is the "replacement character" glyph and a Linux Mozilla will use it if it can be found. If it can't be found then Mozilla tries a transliteration string. Try adding "entity.65533=FOO" to the end of entityTables/transliterate.properties and view a page with some invalid entity like "". Mozilla uses '?' only if nothing else is available. The transliteration strings could indeed be rendered in som distinct way or, perhaps, Mozilla could have some built-in U+FFFD character.
Comment 5•24 years ago
|
||
There's another, *better* solution. See bug #12662. (And bug #454 (yes, that's a very old bug).)
Reporter | ||
Comment 7•24 years ago
|
||
The other bugs do not help in all cases. And fixing this bug should be incredibly easy, if you know where this char is sat. Erik, where is that? I could fix it myself, then.
Comment 8•24 years ago
|
||
mozilla/gfx/src/windows/nsFontMetricsWin.cpp, near top of file, macro called NS_REPLACEMENT_CHAR. mozilla/gfx/src/gtk/nsFontMetricsGTK.cpp, look for nsFontGTKSubstitute::Convert. For the Mac, please ask ftang@netscape.com (Cc'ed). Please let me review any changes before you check them in.
Reporter | ||
Comment 9•24 years ago
|
||
Erik, thanks for the hint. Taking bug.
> For the Mac, please ask ftang@netscape.com (Cc'ed).
*ask*
Comment 10•24 years ago
|
||
BenB, before you spend a lot of time coding, would you please tell us what you intend to do? Let's agree on that before putting a lot of effort into this. Thanks.
Reporter | ||
Comment 11•24 years ago
|
||
Erik, I intend to just replce '?' with some other char (one that usually doesn't appear in text, see initial description, comments/suggestions welcome). Replacing '?' with some otehr char in the places you mentioned didn't work - I still see '?' in the attached testcase.
Summary: Do not use "?" as fallback gylph → Do not use '?' as fallback gylph
Reporter | ||
Comment 12•24 years ago
|
||
I'm using Linux and just changed the *GTK* file.
Comment 13•24 years ago
|
||
Take another look at nsFontGTKSubstitute::Convert. Look for "QuestionMark".
Reporter | ||
Updated•24 years ago
|
Keywords: mozilla1.0
Comment 14•24 years ago
|
||
take out TM and reassign to bstell
Assignee: erik → bstell
Status: ASSIGNED → NEW
Target Milestone: M30 → ---
Updated•24 years ago
|
Status: NEW → ASSIGNED
Reporter | ||
Comment 15•24 years ago
|
||
I had something working, but - I don't know which char to use. It must not be used in normal text and should be available in all relevant charsets. Which chars are save? Only ASCII or can I use any ISO-8859-1 char? (The latter would make the choice *substantially* easier.) - The source change is not as trivial as it seems. - Appearantly, anything can select over an IDL interface which replacement strategy to use ("?", char code number and some other options). Unfortunately, the identifer for "?" is actually called 'UseQuestionMark' or similar. :-( I.e. either we have a misnamed API, or we change it. - In addition to that, there is at least one fallback in each of the platform-specific implementation. The problem is to catch all, or we might have the new replacement char in most cases and the old one, the question mark, in some exceptional cases.
Comment 16•24 years ago
|
||
we could display the unicode value; eg: "\x5E3A" or (5E3A) at least this way the user could tell us the code points that are failing instead of "I say a bunch of <fallback> characters".
Reporter | ||
Comment 17•24 years ago
|
||
Brian, we have that option already - that's one of the other options in the interface. This would mean to change the caller. But I would prefer not to do that. Assuming you have the following original text: Die Bäuerin ißt die größten Törtchen. (The text makes no sense.) What is easier to read: Die B#uerin i#t die gr##ten T#rtchen. or Die B\x5463uerin i\x4683t ist gr\x5734\x4683ten T\x5734rtchen. ? The former.
Comment 18•24 years ago
|
||
UNICODE has a fallback glyph, if any font on the system has this fallback glyph, checking in the order in which fonts are given in the font-family list, then we should use that before any other character.
Reporter | ||
Comment 19•24 years ago
|
||
> if any font on the system has this fallback glyph, [...]
> then we should use that before any other character.
Doesn't it look odd, if you have a Times glyph in a Courier font block?
Comment 20•24 years ago
|
||
> UNICODE has a fallback glyph
What's its name/number?
Comment 21•24 years ago
|
||
Ben: That's why you try all the specified fonts first. This is the same algorithm as is used to find each glyph in the first place. Karl: U+FFFD REPLACEMENT CHARACTER
Reporter | ||
Comment 22•24 years ago
|
||
> This is the same algorithm as is used to find each glyph in the first place.
Oh, OK. So, all we need to do it replcae the |'?'| in the source with
|REPLACEMENT_CHAR| or however it is called? (Assuming we leave the misnamed API
alone for now.)
Comment 23•24 years ago
|
||
The API is not really "misnamed", since it *does* give you question marks, as advertized. You may argue that the API should also (or instead) have given you the option to choose something other than the question mark, but that is a different argument. Anyway, it is not sufficient to replace '?' with the Unicode replacement char, since (at least on Windows and Unix) the "substitute" font machinery assumes that the substitute font only guarantees ASCII availability (0x20-0x7E), and so you can only use ASCII fallbacks (e.g. "?", "EUR", "...", etc). If you wanted to have a more general fallback mechanism, you would have to implement another level of font switching (in addition to the one that is already in nsRenderingContext{Win,GTK}). I.e. you would need to switch from font to font inside the nsFont{Win,GTK}Substitute.{GetWidth,DrawString,etc} methods too.
Reporter | ||
Comment 24•24 years ago
|
||
> The API is not really "misnamed", since it *does* give you question marks, as > advertized. This bug is about changing that -> The API (name) was not general enough -> It *will be* "misnamed" once we fix this bug (and not the API). > it is not sufficient to replace '?' with the Unicode replacement char, [...] I guessed so :-(. Proposal: 2 steps for this bug. Step 1 just changes the current replacement char ('?') to some other, more uncommon, but ASCII, char, e.g. '|'. Step 2 adds an additional layer on top of that. It tries to get the Unicode replacement char via the normal font mechnisms (Erik, would that work in the substitution code?), and, if failing that, use the ASCII replacement char.
Comment 25•24 years ago
|
||
> Karl: U+FFFD REPLACEMENT CHARACTER
This is actually very nice. It looks (could look) like a white question mark on
a rotated black square. The description reads:
"used to replace an incoming character
whose value is unknown or unrepresentable
in Unicode"
Comment 26•24 years ago
|
||
If you submit a patch that simply changes the '?' to something else but does not change the name used in the API (attr_FallbackQuestionMark), then I will disapprove. Also, the old API should probably stay, so you need to add something. I actually don't feel so strongly about the '?', but if many people want to change it, then I'd rather use '#' than '|', since the latter looks too much like an 'l' or a '1' or an 'I'. Yes, the additional font switching would be implemented inside nsFontGTKSubstitute::{GetWidth,DrawString,GetBoundingMetrics}.
Reporter | ||
Comment 27•24 years ago
|
||
I'll attach what I have. Erik already said, he will disapprove it, but I don't care to add a new interface (I would have changed the existing one).
Reporter | ||
Comment 28•24 years ago
|
||
Reporter | ||
Comment 29•24 years ago
|
||
Oh, cool, no the fix doesn't work too well - Schumacher Clean on Linux doesn't have the replacement char I chose (°) :-(. I should have used #. Anyway, you can see from the patch what to change.
Updated•24 years ago
|
Target Milestone: --- → Future
Comment 32•22 years ago
|
||
BenB, in response to comment #19, the replacement character has an appearance unique enough that I don't think it really would look different in different fonts. It's a black diamond with a white question mark inside. Looks like this if your browser can render it: � Most (all?) versions of Netscape 6 used this character for many missing glyphs, and it's the only method of calling out missing characters I've ever seen that didn't make the text confusing. Empty squares, question marks, and # signs can all look like legidimate text in some situations. I'd argue that no US-ASCII character will ever avoid these problems, and that some distinct character must be used. Either use Replacement Character (and include a font with just that char in all moz distributions if you have to), or create a GIF or PNG image that looks like it, and insert that in the event that Replacement Character is not available. Readers need something that makes it clear they're going to have to guess (from context, likely) what the character was supposed to be, and authors also need something bold and clear--so they know they're using a character that's either incorrect or not present in all fonts. Using any other character is going to hide that fact. � is easily seen when scanning text, and though it can have a jarring appearance, this is a necessary effect.
Comment 33•19 years ago
|
||
-> to default owner (rather than ftang's WONTFIX)
Assignee: ftang → smontagu
Status: ASSIGNED → NEW
QA Contact: teruko → amyy
Target Milestone: Future → ---
Assignee | ||
Comment 34•16 years ago
|
||
Fixed by bug 372629.
You need to log in
before you can comment on or make changes to this bug.
Description
•