Closed Bug 33032 Opened 24 years ago Closed 13 years ago

Display Unicode space characters as space

Categories

(Core :: Layout: Text and Fonts, defect, P3)

defect

Tracking

()

RESOLVED FIXED
mozilla6

People

(Reporter: erik, Assigned: jfkthame)

References

(Blocks 1 open bug)

Details

(Keywords: intl)

Attachments

(2 files, 2 obsolete files)

Taken from the following bug:

  http://bugzilla.mozilla.org/show_bug.cgi?id=6585

+ ------- Additional Comments From ftang@netscape.com  2000-03-23 09:40 -------
+ However, we should do special process for
+ "space" characters in the range of U+2000-U+206F. (some of them, not all of
+ them). There are no reason we cannot "render"/"measure" a space characters, 
even
+ non of the font have these glyph (a glyph for a space character is simply a
+ matrix without any countor....)
Status: NEW → ASSIGNED
Target Milestone: --- → M20
Change the summary to 
Display Unicode space characters as space
Should we split this to 3 bugs. one for each platform ?
Summary: implement the Unicode space characters → Display Unicode space characters as space
Mark M24 and reassign to bstell for linux part 
Assignee: erik → bstell
Status: ASSIGNED → NEW
Target Milestone: M20 → M24
Status: NEW → ASSIGNED
Target Milestone: --- → Future
bstell: This bug seems to have acquired 37 votes. There is obviously demand here
:-) Is it on your radar at all?

Gerv
ftang: can you set the priority on this?
--> ftang
Assignee: bstell → ftang
Status: ASSIGNED → NEW
bulk move NEW FUTURE bug to ASSIGN
Status: NEW → ASSIGNED
over to shanjian
Assignee: ftang → shanjian
Status: ASSIGNED → NEW
accepting. 
We can probably do something in subsititute font. 
Status: NEW → ASSIGNED
Does anybody still care about this bug? If the evolution of font on various
systems make this bug unnecessary, I will close it. If not, please post your
comment and a test case. 
Keywords: mozilla1.0
*** Bug 215700 has been marked as a duplicate of this bug. ***
testcase from bug 215700

Actual Results:  
It shows a single line of text that looks like this:
First paragraph.?Second paragraph?Third paragraph, first line.?Third paragraph,

second line?


Expected Results:  
It should instead display as:

First paragraph.
Second paragraph
Third paragraph, first line.
Third paragraph, second line
We have to deal with some of them (Unicode paragraph separator, line separator,
etc)  an XP-way in nsTextTransformeran while others have to be dealt with in
each Gfx port (em space, en space, quad-space?, etc). I guess em/en/quad spaces
are already handled 'correctly' by our trasliterator.
Keywords: intl
Component: Layout → Layout: Fonts and Text
shanjian is no longer working on mozilla for 2 years and these bugs are still
here. Mark them won't fix. If you want to reopen it, find a good owner first. 
Status: ASSIGNED → RESOLVED
Closed: 19 years ago
Resolution: --- → WONTFIX
Mass Reassign Please excuse the spam
Assignee: shanjian → nobody
Mass Re-opening Bugs Frank Tang Closed on Wensday March 02 for no reason, all
the spam is his fault feel free to tar and feather him
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Reassigning Franks old bugs to Jungshik Shin for triage - Sorry for spam
Assignee: nobody → jshin1987
Status: REOPENED → NEW
Blocks: 215700
QA Contact: chrispetersen → layout.fonts-and-text
>I guess em/en/quad spaces are already handled 'correctly' by our trasliterator.

Would fixing this fix Bug 551734 - Unicode hair space (U+200A) is too wide if style="font: x-small sans-serif" (has test case)?
Blocks: 653911
Taking this bug - I think I can cook up a patch to make things work better.

(In reply to comment #17)
> >I guess em/en/quad spaces are already handled 'correctly' by our trasliterator.
> 
> Would fixing this fix Bug 551734 - Unicode hair space (U+200A) is too wide if
> style="font: x-small sans-serif" (has test case)?

That may depend on your system and browser font configuration. Specifically, it would depend whether the sans-serif font concerned actually has a (too-wide) glyph for U+200A. If so, we'll still use that glyph; but if what's happening is that the font does _not_ support U+200A, and so you're seeing fallback to a different font that is too wide in comparison, then I think we can fix that.
Assignee: jshin1987 → jfkthame
This resolves bug 653911 on my Android device, in addition to avoiding potentially expensive font fallback.

We could omit the part of this patch that short-circuits font search (in gfxFontGroup::FindFontForChar), and _only_ use the "fake" spaces in gfxFontGroup::InitScriptRun if font-matching fails to find any font at all (as happens on Android), but I don't see any good reason to fall back through the font list and system fonts for these characters; if they're not supported in the first-choice font, we might as well go directly to the option of using a standard space glyph with an appropriately-modified width.
Attachment #529341 - Flags: review?(jdaggett)
This tests a few of the defined-width spaces in the U+200x range to check that they render as the expected amount of space even when the current font does not include them.
Attachment #529381 - Flags: review?(jdaggett)
Comment on attachment 529341 [details] [diff] [review]
patch, synthesize various Unicode spaces rather than falling back to other font or missing-glyph rendering

>      // 1. check fonts in the font group
>      for (PRUint32 i = 0; i < FontListLength(); i++) {
>          nsRefPtr<gfxFont> font = GetFontAt(i);
> -        if (font->HasCharacter(aCh))
> +        if (font->HasCharacter(aCh)) {
>              return font.forget();
> +        }
> +        if (i == 0 &&
> +            gfxUnicodeProperties::GetGeneralCategory(aCh) ==
> +                HB_CATEGORY_SPACE_SEPARATOR)
> +        {
> +            return nsnull;
> +        }
>      }

I really don't think we should apply this at this level, if we're going to do this we should apply it after pref fonts are referenced, before system font fallback occurs.  And doesn't this effectively apply to any codepoint with category == Zs?  That would appear to include to include characters like normal
spaces, full-width spaces:

  0020;SPACE;Zs;0;WS;;;;;N;;;;;
  00A0;NO-BREAK SPACE;Zs;0;CS;<noBreak> 0020;;;;N;NON-BREAKING SPACE;;;;
  1680;OGHAM SPACE MARK;Zs;0;WS;;;;;N;;;;;
  180E;MONGOLIAN VOWEL SEPARATOR;Zs;0;WS;;;;;N;;;;;
  205F;MEDIUM MATHEMATICAL SPACE;Zs;0;WS;<compat> 0020;;;;N;;;;;
  3000;IDEOGRAPHIC SPACE;Zs;0;WS;<wide> 0020;;;;N;;;;;
  
Maybe I'm not guessing correctly as to how GetGeneralCategory functions (reading the code in gfxUnicodePropertyData.cpp certainly doesn't help!).  Won't this patch also affect shaping behavior for Mongolian if U+180e is used, since with this patch we won't hit system fallback for that character?
Attachment #529341 - Flags: review?(jdaggett) → review-
Argh, yes, I was initially thinking I'd do _all_ the GC=Zs characters, but then backed off a bit from that as some of them are less clear-cut.

Here is an alternative version that only steps in before the final system-fallback search, and only for the characters that GetUnicodeSpaceWidth actually knows about.
Attachment #529341 - Attachment is obsolete: true
Attachment #529381 - Attachment is obsolete: true
Attachment #529434 - Flags: review?(jdaggett)
Attachment #529381 - Flags: review?(jdaggett)
Comment on attachment 529434 [details] [diff] [review]
patch, v2 - synthesize unicode spaces - less aggressive approach

+gfxFont::GetUnicodeSpaceWidth(PRUint32 aCh)

A better name here would be 'SynthesizeSpaceWidth', since you're not really "getting" the width in the sense of fetching it from the font.

+    // for known "space" characters, don't do a full system-fallback search;
+    // we'll synthesize appropriate-width spaces instead of missing-glyph boxes
+    if (GetFontAt(0)->GetUnicodeSpaceWidth(aCh) >= 0.0) {
+        return nsnull;
+    }

The switch statement in GetUnicodeSpaceWidth is relatively expensive, I think it might be better to use:

if (gfxUnicodeProperties::GetGeneralCategory(aCh) == HB_CATEGORY_SPACE_SEPARATOR && 
    GetFontAt(0)->GetUnicodeSpaceWidth(aCh) >= 0.0)
{
   ...
}
Attachment #529434 - Flags: review?(jdaggett) → review+
(In reply to comment #23)
> A better name here would be 'SynthesizeSpaceWidth'

OK, I like that.

> The switch statement in GetUnicodeSpaceWidth is relatively expensive, I think
> it might be better to use:
> 
> if (gfxUnicodeProperties::GetGeneralCategory(aCh) ==
> HB_CATEGORY_SPACE_SEPARATOR && 
>     GetFontAt(0)->GetUnicodeSpaceWidth(aCh) >= 0.0)
> {
>    ...
> }

I'm not sure it matters much once we're down to this point in FindFontForChar, but FWIW I've included this, as GetGeneralCategory is designed to be a very fast table lookup.

Pushed:
http://hg.mozilla.org/mozilla-central/rev/9b312736f4ca
Status: NEW → RESOLVED
Closed: 19 years ago13 years ago
Resolution: --- → FIXED
Target Milestone: Future → mozilla6
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: