Display Unicode space characters as space

RESOLVED FIXED in mozilla6

Status

()

Core
Layout: Text
P3
normal
RESOLVED FIXED
18 years ago
6 years ago

People

(Reporter: Erik van der Poel, Assigned: jfkthame)

Tracking

(Blocks: 1 bug, {intl})

Trunk
mozilla6
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments, 2 obsolete attachments)

(Reporter)

Description

18 years ago
Taken from the following bug:

  http://bugzilla.mozilla.org/show_bug.cgi?id=6585

+ ------- Additional Comments From ftang@netscape.com  2000-03-23 09:40 -------
+ However, we should do special process for
+ "space" characters in the range of U+2000-U+206F. (some of them, not all of
+ them). There are no reason we cannot "render"/"measure" a space characters, 
even
+ non of the font have these glyph (a glyph for a space character is simply a
+ matrix without any countor....)
(Reporter)

Updated

18 years ago
Status: NEW → ASSIGNED
Target Milestone: --- → M20

Comment 1

17 years ago
Change the summary to 
Display Unicode space characters as space
Should we split this to 3 bugs. one for each platform ?
Summary: implement the Unicode space characters → Display Unicode space characters as space

Comment 2

17 years ago
Mark M24 and reassign to bstell for linux part 
Assignee: erik → bstell
Status: ASSIGNED → NEW
Target Milestone: M20 → M24

Updated

17 years ago
Status: NEW → ASSIGNED

Updated

17 years ago
Target Milestone: --- → Future
bstell: This bug seems to have acquired 37 votes. There is obviously demand here
:-) Is it on your radar at all?

Gerv
Keywords: mozilla0.9.1, mozilla1.0

Comment 4

17 years ago
ftang: can you set the priority on this?

Comment 5

16 years ago
--> ftang
Assignee: bstell → ftang
Status: ASSIGNED → NEW

Comment 6

16 years ago
bulk move NEW FUTURE bug to ASSIGN
Status: NEW → ASSIGNED

Comment 7

15 years ago
over to shanjian
Assignee: ftang → shanjian
Status: ASSIGNED → NEW

Comment 8

15 years ago
accepting. 
We can probably do something in subsititute font. 
Status: NEW → ASSIGNED

Comment 9

15 years ago
Does anybody still care about this bug? If the evolution of font on various
systems make this bug unnecessary, I will close it. If not, please post your
comment and a test case. 
Keywords: mozilla1.0

Comment 10

14 years ago
*** Bug 215700 has been marked as a duplicate of this bug. ***

Comment 11

14 years ago
Created attachment 129682 [details]
UTF-8 plaintext  testcase with paragraph and line separator characters

testcase from bug 215700

Actual Results:  
It shows a single line of text that looks like this:
First paragraph.?Second paragraph?Third paragraph, first line.?Third paragraph,

second line?


Expected Results:  
It should instead display as:

First paragraph.
Second paragraph
Third paragraph, first line.
Third paragraph, second line

Comment 12

14 years ago
We have to deal with some of them (Unicode paragraph separator, line separator,
etc)  an XP-way in nsTextTransformeran while others have to be dealt with in
each Gfx port (em space, en space, quad-space?, etc). I guess em/en/quad spaces
are already handled 'correctly' by our trasliterator.
Keywords: intl
Component: Layout → Layout: Fonts and Text

Comment 13

13 years ago
shanjian is no longer working on mozilla for 2 years and these bugs are still
here. Mark them won't fix. If you want to reopen it, find a good owner first. 
Status: ASSIGNED → RESOLVED
Last Resolved: 13 years ago
Resolution: --- → WONTFIX

Comment 14

13 years ago
Mass Reassign Please excuse the spam
Assignee: shanjian → nobody

Comment 15

13 years ago
Mass Re-opening Bugs Frank Tang Closed on Wensday March 02 for no reason, all
the spam is his fault feel free to tar and feather him
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---

Comment 16

13 years ago
Reassigning Franks old bugs to Jungshik Shin for triage - Sorry for spam
Assignee: nobody → jshin1987
Status: REOPENED → NEW

Updated

13 years ago
Blocks: 215700
QA Contact: chrispetersen → layout.fonts-and-text

Comment 17

8 years ago
>I guess em/en/quad spaces are already handled 'correctly' by our trasliterator.

Would fixing this fix Bug 551734 - Unicode hair space (U+200A) is too wide if style="font: x-small sans-serif" (has test case)?
(Assignee)

Updated

6 years ago
Blocks: 653911
(Assignee)

Comment 18

6 years ago
Taking this bug - I think I can cook up a patch to make things work better.

(In reply to comment #17)
> >I guess em/en/quad spaces are already handled 'correctly' by our trasliterator.
> 
> Would fixing this fix Bug 551734 - Unicode hair space (U+200A) is too wide if
> style="font: x-small sans-serif" (has test case)?

That may depend on your system and browser font configuration. Specifically, it would depend whether the sans-serif font concerned actually has a (too-wide) glyph for U+200A. If so, we'll still use that glyph; but if what's happening is that the font does _not_ support U+200A, and so you're seeing fallback to a different font that is too wide in comparison, then I think we can fix that.
Assignee: jshin1987 → jfkthame
(Assignee)

Comment 19

6 years ago
Created attachment 529341 [details] [diff] [review]
patch, synthesize various Unicode spaces rather than falling back to other font or missing-glyph rendering

This resolves bug 653911 on my Android device, in addition to avoiding potentially expensive font fallback.

We could omit the part of this patch that short-circuits font search (in gfxFontGroup::FindFontForChar), and _only_ use the "fake" spaces in gfxFontGroup::InitScriptRun if font-matching fails to find any font at all (as happens on Android), but I don't see any good reason to fall back through the font list and system fonts for these characters; if they're not supported in the first-choice font, we might as well go directly to the option of using a standard space glyph with an appropriately-modified width.
Attachment #529341 - Flags: review?(jdaggett)
(Assignee)

Comment 20

6 years ago
Created attachment 529381 [details] [diff] [review]
reftest for synthetic Unicode space characters

This tests a few of the defined-width spaces in the U+200x range to check that they render as the expected amount of space even when the current font does not include them.
Attachment #529381 - Flags: review?(jdaggett)

Comment 21

6 years ago
Comment on attachment 529341 [details] [diff] [review]
patch, synthesize various Unicode spaces rather than falling back to other font or missing-glyph rendering

>      // 1. check fonts in the font group
>      for (PRUint32 i = 0; i < FontListLength(); i++) {
>          nsRefPtr<gfxFont> font = GetFontAt(i);
> -        if (font->HasCharacter(aCh))
> +        if (font->HasCharacter(aCh)) {
>              return font.forget();
> +        }
> +        if (i == 0 &&
> +            gfxUnicodeProperties::GetGeneralCategory(aCh) ==
> +                HB_CATEGORY_SPACE_SEPARATOR)
> +        {
> +            return nsnull;
> +        }
>      }

I really don't think we should apply this at this level, if we're going to do this we should apply it after pref fonts are referenced, before system font fallback occurs.  And doesn't this effectively apply to any codepoint with category == Zs?  That would appear to include to include characters like normal
spaces, full-width spaces:

  0020;SPACE;Zs;0;WS;;;;;N;;;;;
  00A0;NO-BREAK SPACE;Zs;0;CS;<noBreak> 0020;;;;N;NON-BREAKING SPACE;;;;
  1680;OGHAM SPACE MARK;Zs;0;WS;;;;;N;;;;;
  180E;MONGOLIAN VOWEL SEPARATOR;Zs;0;WS;;;;;N;;;;;
  205F;MEDIUM MATHEMATICAL SPACE;Zs;0;WS;<compat> 0020;;;;N;;;;;
  3000;IDEOGRAPHIC SPACE;Zs;0;WS;<wide> 0020;;;;N;;;;;
  
Maybe I'm not guessing correctly as to how GetGeneralCategory functions (reading the code in gfxUnicodePropertyData.cpp certainly doesn't help!).  Won't this patch also affect shaping behavior for Mongolian if U+180e is used, since with this patch we won't hit system fallback for that character?
Attachment #529341 - Flags: review?(jdaggett) → review-
(Assignee)

Comment 22

6 years ago
Created attachment 529434 [details] [diff] [review]
patch, v2 - synthesize unicode spaces - less aggressive approach

Argh, yes, I was initially thinking I'd do _all_ the GC=Zs characters, but then backed off a bit from that as some of them are less clear-cut.

Here is an alternative version that only steps in before the final system-fallback search, and only for the characters that GetUnicodeSpaceWidth actually knows about.
Attachment #529341 - Attachment is obsolete: true
Attachment #529381 - Attachment is obsolete: true
Attachment #529434 - Flags: review?(jdaggett)
Attachment #529381 - Flags: review?(jdaggett)

Comment 23

6 years ago
Comment on attachment 529434 [details] [diff] [review]
patch, v2 - synthesize unicode spaces - less aggressive approach

+gfxFont::GetUnicodeSpaceWidth(PRUint32 aCh)

A better name here would be 'SynthesizeSpaceWidth', since you're not really "getting" the width in the sense of fetching it from the font.

+    // for known "space" characters, don't do a full system-fallback search;
+    // we'll synthesize appropriate-width spaces instead of missing-glyph boxes
+    if (GetFontAt(0)->GetUnicodeSpaceWidth(aCh) >= 0.0) {
+        return nsnull;
+    }

The switch statement in GetUnicodeSpaceWidth is relatively expensive, I think it might be better to use:

if (gfxUnicodeProperties::GetGeneralCategory(aCh) == HB_CATEGORY_SPACE_SEPARATOR && 
    GetFontAt(0)->GetUnicodeSpaceWidth(aCh) >= 0.0)
{
   ...
}
Attachment #529434 - Flags: review?(jdaggett) → review+
(Assignee)

Comment 24

6 years ago
(In reply to comment #23)
> A better name here would be 'SynthesizeSpaceWidth'

OK, I like that.

> The switch statement in GetUnicodeSpaceWidth is relatively expensive, I think
> it might be better to use:
> 
> if (gfxUnicodeProperties::GetGeneralCategory(aCh) ==
> HB_CATEGORY_SPACE_SEPARATOR && 
>     GetFontAt(0)->GetUnicodeSpaceWidth(aCh) >= 0.0)
> {
>    ...
> }

I'm not sure it matters much once we're down to this point in FindFontForChar, but FWIW I've included this, as GetGeneralCategory is designed to be a very fast table lookup.

Pushed:
http://hg.mozilla.org/mozilla-central/rev/9b312736f4ca
Status: NEW → RESOLVED
Last Resolved: 13 years ago6 years ago
Resolution: --- → FIXED

Updated

6 years ago
Target Milestone: Future → mozilla6
You need to log in before you can comment on or make changes to this bug.