Closed Bug 339513 Opened 13 years ago Closed 13 years ago

neutral characters rendering problem (refactoring of gfxPangoTextRun)

Categories

(Core :: Graphics, defect, P1, major)

x86
Linux
defect

Tracking

()

RESOLVED FIXED
mozilla1.9alpha1

People

(Reporter: masayuki, Assigned: masayuki)

References

()

Details

(Whiteboard: Should fix: itemizing for neutral character, font switching like Windows, honor the font prefs)

Attachments

(1 file, 27 obsolete files)

71.48 KB, patch
Details | Diff | Splinter Review
Sorry for the no good summary.

This is pango's issue. When pango chooses the font for the lang neutral characters(e.g., numbers), it decides the font from around characters. This is a good idea for the *segment*. But it's not valid for our rendering. In the test case, the number '4's are rendered different fonts on its position. This is not beautiful. We should always render same font irrespective of the place of the characters.
Attached patch Patch rv1.0 (obsolete) — Splinter Review
This patch separates the text to some segments. They are non-neutral(language dependent) characters segment and neutral(language independent) characters.

Therefore, the neutral characters are rendered by default font.
Attachment #223606 - Flags: review?(pavlov)
Attached image screenshot of overlapping problem (obsolete) —
And this bug has another problem. This bug makes overlapping problem.(But I don't understand the logic.)
Flags: blocking1.9a1?
Priority: -- → P1
Target Milestone: --- → mozilla1.9alpha
I found another problem that being fixed by this patch.
See this testcase:
http://bugzilla.mozilla.gr.jp/attachment.cgi?id=3169
This is rendered as following screenshot:
http://bugzilla.mozilla.gr.jp/attachment.cgi?id=3170

When the justifying, the characters are mesured per character. So, the neutral characters('(' and ')') don't depend on the around characters at measuring. But at rendering, the neutral characters depend on the around characters. But this patch fixes the difference.
Attachment #223606 - Flags: review?(pavlov) → review?(vladimir)
Comment on attachment 223606 [details] [diff] [review]
Patch rv1.0

I cancel this review.

If a string has two or more languages, each languages are rendered by each system default fonts. I think that we may be able to fix both problem in a patch. I'll try it.
# And maybe also the font.name-list is.
Attachment #223606 - Flags: review?(vladimir)
I think that if my plan will be success, Bug 334299 and Bug 334565 will be fixed by it. If I confirm it, I'll block these bugs by this bug.
Depends on: 343454
Flags: blocking1.9a1?
Summary: neutral characters rendering problem → neutral characters rendering problem (refactoring of gfxPangoTextRun)
Whiteboard: Should fix: itemizing for neutral character, font switching like Windows, honor the font prefs
Masa, the problem in c#1, I think we need a larger perspective on whether what you try to implement is actually desirable. I believe the behavior was implemented knowingly  in Pango, and their point of view was the advantage outweighs the inconvenients. It would be good to have a feedback from the initial Pango implementers. It think that is a worst case, and in many real life situations it will look nicer to use lang neutral characters that come from the same font as the surrounding characters.

Do you have real life examples of this going bad ? (I understand you transferred this bug from bugzilla.mozilla.gr.jp ) Doesn't the problem only happen on UTF-8 pages ? (that really should use the lang attribute to avoid random language change)
I think that if CSS style sheet has system font specify for font-family, we may need to discuss for the best approach. But the testcase is not having system font specify, so, it should be rendered by CSS rules.
(In reply to comment #11)
> Do you have real life examples of this going bad ? (I understand you
> transferred this bug from bugzilla.mozilla.gr.jp ) Doesn't the problem only
> happen on UTF-8 pages ? (that really should use the lang attribute to avoid
> random language change)

No, the problem is occurred on many pages(including non-UTF-8 pages), so that doesn't occur only on specially testcases. The current pango code doesn't honor the lang attribute.
This patch isn't not final, it is draft.
This shows my idea for new structure of gfxPangoTextRun.
Attachment #223606 - Attachment is obsolete: true
Attachment #223607 - Attachment is obsolete: true
Attachment #223608 - Attachment is obsolete: true
Attachment #223610 - Attachment is obsolete: true
Attachment #223611 - Attachment is obsolete: true
This patch fixes very many problems.

1. This chooses the good font in every segments.
2. Even if Japanese and Chinese characters are mixed in a text, this can render the one.
3. This fixes the bug 310746.
4. This fixes the bug 334299.

remains works:
1. Measure the performance.
2. Optimize and cleanup the code.
3. fix RTL problem(in RTL page, this hangs up.)
Attachment #234466 - Attachment is obsolete: true
I tested the performance of this patch. That will give the damage for the tp value in tinderbox. In my environment, the page loading scores are:

current:
Avg. Median : 846 msec
Average     : 856 msec
Minimum     : 405 msec
Maximum     : 2864 msec

patched:
Avg. Median : 906 msec
Average     : 962 msec
Minimum     : 426 msec
Maximum     : 2716 msec

So, we need fast function for simple text like as gfxWindowsTextRun.
RTL problem is fixed.
The final patch coming soon.
Attachment #234961 - Attachment is obsolete: true
This patch may be pre-final.
This patch fixes:
* adding "x-central-euro" for pango-lang to moz-lang table.
* use "x-unicode" if aLangGroup of ForEachFont is empty.
* fix bustage on Win/Mac.

I'll optimize in next patch.
Attachment #235260 - Attachment is obsolete: true
Attachment #235277 - Attachment description: Itemize patch rv10.0 (work in progress) → Itemize patch rv0.10 (work in progress)
Attached patch Itemizing patch rv1.0 (obsolete) — Splinter Review
The performances:
current build
Avg. Median : 851 msec
Average     : 866 msec
Minimum     : 442 msec
Maximum     : 3097 msec

patched build
Avg. Median : 821 msec
Average     : 835 msec
Minimum     : 415 msec
Maximum     : 2130 msec

The patched build is little faster. But it may not be correct. However, there is no regression in performance :-)

The detail of the patch will be written in next comment.
Attachment #235277 - Attachment is obsolete: true
Attachment #235587 - Flags: review?(vladimir)
This patch has very many features:

1. bug 310746, this fixes the non-ASCII font name problem.
2. bug 334299, this uses font prefs for font switching if we cannot find the glyphs in default font.
3. This guarantees the same character is always rendered by same font. Therefore, this fixes comment 0, comment 1 and comment 7 problems.
4. This can render the CJK complex text(e.g., the Japanese text has Chinese only character). Because if we cannot find glyph in current font list, it adds the all CJK pref fonts to the list.
5. Expanding the moz-lang to pango-lang table.
6. Adding the pango-lang to moz-lang table and function.
7. Now, we are released from layout module of pango!

This is not using layout module of pango. We can get glyphs by |pango_shape| that needs |PangoAnalysis| struct. We have two ways for initializing |PangoAnalysis|.
1. using |pango_itemize|:
This way is used in |MeasureOrDrawItemizing|. This is not good for performance. But we can get the hints for missing glyphs by the language information of the text.
2. ourselves:
This way is used in |MeasureOrDrawFast|. In this case, this patch uses font language for the initializing. In most cases, we can render by this function. But if there are missing glyphs, we need to fall back to |MeasureOrDrawItemizing|.
Attached patch Itemizing patch rv1.1 (obsolete) — Splinter Review
The previous patch has a nit.
Attachment #235587 - Attachment is obsolete: true
Attachment #235647 - Flags: review?(vladimir)
Attachment #235587 - Flags: review?(vladimir)
Attached patch Itemizing patch rv1.2 (obsolete) — Splinter Review
fix perf issue by non-drawable characters.
Attachment #235647 - Attachment is obsolete: true
Attachment #235741 - Flags: review?(vladimir)
Attachment #235647 - Flags: review?(vladimir)
Attached patch Itemizing patch rv1.3 (obsolete) — Splinter Review
Sorry for the spam. This fixes invalid length if the missing glyphs found in non-left-most of the segment.
Attachment #235741 - Attachment is obsolete: true
Attachment #235742 - Flags: review?(vladimir)
Attachment #235741 - Flags: review?(vladimir)
Attached patch Itemizing patch rv1.3 (obsolete) — Splinter Review
Updating for latest trunk. I'll attach rv1.4 for minor change.
Attachment #235742 - Attachment is obsolete: true
Attachment #235742 - Flags: review?(vladimir)
Attached patch Itemizing patch rv1.4 (obsolete) — Splinter Review
This fixes bug 343262 too, see the diff.
Attachment #236081 - Attachment is obsolete: true
Attachment #236086 - Flags: review?(vladimir)
Comment on attachment 236086 [details] [diff] [review]
Itemizing patch rv1.4

I found a hang up bug. Please wait.
Attachment #236086 - Flags: review?(vladimir) → review-
Every time I start to review you post a new patch up :)  I'll look at the next patch as soon as you post it; so far everything looked ok though I had a few questions.
Attached patch Itemizing patch rv1.5 (obsolete) — Splinter Review
Sorry for the delay.

In Arabic text, sometimes, pango hung-upped if I sent the font that doesn't have Arabic glyphs to the pango shaper. This patch fixes this problem. (If the text has Arabic character(s), the FontSelector doesn't send non-Arabic fonts to the pango shaper.)
Attachment #236086 - Attachment is obsolete: true
Attachment #236116 - Flags: review?(vladimir)
(In reply to comment #29)
> In Arabic text, sometimes, pango hung-upped if I sent the font that doesn't
> have Arabic glyphs to the pango shaper. This patch fixes this problem. (If the
> text has Arabic character(s), the FontSelector doesn't send non-Arabic fonts to
> the pango shaper.)

Ah, sorry. Of course, this way is not a best way. We should check whether the font has Arabic glyphs before calling the pango shaper. But I don't know how to do. So, I cannot use best way, now. But I'll try to fix it for better implementation.
(In reply to comment #29)
> Created an attachment (id=236116) [edit]
> Itemizing patch rv1.5
> 
> Sorry for the delay.
> 
> In Arabic text, sometimes, pango hung-upped if I sent the font that doesn't
> have Arabic glyphs to the pango shaper. This patch fixes this problem. (If the
> text has Arabic character(s), the FontSelector doesn't send non-Arabic fonts to
> the pango shaper.)
> 

I believe we want to send arabic text through the pango shaper.  We should figure out why pango is hanging -- it certainly doesn't hang for other applications using arabic.  Are we passing it in funny data?
(In reply to comment #31)
> (In reply to comment #29)
> > Created an attachment (id=236116) [edit]
> > Itemizing patch rv1.5
> > 
> > Sorry for the delay.
> > 
> > In Arabic text, sometimes, pango hung-upped if I sent the font that doesn't
> > have Arabic glyphs to the pango shaper. This patch fixes this problem. (If the
> > text has Arabic character(s), the FontSelector doesn't send non-Arabic fonts to
> > the pango shaper.)
> > 
> 
> I believe we want to send arabic text through the pango shaper.  We should
> figure out why pango is hanging -- it certainly doesn't hang for other
> applications using arabic.  Are we passing it in funny data?
> 
I don't know what happens in pango. If |mDontTryDifferentLangFont| is always PR_FALSE, I can get this error message:

(Gecko 4159): Pango-WARNING **: shape engine failure, expect ugly output. the offending font is 'Sazanami Gothic Bold 11.19921875'

If other applications uses pango-layout-*, pango always sends the Arabic font to Arabic text segment at shaping even if any font is specified.
And this happens when I selected the Arabic text in many cases.(Of course, it happen at page loading time.)
Attached patch Itemizing patch rv1.6 (obsolete) — Splinter Review
This fixes (but it's adhoc...) the problem of bug 334064 comment 20.
Attachment #236116 - Attachment is obsolete: true
Attachment #236197 - Flags: review?(vladimir)
Attachment #236116 - Flags: review?(vladimir)
Comment on attachment 236197 [details] [diff] [review]
Itemizing patch rv1.6

Wow, sorry. This patch cannot fix the problem completely. We should not use this way.
Attachment #236197 - Flags: review?(vladimir) → review-
Comment on attachment 236116 [details] [diff] [review]
Itemizing patch rv1.5

Sorry. I confirmed that the hang in Gecko, so not on pango. I'll find the cause and recreate the patch.
Attachment #236116 - Flags: review?(vladimir) → review-
Attached patch Itemizing patch rv1.7 (obsolete) — Splinter Review
This fixes Arabic text problem.
The cause of the hung-up is the context direction setting.
Even if the gfxTextRun is LTR, the text may have RTL text. In that case, this patch changes the context setting.
Attachment #236116 - Attachment is obsolete: true
Attachment #236239 - Flags: review?(vladimir)
Attached patch Itemizing patch rv1.8 (obsolete) — Splinter Review
removing the non-used member. sorry for the spam.
Attachment #236239 - Attachment is obsolete: true
Attachment #236246 - Flags: review?(vladimir)
Attachment #236239 - Flags: review?(vladimir)
Attached patch Itemizing patch rv1.9 (obsolete) — Splinter Review
Oops. sorry. the previous patch has a test code.
Attachment #236246 - Attachment is obsolete: true
Attachment #236247 - Flags: review?(vladimir)
Attachment #236246 - Flags: review?(vladimir)
Attached patch Itemizing patch rv1.10 (obsolete) — Splinter Review
the regression of bug 334064 comment 20 is fixed.
Attachment #236247 - Attachment is obsolete: true
Attachment #236369 - Flags: review?(vladimir)
Attachment #236247 - Flags: review?(vladimir)
Attached patch Itemizing patch rv1.11 (obsolete) — Splinter Review
fix a memory leak.
Attachment #236369 - Attachment is obsolete: true
Attachment #236709 - Flags: review?(vladimir)
Attachment #236369 - Flags: review?(vladimir)
Attached patch Itemizing patch rv1.12 (obsolete) — Splinter Review
Improving the font cache mechanism.
Now, we can get actual font name that is resolved the alias.
Attachment #236709 - Attachment is obsolete: true
Attachment #236726 - Flags: review?(vladimir)
Attachment #236709 - Flags: review?(vladimir)
Comment on attachment 236726 [details] [diff] [review]
Itemizing patch rv1.12

I have an idea for better code, please wait, sorry.
Attachment #236726 - Flags: review?(vladimir) → review-
Attached patch Itemizing patch rv1.13 (obsolete) — Splinter Review
The previous patch sometimes failed the font switching.
But this patch fixes the problem, maybe, the cause is a wrong font index management.
And this patch removes some warnings.

Maybe, this is a final if there are no bugs.
Attachment #236726 - Attachment is obsolete: true
Attachment #236745 - Flags: review?(vladimir)
Attached patch Itemizing patch rv1.14 (obsolete) — Splinter Review
Optimizing the font cache.
Attachment #236745 - Attachment is obsolete: true
Attachment #236783 - Flags: review?(vladimir)
Attachment #236745 - Flags: review?(vladimir)
Attached patch Itemizing patch rv1.14.1 (obsolete) — Splinter Review
Sorry, the previous patch has a nit.
Attachment #236783 - Attachment is obsolete: true
Attachment #236788 - Flags: review?(vladimir)
Attachment #236783 - Flags: review?(vladimir)
I think that we need more work for alias font name. Because the current patch still has some problem on some environments. However, I think that the code may be large. So, I don't like to merge the codes in the patch. And I'll continue to work on gfxPangoFonts for some issues.

Vlad, would you review the patch? Sorry for the delay.
Ok.. Can we let 1.14.1 sit there a bit so that I can review it?  :)  Any followup stuff can come as separate patches, otherwise I'll never have a stable version of this to look over (it's getting pretty big as it is).
(In reply to comment #48)
> Can we let 1.14.1 sit there a bit so that I can review it?

Vlad, sorry, I cannot understand this sentence, what you mean?
(In reply to comment #49)
> (In reply to comment #48)
> > Can we let 1.14.1 sit there a bit so that I can review it?
> 
> Vlad, sorry, I cannot understand this sentence, what you mean?

I mean that it'll take me at least a day to review, and if the patch keeps changing then I have a hard time doing that (and Stuart will need to review as well, so both of us together will need a few days).  If the patch is good enough as-is, give me a day or two to review, and then assuming everything is good, let's get this checked in and do other work as additional patches.  There's a bunch of (good!) changes here already, so it would be better to review this as it is.
Comment on attachment 236788 [details] [diff] [review]
Itemizing patch rv1.14.1

This looks fine; sorry the review took so long.  Stuart may have some comments on the core gfxFont changes for font-family splitting, but I'll let him chime in here.
Attachment #236788 - Flags: review?(vladimir) → review+
Comment on attachment 236788 [details] [diff] [review]
Itemizing patch rv1.14.1

Thank you, vlad for your review! I'm asking to stuart. Do you have some comments for this patch?
Attachment #236788 - Flags: review?(pavlov)
Comment on attachment 236788 [details] [diff] [review]
Itemizing patch rv1.14.1

this patch really needs a lot more comments so that you can follow what is going on.
(In reply to comment #54)
> (From update of attachment 236788 [details] [diff] [review] [edit])
> this patch really needs a lot more comments so that you can follow what is
> going on.

Where needs the comments? If you point, I'll attach the new patch.
Comment on attachment 236788 [details] [diff] [review]
Itemizing patch rv1.14.1

Please rewrite GetMozLanguage without gotos.  There isn't really any need for them in this function.
Attachment #236788 - Flags: review?(pavlov) → review+
checked-in.

Thank you.
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Depends on: 354349
Depends on: 354551
Comment on attachment 240086 [details] [diff] [review]
Itemizing patch rv1.15

>+/* static */
>+void
>+GetMozLanguage(const PangoLanguage *aLang, nsACString &aMozLang)
>+{
>+    aMozLang.Truncate();
>+    if (!aLang)
>+        return;
>+
>+    nsCAutoString lang(pango_language_to_string(aLang));
>+    if (lang.Equals("xx"))
>+        return;
>+
>+    do {
>+        for (PRUint32 i = 0; i < NUM_PANGO_ALL_LANG_GROUPS; ++i) {
>+            if (lang.Equals(PangoAllLangGroup[i].PangoLang)) {
>+                if (PangoAllLangGroup[i].mozLangGroup)
>+                    aMozLang.Assign(PangoAllLangGroup[i].mozLangGroup);
>+                return;
>+            }
>+        }
>+
>+        PRInt32 hyphen = lang.FindChar('-');
>+        if (hyphen != kNotFound) {
>+            lang.Cut(hyphen, lang.Length());
>+            continue;
>+        }
>+    } while (0);
>+}

This revised version of GetMozLanguage looks like it won't actually do the repeat if it fails in the first pass (unlike the version before comment 56), since |continue| still rechecks the loop condition before re-entering the loop (so in a while(0) there's no difference between |continue| and |break|).  Shouldn't you replace this:

        if (hyphen != kNotFound) {
            lang.Cut(hyphen, lang.Length());
            continue;
        }

with something like this:

        if (hyphen == kNotFound)
            break;
        lang.Cut(hyphen, lang.Length());

and then change while(0) to while(1) ?
Depends on: 357637
You need to log in before you can comment on or make changes to this bug.