Last Comment Bug 703100 - remove the existing text-run word cache and replace with a simpler and more efficient scheme
: remove the existing text-run word cache and replace with a simpler and more e...
Status: RESOLVED FIXED
[Snappy]
:
Product: Core
Classification: Components
Component: Layout: Text (show other bugs)
: unspecified
: All All
: -- normal (vote)
: mozilla12
Assigned To: Jonathan Kew (:jfkthame)
:
Mentors:
Depends on: 716229 717175 717852 726539 728133 728462 737942 745555 745699 751129 791953 909264 909344 970891
Blocks: 614476 694205 707959 708075 715473
  Show dependency treegraph
 
Reported: 2011-11-16 14:28 PST by Jonathan Kew (:jfkthame)
Modified: 2014-02-11 06:34 PST (History)
12 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
pt 1 - eliminate gfxTextRunWordCache and gfxTextRunCache. (110.76 KB, patch)
2011-12-06 07:23 PST, Jonathan Kew (:jfkthame)
roc: review+
Details | Diff | Review
pt 2.1 - implement gfxShapedWord caching for gfxFont instances. (135.55 KB, patch)
2011-12-06 07:24 PST, Jonathan Kew (:jfkthame)
no flags Details | Diff | Review
pt 2.2 - adapt Mac font code to work with gfxShapedWord caches. (24.09 KB, patch)
2011-12-06 07:24 PST, Jonathan Kew (:jfkthame)
no flags Details | Diff | Review
pt 2.3 - adapt Windows font code to work with gfxShapedWord caches. (52.49 KB, patch)
2011-12-06 07:25 PST, Jonathan Kew (:jfkthame)
no flags Details | Diff | Review
pt 2.4 - adapt Linux/Pango font code to work with gfxShapedWord caches. (35.60 KB, patch)
2011-12-06 07:25 PST, Jonathan Kew (:jfkthame)
no flags Details | Diff | Review
pt 2.5 - adapt Android/FT2 font code to work with gfxShapedWord caches. (7.50 KB, patch)
2011-12-06 07:25 PST, Jonathan Kew (:jfkthame)
no flags Details | Diff | Review
pt 3 - remove copy of original characters from gfxTextRun. (60.61 KB, patch)
2011-12-06 07:26 PST, Jonathan Kew (:jfkthame)
no flags Details | Diff | Review
pt 4 - add timed expiration of cached gfxShapedWord records. (11.26 KB, patch)
2011-12-06 07:26 PST, Jonathan Kew (:jfkthame)
no flags Details | Diff | Review
pt 5 - optimize allocation of gfxTextRun objects to avoid separate allocation for CompressedGlyph records. (19.79 KB, patch)
2011-12-06 07:26 PST, Jonathan Kew (:jfkthame)
no flags Details | Diff | Review
pt 6 - remove pango-specific todo()s in test_backspace_delete, now that it passes on all platforms. (10.44 KB, patch)
2011-12-06 07:27 PST, Jonathan Kew (:jfkthame)
roc: review+
Details | Diff | Review
pt 7 - fix fragile reftests that depend on metrics of fallback fonts used for invisible chars. (3.66 KB, patch)
2011-12-06 07:27 PST, Jonathan Kew (:jfkthame)
roc: review+
Details | Diff | Review
pt 4, v2 - add timed expiration of cached gfxShapedWord records (13.90 KB, patch)
2011-12-08 02:35 PST, Jonathan Kew (:jfkthame)
no flags Details | Diff | Review
pt 2.1 v2 - implement gfxShapedWord caching for gfxFont instances. (150.82 KB, patch)
2012-01-04 00:02 PST, Jonathan Kew (:jfkthame)
roc: review+
Details | Diff | Review
pt 2.2 v2 - adapt Mac font code to work with gfxShapedWord caches. (24.08 KB, patch)
2012-01-04 00:03 PST, Jonathan Kew (:jfkthame)
roc: review+
Details | Diff | Review
pt 2.3 v2 - adapt Windows font code to work with gfxShapedWord caches. (52.93 KB, patch)
2012-01-04 00:04 PST, Jonathan Kew (:jfkthame)
roc: review+
Details | Diff | Review
pt 2.4 v2 - adapt Linux/Pango font code to work with gfxShapedWord caches. (36.18 KB, patch)
2012-01-04 00:05 PST, Jonathan Kew (:jfkthame)
no flags Details | Diff | Review
pt 2.5 v2 - adapt Android/FT2 font code to work with gfxShapedWord caches. (8.24 KB, patch)
2012-01-04 00:06 PST, Jonathan Kew (:jfkthame)
no flags Details | Diff | Review
pt 3 v2 - remove copy of original characters from gfxTextRun. (62.07 KB, patch)
2012-01-04 00:07 PST, Jonathan Kew (:jfkthame)
no flags Details | Diff | Review
pt 4 v3 - add timed expiration of cached gfxShapedWord records (13.76 KB, patch)
2012-01-04 00:08 PST, Jonathan Kew (:jfkthame)
roc: review+
Details | Diff | Review
pt 5 v2 - optimize allocation of gfxTextRun objects to avoid separate allocation for CompressedGlyph records. (19.80 KB, patch)
2012-01-04 00:09 PST, Jonathan Kew (:jfkthame)
roc: review+
Details | Diff | Review
pt 2.5 v3 - adapt Android/FT2 font code to work with gfxShapedWord caches. (10.45 KB, patch)
2012-01-04 23:37 PST, Jonathan Kew (:jfkthame)
roc: review+
Details | Diff | Review
pt 2.4 v3 - adapt Linux/Pango font code to work with gfxShapedWord caches. (33.74 KB, patch)
2012-01-04 23:40 PST, Jonathan Kew (:jfkthame)
roc: review+
Details | Diff | Review
pt 2.4.1 - make gfxPangoFontGroup font-matching behavior more similar to generic gfxFontGroup version. (2.77 KB, patch)
2012-01-04 23:51 PST, Jonathan Kew (:jfkthame)
roc: review+
Details | Diff | Review
pt 3 v3 - remove copy of original characters from gfxTextRun. (63.71 KB, patch)
2012-01-05 02:48 PST, Jonathan Kew (:jfkthame)
roc: review+
Details | Diff | Review

Description Jonathan Kew (:jfkthame) 2011-11-16 14:28:24 PST
(Not sure whether to file this under Layout:Text or Graphics; it seems to straddle the boundary...)

The existing textrun word cache implementation has a number of problems:

- Cache entries refer to a range within a textrun, which means the glyph data remains owned by a particular textrun, and the lifetime of the cache entry is tied to the lifetime of the textrun where the word was first encountered.

- Destroying a textrun requires iterating over its text to find all the words and check whether there are cache entries that refer to them; if so, those entries need to be removed. This makes textrun destruction unnecessarily expensive.

- The presence of a single character outside the 8-bit range when a textrun is constructed means that every word in that run will be handled as 16-bit text, even if most of them could be handled as 8-bit.

- The "same" word in an 8-bit and a 16-bit textrun is shaped and cached separately.

- Textruns have to keep a copy of the original text for the cache entries to refer to, even though for drawing and measurement all we need is the shaped glyph data.

I'm working on a set of patches to replace this global cache with a scheme where each gfxFont instance will own the text and glyph data of the words that it has shaped, so that shaped-word lifetime can be decoupled from the lifetime of the textrun where the word happened to be seen first. This will allow us to manage the expiration of cached words better.

This will also unify the caching of words from 8-bit and 16-bit textruns on a per-word basis, so that we aren't forced to cache lots of ASCII-only words in a 16-bit form (in addition to an 8-bit form) just because a single non-ASCII character such as a dash or curly quote happened to occur somewhere in the text.

I believe this will also make it possible to avoid copying the original text into (most) textruns, which should be a perf and memory win.

Once the ownership of shaped-word data is moved from textruns to the gfxFont instances, it may be possible to change the structure of textruns so that we don't copy glyph data into them at all, but instead hold an array of references to shaped words. This could be a further memory win except for extreme edge cases such as a textrun containing a series of single-character words, but it is not entirely clear whether it can be done without hurting performance.
Comment 1 Jonathan Kew (:jfkthame) 2011-12-06 07:23:56 PST
Created attachment 579285 [details] [diff] [review]
pt 1 - eliminate gfxTextRunWordCache and gfxTextRunCache.
Comment 2 Jonathan Kew (:jfkthame) 2011-12-06 07:24:31 PST
Created attachment 579286 [details] [diff] [review]
pt 2.1 - implement gfxShapedWord caching for gfxFont instances.
Comment 3 Jonathan Kew (:jfkthame) 2011-12-06 07:24:55 PST
Created attachment 579287 [details] [diff] [review]
pt 2.2 - adapt Mac font code to work with gfxShapedWord caches.
Comment 4 Jonathan Kew (:jfkthame) 2011-12-06 07:25:14 PST
Created attachment 579288 [details] [diff] [review]
pt 2.3 - adapt Windows font code to work with gfxShapedWord caches.
Comment 5 Jonathan Kew (:jfkthame) 2011-12-06 07:25:35 PST
Created attachment 579289 [details] [diff] [review]
pt 2.4 - adapt Linux/Pango font code to work with gfxShapedWord caches.
Comment 6 Jonathan Kew (:jfkthame) 2011-12-06 07:25:50 PST
Created attachment 579290 [details] [diff] [review]
pt 2.5 - adapt Android/FT2 font code to work with gfxShapedWord caches.
Comment 7 Jonathan Kew (:jfkthame) 2011-12-06 07:26:08 PST
Created attachment 579291 [details] [diff] [review]
pt 3 - remove copy of original characters from gfxTextRun.
Comment 8 Jonathan Kew (:jfkthame) 2011-12-06 07:26:30 PST
Created attachment 579292 [details] [diff] [review]
pt 4 - add timed expiration of cached gfxShapedWord records.
Comment 9 Jonathan Kew (:jfkthame) 2011-12-06 07:26:49 PST
Created attachment 579293 [details] [diff] [review]
pt 5 - optimize allocation of gfxTextRun objects to avoid separate allocation for CompressedGlyph records.
Comment 10 Jonathan Kew (:jfkthame) 2011-12-06 07:27:09 PST
Created attachment 579294 [details] [diff] [review]
pt 6 - remove pango-specific todo()s in test_backspace_delete, now that it passes on all platforms.
Comment 11 Jonathan Kew (:jfkthame) 2011-12-06 07:27:25 PST
Created attachment 579295 [details] [diff] [review]
pt 7 - fix fragile reftests that depend on metrics of fallback fonts used for invisible chars.
Comment 12 Jonathan Kew (:jfkthame) 2011-12-06 08:00:36 PST
The patches above implement the revised caching scheme for shaped glyph data, as outlined in comment #0. Tryserver results suggest that on some platforms, we may see a Tp improvement of up to about 2%, while on others the effect is insignificant compared to the existing noise on the measurements.

In real-world browsing, we should see an improvement in the cache hit rate (as we won't throw away cached words that are still being frequently used just because the original textrun where they were created has been discarded). We may want to add telemetry that reports the cache statistics, to help us fine-tune the gfxShapedWord and gfxFont cache expiration parameters; I propose to file a followup on this.

Regarding the individual patches:

1 - Removes the existing caching scheme in preparation for the new implementation. Users of gfxTextRunCache and gfxTextRunWordCache are revised to call gfxFontGroup::MakeTextRun directly, as the new caches will operate below this level. This patch alone would be expected to regress performance, but it simplifies/cleans up the code in readiness for the new cache.

2.1-2.5 - Implement caching within each gfxFont instance, based on "words" (space-separated runs of characters) found when constructing text-runs. Note that all the 2.x parts _must_ land together, as the API of the platform-specific shaper classes is changed so that the tree won't even build until those backends are updated. The patch is split into platform-specific pieces to provide more manageable chunks for review. Part 2.1 (the generic gfxFont support for word-caching, and gfxHarfBuzzShaper adaptation) seems bigger than it really is because a bunch of code moves from gfxTextRun to the new gfxShapedWord class, but is essentially unchanged.

3 - The text-run no longer needs to keep a copy of its original text, as we no longer cache "words" by pointing at segments of a text-run. However, some layout code does need to know where specific characters (space, tab, newline) occurred, so this is recorded in the CompressedGlyph record using new flag bits.

4 - The word caches will automatically disappear when their owning gfxFont instances are deleted, but in the case of a long-lived gfxFont, the cache might grow quite large. So this adds timed expiration of cached words that have not been recently used. (Once we have telemetry to monitor cache behavior, we can try tuning the expiration time both here and for gfxFontCache.)

5 - Optimize text-run creation to allocate the gfxTextRun object and its array of CompressedGlyph records in a single operation; this is easier to do now that gfxTextRun has been simplified by removal of the original text.

6 - The new implementation avoids relying on Pango's clustering support, which avoids the problem discussed in bug 474068 and means we can remove the pango-version-specific todo() stuff in this test.

7 - Fix up a couple of reftests that are sensitive to the metrics of fonts chosen during fallback for zero-width invisible characters like ZWNJ.
Comment 13 Jonathan Kew (:jfkthame) 2011-12-06 08:54:09 PST
(In reply to Jonathan Kew (:jfkthame) from comment #12)
> 6 - The new implementation avoids relying on Pango's clustering support,
> which avoids the problem discussed in bug 474068 and means we can remove the
> pango-version-specific todo() stuff in this test.

As a side benefit, this results in the correct rendering (as per attachment 495460 [details], mentioned in bug 474068 comment 66) of :first-letter with Thai examples such as

data:text/html;charset=utf-8,<style>:first-letter{font-size:4em;}</style>เมื่อ

which currently (because of the Pango cluster issue) applies :first-letter to too many characters on Linux.
Comment 14 Jonathan Kew (:jfkthame) 2011-12-06 09:07:55 PST
Filed bug 707959 as a followup on adding telemetry to help us tune the caching behavior.
Comment 15 Jeff Muizelaar [:jrmuizel] 2011-12-07 12:48:53 PST
I'm not going to have time to review all of these, so I suggest getting other reviewers for the details. jdagget and bas are good candidates as they have touched more of the shaping code then I ever have.

I have however looked over the code to get a general idea of what's going on. I really like the direction that it's going except for cache expiry. I don't really like the idea of each gfxFont having it's own timer. Further, I feel that it would be better to have cache size/expiry managed globally instead of for each gfxFont. 

In the recent past we used to have a timer per image to do expiry and that was bad. I know that this isn't as extreme as that, but I'd like to avoid getting close.

What do you think?

Overall though, the new code seems much easier to understand! Thanks for doing it.
Comment 16 Jonathan Kew (:jfkthame) 2011-12-07 13:09:05 PST
(In reply to Jeff Muizelaar [:jrmuizel] from comment #15)
> don't really like the idea of each gfxFont having it's own timer. Further, I
> feel that it would be better to have cache size/expiry managed globally
> instead of for each gfxFont. 

I've gone back and forth on that while working on this, and tried both approaches (with no clear indication of any difference in performance one way or the other). The per-font timers seemed like the "cleanest" design to me, but a single global timer for aging the cached-word entries is marginally less memory overhead. So given your feeling, I'll post an alternative patch taking that approach; I don't really have a preference either way.
Comment 17 Jet Villegas (:jet) 2011-12-07 17:03:23 PST
Adding [Snappy] search string. This fix aims to improve actual (and hopefully, perceived) layout and rendering performance on pages with large blocks of text (eg. wikipedia.)
Comment 18 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2011-12-08 01:22:33 PST
I wrote a lot of this code and I'm happy to review some or all of these patches.
Comment 19 Jonathan Kew (:jfkthame) 2011-12-08 02:35:12 PST
Created attachment 579988 [details] [diff] [review]
pt 4, v2 - add timed expiration of cached gfxShapedWord records

Alternate implementation of cached-word expiration, using a single global timer instead of per-font timers to age the words.
Comment 20 Jonathan Kew (:jfkthame) 2011-12-08 04:06:24 PST
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #18)
> I wrote a lot of this code and I'm happy to review some or all of these
> patches.

You're on!

It's not really quite as much as it looks - there are a bunch of semi-mechanical changes, basically moving functionality from gfxTextRun to gfxShapedWord and adapting the various gfxFontShaper subclasses to implement ShapeWord() instead of InitTextRun(). So there's not nearly as much actual new code or logic as the quantity of patch might imply. :)
Comment 21 John Daggett (:jtd) 2011-12-08 04:55:36 PST
It would be nice to have instrument the code in gfxFont::GetShapedWord to record when hits/misses occur, then record telemetry data on that (miss ratio per text run?).
Comment 22 Jonathan Kew (:jfkthame) 2011-12-08 05:20:18 PST
(In reply to John Daggett (:jtd) from comment #21)
> It would be nice to have instrument the code in gfxFont::GetShapedWord to
> record when hits/misses occur, then record telemetry data on that (miss
> ratio per text run?).

I already filed bug 707959 as a followup for this.
Comment 23 John Daggett (:jtd) 2011-12-08 05:43:43 PST
Another thing to consider is skipping the word cache altogether for scripts like CJK where I doubt word caching is buying us anything, since words are not generally delineated by spaces in CJK text and the miss ratio is probably very high.
Comment 24 Jonathan Kew (:jfkthame) 2011-12-08 06:46:35 PST
(In reply to John Daggett (:jtd) from comment #23)
> Another thing to consider is skipping the word cache altogether for scripts
> like CJK where I doubt word caching is buying us anything, since words are
> not generally delineated by spaces in CJK text and the miss ratio is
> probably very high.

That's an interesting idea - might well be a win. (Hmm, that should be "CJ", not "CJK", as Korean _does_ use word spaces. OTOH, things like Thai, Lao and Khmer would likely fall into the not-worth-caching category.)

Another possible tweak that might have a similar overall effect would be to skip the cache for "words" above a certain (to-be-determined) length threshold, as excessively long "words" are unlikely to be used repeatedly.

In both cases, though, we should check how often we end up reconstructing textruns more than once for the same document text. If this happens frequently, then the cache could be giving us some benefit even if it ends up caching each complete CJK paragraph separately.

Maybe we could get telemetry to report cache miss rates on a per-script basis, to see if the word cache is clearly not proving useful for certain scripts?
Comment 25 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2011-12-08 14:55:44 PST
(In reply to Jonathan Kew (:jfkthame) from comment #24)
> That's an interesting idea - might well be a win. (Hmm, that should be "CJ",
> not "CJK", as Korean _does_ use word spaces. OTOH, things like Thai, Lao and
> Khmer would likely fall into the not-worth-caching category.)
> 
> Another possible tweak that might have a similar overall effect would be to
> skip the cache for "words" above a certain (to-be-determined) length
> threshold, as excessively long "words" are unlikely to be used repeatedly.

This would require careful measurement.

Web pages often aren't prose. The same terms, even whole sentences, may be repeated in UI elements or headings for example. We also benefit from the word cache when navigating between pages that share a lot of the same text content.

> In both cases, though, we should check how often we end up reconstructing
> textruns more than once for the same document text. If this happens
> frequently, then the cache could be giving us some benefit even if it ends
> up caching each complete CJK paragraph separately.

That too.
Comment 26 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2011-12-15 15:43:22 PST
Comment on attachment 579286 [details] [diff] [review]
pt 2.1 - implement gfxShapedWord caching for gfxFont instances.

Review of attachment 579286 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/thebes/gfxFont.cpp
@@ +1543,5 @@
> +
> +    entry = mWordCache.PutEntry(key);
> +    if (!entry) {
> +        NS_WARNING("failed to create cache entry for gfxShapedWord - expect missing text");
> +        return nsnull;

I think these are infallible now, so no need to check results here or in our callers.

@@ +1565,5 @@
> +        const PRUint8* src = (const PRUint8*)aText;
> +        PRUnichar *dest = utf16.BeginWriting();
> +        while (dest < utf16.EndWriting()) {
> +            *dest++ = *src++;
> +        }

Use LossyConvertEncoding8to16 or AppendASCIItoUTF16.

@@ +1646,5 @@
> +
> +    for (PRUint32 i = 0; i <= aRunLength; ++i) {
> +        T ch = i < aRunLength ? text[i] : '\n';
> +        bool boundary = IsBoundarySpace(ch);
> +        bool invalid = boundary ? false : gfxFontGroup::IsInvalidChar(ch);

!boundary && gfxFontGroup::IsInvalidChar(ch)

Also I think we should declar ch, boundary and invalid and then do an "if (i < aRunLength) ... else { ch = '\n'; boundary = false; invalid = true; }"

@@ +1670,5 @@
> +                        PRUint8 cat =
> +                            gfxUnicodeProperties::GetGeneralCategory(ch);
> +                        if (cat < HB_CATEGORY_COMBINING_MARK ||
> +                            cat > HB_CATEGORY_NON_SPACING_MARK)
> +                        {

{ on previous line

@@ +1696,5 @@
> +            continue;
> +        }
> +
> +        // We've decided to break here (i.e. we're at the end of a "word",
> +        // of the word is becoming excessively long): shape the word and

"or the word"

@@ +1703,5 @@
> +            gfxShapedWord *sw = nsnull;
> +            if (sizeof(T) == sizeof(PRUnichar) && wordIs8Bit) {
> +                nsAutoTArray<PRUint8,256> bytes;
> +                PRUint8 *bp = bytes.AppendElements(length);
> +                if (bp) {

This is an infallible array (default) so you don't need to check the result.

Converting to 8-bit text here seems undesirable since on a cache hit, the 16-bit version of GetShapedWord would have done just as well, and on a cache miss, the 8-bit version of GetShapedWord has to reconvert the text to 16-bit for shaping.

I guess you're doing it this way because the 8-bit flag is part of the cache key, and we want 8-bit words in 16-bit strings to hit 8-bit words from 8-bit strings. Can't we do that without converting the text? Pass the wordIs8Bit flag into GetShapedWord and then into CacheHashKey, and when it compares the strings, use an 8-to-16 comparison path if necessary?

@@ +1730,5 @@
> +
> +        if (boundary) {
> +            // word was terminated by a space: add that to the textrun
> +            if (!aTextRun->SetSpaceGlyphIfSimple(this, aContext,
> +                                                 aRunStart + i))

Better add a test font for non-simple space glyphs, since otherwise I bet the following code would never be tested.

@@ +2344,5 @@
>  
>  bool 
> +gfxFontGroup::IsInvalidChar(PRUint8 ch)
> +{
> +    return ((ch & 0x7f) < 0x20);

What's the reasoning behind changing IsInvalidChar? I thought it was better to be conservative, because there might be fonts that render glyphs for some of the control characters.

@@ +2764,5 @@
> +
> +    if (sizeof(T) == sizeof(PRUnichar) && aLength > 0) {
> +        gfxTextRun::CompressedGlyph *glyph = aTextRun->GetCharacterGlyphs();
> +        if (!glyph->IsSimpleGlyph()) {
> +            glyph->SetClusterStart(true);

Why is this needed? Can't we get rid of it?

::: gfx/thebes/gfxFont.h
@@ +2411,5 @@
>                     const DetailedGlyph *aGlyphs);
>      void SetMissingGlyph(PRUint32 aCharIndex, PRUint32 aUnicodeChar);
>      void SetSpaceGlyph(gfxFont *aFont, gfxContext *aContext, PRUint32 aCharIndex);
>  
> +    bool SetSpaceGlyphIfSimple(gfxFont *aFont, gfxContext *aContext,

Document this!

::: gfx/thebes/gfxPlatform.h
@@ +70,5 @@
>  class gfxTextRun;
>  class nsIURI;
>  class nsIAtom;
>  
> +class CompressedGlyph;

Why add this here? It doesn't seem to be used.
Comment 27 Jonathan Kew (:jfkthame) 2012-01-04 00:02:27 PST
Created attachment 585663 [details] [diff] [review]
pt 2.1 v2 - implement gfxShapedWord caching for gfxFont instances.
Comment 28 Jonathan Kew (:jfkthame) 2012-01-04 00:03:33 PST
Created attachment 585664 [details] [diff] [review]
pt 2.2 v2 - adapt Mac font code to work with gfxShapedWord caches.
Comment 29 Jonathan Kew (:jfkthame) 2012-01-04 00:04:18 PST
Created attachment 585665 [details] [diff] [review]
pt 2.3 v2 - adapt Windows font code to work with gfxShapedWord caches.
Comment 30 Jonathan Kew (:jfkthame) 2012-01-04 00:05:36 PST
Created attachment 585667 [details] [diff] [review]
pt 2.4 v2 - adapt Linux/Pango font code to work with gfxShapedWord caches.
Comment 31 Jonathan Kew (:jfkthame) 2012-01-04 00:06:20 PST
Created attachment 585668 [details] [diff] [review]
pt 2.5 v2 - adapt Android/FT2 font code to work with gfxShapedWord caches.
Comment 32 Jonathan Kew (:jfkthame) 2012-01-04 00:07:07 PST
Created attachment 585669 [details] [diff] [review]
pt 3 v2 - remove copy of original characters from gfxTextRun.
Comment 33 Jonathan Kew (:jfkthame) 2012-01-04 00:08:06 PST
Created attachment 585670 [details] [diff] [review]
pt 4 v3 - add timed expiration of cached gfxShapedWord records
Comment 34 Jonathan Kew (:jfkthame) 2012-01-04 00:09:17 PST
Created attachment 585671 [details] [diff] [review]
pt 5 v2 - optimize allocation of gfxTextRun objects to avoid separate allocation for CompressedGlyph records.
Comment 35 Jonathan Kew (:jfkthame) 2012-01-04 00:28:08 PST
Updated pt 2.1 as per comment #26, and further updated this and the remaining patches to adjust for the landing of Graphite and other changes on trunk.

(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #26)
> Comment on attachment 579286 [details] [diff] [review]
> pt 2.1 - implement gfxShapedWord caching for gfxFont instances.

> @@ +1730,5 @@
> > +
> > +        if (boundary) {
> > +            // word was terminated by a space: add that to the textrun
> > +            if (!aTextRun->SetSpaceGlyphIfSimple(this, aContext,
> > +                                                 aRunStart + i))
> 
> Better add a test font for non-simple space glyphs, since otherwise I bet
> the following code would never be tested.

Actually, it is tested already by existing crashtests: we have some tests that use huge font sizes, and in this case the advance of the space glyph will not fit within the available number of bits of a "simple" glyph record, and so SetSpaceGlyphIfSimple bails and we use the fallback path.

> @@ +2344,5 @@
> >  
> >  bool 
> > +gfxFontGroup::IsInvalidChar(PRUint8 ch)
> > +{
> > +    return ((ch & 0x7f) < 0x20);
> 
> What's the reasoning behind changing IsInvalidChar?

Simplicity/efficiency, reducing the number of separate comparison operations. This is a hot code path, so it seemed worth optimizing somewhat.

> I thought it was better
> to be conservative, because there might be fonts that render glyphs for some
> of the control characters.

There's no reason to render glyphs for these, given that (as per Unicode spec) they're not printable graphic characters. It's conceivable that fonts using private, non-Unicode encodings could put visible glyphs at these codepoints (and in fact that used to be done, in the old pre-Unicode world of "hacked 8-bit Indic" fonts, etc), but we simply don't support such fonts - hence some of the "bug reports" we used to get re Indian sites that required "custom" fonts - but this practice seems to be dying anyway.

If we ever wanted to implement a "show invisibles" mode that renders something visible for the various control characters, we'd need to handle that differently anyhow, as we can't rely on normal fonts having such glyphs, and we already make assumptions about glyph rendering based on standard Unicode semantics - e.g. we render &#xA0; using the font's "space" glyph, rather than its "nonbreakingspace" glyph. This is just an extension of that optimization: we know what Unicode specifies about the rendering of these characters - they're non-printing - and so we can take a shortcut.

> @@ +2764,5 @@
> > +
> > +    if (sizeof(T) == sizeof(PRUnichar) && aLength > 0) {
> > +        gfxTextRun::CompressedGlyph *glyph = aTextRun->GetCharacterGlyphs();
> > +        if (!glyph->IsSimpleGlyph()) {
> > +            glyph->SetClusterStart(true);
> 
> Why is this needed? Can't we get rid of it?

I don't think so, because we expect the first character of a textrun to be marked as a cluster start; but that is not true for individual ShapedWords (it'll normally be true, of course, but there are edge cases where it isn't - usually when font fallback causes a font switch between a base letter and a diacritic).

> ::: gfx/thebes/gfxPlatform.h
> @@ +70,5 @@
> >  class gfxTextRun;
> >  class nsIURI;
> >  class nsIAtom;
> >  
> > +class CompressedGlyph;
> 
> Why add this here? It doesn't seem to be used.

Duh. Residue of older version.
Comment 36 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-01-04 02:19:24 PST
Comment on attachment 585663 [details] [diff] [review]
pt 2.1 v2 - implement gfxShapedWord caching for gfxFont instances.

Review of attachment 585663 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/thebes/gfxFont.cpp
@@ +1548,5 @@
> +    if (entry) {
> +        return entry->mShapedWord;
> +    }
> +
> +    entry = mWordCache.PutEntry(key);

Oops, one thing I didn't notice before. Just use PutEntry instead of GetEntry/PutEntry. If mShapedWord is nonnull we know it was a hit.
Comment 37 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-01-04 02:30:19 PST
Comment on attachment 585664 [details] [diff] [review]
pt 2.2 v2 - adapt Mac font code to work with gfxShapedWord caches.

Review of attachment 585664 [details] [diff] [review]:
-----------------------------------------------------------------

use mozilla::ArrayLength instead of NS_ARRAY_LENGTH.
Comment 38 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-01-04 02:40:15 PST
Comment on attachment 585665 [details] [diff] [review]
pt 2.3 v2 - adapt Windows font code to work with gfxShapedWord caches.

Review of attachment 585665 [details] [diff] [review]:
-----------------------------------------------------------------

Do we still need gfxGDIShaper for anything? Presumably Harfbuzz completely supercedes it? Why not get rid of gfxGDIShaper in a followup?

How do you know that the IDWriteTextAnalyzer is reusable this way?

::: gfx/thebes/gfxUniscribeShaper.cpp
@@ +322,5 @@
>                          gfxTextRun::DetailedGlyph *details = &detailedGlyphs[i];
>                          details->mGlyphID = mGlyphs[k + i];
> +                        details->mAdvance = mAdvances[k + i] * appUnitsPerDevUnit;
> +                        details->mXOffset = float(mOffsets[k + i].du) * appUnitsPerDevUnit *
> +                            (aShapedWord->IsRightToLeft() ? -1.0 : 1.0);

Probably worth adding GetDirection to gfxShapedWord and using it here.
Comment 39 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-01-04 02:48:00 PST
Comment on attachment 585667 [details] [diff] [review]
pt 2.4 v2 - adapt Linux/Pango font code to work with gfxShapedWord caches.

Review of attachment 585667 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/thebes/gfxPangoFonts.cpp
@@ +2050,5 @@
>  {
> +    // if this character is a join-control or the previous is a join-causer,
> +    // use the same font as the previous range if we can
> +    if (gfxFontUtils::IsJoinControl(aCh) || gfxFontUtils::IsJoinCauser(aPrevCh)) {
> +        if (aPrevMatchedFont && aPrevMatchedFont->HasCharacter(aCh)) {

Seems like you're changing some behavior here. Would be better to separate that from the gfxShapedWord changes.
Comment 40 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-01-04 02:50:23 PST
Comment on attachment 585668 [details] [diff] [review]
pt 2.5 v2 - adapt Android/FT2 font code to work with gfxShapedWord caches.

Review of attachment 585668 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/thebes/gfxFT2Fonts.cpp
@@ +459,5 @@
>  
> +    if (IsSyntheticBold()) {
> +        float synBoldOffset =
> +            GetSyntheticBoldOffset();//FIXME * CalcXScale(aContext);
> +        aShapedWord->AdjustAdvancesForSyntheticBold(synBoldOffset);

So how are we going to fix the FIXME?
Comment 41 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-01-04 02:55:30 PST
Comment on attachment 585669 [details] [diff] [review]
pt 3 v2 - remove copy of original characters from gfxTextRun.

Review of attachment 585669 [details] [diff] [review]:
-----------------------------------------------------------------

Can we avoid having to create detailedGlyphs for tabs and newlines by overloading the glyph field of a SimpleGlyph to store the character data? We know there aren't going to be glyphs for those characters.

::: gfx/thebes/gfxFont.cpp
@@ +1521,5 @@
> +    PRUint8 category = gfxUnicodeProperties::GetGeneralCategory(aUSV);
> +    return ((category >= HB_CATEGORY_COMBINING_MARK &&
> +             category <= HB_CATEGORY_NON_SPACING_MARK) ||
> +            (aUSV >= 0x200c && aUSV <= 0x200d) || // ZWJ, ZWNJ
> +            (aUSV >= 0xff9e && aUSV <= 0xff9f));

Comment this range
Comment 42 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-01-04 03:00:19 PST
In particular we could use a bit to represent "is this a control or space character" instead of "is this a space character", encode the character code in the glyph field, and if we want to support rendering non-blank space glyphs we can check for a space in gfxFont::Draw and use the font's cached space glyph if necessary.
Comment 43 Jonathan Kew (:jfkthame) 2012-01-04 03:23:39 PST
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #38)
> Comment on attachment 585665 [details] [diff] [review]
> pt 2.3 v2 - adapt Windows font code to work with gfxShapedWord caches.

> Do we still need gfxGDIShaper for anything? Presumably Harfbuzz completely
> supercedes it? Why not get rid of gfxGDIShaper in a followup?

At present, we don't attempt to use harfbuzz for fonts that are not sfnt-based (truetype/opentype) - i.e. legacy formats such as .fon bitmaps or .pfb type1 fonts - we just send them through the old platform shaping code. We might need to do some work on the callbacks harfbuzz uses to access font data, to ensure they work properly with non-sfnt fonts.

So I don't think we're in a position to do this right now, but could be worth doing as a followup.

> How do you know that the IDWriteTextAnalyzer is reusable this way?

Because it works? :) No, seriously, it's OK because this is just a stateless interface object, it just exists to provide access to the various methods - in effect, it's like a namespace for a bunch of global functions that we need to access. It doesn't have instance-specific state that would mean we need separate instances for each use. "State" is maintained by the client in the various structures and arrays that are passed to and from the IDWriteTextAnalyzer methods.
Comment 44 Jonathan Kew (:jfkthame) 2012-01-04 03:35:32 PST
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #39)
> Comment on attachment 585667 [details] [diff] [review]
> pt 2.4 v2 - adapt Linux/Pango font code to work with gfxShapedWord caches.

> ::: gfx/thebes/gfxPangoFonts.cpp
> @@ +2050,5 @@
> >  {
> > +    // if this character is a join-control or the previous is a join-causer,
> > +    // use the same font as the previous range if we can
> > +    if (gfxFontUtils::IsJoinControl(aCh) || gfxFontUtils::IsJoinCauser(aPrevCh)) {
> > +        if (aPrevMatchedFont && aPrevMatchedFont->HasCharacter(aCh)) {
> 
> Seems like you're changing some behavior here. Would be better to separate
> that from the gfxShapedWord changes.

The main change this introduces is that format-controls which aren't supported in a particular font will now fall back to a font that does support them - which is the same thing we do elsewhere. (I.e., it's changing to bring it more into line with the code in gfxFont.cpp that we use on other platforms.)

I think I probably had a specific reason for doing this, but I can't recall offhand - I could try without it, and see whether something breaks.
Comment 45 Jonathan Kew (:jfkthame) 2012-01-04 03:40:40 PST
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #40)
> Comment on attachment 585668 [details] [diff] [review]
> pt 2.5 v2 - adapt Android/FT2 font code to work with gfxShapedWord caches.

> ::: gfx/thebes/gfxFT2Fonts.cpp
> @@ +459,5 @@
> >  
> > +    if (IsSyntheticBold()) {
> > +        float synBoldOffset =
> > +            GetSyntheticBoldOffset();//FIXME * CalcXScale(aContext);
> > +        aShapedWord->AdjustAdvancesForSyntheticBold(synBoldOffset);
> 
> So how are we going to fix the FIXME?

Oops! (I'm surprised that didn't result in any test failures.... should try to create a test that covers it.)

The fix will be to make CalcXScale a method on gfxFont instead of just a static function in gfxFont.cpp, so that the gfxFT2Font subclass can also use it.
Comment 46 Jonathan Kew (:jfkthame) 2012-01-04 04:49:08 PST
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #41)
> Comment on attachment 585669 [details] [diff] [review]
> pt 3 v2 - remove copy of original characters from gfxTextRun.

> Can we avoid having to create detailedGlyphs for tabs and newlines by
> overloading the glyph field of a SimpleGlyph to store the character data? We
> know there aren't going to be glyphs for those characters.

I don't think this is needed, and it would probably hurt drawing performance as it would make spaces more expensive to process - we'd no longer be able to treat them just like all the other simple glyphs in a typical textrun. Note that for tabs and newlines, the CompressedGlyph record will *not* normally be a simple glyph, it will be a blank missing glyph record, and so we won't need to "convert" a simple glyph to a DetailedGlyph, we'll just set the relevant flag within the CompressedGlyph, but its glyph count will remain zero and no DetailedGlyph will be allocated.

The code in gfxTextRun::SetIs{Tab,Newline} to convert a simple glyph record to a detailed glyph is there as a precaution, to avoid the risk that calling these methods could result in garbled records, but should rarely if ever be needed in practice.
Comment 47 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-01-04 12:28:17 PST
Comment on attachment 585665 [details] [diff] [review]
pt 2.3 v2 - adapt Windows font code to work with gfxShapedWord caches.

Review of attachment 585665 [details] [diff] [review]:
-----------------------------------------------------------------

> No, seriously, it's OK because this is just a stateless interface object

Sure, it looks like it, but I couldn't find any documentation to that effect and I was hoping you had.
Comment 48 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-01-04 13:09:49 PST
Comment on attachment 585669 [details] [diff] [review]
pt 3 v2 - remove copy of original characters from gfxTextRun.

Review of attachment 585669 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/thebes/gfxFont.h
@@ +1717,5 @@
>                      (FLAG_NOT_LIGATURE_GROUP_START | FLAG_NOT_MISSING);
>          }
>  
> +        bool CharIsSpace() const {
> +            return (mValue & FLAG_CHAR_IS_SPACE) != 0;

Document that this is a breakable/trimmable space (0x20) only.

@@ +2543,5 @@
>      // the font shaper and go through the shaped-word cache for most spaces.
>      //
> +    // The parameter aTrimmableSpace is set to true for "normal" space
> +    // characters, false if this was a no-break space (which should not
> +    // be trimmed if it falls at a run end).

Boolean parmaeters suck. Here you could just pass the PRUnichar character code, which would be a little clearer.

::: layout/generic/nsTextFrameThebes.cpp
@@ +849,1 @@
>                                   bool aSuppressSink);

Let's make these bools into a flags word.

@@ +1372,5 @@
> +      if (bufferSize < mMaxTextLength || bufferSize == PR_UINT32_MAX ||
> +          !buffer.AppendElements(bufferSize)) {
> +        return;
> +      }
> +      SetupLineBreakerContext(textRun, buffer.Elements());

Why is the buffer a parameter to SetupLineBreakerContext? Can't SetupLineBreakerContext set up the buffer itself?

@@ +2018,5 @@
> +// So it does the same walk over the mMappedFlows, but doesn't actually
> +// build a new textrun.
> +void
> +BuildTextRunsScanner::SetupLineBreakerContext(gfxTextRun *aTextRun,
> +                                              void *aTextBuffer)

Hmm ... if the textrun contained a list of shaped words, couldn't we get the text from the words? Might be simpler and faster than this.

If we end up not copying glyph data into the textrun and reference it via words instead, then we'd be getting the text via those words as well.

Having an intermediate state where we copy the glyph data and keep a list of the words seems OK since it should still be an improvement --- or at least not much worse than --- keeping the glyph data and the text in most cases.
Comment 49 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-01-04 13:10:02 PST
(In reply to Jonathan Kew (:jfkthame) from comment #44)
> The main change this introduces is that format-controls which aren't
> supported in a particular font will now fall back to a font that does
> support them - which is the same thing we do elsewhere. (I.e., it's changing
> to bring it more into line with the code in gfxFont.cpp that we use on other
> platforms.)

Even if we take this change in this bug, it would still be good to have it in a separate patch.

(In reply to Jonathan Kew (:jfkthame) from comment #45)
> The fix will be to make CalcXScale a method on gfxFont instead of just a
> static function in gfxFont.cpp, so that the gfxFT2Font subclass can also use
> it.

So you're going to rework that patch to do that, right?

(In reply to Jonathan Kew (:jfkthame) from comment #46)
> Note that for tabs and newlines, the CompressedGlyph record will *not*
> normally be a simple glyph, it will be a blank missing glyph record, and so
> we won't need to "convert" a simple glyph to a DetailedGlyph, we'll just set
> the relevant flag within the CompressedGlyph, but its glyph count will
> remain zero and no DetailedGlyph will be allocated.

I didn't think of that. Great.
Comment 50 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-01-04 13:15:33 PST
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #48)
> Hmm ... if the textrun contained a list of shaped words, couldn't we get the
> text from the words? Might be simpler and faster than this.

Oh, you can't do that because you're expiring the words...
Comment 51 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-01-04 13:17:47 PST
Comment on attachment 585670 [details] [diff] [review]
pt 4 v3 - add timed expiration of cached gfxShapedWord records

Review of attachment 585670 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/thebes/gfxFont.cpp
@@ +1147,5 @@
> +    if (!aEntry->mShapedWord) {
> +        NS_ASSERTION(aEntry->mShapedWord, "cache entry has no gfxShapedWord!");
> +        return PL_DHASH_REMOVE;
> +    }
> +    if (aEntry->mShapedWord->IncrementAge() == 3) {

Make this a constant somewhere instead of a magic number.
Comment 52 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-01-04 13:18:30 PST
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #50)
> (In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #48)
> > Hmm ... if the textrun contained a list of shaped words, couldn't we get the
> > text from the words? Might be simpler and faster than this.
> 
> Oh, you can't do that because you're expiring the words...

Although refcounting the shaped words would be no problem.
Comment 53 Jonathan Kew (:jfkthame) 2012-01-04 23:37:14 PST
Created attachment 585988 [details] [diff] [review]
pt 2.5 v3 - adapt Android/FT2 font code to work with gfxShapedWord caches.

This addresses the FIXME that I'd forgotten earlier.
Comment 54 Jonathan Kew (:jfkthame) 2012-01-04 23:40:54 PST
Created attachment 585989 [details] [diff] [review]
pt 2.4 v3 - adapt Linux/Pango font code to work with gfxShapedWord caches.

Split the gfxPangoFontGroup::FindFontForChar modification out of this patch as requested.
Comment 55 Jonathan Kew (:jfkthame) 2012-01-04 23:51:36 PST
Created attachment 585991 [details] [diff] [review]
pt 2.4.1 - make gfxPangoFontGroup font-matching behavior more similar to generic gfxFontGroup version.

This modifies the initial part of gfxPangoFontGroup::FindFontForChar, where the code is aiming to propagate a previously-matched font to the current character in certain cases, to mimic the implementation in gfxFontGroup as used on other platforms.

The significant behavior change this introduces, which is needed in this bug, is that space characters will no longer inherit the font of a preceding character that used fallback, but will revert to the font-group's primary font instead. The old code allowed fallback to extend across spaces, and had a comment to the effect that this didn't matter as the font chosen for the space would be ignored anyway, but that is no longer true in the gfxShapedWord-based version of textrun construction where the intervening word spaces are not included in the shaping process but handled separately.

Specifically, without this change we get a handful of reftest failures in cases where we have <pre> or monospaced text that contains mixed English and Hebrew, and the space characters of the English (primary) font and Hebrew (fallback) fonts chosen have different widths.
Comment 56 Jonathan Kew (:jfkthame) 2012-01-05 02:48:17 PST
Created attachment 586018 [details] [diff] [review]
pt 3 v3 - remove copy of original characters from gfxTextRun.
Comment 57 Jonathan Kew (:jfkthame) 2012-01-05 03:02:51 PST
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #48)
> Comment on attachment 585669 [details] [diff] [review]
> pt 3 v2 - remove copy of original characters from gfxTextRun.

> Boolean parmaeters suck. Here you could just pass the PRUnichar character
> code, which would be a little clearer.

OK, makes sense.

> ::: layout/generic/nsTextFrameThebes.cpp
> @@ +849,1 @@
> >                                   bool aSuppressSink);
> 
> Let's make these bools into a flags word.

Done.

> @@ +1372,5 @@
> > +      if (bufferSize < mMaxTextLength || bufferSize == PR_UINT32_MAX ||
> > +          !buffer.AppendElements(bufferSize)) {
> > +        return;
> > +      }
> > +      SetupLineBreakerContext(textRun, buffer.Elements());
> 
> Why is the buffer a parameter to SetupLineBreakerContext? Can't
> SetupLineBreakerContext set up the buffer itself?

Yes, that's reasonable - I was just copying the pattern used with BuildTextRunForFrames, but there's no reason for that.

Actually, I think we should do a followup to review the existing buffer allocations in this code, and switch to fallible arrays in some cases where we're allocating space for a potentially huge string of text, and the code include checks for failure - it was clearly written assuming fallible arrays, but we've since changed the default behavior. The buffer passed to BuildTextRunForFrames, for example, should be allocated fallibly, as should the temporary buffer used when we need to "expand" 8- to 16-bit text. Basically, wherever we're doing "nsAutoTArray<T,BIG_TEXT_NODE_SIZE>", we probably want FallibleAutoTArray.

> @@ +2018,5 @@
> > +// So it does the same walk over the mMappedFlows, but doesn't actually
> > +// build a new textrun.
> > +void
> > +BuildTextRunsScanner::SetupLineBreakerContext(gfxTextRun *aTextRun,
> > +                                              void *aTextBuffer)
> 
> Hmm ... if the textrun contained a list of shaped words, couldn't we get the
> text from the words? Might be simpler and faster than this.

We could, though it probably wouldn't be significantly faster, as we'll be needing to work with a mapping between textrun offsets and the corresponding words.

I'd prefer not to start on that within this bug, however. The intention is to try restructuring gfxTextRun to have a list of references to shaped words instead of its current mCharacterGlyphs array, but I want to do that in a separate bug, as I'm not confident of how performance will work out - iterating over the glyphs (e.g. for drawing) will become slightly more complex than with the existing flat array of CompressedGlyphs in the textrun itself. OTOH, that model should allow us to optimize construction better (no copying of glyph data), and may be friendlier towards off-main-thread shaping. But it seems sufficiently risky/unknown that it should be clearly separate, and I don't want to start adding a list of shaped words to the textrun at this stage, as by itself that would be a net loss (more memory, effort to maintain the offset-to-word mapping, refcount the words, etc).

If that followup does work out well, we can then drop the extra character-identifying flags that are being stored in the CompressedGlyph here, as we'll have access to the actual character codes again. But in the meantime I think this is the simplest and safest way to maintain the data we need.
Comment 59 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-01-05 04:25:55 PST
I had an idea about storing lists of words in textruns which I'll write down here for posterity... Something like this:

Store three arrays in each textrun:
1) The "word list array", an array of pairs, one for each word that occurs (nsRefPtr<gfxShapedWord> word, PRUint32 character offset of the start of the word in the textrun)
2) The "base word index array", an array of (#characters + 127)/128 PRUint32 indexes into the word list array. The value at index i is the index of the first word that includes a character with offset >= i*128.
3) The "word index array", an array of #characters byte. The value at index i is either some magic value (e.g. 0xFF) to indicate character i is a space, or another value which is added to the "base word index array"[i/128] to yield the index into the "word list array" for the word containing character i.

This is a) compact (slightly over 1 byte per character when words are very long, 9 bytes per character for the pessimal case of 1-character words) and b) fairly efficient (finding the word containing a given character, and the offset of that character within the word, is three array lookups and a conditional branch on whether it's a space).
Comment 60 Jonathan Kew (:jfkthame) 2012-01-05 04:41:15 PST
Filed followup bug 715473 on re-working the structure of textruns, and bug 715471 on using fallible allocation where appropriate in the textframe code.
Comment 61 Jonathan Kew (:jfkthame) 2012-01-05 06:48:15 PST
Pushed a followup to fix an intermittent crash (see https://tbpl.mozilla.org/php/getParsedLog.php?id=8345059&tree=Mozilla-Inbound) because the text passed to gfxFont::GetShapedWord() is not (usually) null-terminated, so we need to wrap it in an nsDependentCString when passing it to AppendASCIItoUTF16, to avoid reading beyond the end of what's valid:

https://hg.mozilla.org/integration/mozilla-inbound/rev/26d7324c8d37
Comment 62 Jeff Muizelaar [:jrmuizel] 2012-01-05 22:10:53 PST
This seems to have caused a large performance regression of svg opacity.

Regression :( SVG, Opacity increase 79.4% on Linux x64 Mozilla-Inbound
----------------------------------------------------------------------
   Previous: avg 36.400 stddev 1.094 of 30 runs up to revision 3c970a5c173c
   New     : avg 65.300 stddev 0.274 of 5 runs since revision c0b62edd2917
   Change  : +28.900 (79.4% / z=26.420)
   Graph   : http://mzl.la/wd9hQZ
Comment 63 Jonathan Kew (:jfkthame) 2012-01-06 01:34:27 PST
Which makes no apparent sense, as Tsvg-opacity doesn't even involve text, AFAICS.

Also strange that it should affect Linux only.

I'm beginning some investigation locally to try and understand what's going on here, but will be away from my main Linux machine over the weekend so it may be a few days before we have answers to this.
Comment 65 Jeff Muizelaar [:jrmuizel] 2012-01-06 13:20:40 PST
Clang warns about this expression:
        if (!aFinish && mNumGlyphs < GLYPH_BUFFER_SIZE || !mNumGlyphs) {
where it's not clear if it should be:
        if ((!aFinish && mNumGlyphs < GLYPH_BUFFER_SIZE) || !mNumGlyphs) {
or:
        if (!aFinish && (mNumGlyphs < GLYPH_BUFFER_SIZE || !mNumGlyphs)) {

can you add the parenthesis to make it clear which one you mean.
Comment 66 Jonathan Kew (:jfkthame) 2012-01-09 11:54:08 PST
(In reply to Jeff Muizelaar [:jrmuizel] from comment #65)
> Clang warns about this expression:
>         if (!aFinish && mNumGlyphs < GLYPH_BUFFER_SIZE || !mNumGlyphs) {
> where it's not clear if it should be:
>         if ((!aFinish && mNumGlyphs < GLYPH_BUFFER_SIZE) || !mNumGlyphs) {
> or:
>         if (!aFinish && (mNumGlyphs < GLYPH_BUFFER_SIZE || !mNumGlyphs)) {
> 
> can you add the parenthesis to make it clear which one you mean.

I think you may have added this comment to the wrong bug by mistake? I don't believe I wrote anything like that in these patches....
Comment 67 Jeff Muizelaar [:jrmuizel] 2012-01-09 12:29:24 PST
(In reply to Jonathan Kew (:jfkthame) from comment #66)
> I think you may have added this comment to the wrong bug by mistake? I don't
> believe I wrote anything like that in these patches....

Quite so.
Comment 68 Jonathan Kew (:jfkthame) 2012-01-10 09:11:02 PST
(In reply to Jonathan Kew (:jfkthame) from comment #63)
> Which makes no apparent sense, as Tsvg-opacity doesn't even involve text,
> AFAICS.
> 
> Also strange that it should affect Linux only.
> 
> I'm beginning some investigation locally to try and understand what's going
> on here, but will be away from my main Linux machine over the weekend so it
> may be a few days before we have answers to this.

I've been trying to investigate this locally with Linux Opt builds from before/after this bug landed, but have not been able to reproduce the apparent regression either using a simple page-load test with the Tsvg-opacity testcases or by running standalone talos.
Comment 69 Jonathan Kew (:jfkthame) 2012-01-10 10:53:44 PST
I still don't understand this, but from the tryserver results at https://tbpl.mozilla.org/?tree=Try&rev=1a04d12654c0, it looks as though bug 716229 may help with the mysterious Tsvg-opacity regression on Linux. Let's see what talos says once that has landed.
Comment 70 Karl Tomlinson (ni?:karlt) 2012-01-10 18:23:31 PST
Comment on attachment 585989 [details] [diff] [review]
pt 2.4 v3 - adapt Linux/Pango font code to work with gfxShapedWord caches.

>-    pango_break(aUTF8, aUTF8Length, aAnalysis,
>-                buffer.Elements(), buffer.Length());

>         pango_shape(text, len, aAnalysis, glyphString);
>-        SetupClusterBoundaries(aTextRun, text, len, utf16Offset, aAnalysis);
>-        SetGlyphs(aTextRun, text, len, &utf16Offset, glyphString,
>-                  aOverrideSpaceWidth);
>+        SetGlyphs(aShapedWord, text, len, &utf16Offset, glyphString,
>+                  aOverrideSpaceWidth, aFont);

Did you decide we don't need or want to use the PangoEngineLang for cursor positions in indic scripts (bug 617203 comment 5)?

>-    cairo_scaled_font_t *cairoFont = CreateScaledFont(renderPattern, face);
>-
>-    nsRefPtr<gfxFcFont> font = static_cast<gfxFcFont*>
>-        (cairo_scaled_font_get_user_data(cairoFont, &sGfxFontKey));
>-
>+    gfxFontStyle style(*aFontStyle);
>+    style.size = GetPixelSize(renderPattern);
>+    style.style = gfxFontconfigUtils::GetThebesStyle(renderPattern);
>+    style.weight = gfxFontconfigUtils::GetThebesWeight(renderPattern);
>+
>+    nsRefPtr<gfxFont> font = gfxFontCache::GetCache()->Lookup(fe, &style);
>     if (!font) {
>-        gfxFloat size = GetPixelSize(renderPattern);
>-
>-        // Shouldn't actually need to take too much care about the correct
>-        // name or style, as size is the only thing expected to be important.
>-        PRUint8 style = gfxFontconfigUtils::GetThebesStyle(renderPattern);
>-        PRUint16 weight = gfxFontconfigUtils::GetThebesWeight(renderPattern);
>-
>-        // The LangSet in the FcPattern does not have an order so there is no
>-        // one particular language to choose and converting the set to a
>-        // string through FcNameUnparse() is more trouble than it's worth.
>-        nsIAtom *language = gfxAtoms::en; // TODO: get the correct language?
>-        // FIXME: Pass a real stretch based on renderPattern!
>-        gfxFontStyle fontStyle(style, weight, NS_FONT_STRETCH_NORMAL,
>-                               size, language, 0.0,
>-                               true, false,
>-                               NS_LITERAL_STRING(""),
>-                               NS_LITERAL_STRING(""));
>-
>         // Note that a file/index pair (or FT_Face) and the gfxFontStyle are
>         // not necessarily enough to provide a key that will describe a unique
>         // font.  cairoFont contains information from renderPattern, which is a
>         // fully resolved pattern from FcFontRenderPrepare.
>         // FcFontRenderPrepare takes the requested pattern and the face
>         // pattern as input and can modify elements of the resulting pattern
>         // that affect rendering but are not included in the gfxFontStyle.
>-        font = new gfxFcFont(cairoFont, fe, &fontStyle);
>+        cairo_scaled_font_t *cairoFont = CreateScaledFont(renderPattern, face);
>+        font = new gfxFcFont(cairoFont, fe, &style);
>+        gfxFontCache::GetCache()->AddNew(font);
>+        cairo_scaled_font_destroy(cairoFont);

Can you explain the reason for this change, please?
Comment 71 Nicholas Nethercote [:njn] 2012-01-11 02:35:58 PST
jkew, can you briefly summarize the memory wins in this bug?
Comment 72 Jonathan Kew (:jfkthame) 2012-01-11 06:03:22 PST
(In reply to Karl Tomlinson (:karlt) from comment #70)
> Did you decide we don't need or want to use the PangoEngineLang for cursor
> positions in indic scripts (bug 617203 comment 5)?

Some, at least, of the "special breaking rules" that pango implements via pango_break are specifically _not_ desired; see bug 474068 and http://bugzilla.gnome.org/show_bug.cgi?id=576156. The undesired behavior in pango has still not been fixed, AFAIK; in bug 474068 we hacked our test to ignore the problem, but it would also give incorrect first-letter behavior (for example), if we were to have Thai testcases for that.

If we determine that there are specific languages/scripts for which using pango would give us _improved_ behavior, we could consider re-enabling it there, but at this point I'm not aware of cases where we need that.

> Can you explain the reason for this change, please?

The old code was bypassing the global gfxFontCache, which left us without any easy way to keep track of the thebes fonts in order to report their memory usage or to age or flush their ShapedWord caches.
Comment 73 Jonathan Kew (:jfkthame) 2012-01-11 06:08:21 PST
(In reply to Nicholas Nethercote [:njn] from comment #71)
> jkew, can you briefly summarize the memory wins in this bug?

This bug doesn't by itself substantially affect memory usage; however, it does make it possible for us to tune caching behavior more readily, reducing churn due to discarding and then re-creating data for common words, and it allows us to flush the (potentially large) shaped-word caches on memory-pressure notifications (bug 708075).

It's also the first step towards the restructuring of gfxTextRun (bug 715473), which might (if successful) be a significant memory win on text-heavy pages.
Comment 74 Karl Tomlinson (ni?:karlt) 2012-01-11 13:12:02 PST
(In reply to Jonathan Kew (:jfkthame) from comment #72)
> The old code was bypassing the global gfxFontCache, which left us without
> any easy way to keep track of the thebes fonts in order to report their
> memory usage or to age or flush their ShapedWord caches.

The old code did use the gfxFontCache for gfxFonts with a reference count of zero, but the method that was used to track gfxFonts may not have caught that.

I gather that word caching has now moved from a FontGroup to Font, and so CSS details such as the language now need to exist on the Font and we now need to have a separate gfxFont for each language (and override).

I expect there would be fontconfig corner cases where gfxFontCache::GetCache()->Lookup(fe, &style) doesn't give us the correct cairo_scaled_font, but it looks like it is working with standard configurations.

I guess we now no longer have word-based FindFontForChar font-selection caching now that caching is no longer FontGroup based.  I wonder what impact that has.

Note You need to log in before you can comment on or make changes to this bug.