The few pixels of a 'j' or 'J' at the beginning of a link don't change color on
highlight. The number of pixels varies from 0 to about 3, depending on
capitalization, font face, and whether the text is italicized.
Created attachment 46409 [details]
I see this on linux. OS->All
On the URL, I also see this on the 'M' of the Jesse Malone link (bottom right)
Created attachment 46411 [details]
more obvious testcase
Bulk moving Mozilla1.01 bugs to future-P1. I will pull from the future-P1 list
to schedule bugs for post Mozilla1.0 milestones
Created attachment 83343 [details]
a testcase with a hint
It seems that part of the letter escapes the bounding box...
Created attachment 108229 [details]
an other testcase with letter 'f'
The same happens with the rightmost pixel(s) at the end of a link with the
letter 'f' or 'F'.
*** Bug 207926 has been marked as a duplicate of this bug. ***
*** Bug 261915 has been marked as a duplicate of this bug. ***
This may itself be a duplicate of bug 15116
When text frames have overflowing text they probably need to set their
mCombinedArea / overflow area during reflow.
I don't know anything about our font APIs. rbs, do the MathML bounding metrics
give us what we want here?
Yes, that info is contained in the MathML bounding metrics:
- the so-called .leftBearing tells how far the text protrudes on the left;
- the so-called .rightBearing tells how far the text protrudes on the right.
However, as they come at a sizeable extra cost, I feel that they are overkill
for widespread use in all textframes.
I am under the impression that IE uses a very simple mechanism:
- if the containing block is positionned, then they clip its area from the
start. (That is, the text will never protrude outside. So bug 79732 doesn't
happen because the descender of j is cut right from the start.) The problem in
Gecko here seems to be that the containing block is not clipped (see screenshots
- if the containing block is not positionned, then they don't clip its area. The
text can protrude, and likewise when the damage area is re-painted, the bits of
text that stick outside are repainted (i.e., DrawString overflows, but who
cares). The problem in Gecko here seems to be that the damage area gets
explicitly clipped, and so the overflow of DrawString are cut out.
It sounds like what IE does is convenient but isn't really correct.
> I feel that they are overkill for widespread use in all textframes.
Is there a way we can make it fast? Even via some ugly hack like special-casing
ASCII characters that we know "should not" overflow in non-oblique text?
> It sounds like what IE does is convenient but isn't really correct.
Notice however that Gecko is not consistent with the second case (this bug). At
the first paint, DrawString is allowed to protrude. But at the next repaint, it
is prevented from protruding (due to the clipping). And that's why the bits that
stick outside go out-of-sync and look outdated.
Also note that in the first case (bug 79732), it may be that an explicit
clipping is only needed when painting the foreground text layer, thus mostly
affecting DrawString only.
> Is there a way we can make it fast?
Sure there could be a special-casing or something (e.g., a crude estimate that
doesn't involve a GDI call/measurement). But...
> Notice however that Gecko is not consistent with the second case (this bug).
I realize that what we're currently doing is wrong. But if we accurately
computed the overflow for text frames then everything should work as you'd expect.
>> Is there a way we can make it fast?
> Sure there could be a special-casing or something (e.g., a crude estimate that
> doesn't involve a GDI call/measurement). But...
It seems to me that only additional work required is to measure the left bearing
on the first character of the string and the right bearing on the last character
of the string. Would it still be expensive to get just those from GDI?
And it may sound ridiculous but I bet we could get a decent speedup by
identifying characters that never have left or right bearing in any
Two GDI calls (for the left and right) will be needed to get all the info. The
actual GDI function returns a struct with other data such as the width --
necessary to infer the overflow of the string: rbearing_string = width_string -
width_lastchar + rbearing_lastchar. GetBoundingMetrics will expose the info in a
unified way that hides the coordinate system that varies per platform. Hence no
further action is needed here. It already has a fast path for a single char
since this was a common case in MathML.
An extra perf might indeed be to skip characters that can't protrude (such as
'r', 's', ... as I type this, I wonder if they really can't protrude in all
I am not sure precisely what the extra cost of all this will be. I guess it can
be tried (shouldn't be hard to implement, except that those platforms that still
don't have GetBoundMetrics will miss out...). I understand that this fits nicely
with the current way that overflows are handled and that you prefer this route.
If the perf degenerates badly or storing the overflow areas creates much
overhead in memory, then we can revisit the other approaches that I mentioned.
> An extra perf might indeed be to skip characters that can't protrude (such as
> 'r', 's', ... as I type this, I wonder if they really can't protrude in all
> possible fonts?!?)
Yeah :-). Someone with a really big font collection could probably do an
experiment to figure out which characters satisfy.
By "not correct" I mean
> - if the containing block is positionned, then they clip its area from the
A relatively positioned inline span with an offset of 0,0 should look identical
to if it didn't have relative positioning. Period.
Also note that even Opera does that clipping (bug 79732 comment 31). I
understand that Gecko's overflowing handling together with the
GetBoundingMetrics can fit the bill nicely. So until we see the perf hit,
considering such a clipping (or other trickeries with 'r', 's', etc) is
premature optimization however.
*** Bug 197129 has been marked as a duplicate of this bug. ***
*** Bug 280621 has been marked as a duplicate of this bug. ***
I hope the changed bug summary reflects the more general nature of this bug.
This isn't really an obvious bug in Firefox, and as a result probably hasn't been treated with much concern. But it really stands out in Thunderbird mail composition, where backspacing over / deleting an f (in Times font, which is default) often leaves remnants of the characters in the composition window (similar to bug 197129 reported for Composer). While it's obviously not serious, turds like that can give an average user the (false) impression that the product is cheap or buggy.
It'd be nice if at least a temporary patch could be put in before Thunderbird 2 was released, or soon thereafter.
I believe this is most frequently noticed with hover coloring in Firefox. The summary change will likely lead to more duplicates. However, the old summary was also somewhat misleading as characters could overflow on either the right or left. After looking at the dupe summaries and these, I've combined them with more keywords to hopefully make this easier to find.
Would require too invasive a change for 1.8.1.
*** Bug 317974 has been marked as a duplicate of this bug. ***
We need to compute accurate glyph bounds so we get the right overflow area for text frames whose glyphs overflow the font box. For fancy fonts such as Zapfino this is essential for dynamic effects to work right.
The new textframe knows how to do this. We need to fix gfxTextRun to get the glyph metrics. The tricky part is doing it without regressing performance. See
Created attachment 266438 [details] [diff] [review]
work in progress
This should work on all platforms but I've only tested it on Mac. It requires new-textframe. I don't plan to finish and land this until new textframe has landed.
The main idea is to aggressively cache glyph extents on gfxFonts. We take advantage of a common case: the glyph lies within the ascent and descent of the font, and has no before-bearing, and tight glyph bounds are not required (since we're going to take the union of the glyph extents and the font-box anyway). In this case we need only store the offset to the glyph's right edge. We can commonly do this in 16 bits. So we use an array of PRUint16s indexed by glyph ID, with a backup of a hashtable mapping glyph IDs to full extent rects.
Because we are doing our own glyph extents caching this way, we avoid entering cairo at all for glyph extents on Mac. We just get them directly from ATSUI.
The extents cache depends on the appUnitsToDevUnits ratio, since the glyph widths may be fractional and we're actually rounding them to appunits. Since fonts don't know the appunitstodevunits ratio, we have to handle multiple extents caches per gfxFont. It would be nice to find a way to avoid this.
I've measured the performance on Mac and observed no significant performance impact over 40 runs of Tp2.
-- use ceil() when we convert glyph widths to appunits
-- check RTL situations
-- get this working on Windows and Linux, with performance testing for those platforms
Created attachment 272445 [details] [diff] [review]
Okay, this patch is updated to trunk. There is cairo-based code that should work on Windows, Linux and anywhere else --- although performance may suck. In fact, since GetGlyphOutline is said to suck badly, I'm pretty sure this will be a performance problem on Windows. Not sure of the best way to address that. Anyway, this seems to work with RTL, and I've added the NS_ceil I mentioned in the previous comment.
What I'd like to do now is get someone to profile Windows and Linux and we'll see what the perf impact is. One option would be to hardcode particular common font names where we know no glyphs overflow font boxes. Another option would be to depend on the text-rendering CSS property and lie about the glyph bounds when that is optimizing for speed.
Actually I think I'll test Linux performance right now...
Your patch is 61 bytes :)
Created attachment 272586 [details] [diff] [review]
Sorry about that.
Created attachment 272882 [details] [diff] [review]
patch with Pango/Windows changes
Added FetchGlyphExtents calls to Pango and Windows (in addition to Atsui) to set up GlyphExtents objects with cairo-based code. Tested successfully on Linux/Pango.
Measured mean effect on Linux Tp2 page load time: 1.3%
95% confidence interval: -0.3 to 2.8%
Obtained from 50 cycles of the 40 Tp2 pages.
20070716 trunk: 137.7 0.5
with patch: 139.4 0.6
Karl's patch has become old and won't apply to the current tree. An updated patch (with other mathml fixes) is posted on bug 324857. If you need a stripped-off version for this bug alone, please let me know.
Marking blocking since it's blocking other things (:first-letter, etc.)
Yamashita-san, please attach your updated patch (with the MathML stuff removed). THanks!
Created attachment 276046 [details] [diff] [review]
Tempting as it is, I can't review my own code. I need someone who can produce trustworthy Windows performance numbers to get some for this patch.
I can do that; will do so either later today or monday.
On windows... not sure how to fix this in a straightforward way; basically nsAutoTArray<nsAutoPtr<gfxGlyphExtents>,1> is causing the problem.
c:\proj\mozilla-cvs\firefox-opt\dist\include\xpcom\nsTArray.h(194) : error C2558: class 'nsAutoPtr<T>' : no copy constructor available or copy constructor is declared 'explicit'
c:\proj\mozilla-cvs\firefox-opt\dist\include\xpcom\nsTArray.h(712) : see reference to function template instantiation 'void nsTArrayElementTraits<E>::Construct<Item>(E *,const A &)' being compiled
c:\proj\mozilla-cvs\firefox-opt\dist\include\xpcom\nsTArray.h(446) : see reference to function template instantiation 'void nsTArray<E>::AssignRange<Item>(nsTArray_base::index_type,nsTArray_base::size_type,const Item *)' being compiled
c:\proj\mozilla-cvs\firefox-opt\dist\include\xpcom\nsTArray.h(291) : see reference to function template instantiation 'nsAutoPtr<T> *nsTArray<E>::ReplaceElementsAt<nsAutoPtr<T>>(nsTArray_base::index_type,nsTArray_base::size_type,const Item *,nsTArray_base::size_type)' being compiled
c:\proj\mozilla-cvs\firefox-opt\dist\include\xpcom\nsTArray.h(290) : while compiling class template member function 'nsTArray<E> &nsTArray<E>::operator =(const nsTArray<E> &)'
c:\proj\mozilla-cvs\firefox-opt\dist\include\xpcom\nsTArray.h(718) : see reference to class template instantiation 'nsTArray<E>' being compiled
c:\proj\mozilla-cvs\firefox-opt\dist\include\thebes\gfxFont.h(509) : see reference to class template instantiation 'nsAutoTArray<E,N>' being compiled
- using nsTArray instead of nsAutoTArray
- using nsTArray<gfxGlyphExtents,1> (which worked, as far as that goes, but the rest of the code didn't)
- adding a template<typename T> nsTArrayElementTraits<nsAutoPtr<T> > specialization with an undefined Construct() variant that used the copy constructor; this didn't work -- that version of Construct is called (failed at link time)
- adding a template<typename T> nsTArrayElementTraits<nsAutoPtr<T> > that implemented the copying-construct by allocating a new T and copying the old pointed-to- thing there. This compiled in gfx, but failed in layout/generic due to some odd nsTHashtable error.
Also, nsTextFrameBase.h was missing from the latest patch; it also needs an updated license block (e.g. it says 1998 in there or whatever).
Created attachment 277014 [details] [diff] [review]
This updates to trunk (fixing some conflicts) and fixes the Windows build issues. It also includes nsTextFrameBase which was missing from the previous patch. It works just fine on Windows, although performance is still a big question...
Created attachment 277174 [details] [diff] [review]
alternative patch that enables accurate glyph bounds only when the textrun is in "quality" mode
Created attachment 277179 [details] [diff] [review]
updated patch with a NeedsGlyphExtents abstraction
Created attachment 278363 [details] [diff] [review]
okay, this patch is based on 277179 and adds a fix in cairo for Windows bitmap fonts; GetGlyphBounds only works for Truetype fonts so for non-Truetype fonts we just assume that glyphs have no left bearing or right bearing, and vertically the bounds are the font ascent and descent.
I profiled this carefully on Windows and see no significant performance impact over 40 cycles of Tp.
Created attachment 278364 [details]
These are the before and after Tp results I got.
Comment on attachment 278363 [details] [diff] [review]
I think we should go for this now.
Comment on attachment 278363 [details] [diff] [review]
oops, there's a bug with MakeSpaceTextRun, we're not setting up glyph extents for the space in some cases.
Also IIRC Stuart requested that we pass around gfxContext* instead of cairo_t*. I'll do that.
Created attachment 278456 [details] [diff] [review]
updated patch v2
Okay, this is updated with a fix for the space-glyph problem and the changes that Stuart wanted, we don't pass around cairo_ts anymore.
Comment on attachment 278456 [details] [diff] [review]
updated patch v2
The layout changes need review from dbaron. These changes are to properly recompute the overflow rect for text frames when justification has alters the positions of glyphs.
Comment on attachment 278456 [details] [diff] [review]
updated patch v2
Looks ok to me. We need to get our win32 font changes upstream into cairo, though; there's some others as well.
Created attachment 278495 [details] [diff] [review]
These numbers seem more accurate. The results for the as-is patch seem to show a 0-2% slowdown on most pages, with some outliers that are likely as not just noise. Turning on NeedsGlyphBounds for all text adds an additional very slight slowdown, except for hotwired.lycos.com which shows a significant jump as we saw before. Not sure what's going on there, possibly we're getting an overflow rect added which is causing extra repainting or something.
I'll add an extra optimization to FetchGlyphExtents to see if that helps the default "disable bounds on fast path" case.
Created attachment 278709 [details] [diff] [review]
updated patch v3
This is a slightly improved patch with an additional optimization to MeasureText so that we don't get/create a gfxGlyphExtents if it's not going to be needed. This doesn't have any noticeable impact on the results though.
I profiled the Tp2 hotwired.lycos.com page, 100 cycles alternating with about:blank, with VTune. With patch v3, it doesn't show any of the new functions in the patch showing up in any way. And in fact there is no difference in the before-patch and after-patch results for the page when I test it this way.
I've run out of ideas for now; I think we should just land it.
Created attachment 281145 [details] [diff] [review]
updated patch v4
Updated to trunk (-NS_STATIC_CAST-).
Included nsTextFrameBase.h from attachment in comment #44.
I'm a bit concerned that the #ifdef DEBUG in nsLineLayout::RelativePositionFrames may one day end up causing different behaviour for debug versions.
Created attachment 281245 [details] [diff] [review]
updated patch v5
removed #ifdef DEBUG section in nsLineLayout::RelativePositionFrames.
Comment on attachment 281245 [details] [diff] [review]
updated patch v5
Updated patch ... still need dbaron review on the non-gfx bits
Why do we want nsTextFrameBase and the ensuing virtual functions?
In nsCanvasRenderingContext2D.cpp, it seems like you may as well fix the indentation of both of the lines that you're changing, rather than just one.
Why does nsTextFrame::HitTest use nsSize(0, 0)? That seems like it would make it always return null.
mTextRun->MeasureText is returning a cached result in RecomputeOverflowRect, right?
I don't think that long list of people contributed to nsTextFrameBase.
You might want a blank line to separate these comments:
// XXX what about adjusting bounding metrics?
+ // probably should not measure text in ::Reflow, but wait for
+ // RecomputeOverflowRect to be called
(In reply to comment #58)
> Why do we want nsTextFrameBase and the ensuing virtual functions?
So nsLineLayout can call RecomputeOverflowRect.
> In nsCanvasRenderingContext2D.cpp, it seems like you may as well fix the
> indentation of both of the lines that you're changing, rather than just one.
> Why does nsTextFrame::HitTest use nsSize(0, 0)? That seems like it would make
> it always return null.
Er, yeah. I don't know what I was thinking... It should be mFrame->GetSize().
> mTextRun->MeasureText is returning a cached result in RecomputeOverflowRect,
The glyph metrics will have been cached in the textrun, yes, so we won't compute them again. The overall result must be recomputed (it varies here because we may have added justification spacing since we measured the text in Reflow). Note, though, that RecomputeOverflowRect only gets called when text-align:justify is in effect, so it should not be a significant performance issue anyway.
> I don't think that long list of people contributed to nsTextFrameBase.
Oh yeah, I'll fix that.
> You might want a blank line to separate these comments:
> // XXX what about adjusting bounding metrics?
> + // probably should not measure text in ::Reflow, but wait for
> + // RecomputeOverflowRect to be called
OK. Actually I might just delete that comment since it might not be right.
Created attachment 281783 [details] [diff] [review]
updated patch v6
updated to those comments
Comment on attachment 281783 [details] [diff] [review]
updated patch v6
The alternative to having nsTextFrameBase seems like having an nsTextFrame.h header file. In the past we've had some ugly hacks to work around the lack of such a header file -- seems like we may as well have it.
r+sr=dbaron either way
Sorry for the long delay (again).
checked in. Watching for performance impact.
It caused various platforms to go orange so I backed it out.
The problems were reftest failures on all platforms plus a startup failure on fxdbug-linux-tbox. The good news is that performance metrics didn't budge anywhere, as far as I can tell from the one cycle I got.
Not sure if the failed reftests revealed this:
with this patch, all type of list-markers (<ul>,<ol>) are completely missing.Tested with the 2007092205 Minefield/3.0a9pre Mac OS X build
Yeah, they did. But thanks for mentioning it.
Created attachment 282043 [details] [diff] [review]
updated patch v7
Okay, there are four small changes here:
1) nsTextFrameBase.h converted to nsTextFrame.h as David suggested
2) change in nsBulletFrame.cpp to set mOverflowArea (fixes bullets not appearing, and fixes lots of assertions about overflow area not including glyph bounds; these assertions fire with this patch because we never called FinishAndStoreOverflow on bullet frames before). We set the trivial overflow area, equal to the frame size; handling overflowing bullet glyphs is future work (and anyway, not a regression)
3) Reftest 392435-1.html/392435-1-ref.html needs to be fixed; first-letter is allowed to change the text ascent/descent values, so we should not require that the text background and strike-through be drawn the same way in test vs reference
4) Don't change the advance of first-letters:
- textMetrics.mAdvanceWidth = textMetrics.mBoundingBox.XMost();
This was causing reftest failures when we compare first-letter vs explicit markup of the first letter, and there is no need for it as far as I can tell.
These fix all the reftest failures as far as I can tell. fxdbug-linux-tbox probably went orange due to a fatal assertion, hopefully fixed by change #2.
(In reply to comment #67)
> probably went orange due to a fatal assertion, hopefully fixed by change #2.
You should be able to tell whether it was due to a fatal assertion by looking at the log.
Comment on attachment 282043 [details] [diff] [review]
updated patch v7
File a bug on the XXX comment in nsBulletFrame.cpp and cite it in the comment, and r+sr=dbaron on the content/layout parts.
Thanks! Relanded, we'll see how this one goes.
Okay, tests are green on all platforms. All performance numbers look normal so far except for about a 1% Tp2 regression on bl-bldlnx03. I'll recheck the numbers later when we have more data. This is one of those situations where we sometimes just have to do more work.
This seems to have caused about a 1MB increase in memory usage on Talos/XP:
I'll try to figure out why.
The issue seems to be getting glyph extents for CJK characters. Even if we need extents for just a few glyphs in a font, if the glyphID is, say, 30,000, then we need at least 60K of memory for the font's simple glyph-width table. We're getting glyph extents for at least two reasons:
1) image alt text, which uses nsIRenderingContext APIs and therefore does not set OPTIMIZE_SPEED, which currently is taken as a cue that glyph extents will be needed
2) prefilled text in textfields
The first reason is bogus. nsIRenderingContext APIs currently do not use glyph extents. Therefore I've done a patch which resurrects the NEED_BOUNDING_BOX textrun flag to provide callers with direct control over whether glyph extents will be needed or not. nsTextFrameThebes and SVG set it, currently no-one else does.
The second reason is not bogus. We need a better strategy for storing glyph widths when we have widths for only a few glyphs out of thousands.
After fixing the first issue, here are some numbers using instrumentation I added, over about 1.5 runs of the top500 pageset:
Total number of fonts=7958
Total glyph extents allocated=317 (size 12680)
Average simple array size: 6433.116699
Average tight-bounds table size: 13.022082
Number of simple glyph extents eagerly requested=2254
Number of tight glyph extents eagerly requested=128
Number of tight glyph extents lazily requested=0
Number of simple glyph extent setups that fell back to tight=0
This means we needed some glyph extents for about 1 font in every 26 (7958/317). The average capacity of the simple glyph width array was about 3200 elements, so it's still 2MB of storage just for those arrays --- although not all those will be alive at the same time, so the actual footprint impact will be considerably less. But only 2254 glyph extents were requested, so that's only about 7 glyphs per font (2254/317) on average, so the simple glyph width array occupancy is about 0.22%, which is rather sad.
Designing a more efficient data structure is a bit tricky, because sometimes the glyph IDs are sparse, but sometimes they're dense, sometimes both in different regions of glyph ID space in the same font. So a hashtable isn't efficient, but neither is the array we have now.
An array of pointers to arrays of N glyph widths, where null pointers can be used to indicate that all N glyph widths are unknown, is one way to go, but choosing N is tricky. Given G sparse glyphs with a max glyphID of M, this will use about S = M/N*4 + G*N*2 bytes on a 32-bit machine. Minimising S/G with example values of G = 8 and M = 30000, the minimum value of S/G for powers-of-2 N is N=64, which still uses about 362 bytes per glyph, which is better than 7500 with our current plain array, but still not great.
One possible optimization to the block scheme would be to encode a block containing a single glyph width directly overloaded into the block pointer (we only need the bottom log2(N) bits of the glyph ID plus the 16-bit glyph width). Then we could up the block size a bit; say choosing N=128, if there are still no two glyphs in the same block, we end up about 29 bytes per glyph in the above example. That's probably a pretty good approach.
I'll capture the glyph IDs we're actually using in the top500 tests and determine what the optimal parameters are.
Created attachment 282496 [details]
This file contains an instrumentation patch, the results from a performance run, and an analysis script to compute estimated storage requirements for the simple glyph widths using different schemes. The results:
Trunk check: 1794226
Simple blocks with 1-element block optimization:
So basically, using simple 64-glyph blocks reduces the memory requirements by a factor of 10. Using the additional optimization for blocks containing only one glyph reduces memory by another factor of 2. I think I'll go with that optimization and N=128 until/unless we get better data.