Performance issue on large bidi blocks due to excessive destruction and creation of continuations

RESOLVED FIXED in mozilla1.9alpha1

Status

()

Core
Layout: Text
RESOLVED FIXED
12 years ago
10 years ago

People

(Reporter: Uri Bernstein (Google), Assigned: Uri Bernstein (Google))

Tracking

({perf})

Trunk
mozilla1.9alpha1
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(3 attachments)

(Assignee)

Description

12 years ago
This is a follow-up on bug 319930.

Bidi resolution re-uses the first available continuations for a frame as bidi continuations (in nsBidiPresUtils::EnsureBidiContinuation). It then frees the remaining continuations for use by the inline reflow code, by ensuring they are all fluid.
For the case of a large block (spanning many lines) with only some bidiness in it, this is wasteful, as many fluid continuations need to be re-created between the bidi continuations, whereas many of the remaining fluid continuations at the end need to be removed.

This causes a significant performance problem on large bidi plaintext files (e.g. attachment 205680 [details] on bug 319930).
See bug 319930 comment 24 and 25 for profiling data.
(Assignee)

Comment 1

12 years ago
Looking into those profiling results and reading bz's comments more carefully, I note that the destruction and creation of continuations only accounts for about 30% of the total time (nsTextFrame::Reflow accounts for an additional 35% on GTK, and I suspect that the percentage on Mac is even higher).

So while fixing this would still be nice, saying that "*this* causes a significant performance problem on large bidi plaintext files" was not really accurate.
Well... if we're creating/destroying text frame continuations, we have to reflow them too...

But yes, that could be a separate (still likely bidi-related) issue.
(Assignee)

Comment 3

12 years ago
(In reply to comment #2)
> Well... if we're creating/destroying text frame continuations, we have to
> reflow them too...
> 
> But yes, that could be a separate (still likely bidi-related) issue.
> 

Hmmm... I thought we have to reflow them anyway. But I guess that's not the case.
(Assignee)

Comment 4

12 years ago
Created attachment 214557 [details] [diff] [review]
patch

In EnsureBidiContinuation, leave fluid continuations alone (just skip them), and only re-use non-fluid continuations as bidi continuations.

This means that if nothing actually changed, the normal inline reflow will be able to use all the existing fluid continuations, and none will have to be destroyed or created.
Assignee: mozilla → uriber
Status: NEW → ASSIGNED
Attachment #214557 - Flags: superreview?(bzbarsky)
Attachment #214557 - Flags: review?(smontagu)
Comment on attachment 214557 [details] [diff] [review]
patch

sr=bzbarsky.

On the testase from bug 319930 with this patch I get:


Total hit count: 1060036
Count %Total  Function Name
66144   6.2     FcCharSetDestroy
38268   3.6     XftCharIndex
34843   3.3     nsFontMetricsXft::FindFont(unsigned int)
33373   3.1     XftGlyphExtents
23433   2.2     nsBidiPresUtils::CalculateCharType(int&, int, int&, int&, int&, unsigned char&, unsigned char&) const
20160   1.9     FcCharSetHasChar

That's about 40% faster than the same testcase without this patch.
Attachment #214557 - Flags: superreview?(bzbarsky) → superreview+
Comment on attachment 214557 [details] [diff] [review]
patch

r=me, assuming you've done regression testing (I would look at bug 248447 bug 297401 and bug 321745)
Attachment #214557 - Flags: review?(smontagu) → review+
(In reply to comment #5)
> Total hit count: 1060036
> Count %Total  Function Name
> 66144   6.2     FcCharSetDestroy
> 38268   3.6     XftCharIndex
> 34843   3.3     nsFontMetricsXft::FindFont(unsigned int)
> 33373   3.1     XftGlyphExtents
> 23433   2.2     nsBidiPresUtils::CalculateCharType(int&, int, int&, int&, int&,
> unsigned char&, unsigned char&) const
> 20160   1.9     FcCharSetHasChar

Am I reading this correctly? RemoveBidiContinuation has dropped completely out of the picture and CalculateCharType is now looming large?
> Am I reading this correctly? RemoveBidiContinuation has dropped completely out

All of the following is with this patch applied.

Time spent under RemoveBidiContinuation itself is about 0.25% of total.  So yeah, it's more or less gone.  ;)

The times in comment 5 are percentages of total time in the function, not under it.  For bidi stuff, about 12% of total load time is spent under nsBidiPresUtils::Resolve.  It breaks down as:

113880 nsBidiPresUtils::Resolve(nsPresContext*, nsIFrame*, nsIFrame*, int&, int)
  52088 nsBidi::SetPara(unsigned short const*, int, unsigned char,
                        unsigned char*)
  28471 nsBidiPresUtils::CalculateCharType(int&, int, int&, int&, int&,
                                           unsigned char&, unsigned char&) const
  12074 nsBidiPresUtils::CreateBlockBuffer(nsPresContext*)
   6634 nsBidiPresUtils::InitLogicalArray(nsPresContext*, nsIFrame*, nsIFrame*,
                                          int)
   5375 nsBidi::CountRuns(int*)
   3095 nsBidi::GetLogicalRun(int, int*, unsigned char*)
   2013 nsBidiPresUtils::EnsureBidiContinuation(nsPresContext*, nsIFrame*,
                                                nsIFrame**, int&)
   1255 nsPropertyTable::SetPropertyInternal(nsPropertyOwner, unsigned int,
                                             nsIAtom*, void*, void (*)(void*,
                                             nsIAtom*, void*, void*), void*,
                                             void**)
   1242 nsBidi::GetCharTypeAt(int, nsCharType*)
    258 nsBidiPresUtils::RemoveBidiContinuation(nsPresContext*, nsIFrame*, int,
                                                int, int&) const
    203 nsContinuingTextFrame::GetPrevContinuation() const
    181 nsStyleContext::GetStyleData(nsStyleStructID)

That said, the total time spent in reflow is still about 15x more than on the same file as EUC-JP.  So there's some sort of whole-system effect here, it seems to me.
(Assignee)

Comment 10

12 years ago
I'm away from home until tomorrow evening, so I couldn't check in the patch, but thinking about it, I realized that I probably need to set the embeddingLevel, baseLevel, and charType attributes on the fluid continuations that I'm skipping. So I'll have to make a new patch anyway. I'll also test the bugs Simon pointed to for regressions (I did do some tests, but not specifically those bugs).

Boris, thanks for the profiling data. I'm trying to think why this is still so much slower than in the no-bidi case. It seems that in the bidi case we're doing a full reflow of everything for each "incremental reflow". Is this true also for the  non-bidi case? If not, that might explain the difference.
(Assignee)

Comment 11

12 years ago
Created attachment 214843 [details] [diff] [review]
patch v2

Same as before, but set the attributes correctly on the reused continuations.
I did some more testing (including the bugs Simon mentioned above), whech led me to finding bug 330269 (which didn't surprize me) and bug 330268 (which did) - but no regressions from this patch.
Attachment #214843 - Flags: superreview?(bzbarsky)
Attachment #214843 - Flags: review?(smontagu)
Comment on attachment 214843 [details] [diff] [review]
patch v2

So are these frames that we're setting the properties on guaranteed to be on the in-flow list of aFrame or something?

If so, could we add an assert to that effect?
Attachment #214843 - Flags: superreview?(bzbarsky) → superreview+
(Assignee)

Comment 13

12 years ago
(In reply to comment #12)
> (From update of attachment 214843 [details] [diff] [review] [edit])
> So are these frames that we're setting the properties on guaranteed to be on
> the in-flow list of aFrame or something?
> 
> If so, could we add an assert to that effect?
> 

Well, this time I think the local code really guarantees this, so assertions are really not necessary:
We're only reaching the attribute-setting code if the |if (frame->GetPrevInFlow() != aFrame)| condition was not satisfied, meaning that aFrame *is* |frame|'s prev-in-flow, which (we can assume) means |frame| is aFrame's next-in-flow. And then we set aFrame to frame, etc., meaning that |frame| will always be in the in-flow chain of the original aFrame.

But as always, you might show me that I either misunderstood or am wrong :)

Updated

12 years ago
Attachment #214843 - Flags: review?(smontagu) → review+
(Assignee)

Comment 15

12 years ago
Checked in:

Checking in layout/base/nsBidiPresUtils.h;
/cvsroot/mozilla/layout/base/nsBidiPresUtils.h,v  <--  nsBidiPresUtils.h
new revision: 1.20; previous revision: 1.19
done
Checking in layout/base/nsBidiPresUtils.cpp;
/cvsroot/mozilla/layout/base/nsBidiPresUtils.cpp,v  <--  nsBidiPresUtils.cpp
new revision: 1.70; previous revision: 1.69
done
Status: ASSIGNED → RESOLVED
Last Resolved: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla1.9alpha
(Assignee)

Comment 16

12 years ago
On my machine (iMac G5 1.8GHz), using hourly builds, the improvement I see is from 99 sec. (before) to 72 sec. (after), that is, 27%. This is consistent with comment #1.

Boris, do we want to file another follow up, for dealing with the fact that this is still a lot slower than the non-bidi case? This time I'm not sure where to look for improvement, though.

Comment 17

12 years ago
(In reply to comment #16)
> Boris, do we want to file another follow up, for dealing with the fact that
> this is still a lot slower than the non-bidi case? This time I'm not sure where
> to look for improvement, though.

How does vi load this file in 1 sec? It knows the line count, displays all of the Unicode caharacters and can jump to the last page in that time. 
(Assignee)

Comment 18

12 years ago
(In reply to comment #17)
> (In reply to comment #16)
> > Boris, do we want to file another follow up, for dealing with the fact that
> > this is still a lot slower than the non-bidi case? This time I'm not sure where
> > to look for improvement, though.
> 
> How does vi load this file in 1 sec? It knows the line count, displays all of
> the Unicode caharacters and can jump to the last page in that time. 
> 

Perhaps I should make myself clear: I'm not saying that I don't think this can (and should) be dramatically improved. All I'm saying is that I, currently, don't know where to look in order to do that.
> How does vi load this file in 1 sec?

If it does what emacs does, it doesn't treat it as HTML. ;)   That allows some interesting optimizations that would significantly complicate out code (we'd have  to have a separate layout engine for plaintext, basically).

Uri, I think we should definitely file another followup, still blocking bug 317334, and probably depending on bug 263359.  I think the next interesting thing to do is to investigate what lines are marked dirty when content is appended in this case.  Normally it would just be the last line of the <pre> and maybe the one before that.  Is that what happens with bidi?  If not, why not?
(Assignee)

Comment 20

12 years ago
(In reply to comment #19)
> Uri, I think we should definitely file another followup, still blocking bug
> 317334, and probably depending on bug 263359.

Filed bug 330350 for this.
(Assignee)

Updated

12 years ago
Depends on: 330373
(Assignee)

Updated

12 years ago
Depends on: 333433

Updated

10 years ago
Component: Layout: BiDi Hebrew & Arabic → Layout: Text
QA Contact: zach → layout.fonts-and-text
You need to log in before you can comment on or make changes to this bug.