Closed Bug 98118 Opened 19 years ago Closed 18 years ago

nsJISx4501LineBreaker::Next is worse than O(N^2) in number of elements in line

Categories

(Core :: Internationalization, defect, P2)

x86
Linux
defect

Tracking

()

VERIFIED FIXED

People

(Reporter: dbaron, Assigned: smontagu)

References

(Blocks 2 open bugs)

Details

(Keywords: perf)

Attachments

(2 files, 1 obsolete file)

nsJISx4501LineBreaker::Next takes time proportional to worse than the square of
the number of elements in a line of text.  This is demonstrated by the testcases
in bug 97229.  It causes reflow to take *extremely* long times when there are
large numbers of elements in a line.  See bug 97229 for more details (see the
information on the Cn testcases).

With N elements on a line, it takes the following numbers of seconds for a
one-line testcase to display on my machine:

N      time
8192   13
16384  62
32768  400
jprof shows that over 80% of the time loading the 32768-element testcase was
spentin nsJISx4501LineBreaker::Next itself.  About 6% was spent in the memmove
called from nsVoidArray::RemoveElementsAt called from nsLineLayout::ForgetWordFrame.

The testcase can be constructed by doing the following:

1st line: "<pre>"
2d line: "<span>0</span><span>1</span>" ... "<span>f</span>" repeated so as to
give the correct number of elements.
3d line: "</pre>"
To clarify, I haven't yet investigated how the overall rate of increase in time
spent in this function is split between the number of times the function is
called and the amount of time a single call takes.
assigning to bstell: and cc'ing self
Assignee: yokoyama → bstell
dave: please work with shanjian.
Shanjian- can you take a look at the performance there?
I don't believe the Next routines will spend a lot of time....
Assignee: bstell → shanjian
I made 3 testcase, each with 330, 3300 and 33000 elements. The approximate 
number of call to nsJISx4501LineBreaker::Next is:
330 -> 329,  -> 1 call per element
3300 -> 9464 -> 3 calls per element
33000 -> 895043 -> 27 calls per element

The performance of a single call of nsJISx4501LineBreaker::Next is O(n), but as 
the number of call increase exponentially, the mentioned behavior is within 
explaination. 
Status: NEW → ASSIGNED
Now the problem is clear. In the testcase proposed by dbaron, there are a lot of 
elements, each has a frame. When it's time to decide line break, frame is added 
one by one and sent to nsJISx4501LineBreaker::Next. Since all those element are 
in a single word, no break can be found and this procedure repeat again and 
again, with the text buffer sent to nsJISx4501LineBreaker::Next increasing as 
well.

I believe this is a very special scenario. Normally it is breakable between 
frame and frame, and only in very few situation several frames need to be 
treated as one unbreakable word. IMO we probably don't need to do anything here.
Status: ASSIGNED → RESOLVED
Closed: 19 years ago
Resolution: --- → WONTFIX
Wontfix isn't appropriate here -- we should fix this.  It may or may not be your
bug, but it should be fixed -- we shouldn't have quadratic growth on input size.
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
(The fix may be as simple as not calling the linebreaker at all for preformatted
output.)
<pre> is not necessary to reproduce this problem. I still don't think this is 
something we should sweat on. But if people want to improve this block of code, 
that is always welcome. So let me reassign this bug to dbaron. 

Assignee: shanjian → dbaron
Status: REOPENED → NEW
Status: NEW → ASSIGNED
Priority: -- → P2
Target Milestone: --- → mozilla0.9.7
Target Milestone: mozilla0.9.7 → mozilla0.9.8
*** Bug 114420 has been marked as a duplicate of this bug. ***
http://www.toronto.com is a real-world page; when we call view-source on it it
takes forever and 92% of that time is in (not under) nsJISx4501LineBreaker::Next

Of course there are no newline characters in the source, so this is to be
expected...
Blocks: 114584
Target Milestone: mozilla0.9.8 → mozilla0.9.9
*** Bug 122810 has been marked as a duplicate of this bug. ***
*** Bug 111784 has been marked as a duplicate of this bug. ***
Target Milestone: mozilla0.9.9 → Future
dbaron, are you planning to work on this? If not, please assign it to me.
Keywords: nsbeta1
->smontagu, per comment 14.  (But see comment 8, although I'm not sure what's
easier.)
Assignee: dbaron → smontagu
Status: ASSIGNED → NEW
Target Milestone: Future → ---
QA Contact: andreasb → ylong
Attached patch Something to try (obsolete) — Splinter Review
So... this patch tries to not call the linebreaker when aTextData.mIsBreakable
is false in MeasureText(). Two things that could be wrong with the patch:

1)  I may misunderstand how the whole thing works.... I assumed that
    nsJISx4501LineBreaker::Next can't return anything useful if the line is not

    breakable

2)  I'm not sure whether the 

   if((*aStop) && (wordLen == 0))
     return dimensions; // 0;

lines in ComputeWordFragmentDimensions() should always be executed or only
executed when the text is breakable....

Are there any wrapping regression tests this could be tested on?  I'm not quite
sure under what circumstances the block of code I'm skipping will actually
produce usable output.	I've been browsing with this patch for a bit with no
ill effects.
I just tested that patch on the testcase in bug 104517 (attachment 71611 [details]) and
the patched and unpatched builds act exactly alike (and yes, there _are_ places
in that testcase where the changes I made change whether code runs or not).

Shanjian?  Frank?  What do you think?
Keywords: review
Attached file testcase
Attachment #71698 - Attachment mime type: text/plain → text/html
Your patch make sense to me. I tried several testcases, and they all looks fine. 
>>1)  I may misunderstand how the whole thing works.... I assumed that
>>    nsJISx4501LineBreaker::Next can't return anything useful if the line is not
>>    breakable
This seems truth, I could not find a testcase to defeat it. 

>>2)  I'm not sure whether the 
>>   if((*aStop) && (wordLen == 0))
>>     return dimensions; // 0;
This statement should always be called in either case. 

There is another thing that I am not absolutely sure. In case "aIsBreakable" is
not true, will we ever suffer in performance in certain situation? If there does
exists such a case, "aLineBreaker->Next" will be called again and again with
expanding buffer. The following patch should fix the problem if it does happen. 
+#define MAX_LOOK_BACK_FOR_WORDBREAK 80^M
   if(wordLen > 0)^M
   {^M
     memcpy((void*)&(aWordBuf[aRunningWordLen]), bp, sizeof(PRUnichar)*wordLen);
^M
 ^M
     PRUint32 breakP=0;^M
     PRBool needMore=PR_TRUE;^M
+    PRUint32 startPos = 0;^M
+    if (aRunningWordLen > MAX_LOOK_BACK_FOR_WORDBREAK)^M
+      startPos = aRunningWordLen - MAX_LOOK_BACK_FOR_WORDBREAK;^M
     nsresult lres = aLineBreaker->Next(aWordBuf, aRunningWordLen+wordLen,^M
-                                       0, &breakP, &needMore);^M
+                                       startPos, &breakP, &needMore);^M
     if(NS_SUCCEEDED(lres)) ^M


> There is another thing that I am not absolutely sure. In case "aIsBreakable" is
> not true

You mean in case it _is_ true but there are no actual places where one can
break?   How does that situation arise?  Would it be triggered by a long run of
letter with no spaces?

Note that I _really_ don't know this code...
Keywords: mozilla1.0+
>>You mean in case it _is_ true but there are no actual places where one can
>>break?   
Yes. Sorry for my typo. 
>>How does that situation arise?  Would it be triggered by a long run of
>>letter with no spaces?
I don't know if such situation exist or not. But clearly,  a long run of
letters with no spaces within the same frame will not trigger such a problem. 
For the time being, probably your patch is enough. When we do meet such a testcase
in future, we can put my additional patch in. 
Comment on attachment 71533 [details] [diff] [review]
Something to try

This patch improve the performance greatly in my testing. But we are still
sufferring in such a situation, so further optimization is needed. Please open
another bug when this bug is closed. Future optimization probably will have
nothing to do with linebreaker.
Attachment #71533 - Flags: review+
I runned the layout regression tests with the patch, they passed, only the usual
suspects (aka false postives) did appear.
I have some questions about the patch.... part of it seems to be missing. Where
are the callers to these functions that you have added aIsBreakable to? Plus you
have some fprintfs in there that probably shouldn't be there.
Comment on attachment 72653 [details] [diff] [review]
Same as first patch but -u10 so the callers are obvious and with fprintfs removed

sr=roc+moz
Attachment #72653 - Flags: superreview+
nsbeta1+, please work with IQA to test the line breaking test suite
Keywords: nsbeta1nsbeta1+
Is the suite available outside the firewall?  ylong?
The test cases in comment #18, #12, #13 not helping?
Comment on attachment 72653 [details] [diff] [review]
Same as first patch but -u10 so the callers are obvious and with fprintfs removed

a=asa (on behalf of drivers) for checkin to the 1.0 trunk
Attachment #72653 - Flags: approval+
Patch checked in.  Leaving bug open per shanjian's comments, since memmove is
now the biggest offender in the various testcases....  Simon, Shanjian, should
we open a new bug on that?
Yes, I suggest to open a new bug and close this one. One bug should only 
focusing on one issue for ease of regression and management control.
Bug 129167 and bug 129166 filed.  Marking this one fixed.
Status: NEW → RESOLVED
Closed: 19 years ago18 years ago
Resolution: --- → FIXED
Mark as verified for this one cause we have seperate bugs filed.

Should the new bugs be nsbeta1? cause "nsbeta1+" for this one.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.