Closed Bug 98118 Opened 23 years ago Closed 23 years ago

nsJISx4501LineBreaker::Next is worse than O(N^2) in number of elements in line

Categories

(Core :: Internationalization, defect, P2)

x86
Linux
defect

Tracking

()

VERIFIED FIXED

People

(Reporter: dbaron, Assigned: smontagu)

References

Details

(Keywords: perf)

Attachments

(2 files, 1 obsolete file)

nsJISx4501LineBreaker::Next takes time proportional to worse than the square of the number of elements in a line of text. This is demonstrated by the testcases in bug 97229. It causes reflow to take *extremely* long times when there are large numbers of elements in a line. See bug 97229 for more details (see the information on the Cn testcases). With N elements on a line, it takes the following numbers of seconds for a one-line testcase to display on my machine: N time 8192 13 16384 62 32768 400
jprof shows that over 80% of the time loading the 32768-element testcase was spentin nsJISx4501LineBreaker::Next itself. About 6% was spent in the memmove called from nsVoidArray::RemoveElementsAt called from nsLineLayout::ForgetWordFrame. The testcase can be constructed by doing the following: 1st line: "<pre>" 2d line: "<span>0</span><span>1</span>" ... "<span>f</span>" repeated so as to give the correct number of elements. 3d line: "</pre>"
To clarify, I haven't yet investigated how the overall rate of increase in time spent in this function is split between the number of times the function is called and the amount of time a single call takes.
assigning to bstell: and cc'ing self
Assignee: yokoyama → bstell
dave: please work with shanjian. Shanjian- can you take a look at the performance there? I don't believe the Next routines will spend a lot of time....
Assignee: bstell → shanjian
I made 3 testcase, each with 330, 3300 and 33000 elements. The approximate number of call to nsJISx4501LineBreaker::Next is: 330 -> 329, -> 1 call per element 3300 -> 9464 -> 3 calls per element 33000 -> 895043 -> 27 calls per element The performance of a single call of nsJISx4501LineBreaker::Next is O(n), but as the number of call increase exponentially, the mentioned behavior is within explaination.
Status: NEW → ASSIGNED
Now the problem is clear. In the testcase proposed by dbaron, there are a lot of elements, each has a frame. When it's time to decide line break, frame is added one by one and sent to nsJISx4501LineBreaker::Next. Since all those element are in a single word, no break can be found and this procedure repeat again and again, with the text buffer sent to nsJISx4501LineBreaker::Next increasing as well. I believe this is a very special scenario. Normally it is breakable between frame and frame, and only in very few situation several frames need to be treated as one unbreakable word. IMO we probably don't need to do anything here.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → WONTFIX
Wontfix isn't appropriate here -- we should fix this. It may or may not be your bug, but it should be fixed -- we shouldn't have quadratic growth on input size.
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
(The fix may be as simple as not calling the linebreaker at all for preformatted output.)
<pre> is not necessary to reproduce this problem. I still don't think this is something we should sweat on. But if people want to improve this block of code, that is always welcome. So let me reassign this bug to dbaron.
Assignee: shanjian → dbaron
Status: REOPENED → NEW
Status: NEW → ASSIGNED
Priority: -- → P2
Target Milestone: --- → mozilla0.9.7
Target Milestone: mozilla0.9.7 → mozilla0.9.8
*** Bug 114420 has been marked as a duplicate of this bug. ***
http://www.toronto.com is a real-world page; when we call view-source on it it takes forever and 92% of that time is in (not under) nsJISx4501LineBreaker::Next Of course there are no newline characters in the source, so this is to be expected...
Blocks: 114584
Target Milestone: mozilla0.9.8 → mozilla0.9.9
*** Bug 122810 has been marked as a duplicate of this bug. ***
*** Bug 111784 has been marked as a duplicate of this bug. ***
Target Milestone: mozilla0.9.9 → Future
dbaron, are you planning to work on this? If not, please assign it to me.
Keywords: nsbeta1
->smontagu, per comment 14. (But see comment 8, although I'm not sure what's easier.)
Assignee: dbaron → smontagu
Status: ASSIGNED → NEW
Target Milestone: Future → ---
QA Contact: andreasb → ylong
Blocks: 87787
Attached patch Something to try (obsolete) — Splinter Review
So... this patch tries to not call the linebreaker when aTextData.mIsBreakable is false in MeasureText(). Two things that could be wrong with the patch: 1) I may misunderstand how the whole thing works.... I assumed that nsJISx4501LineBreaker::Next can't return anything useful if the line is not breakable 2) I'm not sure whether the if((*aStop) && (wordLen == 0)) return dimensions; // 0; lines in ComputeWordFragmentDimensions() should always be executed or only executed when the text is breakable.... Are there any wrapping regression tests this could be tested on? I'm not quite sure under what circumstances the block of code I'm skipping will actually produce usable output. I've been browsing with this patch for a bit with no ill effects.
I just tested that patch on the testcase in bug 104517 (attachment 71611 [details]) and the patched and unpatched builds act exactly alike (and yes, there _are_ places in that testcase where the changes I made change whether code runs or not). Shanjian? Frank? What do you think?
Keywords: review
Attached file testcase
Attachment #71698 - Attachment mime type: text/plain → text/html
Your patch make sense to me. I tried several testcases, and they all looks fine. >>1) I may misunderstand how the whole thing works.... I assumed that >> nsJISx4501LineBreaker::Next can't return anything useful if the line is not >> breakable This seems truth, I could not find a testcase to defeat it. >>2) I'm not sure whether the >> if((*aStop) && (wordLen == 0)) >> return dimensions; // 0; This statement should always be called in either case. There is another thing that I am not absolutely sure. In case "aIsBreakable" is not true, will we ever suffer in performance in certain situation? If there does exists such a case, "aLineBreaker->Next" will be called again and again with expanding buffer. The following patch should fix the problem if it does happen. +#define MAX_LOOK_BACK_FOR_WORDBREAK 80^M if(wordLen > 0)^M {^M memcpy((void*)&(aWordBuf[aRunningWordLen]), bp, sizeof(PRUnichar)*wordLen); ^M ^M PRUint32 breakP=0;^M PRBool needMore=PR_TRUE;^M + PRUint32 startPos = 0;^M + if (aRunningWordLen > MAX_LOOK_BACK_FOR_WORDBREAK)^M + startPos = aRunningWordLen - MAX_LOOK_BACK_FOR_WORDBREAK;^M nsresult lres = aLineBreaker->Next(aWordBuf, aRunningWordLen+wordLen,^M - 0, &breakP, &needMore);^M + startPos, &breakP, &needMore);^M if(NS_SUCCEEDED(lres)) ^M
> There is another thing that I am not absolutely sure. In case "aIsBreakable" is > not true You mean in case it _is_ true but there are no actual places where one can break? How does that situation arise? Would it be triggered by a long run of letter with no spaces? Note that I _really_ don't know this code...
Keywords: mozilla1.0+
>>You mean in case it _is_ true but there are no actual places where one can >>break? Yes. Sorry for my typo. >>How does that situation arise? Would it be triggered by a long run of >>letter with no spaces? I don't know if such situation exist or not. But clearly, a long run of letters with no spaces within the same frame will not trigger such a problem. For the time being, probably your patch is enough. When we do meet such a testcase in future, we can put my additional patch in.
Comment on attachment 71533 [details] [diff] [review] Something to try This patch improve the performance greatly in my testing. But we are still sufferring in such a situation, so further optimization is needed. Please open another bug when this bug is closed. Future optimization probably will have nothing to do with linebreaker.
Attachment #71533 - Flags: review+
I runned the layout regression tests with the patch, they passed, only the usual suspects (aka false postives) did appear.
I have some questions about the patch.... part of it seems to be missing. Where are the callers to these functions that you have added aIsBreakable to? Plus you have some fprintfs in there that probably shouldn't be there.
Comment on attachment 72653 [details] [diff] [review] Same as first patch but -u10 so the callers are obvious and with fprintfs removed sr=roc+moz
Attachment #72653 - Flags: superreview+
nsbeta1+, please work with IQA to test the line breaking test suite
Keywords: nsbeta1nsbeta1+
Is the suite available outside the firewall? ylong?
The test cases in comment #18, #12, #13 not helping?
Comment on attachment 72653 [details] [diff] [review] Same as first patch but -u10 so the callers are obvious and with fprintfs removed a=asa (on behalf of drivers) for checkin to the 1.0 trunk
Attachment #72653 - Flags: approval+
Patch checked in. Leaving bug open per shanjian's comments, since memmove is now the biggest offender in the various testcases.... Simon, Shanjian, should we open a new bug on that?
Yes, I suggest to open a new bug and close this one. One bug should only focusing on one issue for ease of regression and management control.
Bug 129167 and bug 129166 filed. Marking this one fixed.
Status: NEW → RESOLVED
Closed: 23 years ago23 years ago
Resolution: --- → FIXED
Mark as verified for this one cause we have seperate bugs filed. Should the new bugs be nsbeta1? cause "nsbeta1+" for this one.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: