Closed Bug 49772 Opened 24 years ago Closed 24 years ago

Selecting in large documents is very slow

Categories

(Core :: DOM: Selection, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla0.9

People

(Reporter: dbaron, Assigned: mjudge)

References

()

Details

(Keywords: perf, regression, Whiteboard: [rtm-][p:1][dogfood-] Fix reviewed and approved, checked into trunk)

Attachments

(4 files)

DESCRIPTION: Making a selection in very large documents has recently (I think!) become very slow. STEPS TO REPRODUCE: * load http://lxr.mozilla.org/seamonkey/source/layout/html/style/src/nsCSSFrameConstructor.cpp * drag the mouse over a few lines of text * wait for the browser to become responsive again ACTUAL RESULTS: * waiting takes 2 minutes and 40 seconds EXPECTED RESULTS: * waiting takes under 100 milliseconds DOES NOT WORK CORRECTLY ON: * Linux, mozilla, 2000-08-21 optimized build WORKS CORRECTLY ON: * older builds, I think ADDITIONAL INFORMATION: This is dogfood. Most of the documents I look at in daily mozilla work (leak logs, profiles, tinderbox logs) are this size (within an order of magnitude or two). I often accidentally drag the mouse over some text. I've stopped using the browser for real work. Period.
The delay in 2000082108 win98 on my PII 450 mHZ w/ 128 MB of 100mhz SDRAM isn't quite as long as you noted, but still significant enough to be noticeable. Will check on linux shortly.
OS: Linux → All
Hardware: PC → All
I agree and will make this [dogfood+]..but not sure this can be fixed ASAP. beppe? mjudge? what do you think? more of a nsbeta3+ for you?
Whiteboard: [dogfood+]
Sorry, got sidetracked earlier. On Linux this is painstakingly slow, as described. Specs: Redhat 6.2, kernel 2.3.99pre9, SMP i686 w/ dual 866 XEON processors. Mozilla 2000082121 Yes, definitely nsbeta3 if can't be done for dogfood.
Keywords: nsbeta3
Target Milestone: --- → M18
I tried to reproduce this, and had confusing results. (NT, 300Mhz, 256MB ram, optimized commercial build from 8/23 late afternoon on sweetlou) The page took so long to load, that I eventually hit stop. This made me a little skeptical about the test case... but I kept going. I was able to select lines of text, and the delay was typically on the order of a second from mouse move, to highlight catching up. Not knowing for sure what a "cenzoic era" is, I can only note that it is less than the 160 seconds reported. Finally, I tried hitting back (to return to this bug), and then the processor went into an infinite loop (100% CPU utilization, and it never got out). (no talkback report... it never crashed... I had to kill it). I suspect there are a couple of bugs in this area... I suspect that the page size is hurting something... although the memory usage for the page load was "only" about 10-15 megs :-/ (I'm not sure how large the annotated source on this page is). This bug has been on the dogfood list for a while, could we get an update on the progress??
i finally got abulid after the zip/perl changes. i am runnig some tests now
Status: NEW → ASSIGNED
Fix summary
Summary: making selection in large document longer than cenozoic era → Selecting in large documents is very slow
I profiled two cases: 1. Selecting in the URL bar 2. Selecting in a complex page (mozilla.org). URL bar ------- For the URL bar, 40% of the time was going into updating text editing commands (cut, copy, clear etc) when the selection changed from collapsed to non- collapsed, which happened 3 times in my trial. 8% went in nsDOMSelection::selectFrames, which filters down into a bunch of CreateInstance calls (subtree iterator, content iterator). We can use NS_NewFoo calls here. nsTextFrame::SetSelected contributes 3.5%, and this is mostly in nsFrame::Invalidate. A flat listing of this URL bar selection stuff, including command updating, shows that GetPrimaryFrameFor/FindPrimaryFrameFor take 6.4/6.3% of the total time respectively. nsHashtable::Get() is next with 3.6%, nsTextFrame::GetPosition() has 3%, and text width measuring code 2.2%. Whole document selection ------------------------ Here the scenario is different. To profile, I shift-clicked a couple of times on the mozilla.org page to extend the selection. 80% of the time is spent in nsDOMSelection::Extend; that's split between nsDOMSelection::selectFrames (38% [nsIPresContext*,nsIDOMRange*,PRBool]+ 10%[nsIPresContext*,nsIContentIterator* etc]) and nsGeneratedSubtreeIterator::Next (28.3%). Burrowing into these routines shows that they recursively call eachother, with most of the time spent in nsGeneratedSubtreeIterator::Next(). And most of the time in nsGeneratedSubtreeIterator::Next() is attributable to nsRange::GetAncestorsAndOffsets(). This nsRange function takes 34% of the entire selection time. This is way ahead of the next most time-consuming function, nsFrame::IsSelectable (3.4%).
ok here is the deal... Its not selection that is slow. bear with me... ok non-pc version (The event system is totaly whacked and doesnt scale for crap.) dressed up version.. *The event system calls GetFrameForPoint many times more than is necessary for EVERY event that hits PresShell::HandleEvent. key-events,mouse-events,ect. *The bigger the file you are looking at the worse it is. *If you have many frames in the same container. Take a PRE tag in the quoted example first reported and add about 6000 lines to it then just move the mouse around the webpage. you will see the same problem as if you had had the mouse button down and were selecting. *From the attachment 49772 [details]prof1.html results: for every call to PresShell::HandleEvent you get 6731.37 calls to nsFrame::GetFrameForPoint. There is allot of recursion happening here so some of the total time numbers have a hard time adding up but it is easy to see where the time is. *we should cache the last rect/frame so that the next event could possibly short circuit the whole system. *we should also see why the search for a line in the container becomes N*M re3gardless what point it is searching for. M=number of lines in container, N= number of events. *I am going to talk to some people then probably zap this over to Joki
moving this over to joki so he can start looking at this one
Assignee: mjudge → joki
Status: ASSIGNED → NEW
Hmmm... what I was seeing when I profiled this with jprof around when I filed the bug was similar to what simon saw in his second comment. I saw most of the time spent in selection code, and I think there is something wrong there unless things have changed (in which case this is no longer as bad as it once was). anthonyd: I think GetFrameForPoint is O(N) on the document size. Do you think it's worse? I have no idea how one would implement your caching suggestion without breaking just about everything. I agree that we probably could make some improvements by starting the search at the last-child and stopping when we find a match.
Removing nsbeta3 nomination because this is already a dogfood bug.
Keywords: nsbeta3
Chris, joki is moving and will not be online for the rest of this week. Please take a look at this bug while he's away. It seems from David Baron's comment that this bug might not be an event system issue and should probably go back to the owner of selection code.
Assignee: joki → saari
CCing mjudge who more or less owns selection and has been working on GetFrameForPoint slowness.
Status: NEW → ASSIGNED
please read anthony's comments on 8/29 -- he and Mike spent several hours on this bug
I spent some time here too, and my data show that it's selection itself that is slow (range operations). dbaron agrees with me. I really think that the editor group need to own this bug.
ok Simon, please sit with Anthony and go through the process with him. Mike and Anthony looked at how the operations work and you saw Anthony's input about that. Since you and David are able to find other data to support selection is the issue, then please jump in and assist with the fix.
Assignee: saari → sfraser
Status: ASSIGNED → NEW
entering url
Status: NEW → ASSIGNED
Testing the scenario of clicking near the top of the document, and shift-clicking near the end, I see a very long delay. Most of the time here seems to be being spent in nsCSSFrameConstructor::FindFrameWithContent(), which is called from GetPrimaryFrameFor(). The generated content iterator is calling this.
i was told that performace work is for m19 is this not still the case? if we are looking at this now just let me know.
Adding kin@netscape.com to Cc list.
perf work is slotted for m19, will move this to that milestone -- we will start with this issue then.
Target Milestone: M18 → M19
I broke this out into 2 other bugs: 1. 51938 -- getting frame from point is slow. Covers the large number of calls to nsFrame::GetFrameForPoint. 2. 51939 -- output is slow on large documents. covers the problem of slow auto- copy on linux after selecting. So this bug should remain to cover the stuff under nsFrame::HandleEvent, i.e. the inefficiencies in range, and selection iterating through content, calling GetPrimaryFrame for a whole bunch of times.
Depends on: 51938, 51939
removed dogfood+, would prefer that this be rtm+ because there are several issues at hand: 51938 needs to be resolved 51939 needs to be resolved 43008 needs to be resolved when those issues are resolved, then most of this issue will be resolved, the remaing work can also be resolved
Keywords: rtm
Priority: P3 → P1
Whiteboard: [dogfood+] → [rtm+][p:1]
Per request, changing to dogfood-minus. Hopefully folks are not being blocked as much by this performance... if so... please re-nominate.
Whiteboard: [rtm+][p:1] → [rtm+][p:1][dogfood-]
Simon, please include the required information per the rtm checkin rules
Whiteboard: [rtm+][p:1][dogfood-] → [rtm+ NEED INFO][p:1][dogfood-]
Giving this to mjudge, since its his selection code that is slow. We still need to wait on resolutions to the two dependent bugs though.
Assignee: sfraser → mjudge
Status: ASSIGNED → NEW
We iterate across the dom and hit frames along the way to select them. if there are 10,000 nodes we will be slow since GetPrimary frame is slow (especially for text frames). we can reduce the number of calls to GetPrimaryFrame by a good factor (maybe 2-3 times) if we just ignore the generated content and NOT highlite it when we cross it. #1 the output system doesnt handle it right now anyway so selecting has no REAL benefit to generated content. #2 there is no available way given to me by troy that can give back the necessary data without a frame to work from. any changes allong the lines of rewriting generated content selection i believe is too late. this fix to stop checking alltogether for generated content while we highlite things is a no brainer. ( like 2 lines) I will make fix and put patch on bug report.
Status: NEW → ASSIGNED
this patch removes all references to nsIGeneratedIterator. the range iterator now iterates over nodes only in the original document. (see explanation in comment above) the large size of diff is only because unnecessary conditionals have been removed. risk estimation: low risk time estimation: easy checkin 1 file low probability of collision gain estimation: off top of head with no quantify data yet (but basing on smfrs profile data) I would say a 40% gain. this gain is ONLY on the shift-click selection aspect of selection. this will not affect the events ect of a drag selection.
adding buster to CC to review patch as a super-reviewer.
a=buster. mike, are you just going to check this into the branch as a hack, or are you going to check this into the trunk as well as "the way things are supposed to be?" I think we should check this into the branch only, and leave this bug open TM=future so we can get generated content and selection working together correctly. But I agree we're out of time for rtm.
I will look into getting it just on the branch and leaving bug open for trunk. with the new non-XIF stuff going in it looks like generated content could be a go for 6.5 quite easily. (that still doesnt help the speed factor of calling getprimary frame however :( )
ok i have an a=buster. now I need an r=. kin? can you take a look for me?
removing + per pdt sw rules
Whiteboard: [rtm+ NEED INFO][p:1][dogfood-] → [rtm NEED INFO][p:1][dogfood-]
ok r=kin. a=buster. checking into branch only. please rtm++
removing rtm NEED INFO to get attention. this is low risk. simple class id change really.
Whiteboard: [rtm NEED INFO][p:1][dogfood-] → [rtm][p:1][dogfood-]
What are the implications for selecting generated content with this change? Where is the significant generated content that is affected by this?
I've already forgotten how to read these reports (where does it show cumulative time spent in a function?), but nonetheless, it's worth noting the portion of the profile routed in nsAutoCopyService::NotifyselectionChanged(). This used to trigger a very inefficient copy operation that would involve lots of expensive range operations. I (with lots of help from jst and others) have rewritten the underlying copy algorithm, and it's a lot faster now. This was part of the noxif landing that occurred Tuesday 10/10/00. I'll wager that selection on linux is a lot faster now in the scenerio measured here. Other platforms do not copy during selection, so they will not be affected by the change.
Nothing has changed here except what is visible. now the selection that could not be copy any way will no longer be displayed as having the selected ability. Post 6.0 we may be able to turn back on the generated iterators since the no-xif stuff will make it so that we may be able to copy the generated content. for now. we simply dont draw what we cant copy. that is the only implication here. this is to fix the problem with scanning for generated content only. simple fix. 1. remove use of generatedcontent iterators. 2. replace them with standard iterators. 3. post 6.0 fix non-xif output system to output the generated content that DOES look selected. a=buster r=kin can we pleeease rtm++ this i am running out of time
hmmm... this is just on the borderline of being too big. The usual solution is to land this on the trunk, and see how it does cooking for a day. Do you have approval to do so? If so, please land there, and then post the results in this bug when you re-nominate (market it rtm+ to atract PDT attention).
Whiteboard: [rtm][p:1][dogfood-] → [rtm need info][p:1][dogfood-]
mjudge and I were talking earlier about converting his patch to use ifdefs instead of just removing the code on the branch. This would allow us to keep the trunk and branch a little more in sync, generate diffs that were easier to look at when we diff'd between trunk and branch, and allow us to turn use of generated content on and off easily. If we have to land on the trunk first, I suggest we use our ifdef version of the patch.
Depends on: 56432
a=buster pending mike's review
fine looks good lets just check this in pleeeeze.
i am told now that i should check this into the trunk. i am taking kins patch and applying it and checking it into the trunk.
ok fix in trunk
changing to rtm+ since a reviewed, approved fix is in hand and has been checked into the trunk. I'll pull your changes now and report back if I find any regressions / problems with the patch.
Whiteboard: [rtm need info][p:1][dogfood-] → [rtm+][p:1][dogfood-] Fix reviewed and approved, checked into trunk
With this having landed on the trunk, I have several questions: a) Can you quantify the perforamnce difference (include platform specs as well as doc spects). Is the change exactly as suggested at the start of the bug (2 minutes becomes less than 1 second?). b) Were there any regressions reported as a consequence of this landing? Marking need-info. Please re-nominated after getting this data.
Whiteboard: [rtm+][p:1][dogfood-] Fix reviewed and approved, checked into trunk → [rtm need info][p:1][dogfood-] Fix reviewed and approved, checked into trunk
is this related? try drag-selecting text on http://www.uptimes.net/bottom.html?show=sactive&start=0&sort=1&refresh=0&showall=1 you should notice heavy cpu usage when making the selection.
no cpu usage WHILE selecting is another bug that was spun off from this one having to do with events. this bug is trying to solve slowness when selecting large chunks of the document at a time. aka click top, shift click on bottom of page. that speed should have increased 25% from simon's profiles he ran. if you want you can think of this as a back out of the generated iterator usage in selection. they didnt use to show generated content selected. with this patch they wont again. 1. simpler codepath 2. no confusion as to what will be copied 3. allready tested and used iterators substituted. i hate to have to run all sorts of profiles again when this isnt a different codepath but a subset. plus there is not a good chance this will not be allowed in anyway. on monday i will run more profiles and maybe i will get lucky.
I'm not asking you to run Quntify or a profiler. I'm asking if the 2 plus minute task became less than 1 second. You should not need a profiler to get this info. I'd like some statement of the user visible performance improvement. Can anyone comment on the performance differential?? There are plenty of builds of the trunk before vs after to compare. The second question is about regressions.
Jim: only those who are building can test this, because the patch that was checked in requires that you build with a certain env var. I'm building now to test.
hmm... mjudge said "fix in trunk," but now blake has said (to parapharse) "...but no one has tested it, because it is not active in standard build." The whole point of "landing it on the trunk" was to test it, and see what kinds of regressions (if any) it caused when the mozilla community used the trunk builds. If we don't have that active testing, then we don't have the feedback we were after. :-( :-( Mike: Can you confirm that you didn't really land the code on the trunk as active code??
Just to be clear. The patch that landed turns off the use of Generated Content iterators in selection with an ifdef. To use Generated content iterators again (the old slow way) you must define USE_SELECTION_GENERATED_CONTENT_ITERATOR_CODE and then rebuild. As jar pointed out it would be easy to compare a trunk build from Friday morning (before this checkin) with Monday mornings trunk build (with checkin) to see the performance gain.
just to recap, yes the environment variable is only nevessary if you want to turn the generated content BACK on. I will check performance but I dont believe this will make 2 min= 100ms. It should still be a good improvement however. i will post less subjective numbers in a little bit.
moving to future
Target Milestone: M19 → Future
Hold on. PDT, the fix for this also fixes a crash which occurs simply as a result of selecting some simple text (see bug 56704). This patch, which has been reviewed, approved, and tested in the trunk for days now, also isn't that big (look at the 10/5/00 patch, which is the patch would be go in the branch -- NOT the 10/13/00 patch). And it does indeed speed up the selection in large documents greatly. We'd be killing two birds with one stone here with a patch known to have little or known risk. Could we please reconsider this?
Whiteboard: [rtm need info][p:1][dogfood-] Fix reviewed and approved, checked into trunk → [rtm+][p:1][dogfood-] Fix reviewed and approved, checked into trunk
Blake, this was moved out to future after our group meeting today, and is in the trunk. Now that I know which milestone to put this in, I am moving this to mozilla0.9 -- which then means it will be marked fixed -- go figure that one!.
Target Milestone: Future → mozilla0.9
Beth, but does this mean there's no hope for the branch? Was the 10/05/00 patch ever evaluated in the context that it also fixes a crasher?
based on jars comment I seriously doubt they will let it get checked in, and I honestly don't know if the crasher bug was tested with this, there are no dependencies set on this bug against 56704, if this fix is really the culprit for that one, then it needs to be made a blocker, and the crash keyword needs to be added
PDT says rtm-, since we didn't see sufficient benefit to offset the risk.
Whiteboard: [rtm+][p:1][dogfood-] Fix reviewed and approved, checked into trunk → [rtm-][p:1][dogfood-] Fix reviewed and approved, checked into trunk
Blocks: 56704
this should be marked fixed since the fix is in the tree right now. trunk = waaaay after this bug patch was completed. if this is still too slow lemme know. open a new bug or if you have to reopen this one.
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: