Closed Bug 97157 Opened 23 years ago Closed 23 years ago

Find in page with a large <pre> is very very slow

Categories

(SeaMonkey :: UI Design, defect, P3)

x86
All

Tracking

(Not tracked)

VERIFIED FIXED
mozilla0.9.9

People

(Reporter: jesup, Assigned: akkzilla)

References

()

Details

(Keywords: perf)

Attachments

(3 files, 2 obsolete files)

This is a companion to bug 31770 and to bug 95461. Try to do a Find in the lxr page above....
Severity: normal → major
Keywords: mozilla1.0
Keywords: perf
Blocks: 91351
Cc sfraser.
Priority: -- → P3
Target Milestone: --- → mozilla0.9.5
Status: NEW → ASSIGNED
Posting a URL for reference: http://www.mozilla.org/performance/mallocsites.html Right on mozilla org ;-) On an older machine (PII-233 mhz) on Linux 2001083016 this page causes the browser to black out the windows and become unresponsive for several minutes. At several intervals the page renders and appears for a few seconds intermittently.
0.9.6 for now.
Target Milestone: mozilla0.9.5 → mozilla0.9.6
This works for me now that the fix for 31770 was checked in.
This bug is still there; Kin is planning on a rewrite of TextServicesDocument to solve it. On an lxr of nsCSSFrameConstructor.cpp (monster file), Finding the second occurance of "HTMLAttribute" (after finding the first) takes ~20 seconds on my build. (The first occurance is on line 20 or so, the second is on line 1200ish. Finding the first takes about as long as the second.) The fix for 31770 will have made this faster than it was, perhaps, but it's still REALLY slow on large <pre>s.
Huh. It only takes two seconds to go from the first instance, at line 32, to the second instance, at line 1256. I'm using 2001092403 on Win2K, on a PII-500. Considering this is a 12000-line file, I find this speed to be acceptable. Not ideal, a vast improvement. If there's more work coming to speed this up even more then great, I guess.
Blocks: 106961
--> Mozilla0.9.9 Trying to load-balance bug list across milestones. I will pull this bug in to a sooner milestone if I can.
Target Milestone: mozilla0.9.6 → mozilla0.9.9
On Kin's request, I'm looking into this.
Assignee: kin → akkana
Status: ASSIGNED → NEW
Target Milestone: mozilla0.9.9 → mozilla0.9.8
It's time to start soliciting comments. It seems to be working fairly well for my small tests so far. In a debug build, my worst-case test (search to the end of the lxr page for nsCSSFrameConstructor.cpp), it takes 1/4 the time of the old find (3 seconds vs. 12 on an Athlon 1600); for incremental finds the gain is substantially larger. As is, this replaces the find used by the browser. XUL/JS code to make the editor use the new find will be showing up momentarily in a separate bug. Comment about the implementation, which reviewers need to know: First, the code I used for IsBlock gets that information from the parser (technically the "right" owner of this information). However, to do this introduces a dependency from embedding/components/find to parser and necko. This may not be kosher. It's easy enough to do what everybody else does and just duplicate the list of nodes thought to be block, and maintain it ourselves; this is ugly but introduces no dependencies, and I can easily rewrite it that way if the dependencies are a problem. All the TEXT_SVCS_TEST stuff in nsWebBrowserFind is code I'm using for performance testing -- I can switch back and forth between old and new find by setting an environment variable (the timing stuff will probably only work on Unix). My plan is to remove it all before checking in, but until then I want to leave it in to be able to get easy performance comparison numbers. Other limitations/issues as noted in comments in the code. Simon, I know it's a lot of code, but do you have any cycles to glance at this? Even if you don't have time for a code review, I'd appreciate comments on the embedding dependency issue and any other concerns you might have; if you do have time to look at the code it would be much appreciated. Of course, anyone else on the cc list is also welcome to review or offer comments.
Status: NEW → ASSIGNED
I will be glad to review this when I get back in early January.
Note: the string APIs have changed again, so line 357 of nsFind must change to read: ToLowerCase(patStr); I'll assume I don't have to make a new patch just for that. Still looking for reviews ... This needs to go in by next Wednesday if it's going to make the milestone. Please?
How final is the patch? I'll review when you say it's ready to go in ;)
Thanks! As far as I know, it's ready (modulo the #ifdef'ed testing code which should probably be removed). There may be issues I've missed, though, hence the plea for reviews to get more sets of eyes on it.
I'll try to look at it also this weekend
Here's another patch which I think combines Kin's changes and mine. I've included the mods to the macbuild xml file, and Kin's changes to make it build on windows. Kin says it's not working right on Windows. Unfortunately my build isn't working today (for possibly unrelated reasons) so we haven't gotten to the bottom of this yet. Reviews are ongoing and I do still need reviews, and would love to have an indication of whether these XML files do the right thing on the mac build.
Attachment #62247 - Attachment is obsolete: true
I want to see this in, if at all possible, for 0.9.8. Comments: -- Use nsnull instead of 0 in things like "mRange = 0" (for any pointer, especially an nsCOMPtr). -- nsFind::SetRange() doesn't create an iterator if it wasn't already created (could crash). Since you have code elsewhere for creating the iterator, you may want to break that into a private method. -- Minor: don't we have simpler ways of defining straightforward attribute methods? -- In AddIterNode(), you drop whitespace-only nodes. Won't this cause problems if you have something like this: <b>foo</b> <i>bar</i> and search for "foo bar"? Similarly for cases of searching for multiple whitespaces where something breaks the whitespace into a separate node. + // nsIContentIterator.h says Next() will return error at end, + // but it doesn't really, so we have to check: -- You should patch the nsContentIterator.h file if it lies -- There are a couple of tabs; kill them Other than that, looks quite sane. Correct those, and r=rjesup@wgate.com. We should get r/sr's from Kin and JFrancis too I think.
Blocks: 116405
I had some trouble with the mac project canges, so I created a new diff for the .xml file.
Attached patch Latest patchSplinter Review
Here's a new patch, based on inputs from rjesup, kin, jfrancis. It addresses: - the issues rjesup brought up - some build issues on mac and windows - Joe's mac build xml file - check pref "browser.new_find" (off by default) and only use the new behavior if that's set - don't save state between finds; instead, find from the selection position - climb up to find nearest block parent and compare with previous one, and clearQ if they're different (because iterator doesn't give us the parent node in both directions) - comment out iterator pre order since it doesn't currently work, and so don't SkipNode in forward direction I think this one should be safe to check in. Joe's going to look at it after dinner.
Attachment #64913 - Attachment is obsolete: true
Looks good. What's the answer to my paranoid worry: -- In AddIterNode(), you drop whitespace-only nodes. Won't this cause problems if you have something like this: <b>foo</b> <i>bar</i> and search for "foo bar"? Similarly for cases of searching for multiple whitespaces where something breaks the whitespace into a separate node. So long as you have an answer for that (I'm probably mis-understanding the effect fo dropping whitespace nodes), r=rjesup@wgate.com
Attachment #65181 - Flags: review+
r=jfrancis for initial landing, with preference set to disable new find code pending further fixes.
i meant to add to above comment that i did build testing and runtime testing on mac for this.
A thought. You need to find whether the _node_ is a block or whether the node's _frame_ is a block? Would it not make more sense to get the primary frame for the content and check whether it's a percentage base?
Blocks: 80805
The patch (with a few last-minute tweaks for problems Joe discovered) has gone in, preffed out by default (because it has some problems that still need to be worked out). To turn it on, user_pref("browser.new_find", true). On Unix platforms it will print out timing information, so you can do timing tests by flipping the pref and noting the times. Bumping the bug to the next milestone for getting it turned on, getting all the issues worked out and removing the old code and the old files.
Target Milestone: mozilla0.9.8 → mozilla0.9.9
Blocks: 95461
Some problems I'm seeing with new find turned on: 1. It seems to only work with the page that exists when you *first* bring up the find dialog ... if you visit another page and leave the find dialog up, it stops working, probably due to the fact that nsWebBrowserFind::SearchInFrame() doesn't reset the doc used in mFind. 2. Searching for a string like "read Mozilla" on www.mozilla.org finds the sub string in: For more info about us, read <a href="foo">Mozilla at a Glance</a> but only the word "read" is hilited. 3. If I load the lxr page for nsCSSFrameConstructor.cpp, and search for "pseudo", it seems to find the first occurrence, but subsequent finds do nothing ... no not found dialog, no hiliting or scrolling of the next "pseudo" word. The first "pseudo" word remains hilited. This is probably because find is not starting it's search after the end of the current selection when going forward. (Note it should start before the current selection when searching backwards too.)
Just jotting down some of the notes I mentinoed to akk when looking at the first pass impl of the nsIFind interface ... - I really think we should remove dependencies on doc, presshell, and selection from this low-level interface, it would allow us to use the implementation on different types of DOM based trees like the ones used in text and other types of widgets. It would also allow things like spellcheckers to find all occurrences of a specific word, without disturbing the current selection. It would also open up a bunch of possibilities for using find via JS for things that we haven't thought of yet. - We should remove wrapping from nsIFind and let the app/web-browser-find handle it, or provide an app-level convenience interface that implemented it. I say this because some apps want more control or notification before a search actually wraps within a particular doc. The example that comes to mind would be searching through a multi-frame set of docs ... some apps may prefer to search from a certain point within a one of docs in a frame, let the find proceed through all the frames after it, then wrap to the top frame, continuing till you hit the doc we started in. - The find method should probably take 2 ranges, one that says only search between these 2 endpoints, and the other an indicator of where to start the search from. - Lose the references to replace and let the app handle it.
Randell: your paranoid worry is probably correct. In fact, that attempt at optimization is probably not safe and should be removed. I'll do that in the next iteration while tracking down the other issues that have come up. Re the document not being changed: nsWebBrowserFind has code that supposedly changes the document if the frame isn't the same as before ... but it doesn't work (that != never actually fires). I wonder if that's what Simon was trying to debug when he put the big #ifdef DEBUG_smfr right near there? The old code works because it resets the document for each find anyway; I hope to avoid that and only reset when the document has actually changed (which I can probably do by storing a weak pointer to the document in nsWebBrowserFind if storing the frame doesn't do the trick). Agree with Kin that we should offer a low-level interface that doesn't require knowing about the selection, pres shell, etc. I'll be doing that in the next iteration (there wasn't time last night). I also agree about removing the wrapping from the low-level interface. The tentative plan is that nsIWebBrowserFind will be the easy-to-use interface that takes docshells and handles wrapping, and nsIFind will be the low-level interface that just returns a range.
Boris: I've been warned that GetPrimaryFrameFor is very slow and should be avoided when possible. But of course, one should always test assertions like that rather than believing them blindly, so it may be worth trying.
GetPrimaryFrameFor can be slow, yes.... Especially on table elements. My concern is what happens if the page contains something like <span style="display:block"> Some random </span> <span style="display:block"> text in here </span> Will Find end up finding "random text" in such a page? :)
just documenting some old news so we don't forget: for a good time, search www.mozilla.org/start for "a page". The behavior here may shed some light on difficulties with the find algorithm.
Could this in some way be releated to a bug i reported here: http://bugzilla.mozilla.org/show_bug.cgi?id=98835 Some of the behaviour between this and my bug is identical. The high CPU usage, eating up a _lot_ of ram. Also after visiting the page that was given as test case for this bug, i also noticed the same behaviour as with my bug. Memory is not released, lots of CPU cycles spend on seemingly nothing (nothing visual). Also when going to the test page, back here, and to the test page again, keeps increasing the memory usage of the browser. As i wrtite this, mozilla is using 140Mb of memory, and i've only visited 2 pages plus twice the test page here, plus once the test page i included in my bug report.
mass moving open bugs pertaining to find in page/frame to pmac@netscape.com as qa contact. to find all bugspam pertaining to this, set your search string to "AppleSpongeCakeWithCaramelFrosting".
QA Contact: sairuh → pmac
Blocks: 122061
The problem with the mozilla start page was a problem with the way the algorithm skipped multiple adjacent whitespace. When we hit the first non-space after a block of whitespace, we weren't advancing to the next non-whitespace character in the pattern string. While I was there, I noticed a possible buffer over/underrun in the "search to the last space" code, so I also fixed that. Can I get a review for that? I'll work separately on the problems moving between pages -- I need to restructure the high level/low level code as Kin and I discussed.
For anyone interested: issues with the API splitting will be addressed in newly opened bug 123087 (this bug was getting cumbersome and it was hard to keep track of what needed to be done).
Keywords: nsbeta1+
The API split is in. This bug is now tracking turning the feature on by default.
It's on by default!
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Clarification: it's on by default in the browser. Working on the editor find/replace ... bug 80805.
is there another bug on ripping out the old find code then?
Just filed bug 126312 -- thanks for the reminder.
Akkana, is this one really fixed? 1. Click on the url that lists on comment#2, http://www.mozilla.org/performance/mallocsites.html it still hangs. (windows 2000, commercial build: 2002-02-22-10-trunk). 2. Click on the orginal url: http://lxr.mozilla.org/seamonkey/source/layout/html/style/src/nsCSSFrameConstructor.cpp Then, quickly, hit ^g for several times in "find in this page" dialog with this text: nsobject, it crashes. This probably is a different issue so I already filed it: http://bugzilla.mozilla.org/show_bug.cgi?id=127295 Let me know. Thanks!
Patty: can you check your prefs (prefs.js and user.js) and make sure you don't have "browser.new_find" set to false somewhere? With a debug build from today, nsCSSFrameCnstructor.cppI get: new find, first nsobject: 0.1 sec new find, second (not found): 1.3 sec old find, first nsobject: 5.8 sec old find, second (not found): 11.1 sec I don't see the crash, but I'm not sure I'm trying the right way to reproduce it -- accel-G doesn't seem to do anything for me, either in the find dialog or in the browser window when the find dialog is up, so I just tried hitting return in the find dialog several times. Any chance you can get a stack trace for the crash? Does it happen repeatably when new_find is true (the default now) but not when it's false, or does it happen either way? Does it happen on earlier builds?
Thanks Akkana! Verified on windows (commercial netscape build: 2002-02-27-06-trunk) new find, first nsobject: 0.01 sec new find, second (not found): 0.09 sec
Status: RESOLVED → VERIFIED
Product: Core → Mozilla Application Suite
Component: XP Apps: GUI Features → UI Design
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: