Closed Bug 97157 Opened 24 years ago Closed 24 years ago

Find in page with a large <pre> is very very slow

Categories

(SeaMonkey :: UI Design, defect, P3)

Product:

Component:

Platform:

x86

All

Type:

defect

Priority:

P3

Severity:

major

Tracking

(Not tracked)

Status:

VERIFIED FIXED

Milestone:

mozilla0.9.9

People

(Reporter: jesup, Assigned: akkzilla)

References

(
URL
)

Details

(Keywords: perf)

Attachments

(3 files, 2 obsolete files)

Preliminary patch: Much faster find 24 years ago Akkana Peck 41.99 KB, patch		Details \| Diff \| Splinter Review
Better patch: should build on windows, includes mac build xml 24 years ago Akkana Peck 47.73 KB, patch		Details \| Diff \| Splinter Review
updated diffs for EmbedComponents.xml 24 years ago Joe Francis 10.17 KB, patch		Details \| Diff \| Splinter Review
Latest patch 24 years ago Akkana Peck 61.29 KB, patch	jesup : review+	Details \| Diff \| Splinter Review
Patch: fix problems skipping over whitespace 24 years ago Akkana Peck 2.01 KB, patch		Details \| Diff \| Splinter Review

Randell Jesup [:jesup] (needinfo me)

Reporter

Description

•

24 years ago

This is a companion to bug 31770 and to bug 95461. Try to do a Find in the lxr page above....

Randell Jesup [:jesup] (needinfo me)

Reporter

Updated

•

24 years ago

Severity: normal → major

Keywords: mozilla1.0

Kathleen :Brade

Updated

•

24 years ago

Keywords: perf

Updated

•

24 years ago

Blocks: 91351

Comment 1

•

24 years ago

Cc sfraser.

Priority: -- → P3

Target Milestone: --- → mozilla0.9.5

Updated

•

24 years ago

Status: NEW → ASSIGNED

Comment 2

•

24 years ago

Posting a URL for reference: http://www.mozilla.org/performance/mallocsites.html Right on mozilla org ;-) On an older machine (PII-233 mhz) on Linux 2001083016 this page causes the browser to black out the windows and become unresponsive for several minutes. At several intervals the page renders and appears for a few seconds intermittently.

Comment 3

•

24 years ago

0.9.6 for now.

Target Milestone: mozilla0.9.5 → mozilla0.9.6

Comment 4

•

24 years ago

This works for me now that the fix for 31770 was checked in.

Randell Jesup [:jesup] (needinfo me)

Reporter

Comment 5

•

24 years ago

This bug is still there; Kin is planning on a rewrite of TextServicesDocument to solve it. On an lxr of nsCSSFrameConstructor.cpp (monster file), Finding the second occurance of "HTMLAttribute" (after finding the first) takes ~20 seconds on my build. (The first occurance is on line 20 or so, the second is on line 1200ish. Finding the first takes about as long as the second.) The fix for 31770 will have made this faster than it was, perhaps, but it's still REALLY slow on large <pre>s.

Comment 6

•

24 years ago

Huh. It only takes two seconds to go from the first instance, at line 32, to the second instance, at line 1256. I'm using 2001092403 on Win2K, on a PII-500. Considering this is a 12000-line file, I find this speed to be acceptable. Not ideal, a vast improvement. If there's more work coming to speed this up even more then great, I guess.

Updated

•

24 years ago

Blocks: 106961

Comment 7

•

24 years ago

--> Mozilla0.9.9 Trying to load-balance bug list across milestones. I will pull this bug in to a sooner milestone if I can.

Target Milestone: mozilla0.9.6 → mozilla0.9.9

Assignee

Comment 8

•

24 years ago

On Kin's request, I'm looking into this.

Assignee: kin → akkana

Status: ASSIGNED → NEW

Target Milestone: mozilla0.9.9 → mozilla0.9.8

Assignee

Comment 9

•

24 years ago

Attached patch Preliminary patch: Much faster find (obsolete) — Details — Splinter Review

Assignee

Comment 10

•

24 years ago

It's time to start soliciting comments. It seems to be working fairly well for my small tests so far. In a debug build, my worst-case test (search to the end of the lxr page for nsCSSFrameConstructor.cpp), it takes 1/4 the time of the old find (3 seconds vs. 12 on an Athlon 1600); for incremental finds the gain is substantially larger. As is, this replaces the find used by the browser. XUL/JS code to make the editor use the new find will be showing up momentarily in a separate bug. Comment about the implementation, which reviewers need to know: First, the code I used for IsBlock gets that information from the parser (technically the "right" owner of this information). However, to do this introduces a dependency from embedding/components/find to parser and necko. This may not be kosher. It's easy enough to do what everybody else does and just duplicate the list of nodes thought to be block, and maintain it ourselves; this is ugly but introduces no dependencies, and I can easily rewrite it that way if the dependencies are a problem. All the TEXT_SVCS_TEST stuff in nsWebBrowserFind is code I'm using for performance testing -- I can switch back and forth between old and new find by setting an environment variable (the timing stuff will probably only work on Unix). My plan is to remove it all before checking in, but until then I want to leave it in to be able to get easy performance comparison numbers. Other limitations/issues as noted in comments in the code. Simon, I know it's a lot of code, but do you have any cycles to glance at this? Even if you don't have time for a code review, I'd appreciate comments on the embedding dependency issue and any other concerns you might have; if you do have time to look at the code it would be much appreciated. Of course, anyone else on the cc list is also welcome to review or offer comments.

Status: NEW → ASSIGNED

Comment 11

•

24 years ago

I will be glad to review this when I get back in early January.

Assignee

Comment 12

•

24 years ago

Note: the string APIs have changed again, so line 357 of nsFind must change to read: ToLowerCase(patStr); I'll assume I don't have to make a new patch just for that. Still looking for reviews ... This needs to go in by next Wednesday if it's going to make the milestone. Please?

Simon Fraser [no longer active]

Comment 13

•

24 years ago

How final is the patch? I'll review when you say it's ready to go in ;)

Assignee

Comment 14

•

24 years ago

Thanks! As far as I know, it's ready (modulo the #ifdef'ed testing code which should probably be removed). There may be issues I've missed, though, hence the plea for reviews to get more sets of eyes on it.

Randell Jesup [:jesup] (needinfo me)

Reporter

Comment 15

•

24 years ago

I'll try to look at it also this weekend

Assignee

Comment 16

•

24 years ago

Attached patch Better patch: should build on windows, includes mac build xml (obsolete) — Details — Splinter Review

Here's another patch which I think combines Kin's changes and mine. I've included the mods to the macbuild xml file, and Kin's changes to make it build on windows. Kin says it's not working right on Windows. Unfortunately my build isn't working today (for possibly unrelated reasons) so we haven't gotten to the bottom of this yet. Reviews are ongoing and I do still need reviews, and would love to have an indication of whether these XML files do the right thing on the mac build.

Attachment #62247 - Attachment is obsolete: true

Randell Jesup [:jesup] (needinfo me)

Reporter

Comment 17

•

24 years ago

I want to see this in, if at all possible, for 0.9.8. Comments: -- Use nsnull instead of 0 in things like "mRange = 0" (for any pointer, especially an nsCOMPtr). -- nsFind::SetRange() doesn't create an iterator if it wasn't already created (could crash). Since you have code elsewhere for creating the iterator, you may want to break that into a private method. -- Minor: don't we have simpler ways of defining straightforward attribute methods? -- In AddIterNode(), you drop whitespace-only nodes. Won't this cause problems if you have something like this: <b>foo</b> <i>bar</i> and search for "foo bar"? Similarly for cases of searching for multiple whitespaces where something breaks the whitespace into a separate node. + // nsIContentIterator.h says Next() will return error at end, + // but it doesn't really, so we have to check: -- You should patch the nsContentIterator.h file if it lies -- There are a couple of tabs; kill them Other than that, looks quite sane. Correct those, and r=rjesup@wgate.com. We should get r/sr's from Kin and JFrancis too I think.

Assignee

Updated

•

24 years ago

Blocks: 116405

Comment 18

•

24 years ago

Attached patch updated diffs for EmbedComponents.xml — Details — Splinter Review

I had some trouble with the mac project canges, so I created a new diff for the .xml file.

Assignee

Comment 19

•

24 years ago

Attached patch Latest patch — Details — Splinter Review

Here's a new patch, based on inputs from rjesup, kin, jfrancis. It addresses: - the issues rjesup brought up - some build issues on mac and windows - Joe's mac build xml file - check pref "browser.new_find" (off by default) and only use the new behavior if that's set - don't save state between finds; instead, find from the selection position - climb up to find nearest block parent and compare with previous one, and clearQ if they're different (because iterator doesn't give us the parent node in both directions) - comment out iterator pre order since it doesn't currently work, and so don't SkipNode in forward direction I think this one should be safe to check in. Joe's going to look at it after dinner.

Attachment #64913 - Attachment is obsolete: true

Randell Jesup [:jesup] (needinfo me)

Reporter

Comment 20

•

24 years ago

Looks good. What's the answer to my paranoid worry: -- In AddIterNode(), you drop whitespace-only nodes. Won't this cause problems if you have something like this: <b>foo</b> <i>bar</i> and search for "foo bar"? Similarly for cases of searching for multiple whitespaces where something breaks the whitespace into a separate node. So long as you have an answer for that (I'm probably mis-understanding the effect fo dropping whitespace nodes), r=rjesup@wgate.com

Randell Jesup [:jesup] (needinfo me)

Reporter

Updated

•

24 years ago

Attachment #65181 - Flags: review+

Comment 21

•

24 years ago

r=jfrancis for initial landing, with preference set to disable new find code pending further fixes.

Comment 22

•

24 years ago

i meant to add to above comment that i did build testing and runtime testing on mac for this.

Boris Zbarsky [:bzbarsky]

Comment 23

•

24 years ago

A thought. You need to find whether the _node_ is a block or whether the node's _frame_ is a block? Would it not make more sense to get the primary frame for the content and check whether it's a percentage base?

Assignee

Updated

•

24 years ago

Blocks: 80805

Assignee

Comment 24

•

24 years ago

The patch (with a few last-minute tweaks for problems Joe discovered) has gone in, preffed out by default (because it has some problems that still need to be worked out). To turn it on, user_pref("browser.new_find", true). On Unix platforms it will print out timing information, so you can do timing tests by flipping the pref and noting the times. Bumping the bug to the next milestone for getting it turned on, getting all the issues worked out and removing the old code and the old files.

Target Milestone: mozilla0.9.8 → mozilla0.9.9

Assignee

Updated

•

24 years ago

Blocks: 95461

Comment 25

•

24 years ago

Some problems I'm seeing with new find turned on: 1. It seems to only work with the page that exists when you *first* bring up the find dialog ... if you visit another page and leave the find dialog up, it stops working, probably due to the fact that nsWebBrowserFind::SearchInFrame() doesn't reset the doc used in mFind. 2. Searching for a string like "read Mozilla" on www.mozilla.org finds the sub string in: For more info about us, read <a href="foo">Mozilla at a Glance</a> but only the word "read" is hilited. 3. If I load the lxr page for nsCSSFrameConstructor.cpp, and search for "pseudo", it seems to find the first occurrence, but subsequent finds do nothing ... no not found dialog, no hiliting or scrolling of the next "pseudo" word. The first "pseudo" word remains hilited. This is probably because find is not starting it's search after the end of the current selection when going forward. (Note it should start before the current selection when searching backwards too.)

Comment 26

•

24 years ago

Just jotting down some of the notes I mentinoed to akk when looking at the first pass impl of the nsIFind interface ... - I really think we should remove dependencies on doc, presshell, and selection from this low-level interface, it would allow us to use the implementation on different types of DOM based trees like the ones used in text and other types of widgets. It would also allow things like spellcheckers to find all occurrences of a specific word, without disturbing the current selection. It would also open up a bunch of possibilities for using find via JS for things that we haven't thought of yet. - We should remove wrapping from nsIFind and let the app/web-browser-find handle it, or provide an app-level convenience interface that implemented it. I say this because some apps want more control or notification before a search actually wraps within a particular doc. The example that comes to mind would be searching through a multi-frame set of docs ... some apps may prefer to search from a certain point within a one of docs in a frame, let the find proceed through all the frames after it, then wrap to the top frame, continuing till you hit the doc we started in. - The find method should probably take 2 ranges, one that says only search between these 2 endpoints, and the other an indicator of where to start the search from. - Lose the references to replace and let the app handle it.

Assignee

Comment 27

•

24 years ago

Randell: your paranoid worry is probably correct. In fact, that attempt at optimization is probably not safe and should be removed. I'll do that in the next iteration while tracking down the other issues that have come up. Re the document not being changed: nsWebBrowserFind has code that supposedly changes the document if the frame isn't the same as before ... but it doesn't work (that != never actually fires). I wonder if that's what Simon was trying to debug when he put the big #ifdef DEBUG_smfr right near there? The old code works because it resets the document for each find anyway; I hope to avoid that and only reset when the document has actually changed (which I can probably do by storing a weak pointer to the document in nsWebBrowserFind if storing the frame doesn't do the trick). Agree with Kin that we should offer a low-level interface that doesn't require knowing about the selection, pres shell, etc. I'll be doing that in the next iteration (there wasn't time last night). I also agree about removing the wrapping from the low-level interface. The tentative plan is that nsIWebBrowserFind will be the easy-to-use interface that takes docshells and handles wrapping, and nsIFind will be the low-level interface that just returns a range.

Assignee

Comment 28

•

24 years ago

Boris: I've been warned that GetPrimaryFrameFor is very slow and should be avoided when possible. But of course, one should always test assertions like that rather than believing them blindly, so it may be worth trying.

Boris Zbarsky [:bzbarsky]

Comment 29

•

24 years ago

GetPrimaryFrameFor can be slow, yes.... Especially on table elements. My concern is what happens if the page contains something like <span style="display:block"> Some random </span> <span style="display:block"> text in here </span> Will Find end up finding "random text" in such a page? :)

Comment 30

•

24 years ago

just documenting some old news so we don't forget: for a good time, search www.mozilla.org/start for "a page". The behavior here may shed some light on difficulties with the find algorithm.

Comment 31

•

24 years ago

Could this in some way be releated to a bug i reported here: http://bugzilla.mozilla.org/show_bug.cgi?id=98835 Some of the behaviour between this and my bug is identical. The high CPU usage, eating up a _lot_ of ram. Also after visiting the page that was given as test case for this bug, i also noticed the same behaviour as with my bug. Memory is not released, lots of CPU cycles spend on seemingly nothing (nothing visual). Also when going to the test page, back here, and to the test page again, keeps increasing the memory usage of the browser. As i wrtite this, mozilla is using 140Mb of memory, and i've only visited 2 pages plus twice the test page here, plus once the test page i included in my bug report.

sairuh (rarely reading bugmail)

Comment 32

•

24 years ago

mass moving open bugs pertaining to find in page/frame to pmac@netscape.com as qa contact. to find all bugspam pertaining to this, set your search string to "AppleSpongeCakeWithCaramelFrosting".

QA Contact: sairuh → pmac

Assignee

Updated

•

24 years ago

Blocks: 122061

Assignee

Comment 33

•

24 years ago

The problem with the mozilla start page was a problem with the way the algorithm skipped multiple adjacent whitespace. When we hit the first non-space after a block of whitespace, we weren't advancing to the next non-whitespace character in the pattern string. While I was there, I noticed a possible buffer over/underrun in the "search to the last space" code, so I also fixed that. Can I get a review for that? I'll work separately on the problems moving between pages -- I need to restructure the high level/low level code as Kin and I discussed.

Assignee

Comment 34

•

24 years ago

Attached patch Patch: fix problems skipping over whitespace — Details — Splinter Review

Assignee

Comment 35

•

24 years ago

For anyone interested: issues with the API splitting will be addressed in newly opened bug 123087 (this bug was getting cumbersome and it was hard to keep track of what needed to be done).

Updated

•

24 years ago

Keywords: nsbeta1+

Assignee

Comment 36

•

24 years ago

The API split is in. This bug is now tracking turning the feature on by default.

Assignee

Comment 37

•

24 years ago

It's on by default!

Status: ASSIGNED → RESOLVED

Closed: 24 years ago

Resolution: --- → FIXED

Assignee

Comment 38

•

24 years ago

Clarification: it's on by default in the browser. Working on the editor find/replace ... bug 80805.

Comment 39

•

24 years ago

is there another bug on ripping out the old find code then?

Assignee

Comment 40

•

24 years ago

Just filed bug 126312 -- thanks for the reminder.

Comment 41

•

24 years ago

Akkana, is this one really fixed? 1. Click on the url that lists on comment#2, http://www.mozilla.org/performance/mallocsites.html it still hangs. (windows 2000, commercial build: 2002-02-22-10-trunk). 2. Click on the orginal url: http://lxr.mozilla.org/seamonkey/source/layout/html/style/src/nsCSSFrameConstructor.cpp Then, quickly, hit ^g for several times in "find in this page" dialog with this text: nsobject, it crashes. This probably is a different issue so I already filed it: http://bugzilla.mozilla.org/show_bug.cgi?id=127295 Let me know. Thanks!

Assignee

Comment 42

•

24 years ago

Patty: can you check your prefs (prefs.js and user.js) and make sure you don't have "browser.new_find" set to false somewhere? With a debug build from today, nsCSSFrameCnstructor.cppI get: new find, first nsobject: 0.1 sec new find, second (not found): 1.3 sec old find, first nsobject: 5.8 sec old find, second (not found): 11.1 sec I don't see the crash, but I'm not sure I'm trying the right way to reproduce it -- accel-G doesn't seem to do anything for me, either in the find dialog or in the browser window when the find dialog is up, so I just tried hitting return in the find dialog several times. Any chance you can get a stack trace for the crash? Does it happen repeatably when new_find is true (the default now) but not when it's false, or does it happen either way? Does it happen on earlier builds?

Comment 43

•

24 years ago

Thanks Akkana! Verified on windows (commercial netscape build: 2002-02-27-06-trunk) new find, first nsobject: 0.01 sec new find, second (not found): 0.09 sec

Status: RESOLVED → VERIFIED

Myk Melez [:myk] [@mykmelez]

Updated

•

21 years ago

Product: Core → Mozilla Application Suite

Updated

•

17 years ago

Component: XP Apps: GUI Features → UI Design

You need to log in before you can comment on or make changes to this bug.