Open Bug 453605 Opened 14 years ago Updated 1 month ago
.get Attributes() can take an extraordinarily long time to return given a sufficiently large text object .
Steps to reproduce: 1. Load the test case 2. Launch Accerciser 3. In Accerciser's tree of accessibles, highlight the document frame (note, Accerciser might take a few seconds to respond) 4. In Accerciser's iPython console type the following: text = acc.queryText()<Return> text.getAttributes(0)<Return> Expected results: getAttributes() would return instantly. Actual results: getAttributes() takes at least 6 seconds and sometimes 8 seconds to return. Specs of the box where I tried this: AMD Athlon 64 X2 (Dual Core) 4200+ with 4GB RAM. Granted, to highlight the issue I did place all of the text from "War and Peace" in the document frame object which is rather extreme. :-) But please bear with me BECAUSE: 1. There aren't even any text attributes present other than the default. All I did was copy and paste the contents and add <br /> tags. It's taking 6-8 seconds on a box with a dual core processor and 4 GB RAM to find out that, sure enough, there are no non-default text attributes. :-) 2. If you paste all of the content into a Gedit document and repeat the test, getAttributes() returns *instantly.* 3. If you open the test case in Gedit (with syntax highlighting enabled) and spot-check different offsets, getAttributes() returns *instantly* (correctly reporting the attributes). Thus it would seem to be a Firefox/Mozilla bug. That leaves the question of, does this actually occur "in the wild"? The answer seems to be "yes and no." Michael Pedersen (who I believe is on the browser-china-atf alias, and who should feel free to chime in) has pointed out examples of eBooks (Bookshare as I recall) and also certain search results from the Library of Congress' National Library Service (aka the Braille and Talking Book folks) digital download catalog. Due to various and sundry issues with passwords, copyright, and the like, I couldn't make a test case based on those. However, it seems that there are at least a healthy number of eBooks with sufficiently large text objects to make this a problem. Not an 6-8 second problem, but, say, a 1-2 second problem. Still too long for getAttributes() to return I would argue. In addition, considering real-world cases, given a line with a single bold word in the middle, there are three segments of text for which attributes must be obtained rather than one. Add another bold word somewhere else in the middle of that line, and now we're up to five segments. If getAttributes() took only a quarter of a second to return, that's still over a full second to get the text attributes for one line of text with a couple of bold words. :-( Sorry to be long-winded in my report, and thanks VERY much in advance for taking a look at it!!
Heh. Bugzilla wouldn't let me upload it as html -- or even zipped up -- due to size. :-) So I put it on my server instead. Your test case choices are: * http://grain-of-salt.com/foo/wnp.html.zip * http://grain-of-salt.com/foo/wnp.html Thanks again!!
No longer blocks: textattra11y
Thanks Alexander. My plan was to comment on those bugs instead of filing this one. But I had initially made the 274 instances of the string "war" "<b>war</b>" and in doing so managed to hang Accerciser, hang Firefox, and cause the AT-SPI registry daemon to segfault by asking for the text attributes. :-) After that, I figured it seemed worthy of a dedicated bug. ;-)
Confirmed also on Windows. It takes 4 to 5 seconds on my Intel Core2Duo 2.67 gHz with 3 GB of RAM and Windows XP using AccProbe.
OS: Linux → All
Hardware: PC → All
Do I understand right you call text.getAttributes on document accessible?
Yes and no.... We call text.getAttributes on the accessible which contains the text of interest. Often that's a paragraph or a section or a heading or .... However, we still find plenty of pages where the text of interest is not inside any such object but instead directly in the page's body, i.e. document frame. In that case, yes indeed we'd wind up calling text.getAttributes on the document accessible.
Ok. So 1) Run through accessible tree (may help a bit because a11y tree is subset of DOM tree) 2) Do not walk into embedded char object (should bring big help but see bug 445677 comment #5) 3) Do not walk whole subtree every time for every text attribute (I suppose it should bring main performance win, in several times)
I don't see this as blocking 1.9.1. Please re-nom if you disagree. 1.9.1 is a time-based release with a clear set of features defining the release. This seems like an edge case that wouldn't hold back the release.
Flags: blocking1.9.1? → blocking1.9.1-
Joanie, can you explain again why you have to consume all text attributes at the doc level for support of text attributes in Orca?
What's the status of this bug, since bug 475522 was fixed?
(In reply to comment #10) > What's the status of this bug, since bug 475522 was fixed? NEW, we need to fix bug 445677 firstly.
(In reply to comment #11) > (In reply to comment #10) > > What's the status of this bug, since bug 475522 was fixed? > > NEW, we need to fix bug 445677 firstly. Bug 445677 patch is ready, it makes this testcase faster in 8 times. Originally it takes 4 seconds, after it takes .5 seconds on my machine.
It stills takes ~4 secs with trunk from today. I did a simple profiling with sysprof and most of the ff4 cpu time is spent on call(s) to nsTextAttrsMgr::GetRange, and internally, that function seems to spend most of the cpu time in in Equal and TextLength funcs. I'm attaching the sysprof xml log and a quick screenshot
(In reply to comment #15) > Created attachment 505071 [details] > Screenshot of profiling data for quick view Notice that those numbers are for the whole cpu time and firefox cpu time is 87,19
Fer, did you profile on a non-debug build?
good catch. Repeating with optimized non-debug build
Severity: major → S4
Priority: -- → P3
You need to log in before you can comment on or make changes to this bug.