Closed Bug 76722 Opened 24 years ago Closed 23 years ago

Gecko blocks the main thread for > 1 sec processing events

Tracking

()

Status:

VERIFIED FIXED

Milestone:

mozilla0.9.3

People

(Reporter: phil, Assigned: harishd)

References

(
URL
)

Details

(Keywords: perf, Whiteboard: [pdt+][nsBranch+][fixed and verified on the trunk])

Attachments

(16 files)

Debug patch file which alerts when a single reflow or content flush exceeds 1/4 of a second 24 years ago Kevin McCluskey (gone) 4.30 KB, patch		Details \| Diff \| Splinter Review
Debug patch which includes nsDelayAlarm.h 24 years ago Kevin McCluskey (gone) 8.24 KB, patch		Details \| Diff \| Splinter Review
nsDelayAlarm.h by itself just in case the patch will not apply 24 years ago Kevin McCluskey (gone) 2.47 KB, text/plain		Details
Work in progress - do not checkin. Limits processing of tokens by interrupting the parser. www.disney.com does not load completely with this patch 24 years ago Kevin McCluskey (gone) 8.06 KB, patch		Details \| Diff \| Splinter Review
Work in progress. Better patch, a lot more sites load correctly but banner ad doesn't appear on www.disney.com. Contains numerous hacks that need to be cleaned up. 24 years ago Kevin McCluskey (gone) 27.50 KB, patch		Details \| Diff \| Splinter Review
patch includes cleanup, dummy parser request to delay the reporting of document done until parsing is complete, prefs to control interruptible parsing 24 years ago Kevin McCluskey (gone) 36.78 KB, patch		Details \| Diff \| Splinter Review
Interruptible parsing patch which samples the system clock after every 20 tokens are processed (tunable with pref) instead of after every one. 23 years ago Kevin McCluskey (gone) 37.21 KB, patch		Details \| Diff \| Splinter Review
patch with tuned values for NotificationInterval and MaxTokenProcessing + cleaned up code 23 years ago Kevin McCluskey (gone) 34.43 KB, patch		Details \| Diff \| Splinter Review
patch with calls to PR_Now replaced with PR_Interval_Now to improve performance as suggested by waterson@netscape.com. Also tuned values to work well with 200Mhz WIN95 machines 23 years ago Kevin McCluskey (gone) 34.10 KB, patch		Details \| Diff \| Splinter Review
patch incorporating suggestions from attinasi, rickg, and harishd 23 years ago Kevin McCluskey (gone) 38.52 KB, patch		Details \| Diff \| Splinter Review
patch incorporating additional suggestions from harishd 23 years ago Kevin McCluskey (gone) 39.62 KB, patch		Details \| Diff \| Splinter Review
Patch which dynamically switches between fast-page load and user-interactive based on user events 23 years ago Kevin McCluskey (gone) 9.59 KB, patch		Details \| Diff \| Splinter Review
Minor tweaks to patch 40220, removed some compiler warnings, removed nsDelayTimer class, added some comments 23 years ago Kevin McCluskey (gone) 10.63 KB, patch		Details \| Diff \| Splinter Review
patch which includes the call to GetNotificationInterval() mentioned above + small tweek of defaults for the dynamic switching. 23 years ago Kevin McCluskey (gone) 10.78 KB, patch		Details \| Diff \| Splinter Review
Patch which prevents dynamic switching until 2 seconds after the document load has begun to fix event issue which was causing performance hit when the pointer is over the window during I-BENCH tests 23 years ago Kevin McCluskey (gone) 2.05 KB, patch		Details \| Diff \| Splinter Review
patch to delay lower of the interval incorporating Marc's suggestions 23 years ago Kevin McCluskey (gone) 3.34 KB, patch		Details \| Diff \| Splinter Review

Phil Peterson

Reporter

Description

•

24 years ago

One of our embedding customers tells us that they see Gecko blocking the main thread while processing events. Sometimes one event takes more than 1 sec to process. kmcclusk, can you look into this and see what you can find?

chris hofmann

Comment 1

•

24 years ago

related to hyatt/darin work on flow control analysis???? do we have bugs filed on that analysis and possible fixes yet?

Kevin McCluskey (gone)

Comment 2

•

24 years ago

Is there a particular site where this is happening?

Status: NEW → ASSIGNED

Kevin McCluskey (gone)

Updated

•

24 years ago

Target Milestone: --- → mozilla0.9.1

Kevin McCluskey (gone)

Comment 3

•

24 years ago

Attached patch Debug patch file which alerts when a single reflow or content flush exceeds 1/4 of a second — Details — Splinter Review

Kevin McCluskey (gone)

Comment 4

•

24 years ago

A single content flush and a single reflow on www.cnn.com typically exceed 1 second on a 433Mhz PC running WINNT. When a single content flush or reflow is processed we do not return to the message pump until they have completed. There are two possible solutions: 1) Implement interruptible reflow and content creation (A big job) 2) Limit the amount of data that processed by a single content sink flush I'll explore alternative 2.

URL: www.cnn.com

Brendan Eich [:brendan]

Comment 5

•

24 years ago

If you've cvs add'ed nsDelayAlarm.*, you can include them in the patch via cvs diff -u -N .... /be

Kevin McCluskey (gone)

Comment 6

•

24 years ago

Attached patch Debug patch which includes nsDelayAlarm.h — Details — Splinter Review

Kevin McCluskey (gone)

Comment 7

•

24 years ago

Attached file nsDelayAlarm.h by itself just in case the patch will not apply — Details

David Hyatt

Comment 8

•

24 years ago

In my profiles, it isn't content creation that takes a lot of time. It's frame creation (which right now happens in the same event as content creation). It is during frame creation that style resolution and image load kickoffs happen, and these are the expensive operations. I don't see a problem with doing many more ContentAppended and ContentInserted operations than we do now as long as we fix the O(n^2) problems with appending content and get a little smarter about coalescing reflows that occur as a result of appending and inserting content.

Kevin McCluskey (gone)

Comment 9

•

24 years ago

I instrumented: nsParser::OnDataAvailable nsHTMLContentSink::FlushTags nsPresShell:ProcessReflowCommand nsCSSFrameConstructor.cpp => frame construction. On www.cnn.comI saw the sequence of OnDataAvailable -> Chunk of Markup FlushTags OnDataAvailable -> Chunk of Markup FlushTags .... Until almost the entire page loads then I finally saw almost all of the frames constructed as the result of a single call to FlushTags, followed by a single reflow. Almost all of the frames are created and reflowed in one shot which leads to frame construction and reflow delay's for a single event which are greater than one second If I save www.cnn.com locally and remove the links to external style sheets.I see the following sequence OnDataAvailable==> Chunk of Markup FlushTags Frame Construction=> Frames corresponding to the Markup created Process Reflow Command ... OnDataAvailable==> Chunk of Markup FlushTags Frame Construction=> Frames corresponding to the Markup created Process Reflow Command I discussed this with rickg@netscape.com and attinasi@netscape.com The inclusion of the external sheets forces the parser to block until all of the external style sheets have been loaded. Once they have loaded, the parser is unblocked and it sends all of the data it was caching while in the blocked state over to the content sink. This is what causes the majority of the frame construction and reflowing to be done in one shot. We also seem to spent too much time away for the message pump when processing long documents. I will try modifying how much data necko passes to the parser's stream listener by changing nsStorageTransport.cpp's MAX_IO_CHUNK from 8192 to 1024.

Kevin McCluskey (gone)

Comment 10

•

24 years ago

To prevent the parser from sending all of the data that has been cached while it is blocked in one shot we would need to limit the amount of data that it could pass to the content sink and schedule messages to complete the flushing of the parsers cache.

Kevin McCluskey (gone)

Comment 11

•

24 years ago

"In my profiles, it isn't content creation that takes a lot of time". I agree. The content sink flush which results in the creation of the frames is what takes the majority of the time. Thats what I meant when I referred to content flush, not the creation of the content itself, although I erroneously referred to content creation in solution 1) :-)

David Hyatt

Comment 12

•

24 years ago

Changing the buffer size is dangerous, since you don't want to stall Necko. If Necko can keep loading that's a good thing. IMO it's better to make the content sink only flush small amounts to the frame constructor and still allow the parser to queue up 8192 blocks. I think your original idea of tuning the content sink is the way to go here.

David Hyatt

Comment 13

•

24 years ago

Although I guess we should tune the parser a little bit to make sure it doesn't synchronously parse a whole huge amount of content. :) I assume the parser can literally grab the entire page from Necko (e.g., 80K) while blocked, and that it does something like parse the entire 80K and send a bunch of synchronous notifications to the content sink? If that's what it does, you're right that we should tune the parser to post events instead of parsing in one fell swoop.

rickg

Comment 14

•

24 years ago

As I told Kevin and Attinasi, it would be very easy to gate the parser on how much content to toss to the content model in any one cycle. We will be discussing this a bit more tomorrow if folks want to join in.

Kevin McCluskey (gone)

Comment 15

•

24 years ago

Attached patch Work in progress - do not checkin. Limits processing of tokens by interrupting the parser. www.disney.com does not load completely with this patch — Details — Splinter Review

Kevin McCluskey (gone)

Updated

•

24 years ago

Target Milestone: mozilla0.9.1 → mozilla0.9.2

Kevin McCluskey (gone)

Comment 16

•

24 years ago

Attached patch Work in progress. Better patch, a lot more sites load correctly but banner ad doesn't appear on www.disney.com. Contains numerous hacks that need to be cleaned up. — Details — Splinter Review

Kevin McCluskey (gone)

Comment 17

•

24 years ago

Attached patch patch includes cleanup, dummy parser request to delay the reporting of document done until parsing is complete, prefs to control interruptible parsing — Details — Splinter Review

Cathleen

Updated

•

24 years ago

Blocks: 71668

Cathleen

Comment 18

•

23 years ago

kevin, how are we doing with your patch? Are we able to run through I-BENCH test or jrgm's page-load test?

Kevin McCluskey (gone)

Comment 19

•

23 years ago

Attached patch Interruptible parsing patch which samples the system clock after every 20 tokens are processed (tunable with pref) instead of after every one. — Details — Splinter Review

Kevin McCluskey (gone)

Comment 20

•

23 years ago

"Are we able to run through I-BENCH test or jrgm's page-load test?" I was able to run through I-BENCH and jrgm's page loader test on Linux with a release build. The page load times with jrgm's test were within the range of error for me of approx 4% so I couldn't see any impact from the patch. The I-BENCH scores for the initial page load time were not consistent. I need to tune the prefs which control how frequently the parser is interruptibted and how often the system clock is sampled. After I have done some tuning I'll re-run I-BENCH and jrgm's page loader test.

Kevin McCluskey (gone)

Comment 21

•

23 years ago

Just to clarify. The solution that rickg and I have been working on is really not the correct solution to this problem in the long term. The correct solution would to be to make reflow and frame construction interruptible. Making them interruptible would be a large task so we have been exploring using the parser's pre-existing incremental/interruptible capability to limit the number of frames that are created and reflowed in one shot. If both reflow and frame construction were interruptible then there would no need to interrupt the parser's processing of tokens. The patch I have attached interrupts the parser based on how much time has been spent processing tokens. In addition the patch sets up a PL_Event to continue calling the parser to continue processing if the document has been completely loaded, but the parser has been interrupted.

David Hyatt

Comment 22

•

23 years ago

So you can do something like process some tokens, schedule an event, then during the processing of that event you parse some more tokens and possible schedule another event, and so on? That sounds good to me.

Kevin McCluskey (gone)

Comment 23

•

23 years ago

Thats right. The continue parsing PL_event only needs to be scheduled if the parser's OnDataAvailable has been called for the last time. If necko is still loading a document the tokens that were left around when the parser was interrupted will be processed the next time the parser's OnDataAvailable is called (Provided it doesn't interrupt again). I also added a dummy parser request to the load group to prevent JavaScript onload handlers and other document done processing from happening until the final chunk of data has been tokenized and passed over to the content sink.

David Hyatt

Comment 24

•

23 years ago

Let me know if/when you're ready for a review. I'd be happy to look at this.

rickg

Comment 25

•

23 years ago

Hyatt: you're close. As KM said, the parser is already incremental and interruptable. All we're really doing is using that in conjuction with a gating function that tells us to quit scanning/tokenizing/model-buildling once a given threshold has been exceeded. The trick now is to find the right threshold that: 1) provides improved user responsiveness; 2) doesn't slow normal page construction; 3) performs the max possible work per iteration.

Tomi Leppikangas

Comment 26

•

23 years ago

I am not sure if i miss some info, but i think that logic in HTMLContentSink::DidProcessAToken is wrong. First that == should be >=, but if you want to check time after every 20 token, it should be more like this: HTMLContentSink::DidProcessAToken(void) { mTokenCount++; if ((mInterruptParsing) && (mTokenCount >= mMaxTokenCount)) { mTokenCount = 0; if (mDelayTimer.HasExceeded(mMaxProcessingTime)) return NS_ERROR_HTMLPARSER_INTERRUPTED; } return NS_OK; }

David Hyatt

Comment 27

•

23 years ago

Do not be discouraged if this patch causes a slight slowdown on page load times. I would expect it to. By not hogging the event queue, we're going to make paint suppression really unsuppress painting at the right time, and we're going to get a few more cycles of animation and progress. The tradeoff is that pages with CSS should show some content earlier, so you may have a positive perception gain even though the overall time slows.

Kevin McCluskey (gone)

Comment 28

•

23 years ago

Attached patch patch with tuned values for NotificationInterval and MaxTokenProcessing + cleaned up code — Details — Splinter Review

Darin Fisher

Comment 29

•

23 years ago

actually, from the work bbaetz has been doing, you'll most likely see a big win on pages with many images. (turns out that this is related to why we seem to do better with fewer connections per server per page.) because the ui thread is hogged for so long (and the ui thread's event queue essentially blocked), http connections are not reclaimed by the http handler (since this happens on the ui thread) as early as they really could be. new connections are not started until the old ones are reclaimed. in fact, bbaetz has found that this contributes to about a 35% slowdown on pages like www.netscape.com. overall, it amounts to about a 5% slowdown on jrgm's page-loader test. dropping the number of http connections actually helped with this problem because it essentially reduced the number of active consumers of http, and hence kept the ui thread's event queue free'er... this is just an educated guess, though. we're working on a patch that will enable http to reclaim connections on the socket transport thread to get around this problem.

David Hyatt

Comment 30

•

23 years ago

Well, good. Let's hope for the best! :)

Bradley Baetz (:bbaetz)

Comment 31

•

23 years ago

My bug is bug 83772. Comments welcome...

Bradley Baetz (:bbaetz)

Comment 32

•

23 years ago

I tried this patch, and I crash on the ibench tests. I can't seem to get a backtrace out of gdb though. I can run ibench without the patch applied.

Kevin McCluskey (gone)

Comment 33

•

23 years ago

Attached patch patch with calls to PR_Now replaced with PR_Interval_Now to improve performance as suggested by waterson@netscape.com. Also tuned values to work well with 200Mhz WIN95 machines — Details — Splinter Review

Kevin McCluskey (gone)

Comment 34

•

23 years ago

The patch is ready for review/super-review I will check it in disabled by setting mCanInterruptParsing to false. This will prevent the content sink from ever returning NS_ERROR_HTMLPARSER_INTERRUPTED so it will never schedule ParserContinue events and the Dummy parser requests will not get added to the load group. I tuned the values for: mMaxTokenProcessingTime - New throttle I added to control Maximumum time spent processing tokens. mNotificationInterval - Existing throttle for controlling how often the content sink flushes. I tuned the values by loading: http://lxr.mozilla.org/seamonkey/source/view/src/nsViewManager.cpp and http://lxr.mozilla.org/seamonkey/find?string=nsCSSFrameConstructor.cpp then I reloading them from the cache to get consistent load times. I set the default values for mMaxTokenProcessingTime and mNotificationInterval as low as possible without affecting page load time. I set mMaxTokenProcessingTime to be a multiple of mNotificationInterval by default because the two are linearly related. Both default values can be overridden using prefs. I did this on the following machines with release builds: 500Mhz Linux build. 750Mhz WINNT system 433Hmz WINNT system 200Mhz WIN95 system I also ran I-BENCH and jrgm's page-load test. The current settings do not affect page load times but they DO NOT make Mozilla super-responsive during the load of large pages. They make Mozilla more responsive but its not as responsive as I would like. We need to improve the performance of incremental reflows so we can reduce the value for mNotificationInterval and mMaxTokenProcessingTime without affecting page performance. As a test to see that it is working you can set the user pref for mNotificationInterval to an absurdly low value user_pref("content.notify.interval", 1); and load: http://lxr.mozilla.org/seamonkey/find?string=nsCSSFrameConstructor.cpp Mozilla will be extremly responsive during the load. During the page load, you can click on menus and scroll as if the document was completed loaded. You will see small chunks of content added and reflowed. The default values are: user_pref("content.max.tokenizing.time", 360000); user_pref("content.notify.interval", 120000); You can turn off the interrupting of the parser using the following pref: user_pref("content.interrupt.parsing", true);

Kevin McCluskey (gone)

Updated

•

23 years ago

Whiteboard: waiting for review/super-review

Kevin McCluskey (gone)

Comment 35

•

23 years ago

"I tried this patch, and I crash on the ibench tests. I can't seem to get a backtrace out of gdb though. I can run ibench without the patch applied." I just applied the latest patch to a pull from this morning on Linux. I ran the I-Bench performance tests with a release build and it worked fine, it didn't crash. I ran both the specific HTML page load test http://i-bench.zdnet.com/ibench/performance_tests/perf_loadspeed.php as well as running the performance tests with all of the default settings on http://i-bench.zdnet.com/ibench/testlist/run.php If you think the parser interruption code is causing a problem you can turn off the interruption of the code either through the pref user_pref("content.interrupt.parsing", false); or you can modify mozilla/content/html/document/src/nsHTMLContentSink.cpp at line 2492 and change mCanInterruptParsing = PR_TRUE; to: mCanInterruptParsing = PR_FALSE;

John Morrison

Comment 36

•

23 years ago

I should have noted earlier: I had applied the patch 'attach_id=36855' after bbaetz comment, to a win2k build and could run the ibench stuff with no problem.

Bradley Baetz (:bbaetz)

Comment 37

•

23 years ago

Drat. I thought I updated my comment here, but obviously not. The crash was repeatable, going away when I reverted the patch, and then returning when I reapplied it, but then (without updating from cvs) it just disappeared. I had no luck in getting a backtrace out of gdb, so I have no idea what caused it. I was seeing slower ibench times from home -> ibench.zdnet.com (ie higher latency than to dogspit interanlly) with the patch, though, but if noone else is seeing those then I'll just blame that as pilot error.

Kevin McCluskey (gone)

Comment 38

•

23 years ago

With the latest patch 37679, I improved performance by replacing PR_Now calls with PR_Interval_Now. I ran I-BENCH on Linux with a release build and I am not seeing any change in page load times.

James Green

Updated

•

23 years ago

Keywords: perf

OS: Linux → All

Hardware: PC → All

Tomi Leppikangas

Comment 39

•

23 years ago

I see one problem with this patch, steps to reproduce: Go bugzilla query page, enter "perf" to keywords-field, and submit query -> Page loads ok, but throbber keeps spinning forever. When i set pref user_pref("content.interrupt.parsing", false); problem goes away. This is on linux. I have some other changes on my tree, so not 100% sure is it this patch.

Marc Attinasi

Comment 40

•

23 years ago

I don't even have the patch installed (and never have) and I have the problem of the throbber never stopping, but only with the classic skin. CVS build from 06/11, RedHat 7.1

Peter Lubczynski

Comment 41

•

23 years ago

I tried Kevin's patch on my Mac OS 9.1 400 MHz w/ 384 MB of memory and I did not see any side-effects. However, using the default setting, I could not tell any change in responsiveness. Then again, it wasn't that bad without it. More performance tuning needs to be done as cranking the setting up the max did show more responsiveness. Let's get this checked in to get more eyes looking at it.

Marc Attinasi

Comment 42

•

23 years ago

Comments on patch: 1) in nsParser::PostContinueEvent() you should check for (nsnull != mEventQueue) before creating the new event - if null, ev will leak (and it is a waste of time creating it). 2)new data members in nsParser.h / nsHTMLContentSink.cpp: should we use PRPackedBool instead of PRBool? Probably no big deal sinec there are very few parser or content sink instances. 3) Initialize the statics for DummyParserRequest? These are all really minor comments. The code look good, and it looks like the testing has been thorough too. I did not see any problems with the use of the mCanInterruptParsing data member, but did you check to make sure that disabling interrupt-of-parsing does indeed disable it? If so, then sr=attinasi

rickg

Comment 43

•

23 years ago

The patch works fine for me too (but I'm biased). :) Couple of small nits: 1. In nsParserContinueEvent::HandleEvent(), make sure the parser argument is non-null. 2. This whole process needs to be documented. The parser was always incremental, but this is a new kind of incrementalism, and should be described. To attinasi: packed bool won't help. If you need 1 bit, it still costs you eight. Rick

Chris Waterson

Comment 44

•

23 years ago

[Nit on a nit: PRPackedBool is eight bits (unsigned char). PRBool is 32 (int).]

Kevin McCluskey (gone)

Comment 45

•

23 years ago

Attached patch patch incorporating suggestions from attinasi, rickg, and harishd — Details — Splinter Review

Simon Fraser [no longer active]

Comment 46

•

23 years ago

This sounds very related to bug 15122.

Kevin McCluskey (gone)

Comment 47

•

23 years ago

Attached patch patch incorporating additional suggestions from harishd — Details — Splinter Review

Kevin McCluskey (gone)

Updated

•

23 years ago

Whiteboard: waiting for review/super-review → waiting for approval

harishd

Assignee

Comment 48

•

23 years ago

patch id=36536 looks good. r=harishd

harishd

Assignee

Comment 49

•

23 years ago

patch id=38536 looks good. r=harishd

Kevin McCluskey (gone)

Comment 50

•

23 years ago

sr=vidur@netscape.com per phone conversation with Vidur

Asa Dotzler [:asa]

Updated

•

23 years ago

Blocks: 83989

chris hofmann

Comment 51

•

23 years ago

a=chofmann

Kevin McCluskey (gone)

Comment 52

•

23 years ago

Checked in patch 38536 with the default for content.interrupt.parsing set to false. To test interrupting the parser to improve user interactivity during long page loads set the pref user_pref("content.interrupt.parsing", true); You can increase the responsiveness by changing content.notify.interval. The default is user_pref("content.notify.interval", 120000); if you set it to user_pref("content.notify.interval", 1000); Mozilla will get very responsive during long page loads, but overall page load times will increase. The current defaults are tuned to not affect overall page load times.

Kevin McCluskey (gone)

Updated

•

23 years ago

Whiteboard: waiting for approval → Checked in a fix, but it is disabled. To enable the fix add user_pref("content.interrupt.parsing", true); to pref.js

Christopher Blizzard (:blizzard)

Updated

•

23 years ago

Whiteboard: Checked in a fix, but it is disabled. To enable the fix add user_pref("content.interrupt.parsing", true); to pref.js → Checked in a fix, but it is disabled. To enable the fix add user_pref("content.interrupt.parsing", true); to pref.js, critical for 0.9.2

Tomi Leppikangas

Comment 53

•

23 years ago

I am still seeing problem with bugzilla queries. When setting user_pref("content.interrupt.parsing", true); after bugzilla query throbber just keep spinning. When setting pref to false, throbber stops after query. Tested with build 2001062121 in linux and new profile.

chris hofmann

Comment 54

•

23 years ago

I think we have what we need for mozilla 0.9.2 with the addtion of the prefs to get us some help with testing. moving this off the critical for 0.9.2 list. adding the nsbranch keyword so we can track if we want to take it for the netscape release later on down the road.

Keywords: nsBranch

Whiteboard: Checked in a fix, but it is disabled. To enable the fix add user_pref("content.interrupt.parsing", true); to pref.js, critical for 0.9.2 → Checked in a fix, but it is disabled. To enable the fix add user_pref("content.interrupt.parsing", true); to pref.js,

Chris Waterson

Comment 55

•

23 years ago

Bugzilla queries are likely suffering from another problem. See bug 81524.

Tomi Leppikangas

Comment 56

•

23 years ago

No its not that ui freezes, mozilla works fine, just trobber keeps spinning, AND this only happens when i enable content.interrupt.parsing.

Kevin McCluskey (gone)

Comment 57

•

23 years ago

Harish and I looked at the problem with the bugzilla query. The problem is that necko loads the entire document then starts a new load of the same document for some reason. Necko never passes the flag to indicate the document has completed for the second load so the DummyParserRequest that is added when content.interrupt.parsing is enabled never gets removed from the load request so the throbber keeps going.

Bradley Baetz (:bbaetz)

Comment 58

•

23 years ago

The bugzilla query page is multipart/x-mixed-replace, so the document should be loading twice, shouldn't it? Or something like that.

Kevin McCluskey (gone)

Updated

•

23 years ago

Depends on: 87370

Kevin McCluskey (gone)

Comment 59

•

23 years ago

With Darin's help I traced into the multi-part converter code and the problem is that the nsMultiMixedConv implementation doesn't pass the OnStopRequest through to the parser for the second document. I filed bug 87370 for this problem.

Kevin McCluskey (gone)

Updated

•

23 years ago

Target Milestone: mozilla0.9.2 → mozilla0.9.3

Kevin McCluskey (gone)

Comment 60

•

23 years ago

Attached patch Patch which dynamically switches between fast-page load and user-interactive based on user events — Details — Splinter Review

Kevin McCluskey (gone)

Comment 61

•

23 years ago

The patch I attached will dynamically switch between two different settings for the content.notify.interval and max.token.processing.interval based on the users actions. The basic algorithm is if a user event (mouse mouse, keypress etc.) has occurred in the last second then use values which maximizes interactive responsiveness. If a user event has not occurred in the last second then return to the values which maximize page-load performance. The idea is that if the user is going to grab the thumb of the scrollbar or make a menu selection they would prefer to emphasize the processing of the user-events over page-load.

Marc Attinasi

Comment 62

•

23 years ago

sr=attinasi for the patch ID 40220 The patch makes interaction really sweet, and actually makes the app look 'smart' when a large page is loading and you reach for the scroll-thumb because the mouse-movement slows down the load and makes grabbing the thumb easier. Also, the menus are way more responseive when you want to access them, but page load is full-speed when the mouse is left alone. This is a great improvement over the static inverval!

Kevin McCluskey (gone)

Comment 63

•

23 years ago

Attached patch Minor tweaks to patch 40220, removed some compiler warnings, removed nsDelayTimer class, added some comments — Details — Splinter Review

harishd

Assignee

Comment 64

•

23 years ago

Remove + printf("Enable interrupting parser %d\n", enableInterruptParsing); with that r=harishd.

John Morrison

Comment 65

•

23 years ago

I did some runs with a 06/26 trunk build on linux and win98. (I didn't get to mac, since I was chasing down a different issue [bug 88166] at the same time). What I did was to create a new profile and set it up with sidebar collapsed, in modern skin, and then after a few restarts, run the page-loader test. The first run was with the default settings, which makes this turned off at the moment. Between each test, I delete NewCache, history.dat, cookies.txt and restarted a few times to get to the same initial state. I ran tests with the pref set to true and timing values of 120000, 10000, 40000, 80000, and then back to 120000, and finally again as 'off' (disabled). Linux/500MHz/128MB/rh9.1 times in msec. cached uncached 1) off 1775 1949 2) 120000 1795 1925 3) 10000 1998 2107 4) 40000 1795 1914 5) 80000 1777 1896 6) 120000 1776 1933 7) off 1765 1890 win98/500MHz/128MB times in msec. cached uncached 1) off 1404 1680 2) 120000 1442 1656 3) 10000 1471 1713 4) 40000 1456 1684 5) 80000 1431 1675 6) 120000 1454 1665 7) off 1413 1640 So, it is effectively neutral (according to my testing) on Linux [i.e., can't differentiate the changes from 1% noise]. However, I do see a measurable difference of about ~3% for cached load times on win98, for a setting of 120000 (the default). (I should note that I think this patch has much different dynamics on win98 than on win NT variants: when I ran on win2k with a setting of 10000 usec, the DOM L2 spec shot up from ~4 seconds to ~30 seconds to finish loading. But on win98, I only went from ~4 to 5.5 seconds (I think).) It would still be good to run the ibench test, and to check the Mac (my bad). Sorry not getting to this earlier. (p.s. I am away until Monday afternoon).

Kevin McCluskey (gone)

Comment 66

•

23 years ago

I found a problem: when loading http://i-bench.zdnet.com/ibench/i-bench.htm if I have the lastest patch installed and I move the pointer around rapidly to cause the parser to be interrupted very early on in the load process the page (sometimes) doesn't load at all. At the beginning of the i-bench document it does a document.write of a script tag which has an href to external file that contains JavaScript to execute. I believe this triggers the problem. I also get an assertion: 0[ba2c40]: ###!!! ASSERTION: nsHTMLDocument::Reset() - dummy doc write request s till exists!: 'mDocWriteDummyRequest == nsnull', file S:\mozilla\content\html\do cument\src\nsHTMLDocument.cpp, line 371 The following is the stack that triggers the assertion: nsDebug::Assertion(const char * 0x01cf2a9c, const char * 0x01cf2a7c, const char * 0x01cf2a44, int 371) line 290 + 13 bytes nsHTMLDocument::Reset(nsHTMLDocument * const 0x028f7a40, nsIChannel * 0x02de5920, nsILoadGroup * 0x00ed4ee0) line 371 + 49 bytes nsHTMLDocument::OpenCommon(nsIURI * 0x028e0760) line 2137 + 39 bytes nsHTMLDocument::Open(nsHTMLDocument * const 0x028f7b80, nsIDOMDocument * * 0x0012e390) line 2244 + 18 bytes nsHTMLDocument::Open(nsHTMLDocument * const 0x028f7b7c) line 2212 + 40 bytes nsHTMLDocument::WriteCommon(const nsAString & {...}, int 0) line 2309 + 31 bytes nsHTMLDocument::ScriptWriteCommon(int 0) line 2425 + 19 bytes nsHTMLDocument::Write(nsHTMLDocument * const 0x028f7b80) line 2452 XPTC_InvokeByIndex(nsISupports * 0x028f7b80, unsigned int 19, unsigned int 0, nsXPTCVariant * 0x0012e83c) line 139 XPCWrappedNative::CallMethod(XPCCallContext & {...}, XPCWrappedNative::CallMode CALL_METHOD) line 1881 + 42 bytes XPC_WN_CallMethod(JSContext * 0x00f6f420, JSObject * 0x00dab2b0, unsigned int 1, long * 0x024636ec, long * 0x0012ea70) line 1252 + 11 bytes js_Invoke(JSContext * 0x00f6f420, unsigned int 1, unsigned int 0) line 807 + 23 bytes js_Interpret(JSContext * 0x00f6f420, long * 0x0012f888) line 2702 + 15 bytes js_Execute(JSContext * 0x00f6f420, JSObject * 0x00dab138, JSScript * 0x02dffe40, JSStackFrame * 0x00000000, unsigned int 0, long * 0x0012f888) line 986 + 13 bytes JS_EvaluateUCScriptForPrincipals(JSContext * 0x00f6f420, JSObject * 0x00dab138, JSPrincipals * 0x028f60b0, const unsigned short * 0x02420028, unsigned int 741, const char * 0x02dfe630, unsigned int 1, long * 0x0012f888) line 3273 + 25 bytes nsJSContext::EvaluateString(nsJSContext * const 0x00f6e920, const nsAString & {...}, void * 0x00dab138, nsIPrincipal * 0x028f60ac, const char * 0x02dfe630, unsigned int 1, const char * 0x00000000, nsAString & {...}, int * 0x0012f8f4) line 608 + 85 bytes nsScriptLoader::EvaluateScript(nsScriptLoadRequest * 0x029bde90, const nsAFlatString & {...}) line 565 nsScriptLoader::ProcessRequest(nsScriptLoadRequest * 0x029bde90) line 477 + 22 bytes nsScriptLoader::OnStreamComplete(nsScriptLoader * const 0x028f50c4, nsIStreamLoader * 0x029bdbd0, nsISupports * 0x029bde90, unsigned int 0, unsigned int 741, const char * 0x02dfe750) line 756 nsStreamLoader::OnStopRequest(nsStreamLoader * const 0x029bdbd4, nsIRequest * 0x029bd8d0, nsISupports * 0x00000000, unsigned int 0) line 120 + 81 bytes nsStreamListenerTee::OnStopRequest(nsStreamListenerTee * const 0x02dfeac0, nsIRequest * 0x029bd8d0, nsISupports * 0x00000000, unsigned int 0) line 25 nsHttpChannel::OnStopRequest(nsHttpChannel * const 0x029bd8d4, nsIRequest * 0x029b9bf0, nsISupports * 0x00000000, unsigned int 0) line 2111 nsOnStopRequestEvent::HandleEvent() line 161 nsARequestObserverEvent::HandlePLEvent(PLEvent * 0x029cc9e4) line 64 PL_HandleEvent(PLEvent * 0x029cc9e4) line 590 + 10 bytes PL_ProcessPendingEvents(PLEventQueue * 0x00f32f50) line 520 + 9 bytes _md_EventReceiverProc(HWND__ * 0x00050778, unsigned int 49363, unsigned int 0, long 15937360) line 1071 + 9 bytes USER32! 77e7124c() 00f32f50() My guess is that we have some code that assumes that when the stream is complete that all of the document has been tokenized and the content model is complete. The interrupting of the parser is disabled if we are in a sub-parser context so it will not interrupt the parser when it's in the middle of the parsing the result of a document.write, but there may be some other places where we need to disallow the interrupting of the parser.

Marc Attinasi

Comment 67

•

23 years ago

Kevin, does that page fail to load with the interval set to a very low number _without_ the patch that makes it dynamic? I would think that it would, since the dynamic stuff just sets the interval really low...

Kevin McCluskey (gone)

Comment 68

•

23 years ago

If I set initial value for the mNotificationInterval to be the same as the value that it is lowered to when there is a user event it does not fail. The assertion only appears when the mNotificationInterval is dynamically switched between two values.

Kevin McCluskey (gone)

Comment 69

•

23 years ago

Ok, I found the problem. I missed one of the places where I should have replaced mNotificationInterval with GetNotificationInterval: nsHTMLContentSink.cpp PRBool HTMLContentSink::IsTimeToNotify() { ... LL_I2L(interval, mNotificationInterval); Should be LL_I2L(interval, GetNotificationInterval());

Marc Attinasi

Comment 70

•

23 years ago

Great, I'm glad you found it. [s]r=attinasi on the inline fix.

Kevin McCluskey (gone)

Comment 71

•

23 years ago

Attached patch patch which includes the call to GetNotificationInterval() mentioned above + small tweek of defaults for the dynamic switching. — Details — Splinter Review

Kevin McCluskey (gone)

Comment 72

•

23 years ago

I checked patch 40583 into the trunk. Dynamic interrupting of the parser is now turned on by default on the trunk. To test this, try loading a large document such as http://lxr.mozilla.org/seamonkey/find?string=nsCSSFrameConstructor.cpp. As you move the mouse to grab the scrollbar or make a menu selection Mozilla should be extremely responsive. If you stop moving the mouse Mozilla will go back to the defaults to enable fast page load.

Kevin McCluskey (gone)

Updated

•

23 years ago

Whiteboard: Checked in a fix, but it is disabled. To enable the fix add user_pref("content.interrupt.parsing", true); to pref.js, → Checked in a fix to the trunk

Kevin McCluskey (gone)

Comment 73

•

23 years ago

I ran I-BENCH with a release build on WINNT 750Mhz CableModem and the times were unchanged with the dynamic switching. However, the times are slower with dynamic switching turned on if the mouse pointer is inside the window when the I-BENCH tests are run. This is because windows generates a mouse move event when a new window is constructed under the mouse pointer, this mouse move event causes the lower content.notification.interval and max.token.processing intervals to be used for 3/4 of second when the page begins loading. I can improve the performance on I-BENCH when the pointer is in the window by increasing the lower values returned for content.notification.interval and max.token.processing intervals in the GetNotificationInterval() and GetMaxTokenProcessingTime(). However, Doing this makes Mozilla less interactive during long page loads. Suppressing the mouse move events generated when the mouse just happens to be over a window when it is created would be a better solution.

Kevin McCluskey (gone)

Comment 74

•

23 years ago

Attached patch Patch which prevents dynamic switching until 2 seconds after the document load has begun to fix event issue which was causing performance hit when the pointer is over the window during I-BENCH tests — Details — Splinter Review

Simon Fraser [no longer active]

Comment 75

•

23 years ago

cc: saari for the mouse event on page load issue.

David Hyatt

Comment 76

•

23 years ago

3/4 second added on to the total i-bench time, or to the per-run time?

Kevin McCluskey (gone)

Comment 77

•

23 years ago

With the latest patch installed there isn't any impact on I-Bench load times. The 3/4 of second is how long the dynamic switching waits after a user-event before it switches to lower settings to improve user responsiveness. If another user-event occurs within 3/4 of second of the previous user event it will stick with the lower settings. If 3/4 of second goes by without a user event it switches back to the higher settings which maximimizes page load performance. The problem I was having with I-BENCH was that when each page was loaded and the mouse pointer was in the window it was jumping to the lower settings because the creation of the window was generating a mouse move event. It should be sticking with higher settings which maximize page load I wasn't actually moving the mouse or typing. The lastest patch waits 2 seconds after the beginning of the document load before it looks at the user event time when determining if it should jump to the lower setting. This eliminates the jumping to the lower value when Mozilla cycles through pages. It also eliminates the jumping to the lower value immediately after the user hits return in the URL bar, or clicks on a link if it results in the loading of a new document.

Marc Attinasi

Comment 78

•

23 years ago

I'd like to suggest that the comment be reworded a little (it is a confusing sentance). From: + // should be used. but only consider using the lower + // value if the document has already been loading + // for 2 seconds. 2 seconds was chosen because it is + // greater than the default 3/4 of second that is used + // to determine when to switch between the modes and it + // gives the document a little time to create windows + // result in user events because on some systems creating + // a window under the mouse pointer will generate a + // mouse move event. to + // should be used. But only consider using the lower + // value if the document has already been loading + // for 2 seconds. 2 seconds was chosen because it is + // greater than the default 3/4 of second that is used + // to determine when to switch between the modes and it + // gives the document a little time to create windows. + // This is important because on some systems (Windows, + // for example) when a window is created and the mouse + // is over it, a mouse move event is sent, which will kick + // us into interactive mode otherwise. It also supresses + // reaction to pressing the ENTER key in the URL bar... Or something like that. Anyway, the information is there, I think you just had a typo or something. Also, since the value 2 seconds is really related to the 3/4 second default value (2X + fuzz-factor), why not make them consts or defines and put the relationship into the code (in the odd event that the default interval is changed, the wait-time would be changed automatically that way). Not critical, but it seems cleaner to me. sr=attinasi.

harishd

Assignee

Comment 79

•

23 years ago

r=harishd

Kevin McCluskey (gone)

Comment 80

•

23 years ago

Attached patch patch to delay lower of the interval incorporating Marc's suggestions — Details — Splinter Review

Kevin McCluskey (gone)

Comment 81

•

23 years ago

Checked in patch 40719 into the trunk. I-BENCH scores should be back to what they were previously.

Kevin McCluskey (gone)

Comment 82

•

23 years ago

If you want to bring the Mozilla0.9.2 branch up to date with the patches I've installed on the trunk apply patches 40583 and 40719

Kevin McCluskey (gone)

Updated

•

23 years ago

Assignee: kmcclusk → harishd

Status: ASSIGNED → NEW

Kevin McCluskey (gone)

Comment 83

•

23 years ago

Reassigning this bug to Harish since I will be gone until Aug 29.

Kevin McCluskey (gone)

Comment 84

•

23 years ago

If you encounter problems and you want to turn off the interrupting of the parser as the default: change line 2509 in nsHTMLContentSink.cpp from: PRBool enableInterruptParsing = PR_TRUE; to be PRBool enableInterruptParsing = PR_FALSE;

lchiang

Comment 85

•

23 years ago

adding vtrunk keyword to indicate the bug has been fixed on the trunk. Bug left open for tracking checkin to branch (nsbranch) when appropriate. Once bug has been fixed on the branch also, pls remove vtrunk keyword.

Keywords: vtrunk

Nisheeth Ranjan

Comment 86

•

23 years ago

This is a high impact fix for user experience during long page loads. It has also been on the trunk for a week now. Marking nsBranch+.

Whiteboard: Checked in a fix to the trunk → [nsBranch+] Checked in a fix to the trunk

harishd

Assignee

Comment 87

•

23 years ago

Ran browser buster, with build ID 2001070604, and did not see anything abnormal.

Heikki Toivonen (remove -bugzilla when emailing directly)

Comment 88

•

23 years ago

Verified FIXED on today's trunk builds on Win2k and Linux. I checked this by comparing the interactivity of the application while loading nsCSSFrameConstructor.cpp via LXR.

Whiteboard: [nsBranch+] Checked in a fix to the trunk → [nsBranch+][fixed and verified on the trunk]

Nisheeth Ranjan

Updated

•

23 years ago

Whiteboard: [nsBranch+][fixed and verified on the trunk] → [pdt+][nsBranch+][fixed and verified on the trunk]

Nisheeth Ranjan

Comment 89

•

23 years ago

Checked into branch on behalf of Harish. Marking fixed.

Nisheeth Ranjan

Comment 90

•

23 years ago

Really marking fixed.

Status: NEW → RESOLVED

Closed: 23 years ago

Resolution: --- → FIXED

lchiang

Comment 91

•

23 years ago

chris - pls verify this on the branch builds. Thanks.

Keywords: vtrunk → vbranch

Chris Petersen

Comment 92

•

23 years ago

Marking verified with the July 16th branch builds (20010716) with Linux Redhat, Windows ME, and Mac OS 9.1.

Status: RESOLVED → VERIFIED

You need to log in before you can comment on or make changes to this bug.