Closed Bug 271050 Opened 15 years ago Closed 15 years ago
Improve Camino's handling of Gecko Events
10.52 KB, patch
|Details | Diff | Splinter Review|
40.98 KB, text/plain
9.97 KB, patch
|Details | Diff | Splinter Review|
4.63 KB, patch
|Details | Diff | Splinter Review|
9.13 KB, patch
|Details | Diff | Splinter Review|
Camino currently handles gecko events by setting up an NSTimer which fires off every 0.005 seconds and checks the gecko event queue. Not only does this make me feel dirty, it's been theorized that this is the source of some of Camino's performance problems, particularly although not exclusively where flash is concerned.
This patch makes camino's gecko event handling behave like other UNIX platforms; rather than polling at a timed interval, the strategy is: 1. Events are posted to a pipe. 2. An NSFileHandle is used to monitor the pipe for posted events. 3. When posted events are detected pending events are processed. Additionally, this is rate-limited, so that events are processed at a maximum frequency. In my testing, browsing with this patch "feels" snappier; I notice this the most on pages that make heavy use of flash. Could others please look at this patch and test? Specifically, I'd like to know: 1. Does this patch truly improve Camino's behavior? 2. Does this strategy seem better than the old one? This patch probably isn't really ready for review yet; specifically, I'd say it needs some documentation before it can go in. But please test and give your opinions on whether this is a good idea or not.
I'd be glad to supply a build with this patch applied to anyone who doesn't want to build it. Either post to the bug or send me an email requesting a build and I'll put one online.
As an example, CPU usage on the test URL is dramatically less for me with the patch: http://www.homestarrunner.com/sbemail118.html
Comment on attachment 166655 [details] [diff] [review] event handling change >+#define MIN_PLEVENT_INTERVAL_MILLISECONDS 50 >+ I thought consts were prefered over #defines >-#define MAC_USE_WAKEUPPROCESS >+#define MAC_USE_UNIX Why the name change ?
(In reply to comment #2) > I'd be glad to supply a build with this patch applied to anyone who doesn't want to build it. Either post > to the bug or send me an email requesting a build and I'll put one online. While I can't currently build, It would be a good idea to post two builds from the same source - o,e with the patch and one without, then post to the forums and mailing list about it and have people comment on the mailing list if they see any difference.
re:comment#5 I build and put this location: http://caminofreak.hp.infoseek.co.jp/subset/caminol10nJP.html Flash animation is improved dramatically. Brilliant patch!
(In reply to comment #4) > (From update of attachment 166655 [details] [diff] [review]) > >+#define MIN_PLEVENT_INTERVAL_MILLISECONDS 50 > >+ > > I thought consts were prefered over #defines > You are quite correct. I used that from something similar that was there before... that said, a define there may in fact be desirable... > >-#define MAC_USE_WAKEUPPROCESS > >+#define MAC_USE_UNIX > > Why the name change ? It's doing something entirely different than it was before. (Now it's using the UNIX platform specific code instead of the mac code. Previously it was changing a bit of the mac code based on that flag.)
I canalso confirm that there is an incredible difference in the handling of cpu intensive sites with the build waverider posted. Websites that previously would bring my G4 1,25 GHZ to a grinding hault now work without even showing a beachball cursor on a G3 400mhz. That is what I'd like to call a HUGE improvement for this browser. Not that this patch might also fix the bug which caused camino to even use cpu cycles when all windows wher closed. I'm typing this message in one window while in the other a pretty heavy flash movie is playing in another window on the G3 400 mhz iMac. And so I'd like to nominate this bug to be put in 0.8.2 if we can. This is a major improvement.
(In reply to comment #5) > While I can't currently build, It would be a good idea to post two builds from > the same source - o,e with the patch and one without, then post to the forums > and mailing list about it and have people comment on the mailing list if they > see any difference. Excellent idea. I've placed a disk image with the two builds here: http://www.speakeasy.org/~gbeier/Camino-perftest.dmg.bz2 and am posting to the list and to the forum on mozillazine.org. If you have any other forums in mind, please post there :-)
Bug 225397 is a flash performance Meta bug which might be of interest further on in the development of thi sbug. I tested both bugs in the meta bug and looked at the url in those bugs that where giving us trouble. With this patch those websites didn't skip frames anymore on a G3 400 iMac, sure the mac wen't a little slow but the beachball cusors didn't even show.
Ok so I really tested this patch and these are my findings: - we are not much faster than before, in some cases we are even slower than we where. In some cases I get the feeling flash is held back instead of given full speed. This page for instance: http://yugop.com/ver3/stuff/12/index.html Is much faster in the build without the patch, but in that build it will skip frames and hang on slower machines. With the build with the patch the movie even plays on slow machines but feels way slower. On a fast machine I can see that camino has a lot of free cpu but doesn't use it to speed up the movie. - however when a page is very cpu intensive we are MUCH more graceful when it comes to showing the beach-ball. We just don't and we still keep to be responsive on mouse click of the user. - web-page containing multiple flash movies are responding much better. if one movie is in full motion and you do something in the other movie they both continue without a hitch. While before we would skip frames. This page will show that: http://yugop.com/ver3/ Just choose any of the wonderful heavy flash animations and you will see that while the top movie is playing and the using the menu movie at the bottom in the patch-less build will result in frame skipping. In the build with patch no frames are skipped and everything works as it should. I encountered using a build with a lot of flash (as that is the only good way I have found to test the patch) is some kind of memory crash for which I attached the crash log. The crash happens when ShockWave content is being loaded. I also had some trouble with input boxes f*cking up when I was typing in them with you patched build.
It seems almost Flash movies becomes snappier, but transfer speed that Speed Test Site shows becomes much slower. For example; 8 Mbps (with New event handling version) 20 Mbps (with Old event handling version)
As someone who's played in the event handling code before, let me take a moment to describe how things were before, what they are with this patch, and whether it can be improved on. Everywhere: Events are posted to the main thread via the plevents code. They may be posted from other threads. The main thread needs to wake up and handle these events in a timely fasion for good performance. The old world (Carbon/Mozilla): Carbon Events are used to notify the main thread that there are events waiting to be processed; these get processed at WaitNextEvent time. The Camino world (MOZ_WIDGET_COCOA) Carbon events were not used (because they don't get processed in some run loop modes perhaps), so the plevents code does nothing to notify the UI thread that there are events waiting to be processed. Instead, nsToolkit sets up a timer which polls the main thread's event queue to see if there are plevents to be processed. With this patch: plevents used the Unix pipe code, and nsToolkit sets up a notification that will get called when there's data on the pipe. A timer is used to ensure that plevents are processed at most every 50ms (why?). In an ideal world, nsToolkit wouldn't have to do anything to get plevents to be processed (as is the case in the Carbon world). I think plevent.c should, for Mac OS X, use CFMessagePorts to notify the UI thread that there's work to do. I've written the code before, and could come up with a patch once my cvs access gets resolved.
(In reply to comment #13) > With this patch: > plevents used the Unix pipe code, and nsToolkit sets up a notification > that will get called when there's data on the pipe. A timer is used to > ensure that plevents are processed at most every 50ms (why?). > The timer is a misguided attempt to keep Camino's CPU usage down. While it does that, the increased apparent CPU usage is OK as it only gets used if available and doesn't adversely impact system performance. I'm taking the timer out. > In an ideal world, nsToolkit wouldn't have to do anything to get plevents to be > processed (as is the case in the Carbon world). I think plevent.c should, for > Mac OS X, use CFMessagePorts to notify the UI thread that there's work to do. > I've written the code before, and could come up with a patch once my cvs access > gets resolved. And actually, there's absolutely no need for me to do this in nsToolkit anymore; the only reason i did is intertia. I should probably move it to MainController. If you've got code for using CFMessagePorts (Is that better than the BSD pipe for some reason? I thought since this was just a single byte of data going one direction a pipe might be best...) I'd be more than willing to do any necessary cleanup to get a landable patch. Feel free to send it to me or post it here if you'd like. Thanks for the lucid explanation!
I should be able to get a CFMessagePort patch out this week. There are a couple of things that I wonder about. First, should the plevent code just notify the UI thread that there are any events in the queue (which means that the UI thread would then go and process all, or N of them in one go), or should each and every plevent get proxied over to the UI thread via CFMessagePort or similar? The advantage of proxying each plevent separately is that their processing is interleaved with UI events, which might ease some of the CPU hogging we see on flash-heavy sites (more on that in a moment). However, that means more inter-thread communication, which might be expensive (though I'm sure trivial compared to the amount of work done for many of those events). Previous attempts to reduce plevent-based CPU hogging have put a cap on the number of events processed from the queue in one go (hence the "N" above). Now the reason I believe that Flash-heavy sites cause these CPU hogging problems that is that we set up a 50ms timer for each and every plugin, to feed the plugin null events (so it can do idle processing). Some flash movies use considerable CPU time each time they are sent an idle event. Because timers are implemented cross-platform using a thread, they use plevents to notify the calling thread when the timer fires. I think this is why this event handling tweaking has such an impact on flash-heavy sites.
Note that nsITimers use PLEvents to fire, at least in some cases, so we absolutely need to support PLEvents firing at least as often as DOM setTimeout calls can fire (every 10ms, possibly more often).
> If you've got code for using CFMessagePorts (Is that better than the BSD pipe > for some reason? The advantage of CFMessagePort is that you can create a run loop source for one. In other words, unlike a BSD pipe, when a request is sent to a CFMessagePort, it wakes up your CFRunLoop.
Here's what I was thinking. It would be good if someone could do an opt build with this for testing.
(In reply to comment #18) > Created an attachment (id=166833) > Patch to change plevents to use CFMessagePorts, and remove nsToolkit timer > stuff. > > Here's what I was thinking. It would be good if someone could do an opt build > with this for testing. Thank you! Here is an opt build for everyone's testing enjoyment: http://www.speakeasy.org/~gbeier/Camino-CFMessage.dmg.bz2
i did some pageload testing between the 11/22 nightly and the cfMessagePort build that geoff posted and there is some difference: cache cleared, ran 1 cycle 11/22: 1842ms avg CFMP: 1887ms avg MessagePort build is 2% slower in the totally uncached case. after filling the cache, re-ran test, 3 cycles 11/22: 819ms avg CFMP: 856ms avg MP build is 5% slower in the cached case. this is the case where the network makes the least amount of difference as all images are already cached.
i tested the first patch (geoff's) and got about a 12% slowdown (in the cached case) from an equivalent build w/out his patch. it was pointed out that the CFMP build is a dynamic opt build while the 11/22 is a static opt build. that could account for some of the difference.
That second URL is definitely noticeably better using the experimental build. Without wandering into the pointless realm of "it feels snappier" it would be interesting if we could also quantifiably test things like Flash performance and UI responsiveness.
Well shut my mouth, I'm guilty of exactly what I was describing, overzealous performance-increase reporting. On subsequent testing performance is visually indistinguishable between the experimental build and 0.8.1. There must have been some other factor influencing performance initially. Simple Flash FPS testing using <> is close, probably within the margin of error (whatever that may be): 23--24 for 0.8.1, 24--25 for the experimental build.
(In reply to comment #24) > Well shut my mouth, I'm guilty of exactly what I was describing, overzealous > performance-increase reporting. On subsequent testing performance is visually > indistinguishable between the experimental build and 0.8.1. There must have been > some other factor influencing performance initially. > > Simple Flash FPS testing using <> is close, probably within the margin of error > (whatever that may be): 23--24 for 0.8.1, 24--25 for the experimental build. Maybe, but I *can* say that on pages with lots of Flash embeds I can actually scroll without waiting 10 seconds and getting constant beachballs, and that on full Flash sites (e.g., http://www.metroid.com/), I don't need to hold the mouse button down on the movie (old Camino-fu) to have it play back decently.
(In reply to comment #25) > Maybe, but I *can* say that on pages with lots of Flash embeds I can actually > scroll without waiting 10 seconds and getting constant beachballs, and that on > full Flash sites (e.g., http://www.metroid.com/), I don't need to hold the mouse > button down on the movie (old Camino-fu) to have it play back decently. Exactly, more so people using old machines like G3 iMacs and such will be the ones who will notice a BIG performance changes. I had a hands on experience where the yugop pages would skip frames like hell and camino would be very unresponsive while with the patches Camino had no troubles what so ever. Most of us here in bugzulla own big beasts of machines so we won't really notice the changes. Find an old one and you will see.
*** Bug 197889 has been marked as a duplicate of this bug. ***
*** Bug 182087 has been marked as a duplicate of this bug. ***
If anyone is interested I have too build up static optimized at http://softkid.nerim.net Camino_clean.zip is patch free and Camino_patch.zip contains smfr's patch.
It seems the CFMP static build makes Flash movies snappier and doesn't slow down transfer speed.
BenchJS results <http://www.24fun.com/downloadcenter/benchjs/benchjs.html> using test builds (opt, static) from http://softkid.nerim.net. All numbers are averages of 3 runs. Mac OS X 10.3 Camino_clean: 8.87sec Camino_patched: 9.24sec (4.1% slower) Mac OS X 10.2: Camino_clean: 9.24sec Camino_patched: 9.69sec (4.8% slower) This shows that the build with the CFMessagePort patch is apparently slower. However, I believe that this is because the build is doing additional drawing during the tests that is missed by the vanilla build. For example, on the second (popup window) test, the patched build shows the contents of the popup windows to be red. On the vanilla build, they close before the contents are drawn. On test 6 (the animated letters), the patched build shows a smoother animation, drawing every frame. The animation in the vanilla build is jerky, drawing fewer frames. So I belive that the CFMessagePort patch makes Camino behave more correctly in terms of drawing.
(In reply to comment #31) > So I belive that the CFMessagePort patch makes Camino behave more correctly in > terms of drawing. I think that this is definitely the correct thing to do. Taken to an extreme case, Camino could have superb rendering performance if it just skipped that darn drawing step altogether ;) Obviously if the patch can be improved to match the performance of the old system that's great, but I don't think that this really counts as a performance regression, since the only way the old behavior is so fast is because it's wrong.
Simon - are you working on the patch any more or would you like to start getting reviews? I think the latest patch from you (using CFMessagePorts) works really well. It makes us behave correctly in a few situations we don't now, positive performance impacts are huge in some cases, negative perf impacts are minor and acceptable (usually for the sake of correctness) in others.
I think the patch is ready to go.
Assignee: pinkerton → sfraser_bugs
my only concern here is that this is causing us to do too much work in some cases (overpainting) unrelated to plugins. I totally agree that painting all frames or an animation are important, but it sounds like from some of the examples given that we're painting when we weren't before. Anything we can do to tweak this? 5% isn't a trivial slowdown. That said, I'm all for this patch, it's gonna make things a lot better, just trying to make it the best it can be up front.
> we're painting when we weren't before I think we were wrong to not paint before. The pageload numbers are the real test -- were you able to run those for static plain and patched builds?
Status: NEW → ASSIGNED
The pageload tests get 6.4% slower with this patch (mean 393ms to 419ms).
+ SInt32 err = CFMessagePortSendRequest(self->mRemoteMessagePort, 0, NULL, 1.0, 0.0, NULL, NULL); what's the 1.0 and 0.0? either a comment or constants would help here + eventQueue->mLocalMessagePort = CFMessagePortCreateLocal(kCFAllocatorDefault, portName, + _md_EventReceiverProc, &portContext, NULL); can this ever fail? i guess there can't ever be a duplicate (can't dupe the process id) but could you ever be "out" of machine-wide ports? what would be the behavior if it did fail? no events? crash? looks good otherwise. damn, 6.4%. ouchies. :(
This patch made a whole world of a difference.. Im on an iMac 233, 10.3.7, and before it I could'nt enter a page which had a large flash-animation on it. It made Camino totally unresponsible and I had to use command-W, and wait ~20 seconds for it to respond. Now I can scroll up & down with the same animation running. Just my 0.02¢
simon? cleaned up patch?
has anyone tested this on a 10.1 system (read: could someone test for us)? it would be awesome if we could also take this patch on the branch for 0.8.3.
(In reply to comment #41) > has anyone tested this on a 10.1 system (read: could someone test for us)? it > would be awesome if we could also take this patch on the branch for 0.8.3. Providing Branch build containing the patch would help. Is this needed ?
This is a complete patch for an alternative version of the CFMessagePort patch taht just makes a plain CFRunLoopSource, which it notifies using CFRunLoopSourceSignal(). This should be just as good, and is simpler.
Test branch builds with and without the last CFRunLoop patch are at: <ftp://ftp.accesscom.com/pub/users/s/smfr/camino/>
Pageload results look pretty good: vanilla build: mean 445ms CFRunLoop build: mean 443ms.
(In reply to comment #46) > Pageload results look pretty good: > > vanilla build: mean 445ms > CFRunLoop build: mean 443ms. Testing the patched build with the Flash video here: http://www.atomfilms.com/af/content/gangsta_rap_se yields satisfactory results. All previous builds of not only Camino, but also Fx and Mozilla drop frames and quickly lock up, requiring a Force Quit. This is looking good.
I should add I'm testing on a 12" PB 1.25GHz with 1GB of RAM. OSX 10.3.7
bumping up to 0.9
Target Milestone: --- → Camino0.9
The checkin for this bug looks like it caused an 8% improvement in startup time but a 19% regression in pageload time.
19% is much worse than my testing showed, alas. I'll try checking in the CFMessagePort version.
With some basic test I can indeed see umbres dropping for pageload tests. Howe ever more importantly, when rendering pages the UI always stays responsive and progress in the rendering of very large pages is always updated. All in all ths creates a smoother and more responsive feel when using Camino. I do agree that 19% is a lot to loose, maybe to much?
Pageload results from 500MHz laptop: Original branch: 803ms CFRunLoop build: 955ms When running the tests you can see more painting; on some pages, the background draws first, followed by the content. On faster machines this probably won't happen, explaining the divergence in results.
so what do we want to do with this? mark it done? see if we can't get the times down more?
I need to swap the CFRunLoop patch back into the trunk.
CFRunLoop patch is back in: Checking in plevent.c; /cvsroot/mozilla/xpcom/threads/plevent.c,v <-- plevent.c new revision: 1.48; previous revision: 1.47 Should we just close this out?
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
This bug is marked fixed, but I never saw a check-in on bonsai. Am I missing something?
Look at the CVS history for plevents.c.
Comment on attachment 169688 [details] [diff] [review] Alternative simpler patch using CFRunLoopSource asking for approval just to be safe. this has been on the trunk for a long time and is ifdef'd for cocoa. only camino is affected by these changes.
Did the pageload performance issues (comment 50) ever get worked out? It is not clear from the remaining comments....
(In reply to comment #61) > Did the pageload performance issues (comment 50) ever get worked out? It is not > clear from the remaining comments.... No unfortunatly not. See below for the numbres. http://axolotl.mozilla.org/graph/query.cgi?tbox=pawn.office.mozilla.org&testname=pageload&autoscale=1&size=&days=0&units=<ype=&points=&showpoint=2004:01:13:11:39:21,777&avg=1
the page load numbers on pawn are out of whack with our own testing, but we decided to take the perf hit anyway for substantially improved flash performance and to fix other bugs, such as chewing up the CPU when no windows were open that we would occasionally get into.
Comment on attachment 169688 [details] [diff] [review] Alternative simpler patch using CFRunLoopSource a=caillon for 1.7.6
Attachment #169688 - Flags: approval1.7.6? → approval1.7.6+
landed on branch, will do an 083 test spin with this in it so we can get some 10.1 luvin'
> the page load numbers on pawn are out of whack with our own testing To be more precise, I think what's happening is that on slower machines, the patch has a pageload impact because some pages redraw more than once when loading. On faster machines there are no extra redraws, so the patch has little impact on performance (but does help fastly with responsiveness on Flash-heavy pages).
You need to log in before you can comment on or make changes to this bug.