Closed Bug 389683 Opened 18 years ago Closed 18 years ago

System slowdown following overactive windowserver, when Camino running

Categories

(Camino Graveyard :: General, defect)

PowerPC
macOS
defect
Not set
blocker

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: asolkar, Assigned: smichaud)

References

Details

Attachments

(1 file)

User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en; rv:1.9a7pre) Gecko/2007072400 Camino/2.0a1pre (like Firefox/3.0a7pre) Build Identifier: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en; rv:1.9a7pre) Gecko/2007072400 Camino/2.0a1pre (like Firefox/3.0a7pre) I have noticed over the past week that when Camino has been running for a while, my system starts to slow down drastically. Upon observing Activity Monitor, I found that a WindowServer job is hogging in the excess of 80% of CPU. It took a few iterations to notice a possible Camino connection. In addition, I have also noticed that this happens more when I am surfing a site with Flash content. In #Camino, I was told to note the number of ports Camino uses when the slow down occurs. I'll add that data as a comment soon. In my normal use, Camino uses about 240+ (at startup - iGoogle as home page with lots of feeds) and about 140+ afterwards. Reproducible: Sometimes Steps to Reproduce: 1. Open a website (possibly with flash content) 2. Wait for a while Actual Results: System starts to slow down with WindowServer starting to use much more CPU than it usually does. Just quitting Camino mostly doesn't help. Logout-login helps. Expected Results: There should be no system performance impact when Camino is running
Wevah noticed a pretty extreme case of this in the first build after the "consolidated Cocoa popups patch" landed (which is why you're cc'd, Steven, but we're still trying to collect data).
Hooray; I'm not the only one! Some stuff to add: The "consolidated Cocoa popups patch" landing was the first build I'd used in months, since the system menus not working when Camino was active bothered me too much. At any rate, it may not be this patch specifically, but it's definitely possible. Also, in my case, not only did the window server hog CPU, but it would also eat VM until it could no longer allocate memory for new windows (it normally uses about 400-500 MB of VM; it was capping out at 4.5 GB after/while using Camino). In one case I actually got VRAM corruption (if I interpret the symptoms correcty), tossing funny colors all over my screen. A logout fixes the issue, providing I can catch it before it starts crashing applications.
What you still need to do, of course, is give the rest of us enough information to reproduce your problem.
Step 1: Run a Camino trunk nightly. Step 2: There is no step 2. There's really not much more to it than that. It could very well be system dependent, but it does happen.
I was going to add something about architecture, but the reporter is on PPC and I'm on Intel, so that's probably out...
> Step 1: Run a Camino trunk nightly. > Step 2: There is no step 2. Doesn't work for me, or I suspect for most other people.
Mahesh and Wevah, Here's some information that may help you figure out what's triggering your problem with the window server: If this problem isn't just a coincidence (that is if it's really caused by something in my consolidated popup patch), the culprit is most likely Apple functionality called an "event tap", which only started being used when my popup patch was landed. http://developer.apple.com/documentation/Carbon/Reference/QuartzEventServicesRef/Reference/reference.html#//apple_ref/c/func/CGEventTapCreate I installed an "event tap" on all mouseDown events in the current login session (i.e. in the window server itself). This is how I fixed the "system menus not working" problem (bug 381448). While playing with event taps I ran into a serious problem, which I worked around (clicking on any program's main menu froze the entire login session!). But it's entirely possible that there are other, more subtle problems (i.e. other Apple event tap bugs) that I missed. If so, what triggers your problem with the window server is likely to be some sequence of mouseDown events, possibly entirely outside of the browser.
> If so, what triggers your problem with the window server is likely > to be some sequence of mouseDown events, possibly entirely outside > of the browser. What I called "mouseDown events" includes "left", "right" and "other" (middle button) mouseDown events.
That's very possible; I did notice some other menu wonkiness (menus popping up again right after they were dismissed and inability to click application menus sometimes) as well. I did a quick read-over of your patch, but I didn't know anything about the event tap stuff (and thereby didn't know what I should be looking for). I will try to read up on that stuff in the next few days, though, to see if I can figure anything else out that's actually useful.
Just to chime in, it happens on my G5 PowerMac, but doesn't seem to happen on my Macbook Pro. I use both machines equally.
For the record, I don't (yet) see the problem on either a dual 1GHz PowerPC G4 or an early model 2GHz MacBook Pro.
For the record, I've seen this happen exactly twice. In both cases, I had a blogspot.com/ [1] page open for a while in a background tab, and more exactly a page with that blogger search bar at the top (that is loaded in an iframe: [src*="blogger.com/navbar"] ). I nuked that iframe in my userContent.css, and I haven't experienced the issue anymore - I even forgot about it :-). All that was shortly after Steven's patch landed. PowerBook G4 1.5Ghz, 1.25Gb ram, 10.4.10 ppc, as always a truckload of other apps (browsers) open. [1] like this one http://arablinks.blogspot.com/
I've had the 2007-07-23-04 Camino nightly running now for three hours with four tabs open in a single window. One of the tabs contains http://espn.go.com/ (_lots_ of Flash objects, many of which change every few seconds). Another contains philippe's http://arablinks.blogspot.com/, with the search bar containing the keyboard focus and visible (when the tab is visible). I've switched back and forth between tabs and fiddled with their contents (e.g. scrolling them up and down) every half-hour or so. I've had no problems of any kind. And the WindowServer stats are (basically) unchanged since when I started: 60 WindowServ 0.0% 14:15.76 3 226 431 3.80M 21.8M 23.8M 213M
It seems that this bug (the window server problem) has only been reported on recent trunk versions of Camino. And (in fact) I don't need to use an event tap on Camino at all (because Camino uses native context menus). (What fixed the "system menus not working" problem (bug 381448) was no longer using an event monitor on mouseDown events -- the event tap on a mouseDown event was what replaced it.) So here's a patch that doesn't install either an event tap or an event monitor if the "ui.use_native_popup_windows" preference is "true". The same logic is used in nsCocoaWindow.mm's nsCocoaWindow::StandardCreate() to avoid creating creating a Cocoa window for Camino. I've tested this in both Minefield and Camino, and it seems to work fine in both browsers. If possible I'd like to get this reviewed and landed pretty quickly ... but _not_ before A7 lands (apparently the tree is still closed for A7). I figure it's easier to get this into the trunk, and then into the hands of people who have seen this problem, than it is to get you guys to notice what's triggered the problem on your systems :-)
Assignee: nobody → smichaud
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Comment on attachment 274202 [details] [diff] [review] Patch to stop using event taps on Camino I'm reluctant to ask Josh to review this, since he'll have a huge pile of stuff to do when he gets back from vacation next week. Stuart, is it alright if I ask you to review? If not, please pick someone else you think is appropriate.
Attachment #274202 - Flags: review?(stuart.morgan)
FYI, Stuart's away through/until Sunday.
Thanks for letting me know. Has Stuart been on vacation too? In that case I'll just ask Josh to review ... since they'll both have huge piles to wade through.
Unless you'd like to do the review, Mark ...
Attachment #274202 - Flags: review?(stuart.morgan) → review?(mark)
(In reply to comment #14) > I figure it's easier to get this into the trunk, and then into the > hands of people who have seen this problem, than it is to get you guys > to notice what's triggered the problem on your systems :-) > I can't tell you exactly what caused the problem... :-). The first time WindowServer went mad I was working in Vienna (feed reader). That is mostly using the keyboard (arrows for navigation, return key to view a post in the browser, occasionally mouse or trackpad to scroll). The second time, iirc, I was working on a template for a site. That is going back and forth between text editor (SubEthaEdit) and browsers (again, mostly keyboard interaction/shortcuts - I might have used the mouse to click on a browser dock-icon). I'll apply your patch to my home-made build and report any issues.
> I'll apply your patch to my home-made build and report any issues. Thanks. I look forward to hearing your results. But you've only experienced the problem twice, so I'll probably learn more from the people who seem to see it more often (once they get my patch in a nightly, or if I'm lucky once they add it to their own builds).
For the record, I've hit this twice in the last 24 hours, and quitting Camino completely fixed it both times. (The first time I logged out as well, because I needed to anyway, but when I hit it about an hour ago, quitting Camino appears to have totally fixed it without logging out.) That your patch changes behaviour of context menus, Steven, suggests that triggering context menus in a page might be a way to trigger the bug. Do you have any thoughts on that? It would be a lot easier for me to test the patch if I knew a reliable way to trigger this other than "open a bunch of feeds in NNWLite and browse for a while," which (obviously) doesn't trigger it all the time :-p
Comment on attachment 274202 [details] [diff] [review] Patch to stop using event taps on Camino r=smorgan
Attachment #274202 - Flags: review?(mark) → review+
(In reply to comment #21) > quitting Camino appears to have totally fixed it without logging out Then your problem is new, and isn't the same as the one that was reported here. > It would be a lot easier for me to test the patch if I knew a > reliable way to trigger this other than "open a bunch of feeds in > NNWLite and browse for a while," which (obviously) doesn't trigger > it all the time :-p In several of this bug's comments I've complained (loudly) that I can't reproduce this bug's problem at all. This situation hasn't changed in the last few days. And I haven't seen your problem either, not even once. (Though I think, from something he said to me on IRC chat, that sdwilsh might have seen it.) I know it's difficult. But it you want to be helpful, what you need to do is keep observing what happens just before your problem is triggered, and try to come up with a reasonably reliable procedure to reproduce it. I'd be happy with something that worked event 50% of the time. (Keep in mind that other software running on your system may make your problem more likely to happen. So (if you don't already have a plain-vanilla system) try disabling plugins and other (non-Apple) background processes, to see if this makes any difference.) If you have your own build, you might want to apply my patch, just to see what happens ... or what doesn't happen. But I doubt that my patch will fix your problem.
Attachment #274202 - Flags: superreview?(mark)
(Following up comment #21) As for this bug's problem, all I can suggest is what I said in comment #7. Notice that my "sequence of mouseDown" events could be _anywhere_. If my hunch is correct, there's no reason to think that opening a context menu has anything to do with it.
Comment on attachment 274202 [details] [diff] [review] Patch to stop using event taps on Camino I'm not a superreviewer, I can't help here. http://www.mozilla.org/hacking/reviewers.html
Attachment #274202 - Flags: superreview?(mark) → superreview?
Attachment #274202 - Flags: superreview? → superreview?(mikepinkerton)
> I'm not a superreviewer, I can't help here. Oops. Sorry.
For posterity's sake, I'm bumping this up to a blocker, since there is absolutely no way we have any business shipping a release with a bug like this in it. (Not that we're planning on shipping a release based on this code any time soon, but still.) As an update, it appears that quitting Camino fixes *most* of the slowdown, but not all of it, and logging out is the only way to completely fix the problem. I've now triggered this in trunk builds without ever actually *using* Camino, just launching it (with a blank home page which I subsequently closed) and using the mouse a lot in another app. I suppose that lends credence to Steven's event tap theory. cl
Severity: normal → blocker
Oh, and "use the mouse a lot in another app" has been fairly reliable in triggering it for me. Pick an app that involves using the mouse -- a game like Iago or Risk works great -- and have Camino running in the background, optionally with a page or two open. Keep at it for half an hour or so and you're pretty likely to trigger this without ever bringing Camino to the foreground. cl
Ok - since my last comment (comment 19) I've used home made builds with that patch applied. I haven't had any issue, especially no crazy Windowserver. Based on Steven's comments I've forced my self to use the mouse more than I normally would, including clicking or control-clicking on app icons in the dock. Also, my 5 years old daughter spent an hour or so playing a game (mouse mouse mouse) - with Camino idle in the background. No problems. Then I fetched the latest hourly build from Tinderbox to see what that new tabosé looks like. Camino with one tab open in the background, working in Mail.app: sorting emails and the like. Again, using the mouse way more than I normally do. Within half an hour Windowserver went through the roof (450% CPU...). The sequence of events: command-clicked a link in Mail.app, finish reading message, click on Camino-icon in the dock to bring it forward. Check that page, hide Camino, continue in Mail.app (typing and some copy-paste). End. Quit everything as fast as possible, log out and back in. Back to my patched home made build. In short: that patch fixed that issue on my side. And I haven't found any problems with it.
This sounds like the way to go, for sure, since we don't need anything to do with custom popup code at all. What CL said about having to log out is true; once the window server (for me) bloats and can no longer allocate windows, the only way to fix it is to log out. Thanks a ton for working to fix this for those of us that experience the bug.
Thanks for the additional information. I'll work through the recent comments to see if I can find out how to reproduce this problem on one of my systems. If event taps are causing all this trouble in Camino (or while Camino is running), this doesn't bode well for Minefield/Firefox, where they're actually needed :-( Could those of you who've found reasonably easy ways to reproduce this problem try them with Minefield running in the background (and not Camino)?
Last night I used the nightly trunk build of Minefield for a couple of hours. I did see similar behavior. I hit the slowdown twice in two hours. Once I had enough time to logout and log back in. The second time it was too late. The UI was almost frozen. I could go between windows by clicking on them, but could not close. Dock was stuck. Keyboard shortcut for logout won't work. Very similar to what Camino build does. In both cases, Minefield was running with may be 2 tabs - sometimes in foreground, but mostly in background. The other activity was either browsing email in Mail.app or IRC in Colloquy. I tried to use context menu more often than I usually do.
I'd like to land my patch for this (attachment 274202 [details] [diff] [review]) on the trunk before I go on vacation next week. I've asked Mike Pinkerton to superreview ... but he hasn't yet responded, and I'm not sure he's available. Stuart, if need be could you make your review also count as a superreview?
Comment on attachment 274202 [details] [diff] [review] Patch to stop using event taps on Camino sr=pink.
Attachment #274202 - Flags: superreview?(mikepinkerton) → superreview+
Comment on attachment 274202 [details] [diff] [review] Patch to stop using event taps on Camino This bug seems pretty urgent. I'd like to land my patch as soon as possible (before I go on vacation next week).
Attachment #274202 - Flags: approval1.9?
(Following up comment #35) The patch (attachment 274202 [details] [diff] [review]) has almost no risk -- it's very simple, and just stops event taps from being used where they're not needed (in browsers that, like Camino, don't use custom context menus).
Stuart or Pav, can we get approval for this patch? See comment 36 for a risk analysis.
Comment on attachment 274202 [details] [diff] [review] Patch to stop using event taps on Camino a1.9=dbaron
Attachment #274202 - Flags: approval1.9? → approval1.9+
FIXED per comment 40 (but please holler if for some reason this persists in tomorrow's trunk builds); thanks Steven!
Status: ASSIGNED → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
Blocks: 436897
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: