Closed Bug 190158 Opened 22 years ago Closed 22 years ago

100% CPU usage and mozilla staying in memory after exit

Categories

(SeaMonkey :: General, defect)

x86
Windows XP
defect
Not set
critical

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 189672

People

(Reporter: bernard.alleysson, Assigned: darin.moz)

References

Details

Attachments

(1 file)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.3b) Gecko/20030114
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.3b) Gecko/20030114

after a while (a few minutes browsing any URL, for instance mozilla.org,
bugzilla.mozilla.org, ...) mozilla takes 100% CPU
and it stays in memory after exiting, the process has to be killed

Reproducible: Always

Steps to Reproduce:
1. browse for a while (few minutes, any web site)
2.
3.



Expected Results:  
0% CPU usage, exit properly

I've attached VC++ debugger to the process and I see that all CPU time is spent in 
a particular thread running the code:

NS_METHOD nsAppShell::Run(void)
{

....
....

  gKeepGoing = 1;
  // Process messages
  do {
    // Give priority to system messages (in particular keyboard, mouse,
    // timer, and paint messages).
     if (PeekKeyAndIMEMessage(&msg, NULL) ||
#ifdef MOZ_UNICODE
         nsToolkit::mPeekMessage(&msg, NULL, WM_MOUSEFIRST, WM_MOUSELAST,
PM_REMOVE) || 
         nsToolkit::mPeekMessage(&msg, NULL, 0, 0, PM_REMOVE)) <======== THIS
RETURNS ALWAYS 1
{
#else
         ::PeekMessage(&msg, NULL, WM_MOUSEFIRST, WM_MOUSELAST, PM_REMOVE) || 
         ::PeekMessage(&msg, NULL, 0, 0, PM_REMOVE)) {
#endif /* MOZ_UNICODE */
      keepGoing = (msg.message != WM_QUIT); <======= KEEPGOING IS ALWAYS 1 HERE

      if (keepGoing != 0) {
//#ifdef MOZ_AIMM // not need?
//      if (!nsToolkit::gAIMMMsgPumpOwner ||
(nsToolkit::gAIMMMsgPumpOwner->OnTranslateMessage(&msg) != S_OK))
//#endif
        TranslateMessage(&msg);
#ifdef MOZ_UNICODE
        nsToolkit::mDispatchMessage(&msg);
#else
        ::DispatchMessage(&msg);
#endif /* MOZ_UNICODE */
        if (mDispatchListener)
          mDispatchListener->AfterDispatch();
      }
    } else {

      PRBool hasTimers;
      timerManager->HasIdleTimers(&hasTimers);
      if (hasTimers) {
        do {
          timerManager->FireNextIdleTimer();
          timerManager->HasIdleTimers(&hasTimers);
#ifdef MOZ_UNICODE
        } while (hasTimers && !nsToolkit::mPeekMessage(&msg, NULL, 0, 0,
PM_NOREMOVE));
#else
        } while (hasTimers && !::PeekMessage(&msg, NULL, 0, 0, PM_NOREMOVE));
#endif
      } else {

        if (!gKeepGoing) {
          // In this situation, PostQuitMessage() was called, but the WM_QUIT
          // message was removed from the event queue by someone else -
          // (see bug #54725).  So, just exit the loop as if WM_QUIT had been
          // reeceived...
          keepGoing = 0;
        } else {
          // Block and wait for any posted application message
          ::WaitMessage();
        }
      }
    }

  } while (keepGoing != 0);
  Release();
  return msg.wParam;
}
reassigning to yokoyama according to CVS blame

it is a home build with MOZ_UNICODE defined
and I can reproduce the bug on 2 machines (Windows 2000 and XP Pro)
Assignee: asa → yokoyama
I've been running moz 1.3a (Gecko/20021212) for sometime on my Win XP; 
but I dont' see this behavior.  

Anybody else seeing this?
This sounds like a dupe of bug 190003 - although I didn't think that it started
happening as early as the 1/14 builds...

Does turning off pipelining make the problem go away?
Bernard, can you try the schedule fix in bug 189718 and see if that fixes the
problem?
addition to comment #2.
Since Nov 2002, mozilla trunk build includes MOZ_UNICODE flag  
by default for Windows platforms. No other platforms affected.
in response to comment #2, I noticed this problem too with recent builds (I
started seeing it with last monday's builds but it could have been there before)
Could someone remove the MOZ_UNICODE flag and rebuild the trunk
to see if this bug still occurs?
(See http://bugzilla.mozilla.org/show_bug.cgi?id=104934 for unicode enable patch)

Unfortunately, I don't have a moz tree anymore. (off to other project...)
i got this and my debug shows that mozilla spents the time in networking/NNTP.
(bug 189718)
Jason (Comment #3): pipelining is off and has always been off, so it doen't look
like a dup of that bug

Warner (Comment #4): I don't use mail/news at all

Pascal (Comment #6): YES, this is definitly a problem with recent builds (a few
days, maybe monday yes). I've waited a few days before submitting this bug.

Matti (Comment #8):

What is special about this bug is that it is a "soft" "100% CPU bug".
I mean that even if mozilla takes 100% CPU the machine is still usable and you
have to look at task manager to see the problem. I think that the time is really
spent in a (GetMessage()/TranslateMessage()/DispatchMessage() loop with no CPU
intensive processing so basicly the machine is still usable (even mozilla, ie
you can continue browsing, etc...), and you have to look at task manager to see
the problem.

What is also special compared to other "100% CPU" bugs is that mozilla doesn't
exit cleanly. It stays in memory taking 100 %CPU and has to be killed.

I didn't find any special steps to reproduce other other than "use it !". In fact 
I experience the bug right now and it happened while I was just reading some text !!
it seems that it might be related to news?
My NNTP (1GB+) log is flooded with:
0[2c3ef0]: (2f06f80) Next state: NEWS_FREE
0[2c3ef0]: (2f06f80) CleanupAfterRunningUrl()
0[2c3ef0]: (2f06f80) setting busy to 0
0[2c3ef0]: (4667078) Next state: NEWS_FREE
0[2c3ef0]: (4667078) CleanupAfterRunningUrl()
0[2c3ef0]: (4667078) setting busy to 0

and it keeps growing!
Henrik,

That's bug 189718
Probably related to bug 176919.
I don't think this performance degradation is due to Unicode enabling.
Because 
- degradation started with recent builds
- can't reproduce with 1.3a trunk build

asa: would you suggest someone to own this bug?  Perhaps performance team?
I saw this for the first time today on WinXP SP1 with 2003012105 Trunk build.

As for comment #9 (referring to Comment #8), WinXP runs Mozilla at a "Normal"
priority which allows other tasks to interrupt Mozilla relatively easily.

However, in my case, I have a "Low" priority task which normally uses >90% CPU,
but when I saw this bug, it was down at 0%, and Mozilla was up at >90%.
*** Bug 190348 has been marked as a duplicate of this bug. ***
if you are seeing this bug with a build starting 20030118, then you are most
likely seeing a regression from bug 176919.  this bug was filed before the patch
for bug 176919 landed, so please be careful to distinguish this bug from bugs
having to do with bug 176919... if that's even possible :-/
Darin, if I understand you correctly, if a post 1/18 build has this 100% CPU
problem, then it is caused by Bug #176919 (which has been marked as fixed).
Thus, if only pre 1/18 builds are part of this bug, then isn't this bug fixed?
david: no, i meant that if you see a bug like this with a build from 1/18 or
later, then it may be difficult to determine if it is _this_ bug or a regression
from bug 176919.
I don't know if this helps -- or even related -- but if I browse a sites that
use a lot of JPEGs, Mozilla seems to stop loading webpages and JPEGs after a few
minutes. I didn't notice this in 2003012008, but in 2003012308 it happens all
the time.

The good thing is that I tried turning off pipelining that was suggested here,
and that seemed to have solved the problem. 
100% CPU usage and  mozilla staying in memory after exit:

I've the same problem. I've nightly builds changed weekly.
My OS: windows 2000 sp3

I've rebuilt a DEBUG build
and I found that all the CPU time is spent in nsInputStreamPump.cpp
I'll attach some code paths I went through
so I think that it may be a regression from bug 176919
... and I hope this will be fixed for 1.3b
For now it is difficult for me to use Mozilla because after 10-20 mn I use it it
takes 100 % CPU until I kill the process
Flags: blocking1.3b?
assign to darin as per #21
Assignee: yokoyama → darin
should be a dupe of bug 189672
I can reproduce bug 189672 and this is not the same bug because
for 189762 the CPU goes to 0% (normal situation) as soon as you leave the 
particular web site. Here what happens is that once you get 100% CPU you are stuck
forever until you kill the process.
I agree that this is seperate from Bug #189672, for two reasons. One is that Bug
#189672 has been fixed (on 1/25) and I am going to assume we will still see this
100% CPU problem from builds after 1/25. Second, while this bug may be caused by
the same thing as bug #189672, we wont realy know for a few days, so this should
remain seperate just in case it is a different problem.

Thus, if anyone with a post 1/25 build sees this particular problem again,
please post your build details and your OS details. NOTE: Only a couple of
people will need to do this, so please don't bury this bug in "Me To's".
with build 2003012517 (includes fix for bug 189672), an assertion pops up on
ftp://ftp.mozilla.org/pub/data/crash-data/Trunk-topcrashers.html after the
document is loaded and while the browser is idle:

NS_ERROR("OnDataAvailable implementation consumed no data");
nsInputStreamPump.cpp, line 420

and the infinite loop is broken,
so far this looks good and fix for bug 189672 seems to fix this bug too (thx for
the comments in the code)

Flags: blocking1.3b? → blocking1.3b-
good news, I can't reproduce this anymore
feel free to reopen if you see this for recent builds (date > 1/26)

*** This bug has been marked as a duplicate of 189672 ***
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → DUPLICATE
Product: Browser → Seamonkey
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: