100% CPU usage after resume from hibernate or suspend --- now with clue

RESOLVED DUPLICATE of bug 376643

Status

()

defect
--
major
RESOLVED DUPLICATE of bug 376643
12 years ago
a year ago

People

(Reporter: whorfin, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Reporter

Description

12 years ago
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.2) Gecko/20070219 Firefox/2.0.0.2
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.2) Gecko/20070219 Firefox/2.0.0.2

There are many, many posts and forum notices of firefox chewing CPU mysteriously.
I've not, however, seen exactly this issue reported, nor anything as specific
as the debug trail i'm reporting.  Admins should of course feel free to
reclassify this report should it turn out i missed something.

Upon resume from suspend on a laptop, or resume from hibernate on a a
laptop or desktop, i often see Firefox chew 100% CPU [literally; it's
using one CPU on multiprocs].  After some amount of time, sometimes
ten minutes or more, sometimes only a few minutes, eventually it simmers
down and returns to normal, ie minimal CPU.  This seems to be more likely
if many firefox windows and tabs were active, and again more likely when
a virtual desktop software is in usage [i favor Virtual Dimension]. It has
happened on all laptops i have tried [three], and three desktop machines.

This happens quite regularly, and even without touching firefox.  in other
words, the web not being actively browsed, firefox quiescent, suspend.
upon resume, CPU is pegged for many minutes, but ultimately does resolve
itself if one waits long enough.  If one kills firefox and then resumes
exactly as before [using session recovery], then CPU usage will be normal.
this indicates that it it not malformed site content, ie
javascript, as the CPU load is not evidenced on first time load, only on
system hibernate/suspend/resume, and further ultimately settles down.


This has been going on since i first started using firefox, certainly since
1.5 days.

While i don't know exactly what's happening, recent frustration got me to
delve into a debugger [on the no-symbols release version], so i have some
information to contribute.  Here's a snippet of the code i saw stuck
in infinite loop in the debugger.


0055C24A   .  B8 395E9600   MOV EAX,firefox.00965E39
0055C24F   .  E8 AC6F4000   CALL firefox.00963200
0055C254   .  83EC 30       SUB ESP,30
0055C257   .  53            PUSH EBX
0055C258   .  8B5D 08       MOV EBX,DWORD PTR SS:[EBP+8]
0055C25B   .  56            PUSH ESI
0055C25C   .  57            PUSH EDI
0055C25D   .  8B03          MOV EAX,DWORD PTR DS:[EBX]
0055C25F   .  53            PUSH EBX
0055C260   .  FF50 04       CALL NEAR DWORD PTR DS:[EAX+4]
0055C263   .  33FF          XOR EDI,EDI
0055C265   .  8D45 E8       LEA EAX,DWORD PTR SS:[EBP-18]
0055C268   .  47            INC EDI
0055C269   .  33F6          XOR ESI,ESI
0055C26B   .  897D EC       MOV DWORD PTR SS:[EBP-14],EDI
0055C26E   .  C745 E0 1483A>MOV DWORD PTR SS:[EBP-20],firefox.00AF83>;  ASCII "@mozilla.org/timer/manager;1"
0055C275   .  8945 E4       MOV DWORD PTR SS:[EBP-1C],EAX
0055C278   .  8975 08       MOV DWORD PTR SS:[EBP+8],ESI
0055C27B   .  8D45 E0       LEA EAX,DWORD PTR SS:[EBP-20]
0055C27E   .  68 D8D9A300   PUSH firefox.00A3D9D8
0055C283   .  50            PUSH EAX
0055C284   .  8D4D 08       LEA ECX,DWORD PTR SS:[EBP+8]
0055C287   .  8975 FC       MOV DWORD PTR SS:[EBP-4],ESI
0055C28A   .  FF15 BC919600 CALL NEAR DWORD PTR DS:[<&xpcom_core.?as>;  xpcom_co.?assign_from_gs_contractid_with_error@nsCOMPtr_base@@QAEXABVnsGetServiceByContractIDWithError@@ABUnsID@@@Z
0055C290   .  F645 EB 80    TEST BYTE PTR SS:[EBP-15],80
0055C294   .  897D FC       MOV DWORD PTR SS:[EBP-4],EDI
0055C297   .  74 08         JE SHORT firefox.0055C2A1
0055C299   .  8B75 E8       MOV ESI,DWORD PTR SS:[EBP-18]
0055C29C   .  E9 D3000000   JMP firefox.0055C374
0055C2A1   >  8B45 08       MOV EAX,DWORD PTR SS:[EBP+8]
0055C2A4   .  57            PUSH EDI
0055C2A5   .  50            PUSH EAX
0055C2A6   .  8B08          MOV ECX,DWORD PTR DS:[EAX]
0055C2A8   .  FF51 10       CALL NEAR DWORD PTR DS:[ECX+10]
0055C2AB   .  893D 1083AF00 MOV DWORD PTR DS:[AF8310],EDI
0055C2B1   >  8D45 C4       LEA EAX,DWORD PTR SS:[EBP-3C]            ;  top of spin loop
0055C2B4   .  56            PUSH ESI
0055C2B5   .  50            PUSH EAX
0055C2B6   .  E8 22FFFFFF   CALL firefox.0055C1DD
0055C2BB   .  59            POP ECX
0055C2BC   .  85C0          TEST EAX,EAX
0055C2BE   .  59            POP ECX
0055C2BF   .  75 7D         JNZ SHORT firefox.0055C33E               ;  not taken
0055C2C1   .  57            PUSH EDI                                 ; /RemoveMsg
0055C2C2   .  68 0A020000   PUSH 20A                                 ; |MsgFilterMax = WM_MOUSEWHEEL
0055C2C7   .  68 00020000   PUSH 200                                 ; |MsgFilterMin = WM_MOUSEMOVE
0055C2CC   .  8D45 C4       LEA EAX,DWORD PTR SS:[EBP-3C]            ; |
0055C2CF   .  56            PUSH ESI                                 ; |hWnd
0055C2D0   .  50            PUSH EAX                                 ; |pMsg
0055C2D1   .  FF15 80FFB300 CALL NEAR DWORD PTR DS:[B3FF80]          ; \PeekMessageW
0055C2D7   .  85C0          TEST EAX,EAX
0055C2D9   .  75 63         JNZ SHORT firefox.0055C33E               ;  not taken
0055C2DB   .  57            PUSH EDI                                 ; /RemoveMsg
0055C2DC   .  56            PUSH ESI                                 ; |MsgFilterMax
0055C2DD   .  56            PUSH ESI                                 ; |MsgFilterMin
0055C2DE   .  8D45 C4       LEA EAX,DWORD PTR SS:[EBP-3C]            ; |
0055C2E1   .  56            PUSH ESI                                 ; |hWnd
0055C2E2   .  50            PUSH EAX                                 ; |pMsg
0055C2E3   .  FF15 80FFB300 CALL NEAR DWORD PTR DS:[B3FF80]          ; \PeekMessageW
0055C2E9   .  85C0          TEST EAX,EAX
0055C2EB   .  75 51         JNZ SHORT firefox.0055C33E               ;  taken
0055C2ED   .  8B45 08       MOV EAX,DWORD PTR SS:[EBP+8]
0055C2F0   .  8D55 F0       LEA EDX,DWORD PTR SS:[EBP-10]
0055C2F3   .  52            PUSH EDX
0055C2F4   .  50            PUSH EAX
0055C2F5   .  8B08          MOV ECX,DWORD PTR DS:[EAX]
0055C2F7   .  FF51 14       CALL NEAR DWORD PTR DS:[ECX+14]          ;  an xpcom_core.dll call
0055C2FA   .  3975 F0       CMP DWORD PTR SS:[EBP-10],ESI
0055C2FD   .  74 2F         JE SHORT firefox.0055C32E
0055C2FF   >  8B45 08       MOV EAX,DWORD PTR SS:[EBP+8]
0055C302   .  50            PUSH EAX
0055C303   .  8B08          MOV ECX,DWORD PTR DS:[EAX]
0055C305   .  FF51 18       CALL NEAR DWORD PTR DS:[ECX+18]
0055C308   .  8B45 08       MOV EAX,DWORD PTR SS:[EBP+8]
0055C30B   .  8D55 F0       LEA EDX,DWORD PTR SS:[EBP-10]
0055C30E   .  52            PUSH EDX
0055C30F   .  50            PUSH EAX
0055C310   .  8B08          MOV ECX,DWORD PTR DS:[EAX]
0055C312   .  FF51 14       CALL NEAR DWORD PTR DS:[ECX+14]
0055C315   .  3975 F0       CMP DWORD PTR SS:[EBP-10],ESI
0055C318   .  74 48         JE SHORT firefox.0055C362
0055C31A   .  56            PUSH ESI                                 ; /RemoveMsg
0055C31B   .  56            PUSH ESI                                 ; |MsgFilterMax
0055C31C   .  56            PUSH ESI                                 ; |MsgFilterMin
0055C31D   .  8D45 C4       LEA EAX,DWORD PTR SS:[EBP-3C]            ; |
0055C320   .  56            PUSH ESI                                 ; |hWnd
0055C321   .  50            PUSH EAX                                 ; |pMsg
0055C322   .  FF15 80FFB300 CALL NEAR DWORD PTR DS:[B3FF80]          ; \PeekMessageW
0055C328   .  85C0          TEST EAX,EAX
0055C32A   .^ 74 D3         JE SHORT firefox.0055C2FF
0055C32C   .  EB 34         JMP SHORT firefox.0055C362
0055C32E   >  3935 1083AF00 CMP DWORD PTR DS:[AF8310],ESI
0055C334   .  74 35         JE SHORT firefox.0055C36B
0055C336   .  FF15 6C779600 CALL NEAR DWORD PTR DS:[<&USER32.WaitMes>; [WaitMessage
0055C33C   .  EB 24         JMP SHORT firefox.0055C362
0055C33E   >  33C0          XOR EAX,EAX                              ;  notice how we just skipped the Wait; CPU is consumed unless we do that wait!
0055C340   .  837D C8 12    CMP DWORD PTR SS:[EBP-38],12
0055C344   .  0F95C0        SETNE AL
0055C347   .  3BC6          CMP EAX,ESI
0055C349   .  8945 EC       MOV DWORD PTR SS:[EBP-14],EAX
0055C34C   .  74 1D         JE SHORT firefox.0055C36B                ;  not taken
0055C34E   .  8D45 C4       LEA EAX,DWORD PTR SS:[EBP-3C]
0055C351   .  50            PUSH EAX                                 ; /pMsg
0055C352   .  FF15 CC769600 CALL NEAR DWORD PTR DS:[<&USER32.Transla>; \TranslateMessage
0055C358   .  8D45 C4       LEA EAX,DWORD PTR SS:[EBP-3C]
0055C35B   .  50            PUSH EAX                                 ; /pMsg
0055C35C   .  FF15 7CFFB300 CALL NEAR DWORD PTR DS:[B3FF7C]          ; \DispatchMessageW
0055C362   >  3975 EC       CMP DWORD PTR SS:[EBP-14],ESI
0055C365   .^ 0F85 46FFFFFF JNZ firefox.0055C2B1                     ;  to top of repeating spin loop
0055C36B   >  8B03          MOV EAX,DWORD PTR DS:[EBX]
0055C36D   .  53            PUSH EBX
0055C36E   .  FF50 08       CALL NEAR DWORD PTR DS:[EAX+8]
0055C371   .  8B75 CC       MOV ESI,DWORD PTR SS:[EBP-34]
0055C374   >  834D FC FF    OR DWORD PTR SS:[EBP-4],FFFFFFFF
0055C378   .  8D4D 08       LEA ECX,DWORD PTR SS:[EBP+8]
0055C37B   .  FF15 90929600 CALL NEAR DWORD PTR DS:[<&xpcom_core.??1>;  xpcom_co.??1nsCOMPtr_base@@QAE@XZ
0055C381   .  8B4D F4       MOV ECX,DWORD PTR SS:[EBP-C]
0055C384   .  8BC6          MOV EAX,ESI
0055C386   .  5F            POP EDI
0055C387   .  5E            POP ESI
0055C388   .  5B            POP EBX
0055C389   .  64:890D 00000>MOV DWORD PTR FS:[0],ECX
0055C390   .  C9            LEAVE
0055C391   .  C2 0400       RETN 4

What i observed when single-stepping was that this loop runs typically as
commented [with taken/not taken for the jumps] when the CPU is pegged in
the above-discussed symptoms.  When things settle down, WaitMessage gets
called, and CPU usage drops to normal.

I _think_ [but do not know] that this code is from the main message
handler loop of nsAppShell::Run from nsAppShell.cpp.  similar code
is in nsWinMain.cpp in webshell/tests/viewer

It would appear that something must be going haywire
with the idle timers.  Is it possible something goes nuts when
a timer firing or message delivery happens "way too late" because
of the suspend/resume?

Reproducible: Sometimes

Steps to Reproduce:
1. Have firefox running with many open windows and tabs
2. Hibernate or suspend the machine
3. Wait an hour or more
4. Resume the machine

Actual Results:  
Note that firefox is now pegging a CPU at 100%
Eventually, firefox simmers back down to normal

Expected Results:  
Machine should restore to state it was in before --- ie firefox
quietly idle, not using 100% of a CPU!

I classify this as Major since it renders a single-proc machine
almost unusable.  Because it's a tight spin-loop, performance
is severely degraded in firefox itself even on multi-core machines.
Do you have the latest flash version installed? 
Reporter

Comment 2

12 years ago
(In reply to comment #1)
> Do you have the latest flash version installed? 

Yes.  the About Flash page reports:
You have version 9,0,28,0 installed

I had this problem with the previous version of flash, and
flash v8 before that.

Comment 3

12 years ago
I've noticed this on SeaMonkey.  I think the problem is that the page is using a setInterval and when you resume it runs every interval between when you suspended and the current time.
Reporter

Comment 4

12 years ago
Right now, firefox is eating 100% of one of my cpus [ie it's doing this
bug right now].  Disabling javascript from the options menu does not change this;
javascript is disabled, but CPU usage is still pegged.  For what it's worth...
Reporter

Comment 5

12 years ago
I'd have to say i'm fairly convinced this isn't due to setInterval or javascript.
I quit firefox [in a 100% cpu state]
Restart, and resume session.
After much loading, normal cpu idle.
Disable javascript [and java]
Suspend
Resume 4.5 hours later
CPU pegs, javascript [and java] still disabled.
Ergo, can't possibly be javascript or setInterval related, right?
Reporter

Comment 6

12 years ago
...perhaps i spoke too soon.  approximately five minutes later, cpu settled
down to normal.  So, that's bad [there should be no cpu peg due to firefox
just because i resumed], but that's better than the normal scenario,
where it takes tens of minutes or hours to quiet down.

Comment 7

12 years ago
Do you maybe have an extension loaded that uses setInterval?  I can reproduce this using a page with setInterval.  And I don't see it otherwise.  Btw, I'm using Linux.  If you're sure this isn't the problem you're seeing, then I'll probably open a new bug for setInterval. :-)
Reporter

Comment 8

12 years ago
Good point, a number of extensions i use have setInterval calls in them.
AdBlockPlus, in particular, seems to reload prefs every 2 seconds via
setInterval.  This could explain the behavior i'm seeing even with
javascript disabled.

The current summary is --- with javascript active, firefox chews 100% cpu
upon suspend/overnight/resume for from hours to days.  With javascript
disabled, firefox chews 100% cpu upon suspend/overnight/resume for between five
and twenty minutes.  The amount of time spent in this churn state does seem directly related [though perhaps not linearly] to exactly how long the machine was in suspended state.

Maybe the thing to do is to open a bug for setInterval as you describe, 
and make my bug a dependent.  When setInterval is fixed, I can test, and we'll
see either that "it's fixed" or it isn't, and if it isn't, how the path through
the code referenced above has changed [if at all].

Comment 9

12 years ago
Could you disable the extensions or try it with a new profile and see if it still happens?
Reporter

Comment 10

12 years ago
I will try, it takes some time to go through suspend/resume cycles etc.
I just disabled all extensions and shut down firefox, causing a crash:
TB30867552Z
Interesting - firefox was in its usual 100% CPU, and the symptoms were all
windows had closed but CPU usage was still 100%.  Eventually this crash...

At present javascript and all extensions are disabled.  I can't work like
this but i'll leave the machine suspended and see what happens tomorrow.
Reporter

Comment 11

12 years ago
OK.

With all extensions AND javascript disabled, this bug does not show itself.
With a single setInterval-using extension enabled [AdBlockPlus for instance,
which sets a 2-second interval check], firefox exhibits the problem reported.

So, I think Samuel Sieb has identified the problem --- 
    "when you resume it runs every interval between when you
     suspended and the current time."
This sure seems consistent with what I observe.  Moreover, disabling javascript
doesn't immediately turn-off this bogus servicing of zillions of pending
intervals, if javascript had been active upon suspend/resume.

The extent of how horrible this problem is depends on how many extensions 
one has which use setInterval, what those setIntervals are, how many currently
active web pages and their setInterval usage [assuming javascript enabled]
and how long a machine has been suspended before resume.  As those factors
can lead to absurdly long numbers of intervals to service, firefox becomes
unusable.

Reporter

Updated

12 years ago
Depends on: 376643
Reporter

Comment 12

12 years ago
Logged Bug 376643, making this bug dependent

Comment 13

12 years ago
BTW, to test this you don't have to go through a full Hibernate/Resume cycle; For Windows you can use SysInternals PsSuspend: http://www.microsoft.com/technet/sysinternals/utilities/pssuspend.mspx
...and for Unix you can send STOP and later CONT signals to the process. Hope this helps simplify testing

Comment 14

12 years ago
Ah sorry, I now checked the other bug and see you're already aware.

Comment 15

12 years ago
Bug 376643 was just fixed.
Status: UNCONFIRMED → RESOLVED
Last Resolved: 12 years ago
No longer depends on: 376643
Resolution: --- → DUPLICATE
Duplicate of bug: 376643
You need to log in before you can comment on or make changes to this bug.