87861 - Mac does not idle.

Assignee

Description

•

23 years ago

The counter for in-use file transports is going negative. This causes the Mac to 
think that the browser is continually busy, which causes permanent high CPU load.

This is a bad thing on both Mac OS 9 and X (it drains my PowerBook batteries and 
causes the machine to run hot, for one thing).

Simon Fraser [no longer active]

Assignee

Comment 1

•

23 years ago

cc: gagan. I think this is serious enough that we need to fix it on the branch 
(and it should be an easy fix)

Simon Fraser [no longer active]

Assignee

Comment 2

•

23 years ago

The problem seems to be that the in-use count is decremented even for files that 
were not opened correctly. Here's part of the log:

22[4be7398]: nsFileTransport: OPEN_FOR_READ [this=4e01a04 jar:resource:///chrome/
modern.jar!/skin/modern/navigator/back-hover.gif] status=80520006
22[4be7398]: nsFileTransport: Leaving Process [this=4e01a04 state=4 status=
80520006]
22[4be7398]: nsFileTransport: Inside Process [this=4e01a04 state=4 status=
80520006]
22[4be7398]: nsFileTransport: END_READ [this=4e01a04 jar:resource:///chrome/
modern.jar!/skin/modern/navigator/back-hover.gif, 0] status=80520006

So there is no START_READ call (which increments the in-use count), but END_READ 
is still called, and decrements the count.

It seems that jar:resource:///chrome/modern.jar!/skin/modern/navigator/back-
hover.gif is missing.

Simon Fraser [no longer active]

Assignee

Comment 3

•

23 years ago

Attached patch Potential patch — Details — Splinter Review

Simon Fraser [no longer active]

Assignee

Comment 4

•

23 years ago

This patch ensures that the in-use count is only decremented in END_READ/
END_WRITE if START-READ/START_WRITE was called. It also adds a debug-only printf 
when there was an error reading the file, so that skin developers can know when 
they screwed up.

Gagan

Comment 5

•

23 years ago

this bug does sound serious to me. nominating for the branch. simon you want to 
own this?

Keywords: nsBranch

Priority: -- → P2

Whiteboard: need sr/a

Target Milestone: --- → mozilla0.9.3

Darin Fisher

Comment 6

•

23 years ago

sr=darin, though there appears to be an indentation problem here:

+        if (mStartWriteCalled)
         PR_AtomicDecrement(&mService->mInUseTransports);

Doug Turner (:dougt)

Comment 7

•

23 years ago

why is the state going into END_READ without passing througt START_READ?

Simon Fraser [no longer active]

Assignee

Comment 8

•

23 years ago

Because of this:

      case OPEN_FOR_READ: { 
        mStatus = mStreamIO->Open(&mContentType, &mTotalAmount);
       ...
        mXferState = NS_FAILED(mStatus) ? END_READ : START_READ;

Doug Turner (:dougt)

Comment 9

•

23 years ago

How about incrementing the count in OPEN_FOR_* instead?

Simon Fraser [no longer active]

Assignee

Comment 10

•

23 years ago

That might work, as long as we can guarantee that we get matched increment/
decrements.

Darin Fisher

Comment 11

•

23 years ago

we have bigger problems if OnStartRequest is not paired with OnStopRequest, so
you should be able to match up the increment/decrement with the call sites of those.

Simon Fraser [no longer active]

Assignee

Comment 12

•

23 years ago

Attached patch New patch; moves increment into OPEN_FOR_ — Details — Splinter Review

Simon Fraser [no longer active]

Assignee

Comment 13

•

23 years ago

New patch attached, that moves the increment of the in-use count into 
OPEN_FOR_READING/OPEN_FOR_WRITING.

Doug Turner (:dougt)

Comment 14

•

23 years ago

looks good to me.  sr/r=dougt.

simon, I have a branch here if you would like me to seek approval and check it 
in.

Simon Fraser [no longer active]

Assignee

Comment 15

•

23 years ago

Dougt: please do. However, I'm seeing another bug where the in-use file transport 
count goes bad, this time when the user clicks on a javascript:history.back() 
link. Let's wait until I've debugged that one.

Simon Fraser [no longer active]

Assignee

Comment 16

•

23 years ago

Another cause of bad file transport counts is in bug 87901.

Simon Fraser [no longer active]

Assignee

Comment 17

•

23 years ago

For javascript:history.back(), we seem to have a case where we can go from 
OPEN_FOR_READ to CLOSING without an END_READ. This happens if mRunState is set to 
CANCELED.

So we need a better patch.

Darin Fisher

Comment 18

•

23 years ago

simon: even if the file transport is canceled, it should still fire OnStopRequest.
if not, then that's a very serious bug.

just a quick reading of the code seems to suggest that a call to Cancel will only
cause mXferState to become CLOSING if it was not READING or WRITING.  so, how can
we be in the middle of reading or writing?  oh.. wait.. ic why!  it's because
we could be in the OPEN_FOR_READ or START_READ states.  the cancel logic needs to
take into account these other states.

         // Change transfer state if canceled.
         //
         if (mRunState == CANCELED) {
-            if (mXferState == READING)
+            if (mXferState == OPEN_FOR_READ ||
+                mXferState == START_READ ||
+                mXferState == READING)
                 mXferState = END_READ;
-            else if (mXferState == WRITING)
+            else if (mXferState == OPEN_FOR_WRITE ||
+                     mXferState == START_WRITE ||
+                     mXferState == WRITING)
                 mXferState = END_WRITE;
             else
                 mXferState = CLOSING;

Simon Fraser [no longer active]

Assignee

Comment 19

•

23 years ago

That patch still doesn't help. Here's an edited part of the log, with the patch.

0[7094efc]: nsFileTransport: AsyncRead [this=7847d98 javascript:history.back()] 
mOffset=0 mTransferAmount=-1
39[74f9c68]: nsFileTransport: Inside Run
39[74f9c68]: nsFileTransport: Inside Process [this=7847d98 state=1 status=0]
0[7094efc]: nsFileTransport: Cancel [this=7847d98 javascript:history.back()]
39[74f9c68]: nsFileTransport: OPEN_FOR_READ [this=7847d98 
javascript:history.back()] status=805303f5
39[74f9c68]: nsFileTransport: Leaving Process [this=7847d98 state=4 status=
805303f5]
39[74f9c68]: nsFileTransport: Inside Process [this=7847d98 state=9 status=
804b0002]
39[74f9c68]: nsFileTransport: CLOSING [this=7847d98 javascript:history.back()] 
status=804b0002
39[74f9c68]: nsFileTransport: Leaving Process [this=7847d98 state=0 status=
804b0002]
39[74f9c68]: nsFileTransport: Leaving Run [xferState=0 runState=2]

So we do an AsyncRead on the javascript: url. Cancel() gets called on it (I've no 
idea why), setting mStatus. Then, when we hit the OPEN_FOR_READ state, we set the 
state to END_READ. Then, the next time through Process(), we change this state to 
CLOSING without ever having processed the END_READ state.

Darin Fisher

Comment 20

•

23 years ago

 // Change transfer state if canceled.
         //
         if (mRunState == CANCELED) {
-            if (mXferState == READING)
+            if (mXferState == OPEN_FOR_READ ||
+                mXferState == START_READ ||
+                mXferState == READING ||
+                mXferState == END_READ)
                 mXferState = END_READ;
-            else if (mXferState == WRITING)
+            else if (mXferState == OPEN_FOR_WRITE ||
+                     mXferState == START_WRITE ||
+                     mXferState == WRITING ||
+                     mXferState == END_WRITE)
                 mXferState = END_WRITE;
             else
                 mXferState = CLOSING;

Simon Fraser [no longer active]

Assignee

Comment 21

•

23 years ago

Yup, that's better, and fixes this javascript: problem.

However, I can still get into a state where we mess up the in-use count by 
running the iBench tests, and cancelling half-way through.

Darin Fisher

Comment 22

•

23 years ago

ugh

Simon Fraser [no longer active]

Assignee

Comment 23

•

23 years ago

OK, cancelling the iBench test in mid-flow sometimes (not always) can mess up the 
nsSocketTransport in-use count. Maybe we need some similar fixes there.

Darin Fisher

Comment 24

•

23 years ago

it sounds like there is a thread-safety problem then.  can you get a socket
transport log?  NSPR_LOG_MODULES=nsSocketTransport:5

Doug Turner (:dougt)

Comment 25

•

23 years ago

the 0.9.2 boat is leaving in a few minutes.  If there is any hope to get
something on this branch, we need to act now.  Simon, does the patch darin and
you came up with solve "most" of the problems?  if so, lets go for it...

Simon Fraser [no longer active]

Assignee

Comment 26

•

23 years ago

Attached patch nsFileTransport patch — Details — Splinter Review

Darin Fisher

Comment 27

•

23 years ago

r/sr=darin

Simon Fraser [no longer active]

Assignee

Comment 28

•

23 years ago

Let's get that patch in. I'll file a separate bug on the socket transport issues, 
since it's likely to be a different kind of problem (it's actually 
mSelectFDSetCount that we track in the socket transport service).

Doug Turner (:dougt)

Comment 29

•

23 years ago

looks good. r=.  Thanks darin and simon for nailing this problem.


Chris, do you want to approve this for the branch?  I can check this in to the
trunk when it opens.

Simon Fraser [no longer active]

Assignee

Comment 30

•

23 years ago

It looks like we don't care too much about having this for 0.9.2, but we 
certainly want it right afterwards.

Doug Turner (:dougt)

Updated

•

23 years ago

Whiteboard: need sr/a → ready to checkin to branch

Doug Turner (:dougt)

Comment 31

•

23 years ago

Checked in patch:

Checking in nsFileTransport.cpp;
/cvsroot/mozilla/netwerk/base/src/nsFileTransport.cpp,v  <--  
nsFileTransport.cpp
new revision: 1.109; previous revision: 1.108
done

Status: NEW → RESOLVED

Closed: 23 years ago

Resolution: --- → FIXED

Simon Fraser [no longer active]

Assignee

Updated

•

23 years ago

Blocks: 67409

Simon Fraser [no longer active]

Assignee

Comment 32

•

23 years ago

Reopening for branch consideration.

Status: RESOLVED → REOPENED

Keywords: vtrunk

Resolution: FIXED → ---

Simon Fraser [no longer active]

Assignee

Comment 33

•

23 years ago

You can use Peek-A-Boo to view application CPU usage on Mac. Get it from:

http://www.clarkwoodsoftware.com/peekaboo/

Mozilla (branch) will show a red bar when using lots of CPU. If this bug is 
fixed, CPU usage should be normal.

Tom Everingham

Comment 34

•

23 years ago

This doesn't appear to be fixed on the trunk.  CPU usage is still almost pegged 
on my os9 machine.

Simon Fraser [no longer active]

Assignee

Comment 35

•

23 years ago

I installed the same build as tever, and we're seeing different behaviour on two 
machines. It sucks CPU on his, and not on mine. So I think the patch is working, 
but there is something else happening.

Simon Fraser [no longer active]

Assignee

Comment 36

•

23 years ago

Things work on Sairuh's Mac OS 9.0 machine, so tever's Mac is weird somehow.

Doug Turner (:dougt)

Comment 37

•

23 years ago

Lets not put this on the trunk, but only the branch if you get approval from 
pdt2.

Simon Fraser [no longer active]

Assignee

Comment 38

•

23 years ago

Huh? I thought you already checked it into the trunk? --

> Checked in patch:
> 
> Checking in nsFileTransport.cpp;
> /cvsroot/mozilla/netwerk/base/src/nsFileTransport.cpp,v  <--  
> nsFileTransport.cpp
> new revision: 1.109; previous revision: 1.108
> done

selmer (gone)

Updated

•

23 years ago

Whiteboard: ready to checkin to branch → nsbranch+,pdt+ ready to checkin to branch

Doug Turner (:dougt)

Comment 39

•

23 years ago

crap.  that comment was for another bug.  sorry simon,

Doug Turner (:dougt)

Comment 40

•

23 years ago

checked into branch:

Checking in nsFileTransport.cpp;
/cvsroot/mozilla/netwerk/base/src/nsFileTransport.cpp,v  <--  
nsFileTransport.cpp
new revision: 1.107.26.2; previous revision: 1.107.26.1

Status: REOPENED → RESOLVED

Closed: 23 years ago → 23 years ago

Resolution: --- → FIXED

lchiang

Comment 41

•

23 years ago

Tom - pls verify.  Do you still see the strange behavior on your Mac?  If so,
perhaps try sairuh's again?

Keywords: vtrunk

Tom Everingham

Comment 42

•

23 years ago

yes, ns61 still has my cpu usage at about 90% when idle - I will have to check
sairah's machine tomorrow.  07/16/01 branch.

Tom Everingham

Comment 43

•

23 years ago

I checked this on Prashant's os9 system and ns6 didn't seem to be loading the
cpu that much.  On mine, it does.   (7/17 branch)

Tom Everingham

Comment 44

•

23 years ago

checked this on Kritzer's os9.1 system.  The peek-a-boo utility is showing the
cpu is pegged.  So, I think there is still a problem.  re-open

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Doug Turner (:dougt)

Comment 45

•

23 years ago

gordon, pink, do you have any cycles to look at this today?

Doug Turner (:dougt)

Comment 46

•

23 years ago

bulk move to 0.9.4

Target Milestone: mozilla0.9.3 → mozilla0.9.4

Doug Turner (:dougt)

Comment 47

•

23 years ago

I don't doubt that the mac is not idling, but I do not see the file transport
count going negative....  adjusting summary.

Summary: In use file transport count is going negative → Mac does not idle.

Doug Turner (:dougt)

Updated

•

23 years ago

Whiteboard: nsbranch+,pdt+ ready to checkin to branch → nsbranch+,pdt+

Doug Turner (:dougt)

Comment 48

•

23 years ago

back to simon for investigation.

Assignee: dougt → sfraser

Status: REOPENED → NEW

Simon Fraser [no longer active]

Assignee

Comment 49

•

23 years ago

We need to test this again on tever's machine. I think it's fixed everywhere 
else.

Status: NEW → ASSIGNED

Tom Everingham

Comment 50

•

23 years ago

it happens on my own and Chris Kritzer's machine.  2 out of 5 that I've looked 
at so far.

Simon Fraser [no longer active]

Assignee

Comment 51

•

23 years ago

Can you please run Interarchy, and look for active mail or news connections, on 
the machines that appear not to idle.

Tom Everingham

Comment 52

•

23 years ago

I am not seeing any active mail or news connections using interarchy.  What I am
seeing though is that peek-a-boo now shows interarchy as the application hogging
the cpu.   Interarchy is using approximately 95% of the cpu for monitoring
network connections regardless of whether it is the app in focus.  When I shut
down interarchy, NS6 will then show that it is hogging 95% of the cpu while it
idles.

Simon Fraser [no longer active]

Assignee

Comment 53

•

23 years ago

I looked at tever's G3 (Mac OS 9.0) and ccritzer's G4 (9.1), and both show 
different idling behaviour to my Macs. However, debugging with MacsBug show that 
we are calling WaitNextEvent with a sleep time of 2 ticks, which indicates that 
Mozilla thinks that it's idle, so my code is working. I do not know why Peek-a-
boo shows such high CPU usage.

Tom Everingham

Comment 54

•

23 years ago

Do you know of any other tools for monitoring cpu usage for the mac?  I am not
sure if I am trusting peek-a-boo.

rubydoo123

Comment 55

•

23 years ago

removing nsBranch and pdt keywords, nsBranch+ is assigned by mgr and pdt+ is
assigned by PDT members

Whiteboard: nsbranch+,pdt+

Simon Fraser [no longer active]

Assignee

Comment 56

•

23 years ago

MacsBug says that the code is working, and our CPU usage is low on most
machines. The code is working as designed.

Status: ASSIGNED → RESOLVED

Closed: 23 years ago → 23 years ago

Resolution: --- → FIXED

benc

Updated

•

23 years ago

QA Contact: tever → benc

benc

Comment 57

•

23 years ago

VERIFIED:
based on data provided above.

Status: RESOLVED → VERIFIED

Potential patch 23 years ago Simon Fraser [no longer active] 2.52 KB, patch		Details \| Diff \| Splinter Review
New patch; moves increment into OPEN_FOR_ 23 years ago Simon Fraser [no longer active] 1.75 KB, patch		Details \| Diff \| Splinter Review
nsFileTransport patch 23 years ago Simon Fraser [no longer active] 2.38 KB, patch		Details \| Diff \| Splinter Review