Closed Bug 101838 Opened 23 years ago Closed 23 years ago

overall slower performance after nspr checkin [see bug 71718]

Categories

(SeaMonkey :: General, defect)

PowerPC
macOS
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX
mozilla0.9.9

People

(Reporter: bugzilla, Assigned: sfraser_bugs)

References

Details

(Keywords: perf)

Attachments

(4 files)

spun off from bug 71718 --pls reassign, change component, if needed. thx!

basically i've been noticing a slowdown in performance on my Mac G3 running
10.0.4 [384M ram]. for example, Preferences results:

trunk testing of Preferences panel: compared with my 2001-09-20 16:31 results.
same configuration, with a new profile [modern, no sidebar].

AFTER checkin, using 2001.09.25.20-trunk
----------------------------------------
Open Preferences dialog open: 3.39sec
Switching from Navigator to Mail & Newsgroups panel: 0.97sec


BEFORE checkin, using 2001.09.18.20-trunk
-----------------------------------------
Preferences dialog open: 2.50sec
Switching from Navigator to Mail & Newsgroups panel: 0.70sec
i'm currently running page load tests, and will attach that info here when done.

stephend, have you had a chance to run the "after" mailnews tests? afaict, the
2001.09.25.20-trunk build would contain simon's fix...
Keywords: perf
QA Contact: doronr → jrgm
* *9-18 and 9-21 builds were tested with 10.0.4, 9-15 was tested with 10.1

That should read '9-25 was tested with 10.1'.
setup: i quit and restarted btwn each run [separate sessions] --the main reason
being that subsequent runs in the same session got *noticeably* slower, as well
as memory usage increasing. tested on OS 10.0.4.

BEFORE checkin, using 2001.09.18.20-trunk
-----------------------------------------
1st run: ave page load=2300ms, mem usage (RPRVT)=79.6M2nd run: ave page
load=2332ms, mem usage (RPRVT)=77.1M
3rd run: ave page load=2300ms, mem usage (RPRVT)=79.6M


AFTER checkin, using 2001.09.25.20-trunk
----------------------------------------
1st run: ave page load=2879ms, mem usage (RPRVT)=82.0M
2nd run: ave page load=2864ms, mem usage (RPRVT)=83.9M
3rd run: ave page load=2890ms, mem usage (RPRVT)=84.6M
for the "progressively slower" issue i mentioned in my 2001-09-26 18:28
comments, i filed bug 101870.
Summary: slower performance after nspr checkin [see bug 71718] → overall slower performance after nspr checkin [see bug 71718]
for comparison, i ran three page load tests [in three separate browser sessions]
using 2001.09.25.10-branch comm bits:

1st run, ave page load=2588ms, mem usage (RPRVT)=61.1M
2nd run, ave page load=2583ms, mem usage (RPRVT)=67.0M
3rd run, ave page load=2627ms, mem usage (RPRVT)=61.7M

slower than the BEFORE trunk build, but faster than the AFTER trunk builds
[roughly in the middle]. however, the mem usage is significantly ***less with
the branch***, than it is with either of the trunk builds [by about 20M less].
The speed and memory usage diffs on the branch are at least partially 
attributable to the change to use the OS native memory allocators in the Carbon 
build (#97211).
Adding Startup Results here per Sairuh's request:
I take an average of three for launch and relaunch on both builds. I tested 
these manually via a stopwatch. Times are in seconds. 
Build 2001-09-18-20-trunk
Launch: 19.25 
Re-launch: 12.49
Build 2001-09-25-20-trunk
Launch: 24.17
Re-launch: 16.62

So, as you can see.. a fairly large jump in times in the 9-25 build :( 
There were no peaks or weird jumps, all times on both builds were fairly 
consistent. 
Blocks: 102998
setting to 0.9.7
simon, if you can fix it before 0.9.7, go ahead do it!
Target Milestone: --- → mozilla0.9.7
Notes:
* Looking at the page load results, no specifc site got slower in the 'after'
results. This is a general slowdown, not content-specific.
* I dumped stack traces for all the times MD_EnterCriticalSection is called for
  a period of startup (180Mb log file!). For this period of early startup, 
  MD_EnterCritical section is called:
    115956 times for PR_Lock/PR_Unlock exclusive of atomic ops
     36565 times for atomic operations
      8561 times from the TimerCallback
      6522 times from AsyncIOCompletion and descendents
      3270 times from PR_NotifyCondVar (exclusive of atomic ops)
      3252 times from WaitOnThisThread
           and misc others.
    174177 times in total

So PR_Lock/PR_Unlock are the biggest culprits, and I think it will be worthwhile
trying to reduce their call counts from Mozilla code. Converting Mac's atomic
ops to use native calls will save those 36565 calls.
How about using the atomic routines of Open Transport?
I was getting to that ;)
It's bug 106999.
Status: NEW → ASSIGNED
Each call to PR_Lock calls EnterCriticalRegion 3 times:
  Twice in _PR_INTSOFF(is)
which expands to
  do { EnterCritialRegion();
    (is) = (_pr_intsOff); _MD_SetIntsOff(1); LeaveCritialRegion();
  } while (0) ;
where _MD_SetIntsOff() calls EnterCritialRegion()

and once in _PR_FAST_INTSON(is).

I wonder if we can reduce that to 2.
So _PR_INTSOFF(is) expands to:

do {
	EnterCritialRegion();
	(is) = (_pr_intsOff);
	_MD_SetIntsOff(1);
	LeaveCritialRegion();
} while (0);

Expanding _MD_SetIntsOff(1) then gives:

do {
	EnterCritialRegion();
	(is) = (_pr_intsOff);
	
	{
		EnterCritialRegion() ;
		gCriticalRegionEntryCount ++;
		_pr_intsOff = 1;
		if (!1)
		{
			PRInt32 i = gCriticalRegionEntryCount;
			gCriticalRegionEntryCount = 0;
			for ( ;i > 0; i --) {
				LeaveCritialRegion() ;
			}
		}
	}
	
	LeaveCritialRegion();
} while (0);

which can be optimized to

do {
	EnterCritialRegion();
	(is) = (_pr_intsOff);
	
	{
		EnterCritialRegion() ;
		gCriticalRegionEntryCount ++;
		_pr_intsOff = 1;
	}
	
	LeaveCritialRegion();
} while (0);

and further to


do {
	EnterCritialRegion();
	(is) = (_pr_intsOff);
	
	gCriticalRegionEntryCount ++;
	_pr_intsOff = 1;
	
} while (0);

This saves 1 enter and 1 leave call.
My patch (attachment 50017 [details] [diff] [review]), which I couldn't get
to work before your checkin deadline, reduced the
number of calls to EnterCriticalRegion() to 2.

I think your rewrite transformation of _PR_INTSOFF(is)
makes the same number of calls except that my patch
does without gCriticalRegionEntryCount -- I use the
stack of the 'is' local variables to remember the
critical region entry count.
I'll look at that again. Unfortunately I don't have a dual CPU machine to test 
with yet.
Some page load results:

Current opt build                            1886 ms
* _PR_INTSOFF change                         1862
+ native atomic ops                          1734
+ atomic ops & _PR_INTSOFF change            1735

So this is rather surprising. Going to native atomic ops gives a 9% boost, but 
hacking _PR_INTSOFF to call EnterCriticalRegion fewer times has little impact.
->0.9.8
Target Milestone: mozilla0.9.7 → mozilla0.9.8
No longer blocks: 102998
Blocks: 102998
->0.9.9
Target Milestone: mozilla0.9.8 → mozilla0.9.9
Simon, does this one merit the nsbeta1 keyword?
No. The slowdown was necessary for correctness. There is no magic fix in NSPR; 
any improvements will be independent of the actual fix.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → WONTFIX
Product: Browser → Seamonkey
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: