Closed Bug 106999 Opened 23 years ago Closed 23 years ago

Implement native atomic operations on Mac OS 9/Mac OS X

Categories

(NSPR :: NSPR, defect)

PowerPC
macOS
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: sfraser_bugs, Assigned: sfraser_bugs)

Details

Attachments

(1 file, 3 obsolete files)

We need to implement native atomic operations for Mac OS, to avoid the PRLock/ PRUnlock overhead (which increased with the dual CPU support fixes). Note that http://developer.apple.com/technotes/tn/tn1137.html recommends not using the PPC Load Reserved and Store Conditional instructions, but rather the atomic operation APIs provided by Open Transport and the Driver Services lib.
It's quite safe to use these instructions as long as one understands exactly what they do, how they interact with load/store reordering and cache coherency, etc. These instructions are part of the core PowerPC architecture and must work in the same way on all current and future PowerPC processors compatible with the spec.
waldemar: do you have any sample code that implements atomic operations in PPC asm?
I read the tech note TN1137 that Simon cited. I think NSPR for traditional Mac OS (including Carbon) should use Open Transport's atomic routines. OTAtomicAdd32 can be used to implement PR_AtomicIncrement, PR_AtomicDecrement, and PR_AtomicAdd. OTCompareAndSwap32 can be used to implement PR_AtomicSet. (In fact, this function can be used to implement all of the PR_AtomicXXX routines.) I believe that OTLIFO can be used to implement PRStack. On the other hand, since the NSPR threads are user-level threads running on one kernel thread, these OT atomic routines probably won't help, unless the PR_AtomicXXX routines are called by the OT notifier routines. For Darwin (the "mach-o" build), we should find out if there are atomic routines in some system library.
The mach-o build will use whatever atomic ops the Unix build uses, no?
> The mach-o build will use whatever atomic ops the > Unix build uses, no? The Unix build is not using any atomic routines.
> On the other hand, since the NSPR threads are > user-level threads running on one kernel thread, > these OT atomic routines probably won't help, > unless the PR_AtomicXXX routines are called by > the OT notifier routines. On Mac OS X, I believe that we don't have to run into lock contention to take a hit from the PRLock/PRUnlock that the atomic routines call. I think the additional overhead of the critical sections that are entered/left when PRLock turns interrupts on and off is measurable, so even without lock contention, removing these is a win.
This patch implements atomic operations using the Open Transport calls. It also contains an optimization of _MD_INTSOFF() (see bug 101838). This patch gives about a 9% improvement in page load tests for me on Mac OS X.
Comment on attachment 58378 [details] [diff] [review] OT atomic ops for Mac OS/Mac OS X 1. In _macos.h, the old definition of _MD_INTSOFF should be deleted. 2. In macthr.c, _PR_MD_ATOMIC_SET must be implemented with an OT atomic routine too. I suggest OTCompareAndSwap32. The idea is to repeat in a loop of reading the current value and comparing with it and swapping in the new value until the OTCompareAndSwap32 call succeeds.
Attachment #58378 - Flags: needs-work+
This patch implements PR_AtomicSet as well as the other atomic operations. It's ready for review. On my dual CPU Mac OS X machine, this patch reduces the mean page load time from 1840ms to 1740ms.
Attachment #58378 - Attachment is obsolete: true
Note that the patch will also apply for Mac classic builds, but there appears to be little perceptible performance benefit there; it certainly does not get slower.
Status: NEW → ASSIGNED
OS: Mac System 8.5 → MacOS X
Comment on attachment 60449 [details] [diff] [review] Full implementation of atomic ops >Index: pr/src/md/mac/macthr.c >=================================================================== >RCS file: /cvsroot/mozilla/nsprpub/pr/src/md/mac/macthr.c,v >retrieving revision 3.14.2.2 >diff -b -u -4 -r3.14.2.2 macthr.c >--- pr/src/md/mac/macthr.c 26 Sep 2001 02:34:13 -0000 3.14.2.2 >+++ pr/src/md/mac/macthr.c 5 Dec 2001 02:47:48 -0000 >@@ -42,8 +42,9 @@ > > #include <LowMem.h> > #include <Multiprocessing.h> > #include <Gestalt.h> >+#include <OpenTransport.h> > > #include "mdcriticalregion.h" > > TimerUPP gTimerCallbackUPP = NULL; >@@ -432,8 +433,62 @@ > > PR_SetError(PR_NOT_IMPLEMENTED_ERROR, unimpErr); > return PR_FAILURE; > } >+ >+//############################################################################## >+//############################################################################## >+#pragma mark - >+#pragma mark ATOMIC OPERATIONS >+ >+#if _PR_HAVE_ATOMIC_OPS A stylistic nit: In NSPR, we usually say #ifdef _PR_HAVE_ATOMIC_OPS >+/* >+ * We use a single lock for all the emulated atomic operations. >+ * The lock contention should be acceptable. >+ */ >+static PRLock *atomic_lock = NULL; >+void _PR_MD_INIT_ATOMIC(void) >+{ >+ if (atomic_lock == NULL) { >+ atomic_lock = PR_NewLock(); >+ } >+} The 'atomic_lock' should be removed and _PR_MD_INIT_ATOMIC should either have an empty function body or be defined as a macro that expands to nothing. Come to think of it, _PR_MD_ATOMIC_INCREMENT, _PR_MD_ATOMIC_DECREMENT, and _PR_MD_ATOMIC_ADD can all be defined as macros. Use mozilla/nsprpub/pr/include/md/_osf1.h as an example. >+PRInt32 >+_PR_MD_ATOMIC_SET(PRInt32 *val, PRInt32 newval) >+{ >+ PRInt32 rv; >+ >+dprintf("Doing _PR_MD_ATOMIC_SET\n"); >+ >+ do >+ { >+ rv = *val; >+ } while (!OTCompareAndSwap32(rv, newval, (UInt32*)val)); >+ >+ return rv; >+} The dprintf statement should be deleted. Another stylistic nit: use do { /* put opening brace here */ /* indent by four spaces */ } while (...);
Attachment #60449 - Flags: needs-work+
Attachment #60520 - Attachment is obsolete: true
Comment on attachment 60521 [details] [diff] [review] Same as previous, changing a #if to #ifdef in macthr.c I like this patch! Sorry I still have a couple of stylistic nits. >Index: pr/include/md/_macos.h >=================================================================== >RCS file: /cvsroot/mozilla/nsprpub/pr/include/md/_macos.h,v >retrieving revision 3.32.2.2 >diff -b -u -4 -r3.32.2.2 _macos.h >--- pr/include/md/_macos.h 8 Oct 2001 22:41:25 -0000 3.32.2.2 >+++ pr/include/md/_macos.h 5 Dec 2001 21:47:07 -0000 >@@ -675,7 +678,23 @@ > #define ENTER_CRITICAL_REGION() > #define LEAVE_CRITICAL_REGION() > > #endif >+ >+ >+/* >+ * Atomic operations >+ */ >+#ifdef _PR_HAVE_ATOMIC_OPS >+ >+extern PRInt32 MD_AtomicSet(PRInt32 *val, PRInt32 newval); >+ >+#define _MD_INIT_ATOMIC() >+#define _MD_ATOMIC_INCREMENT(val) (OTAtomicAdd32(1, (SInt32 *)val)) >+#define _MD_ATOMIC_ADD(ptr, val) (OTAtomicAdd32(val, (SInt32 *)ptr)) >+#define _MD_ATOMIC_DECREMENT(val) (OTAtomicAdd32(-1, (SInt32 *)val)) >+#define _MD_ATOMIC_SET(val, newval) (MD_AtomicSet(val, newval)) >+ >+#endif /* _PR_HAVE_ATOMIC_OPS */ > > > #endif /* prmacos_h___ */ The parentheses around the values of these macros are not necessary. Change MD_AtomicSet to _MD_AtomicSet. (Make the same change in macthr.c. You can go ahead and check this patch into the trunk and client branch of NSPR after making these changes. Thanks!
Attachment #60521 - Flags: review+
Attachment #60521 - Flags: needs-work+
Attachment #60521 - Flags: approval+
wtc: do you think we need to assert that 'val' is on a 4-byte boundary? The OT docs say that the OTAtomic routines can only be called on 4-byte aligned addresses.
Simon, Since *correct* code using the PR_Atomic routines passes PRInt32 (which is int on the Mac) to the OTAtomic routines, we can safely assume that the CodeWarrior compiler aligns int on a 4-byte boundary. So the assertion should not be necessary.
Checked into NSPR tip and NSPRPUB_PRE_$_@_CLIENT_BRANCH.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: