Closed
Bug 106999
Opened 23 years ago
Closed 23 years ago
Implement native atomic operations on Mac OS 9/Mac OS X
Categories
(NSPR :: NSPR, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: sfraser_bugs, Assigned: sfraser_bugs)
Details
Attachments
(1 file, 3 obsolete files)
2.53 KB,
patch
|
wtc
:
review+
wtc
:
approval+
|
Details | Diff | Splinter Review |
We need to implement native atomic operations for Mac OS, to avoid the PRLock/
PRUnlock overhead (which increased with the dual CPU support fixes).
Note that http://developer.apple.com/technotes/tn/tn1137.html recommends not
using the PPC Load Reserved and Store Conditional instructions, but rather the
atomic operation APIs provided by Open Transport and the Driver Services lib.
Comment 1•23 years ago
|
||
It's quite safe to use these instructions as long as one understands exactly
what they do, how they interact with load/store reordering and cache coherency,
etc. These instructions are part of the core PowerPC architecture and must work
in the same way on all current and future PowerPC processors compatible with the
spec.
Assignee | ||
Comment 2•23 years ago
|
||
waldemar: do you have any sample code that implements atomic operations in PPC
asm?
Comment 3•23 years ago
|
||
I read the tech note TN1137 that Simon cited.
I think NSPR for traditional Mac OS (including
Carbon) should use Open Transport's atomic
routines.
OTAtomicAdd32 can be used to implement
PR_AtomicIncrement, PR_AtomicDecrement, and
PR_AtomicAdd.
OTCompareAndSwap32 can be used to implement
PR_AtomicSet. (In fact, this function can
be used to implement all of the PR_AtomicXXX
routines.)
I believe that OTLIFO can be used to implement
PRStack.
On the other hand, since the NSPR threads are
user-level threads running on one kernel thread,
these OT atomic routines probably won't help,
unless the PR_AtomicXXX routines are called by
the OT notifier routines.
For Darwin (the "mach-o" build), we should find
out if there are atomic routines in some system
library.
Assignee | ||
Comment 4•23 years ago
|
||
The mach-o build will use whatever atomic ops the Unix build uses, no?
Comment 5•23 years ago
|
||
> The mach-o build will use whatever atomic ops the
> Unix build uses, no?
The Unix build is not using any atomic routines.
Assignee | ||
Comment 6•23 years ago
|
||
> On the other hand, since the NSPR threads are
> user-level threads running on one kernel thread,
> these OT atomic routines probably won't help,
> unless the PR_AtomicXXX routines are called by
> the OT notifier routines.
On Mac OS X, I believe that we don't have to run into lock contention to take a
hit from the PRLock/PRUnlock that the atomic routines call. I think the
additional overhead of the critical sections that are entered/left when PRLock
turns interrupts on and off is measurable, so even without lock contention,
removing these is a win.
Assignee | ||
Comment 7•23 years ago
|
||
This patch implements atomic operations using the Open Transport calls. It also
contains an optimization of _MD_INTSOFF() (see bug 101838). This patch gives
about a 9% improvement in page load tests for me on Mac OS X.
Comment 8•23 years ago
|
||
Comment on attachment 58378 [details] [diff] [review]
OT atomic ops for Mac OS/Mac OS X
1. In _macos.h, the old definition of _MD_INTSOFF should be
deleted.
2. In macthr.c, _PR_MD_ATOMIC_SET must be implemented with
an OT atomic routine too. I suggest OTCompareAndSwap32.
The idea is to repeat in a loop of reading the current
value and comparing with it and swapping in the new value
until the OTCompareAndSwap32 call succeeds.
Attachment #58378 -
Flags: needs-work+
Assignee | ||
Comment 9•23 years ago
|
||
This patch implements PR_AtomicSet as well as the other atomic operations. It's
ready for review.
On my dual CPU Mac OS X machine, this patch reduces the mean page load time
from 1840ms to 1740ms.
Attachment #58378 -
Attachment is obsolete: true
Assignee | ||
Comment 10•23 years ago
|
||
Note that the patch will also apply for Mac classic builds, but there appears to
be little perceptible performance benefit there; it certainly does not get slower.
Status: NEW → ASSIGNED
OS: Mac System 8.5 → MacOS X
Comment 11•23 years ago
|
||
Comment on attachment 60449 [details] [diff] [review]
Full implementation of atomic ops
>Index: pr/src/md/mac/macthr.c
>===================================================================
>RCS file: /cvsroot/mozilla/nsprpub/pr/src/md/mac/macthr.c,v
>retrieving revision 3.14.2.2
>diff -b -u -4 -r3.14.2.2 macthr.c
>--- pr/src/md/mac/macthr.c 26 Sep 2001 02:34:13 -0000 3.14.2.2
>+++ pr/src/md/mac/macthr.c 5 Dec 2001 02:47:48 -0000
>@@ -42,8 +42,9 @@
>
> #include <LowMem.h>
> #include <Multiprocessing.h>
> #include <Gestalt.h>
>+#include <OpenTransport.h>
>
> #include "mdcriticalregion.h"
>
> TimerUPP gTimerCallbackUPP = NULL;
>@@ -432,8 +433,62 @@
>
> PR_SetError(PR_NOT_IMPLEMENTED_ERROR, unimpErr);
> return PR_FAILURE;
> }
>+
>+//##############################################################################
>+//##############################################################################
>+#pragma mark -
>+#pragma mark ATOMIC OPERATIONS
>+
>+#if _PR_HAVE_ATOMIC_OPS
A stylistic nit: In NSPR, we usually say
#ifdef _PR_HAVE_ATOMIC_OPS
>+/*
>+ * We use a single lock for all the emulated atomic operations.
>+ * The lock contention should be acceptable.
>+ */
>+static PRLock *atomic_lock = NULL;
>+void _PR_MD_INIT_ATOMIC(void)
>+{
>+ if (atomic_lock == NULL) {
>+ atomic_lock = PR_NewLock();
>+ }
>+}
The 'atomic_lock' should be removed and _PR_MD_INIT_ATOMIC
should either have an empty function body or be defined as
a macro that expands to nothing.
Come to think of it, _PR_MD_ATOMIC_INCREMENT,
_PR_MD_ATOMIC_DECREMENT, and _PR_MD_ATOMIC_ADD
can all be defined as macros. Use
mozilla/nsprpub/pr/include/md/_osf1.h as an
example.
>+PRInt32
>+_PR_MD_ATOMIC_SET(PRInt32 *val, PRInt32 newval)
>+{
>+ PRInt32 rv;
>+
>+dprintf("Doing _PR_MD_ATOMIC_SET\n");
>+
>+ do
>+ {
>+ rv = *val;
>+ } while (!OTCompareAndSwap32(rv, newval, (UInt32*)val));
>+
>+ return rv;
>+}
The dprintf statement should be deleted.
Another stylistic nit: use
do { /* put opening brace here */
/* indent by four spaces */
} while (...);
Attachment #60449 -
Flags: needs-work+
Assignee | ||
Comment 12•23 years ago
|
||
Attachment #60449 -
Attachment is obsolete: true
Assignee | ||
Comment 13•23 years ago
|
||
Assignee | ||
Updated•23 years ago
|
Attachment #60520 -
Attachment is obsolete: true
Comment 14•23 years ago
|
||
Comment on attachment 60521 [details] [diff] [review]
Same as previous, changing a #if to #ifdef in macthr.c
I like this patch!
Sorry I still have a couple of stylistic nits.
>Index: pr/include/md/_macos.h
>===================================================================
>RCS file: /cvsroot/mozilla/nsprpub/pr/include/md/_macos.h,v
>retrieving revision 3.32.2.2
>diff -b -u -4 -r3.32.2.2 _macos.h
>--- pr/include/md/_macos.h 8 Oct 2001 22:41:25 -0000 3.32.2.2
>+++ pr/include/md/_macos.h 5 Dec 2001 21:47:07 -0000
>@@ -675,7 +678,23 @@
> #define ENTER_CRITICAL_REGION()
> #define LEAVE_CRITICAL_REGION()
>
> #endif
>+
>+
>+/*
>+ * Atomic operations
>+ */
>+#ifdef _PR_HAVE_ATOMIC_OPS
>+
>+extern PRInt32 MD_AtomicSet(PRInt32 *val, PRInt32 newval);
>+
>+#define _MD_INIT_ATOMIC()
>+#define _MD_ATOMIC_INCREMENT(val) (OTAtomicAdd32(1, (SInt32 *)val))
>+#define _MD_ATOMIC_ADD(ptr, val) (OTAtomicAdd32(val, (SInt32 *)ptr))
>+#define _MD_ATOMIC_DECREMENT(val) (OTAtomicAdd32(-1, (SInt32 *)val))
>+#define _MD_ATOMIC_SET(val, newval) (MD_AtomicSet(val, newval))
>+
>+#endif /* _PR_HAVE_ATOMIC_OPS */
>
>
> #endif /* prmacos_h___ */
The parentheses around the values of these macros are
not necessary.
Change MD_AtomicSet to _MD_AtomicSet. (Make the same
change in macthr.c.
You can go ahead and check this patch into the trunk
and client branch of NSPR after making these changes.
Thanks!
Attachment #60521 -
Flags: review+
Attachment #60521 -
Flags: needs-work+
Attachment #60521 -
Flags: approval+
Assignee | ||
Comment 15•23 years ago
|
||
wtc: do you think we need to assert that 'val' is on a 4-byte boundary? The OT
docs say that the OTAtomic routines can only be called on 4-byte aligned
addresses.
Comment 16•23 years ago
|
||
Simon,
Since *correct* code using the PR_Atomic routines passes
PRInt32 (which is int on the Mac) to the OTAtomic routines,
we can safely assume that the CodeWarrior compiler aligns
int on a 4-byte boundary. So the assertion should not be
necessary.
Assignee | ||
Comment 17•23 years ago
|
||
Checked into NSPR tip and NSPRPUB_PRE_$_@_CLIENT_BRANCH.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•