Closed Bug 405992 Opened 18 years ago Closed 18 years ago

Implement native atomic operations for Linux/ARM.

Tracking

(Not tracked)

Status:

RESOLVED FIXED

Milestone:

4.7.1

People

(Reporter: ilpo.ruotsalainen, Assigned: wtc)

References

Details

(Keywords: mobile)

Attachments

(1 file, 4 obsolete files)

Patch to add atomic operations for Linux/ARM 18 years ago Ilpo Ruotsalainen 2.04 KB, patch		Details \| Diff \| Splinter Review
Patch v2 18 years ago Ilpo Ruotsalainen 4.57 KB, patch	wtc : review- nelson : review-	Details \| Diff \| Splinter Review
Ilpo's patch as a cvs diff, to aid review with bugzilla diffs 18 years ago Nelson Bolyard (seldom reads bugmail) 5.80 KB, patch		Details \| Diff \| Splinter Review
updated patch with review comments 18 years ago Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond) 2.28 KB, patch		Details \| Diff \| Splinter Review
updated patch with review comments (as checked in) 18 years ago Wan-Teh Chang 3.06 KB, patch	nelson : review+	Details \| Diff \| Splinter Review

Ilpo Ruotsalainen

Reporter

Description

•

18 years ago

User-Agent: Opera/9.24 (X11; Linux i686; U; en) Build Identifier: NSPR_HEAD_20071016 Native atomic operations for Linux/ARM should be implemented to increase performance. Reproducible: Always Steps to Reproduce: 1. Download sources. 2. Notice missing code.

Ilpo Ruotsalainen

Reporter

Comment 1

•

18 years ago

Attached patch Patch to add atomic operations for Linux/ARM (obsolete) — Details — Splinter Review

This patch implements atomic operations for Linux/ARM. However, this implementation requires at kernel version 2.6.13 or later.

Wan-Teh Chang

Assignee

Updated

•

18 years ago

Severity: normal → enhancement

Status: UNCONFIRMED → NEW

Ever confirmed: true

Wan-Teh Chang

Assignee

Comment 2

•

18 years ago

Brad, Chris, Doug, could you help me review the patch in this bug? Ilpo, we need to add a compile-time check (either an autoconf test or a C preprocessor test using macros) to determine when we can call the __kernel_cmpxchg function at the hardcoded address 0xffff0fc0. It might be easier to implement NSPR's own cmpxchg function. It should be a couple of lines of inline assembly code if ARM has the right instruction.

Ilpo Ruotsalainen

Reporter

Comment 3

•

18 years ago

(In reply to comment #2) > Ilpo, we need to add a compile-time check (either an autoconf test > or a C preprocessor test using macros) to determine when we can > call the __kernel_cmpxchg function at the hardcoded address 0xffff0fc0. Any suggestions what the check should be based on? We can't test if we can run the code in cross-compiling cases (does anyone do that?) and even checking kernel version on the build host is not really a good check for this. > It might be easier to implement NSPR's own cmpxchg function. It should > be a couple of lines of inline assembly code if ARM has the right instruction. Oversimplifying, there's no way to implement atomic cmpxchg on ARMv5 (and older, I think, but those might be outside the scope anyway).

Christian Sejersen

Updated

•

18 years ago

Keywords: mobile

Brad Lassey [:blassey] (use needinfo?)

Comment 4

•

18 years ago

We could ignore cross compiles for now and just check the host. Many embedded teams are working with scratchbox anyway. In the case of a cross compiling we could have a config flag for now

Doug Turner (:dougt)

Comment 5

•

18 years ago

wtc, a configure check would be hard. we are building in "scratchbox" which is a cross-compilation toolkit, not the real target device. It might need to just be a configure option that is passed by the developer (and we hope that they know what they are doing) on the patch, I do wonder what the address is all about; a nice big comment was to why we need to do it this way would be very helpful. Especially the requirement to call __kernel_cmpxchg_t repeatedly until success. For example, the text: The ARM __kuser_cmpxchg routine is meant to implement an atomic cmpxchg in user space. It however can produce spurious false negative if a processor exception occurs in the middle of the operation. works very nice as a comment (stolen from commit Nicolas Pitre checkin comment to the linux 2.6 kernel - b49c0f24cf6744a3f4fd09289fe7cade349dead5). You should probably change the routine name above to __kernel_cmpxchg_t. (I assume both of these fuctions are the same thing) I also wonder about the need for the do loop? Is it sufficient to simply set |ov| and/or |nv| once, then loop waiting for a successful __kernel_cmpxchg_t?

Ilpo Ruotsalainen

Reporter

Comment 6

•

18 years ago

(In reply to comment #5) > on the patch, I do wonder what the address is all about; a nice big comment was > to why we need to do it this way would be very helpful. I'll post an updated patch with added comments soon. > I also wonder about the need for the do loop? Is it sufficient to simply set > |ov| and/or |nv| once, then loop waiting for a successful __kernel_cmpxchg_t? If you only set them once you end up in an (more or less) infinite loop the first time some other thread gets in between setting |ov| and calling the cmpxchg since then |ov| will not match the value of the atomic variable anymore and the cmpxchg will not succeed unless some other thread changes the atomic value back to what it was previously.

Doug Turner (:dougt)

Comment 7

•

18 years ago

any update Ilpo?

Ilpo Ruotsalainen

Reporter

Comment 8

•

18 years ago

Attached patch Patch v2 (obsolete) — Details — Splinter Review

Attachment #290686 - Attachment is obsolete: true

Doug Turner (:dougt)

Updated

•

18 years ago

Attachment #310211 - Flags: review?(wtc)

Doug Turner (:dougt)

Comment 9

•

18 years ago

Ilpo, is see this in my nsprpub/config/autoconf.mk USE_ARM_KUSER = @USE_ARM_KUSER@ I am pretty sure that this should have either been set or not set depending on my configure flags.

Ilpo Ruotsalainen

Reporter

Comment 10

•

18 years ago

(In reply to comment #9) > Ilpo, is see this in my nsprpub/config/autoconf.mk > > USE_ARM_KUSER = @USE_ARM_KUSER@ > > I am pretty sure that this should have either been set or not set depending on > my configure flags. Did you remember to regenerate nsprpub/configure?

Mark Finkle (:mfinkle) (use needinfo?)

Updated

•

18 years ago

Blocks: n800

Doug Turner (:dougt)

Comment 11

•

18 years ago

yes, thanks. I have been using this for a few weeks now without any problem. Ilpo, did you happen to get any perf numbers? Wtc, any chance to look at this new patch?

Guillaume Letellier

Comment 12

•

18 years ago

(In reply to comment #11) > did you happen to get any perf numbers? I have some on an ARM1176 development platform (CPU 400 MHz) Linux version 2.6.21 (gcc version 4.2.1 (CodeSourcery Sourcery G++ Lite 2007q3-51)) Xulrunner (1.9b4pre) with a very simple UI ## Browser test suite based on iBench: No patch Ilpo's patch gain Total Time: 1713.095 sec 1635.665 sec 4.5% First time: 215.58 sec 205.89 sec 4.5% Cached time: 213.935 sec 204.255 sec 4.5% ## Dromaeo.com: With Ilpo's patch: http://dromaeo.com/?id=5263 Without patch: http://dromaeo.com/?id=6490 http://dromaeo.com/?id=6490,5263 Total: 259234.00ms ±4.86% 218554.80ms ±2.67% winning 19%

Nelson Bolyard (seldom reads bugmail)

Comment 13

•

18 years ago

Attached patch Ilpo's patch as a cvs diff, to aid review with bugzilla diffs (obsolete) — Details — Splinter Review

Because Ilpo's patch was not a CVS diff, a number of the patch reviewing tools on bugzilla don't work with it. This patch is exactly the same patch as the previous one, but in the form of a CVS unified diff.

Nelson Bolyard (seldom reads bugmail)

Comment 14

•

18 years ago

Comment on attachment 310211 [details] [diff] [review] Patch v2 This patch evidently uses some non-standard non-portable extensions of the c language that (I gather) must be common to all gcc compilers, but are certainly not accepted by most other c compilers. There's precedent for doing this in this same header file, for alpha. I guess we don't foresee using any other compiler on Linux any time soon. I would think that using the single-variable expression statement at the end of the basic block would (or could) be optimized away. This code appears to rely on the assumption that it will not be optimized away. The alpha code does not appear to rely on that assumption. Perhaps declaring that variable as volatile would avoid that possibility. There's a chance of variable name collision causing unexpected results. If (for example) someone invoked one of these macros with a variable named vp or ov (or nv), the results would not be as expected. I recommend prepending the names of the local variables in these basic blocks with some number of underscores (e.g. __ov, __nv), as was done in the alpha code, to reduce the chance of such collisions. Fixing the potential variable name collision problem is a must. If you're confident that the single variable expression statement cannot be optimized away, then that can stay as is.

Attachment #310211 - Flags: review-

Nelson Bolyard (seldom reads bugmail)

Comment 15

•

18 years ago

After thinking about this some more, it seems to me that there is another way to do this that does not rely on so many assumptions about optimizers or non-standard c language extensions. It is to declare these functions as __inline__ functions instead of them being macros. I'm thinking of something roughly like this: __inline__ PRInt32 _MD_ATOMIC_ADD(PRInt32 *ptr, PRInt32 n) { PRInt32 ov, nv; volatile PRInt32 *vp = (ptr); /* The whole sequence must be restarted if __kernel_cmpxchg() fails * because some other thread may have changed the value of *vp. */ do { ov = *vp; nv = ov + n; } while (__kernel_cmpxchg(ov, nv, vp)); return nv; } #define _MD_ATOMIC_INCREMENT(ptr) _MD_ATOMIC_ADD(ptr, 1) #define _MD_ATOMIC_DECREMENT(ptr) _MD_ATOMIC_ADD(ptr, -1) This approach doesn't depend on allowing executable basic blocks inside of expressions (e.g. ({ code }) and doesn't rely on a single-variable expression statement that is the last statement in a basic block not being optimized away or being the "return value" of the block/expression. And it avoids the variable name collision potential completely. If this approach produces comparable performance to the approach taken in the previous patch, I'd much prefer it.

Wan-Teh Chang

Assignee

Comment 16

•

18 years ago

Comment on attachment 310211 [details] [diff] [review] Patch v2 Thanks for the patch. Could you skip the USE_ARM_KUSER makefile variable and just do AC_DEFINE(_PR_ARM_KUSER) directly in configure.in? (It's fine to use the shell variable USE_ARM_KUSER in configure.in.) In NSPR, we put the closing brace } and while on the same line: } while (__kernel_cmpxchg(ov, (nv), vp));

Attachment #310211 - Flags: review?(wtc) → review-

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Comment 17

•

18 years ago

Attached patch updated patch with review comments (obsolete) — Details — Splinter Review

I'd like to see this checked in as soon as we can; here's an updated patch based on the review comments. The generated code in my simple tests is identical to the macro version; I haven't run actual perf tests, though. Also fixed the configure/makefile issue; I just AC_DEFINE if the configure flag is specified, don't think we need the shell var anywhere.

Attachment #317821 - Flags: review?(nelson)

Wan-Teh Chang

Assignee

Comment 18

•

18 years ago

Attached patch updated patch with review comments (as checked in) — Details — Splinter Review

I changed --enable-arm-kuser to --with-arm-kuser. Let me know if --enable is more appropriate. I fixed the indentation offset (which is 4 in NSPR and NSS). I would have defined __kernel_cmpxchg_t and __kernel_cmpxchg as: typedef int (*__kernel_cmpxchg_t)(int oldval, int newval, volatile int *ptr); #define __kernel_cmpxchg ((__kernel_cmpxchg_t)0xffff0fc0) But I found that the definitions in this patch come from the example in the Linux/ARM kernel source code, so I kept them. I checked in the patch on the NSPR trunk (NSPR 4.7.1). Checking in configure.in; /cvsroot/mozilla/nsprpub/configure.in,v <-- configure.in new revision: 1.232; previous revision: 1.231 done Checking in configure; /cvsroot/mozilla/nsprpub/configure,v <-- configure new revision: 1.228; previous revision: 1.227 done Checking in pr/include/md/_linux.h; /cvsroot/mozilla/nsprpub/pr/include/md/_linux.h,v <-- _linux.h new revision: 3.49; previous revision: 3.48 done

Attachment #310211 - Attachment is obsolete: true

Attachment #316713 - Attachment is obsolete: true

Attachment #317821 - Attachment is obsolete: true

Attachment #317821 - Flags: review?(nelson)

Wan-Teh Chang

Assignee

Updated

•

18 years ago

Status: NEW → RESOLVED

Closed: 18 years ago

Resolution: --- → FIXED

Target Milestone: --- → 4.7.1

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Comment 19

•

18 years ago

(In reply to comment #18) > I changed --enable-arm-kuser to --with-arm-kuser. Let me > know if --enable is more appropriate. with should be fine. > I fixed the indentation offset (which is 4 in NSPR and NSS). Note that http://mxr.mozilla.org/mozilla/source/nsprpub/pr/include/md/_linux.h#1 indicates 2 -- my editor's just doing what the file says. :) You may want to update the modelines. > I checked in the patch on the NSPR trunk (NSPR 4.7.1). Thanks! I'll do a separate patch to get this flag passed down from Gecko's toplevel configure.

Wan-Teh Chang

Assignee

Comment 20

•

18 years ago

Vlad, I will fix the Emacs mode line in _linux.h. The mode line was given to us when Netscape released its browser source code in 1998. Since none of the NSPR developers use Emacs, we just added the mode line to our files without looking closely. In comment #18, I wrote: > I would have defined __kernel_cmpxchg_t and __kernel_cmpxchg as: > typedef int (*__kernel_cmpxchg_t)(int oldval, int newval, volatile int *ptr); > #define __kernel_cmpxchg ((__kernel_cmpxchg_t)0xffff0fc0) I now understand why __kernel_cmpxchg has to be defined as: #define __kernel_cmpxchg (*(__kernel_cmpxchg_t *)0xffff0fc0) I believe that 0xffff0fc0 is not the address of the __kernel_cmpxchg function but rather the address of a word that stores the address of the function.

Nelson Bolyard (seldom reads bugmail)

Comment 21

•

18 years ago

Comment on attachment 317838 [details] [diff] [review] updated patch with review comments (as checked in) Thanks, Vlad and Wan-Teh, for incorporating my comments.

Attachment #317838 - Flags: review+

You need to log in before you can comment on or make changes to this bug.