Closed Bug 296878 Opened 15 years ago Closed 6 years ago

PowerPC Linux should support native PowerPC atomic instructions (lwarx/stwcx.)

Categories

(NSPR :: NSPR, enhancement, P2)

PowerPC
Linux
enhancement

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: kevdig, Assigned: wtc)

References

(Blocks 1 open bug, )

Details

Attachments

(5 files, 8 obsolete files)

10.19 KB, patch
mark
: review+
Details | Diff | Splinter Review
6.39 KB, patch
Details | Diff | Splinter Review
2.90 KB, patch
Details | Diff | Splinter Review
9.04 KB, patch
Details | Diff | Splinter Review
3.29 KB, text/plain
Details
User-Agent:       Mozilla/5.0 (X11; U; Linux ppc; en-US; rv:1.8b2) Gecko/20050601
Build Identifier: Mozilla/5.0 (X11; U; Linux ppc; en-US; rv:1.8b2) Gecko/20050601

x86 Linux has code in js/src/jslock.c (in the form of inline assembly) and a
assembly source file in nsprpub/pr/src/md/unix called os_Linux_x86.s to support
native atomic instructions. My PowerPC box felt left out.

Reproducible: Always

Actual Results:  
N/A


Patches to fix this are forthcoming.
Depends on: 296879
Attached patch remaining code changes (obsolete) — Splinter Review
This is a sample mozconfig file to create a build that will run on a 386.
Attachment #185530 - Attachment mime type: application/octet-stream → text/plain
Comment on attachment 185531 [details] [diff] [review]
remaining code changes

I suggest filing a separate set of bugs for targeting i386.  Right now, we do
not target any specific x86 processor and I'm pretty sure that we don't want to
use that particular env variable to start.

Is there any purpose for the #if 0 case in jslock.c?
Attachment #185531 - Flags: review-
Status: UNCONFIRMED → NEW
Ever confirmed: true
(In reply to comment #5)
> (From update of attachment 185531 [details] [diff] [review] [edit])
> I suggest filing a separate set of bugs for targeting i386.  Right now, we do
> not target any specific x86 processor and I'm pretty sure that we don't want to
> use that particular env variable to start.
> 
> Is there any purpose for the #if 0 case in jslock.c?
> 
I would prefer not to file a separate bug. But if I am violating some kind of
"Thou Shall not put multiple fixes in a single bug" rule then I will. Does your
bug process allow you to select parts of submitted patches? Or to modify them
(i.e. to remove the debug printf()s)?

The correct decision that was made to use more modern atomic instructions does
target specific x86 processors, namely all 32-bit but the 386. If you run the
standard build on a 386 you get and illegal instruction crash.

One of the things I like about open source is that the decision to obsolete my
old computers is mine. I tried to add the 386 support in as unobtrusive a
fashion as possible. It only shows up as a few lines of code in a few
configuration scripts. And in the #ifdef in jslock.c.

As for the env variable I used to enable a 386 build, is it the fact that I used
an env varaible or the name I chose?
mostly because the bug summary is PowerPC and the people who would review one
patch aren't the people who'd review the other, nor would the people interested
in one necessarily be interested in the other.
Status: NEW → ASSIGNED
(In reply to comment #6)

> I would prefer not to file a separate bug. But if I am violating some kind of
> "Thou Shall not put multiple fixes in a single bug" rule then I will. Does your
> bug process allow you to select parts of submitted patches? Or to modify them
> (i.e. to remove the debug printf()s)?

Multiple *unrelated* fixes is the key part.  And as timeless points out, a
person capable/interested in review on isn't necessarily capable/interested in
reviewing the other.  Typically when there's a problem with a patch, a new one
is uploaded and the old one is marked as obsolete.
  
> As for the env variable I used to enable a 386 build, is it the fact that I used
> an env varaible or the name I chose?

Both.  I dislike the idea of using env variables when a pre-existing configure
option will do.  And the env variable is distinctly Mozilla specific though the
patch touches NSPR which is a separate product.  

I would prefer to use the standard $target_cpu variable to determine the level
of optimization used.  If configure is given or autodetects i386, then that's
what's used.  If it's given something else, then it uses the current default. 
That does change our current assumptions of grouping all i*86 builds together
although it should only affect people who are mistakingly explicitly configuring
their systems as i386-linux. 
(In reply to comment #7)
> mostly because the bug summary is PowerPC and the people who would review one
> patch aren't the people who'd review the other, nor would the people interested
> in one necessarily be interested in the other.

That is an excellent point. I had not thought of that. It probably also needs
its own bug for bug trail purposes anyway.

Do I need to fix the patches for this bug or can the powers that be select the
relevant parts?
386 code changes removed from previous patch
Attachment #185531 - Attachment is obsolete: true
Comment on attachment 185529 [details]
386 version of NSPR atomic operations (nsprpub/pr/src/md/unix)

will be moved to separate bug
Attachment #185529 - Attachment is obsolete: true
Comment on attachment 185532 [details]
sample mozconfig for 386 compatible build

will be moved to separate bug
Attachment #185532 - Attachment is obsolete: true
386 stuff moved to 298103
Blocks: 296879
No longer depends on: 296879
Comment on attachment 186346 [details] [diff] [review]
de-386-ized version of "remaining code changes"

Mark, does the js_CompareAndSwap implementation here look correct for OS X?
Attachment #186346 - Flags: superreview?(brendan)
Attachment #186346 - Flags: review?(mark)
Comment on attachment 186346 [details] [diff] [review]
de-386-ized version of "remaining code changes"

> Mark, does the js_CompareAndSwap implementation here look correct for OS X?

It doesn't even get touched on OS X, which doesn't define __powerpc__:

+#elif defined(__GNUC__) && defined(__powerpc__)

With minimal changes, this will work on OS X too.  I'm minusing it because it
doesn't work with "as" on Mac OS X.  That's sort of unfair, because the patch
never purported to work for Mac OS X.  The problems are purely syntactical, the
asm itself is fine although I've got a couple of nits.

+  "2: lwarx 6,0,%0\n"
+  "cmpw 7,6,%1\n"

Registers need to be identified with an "r" prefix, or the OS X as won't
understand them.  I also like to see condition register fields identified as
"cr."  I believe both of these changes will be understood by PPC Linux'
assembler, but I don't have a working PPC Linux system to test on at the
moment.  Someone, please verify that as on Linux understands r6 and cr7!

References to %0 should be %3 when "w" is being used, it makes the intent
clearer.

+  : "cc", "memory","r6");

This doesn't clobber memory.

The author gets bonus points for doing branch predicition properly!

os_Linux_ppc.s doesn't contain anything that will work on OS X, which already
has os_Darwin_ppc.s.  I glanced at the asm and it looks correct and
functionally identical to the Darwin version.  I'd find it easier to read if
the registers were identified with "r" prefixes, which would also make it
syntactically identical to the Darwin version.	If that's kosher on Linux, then
with some minor #ifdef use, it would be possible to use a single .s for both
Darwin and Linux.
Attachment #186346 - Flags: superreview?(brendan)
Attachment #186346 - Flags: review?(mark)
Attachment #186346 - Flags: review-
Attachment #185530 - Attachment is obsolete: true
Attachment #186346 - Attachment is obsolete: true
Attachment #195701 - Flags: superreview?(brendan)
Attachment #195701 - Flags: review+
Comment on attachment 195701 [details] [diff] [review]
Also handle Darwin/OS X for JS lock

kevdig, could you give this a spin on PPC Linux to make sure that registers can
be identified with "r" and "cr"?
Attachment #195701 - Flags: review?(kevdig)
Comment on attachment 195701 [details] [diff] [review]
Also handle Darwin/OS X for JS lock

Oops, contained portions from bug 302212.
Attachment #195701 - Flags: superreview?(brendan)
Attachment #195701 - Flags: review?(kevdig)
Attachment #195701 - Attachment is obsolete: true
Attachment #195703 - Flags: superreview?(brendan)
Attachment #195703 - Flags: review?(kevdig)
(In reply to comment #17)
> (From update of attachment 195701 [details] [diff] [review] [edit])
> kevdig, could you give this a spin on PPC Linux to make sure that registers can
> be identified with "r" and "cr"?
> 

I added some "r"s to the assembly output of hello world:

[kevdig@PowerMac8600B /tmp]$ diff -U3 hello.s hello.red.s 
--- hello.s     Mon Sep 12 11:55:13 2005
+++ hello.red.s Mon Sep 12 11:39:55 2005
@@ -9,11 +9,11 @@
        .globl main
        .type    main,@function
 main:
-       stwu 1,-32(1)
-       mflr 0
-       stw 31,28(1)
-       stw 0,36(1)
-       mr 31,1
+       stwu r1,-32(1)
+       mflr r0
+       stw r31,28(1)
+       stw r0,36(1)
+       mr r31,r1
        stw 3,8(31)
        stw 4,12(31)
        li 0,0

Trying to link this gives:

[kevdig@PowerMac8600B /tmp]$ gcc -ohello hello.red.s 
hello.red.s: Assembler messages:
hello.red.s:12: Error: unsupported relocation against r1
hello.red.s:13: Error: unsupported relocation against r0
hello.red.s:14: Error: unsupported relocation against r31
hello.red.s:15: Error: unsupported relocation against r0
hello.red.s:16: Error: unsupported relocation against r31
hello.red.s:16: Error: unsupported relocation against r1

This is on YellowDog 2.1 with compiler 2.95.4.

[kevdig@PowerMac8600B /tmp]$ as -version
GNU assembler 2.11.93.0.2 20020207
Copyright 2002 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License.  This program has absolutely no warranty.
This assembler was configured for a target of `ppc-yellowdog-linux'.

[kevdig@PowerMac8600B /tmp]$ rpm -qf `which as`
binutils-2.11.93.0.2-3b

What does the "References to %0 should be %3 when "w" is being used" mean?

I have read through some of the errata for a few PPC processors. Some of them
have some scary comments on the reservation instructions. Are there some
processors that this will cause serious trouble with?
Attachment #195703 - Flags: superreview?(brendan)
Attachment #195703 - Flags: review?(kevdig)
So registers must be identified with the "r" prefix on OS X, and without the
prefix on Linux.  Considering how most IBM docs I've seen leave the "r" out,
I'd say that OS X is the exception.  We could go back to your #ifdefs that
leave OS X/PPC out, but I'd like to get this on OS X/PPC too.  This patch uses
the "r" on OS X and Darwin, and leaves it out on anything else.

> What does the "References to %0 should be %3 when "w" is being used" mean?

The __asm__ uses variables res, ov, nv, and w (%0 through %3) in registers; res
and w are in the same register.  When that register is being used to store res,
refer to it as %0.  When it's used to store w, use %3.

> I have read through some of the errata for a few PPC processors. Some of
> them have some scary comments on the reservation instructions. Are there
> some processors that this will cause serious trouble with?

They're safe on the PPCs used in Macs, where we already use the reservation
instructions in atomic operations.  The system provides some of its own atomic
primitives that also rely on reservation ops.  This is a pretty standard
compare-and-swap for PPC, so I'd be surprised if this sequence was bad on some
CPU.  The only way to make this safer would be to kill the reservation if ov !=
*w, but a leftover reservation shouldn't present a problem.  Are there specific
CPUs or specific errata that bug you?
Attachment #195703 - Attachment is obsolete: true
Attachment #195825 - Flags: superreview?(brendan)
Attachment #195825 - Flags: review?(kevdig)
Comment on attachment 195825 [details] [diff] [review]
Handle Darwin/OS X with platform-dependent register identification

My r+ covers the patch as modified here, but I still want kevdig to verify on
Linux.
Attachment #195825 - Flags: review+
(In reply to comment #21)
> 
> So registers must be identified with the "r" prefix on OS X, and without the
> prefix on Linux.  Considering how most IBM docs I've seen leave the "r" out,
> I'd say that OS X is the exception.  We could go back to your #ifdefs that
> leave OS X/PPC out, but I'd like to get this on OS X/PPC too.  This patch uses
> the "r" on OS X and Darwin, and leaves it out on anything else.
> 
I will check and see what YellowDog 4.0.1 does with the register stuff sometime
today and let you know. Surely we can find someone somewhere who is more
knowledgeable about the PPC tool chain who can tell me if there is a valid
compiler I can install on YDL 2.1 that fixes this?

> 
> The __asm__ uses variables res, ov, nv, and w (%0 through %3) in registers; res
> and w are in the same register.  When that register is being used to store res,
> refer to it as %0.  When it's used to store w, use %3.
> 
So you are saying I can refer to those overloaded registers as either their
sequential position or the overloaded position? I did not know that.

> 
> They're safe on the PPCs used in Macs, where we already use the reservation
> instructions in atomic operations.  The system provides some of its own atomic
> primitives that also rely on reservation ops.  This is a pretty standard
> compare-and-swap for PPC, so I'd be surprised if this sequence was bad on some
> CPU.  The only way to make this safer would be to kill the reservation if ov !=
> *w, but a leftover reservation shouldn't present a problem.  Are there specific
> CPUs or specific errata that bug you?

Don't quote me but I could have sworn reading in the errata that some 745xs can
report that the reservation was stolen even though it succeeded. And I found a
document on IBM's web site that said that reservations should be cancelled in
all int handlers?
> And I found a
> document on IBM's web site that said that reservations should be cancelled in
> all int handlers?

I think this was for the 970.
(In reply to comment #22)
> (From update of attachment 195825 [details] [diff] [review] [edit])
> My r+ covers the patch as modified here, but I still want kevdig to verify on
> Linux.
> 

I assume by this you mean you want me to apply the new jslock.c patch to a
pristine jslock.c and see if it still compiles. Ok.
(In reply to comment #23)
> I will check and see what YellowDog 4.0.1 does with the register stuff
> sometime today and let you know. Surely we can find someone somewhere who is
> more knowledgeable about the PPC tool chain who can tell me if there is a
> valid compiler I can install on YDL 2.1 that fixes this?

I'd rather not bump the build requirements just for this - your original asm was
valid, just not on OS X.  The whole reason I wanted you to test was to make sure
that the "r" was friendly with a supported Linux PPC assembler.

> So you are saying I can refer to those overloaded registers as either their
> sequential position or the overloaded position? I did not know that.

Sure, the compiler maps %3 back to %0.  The next argument after "w", if there
was one, would be %4, not %3.

> Don't quote me but I could have sworn reading in the errata that some 745xs
> can report that the reservation was stolen even though it succeeded.

7400/7410?  Some processors do unexpected and inconsistent things when the lwarx
that created a reservation and a stwcx. operate on different addresses.  This is
one reason it's important for the interrupt handler to kill the reservation, and
why the lwarx and stwcx. should be as close as possible to one another.

(In reply to comment #24)
> > And I found a
> > document on IBM's web site that said that reservations should be cancelled
> > in all int handlers?
> 
> I think this was for the 970.

We can't do anything about interrupts (and your OS should kill reservations
while handling one), but a recent note for the 970 family advises not leaving
reservations around.  I've stuck it in the URL field of this bug.  This change
makes sure the reservation is killed even when *w != ov:

  "3: lwarx " _REG "6,0,%3\n"
  "cmpw " _CONDREG "7," _REG "6,%1\n"
  "bne- " _CONDREG "7,1f\n"
  "mr " _REG "6,%2\n"
  "1: stwcx. " _REG "6,0,%3\n"
  "bne- " _CONDREG "7,2f\n"
  "bne- 3b\n"
  "2: mfcr %0\n"
  "rlwinm %0,%0,31,31,31\n"

I'd be happy to go with this, it's slightly safer although it probably doesn't
make a difference for our purposes.
We can ask Philip Warren to ask a PowerPC expert at
IBM to review the PowerPC assembly.

See bug 193969 comment 2 for the disassembly of OTCompareAndSwap32
on Mac OS X.
CCing Philip.
(In reply to comment #26)
> (In reply to comment #23)
> > I will check and see what YellowDog 4.0.1 does with the register stuff
> > sometime today and let you know. Surely we can find someone somewhere who is
> > more knowledgeable about the PPC tool chain who can tell me if there is a
> > valid compiler I can install on YDL 2.1 that fixes this?
> 
YellowDog 4.0.1 (a 2.6 based dist) does not like the r and cr either.

> I'd rather not bump the build requirements just for this - your original asm was
> valid, just not on OS X.  The whole reason I wanted you to test was to make sure
> that the "r" was friendly with a supported Linux PPC assembler.
> 
I applied this patch to my jslock and it seemed to build ok.

> We can't do anything about interrupts (and your OS should kill reservations
> while handling one), but a recent note for the 970 family advises not leaving
> reservations around.  I've stuck it in the URL field of this bug.  This change
> makes sure the reservation is killed even when *w != ov:
> 
>   "3: lwarx " _REG "6,0,%3\n"
>   "cmpw " _CONDREG "7," _REG "6,%1\n"
>   "bne- " _CONDREG "7,1f\n"
>   "mr " _REG "6,%2\n"
>   "1: stwcx. " _REG "6,0,%3\n"
>   "bne- " _CONDREG "7,2f\n"
>   "bne- 3b\n"
>   "2: mfcr %0\n"
>   "rlwinm %0,%0,31,31,31\n"
> 
> I'd be happy to go with this, it's slightly safer although it probably doesn't
> make a difference for our purposes.

ok???

I looked at that disassembly. What is with the li 0,0? Isn't that part of the
PPC arch definition? And the sync before the stwcx.? They required? Maybe for
some errata?
> I looked at that disassembly. What is with the li 0,0? Isn't that part of the
> PPC arch definition?

r0 is hardwired to 0 in the contexts in which it's used in that asm.  I have no
idea why it loads 0.

> And the sync before the stwcx.? They required? Maybe for
> some errata?

Possibly.  But I note that in the current release (10.4.2), OTCompareAndSwap32
doesn't implement atomicity directly, it seems to wrap pthreads instead.

Now, we've got OSAtomicCompareAndSwap32, which just branches to
__compare_and_swap32.  It looks almost exactly like what we've got here, without
 sync or isync, and without killing the reservation at <+28>.

http://www.opensource.apple.com/darwinsource/10.4.2/xnu-792.2.4/osfmk/ppc/commpage/atomic.s

0xffff8080 <___compare_and_swap32+0>:   lwarx   r7,r0,r5
0xffff8084 <___compare_and_swap32+4>:   cmplw   r7,r3
0xffff8088 <___compare_and_swap32+8>:   bne-    0xffff809c
<___compare_and_swap32+28>
0xffff808c <___compare_and_swap32+12>:  stwcx.  r4,r0,r5
0xffff8090 <___compare_and_swap32+16>:  bne-    0xffff8080 <___compare_and_swap32>
0xffff8094 <___compare_and_swap32+20>:  li      r3,1
0xffff8098 <___compare_and_swap32+24>:  blr
0xffff809c <___compare_and_swap32+28>:  li      r3,0
0xffff80a0 <___compare_and_swap32+32>:  blr
How can I (or anyone) help to get this bug fixed?

/be
wtc suggested having someone from IBM look it over.  I'm happy with the patch in the most recent attachment.
Can we get some traction here?  This is fully functional on Mac/ppc and Linux/ppc.  If we want someone else to look it over, we should find that "someone else," otherwise, I don't see how this is going to get any more "done."

Wan-Teh?
Comment on attachment 195825 [details] [diff] [review]
Handle Darwin/OS X with platform-dependent register identification

I remember that I wrote NSPR's os_Darwin_ppc.s (bug 193969)
by consulting the book "The PowerPC Architecture: A Specification
for A New Family of RISC Processors".  Since the code was
written according to the architecture manual, it doesn't
handle any processor bugs.  (I don't know if any PowerPC
processor has bugs in this area though.)  This is fine for
Mac OS X because it only runs on Apple computers.  But the
hardware Linux runs on is much more diverse.  Since kevdig
mentioned some errata for certain PowerPC processors in
comment 23, I suggested that someone knowledgeable on PowerPC
should review this patch.  The correctness of this kind of
assembly code usually needs to be determined by code review
because a bug may be difficult to be discovered by testing.
Since we have trouble finding an expert to review the code,
I am fine with just using the "textbook" assembly code in
this patch.

I looked at the NSPR changes in this patch and have some
comments.

In mozilla/nsprpub/configure.in, we have:

>+    ppc|powerpc)
>+        PR_MD_ASFILES=os_Linux_ppc.s
>+        ;;

Note that ppc|powerpc refers to 32-bit binary code.  There
is a powerpc64 case below in this case statement for 64-bit
binary code.

In mozilla/nsprpub/pr/include/md/_linux.h, we have:

>+#if defined(__powerpc__)
>+#define _PR_HAVE_ATOMIC_OPS
...
>+#endif

To be consistent with the change to mozilla/nsprpub/configure.in,
the #if should be written as:

#if defined(__powerpc__) && !defined(__powerpc64__)

Alternatively, we should make os_Linux_ppc.s usable for
both 32-bit and 64-bit binary code.  Is this possible?

In the new file mozilla/nsprpub/pr/src/md/unix/os_Linux_ppc.s,
I found that you changed the "rd" in os_Darwin_ppc.s to "d",
where d is a digit (0, 3, 4, 5).  I guess that's due to the
different syntax used by the Linux assembler.  I have a question
about the duplicate .Lfdx labels at the beginning of the
functions.  Can we just use the function name labels
_PR_linuxppc_AtomicXXX instead?

Remember to add the following at the end of the file:
# Magic indicating no need for an executable stack
.section .note.GNU-stack, "", @progbits ; .previous
I have asked a few people at IBM to review this patch and hope to have some comments soon. One person recommended using the -mregnames argument to GNU AS to let you remove the _REG and _CONDREG defines to be more like OS X.
(In reply to comment #34)
  But the
> hardware Linux runs on is much more diverse.  Since kevdig
> mentioned some errata for certain PowerPC processors in
> comment 23, I suggested that someone knowledgeable on PowerPC
> should review this patch.  The correctness of this kind of
> assembly code usually needs to be determined by code review
> because a bug may be difficult to be discovered by testing.
> Since we have trouble finding an expert to review the code,
> I am fine with just using the "textbook" assembly code in
> this patch.
> 
I have spent enough time searching for a job, watching my teams lose on ESPN and CBSsportline, getting snipped on eBay, and watching my stocks drop on yahoo to feel confident that this patch is stable. At least on a 750GX equipped PowerMac 8600. All of my PPC machines are Macs. The processors are all upgradeable. If I find any of them with sync primitive problems I will, money permitting, replace the processors and see if Intel will loan me their Pentium grinder.

I would hope that anyone designing nuclear reactor control systems is already aware of any problems that some PPCs may have and will not use any susceptible chips. Or they will use an extremely thick containment vessel.

Thus I vote we throw caution to the wind and bloody check it in.

> I looked at the NSPR changes in this patch and have some
> comments.
> 
> In mozilla/nsprpub/configure.in, we have:
> 
> >+    ppc|powerpc)
> >+        PR_MD_ASFILES=os_Linux_ppc.s
> >+        ;;
> 
> Note that ppc|powerpc refers to 32-bit binary code.  There
> is a powerpc64 case below in this case statement for 64-bit
> binary code.
> 
> In mozilla/nsprpub/pr/include/md/_linux.h, we have:
> 
> >+#if defined(__powerpc__)
> >+#define _PR_HAVE_ATOMIC_OPS
> ...
> >+#endif
> 
> To be consistent with the change to mozilla/nsprpub/configure.in,
> the #if should be written as:
> 
> #if defined(__powerpc__) && !defined(__powerpc64__)
> 
> Alternatively, we should make os_Linux_ppc.s usable for
> both 32-bit and 64-bit binary code.  Is this possible?
> 

Excellent point. I don't have a 64-bit PPC machine. Did not even think of this. os_Linux_ppc.s will work on 64-bit, right? We don't need the 64-bit variants of this reservation stuff, do we? I would think the 32-bit variants will still work fine. Anyone out there with a 64-bit PPC box willing to see what happens?

> In the new file mozilla/nsprpub/pr/src/md/unix/os_Linux_ppc.s,
> I found that you changed the "rd" in os_Darwin_ppc.s to "d",
> where d is a digit (0, 3, 4, 5).  I guess that's due to the
> different syntax used by the Linux assembler.  I have a question
> about the duplicate .Lfdx labels at the beginning of the
> functions.  Can we just use the function name labels
> _PR_linuxppc_AtomicXXX instead?
> 

No, you can not remove the duplicate labels. I spent days trying to figure out a linker problem. It (or the assembler) creates an unsupported relocation type when you branch to a location that is also a public entry point. At least on Linux 2.4.x.
(In reply to comment #35)
> I have asked a few people at IBM to review this patch and hope to have some
> comments soon. One person recommended using the -mregnames argument to GNU AS
> to let you remove the _REG and _CONDREG defines to be more like OS X.
> 

Are you saying that there may be a command line switch to get the Linux AS to accept the OSX reg naming format? I am all for that. As much as I admire the skill needed to create those macros to use the same code for both, PPC assembly is hard enough to read and understand. I will investigate this and get back to you.
(In reply to comment #37)

> 
> Are you saying that there may be a command line switch to get the Linux AS to
> accept the OSX reg naming format? I am all for that. As much as I admire the
> skill needed to create those macros to use the same code for both, PPC assembly
> is hard enough to read and understand. I will investigate this and get back to
> you.
> 

I took this program:

#include <stdio.h>

int main(int argc, char *argv[])
{
int i;

        for(i=0; i<10000000; i++)
                ;

        return 0;
}

and generated assembly with gcc -S hello.c:

        .file   "hello.c"
gcc2_compiled.:
        .section        ".text"
        .align 2
        .globl main
        .type    main,@function
main:
        stwu 1,-32(1)
        stw 31,28(1)
        mr 31,1
        stw 3,8(31)
        stw 4,12(31)
        li 0,0
        stw 0,16(31)
.L3:
        lwz 0,16(31)
        lis 9,0x98
        ori 9,9,38527
        cmpw 0,0,9
        bc 4,1,.L5
        b .L4
.L6:
.L5:
        lwz 9,16(31)
        addi 0,9,1
        stw 0,16(31)
        b .L3
.L4:
        li 3,0
        b .L2
.L2:
        lwz 11,0(1)
        lwz 31,-4(11)
        mr 1,11
        blr
.Lfe1:
        .size    main,.Lfe1-main
        .ident  "GCC: (GNU) 2.95.4 20010319 (prerelease/franzo/20011204)"

I then edited it as follows:

[kevdig@PowerMac8600B kevdig]$ diff -U3 hello.s hello-mod.s 
--- hello.s     Thu Jan  5 03:39:52 2006
+++ hello-mod.s Thu Jan  5 03:26:31 2006
@@ -5,17 +5,17 @@
        .globl main
        .type    main,@function
 main:
-       stwu 1,-32(1)
-       stw 31,28(1)
-       mr 31,1
-       stw 3,8(31)
-       stw 4,12(31)
+       stwu 1,-32(r1)
+       stw r31,28(r1)
+       mr r31,r1
+       stw r3,8(r31)
+       stw r4,12(r31)
        li 0,0
-       stw 0,16(31)
+       stw r0,16(r31)
 .L3:
-       lwz 0,16(31)
-       lis 9,0x98
-       ori 9,9,38527
+       lwz r0,16(r31)
+       lis r9,0x98
+       ori r9,r9,38527
        cmpw 0,0,9
        bc 4,1,.L5
        b .L4

I then ran:

gcc -Wa,-mregnames hello-mod.s

It built something that seemed to run.

Anyone know how to add this option into the build stuff for this file?
You can add -Wa,-mregnames or -mregnames to ASFLAGS,
depending on whether AS is the compiler or the assembler.

I have some other PowerPC assembly code for PRStack
(an atomic LIFO singly-linked list) in bug 194339.
As LXR shows (http://lxr.mozilla.org/seamonkey/ident?i=PRStack),
inside the Mozilla clients, PRStack is only used
internally by NSPR to implement the PRFileDesc freelist.
So it may not be as important to the Mozilla clients
as to the servers.  But if anyone is interested in
reviewing the PRStack PowerPC assembly code, I'd appreciate
it.
Two comments on the assembly code:
1. 64-bit instruction is ldarx/stdcx/cmpd
2. it may need isync after stwcx instruction in a SPM machine
Blocks: 20357
QA Contact: wtchang → nspr
(In reply to comment #39)
> You can add -Wa,-mregnames or -mregnames to ASFLAGS,
> depending on whether AS is the compiler or the assembler.
> 
Is it possible in the build system to add these on a per file basis? Won't adding these screw up the assembly of compiled code? And what about the inline assembly?
(In reply to comment #40)
> Two comments on the assembly code:
> 1. 64-bit instruction is ldarx/stdcx/cmpd
> 2. it may need isync after stwcx instruction in a SPM machine
> 

Is mozilla built on a ppc64 box a 64 bit program (pointers are 64 bit)? If so we would need a separate set of files, no?

Probably safer to just add it and be done with it.

I ain't not got no G5 and can't not be of no help here.

This has been open long enough. Let's get this checked in. If it triggers armageddon so be it. Maybe we can get it done before the next Finals game Sunday night?
kevdig: can you mark r+, or do you want to clear that r? review request from Mark to you? I wait to sr+ until I see an r+, which is one excuse for my not helping this bug get fixed. Sorry, I'll help now. Cc'ing crowder, who can help too I hope.

/be
(In reply to comment #43)
> kevdig: can you mark r+, or do you want to clear that r? review request from
> Mark to you? I wait to sr+ until I see an r+, which is one excuse for my not
> helping this bug get fixed. Sorry, I'll help now. Cc'ing crowder, who can help
> too I hope.
> 
> /be
> 

I have been looking at the -mregnames option to as and it does not work like I expected. It does not prevent compiles from working like I expected. It apparently allows either syntax.

So tomorrow I'll look at using the Mac assembler syntax along with -mregnames and see if it will build.
(In reply to comment #44)
> 
> So tomorrow I'll look at using the Mac assembler syntax along with -mregnames
> and see if it will build.
> 

I ran:

CFLAGS="-Wa,-mregnames" time nice -19 make -f client.mk build

and the build seemed to complete.

Would the best thing be to add "-Wa,-mregnames" to ... something in js/src/Makefile.in? CFLAGS? HOST_CFLAGS? _COMPILE_CFLAGS?

Maybe something like:

#
# To enable MacOS X and Linux to use the same PowerPC thinlock code
# (js_CompareAndSwap()), pass the -Wa,-mregnames flag for Linux
#
ifneq (,$(filter powerpc% ppc%,$(TARGET_CPU)))
ifeq ($(OS_ARCH),Linux)
...CFLAGS += -Wa,-mregnames
endif
endif
I'm trying to generate a new patch:

cvs -z3 diff -u8pN -r FIREFOX_2_0_0_1_RELEASE .|tee ../cvsdiff.txt

It does not appear to be doing anything. What am I doing wrong?
Comment on attachment 195825 [details] [diff] [review]
Handle Darwin/OS X with platform-dependent register identification

Clearing old requests.

/be
Attachment #195825 - Flags: superreview?(brendan)
Attachment #195825 - Flags: review?(kevdig)
is this ready for check in?
I can't seem to generate a new patch. The current version of the compare and swap routing is:

    __asm__ __volatile__ (
                          "2: lwarx r6,r0,%3\n"
                          "cmpw cr7,r6,%1\n"
                          "bne- cr7,1f\n"
                          "stwcx. %2,r0,%3\n"
                          "sync\n"
                          "bne- 2b\n"
                          "1: mfcr %0\n"
                          "rlwinm %0,%0,31,31,31\n"
                          : "=r" (res)
                          : "r" (ov), "r" (nv), "0" (w)
                          : "cc", "memory","r6");

kevin
Attached patch NSPR patch (obsolete) — Splinter Review
I extracted the NSPR portions of Kevin/Mark's patch here.

Kevin, does this assembly code in this patch work when we
are compiling for powerpc64 (i.e., __powerpc64__ as opposed
to __powerpc__ is defined)?
I don't have a G5 (aka 64-bit ppc box) and don't know.

Since I appear to be the only one on this planet who surfs the web using mozilla or firefox using a ppc linux box I say lets just check it in.

Unless someone who has access to a G5 steps up just leave atomic support out for them?

kevin
Kevin: the issue is that __powerpc__ is also defined for 64-bit
ppc build target, so if the assembly code doesn't work for that
build target, the ifdef we use in your patch for
mozilla/nsprpub/pr/include/md/_linux.h would need to say:
#if defined(__powerpc__) && !defined(__powerpc64__)
I checked in the NSPR patch on the NSPR trunk for NSPR 4.7.
I assume that the assembly code works for powerpc64, too.

Checking in configure.in;
/cvsroot/mozilla/nsprpub/configure.in,v  <--  configure.in
new revision: 1.225; previous revision: 1.224
done
Checking in configure;
/cvsroot/mozilla/nsprpub/configure,v  <--  configure
new revision: 1.222; previous revision: 1.221
done
Checking in pr/include/md/_linux.h;
/cvsroot/mozilla/nsprpub/pr/include/md/_linux.h,v  <--  _linux.h
new revision: 3.47; previous revision: 3.46
done
RCS file: /cvsroot/mozilla/nsprpub/pr/src/md/unix/os_Linux_ppc.s,v
done
Checking in pr/src/md/unix/os_Linux_ppc.s;
/cvsroot/mozilla/nsprpub/pr/src/md/unix/os_Linux_ppc.s,v  <--  os_Linux_ppc.s
initial revision: 1.1
done
Attachment #298068 - Attachment is obsolete: true
Kevin, this bug is filed against NSPR.  Ideally you should
open two new bugs against the following products/components:

- Directory: LDAP C SDK
  (https://bugzilla.mozilla.org/enter_bug.cgi?product=Directory)
- Core: JavaScript Engine
  (https://bugzilla.mozilla.org/enter_bug.cgi?product=Core)

and attach the remaining changes of your patch (attachment 195825 [details] [diff] [review])
to the respective bugs.

At least, the owners of those two modules need to check in those
changes for you.  I don't mind just using this bug report to finish
the job.
Target Milestone: --- → 4.7
Wan-Teh asked me to test the new code on a ppc64 machine.
I used a PS3 with Linux installed.

It seems the code does not work.
I see a crash when running the "atomic", "cvar2" and "perf" test programs.

$ gdb ./atomic
GNU gdb Red Hat Linux (6.3.0.0-1.122rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "ppc64-yellowdog-linux-gnu"...Using host libthread_db library "/lib64/libthread_db.so.1".

(gdb) r
Starting program: /home/kaie/nspr/obj64/pr/tests/atomic
Reading symbols from shared object read from target memory...done.
Loaded system supplied DSO at 0x100000
[Thread debugging using libthread_db enabled]
[New Thread 4398046692064 (LWP 16661)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 4398046692064 (LWP 16661)]
0x7c80182838040000 in ?? ()
(gdb) bt
#0  0x7c80182838040000 in ?? ()
#1  0x000004000008513c in PR_AtomicIncrement (val=0x400000d4124) at ../../../../mozilla/nsprpub/pr/src/misc/pratom.c:306
#2  0x00000400000811b0 in PR_NewThreadPrivateIndex (newIndex=0x400000d4118,
    dtor=@0x400000cdf68: 0x40000080df4 <_PR_RELEASE_LOCK_STACK>) at ../../../../mozilla/nsprpub/pr/src/threads/prtpd.c:140
#3  0x0000040000080c6c in _PR_InitRWLocks () at ../../../../mozilla/nsprpub/pr/src/threads/prrwlock.c:398
#4  0x000004000008db1c in _PR_InitStuff () at ../../../../mozilla/nsprpub/pr/src/misc/prinit.c:242
#5  0x000004000008db68 in _PR_ImplicitInitialization () at ../../../../mozilla/nsprpub/pr/src/misc/prinit.c:251
#6  0x00000400000a3a7c in PR_GetSpecialFD (osfd=PR_StandardOutput) at ../../../../mozilla/nsprpub/pr/src/pthreads/ptio.c:1213
#7  0x00000000100008c0 in main (argc=1, argv=0xfffff85e458) at ../../../mozilla/nsprpub/pr/tests/atomic.c:45
$ uname -a
Linux localhost.localdomain 2.6.23-9.ydl5.1 #1 SMP Tue Nov 27 19:09:32 MST 2007 ppc64 ppc64 ppc64 GNU/Linux

$ cat /proc/cpuinfo
processor       : 0
cpu             : Cell Broadband Engine, altivec supported
clock           : 3192.000000MHz
revision        : 5.1 (pvr 0070 0501)

processor       : 1
cpu             : Cell Broadband Engine, altivec supported
clock           : 3192.000000MHz
revision        : 5.1 (pvr 0070 0501)

timebase        : 79800000
platform        : PS3
(In reply to comment #54)
 
> - Directory: LDAP C SDK
>   (https://bugzilla.mozilla.org/enter_bug.cgi?product=Directory)
> - Core: JavaScript Engine
>   (https://bugzilla.mozilla.org/enter_bug.cgi?product=Core)
> 
There is already a bug opened against the JavaScript Engine - 296879.

> At least, the owners of those two modules need to check in those
> changes for you.  I don't mind just using this bug report to finish
> the job.
> 
As soon as we get this 64-bit mess straightened out I'll open a bug against the other one.

Will this let me use a more restricted (i.e. smaller module) cvs diff that MIGHT actually work for me?
(In reply to comment #55)
> It seems the code does not work.
> I see a crash when running the "atomic", "cvar2" and "perf" test programs.
> 
Forgive my ignorance, but what are these? Are they part of enable-tests? Can you  run the file command on them and on mozilla-bin in lib/firefox. When I run it on my firefox-bin I get:

/opt/firefox2.0-gtk2-atomic/lib/firefox-2.0.0.1/firefox-bin: ELF 32-bit MSB executable, PowerPC or cisco 4500, version 1 (SYSV), for GNU/Linux 2.2.5, dynamically linked (uses shared libs), not stripped

> 
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 4398046692064 (LWP 16661)]
> 0x7c80182838040000 in ?? ()
This is a 16 digit hex number which suggests a 64-bit number

> (gdb) bt
> #0  0x7c80182838040000 in ?? ()
> #1  0x000004000008513c in PR_AtomicIncrement (val=0x400000d4124) at
The argument val here is an 11 digit hex number - 44-bit??? It is supposed to be a pointer to a PRInt32 - hopefully a 32-bit int. The address at the start of the line has the leading zeros.

Do you know if you can create both 32-bit and 64-bit executables on the PS3? I'm gonna try to figure out what compile options are used on 64-bit ppc are and what they do. If someone already knows that would be helpful.
I have a patch to turn off the assembly code if __powerpc64__
is defined.  I also found an IBM article
http://www.ibm.com/developerworks/linux/library/l-powasm4.html?S_TACT=105AGX03&S_CMP=EDU
that explains how to write a function in 64-bit PowerPC assembly
language.
This patch disables the use of os_Linux_ppc.s for 64-bit
builds, and add the nonexecutable stack directive to
os_Linux_ppc.s.

I checked in this patch on the NSPR trunk for NSPR 4.7.

Checking in configure.in;
/cvsroot/mozilla/nsprpub/configure.in,v  <--  configure.in
new revision: 1.226; previous revision: 1.225
done
Checking in configure;
/cvsroot/mozilla/nsprpub/configure,v  <--  configure
new revision: 1.223; previous revision: 1.222
done
Checking in pr/include/md/_linux.h;
/cvsroot/mozilla/nsprpub/pr/include/md/_linux.h,v  <--  _linux.h
new revision: 3.48; previous revision: 3.47
done
Checking in pr/src/md/unix/os_Linux_ppc.s;
/cvsroot/mozilla/nsprpub/pr/src/md/unix/os_Linux_ppc.s,v  <--  os_Linux_ppc.s
new revision: 1.2; previous revision: 1.1
done
I have finally stumbled around and got this change to build on a G5 running YDL 6. This is against 2.0.0.15pre. After I test the latest patches on some 32-bit systems, I'll TRY to generate new patches.

kevin

P.S.:  Even on a G5, Firefox is a 32-bit program.
This set of patches (less os_Linux_ppc.s) is the current patch that I have used to build this with. It works on both ppc (i.e. 32-bit - Yellow Dog 4.0) and ppc64 (i.e. 64-bit - Yellow Dog 6.0).

As previously noted it builds a 32-bit application on ppc64.

kevin
These patches do not address whatever the --enable-64bit configure option is supposed to do. Seems to me that that is kinda out of scope for this patch?

kevin
Attachment #319691 - Attachment mime type: application/octet-stream → text/plain
Blocks: 432553
(In reply to comment #57)
> (In reply to comment #54)
> 
> > - Directory: LDAP C SDK
> >   (https://bugzilla.mozilla.org/enter_bug.cgi?product=Directory)
> > - Core: JavaScript Engine
> >   (https://bugzilla.mozilla.org/enter_bug.cgi?product=Core)
> > 
> There is already a bug opened against the JavaScript Engine - 296879.
> 
Bug 432553 was filed against mozilla application suite for the Directory product. (Should this have been filed against firefox?)

What product do I file against for the xpcom/ change?

kevin
I'm sorry that I didn't finish the 64-bit PowerPC part of this bug.
I remember I ported the assembly code to 64-bit. I even got an
account on the PowerPC Linux computers at University of Oregon's
open source labs. But I didn't attach my patch to this bug and now
I don't know where that patch is.

I'm going to mark this bug as fixed in NSPR 4.7. Here is a summary
of what was done: the PowerPC assembly code in os_Linux_ppc.s is
used in 32-bit PowerPC builds only.

I opened bug 986745 for the NSPR atomic functions for 64-bit PowerPC
builds.
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Priority: -- → P2
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.