Closed Bug 501881 Opened 15 years ago Closed 14 years ago

Mozilla Firefox 64bit is crashing on AIX @js_AtomizeString

Categories

(Core :: JavaScript Engine, defect)

1.9.1 Branch
PowerPC
AIX
defect
Not set
normal

Tracking

()

RESOLVED FIXED
Tracking Status
status1.9.2 --- .7-fixed
status1.9.1 --- .11-fixed

People

(Reporter: shailen.n.jain, Assigned: ul-mcamafia)

Details

(Keywords: crash, verified1.9.1, verified1.9.2)

Attachments

(3 files, 15 obsolete files)

772 bytes, patch
jimb
: review+
wtc
: review+
Details | Diff | Splinter Review
772 bytes, patch
Details | Diff | Splinter Review
735 bytes, patch
Details | Diff | Splinter Review
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.11) Gecko/2009060215 Firefox/3.0.11
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.11) Gecko/2009060215 Firefox/3.0.11

Below is the stack trace for the coredump

# dbx ./firefox-bin ./core
Type 'help' for help.
[using memory image in ./core]
reading symbolic information ...warning: jsatom.cpp is newer than ./libmozjs.so


IOT/Abort trap in pthread_kill at 0x9000000004786bc ($t1)
0x9000000004786bc (pthread_kill+0x88) e8410028          ld   r2,0x28(r1)
(dbx) where
pthread_kill(??, ??) at 0x9000000004786bc
_p_raise(??) at 0x9000000004780d0
nsProfileLock::FatalSignalHandler(int)(signo = 150994944), line 212 in "nsProfileLock.cpp"
pthread_kill(??, ??) at 0x9000000004786bc
_p_raise(??) at 0x9000000004780d0
raise.raise(??) at 0x900000000057298
abort() at 0x900000000082dac
JS_Assert(s = "", file = warning: Unable to access address 0x10000001 from core
(invalid char ptr (0x0000000010000001)), ln = 268435455), line 69 in "jsutil.cpp"
js_AtomizeString(cx = 0x00000001101f0eb0, str = 0x0fffffffffffdba8, flags = 9), line 743 in "jsatom.cpp"
js_Atomize(cx = 0x00000001101f0eb0, bytes = "", length = 0, flags = 1), line 786 in "jsatom.cpp"
js_InitCommonAtoms(cx = 0x00000001101f0eb0), line 490 in "jsatom.cpp"
js_NewContext(rt = 0x0fffffffffffde20, stackChunkSize = 1152921504606838672), line 439 in "jscntxt.cpp"
JS_NewContext(rt = 0x090000002970fd50, stackChunkSize = 0), line 1048 in "jsapi.cpp"
mozJSComponentLoader::ReallyInit()(this = 0x0fffffffffffe000), line 583 in "mozJSComponentLoader.cpp"
mozJSComponentLoader.~nsCOMPtr()(??, ??, ??), line 668 in "mozJSComponentLoader.cpp"
unnamed block in nsComponentManager.nsComponentManagerImpl::AutoRegisterComponent(nsILocalFile*,nsTArray<DeferredModule>&,int)(this = 0x0900000029713758, aComponentFile = 0x09001000a853fa18, aDeferred = &(...), minLoader = 1), line 3040 in "nsComponentManager.cpp"
nsComponentManager.nsComponentManagerImpl::AutoRegisterComponent(nsILocalFile*,nsTArray<DeferredModule>&,int)(this = 0x0900000029713758, aComponentFile = 0x09001000a853fa18, aDeferred = &(...), minLoader = 1), line 3040 in "nsComponentManager.cpp"
unnamed block in nsComponentManagerImpl::LoadLeftoverComponents(nsCOMArray<nsILocalFile>&,nsTArray<DeferredModule>&,int)(this = 0x0000000000000008, aLeftovers = &(...), aDeferred = &(...), minLoader = 0), line 3094 in "nsComponentManager.cpp"
nsComponentManagerImpl::LoadLeftoverComponents(nsCOMArray<nsILocalFile>&,nsTArray<DeferredModule>&,int)(this = 0x0000000000000008, aLeftovers = &(...), aDeferred = &(...), minLoader = 0), line 3094 in "nsComponentManager.cpp"
nsComponentManager.nsComponentManagerImpl::AutoRegister(nsIFile*)(this = 0x0000000000000020, aSpec = 0x090000002984f630), line 3356 in "nsComponentManager.cpp"
NS_InitXPCOM3_P(result = 0x0fffffffffffe860, binDirectory = 0x00000000a8f0d220, appFileLocationProvider = 0x09001000a8eff5e0, staticComponents = (nil), componentCount = 150994944), line 700 in "nsXPComInit.cpp"
ScopedXPCOMStartup::Initialize()(this = 0x0000000110080020), line 1054 in "nsAppRunner.cpp"
XRE_main(argc = -1159983106, argv = 0xbadc0ffee0ddf00d, aAppData = 0xbadc0ffee0ddf00d), line 3176 in "nsAppRunner.cpp"
main(argc = 0, argv = (nil)), line 156 in "nsBrowserApp.cpp"




Reproducible: Always

Steps to Reproduce:
1. Build Mozilla Firefox Trunk build on AIX with Gnome RPMS of 64 bit
2. 
3.
Actual Results:  
Run the Trunk build and dumps core at startup.

Expected Results:  
Build fails with dumping core.
OS: Other → AIX
Attached patch Patch V 1 (obsolete) — Splinter Review
After analyzing the coredump, it is found that it is crashing due to empty string (Invalid character pointer) at the function 

JS_Assert(s = "", file = warning: Unable to access address 0x10000001 from core
(invalid char ptr (0x0000000010000001)), ln = 268435455), line 69 in "jsutil.cpp"

Further investigation shows that this empty string is the first element of the array 'js_common_atom_names' and it is being passed to js_Atomize function in the file jsatom.cpp

I have created a patch to replace the ""(empty string) with 'js_null_str'. This time it does not crash but it goes into infinite loop inside the function 'js_AtomicSetMask'(invoked by JSFLATSTR_SET_ATOMIZED) in jslock.cpp.
Attachment #386457 - Flags: review?(mrbkap)
Can you reproduce this in Firefox 3.5?
Keywords: crash
Version: unspecified → 3.0 Branch
Comment on attachment 386457 [details] [diff] [review]
Patch V 1

This isn't the right fix. I'm not entirely sure what it's fixing, either. Is this another AIX compiler bug?
Attachment #386457 - Flags: review?(mrbkap) → review-
Kurt Schultz,

Yes. This is reproducible with Firefox 3.5 and it is crashing at the same place.
PLZ post your .mozconfig and the output of 
"lslpp -l vacpp.\*"
 and 
"rpm -qa | sort"
Also your patches to mozilla/configure (which you will need).

I have found that *only* VisualAge C++ 6 produces stable code and supports the needed C99 language level.
With XLC/C++ 7 I get random crashes too, even on MOZILLA_1_8_BRANCH which is else rock stable on AIX 5.1 / 32-bit.
lslpp -l | grep -i vacpp
  vacpp.Bnd                  9.0.0.0  COMMITTED  IBM XL C/C++ Media Defined
  vacpp.cmp.aix52.lib        9.0.0.9  COMMITTED  IBM XL C/C++ Libraries for AIX
  vacpp.cmp.aix52.tools      9.0.0.9  COMMITTED  IBM XL C/C++ Tools for AIX 5.2
  vacpp.cmp.core             9.0.0.9  COMMITTED  IBM XL C/C++ Compiler
  vacpp.cmp.include          9.0.0.9  COMMITTED  IBM XL C/C++ Compiler Include
  vacpp.cmp.lib              9.0.0.9  COMMITTED  IBM XL C/C++ Libraries
  vacpp.cmp.rte              9.0.0.9  COMMITTED  IBM XL C/C++ Compiler
  vacpp.cmp.tools            9.0.0.9  COMMITTED  IBM XL C/C++ Tools
  vacpp.html.common          9.0.0.9  COMMITTED  IBM XL C/C++ Documentation
  vacpp.html.en_US           9.0.0.9  COMMITTED  IBM XL C/C++ Documentation
  vacpp.lic                  9.0.0.0  COMMITTED  IBM XL C/C++ Licence Files
  vacpp.licAgreement         9.0.0.0  COMMITTED  IBM XL C++ Electronic License
  vacpp.man.en_US            9.0.0.9  COMMITTED  IBM XL C/C++ Compiler Man
  vacpp.memdbg.aix52.lib     9.0.0.9  COMMITTED  IBM XL C/C++ User Heap/Memory
  vacpp.memdbg.aix52.rte     9.0.0.9  COMMITTED  IBM XL C/C++ User Heap/Memory
  vacpp.memdbg.lib           9.0.0.9  COMMITTED  IBM XL C/C++ User Heap and
  vacpp.memdbg.rte           9.0.0.9  COMMITTED  IBM XL C/C++ User Heap and
  vacpp.msg.en_US.cmp.core   9.0.0.9  COMMITTED  IBM XL C/C++ Compiler
  vacpp.msg.en_US.cmp.tools  9.0.0.9  COMMITTED  IBM XL C/C++ Tools
  vacpp.ndi                  9.0.0.9  COMMITTED  IBM XL C/C++ Non-Default
  vacpp.pdf.en_US            9.0.0.9  COMMITTED  IBM XL C/C++ Documentation
  vacpp.samples.ansicl       9.0.0.9  COMMITTED  IBM XL C/C++ Compiler ANSI
  vacpp.cmp.core             9.0.0.0  COMMITTED  IBM XL C/C++ Compiler


I did not have any patches to mozilla/configure.

I tried with XLC version 10 and I see the same result.
I can confirm this bug for Firefox 3.5.4, same behaviour of debug build.

After building in a real clean objdir using IBM XLC/C++ 7.0 with Jun2009 fixpack:

Building deps for /home/ulink/Src/mozilla-1.9.1/js/src/jsatom.cpp
xlC_r -qlanglvl=extended -qeh -o jsatom.o -c -DOSTYPE=\"AIX5.1\" -DOSARCH=AIX -DEXPORT_JS_API -DJS_USE_SAFE_ARENA -I/home/ulink/Src/mozilla-1.9.1/js/src -I. -I./../../dist/include -I./../../dist/include/js -I/home/ulink/Src/AIX-51/fx35-dbg/dist/include/nspr -I/sdk/include -I/home/ulink/Src/mozilla-1.9.1/js/src -qflag=w:w -DDEBUG -D_DEBUG -DDEBUG_ulink -DTRACING -g -DHAVE_64BIT_OS=1 -DAIX=1 -DHAVE_SYS_INTTYPES_H=1 -DNSCAP_DISABLE_DEBUG_PTR_TYPES=1 -DD_INO=d_ino -DSTDC_HEADERS=1 -DHAVE_ST_BLKSIZE=1 -DHAVE_SIGINFO_T=1 -DJS_INT8_TYPE=char -DJS_INT16_TYPE=short -DJS_INT32_TYPE=int -DJS_INT64_TYPE=long -DJS_INTPTR_TYPE=long -DJS_BYTES_PER_WORD=8 -DJS_BITS_PER_WORD_LOG2=6 -DJS_ALIGN_OF_POINTER=8 -DJS_BYTES_PER_DOUBLE=8 -DHAVE_INT16_T=1 -DHAVE_INT32_T=1 -DHAVE_INT64_T=1 -DHAVE_INT64=1 -DHAVE_UINT=1 -DHAVE_UINT_T=1 -DHAVE_UINT16_T=1 -DHAVE_DIRENT_H=1 -DHAVE_MEMORY_H=1 -DHAVE_UNISTD_H=1 -DHAVE_NL_TYPES_H=1 -DHAVE_MALLOC_H=1 -DHAVE_X11_XKBLIB_H=1 -DHAVE_SYS_STATVFS_H=1 -DHAVE_SYS_STATFS_H=1 -DHAVE_SYS_VFS_H=1 -DNEW_H=\<new\> -DHAVE_LIBC_R=1 -DHAVE_LIBM=1 -DHAVE_LIBDL=1 -DHAVE_LIBC_R=1 -DHAVE_ARM_SIMD=1 -D_REENTRANT=1 -DHAVE_RANDOM=1 -DHAVE_STRERROR=1 -DHAVE_LCHOWN=1 -DHAVE_FCHMOD=1 -DHAVE_SNPRINTF=1 -DHAVE_STATVFS=1 -DHAVE_MEMMOVE=1 -DHAVE_RINT=1 -DHAVE_STAT64=1 -DHAVE_LSTAT64=1 -DHAVE_TRUNCATE64=1 -DHAVE_FLOCKFILE=1 -DHAVE_GETPAGESIZE=1 -DHAVE_LOCALTIME_R=1 -DHAVE_STRTOK_R=1 -DHAVE_WCRTOMB=1 -DHAVE_MBRTOWC=1 -DHAVE_RES_NINIT=1 -DHAVE_ICONV=1 -DHAVE_ICONV_WITH_CONST_INPUT=1 -DHAVE_CPP_EXPLICIT=1 -DHAVE_CPP_TYPENAME=1 -DHAVE_CPP_MODERN_SPECIALIZE_TEMPLATE_SYNTAX=1 -DHAVE_CPP_PARTIAL_SPECIALIZATION=1 -DHAVE_CPP_AMBIGUITY_RESOLVING_USING=1 -DHAVE_CPP_NAMESPACE_STD=1 -DHAVE_CPP_UNAMBIGUOUS_STD_NOTEQUAL=1 -DHAVE_CPP_NEW_CASTS=1 -DHAVE_CPP_DYNAMIC_CAST_TO_VOID_PTR=1 -DNEED_CPP_UNUSED_IMPLEMENTATIONS=1 -DHAVE_I18N_LC_MESSAGES=1 -DCPP_THROW_NEW=throw\(\) -DMOZ_DLL_SUFFIX=\".so\" -DXP_UNIX=1 -DUNIX_ASYNC_DNS=1 -DJS_THREADSAFE=1 -DMOZ_REFLOW_PERF=1 -DMOZ_REFLOW_PERF_DSP=1 -D_MOZILLA_CONFIG_H_ -DMOZILLA_CLIENT /home/ulink/Src/mozilla-1.9.1/js/src/jsatom.cpp
"/home/ulink/Src/mozilla-1.9.1/js/src/jsatom.cpp", line 66.1: 1540-0231 (S) The array bound must be specified and must be a positive integral constant expression.
gmake[3]: *** [jsatom.o] Error 1
gmake[3]: Leaving directory `/home/ulink/Src/AIX-51/fx35-dbg/js/src'
gmake[2]: *** [libs_tier_js] Error 2
gmake[2]: Leaving directory `/home/ulink/Src/AIX-51/fx35-dbg'
gmake[1]: *** [tier_js] Error 2
gmake[1]: Leaving directory `/home/ulink/Src/AIX-51/fx35-dbg'
gmake: *** [default] Error 2

which is:
JS_STATIC_ASSERT(sizeof(JSAtom *) == JS_BYTES_PER_WORD);

The comment in js/src/jsutil.h starting line 81.

Don't blame other compilers: 
It's a GCC bug accepting this code, as a sizeof() isn't const int without const cast. Even if sizeof() should ever return the same result.
Bug 429105 is related, same fix for IBM __IBMCPP__ instead
Attached patch Revised patch (obsolete) — Splinter Review
This patch resolves the coredump/hang issue on AIX
Attachment #386457 - Attachment is obsolete: true
Attachment #418489 - Flags: review?(lw)
I believe someone else more qualified with locking on AIX should review this patch.
Luke :

 Can you please suggest who i can set the reviewer as ?

Thanks,
Shailen
Attachment #418489 - Flags: review?(lw) → review?(brendan)
Attachment #418489 - Flags: review?(brendan) → review?(jim)
This should be fixed by Bug 516667.
I don't need https://bugzilla.mozilla.org/attachment.cgi?id=418489 for SeaMonkey 2.0.x or Thunderbird 3.0.x nor Firefox 3.5.x on AIX 5L.
Shailen: if you disagree, I will reopen this bug.
Status: UNCONFIRMED → RESOLVED
Closed: 14 years ago
Resolution: --- → WORKSFORME
Uli:

  I disagree as the fix for Bug 516667 did not fix this bug.

Regards,
Shailen
Shailen: I have built 1.9.0 branch and 1.9.1 branch working stable without this patch as 32bit applications on AIX.
Which version should I try to build to reproduce the error as 64bit?
Status: RESOLVED → UNCONFIRMED
Resolution: WORKSFORME → ---
Assignee: nobody → shailen.n.jain
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
ULi :

  This patch is required for 64 bit build. The version is Mozilla Firefox 3.5
Severity: critical → normal
Hardware: Other → PowerPC
Version: 3.0 Branch → unspecified
I could reproduce compiling as 64bit, compiling the same as 32bit works flawlessly.

bash-3.00$ ./firefox
*** Registering components in: Apprunner
*** Registering components in: application
*** Registering components in: nsFindComponent
*** Registering components in: nsChromeModule
*** Registering components in: nsAutoConfigModule
*** Registering components in: nsXPIntlModule
*** Registering components in: CommandLineModule
*** Registering components in: nsI18nModule
*** Registering components in: nsBrowserCompsModule
*** Registering components in: nsAuthModule
*** Registering components in: nsChardetModule
*** Registering components in: nsJarModule
*** Registering components in: nsCookieModule
*** Registering components in: JavaScript_Debugger
*** Registering components in: BrowserDirProvider
*** Registering components in: nsIconDecoderModule
*** Registering components in: nsSecurityManagerModule
*** Registering components in: nsParserModule
*** Registering components in: nsGfxModule
*** Registering components in: nsImageLib2Module
*** Registering components in: nsPluginModule
*** Registering components in: embedcomponents
*** Registering components in: nsComposerModule
*** Registering components in: nsLayoutModule
*** Registering components in: docshell_provider
*** Registering components in: nsFileViewModule
*** Registering components in: necko
*** Registering components in: nsPermissionsModule
*** Registering components in: nsPrefModule
*** Registering components in: nsRDFModule
*** Registering components in: mozStorageModule
*** Registering components in: nsTransactionManagerModule
*** Registering components in: nsUConvModule
*** Registering components in: nsUCvMathModule
*** Registering components in: nsWidgetGtk2Module
*** Registering components in: xpconnect
*** Registering components in: ZipWriterModule
*** Registering components in: PKI
*** Registering components in: NSS
*** Registering components in: BOOT
*** Registering components in: Browser_Embedding_Module
*** Registering components in: appshell
*** Registering components in: nsUniversalCharDetModule
*** Registering components in: nsCJVMManagerModule
*** Registering components in: nsWindowDataSourceModule
*** Registering components in: RemoteServiceModule
*** Registering components in: nsPlacesModule
*** Registering components in: tkAutoCompleteModule
*** Registering components in: satchel
*** Registering components in: nsToolkitCompsModule
*** Registering components in: mozSpellCheckerModule
*** Registering components in: nsUnixProxyModule
*** Registering components in: nsSystemPrefModule
*** Registering components in: nsSoftwareUpdate
Assertion failure: JSSTRING_IS_ATOMIZED(key), at /home/ulink/src/comm-1.9.1/mozilla/js/src/jsatom.cpp:743
./run-mozilla.sh[13]: 589954 IOT/Abort trap(coredump)
bash-3.00$
Version: unspecified → 3.5 Branch
Assignee: shailen.n.jain → general
Component: General → JavaScript Engine
Product: Firefox → Core
QA Contact: general → general
Version: 3.5 Branch → 1.9.1 Branch
Assignee: general → shailen.n.jain
export OBJECT_MODE=64
export CC="xlc_r -q64"
export CXX="xlC_r -q64"
export AUTOCONF=/opt/freeware/bin/autoconf-2.13
. $topsrcdir/browser/config/mozconfig
mk_add_options MOZ_OBJDIR=@TOPSRCDIR@/obj-fx35-64
ac_add_options --disable-dbus
ac_add_options --disable-accessibility
ac_add_options --with-system-zlib
ac_add_options --enable-system-cairo
#ac_add_options --enable-optimize="-O2 -qmaxmem=-1 -qalias=noansi"
#ac_add_options --disable-debug
ac_add_options --disable-optimize
ac_add_options --enable-debug
ac_add_options --disable-tests

IBM XLC/C++ 8.0.0.10 (Fixpack Aug 2007) / AIX 5.1ML9
mozilla-1.9.1 pulled from hg today.

A 64bit build of Firefox 3.0.19 from CVS works both optimized and debug.
Attachment #418489 - Flags: review?(christophe.ravel.bugs)
Attachment #418489 - Attachment is obsolete: true
Attachment #418489 - Flags: review?(jim)
Attachment #418489 - Flags: review?(christophe.ravel.bugs)
Attachment #432527 - Flags: review?(jim)
Attachment #432527 - Flags: review?(christophe.ravel.bugs)
I noticed that the IBM documentation for compare_and_swaplp says:
---
Note:
If the compare_and_swap or the compare_and_swaplp subroutine is used as a locking primitive, insert an isync at the start of any critical sections.
---
Is this something to be concerned about?
Don't have a multi CPU machine for stress testing, on single CPU a build from FIREFOX_3_5_9_RELEASE tag works fine, no other patches under js/ dir :-)
Uli: why did you cc me on this bug?  I don't see any comment that mentions
NSPR or NSS.  What would you like me to look at?
Comment on attachment 432527 [details] [diff] [review]
recreated Shailen's patch, context adjusted, for trunk

The AIX 6.1 documentation suggests this patch is wrong. If it is indeed the right thing to do, we need a comment explaining in detail why it works.

The documentation for _check_lock says: "The _check_lock subroutine performs an atomic (uninterruptible) sequence of operations. The compare_and_swap subroutine is similar, but does not issue synchronization instructions and therefore is inappropriate for updating lock words."

SM's NativeCompareAndSwap is indeed used for updating lock words.

The documentation for compare_and_swaplp says: "If the compare_and_swap or the compare_and_swaplp subroutine is used as a locking primitive, insert an isync at the start of any critical sections."

This patch does not add an isync instruction anywhere, so there's nothing preventing the compiler and processor from moving loads to before the compare-and-swap instruction, and violating the semantics required of a lock.

If you can provide a comment in the patch explaining why _check_lock doesn't work here, then we can move forward.
Attachment #432527 - Flags: review?(jim) → review-
(In reply to comment #23)
> Uli: why did you cc me on this bug?  I don't see any comment that mentions
> NSPR or NSS.  What would you like me to look at?

wtc: I cc'ed you because we need locking expertise here. Similiar to what is done in NSPR.

 (In reply to comment #24)
> (From update of attachment 432527 [details] [diff] [review])
> The AIX 6.1 documentation suggests this patch is wrong. If it is indeed the
> right thing to do, we need a comment explaining in detail why it works.
...
> If you can provide a comment in the patch explaining why _check_lock doesn't
> work here, then we can move forward.

Just my 2ct educated guess, as this is slightly beyond my scope:

_check_lock needs an atomic_p as first parameter.

For 32bit AIX: jsword == long == int32_t which fits perfectly atomic_p or atomic_l, the _check_lock works.

For 64bit AIX jsword == long == int64_t which doesn't fit atomic_p
100% reproducable as per comment c# 16

For the mozilla 1.9.0 C language version of jslock.c the _check_lock works for both 32- and 64bit variants of AIX.
Attachment #432527 - Flags: review?(christophe.ravel.bugs)
/* @(#)85       1.7.1.6  src/bos/kernel/sys/atomic_op.h, sysproc, bos510 9/8/00
14:10:02 */
/*
 *   COMPONENT_NAME: SYSPROC
 *
 *   FUNCTIONS: none
 *
 *   ORIGINS: 27, 83
 *
 *
 *   (C) COPYRIGHT International Business Machines Corp. 1993
 *   All Rights Reserved
 *   Licensed Materials - Property of IBM
 *   US Government Users Restricted Rights - Use, duplication or
 *   disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
 */

/*
 * LEVEL 1,  5 Years Bull Confidential Information
 */

#ifndef _H_ATOMIC_OP
#define _H_ATOMIC_OP

#include <sys/types.h>

typedef int     *atomic_p;
typedef ushort  *atomic_h;
typedef long    *atomic_l;

#ifdef _NO_PROTO

int  fetch_and_add();
uint fetch_and_and();
uint fetch_and_or();
boolean_t compare_and_swap();
long  fetch_and_addlp();
ulong fetch_and_andlp();
ulong fetch_and_orlp();
boolean_t compare_and_swaplp();
ushort fetch_and_add_h();

#ifdef _KERNEL
boolean_t test_and_set();
int fetch_and_nop();
#endif

#else /* _NO_PROTO */

#ifdef _KERNSYS
#if defined(__64BIT_KERNEL) || defined(__FULL_PROTO)
boolean_t test_and_setlp(atomic_l, unsigned long);
boolean_t test_and_setsp(atomic_h, short);
#endif
#endif

int  fetch_and_add(atomic_p,int);
uint fetch_and_and(atomic_p,uint);
uint fetch_and_or(atomic_p,uint);
boolean_t compare_and_swap(atomic_p,int *,int);
long  fetch_and_addlp(atomic_l,long);
ulong fetch_and_andlp(atomic_l,ulong);
ulong fetch_and_orlp(atomic_l,ulong);
boolean_t compare_and_swaplp(atomic_l, long *, long);
ushort fetch_and_add_h(atomic_h,int);

#ifdef _KERNEL
boolean_t test_and_set(atomic_p, int);
int fetch_and_nop(atomic_p);
void * get_from_list(void *,long);
void * get_all_from_list(void *,long);
void   put_onto_list(void *,void *,long);
void   put_chain_onto_list(void *,void *,void *,long);
#endif

#endif /* _NO_PROTO */

/* Atomic Locking primitives--User mode routines */

#ifndef _KERNEL
#ifdef _NO_PROTO

boolean_t _check_lock();
void _clear_lock();
void _clear_lock_mem();
int _safe_fetch();

#else /* _NO_PROTO */

boolean_t _check_lock(atomic_p, int, int);
void _clear_lock(atomic_p, int);
void _clear_lock_mem(atomic_p, int);
int _safe_fetch(atomic_p);

#endif /* _NO_PROTO */

#ifndef __ia64
/* Atomic Locking primitive pragmas */

#pragma mc_func _clear_lock { \
 "48003403" /* bla    ._clear_lock */ \
}
#pragma mc_func _clear_lock_mem { \
 "48003413" /* bla    ._clear_lock_mem */ \
}
#pragma mc_func _check_lock { \
 "48003423" /* bla    ._check_lock */ \
}
#pragma mc_func _safe_fetch {          \
 "80630000" /* l      r3,0(r3)      */ \
 "5463003F" /* rlinm. r3, r3, 0, -1 */ \
 "41820004" /* beq    $+4           */ \
}
#pragma reg_killed_by _safe_fetch cr0,gr3
#endif /* ! __ia64 */

#endif /* !_KERNEL */

#endif /* _H_ATOMIC_OP */
The 64bit kernel does NOT support the _check_lock / _clear_lock kernel services,
as the 64bit kernel is always a multiprocessor kernel.
Only the 32bit kernels were delivered in separate multi or single CPU versions.
Starting with AIX 5.3 the 32bit single processor kernel was dropped.

http://www.ibm.com/developerworks/aix/library/au-32to64.html

When looking at the Solaris/Sparc/Sun Compiler implementation the compare_and_swap seems sufficiant.

Shailen: anything to add???
Comment on attachment 432527 [details] [diff] [review]
recreated Shailen's patch, context adjusted, for trunk

Jim: I think Uli's comment 25 is right on.  Using the AIX header
Uli provided in comment 26, we see that the (atomic_p) cast in
the _check_lock call below is definitely wrong for 64-bit builds:

> static JS_ALWAYS_INLINE int
> NativeCompareAndSwap(jsword *w, jsword ov, jsword nv)
> {
>-    return !_check_lock((atomic_p)w, ov, nv);
>+    return compare_and_swaplp((atomic_l)w, &ov, nv);
> }

jsword is intptr_t, which is 64-bit in 64-bit builds.
atomic_p is a pointer to int, which is 32-bit in 64-bit builds.

The prototype of compare_and_swaplp happens to match NativeCompareAndSwap
in both 32-bit and 64-bit builds on AIX.

If NativeCompareAndSwap is not documented to execute a memory
barrier, this fix looks good.
Assignee: shailen.n.jain → ul.mcamafia
Attachment #432527 - Attachment is obsolete: true
Attachment #432528 - Attachment is obsolete: true
Attachment #432529 - Attachment is obsolete: true
Summary: Mozilla Firefox trunk build is crashing on AIX @js_AtomizeString → Mozilla Firefox 64bit is crashing on AIX @js_AtomizeString
Attachment #433298 - Attachment is obsolete: true
Attachment #433299 - Flags: review?
Attachment #433299 - Flags: review? → review?(jim)
Attachment #433299 - Attachment is obsolete: true
Attachment #433299 - Flags: review?(jim)
Attached patch revised patch (obsolete) — Splinter Review
the new patch is tested on AIX 5.3 and AIX 6.1 multi processor machines 64bit and does not change the code for the known working 32bit build.
as per comment# 28
Attachment #437631 - Flags: review?(jim)
Uli, why did you abandon the __check_lockd_mp patch in comment 30?
Is __check_lockd_mp an IBM compiler intrinsic function, and therefore
not available to GCC?
(In reply to comment #32)
> Uli, why did you abandon the __check_lockd_mp patch in comment 30?
> Is __check_lockd_mp an IBM compiler intrinsic function, and therefore
> not available to GCC?

Yes. And the patch produced a crashing 64bit build with IBM XLC++ 7.
(In reply to comment #28)
> If NativeCompareAndSwap is not documented to execute a memory
> barrier, this fix looks good.

I'm pretty sure it is supposed to be a memory barrier.  Consider the definition of js_Lock, which calls ThinLock, which on the fast path calls NativeCompareAndSwap, and nothing else.
The question we need to answer here is: What function is available to GCC on AIX to do a compare-and-swap with a memory barrier on a pointer-sized value (32 bits or 64 bits)?

Is there such an API?  I wasn't able to find one.
Jim: we can use GCC's built-in functions for atomic memory access:

  __sync_bool_compare_and_swap or __sync_val_compare_and_swap
  __sync_synchronize

These built-in functions were added in GCC 4.1:
http://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Atomic-Builtins.html#Atomic-Builtins
(In reply to comment #35)
> The question we need to answer here is: What function is available to GCC on
> AIX to do a compare-and-swap with a memory barrier on a pointer-sized value (32
> bits or 64 bits)?
> 
> Is there such an API?  I wasn't able to find one.

The configure script checks for the IBM compiler on AIX. 
only NSPR is ported to GCC on AIX, but Core is not possible to compile with GCC on AIX today as there are no xpcom/reflect/xptcall stubs for AIX with GCC.
So I can add a comment in the patch pointing to comment# 36 if it ever comes to porting AIX/GCC.
Uli: why don't you just add an "asm" statement for the "isync"
instruction after the compare_and_swaplp call?  If the IBM
compiler doesn't support "asm" statements, use the code in Appendix A
at http://www.ibm.com/developerworks/systems/articles/powerpc.html
Uli, could you give this patch a try?
Attachment #437907 - Flags: review?(ul.mcamafia)
Comment on attachment 437907 [details] [diff] [review]
Bug 501881: Use correct compare-and-swap primitives on AIX.

Jim: please try JS_STATIC_ASSERT instead of JS_ASSERT in your patch.
It should work.
Attachment #437914 - Flags: review?(ul.mcamafia)
Attachment #437907 - Attachment is obsolete: true
Attachment #437907 - Flags: review?(ul.mcamafia)
Attachment #437914 - Flags: review?(wtc)
To be clear --- I can't test this patch, as I have no access to AIX.  I need Uli to try it out.
First run 32bit:

=1 -DHAVE_ICONV_WITH_CONST_INPUT=1 -DHAVE_CPP_EXPLICIT=1 -DHAVE_CPP_TYPENAME=1 -DHAVE_CPP_MODERN_SPECIALIZE_TEMPLATE_SYNTAX=1 -DHAVE_CPP_PARTIAL_SPECIALIZATION=1 -DHAVE_CPP_AMBIGUITY_RESOLVING_USING=1 -DHAVE_CPP_NAMESPACE_STD=1 -DHAVE_CPP_UNAMBIGUOUS_STD_NOTEQUAL=1 -DHAVE_CPP_NEW_CASTS=1 -DNEED_CPP_UNUSED_IMPLEMENTATIONS=1 -DHAVE_I18N_LC_MESSAGES=1 -DCPP_THROW_NEW=throw\(\) -DMOZ_DLL_SUFFIX=\".so\" -DXP_UNIX=1 -DUNIX_ASYNC_DNS=1 -DJS_THREADSAFE=1  -D_MOZILLA_CONFIG_H_ -DMOZILLA_CLIENT /home/ulink/src/comm-1.9.1/mozilla/js/src/jslock.cpp
"/home/ulink/src/comm-1.9.1/mozilla/js/src/jslock.cpp", line 177.12: 1540-0274 (S) The name lookup for "__compare_and_swap" did not find a declaration.
gmake[3]: *** [jslock.o] Error 1

Second run 32bit + including <builtins.h> header

E_STATVFS=1 -DHAVE_MEMMOVE=1 -DHAVE_RINT=1 -DHAVE_STAT64=1 -DHAVE_LSTAT64=1 -DHAVE_TRUNCATE64=1 -DHAVE_FLOCKFILE=1 -DHAVE_GETPAGESIZE=1 -DHAVE_LOCALTIME_R=1 -DHAVE_STRTOK_R=1 -DHAVE_WCRTOMB=1 -DHAVE_MBRTOWC=1 -DHAVE_RES_NINIT=1 -DHAVE_ICONV=1 -DHAVE_ICONV_WITH_CONST_INPUT=1 -DHAVE_CPP_EXPLICIT=1 -DHAVE_CPP_TYPENAME=1 -DHAVE_CPP_MODERN_SPECIALIZE_TEMPLATE_SYNTAX=1 -DHAVE_CPP_PARTIAL_SPECIALIZATION=1 -DHAVE_CPP_AMBIGUITY_RESOLVING_USING=1 -DHAVE_CPP_NAMESPACE_STD=1 -DHAVE_CPP_UNAMBIGUOUS_STD_NOTEQUAL=1 -DHAVE_CPP_NEW_CASTS=1 -DNEED_CPP_UNUSED_IMPLEMENTATIONS=1 -DHAVE_I18N_LC_MESSAGES=1 -DCPP_THROW_NEW=throw\(\) -DMOZ_DLL_SUFFIX=\".so\" -DXP_UNIX=1 -DUNIX_ASYNC_DNS=1 -DJS_THREADSAFE=1  -D_MOZILLA_CONFIG_H_ -DMOZILLA_CLIENT /home/ulink/src/comm-1.9.1/mozilla/js/src/jslock.cpp
"/home/ulink/src/comm-1.9.1/mozilla/js/src/jslock.cpp", line 178.30: 1540-2878 (S) The built-in function "__compare_and_swap" is not valid for this architecture.

The 64bit build + including <builtins.h> at least compiled jslock.cpp so far.
It will last about three hours until I can test the resulting build. So will report tomorrow.

AIX 5.1ML9 with IBM XLC++ 8.0.0.16
Comment on attachment 437914 [details] [diff] [review]
Bug 501881: Use correct compare-and-swap primitives on AIX.

r=wtc.

I would put the GCC code in a comment because we can't test
it.  (Firefox can't be built with GCC on AIX yet.)

Does NativeCompareAndSwap not need a memory barrier?
_compare_and_swaplp doesn't include a memory barrier.
Attachment #437914 - Flags: review?(wtc) → review+
Comment on attachment 437914 [details] [diff] [review]
Bug 501881: Use correct compare-and-swap primitives on AIX.

as per comment# 43 the 32bit/__IBMCPP__ code path does not compile.

the needed #include <builtins.h> must be #ifdef'ed because the header wouldn't be available in GCC.

The 64bit code path was working so far.

I'm testing a new version of a patch adding __asm__ ("isync");
after the compare_and_swaplp call.
So we don't need a separate GCC code path, using the O/S routines instead of compiler builtins.
Attachment #437914 - Flags: review?(ul.mcamafia) → review-
Attached patch compare_and_swap for AIX (obsolete) — Splinter Review
this version of the patch eliminates the different code paths for IBM Compiler vs. GCC and 32bit/64bit.

The O/S routine compare_and_swaplp() is available to every compiler,
we don't need additional casts, as for 32bit compilation sizeof(int32_t) == sizeof(long)
the single inline assembly completes the memory barrier.

This patch is branch 1.9.1 as this is what could be tested on AIX today.
I will provide context adjusted patches for cleanly pushing.

THX to wtc's comment# 38 I finally understood, why in my testing environment everything worked fine even without the "isync" statement.
Attachment #438112 - Attachment is obsolete: true
See comment# 46 but attached the right patch file ;)
Attachment #437631 - Attachment is obsolete: true
Attachment #437631 - Flags: review?(jim)
Attachment #438115 - Flags: review?(jim)
Attachment #438115 - Flags: review?(wtc)
Comment on attachment 438115 [details] [diff] [review]
simplified compare_and_swap patch for AIX

How can that "isync" make a difference?  It's after the 'return'.
Attachment #438115 - Flags: review?(jim) → review-
Attachment #438115 - Flags: review?(wtc) → review-
Comment on attachment 438115 [details] [diff] [review]
simplified compare_and_swap patch for AIX

Uli, I suggest adding JS_STATIC_ASSERT to verify the types are the same size.

  JS_STATIC_ASSERT(sizeof(jsword) == sizeof(atomic_l));
  JS_STATIC_ASSERT(sizeof(jsword) == sizeof(long));
The isync won't be needed on older PowerPC CPU, and mine is a POWER3 which is old enough to not need it.
Now the isync statement will be reached *before* the return.
With sizeof assertions.
Attachment #437914 - Attachment is obsolete: true
Attachment #438115 - Attachment is obsolete: true
Comment on attachment 438220 [details] [diff] [review]
fixed to comment# 48 and comment# 49

The __asm__ is supported from IBM XLC++ 7 onwards.
The oldest compiler version still supported by IBM is XLC++ 8.
Even on VisualAge 6 with AIX 4.3.3 the source compiles, and produces a working libmozjs.so, but without the "isync" as the v6 compiler doesn't support inline asm for C++
1.9.2 and trunk will need XLC++ 9 or newer.
Attachment #438220 - Flags: review?(jim)
Attachment #438220 - Flags: review?(wtc)
Comment on attachment 438220 [details] [diff] [review]
fixed to comment# 48 and comment# 49

r=wtc.

I believe you only need to isync when compare and swap succeeded.
People usually call compare and swap in a loop until it succeeds.
Doing an isync after each failed attempt could be expensive.

>+    res=compare_and_swaplp((atomic_l)w, &ov, nv);
>+    __asm__ ("isync");

Formatting nits:

Add a space before and after the equal sign '='.

Remove the space between __asm__ and the opening parenthesis '('.
Attachment #438220 - Flags: review?(wtc) → review+
Attachment #438225 - Attachment is obsolete: true
Attachment #438220 - Attachment is obsolete: true
Attachment #438220 - Flags: review?(jim)
Attachment #438228 - Flags: review?(jim)
Comment on attachment 438228 [details] [diff] [review]
fixed to comment# 48 and comment# 49

I'm pretty sure that first assert should be:

JS_STATIC_ASSERT(sizeof(jsword) == sizeof(*atomic_l));

because we're trying to make sure that the thing being swapped is the right size. That is, what we're thinking is:

JS_STATIC_ASSERT(sizeof(*w) == sizeof(*atomic_l));

Isn't that right, WTC?
Attachment #438228 - Flags: review?(jim) → review-
From "/usr/include/sys/atomic_op.h"

typedef int         *atomic_p;
typedef long        *atomic_l;

So if I understand comment# 55 correctly the assertion should be:
JS_STATIC_ASSERT(sizeof(*jsword) == sizeof(atomic_l))
else we compare a long with a **long

or better the second assert alone is sufficiant.
wtc: ???
Comment on attachment 438228 [details] [diff] [review]
fixed to comment# 48 and comment# 49

Jim, you're right.  The first assertion should read:

    JS_STATIC_ASSERT(sizeof(jsword) == sizeof(*atomic_l));

I am not sure if *atomic_l is legal C code though.

Given the definition of atomic_l that Uli provided:

typedef long        *atomic_l;

it is equivalent to the second assertion:

    JS_STATIC_ASSERT(sizeof(jsword) == sizeof(long));

So the second assertion alone is sufficient.
(In reply to comment #57)
> Jim, you're right.  The first assertion should read:
> 
>     JS_STATIC_ASSERT(sizeof(jsword) == sizeof(*atomic_l));
> 
> I am not sure if *atomic_l is legal C code though.

after getting trapped by ugly C in my false comment# 56 I looked up in K&R:
it is legal C and the asteriks here is the derefencing operator (and not a double pointer). C/C++ is not my daily business. Sorry.

> typedef long        *atomic_l;
> 
> it is equivalent to the second assertion:
> 
>     JS_STATIC_ASSERT(sizeof(jsword) == sizeof(long));
> 
> So the second assertion alone is sufficient.

If jimb agrees, I will rework (and test) the patch according to wtc's last comment.
(In reply to comment #58)
> after getting trapped by ugly C in my false comment# 56 I looked up in K&R:
> it is legal C and the asteriks here is the derefencing operator (and not a
> double pointer). C/C++ is not my daily business. Sorry.

Well, then you're doing very well --- this kind of stuff is way in the weeds.

> If jimb agrees, I will rework (and test) the patch according to wtc's last
> comment.

I agree.
Reworked to comment# 57 and comment# 59
context for trunk.
Attachment #438228 - Attachment is obsolete: true
Attachment #438916 - Flags: review?
Attachment #438916 - Flags: review?(wtc)
Attachment #438916 - Flags: review?(jim)
Attachment #438916 - Flags: review?
Comment on attachment 438916 [details] [diff] [review]
Implement NativeCompareAndSwap compatible for both 64 and 32bit on AIX

r=wtc.
Attachment #438916 - Flags: review?(wtc) → review+
Attachment #438916 - Flags: review?(jim) → review+
Keywords: checkin-needed
Pushed changeset 5ac3bdffab9a to mozilla-central.
Status: ASSIGNED → RESOLVED
Closed: 14 years ago14 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
Comment on attachment 438917 [details] [diff] [review]
Same patch for 1.9.2 branch, context adjusted only

No risk, as the code change is only within #idef (AIX) / #endif.
The patch is well tested on AIX.
Attachment #438917 - Flags: approval1.9.2.5?
Comment on attachment 438918 [details] [diff] [review]
Same patch for 1.9.1 branch, context adjusted only

No risc patch.
Well tested on AIX 4.3.3 through 5.3 with different versions of IBM's compiler.
Attachment #438918 - Flags: approval1.9.1.10?
Attachment #438918 - Flags: approval1.9.1.10? → approval1.9.1.11?
Comment on attachment 438917 [details] [diff] [review]
Same patch for 1.9.2 branch, context adjusted only

Approved for 1.9.2.6, a=dveditz for release-drivers
Attachment #438917 - Flags: approval1.9.2.5? → approval1.9.2.6+
Comment on attachment 438918 [details] [diff] [review]
Same patch for 1.9.1 branch, context adjusted only

Approved for 1.9.1.11, a=dveditz for release-drivers
Attachment #438918 - Flags: approval1.9.1.11? → approval1.9.1.11+
Keywords: checkin-needed
Whiteboard: checkin to mozilla-1.9.1 and mozilla-1.9.2
Whiteboard: checkin to mozilla-1.9.1 and mozilla-1.9.2 → [needs checkin to mozilla-1.9.1][needs checkin to mozilla-1.9.2]
Verified on AIX 5.1 building with IBM XLC/C++ 7
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: