strsclnt crashes with SIGSEGV on solaris 10, i386

RESOLVED DUPLICATE of bug 331164

Status

NSS
Libraries
P1
critical
RESOLVED DUPLICATE of bug 331164
12 years ago
12 years ago

People

(Reporter: Alexei Volkov, Assigned: Alexei Volkov)

Tracking

({crash})

3.11
3.11.1
x86
Solaris
crash

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment)

(Assignee)

Description

12 years ago
This looks like a problem with server cert validity. -o -o was not used, so the client should not continue with the rest of testing. This could create an environment for segmentation vialation. 

Also, I can not reproduce the problem by running the same command on
mandela where the original crash occured.


Stress client test output:

strsclnt -q -p 8444 -d ../ext_client -B -s -w nss -c 10 -C :C004 -N \
          mandela.red.iplanet.com
strsclnt started at Wed Mar  8 03:18:09 PST 2006
strsclnt: -- SSL: Server Certificate Invalid, err -8187.
security library: invalid arguments.
selfserv: HDX PR_Read returned error -12271:
SSL peer cannot verify your certificate.
Segmentation Fault - core dumped
strsclnt completed at Wed Mar  8 03:18:10 PST 2006


Thread stack: 

(dbx) thread t@8
t@8 (l@8) stopped in __lwp_park at 0xd0ccd2a9
0xd0ccd2a9: __lwp_park+0x0019:  jae    __lwp_park+0x27 <0xd0ccd2b7>
(dbx) where     
current thread: t@8
=>[1] __lwp_park(0xd07c1400, 0xd07c14), at 0xd0ccd2a9
  [2] mutex_lock_queue(0xd07c1400, 0xd07c14, 0xd07c, 0xd0), at 0xd0cc6342
  [3] slow_lock(0xd07c1400, 0xb0d07c14, 0xbfb0d07c, 0xabfb0d0), at 0xd0cc6bde
  [4] mutex_lock_impl(0x80abfb0, 0x80abf, 0x80a, 0x8), at 0xd0cc6cd4
  [5] pthread_mutex_lock(0x80abfb0, 0x20080abf, 0xd720080a, 0x3bd72008), at 0xd0cc6de0
  [6] PR_Lock(0x80abfb0, 0xf0080abf, 0x20f0080a, 0xf320f008), at 0xd0e21509
  [7] nssSession_EnterMonitor(0x80a3f20, 0x88080a3f, 0xd888080a, 0x3bd88808), at 0xd0f1605e
  [8] find_objects_by_template(0x80b04d0, 0x20080b04, 0x3f20080b, 0xa3f2008), at 0xd0f167d1
  [9] nssToken_FindCertificateByIssuerAndSerialNumber(0x80b04d0, 0x20080b04, 0x3f20080b, 0xa3f2008), at 0xd0f17645
  [10] nssTrustDomain_FindCertificateByIssuerAndSerialNumber(0x80a3e20, 0x80080a3e, 0xd880080a, 0x3bd88008), at 0xd0f0efb5
  [11] NSSTrustDomain_FindCertificateByEncodedCertificate(0x80a3e20, 0xb8080a3e, 0xd8b8080a, 0x3bd8b808), at 0xd0f0f1b9
  [12] __CERT_NewTempCertificate(0x80a3e20, 0x14080a3e, 0xd914080a, 0x3bd91408), at 0xd0ef955a
  [13] ssl3_HandleCertificate(0x81387c8, 0x91081387, 0x3910813, 0x14039108), at 0xd0fa0512
  [14] ssl3_HandleHandshakeMessage(0x81387c8, 0xe081387, 0xfc0e0813, 0x13fc0e08), at 0xd0fa1dfd
  [15] ssl3_HandleHandshake(0x81387c8, 0x8081387, 0x8a080813, 0x138a0808), at 0xd0fa1fa2
  [16] ssl3_HandleRecord(0x81387c8, 0x74081387, 0xda740813, 0x3bda7408), at 0xd0fa2764
  [17] ssl3_GatherCompleteHandshake(0x81387c8, 0x81387, 0x813, 0x8), at 0xd0fa3c60
  [18] ssl_GatherRecord1stHandshake(0x81387c8, 0x94081387, 0x52940813, 0x6529408), at 0xd0fa512b
  [19] ssl_SecureWrite(0x81387c8, 0x94081387, 0x52940813, 0x6529408), at 0xd0facc1d
  [20] ssl_Write(0x8080218, 0x94080802, 0x52940808, 0x6529408), at 0xd0fb13fc
  [21] PR_Write(0x8080218, 0x94080802, 0x52940808, 0x6529408), at 0xd0e0a313
  [22] 0x8056333(0x8046dd0, 0x9808046d, 0xff980804, 0x7ff9808), at 0x8056332
  [23] 0x8054f3c(0x807df60, 0x807df, 0x807, 0x8), at 0x8054f3b
  [24] _pt_root(0x80bf170, 0x80bf1, 0x80b, 0x8), at 0xd0e281cc
  [25] _thr_setup(0xd07c1400, 0xd07c14, 0xd07c, 0xd0), at 0xd0cccf3f
dbx: core file read error: address 0xd03be000 not in data space
So, it was a strsclnt crash, not selfserv, on platform Solaris x86?
was it 32-bit? or was it 64?
I'd guess that the thread whose stack is shown above is not the thread
that crashed.  Please examine all the thread stacks until you find one one more
that don't appear to be blocked on a lock (as this one does) or in some system 
call.  Let us have that stack (or those stacks).  Thanks.
(Assignee)

Comment 2

12 years ago
Created attachment 214486 [details]
thread stack at SEGV
(Assignee)

Comment 3

12 years ago
It is 32 bit version and I've posted the wrong thread stack into the bug report:

It is probably in thread 6: 

current thread: t@6
=>[1] ssl3_HandleCertificate(0x80dcaa8, 0xeb080dca, 0x74eb080d), at 0xd0fa0579
  [2] ssl3_HandleHandshakeMessage(0x80dcaa8, 0xde080dca, 0x6ade080d), at 0xd0fa1dfd
  [3] ssl3_HandleHandshake(0x80dcaa8, 0xe8080dca, 0xcce8080d), at 0xd0fa1fa2
  [4] ssl3_HandleRecord(0x80dcaa8, 0x74080dca, 0xda74080d), at 0xd0fa2764
  [5] ssl3_GatherCompleteHandshake(0x80dcaa8, 0x80dca, 0x80d), at 0xd0fa3c60
  [6] ssl_GatherRecord1stHandshake(0x80dcaa8, 0x94080dca, 0x5294080d), at 0xd0fa512b
  [7] ssl_SecureWrite(0x80dcaa8, 0x94080dca, 0x5294080d), at 0xd0facc1d
  [8] ssl_Write(0x8080158, 0x94080801, 0x52940808), at 0xd0fb13fc
  [9] PR_Write(0x8080158, 0x94080801, 0x52940808), at 0xd0e0a313
  [10] 0x8056333(0x8046dd0, 0x9808046d, 0xff980804), at 0x8056332
  [11] 0x8054f3c(0x807df28, 0x807df, 0x807), at 0x8054f3b
  [12] _pt_root(0x80bf030, 0x80bf0, 0x80b), at 0xd0e281cc
  [13] _thr_setup(0xd07c0c00, 0xd07c0c, 0xd07c), at 0xd0cccf3f
(Assignee)

Comment 4

12 years ago
I've got the problem reproduced couple times. The problem is around NSSCertificate mutex locks. Since the client was run with -s option
I suspect recent changes related to locks have something to do with this.  

Here are traces from three independent crashes:

dbx) threads
      t@1  a  l@1   ?()   LWP suspended in  __lwp_wait()
      t@2  a  l@2   _pt_root()   LWP suspended in  __lwp_park()
      t@3  a  l@3   _pt_root()   LWP suspended in  mutex_unlock_queue()
      t@4  a  l@4   _pt_root()   sleep on 0x80abfb0  in  __lwp_park()
      t@5  a  l@5   _pt_root()   LWP suspended in  __lwp_unpark()
      t@6  a  l@6   _pt_root()   LWP suspended in  ssl3_HandleCertificate()
o     t@7  a  l@7   _pt_root()   signal SIGSEGV in  mutex_lock_impl()
      t@8  a  l@8   _pt_root()   LWP suspended in  __lwp_park()
      t@9  a  l@9   _pt_root()   sleep on 0x80abfb0  in  __lwp_park()
(dbx) thread t@7
t@7 (l@7) stopped in mutex_lock_impl at 0xd0cc6c08
0xd0cc6c08: mutex_lock_impl+0x0020:     movzbl   0x00000004(%edx),%esi
(dbx) where
current thread: t@7
=>[1] mutex_lock_impl(0x9, 0x0), at 0xd0cc6c08
  [2] __mutex_lock(0x9), at 0xd0cc6de0
  [3] PR_Lock(0x9), at 0xd0e21509
  [4] nssArena_Destroy(0x81c35e8, 0x80bf2b0, 0x0, 0xd0f320f0, 0x81cfad0, 0x81cfaa0), at 0xd0f1c45c
  [5] NSSCertificate_Destroy(0x81c37f0), at 0xd0f0b48c
  [6] CERT_DestroyCertificate(0x81d4f78), at 0xd0efa2b5
  [7] ssl3_HandleCertificate(0x80bf2b0, 0x80c7103, 0x0), at 0xd0fa081b
  [8] ssl3_HandleHandshakeMessage(0x80bf2b0, 0x80c66f6, 0xa0d), at 0xd0fa1dfd
  [9] ssl3_HandleHandshake(0x80bf2b0, 0x80bf4f0), at 0xd0fa1fa2
  [10] ssl3_HandleRecord(0x80bf2b0, 0xd04bda74, 0x80bf4f0), at 0xd0fa2764
  [11] ssl3_GatherCompleteHandshake(0x80bf2b0, 0x0), at 0xd0fa3c60
  [12] ssl_GatherRecord1stHandshake(0x80bf2b0), at 0xd0fa512b
  [13] ssl_SecureWrite(0x80bf2b0, 0x8065294, 0x15), at 0xd0facc1d
  [14] ssl_Write(0x80800d8, 0x8065294, 0x15, 0xd04bdf70, 0x8056333, 0x80800d8), at 0xd0fb13fc
  [15] PR_Write(0x80800d8, 0x8065294, 0x15), at 0xd0e0a313
  [16] do_connects(0x8046dd0, 0x807ff98, 0x5), at 0x8056333
  [17] thread_wrapper(0x807df44), at 0x8054f3c
  [18] _pt_root(0x80bf0d0), at 0xd0e281cc
  [19] _thr_setup(0xd07c1000), at 0xd0cccf3f
  [20] _lwp_start(), at 0xd0ccd230


(dbx) threads
      t@1  a  l@1   ?()   LWP suspended in  __lwp_wait() 
      t@2  a  l@2   _pt_root()   LWP suspended in  __lwp_park() 
o>    t@3  a  l@3   _pt_root()   signal SIGSEGV in  __mutex_destroy() 
      t@4  a  l@4   _pt_root()   LWP suspended in  s_mpv_mul_d_add() 
      t@5  a  l@5   _pt_root()   LWP suspended in  __lwp_park() 
      t@6  a  l@6   _pt_root()   LWP suspended in  s_mp_almost_inverse() 
      t@7  a  l@7   _pt_root()   LWP suspended in  s_mp_div_2d() 
      t@8  a  l@8   _pt_root()   LWP suspended in  s_mp_mul_2d() 
      t@9  a  l@9   _pt_root()   LWP suspended in  __lwp_park() 
(dbx) where 
current thread: t@3
  [1] __mutex_destroy(0x0, 0x0, 0xd0f320f0, 0xd0ea4524, 0xd08d98b0, 0xd08d98b0), at 0xd0cb775e 
  [2] PR_DestroyLock(0x0), at 0xd0e214d9 
=>[3] NSSCertificate_Destroy(0x81e76b8), at 0xd0f0b482 
  [4] CERT_DestroyCertificate(0x81e7ea8), at 0xd0efa2b5 
  [5] mySSLAuthCertificate(0x80a3e20, 0x8080218, 0x1, 0x0), at 0x8054b51 
  [6] ssl3_HandleCertificate(0x8103ab8, 0x8119233, 0x0), at 0xd0f90581 
  [7] ssl3_HandleHandshakeMessage(0x8103ab8, 0x8118826, 0xa0d), at 0xd0f91dfd 
  [8] ssl3_HandleHandshake(0x8103ab8, 0x8103cf8), at 0xd0f91fa2 
  [9] ssl3_HandleRecord(0x8103ab8, 0xd08d9a74, 0x8103cf8), at 0xd0f92764 
  [10] ssl3_GatherCompleteHandshake(0x8103ab8, 0x0), at 0xd0f93c60 
  [11] ssl_GatherRecord1stHandshake(0x8103ab8), at 0xd0f9512b 
  [12] ssl_SecureWrite(0x8103ab8, 0x8065294, 0x15), at 0xd0f9cc1d 
  [13] ssl_Write(0x8080218, 0x8065294, 0x15, 0xd08d9f70, 0x8056333, 0x8080218), at 0xd0fa13fc 
  [14] PR_Write(0x8080218, 0x8065294, 0x15), at 0xd0e0a313 
  [15] do_connects(0x8046a20, 0x807ff98, 0x1), at 0x8056333 
  [16] thread_wrapper(0x807ded4), at 0x8054f3c 
  [17] _pt_root(0x80bee80), at 0xd0e281cc 
  [18] _thr_setup(0xd07c0000), at 0xd0cbcf3f 
  [19] _lwp_start(), at 0xd0cbd230 


(dbx) threads
      t@1  a  l@1   ?()   LWP suspended in  __lwp_wait()
o     t@2  a  l@2   _pt_root()   signal SIGSEGV in  mutex_lock_impl()
      t@3  a  l@3   _pt_root()   LWP suspended in  __lwp_park()
      t@4  a  l@4   _pt_root()   LWP suspended in  PORT_Free()
      t@5  a  l@5   _pt_root()   LWP suspended in  mutex_trylock_adaptive()
      t@6  a  l@6   _pt_root()   LWP suspended in  ___lwp_mutex_timedlock()
      t@7  a  l@7   _pt_root()   LWP suspended in  __lwp_park()
      t@8  a  l@8   _pt_root()   LWP suspended in  mutex_trylock_adaptive()
      t@9  a  l@9   _pt_root()   LWP suspended in  __lwp_park()
(dbx) thread t@2
t@2 (l@2) stopped in mutex_lock_impl at 0xd0cb6c08
0xd0cb6c08: mutex_lock_impl+0x0020:     movzbl   0x00000004(%edx),%esi
(dbx) where
current thread: t@2
=>[1] mutex_lock_impl(0x12301431, 0x0), at 0xd0cb6c08
  [2] __mutex_lock(0x12301431), at 0xd0cb6de0
  [3] PR_Lock(0x12301431), at 0xd0e21509
  [4] nssArena_Destroy(0x81b47c8, 0x0, 0x81df668, 0xd0f320f0, 0xd09fb868, 0xd0f1c4d7), at 0xd0f1c45c
  [5] nssCertificate_Destroy(0x81df668), at 0xd0f0b38a
  [6] STAN_GetCERTCertificateOrRelease(0x81df668), at 0xd0f15053
  [7] __CERT_NewTempCertificate(0x80a3e20, 0xd09fb914, 0x0, 0x0, 0x1), at 0xd0ef973c
  [8] ssl3_HandleCertificate(0x8143aa0, 0x81c35a4, 0x7b7), at 0xd0fa0455
  [9] ssl3_HandleHandshakeMessage(0x8143aa0, 0x81c334e, 0xa0d), at 0xd0fa1dfd
  [10] ssl3_HandleHandshake(0x8143aa0, 0x8143ce0), at 0xd0fa1fa2
  [11] ssl3_HandleRecord(0x8143aa0, 0xd09fba74, 0x8143ce0), at 0xd0fa2764
  [12] ssl3_GatherCompleteHandshake(0x8143aa0, 0x0), at 0xd0fa3c60
  [13] ssl_GatherRecord1stHandshake(0x8143aa0), at 0xd0fa512b
  [14] ssl_SecureWrite(0x8143aa0, 0x8065294, 0x15), at 0xd0facc1d
  [15] ssl_Write(0x8080258, 0x8065294, 0x15, 0xd09fbf70, 0x8056333, 0x8080258), at 0xd0fb13fc
  [16] PR_Write(0x8080258, 0x8065294, 0x15), at 0xd0e0a313
  [17] do_connects(0x8047170, 0x807ff98, 0x0), at 0x8056333
  [18] thread_wrapper(0x807deb8), at 0x8054f3c
  [19] _pt_root(0x80bee10), at 0xd0e281cc
  [20] _thr_setup(0xd0d02400), at 0xd0cbcf3f
  [21] _lwp_start(), at 0xd0cbd230
Alexei, here are some questions intended to help us narrow this down.
Do we know any of these things to be true (or to be false)?
a) only happens on optimized builds, not debug?
b) only happens on NSS_3_11_BRANCH, not trunk?
c) only happens on Solaris X86, not on any other platform?
d) only happens on 32-bit, not 64-bit?
e) only happens when built with Studio 11?  or Studio 10?
Severity: normal → critical
Priority: -- → P1
Target Milestone: --- → 3.11.1
(Assignee)

Comment 6

12 years ago
Ok, the problem is intermittent, so there no chance I can answer to some of this questions with 100% confidence.

a) only happens on optimized builds, not debug?
Looks like true. Could not reproduce it within 4 hours 

b) only happens on NSS_3_11_BRANCH, not trunk?
False. Was able to reproduce core. See details below.

c) only happens on Solaris X86, not on any other platform?
Look like true. I've run only on Linux and sparc. Both of them
run for 1-2 hours

d) only happens on 32-bit, not 64-bit?
Looks like true.

e) only happens when built with Studio 11?  or Studio 10?
False. The brach also has the bug.


Here are the stacks:

Trunk bits built with St 11:

dbx) threads
      t@1  a  l@1   ?()   LWP suspended in  __lwp_wait()
      t@2  a  l@2   _pt_root()   LWP suspended in  hash_access()
      t@3  a  l@3   _pt_root()   LWP suspended in  ___lwp_mutex_timedlock()
      t@4  a  l@4   _pt_root()   LWP suspended in  ___lwp_mutex_timedlock()
      t@5  a  l@5   _pt_root()   LWP suspended in  ___lwp_mutex_timedlock()
      t@6  a  l@6   _pt_root()   LWP suspended in  mutex_unlock_queue()
o     t@7  a  l@7   _pt_root()   signal SIGSEGV in  mutex_lock_impl()
      t@8  a  l@8   _pt_root()   LWP suspended in  mutex_unlock_queue()
      t@9  a  l@9   _pt_root()   LWP suspended in  ___lwp_mutex_timedlock()
(dbx) thread t@7
t@7 (l@7) stopped in mutex_lock_impl at 0xd0cb6c08
0xd0cb6c08: mutex_lock_impl+0x0020:     movzbl   0x00000004(%edx),%esi
(dbx) where
current thread: t@7
=>[1] mutex_lock_impl(0x0, 0x0), at 0xd0cb6c08
  [2] __mutex_lock(0x0), at 0xd0cb6de0
  [3] PR_Lock(0x0), at 0xd0e231d1
  [4] nssArena_Destroy(0x81e48e8, 0x0, 0x81ea8d8, 0xd0f35168, 0xd04bd848, 0xd0f1f4ef), at 0xd0f1f474
  [5] nssCertificate_Destroy(0x81ea8d8), at 0xd0f0dabb
  [6] STAN_GetCERTCertificateOrRelease(0x81ea8d8), at 0xd0f17adf
  [7] __CERT_NewTempCertificate(0x80a6310, 0xd04bd904, 0x0, 0x0, 0x1), at 0xd0efb591
  [8] ssl3_HandleCertificate(0x80ca100, 0x81280d4, 0x7b7), at 0xd0f9ff55
  [9] ssl3_HandleHandshakeMessage(0x80ca100, 0x8127e7e, 0xa0d), at 0xd0fa190e
  [10] ssl3_HandleHandshake(0x80ca100, 0x80ca340), at 0xd0fa1abb
  [11] ssl3_HandleRecord(0x80ca100, 0xd04bda74, 0x80ca340), at 0xd0fa2131
  [12] ssl3_GatherCompleteHandshake(0x80ca100, 0x0), at 0xd0fa35b8
  [13] ssl_GatherRecord1stHandshake(0x80ca100), at 0xd0fa4bd7
  [14] ssl_SecureWrite(0x80ca100, 0x8067788, 0x15), at 0xd0fac8c2
  [15] ssl_Write(0x8082648, 0x8067788, 0x15), at 0xd0fb1fd0
  [16] PR_Write(0x8082648, 0x8067788, 0x15), at 0xd0e0a254
  [17] do_connects(0x8046fb0, 0x8082408, 0x5), at 0x8056337
  [18] thread_wrapper(0x8080444), at 0x8054ed9
  [19] _pt_root(0x80c1b50), at 0xd0e2a8bd
  [20] _thr_setup(0xd07c1000), at 0xd0cbcf3f
  [21] _lwp_start(), at 0xd0cbd230

---------------------------------------------------------------------------------------

Branch St10 build:

dbx) threads                                                                
      t@1  a  l@1   ?()   LWP suspended in  __lwp_wait() 
      t@2  a  l@2   _pt_root()   LWP suspended in  __lwp_park() 
      t@3  a  l@3   _pt_root()   LWP suspended in  __lwp_park() 
      t@4  a  l@4   _pt_root()   LWP suspended in  mutex_trylock_adaptive() 
      t@5  a  l@5   _pt_root()   LWP suspended in  __lwp_park() 
      t@6  a  l@6   _pt_root()   LWP suspended in  _free_unlocked() 
o     t@7  a  l@7   _pt_root()   signal SIGSEGV in  mutex_lock_impl() 
      t@8  a  l@8   _pt_root()   LWP suspended in  __pollsys() 
      t@9  a  l@9   _pt_root()   LWP suspended in  mutex_trylock_adaptive() 
(dbx) thread t@7
t@7 (l@7) stopped in mutex_lock_impl at 0xd0cb6c08
0xd0cb6c08: mutex_lock_impl+0x0020:     movzbl   0x00000004(%edx),%esi
(dbx) where     
current thread: t@7
=>[1] mutex_lock_impl(0x0, 0x0), at 0xd0cb6c08 
  [2] __mutex_lock(0x0), at 0xd0cb6de0 
  [3] PR_Lock(0x0), at 0xd0e21445 
  [4] nssArena_Destroy(0x81d10c8), at 0xd0f1b5f4 
  [5] NSSTrustDomain_FindCertificateByEncodedCertificate(0x80a3c80, 0xd04bd8b8), at 0xd0f0e35c 
  [6] __CERT_NewTempCertificate(0x80a3c80, 0xd04bd914, 0x0, 0x0, 0x1), at 0xd0ef86ea 
  [7] ssl3_HandleCertificate(0x812c6a8, 0x81387e6, 0x51d), at 0xd0fa0c72 
  [8] ssl3_HandleHandshakeMessage(0x812c6a8, 0x81382f6, 0xa0d), at 0xd0fa255d 
  [9] ssl3_HandleHandshake(0x812c6a8, 0x812c8e8), at 0xd0fa2702 
  [10] ssl3_HandleRecord(0x812c6a8, 0xd04bda74, 0x812c8e8), at 0xd0fa2ec4 
  [11] ssl3_GatherCompleteHandshake(0x812c6a8, 0x0), at 0xd0fa43b8 
  [12] ssl_GatherRecord1stHandshake(0x812c6a8), at 0xd0fa5883 
  [13] ssl_SecureWrite(0x812c6a8, 0x80651a0, 0x15), at 0xd0fad375 
  [14] ssl_Write(0x807fef8, 0x80651a0, 0x15, 0xd04bdf70, 0x805626b, 0x807fef8), at 0xd0fb1b54 
  [15] PR_Write(0x807fef8, 0x80651a0, 0x15), at 0xd0e0a257 
  [16] do_connects(0x80470c0, 0x807fd78, 0x5), at 0x805626b 
  [17] thread_wrapper(0x807ddb4), at 0x8054e74 
  [18] _pt_root(0x80bf4c0), at 0xd0e28110 
  [19] _thr_setup(0xd07c1000), at 0xd0cbcf3f 
  [20] _lwp_start(), at 0xd0cbd230 

---------------------------------------------------------------------------------------



Alexei, thanks for this info.
Several things jump out at me from those latest stack traces.

1. nssArena_Destroy is passing NULL to PR_Lock in each case.  
   Need to find out why.  The reason for this (whatever it is) is 
   undoubtedly the "root cause" of this crash.

2. PR_Lock is calling __mutex_lock.  
   This makes me wonder if PR_Lock has been built correctly for Solaris.  

   I would expect the Solaris implementation of PR_Lock to be this one:
   http://lxr.mozilla.org/nspr/source/nsprpub/pr/src/pthreads/ptsynch.c#202
   which calls pthread_mutex_lock().  

   But I wonder if we are seeing this one instead:
http://lxr.mozilla.org/nspr/source/nsprpub/pr/src/threads/combined/prulock.c#225
   which calls _MD_Lock, which in turn calls mutex_lock().  This is seen at
   http://lxr.mozilla.org/nspr/source/nsprpub/pr/src/md/unix/solaris.c#344

Alternatively, I wonder if we're failing to link with libpthreads.  

More investigation is needed.  I'd suggest starting by adding code in 
nssArena_Destroy that calls abort() if it's about to pass NULL to PR_Lock().
More debugging suggestions: in lib/pki/certificate.c, add a couple sanity checks.

137         if (PR_AtomicDecrement(&c->object.refCount) == 0) {
138             /* --- remove cert and UNLOCK storage --- */
   +            if (!c->object.arena) abort();
139             if (cc) {


150             PZ_DestroyLock(c->object.lock);
   +            if (!c->object.arena) abort();
151             nssArena_Destroy(c->object.arena);
There is a race documented in bug 225525.  Could it be the cause of this crash?
Keywords: crash
QA Contact: jason.m.reid → libraries

Comment 10

12 years ago
I think this is related to bug 331164 . Before I applied the fix, I got many random crashes on Niagara related to bogus CERTCertificate content. See comments 29 through 32 in that bug. I think this one will go away too with the fix for 331164.   Does everyone else agree we should close this as a dupe ?

(Assignee)

Comment 11

12 years ago
It looks like all stacks reported in this bug are all related to uninitialized lock pointed fixed in bug 331164.  I think it should be closed us dup.
Done.

*** This bug has been marked as a duplicate of 331164 ***
Status: NEW → RESOLVED
Last Resolved: 12 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.