It first showed up in 25.0a1/20130711 and is currently #2 crasher in this build. The regression range is: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=04d8c309fe72&tochange=dde4dcd6fa46 It might be a regression from bug 889831. Signature NS_CycleCollectorSuspect3 More Reports Search UUID 0ce6f3c7-7340-4b71-bd82-5fc042130711 Date Processed 2013-07-11 15:41:49.010506 Uptime 457 Last Crash 460 seconds before submission Install Age 5836 since version was first installed. Install Time 2013-07-11 14:06:24 Product Firefox Version 25.0a1 Build ID 20130711030204 Release Channel nightly OS Windows NT OS Version 6.1.7601 Service Pack 1 Build Architecture x86 Build Architecture Info AuthenticAMD family 15 model 107 stepping 2 | 2 Crash Reason EXCEPTION_ACCESS_VIOLATION_READ Crash Address 0x0 App Notes AdapterVendorID: 0x10de, AdapterDeviceID: 0x053e, AdapterSubsysID: 1c41147b, AdapterDriverVersion: 18.104.22.16838 D3D10 Layers? D3D10 Layers- D3D9 Layers? D3D9 Layers- Adapter Vendor ID 0x10de Adapter Device ID 0x053e Total Virtual Memory 4294836224 Available Virtual Memory 3694338048 System Memory Use Percentage 73 Available Page File 4719177728 Available Physical Memory 1114165248 Frame Module Signature Source 0 xul.dll NS_CycleCollectorSuspect3 xpcom/base/nsCycleCollector.cpp 1 xul.dll nsNavHistoryFolderResultNode::Release() content/base/src/FragmentOrElement.cpp 2 xul.dll nsNSSASN1Sequence::~nsNSSASN1Sequence() security/manager/ssl/src/nsNSSASN1Object.cpp 3 xul.dll nsNSSASN1Sequence::`scalar deleting destructor'(unsigned int) 4 xul.dll nsNSSASN1Sequence::Release() security/manager/ssl/src/nsNSSASN1Object.cpp 5 xul.dll nsNSSCertificate::~nsNSSCertificate() security/manager/ssl/src/nsNSSCertificate.cpp 6 xul.dll nsNSSCertificate::`scalar deleting destructor'(unsigned int) 7 xul.dll nsNSSCertificate::Release() security/manager/ssl/src/nsNSSCertificate.cpp 8 xul.dll nsSSLStatus::~nsSSLStatus() security/manager/ssl/src/nsSSLStatus.cpp 9 xul.dll nsSSLStatus::`scalar deleting destructor'(unsigned int) 10 xul.dll nsSSLStatus::Release() security/manager/ssl/src/nsSSLStatus.cpp 11 xul.dll mozilla::psm::TransportSecurityInfo::~TransportSecurityInfo() security/manager/ssl/src/TransportSecurityInfo.cpp 12 xul.dll nsNSSSocketInfo::`vector deleting destructor'(unsigned int) 13 xul.dll mozilla::psm::TransportSecurityInfo::Release() security/manager/ssl/src/TransportSecurityInfo.cpp 14 xul.dll nsRefPtr<mozilla::dom::Element>::~nsRefPtr<mozilla::dom::Element>() obj-firefox/dist/include/nsAutoPtr.h 15 xul.dll nsSocketTransport::~nsSocketTransport() netwerk/base/src/nsSocketTransport2.cpp 16 xul.dll nsSocketTransport::`vector deleting destructor'(unsigned int) 17 xul.dll nsSocketTransport::Release() netwerk/base/src/nsSocketTransport2.cpp 18 xul.dll nsSocketTransportService::DetachSocket(nsSocketTransportService::SocketContext *,nsSocketTransportService::SocketContext *) netwerk/base/src/nsSocketTransportService2.cpp 19 xul.dll nsSocketTransportService::DoPollIteration(bool) netwerk/base/src/nsSocketTransportService2.cpp 20 xul.dll nsSocketTransportService::Run() netwerk/base/src/nsSocketTransportService2.cpp 21 xul.dll nsThread::ProcessNextEvent(bool,bool *) xpcom/threads/nsThread.cpp 22 xul.dll nsThread::ThreadFunc(void *) xpcom/threads/nsThread.cpp 23 nss3.dll _PR_NativeRunThread nsprpub/pr/src/threads/combined/pruthr.c 24 nss3.dll pr_root nsprpub/pr/src/md/windows/w95thred.c 25 msvcr100.dll _callthreadstartex f:\\dd\\vctools\\crt_bld\\self_x86\\crt\\src\\threadex.c 26 msvcr100.dll _threadstartex f:\\dd\\vctools\\crt_bld\\self_x86\\crt\\src\\threadex.c 27 kernel32.dll BaseThreadInitThunk 28 ntdll.dll __RtlUserThreadStart 29 ntdll.dll _RtlUserThreadStart More reports at: https://crash-stats.mozilla.com/report/list?product=Firefox&signature=NS_CycleCollectorSuspect3
(In reply to Scoobidiver from comment #0) > It might be a regression from bug 889831. Unless there was a typo in the patch, bug 889831 didn't change any functionality - OCSP stapling was pref'd on by default, so everyone was already running that code path.
Note, the code which used to use NS_CycleCollectorSuspect2 is now using NS_CycleCollectorSuspect3. However the stack starts to look odd after ~nsSocketTransport2.cpp
Doug, do you think this is ready for any more stack analysis or do we have recommendations for how QA can investigate a regression window or STR here?
This is not for me. The stack looks ok until something odd happens in the network land. It just happens to jump totally randomly to NS_CycleCollectorSuspect3 (which is a new method added 7/11). Or that is at least what the stack suggests.
Checking manually (bug 888219 not fixed) shows it's 100% correlated to Certificate Patrol (CertPatrol@PSYC.EU) 2.0.14.
(In reply to Scoobidiver from comment #5) > Checking manually (bug 888219 not fixed) shows it's 100% correlated to > Certificate Patrol (CertPatrol@PSYC.EU) 2.0.14. Is that an add-on? If so, does it trigger a problem on our side or do we need to block it?
(In reply to Robert Kaiser (:email@example.com) from comment #6) > (In reply to Scoobidiver from comment #5) > > Checking manually (bug 888219 not fixed) shows it's 100% correlated to > > Certificate Patrol (CertPatrol@PSYC.EU) 2.0.14. > Is that an add-on? If so, does it trigger a problem on our side or do we > need to block it? It's an extension as the ID suggests it. There's no binary in the stack trace so it's a Firefox issue but we can also contact the developers.
OK, then it looks like this add-on is triggering the crash, which already is quite helpful. I wonder if devs can reproduce with that?
So my guess is that the addon somehow causes us to create mASN1Objects of nsNSSASN1Sequence in the main thread (does it call GetASN1Objects?) so that the mutable array is cycle collectable and we still delete nsNSSASN1Sequence in non-main-thread. Deleting cycle collectable object in some other thread it was created is just not supported. nsIMutableArray is super error prone, since depending on the thread it is CCable or not.
Created attachment 778886 [details] [diff] [review] possible patch https://tbpl.mozilla.org/?tree=Try&rev=c756d7b36c52 There are perhaps more objects which shouldn't be thread safe, but changed refcounting of nsNSSASN1Sequence only for now.
Comment on attachment 778886 [details] [diff] [review] possible patch Hmm, no, somehow this breaks bug 480509.
Created attachment 778888 [details] [diff] [review] v2 https://tbpl.mozilla.org/?tree=Try&rev=9ce0ab0dc7de Not beautiful, but seems to pass the tests locally. Trying to keep the old behavior. I guess someone just doesn't care about the rv value, but expects there to be a real result.
Oh, silly me. Didn't notice 'sequence' is reused later in the method. So this v2 should be ok.
Created attachment 778954 [details] [diff] [review] alternative approach https://tbpl.mozilla.org/?tree=Try&rev=d5a45d6f3e3a I guess either one should be ok.
(In reply to Scoobidiver from comment #5) > Checking manually (bug 888219 not fixed) Now that bug is fixed, here are the latest correlations: 96% (45/47) vs. 2% (47/1931) CertPatrol@PSYC.EU (Certificate Patrol, https://addons.mozilla.org/addon/6415) (2.0.14)
Comment on attachment 778954 [details] [diff] [review] alternative approach Review of attachment 778954 [details] [diff] [review]: ----------------------------------------------------------------- You should have an XPCOM peer review the XPCOM changes that add the new parameter to nsArray. I prefer not to use this as the solution because PSM has never had to explicitly deal with cycle collection yet (AFAICT) and I would like to keep it this way if at all possible.
Comment on attachment 778888 [details] [diff] [review] v2 Review of attachment 778888 [details] [diff] [review]: ----------------------------------------------------------------- ::: security/manager/ssl/src/nsNSSASN1Object.cpp @@ +13,2 @@ > NS_IMPL_THREADSAFE_ISUPPORTS2(nsNSSASN1PrintableItem, nsIASN1PrintableItem, > nsIASN1Object) Please also change nsNSSASN1PrintableItem to non-thread-safe nsISupports. This patch is bitrotted by the patch that removed NS_IMPL_THREADSAFE_ISUPPORTS* so you'll need to redo this part by changing the declarations of the classes. ::: security/manager/ssl/src/nsNSSCertHelper.cpp @@ +2116,5 @@ > if (isAlreadyShutDown()) > return NS_ERROR_NOT_AVAILABLE; > > nsCOMPtr<nsIASN1Sequence> sequence = new nsNSSASN1Sequence(); > + NS_ADDREF(*aRetVal = sequence.get()); Nit: No need for two addrefs. Just set *aRetValue = new nsNSSASN1Sequence() directly. (Move the declaration of sequence down to be right above the call to CreateTBSCertificateASN1Struct.)
Created attachment 782627 [details] [diff] [review] using .forget().get() This way no extra addref/release. retval is nsIASN1Object, but we need nsIASN1Sequence within the method, so couldn't just assign to retval.
Calling this verified fixed since I only see four reports in the last week with this signature on Firefox 25.0a2 and 26.0a1. Should there be a follow-up bug to address the new instances of this signature?