Proposing threading safety patch for stable branch(es)

RESOLVED FIXED

Status

()

Core
XPCOM
RESOLVED FIXED
10 years ago
10 years ago

People

(Reporter: kaie, Assigned: kaie)

Tracking

({crash, fixed1.8.0.13, fixed1.8.1.5})

1.8 Branch
crash, fixed1.8.0.13, fixed1.8.1.5
Points:
---
Dependency tree / graph
Bug Flags:
blocking1.8.1.5 +
blocking1.8.0.13 +

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 1 obsolete attachment)

1.30 KB, patch
Wan-Teh Chang
: review+
Darin Fisher
: superreview+
Details | Diff | Splinter Review
(Assignee)

Description

10 years ago
From mozilla/xpcom/threads/nsThread.cpp
function nsThread::Shutdown()

  // XXX nspr doesn't seem to be calling the main thread's destructor
  // callback, so let's help it out:

In bug 362768 we added code to the most recent versions of NSPR that makes the above workaround unnecessary.

In fact, with the code from bug 362768 in place, the above workaround will cause a crash, because a cleanup function gets called twice.

As NSPR is considered a system library on some Linux distributions, which may incorporate the above bugfix, it is proposed to add this patch to the stable Mozilla branch(es).

This is also a good preparation, should Mozilla on the stable branch ever upgrade to an NSPR version that contains the patch from bug 362768.

The patch we are proposing is safe, see comment in attached patch.
(Assignee)

Comment 1

10 years ago
Created attachment 263543 [details] [diff] [review]
Patch v1
Attachment #263543 - Flags: review?(wtc)
(Assignee)

Updated

10 years ago
Assignee: nobody → kengert

Comment 2

10 years ago
Comment on attachment 263543 [details] [diff] [review]
Patch v1

Kai, you forgot to remove the nsThread::Exit call.
Attachment #263543 - Flags: review?(wtc) → review-
(Assignee)

Comment 3

10 years ago
Wan-Teh, I'm confused.

I'm proposing to land a patch that will work with both old and newer NSPR.

Comment 4

10 years ago
Unlike pthreads' pthread_setspecific, NSPR's PR_SetThreadPrivate
calls the destructor on the old thread private data before setting
the new thread private data.  In this case, the old thread private
data is gMainThread and the destructor is nsThread::Exit, so

+        PR_SetThreadPrivate(kIThreadSelfIndex, NULL);

is equivalent to

+        nsThread::Exit(NS_STATIC_CAST(nsThread*, gMainThread));
+        set thread private data at kIThreadSelfIndex to NULL;

Therefore you must remove the original nsThread::Exit call, otherwise
you'd be calling it twice, which would be a double-free error.

So the net effect of my proposed change is that we set the thread
private data at kIThreadSelfIndex to NULL after we destroy the old
thread private data (gMainThread).
- The old NSPR doesn't destroy thread private data of the main thread,
  so this change won't affect it.
- With the new NSPR, the NULL thread private data will prevent NSPR
  from calling the destructor.
So my proposed change will work with both the old and new NSPR.

Comment 5

10 years ago
Comment on attachment 263543 [details] [diff] [review]
Patch v1

>+        // but on older versions of NSPR it will not get called.

Add "unless we call PR_Cleanup" at the end of this sentence.

>+        // - set the callback pointer to NULL to ensure it will not get
>+        //   called by NSPR
>+        // - call the function ourselves, in case we're using an older
>+        //   NSPR library


Change this to:

          // - call the function ourselves
          // - set the data pointer to NULL to ensure the function will
          //   not get called again by NSPR
          // The PR_SetThreadPrivate call does both of these.
(Assignee)

Comment 6

10 years ago
Created attachment 264031 [details] [diff] [review]
Patch v2

Wan-Teh, thanks a lot for clarifying.
Here is a new patch.
Attachment #263543 - Attachment is obsolete: true
Attachment #264031 - Flags: review?(wtc)
Nominating as blocking.  This is causing a ton of crashes in 64bit geckos due to a change in NSS.  http://bugzilla.gnome.org/show_bug.cgi?id=413350 has something like 20-30 dupes and it's been reported a few times in RH bugzilla as well.
Flags: blocking1.8.1.4?
Flags: blocking1.8.0.12?

Comment 8

10 years ago
Comment on attachment 264031 [details] [diff] [review]
Patch v2

r=wtc.  The comments below document my understanding of the code
in question.

This is a safe change.  It does assume that nsThread::Shutdown
is called by the main thread.

What this patch does is to remove a dangling pointer.  If we
call the destructor (nsThread::Exit) on gMainThread, we should
set the thread private data to NULL, otherwise the thread private
data will become a dangling pointer.

The nsThread object pointed to by gMainThread has two references:
- one reference is gMainThread itself,
- the other reference is the thread private data.

The thread private data destructor releases one reference.
nsThread::Shutdown releases the second reference (gMainThread).
Attachment #264031 - Flags: review?(wtc) → review+
(Assignee)

Comment 9

10 years ago
Comment on attachment 264031 [details] [diff] [review]
Patch v2

Asking for approval to land this on stable branches.
Attachment #264031 - Flags: approval1.8.1.4?
Attachment #264031 - Flags: approval1.8.0.12?
If I understand you correctly, Mozilla-released binaries don't need this fix, but distros who build with system NSPR might?

Changing threading code well into the release-candidate process (no more time for full testing, just quick breadth and spot checks) is a scary proposition to us. Can the distros just take this patch themselves for this release and we'll get it early next time?
Comment on attachment 264031 [details] [diff] [review]
Patch v2

Threading changes scare us, want a second look from someone who knows the xpcom code side of things.
Attachment #264031 - Flags: superreview?(darin.moz)
Attachment #264031 - Flags: approval1.8.1.5?
Attachment #264031 - Flags: approval1.8.0.13?

Comment 12

10 years ago
Dan, I agree this patch should wait until the next release.
The super reviewer should verify the patch's assumption that
nsThread::Shutdown is always called by the main thread.

Comment 13

10 years ago
Comment on attachment 264031 [details] [diff] [review]
Patch v2

nsThread::Shutdown is called from NS_ShutdownXPCOM, which is only called on the main thread.  So, looks good to me!
Attachment #264031 - Flags: superreview?(darin.moz) → superreview+
Not respinning for this, distros using a newer NSPR will have to manually include this patch until next time.
Flags: blocking1.8.1.5?
Flags: blocking1.8.1.4?
Flags: blocking1.8.0.13?
Flags: blocking1.8.0.12?
Attachment #264031 - Flags: approval1.8.1.4?
Attachment #264031 - Flags: approval1.8.0.12?
(Assignee)

Updated

10 years ago
Blocks: 383977
Flags: blocking1.8.1.5?
Flags: blocking1.8.1.5+
Flags: blocking1.8.0.13?
Flags: blocking1.8.0.13+
Comment on attachment 264031 [details] [diff] [review]
Patch v2

approved for 1.8.1.5 and 1.8.0.13, a=dveditz for release-drivers
Attachment #264031 - Flags: approval1.8.1.5?
Attachment #264031 - Flags: approval1.8.1.5+
Attachment #264031 - Flags: approval1.8.0.13?
Attachment #264031 - Flags: approval1.8.0.13+
(Assignee)

Comment 16

10 years ago
checked in to both branches
Status: NEW → RESOLVED
Last Resolved: 10 years ago
Keywords: fixed1.8.0.13, fixed1.8.1.5
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.