Closed Bug 936402 Opened 11 years ago Closed 10 years ago

Race condition in UnixSocket; resulting in segmentation fault

Categories

(Firefox OS Graveyard :: General, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(blocking-b2g:koi+, firefox27 wontfix, firefox28 fixed, firefox29 fixed, b2g-v1.2 fixed, b2g-v1.3 fixed)

RESOLVED FIXED
blocking-b2g koi+
Tracking Status
firefox27 --- wontfix
firefox28 --- fixed
firefox29 --- fixed
b2g-v1.2 --- fixed
b2g-v1.3 --- fixed

People

(Reporter: tzimmermann, Assigned: tzimmermann)

References

Details

Attachments

(2 files, 1 obsolete file)

There is a race condition between UnixSocketConsumer::CloseSocket and UnixSocketImpl::OnFileCanReadWithoutBlocking. If the former clears mImpl->mConsumer on the main thread (by calling UnixSocketImpl::ShutDownOnMainThread), in the I/O thread might dereference a null pointer when reading the connection status at the beginning of OnFileCanReadWithoutBlocking. This results in a segmentation fault. A similar race condition exists with OnFileCanWriteWithoutBlocking. The problem has been exposed by the changes in bug 932728.
Hi Ben,

I think you reviewed the patch that introduced the current shutdown logic in UnixSocket.cpp. Please have a look at this change.
Attachment #829166 - Flags: review?(bent.mozilla)
I'm looking at this tomorrow, sorry for the delay!
Comment on attachment 829166 [details] [diff] [review]
[01] Bug 936402: Store connection status in UnixSocketImpl

Review of attachment 829166 [details] [diff] [review]:
-----------------------------------------------------------------

Hm, this seems much more complicated than it needs to be. The underlying problem is that mConsumer is being used off the main thread. That shouldn't happen, but apparently we're using it to get the connection status. This is silly because the IO thread knows the connection status better than the main thread!

What if we tried this instead:

1. Make UnixSocketConsumer::GetConnectionStatus() assert that it's only called on the main thread. That would have helped identify this issue sooner.
2. Make UnixSocketImpl track its own IO-thread-only status.

That should avoid the races and prevent us from sharing this stuff across threads.
Attachment #829166 - Flags: review?(bent.mozilla) → review-
Hi

(In reply to ben turner [:bent] (use the needinfo? flag!) (gone until dec. 6!) from comment #3)

Sorry for the late reply; been on PTO.

> Hm, this seems much more complicated than it needs to be. The underlying
> problem is that mConsumer is being used off the main thread. That shouldn't
> happen, but apparently we're using it to get the connection status. This is
> silly because the IO thread knows the connection status better than the main
> thread!

In my opinion, the core problem is that the connection status is a property of the I/O thread's data structure, but we're storing it on the main thread. So we have to use mConsumer off the main thread.

> 
> What if we tried this instead:
> 
> 1. Make UnixSocketConsumer::GetConnectionStatus() assert that it's only
> called on the main thread. That would have helped identify this issue sooner.
> 2. Make UnixSocketImpl track its own IO-thread-only status.

I'm kind of worried about this, because tracking the connection status in two places violates the Single-Point-Of-Truth rule.

> That should avoid the races and prevent us from sharing this stuff across
> threads.

I'll prepare a patch with the suggested changes, so we can compare side-by-side.

-Thomas
This is a patch with the changes you proposed.
Attachment #8341098 - Flags: review?(bent.mozilla)
Comment on attachment 8341098 [details] [diff] [review]
[01] Bug 936402: Duplicate connection status in UnixSocketImpl

Review of attachment 8341098 [details] [diff] [review]:
-----------------------------------------------------------------

This looks really good to me!
Attachment #8341098 - Flags: review?(bent.mozilla) → review+
(In reply to Thomas Zimmermann [:tzimmermann] [:tdz] from comment #7)
> Thanks for your review.
> 
> https://hg.mozilla.org/integration/b2g-inbound/rev/66acce483c48
> 
> https://tbpl.mozilla.org/?tree=B2g-Inbound&rev=66acce483c48

What's the reason for the koi nomination?
(In reply to Jason Smith [:jsmith] from comment #8)
> What's the reason for the koi nomination?

The patch fixes a race condition that results in a segmentation fault. The bug should be in 1.2 afaict.
https://hg.mozilla.org/mozilla-central/rev/66acce483c48
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
(In reply to Thomas Zimmermann [:tzimmermann] [:tdz] from comment #9)
> (In reply to Jason Smith [:jsmith] from comment #8)
> > What's the reason for the koi nomination?
> 
> The patch fixes a race condition that results in a segmentation fault. The
> bug should be in 1.2 afaict.

This will also help make automation more stable since we depend on the socket code very heavily.
Fixes race condition issue.
blocking-b2g: koi? → koi+
Attachment #829166 - Attachment is obsolete: true
https://hg.mozilla.org/releases/mozilla-aurora/rev/14ee95c5c523

This has some conflicts on b2g26 that I'm not comfortable trying to resolve myself. Please post a branch-specific patch.
Flags: needinfo?(tzimmermann)
Patch for b2g26_v1_2
Attachment #8345949 - Flags: review+
Flags: needinfo?(tzimmermann)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: