Closed Bug 836523 Opened 12 years ago Closed 12 years ago

crash in mozilla::ipc::SocketReceiveTask::Run shutting off Bluetooth file transfer in process

Tracking

(firefox20 wontfix, firefox21 fixed, b2g18+ affected, b2g18-v1.0.0 wontfix, b2g18-v1.0.1 wontfix)

Status:

RESOLVED FIXED

Milestone:

B2G C4 (2jan on)

Tracking Flags:

Tracking

Status

firefox20

---

wontfix

firefox21

---

fixed

b2g18

affected

b2g18-v1.0.0

---

wontfix

b2g18-v1.0.1

---

wontfix

People

(Reporter: marcia, Assigned: tzimmermann)

References

Details

(Keywords: crash, Whiteboard: [b2g-crash])

Crash Data

Attachments

(4 files, 12 obsolete files)

patch 1: v1: Stop watching sockets before deleting UnixSocketImpl instance 12 years ago Eric Chou [:ericchou] [:echou] 1.94 KB, patch	qdot : feedback-	Details \| Diff \| Splinter Review
WIP: Cleanup UnixSocketImpl from IO thread 12 years ago Thomas Zimmermann [:tzimmermann] [:tdz] 6.53 KB, patch		Details \| Diff \| Splinter Review
Protect mCurrentTaskIsCanceled by lock 12 years ago Thomas Zimmermann [:tzimmermann] [:tdz] 2.57 KB, patch	qdot : review+ echou : review+	Details \| Diff \| Splinter Review
Wait for incoming connections in UnixSocketImpl 12 years ago Thomas Zimmermann [:tzimmermann] [:tdz] 8.21 KB, patch	qdot : review+ echou : review+	Details \| Diff \| Splinter Review
Cleanup UnixSocketImpl from within I/O thread 12 years ago Thomas Zimmermann [:tzimmermann] [:tdz] 5.79 KB, patch	qdot : review+ echou : review+	Details \| Diff \| Splinter Review
Protect mCurrentTaskIsCanceled by lock [r=qdot,echou] 12 years ago Thomas Zimmermann [:tzimmermann] [:tdz] 2.52 KB, patch	tzimmermann : review+	Details \| Diff \| Splinter Review
Wait for incoming connections in UnixSocketImpl [r=qdot,echou] 12 years ago Thomas Zimmermann [:tzimmermann] [:tdz] 8.34 KB, patch	tzimmermann : review+	Details \| Diff \| Splinter Review
Wait for incoming connections in UnixSocketImpl [r=qdot,echou] 12 years ago Thomas Zimmermann [:tzimmermann] [:tdz] 8.34 KB, patch	tzimmermann : review+	Details \| Diff \| Splinter Review
Cleanup UnixSocketImpl from within I/O thread [r=qdot,echou] 12 years ago Thomas Zimmermann [:tzimmermann] [:tdz] 5.04 KB, patch	tzimmermann : review+	Details \| Diff \| Splinter Review
Backout fix 845148 (b2g18) 12 years ago Thomas Zimmermann [:tzimmermann] [:tdz] 29.94 KB, patch	mrbkap : review-	Details \| Diff \| Splinter Review
Protect mCurrentTaskIsCanceled by lock [r=qdot,echou] (b2g18) 12 years ago Thomas Zimmermann [:tzimmermann] [:tdz] 4.15 KB, patch	tzimmermann : review+	Details \| Diff \| Splinter Review
Wait for incoming connections in UnixSocketImpl [r=qdot,echou] (b2g18) 12 years ago Thomas Zimmermann [:tzimmermann] [:tdz] 8.22 KB, patch	tzimmermann : review+	Details \| Diff \| Splinter Review
Cleanup UnixSocketImpl from within I/O thread [r=qdot,echou] (b2g18) 12 years ago Thomas Zimmermann [:tzimmermann] [:tdz] 5.80 KB, patch	tzimmermann : review+	Details \| Diff \| Splinter Review
Wait for incoming connections in UnixSocketImpl (b2g18) 12 years ago Thomas Zimmermann [:tzimmermann] [:tdz] 8.56 KB, patch		Details \| Diff \| Splinter Review
Wait for incoming connections in UnixSocketImpl (b2g18) 12 years ago Thomas Zimmermann [:tzimmermann] [:tdz] 8.22 KB, patch	qdot : review+ echou : review+	Details \| Diff \| Splinter Review
Wait for incoming connections in UnixSocketImpl [r=qdot,echou] (b2g18) 12 years ago Thomas Zimmermann [:tzimmermann] [:tdz] 8.23 KB, patch	tzimmermann : review+	Details \| Diff \| Splinter Review

Marcia Knous [:marcia]

Reporter

Description

•

12 years ago

This bug was filed from the Socorro interface and is report bp-b0854aeb-0d9f-4030-9b29-2836f2130130 . ============================================================= unagi, seen while running: Gecko revision="066b9d7cf30884a001db22bde3ae939c02718062" Gaia revision=f7f5a0cd17e3d04308cc5850b254947e127122b9 STR: 1. Start a bluetooth file transfer - I was transferring a 25.3 MB music file 2. Stop the transfer by turning off bluetooth on the unagi device. 3. Receive a crash.

Marcia Knous [:marcia]

Reporter

Comment 1

•

12 years ago

Frame Module Signature Source 0 libxul.so mozilla::ipc::SocketReceiveTask::Run UnixSocket.cpp:333 1 libxul.so nsThread::ProcessNextEvent nsThread.cpp:620 2 libxul.so NS_ProcessNextEvent_P nsThreadUtils.cpp:237 3 libxul.so mozilla::ipc::MessagePump::Run MessagePump.cpp:82 4 libxul.so MessageLoop::RunInternal message_loop.cc:215 5 libxul.so MessageLoop::Run message_loop.cc:208 6 libxul.so nsBaseAppShell::Run nsBaseAppShell.cpp:163 7 libxul.so nsAppStartup::Run nsAppStartup.cpp:290 8 libxul.so XREMain::XRE_mainRun nsAppRunner.cpp:3794 9 libxul.so XREMain::XRE_main nsAppRunner.cpp:3860 10 libxul.so XRE_main nsAppRunner.cpp:3935 11 b2g main nsBrowserApp.cpp:164 12 libc.so __libc_init libc_init_dynamic.c:114 13 libc.so __cxa_atexit atexit.c:99 14 @0xbea62d45

Scoobidiver (away)

Updated

•

12 years ago

blocking-b2g: --- → tef?

OS: Android → Gonk (Firefox OS)

Hardware: All → ARM

Whiteboard: [b2g-crash]

Marcia Knous [:marcia]

Reporter

Comment 2

•

12 years ago

Using the QC RIL when this crash occurred.

Tony Chung [:tchung]

Comment 3

•

12 years ago

can we get someone to confirm this reproduces against moz RIL? if not, then we can send this onward to QC.

Michael Vines [:m1] [:evilmachines]

Comment 4

•

12 years ago

hmm..i don't see us in the .so list

Stephen Donner [:stephend] Not actively reading bugmail

Comment 5

•

12 years ago

I'm using today's Unagi engineering build (w/Marionette enabled). Sorry, but I understand time is of the essence, here, so I don't have a crash ID right now. My git commit info is: 2013-01-30 09:18:04 f7fa0cd17e3d04308cc5850b254947e... Build identifier is: 20130130063101

Marcia Knous [:marcia]

Reporter

Comment 6

•

12 years ago

I tried reproducing it on a few other devices but haven't had any luck hitting it again.

Kyle Machulis [:qdot] [:kmachulis] (INACTIVE)

Comment 7

•

12 years ago

This looks a lot like bug 830704. Adding tzimmerman to cc just in case this does turn out to be repro'able.

Thomas Zimmermann [:tzimmermann] [:tdz]

Assignee

Comment 8

•

12 years ago

I cannot reproduce the problem, but from looking at the code, I have an idea what might happen. The fix for bug 830704 ensures that the instance of UnixSocketImpl gets deleted after the last pending SocketReceiveTask. I assumed that in ipc/unixsocket/UnixSocket.cpp, line 626, the IO thread would have been stopped. That seems not to be the case. If the IO thread creates a SocketReceiveTask (at line 680+) after the main thread executed line 635, this SocketReceiveTask will access an invalid instance of UnixSocketImpl. Is there a thread-safe way to stop the IO thread or make it stop watching the socket file descriptor?

Andrew Overholt [:overholt]

Comment 9

•

12 years ago

Let's track this. Please nom if we can get a reliable STR.

blocking-b2g: tef? → ---

tracking-b2g18: --- → ?

Eric Chou [:ericchou] [:echou]

Comment 10

•

12 years ago

I can reproduce this issue by repeatedly connecting and disconnecting with my Nokia headset for no more than 40 times. The symptom is exactly the same as this issue. I took a look at this problem, it's just like Thomas mentioned in comment 8, SocketReceiveTask is still possible to be created even after DeleteInstanceRunnable::run() is executed, so SocketReceiveTask will access an invalid instance of UnixSocketImpl.

Eric Chou [:ericchou] [:echou]

Comment 11

•

12 years ago

Attached patch patch 1: v1: Stop watching sockets before deleting UnixSocketImpl instance (obsolete) — Details — Splinter Review

Hi Kyle, could you please take a minute to see if this would work? Thanks.

Assignee: nobody → echou

Attachment #710624 - Flags: feedback?(kyle)

Thomas Zimmermann [:tzimmermann] [:tdz]

Assignee

Comment 12

•

12 years ago

Hi Eric, Is it save to stop the watch the file descriptor from within the main thread while the I/O thread is still listening? I was looking at the problem a few days ago when Kyle cc'd me, and I couldn't find a thread-safe way to do libevent calls. Also, I can still see an opportunity for this bug to happen if the I/O thread is executing OnFileCanReadWithoutBlocking while the main thread dispatches DeleteUnixSocketImpl.

Thomas Zimmermann [:tzimmermann] [:tdz]

Assignee

Comment 13

•

12 years ago

Attached patch WIP: Cleanup UnixSocketImpl from IO thread (obsolete) — Details — Splinter Review

I wrote this patch a few days ago to fix this bug. The idea is to signal the I/O thread that we're about to quit and let it shut down itself and dispatch the DeleteUnixSocketImpl instance The patch has a bug somewhere and doesn't really work, but idea is sound, if think.

Attachment #710629 - Flags: feedback?

Eric Chou [:ericchou] [:echou]

Comment 14

•

12 years ago

(In reply to Thomas Zimmermann [:tzimmermann] from comment #12) > Hi Eric, > > Is it save to stop the watch the file descriptor from within the main thread > while the I/O thread is still listening? I was looking at the problem a few > days ago when Kyle cc'd me, and I couldn't find a thread-safe way to do > libevent calls. Actually, I'm not pretty sure. But since we will always stop watching FDs only in CloseSocket() on main thread and in dtor of UnixSocketImpl, is it still not safe? > Also, I can still see an opportunity for this bug to happen > if the I/O thread is executing OnFileCanReadWithoutBlocking while the main > thread dispatches DeleteUnixSocketImpl. You're right, it could happen.

Kyle Machulis [:qdot] [:kmachulis] (INACTIVE)

Comment 15

•

12 years ago

Ok, well yet another idea I'm not exactly loving, but... So we /need/ to know that we're no longer watching the fds before deleting anything. So why not: 1. Make a function to run on the IO thread that will stop the watchers, that also takes the pointer to delete. 2. Make a function to run on the main thread that'll do the deletion. 3. When we run CloseSocket, queue the function in (1) on the IOThread. 4. When the IOThread gets to the task, it'll close out the watchers, then queue the deletion function to run on the main thread. 5. Main thread deletes pointer without worrying about racing. This way, the order is implicit, doesn't require the complexity of the lock solution, and everything runs on the thread its expected to. Make sense? Am I missing anything?

Kyle Machulis [:qdot] [:kmachulis] (INACTIVE)

Comment 16

•

12 years ago

Comment on attachment 710624 [details] [diff] [review] patch 1: v1: Stop watching sockets before deleting UnixSocketImpl instance Review of attachment 710624 [details] [diff] [review]: ----------------------------------------------------------------- ::: ipc/unixsocket/UnixSocket.cpp @@ +629,5 @@ > mImpl = nullptr; > // Line it up to be destructed on the IO Thread > impl->mConsumer.forget(); > impl->StopTask(); > + impl->StopIO(); StopWatchingFileDescriptor ain't threadsafe. :(

Attachment #710624 - Flags: feedback?(kyle) → feedback-

Kyle Machulis [:qdot] [:kmachulis] (INACTIVE)

Comment 17

•

12 years ago

Talked to cjones. StopWatchingFileDescriptor most likely isn't threadsafe, so I don't think that solution will work. I think my linearizing the functions across the threads should do it, but like I said, if I'm missing something, let me know.

Thomas Zimmermann [:tzimmermann] [:tdz]

Assignee

Comment 18

•

12 years ago

(In reply to Kyle Machulis [:kmachulis] [:qdot] from comment #15) > Ok, well yet another idea I'm not exactly loving, but... > > So we /need/ to know that we're no longer watching the fds before deleting > anything. So why not: > > 1. Make a function to run on the IO thread that will stop the watchers, that > also takes the pointer to delete. > 2. Make a function to run on the main thread that'll do the deletion. > 3. When we run CloseSocket, queue the function in (1) on the IOThread. > 4. When the IOThread gets to the task, it'll close out the watchers, then > queue the deletion function to run on the main thread. > 5. Main thread deletes pointer without worrying about racing. > > This way, the order is implicit, doesn't require the complexity of the lock > solution, and everything runs on the thread its expected to. Make sense? Am > I missing anything? My patch already does something like that. It sets a flag that will make the I/O thread dispatch a DeleteUnixSocketImpl task to the main tread, which then does the real work. The locking is only for protecting the flag variable. The I/O thread needs to first wake up, though. Another, much simpler idea is to simply close the socket file descriptor from within the main thread. I'm not sure what the I/O thread does in that case, but my guess is that we can detect that the socket has been closed and stop watching. We'd need to signal this back to the main thread, which would then dispatch the DeleteUnixSignalImpl task.

Kyle Machulis [:qdot] [:kmachulis] (INACTIVE)

Comment 19

•

12 years ago

(In reply to Thomas Zimmermann [:tzimmermann] from comment #18) > My patch already does something like that. It sets a flag that will make the > I/O thread dispatch a DeleteUnixSocketImpl task to the main tread, which > then does the real work. The locking is only for protecting the flag > variable. The I/O thread needs to first wake up, though. Yeah, that's the nice part about having it as a task, we'll know it'll run versus waiting on something incoming to trigger it. > Another, much simpler idea is to simply close the socket file descriptor > from within the main thread. I'm not sure what the I/O thread does in that > case, but my guess is that we can detect that the socket has been closed > and stop watching. We'd need to signal this back to the main thread, which > would then dispatch the DeleteUnixSignalImpl task. Hmm. Where would we watch for the descriptor getting closed? We don't really maintain control at all times since libevent is doing most of the work.

Lukas Blakk [:lsblakk] use ?needinfo

Updated

•

12 years ago

tracking-b2g18: ? → +

Eric Chou [:ericchou] [:echou]

Comment 20

•

12 years ago

Hey Thomas, are you going to take this bug? :)

Flags: needinfo?(tzimmermann)

Thomas Zimmermann [:tzimmermann] [:tdz]

Assignee

Comment 21

•

12 years ago

Hi Eric, I can't reproduce it, but I'd give it another try and see how far I get.

Assignee: echou → tzimmermann

Flags: needinfo?(tzimmermann)

Eric Chou [:ericchou] [:echou]

Comment 22

•

12 years ago

(In reply to Thomas Zimmermann [:tzimmermann] from comment #21) > Hi Eric > > I can't reproduce it, but I'd give it another try and see how far I get. No problem. Since we've already known what the root cause is, I think it's fine to just upload your patch and we can discuss if it would work. Thanks in advance.

Thomas Zimmermann [:tzimmermann] [:tdz]

Assignee

Comment 23

•

12 years ago

Attached patch Protect mCurrentTaskIsCanceled by lock (obsolete) — Details — Splinter Review

Attachment #712464 - Flags: review?(kyle)

Attachment #712464 - Flags: review?(echou)

Thomas Zimmermann [:tzimmermann] [:tdz]

Assignee

Comment 24

•

12 years ago

Attached patch Wait for incoming connections in UnixSocketImpl (obsolete) — Details — Splinter Review

Attachment #712466 - Flags: review?(kyle)

Attachment #712466 - Flags: review?(echou)

Thomas Zimmermann [:tzimmermann] [:tdz]

Assignee

Comment 25

•

12 years ago

Attached patch Cleanup UnixSocketImpl from within I/O thread (obsolete) — Details — Splinter Review

Attachment #710624 - Attachment is obsolete: true

Attachment #710629 - Attachment is obsolete: true

Attachment #710629 - Flags: feedback?

Attachment #712467 - Flags: review?(kyle)

Attachment #712467 - Flags: review?(echou)

Thomas Zimmermann [:tzimmermann] [:tdz]

Assignee

Comment 26

•

12 years ago

Hey! It took me several attempts to fix this bug, but I think these patches finally work. - The first patch simply adds some locking; nothing special. - The second patch converts SocketAcceptTask to use the I/O thread's watch mechanisms, instead of actively polling the listening socket. This is necessary because a SocketAcceptTask might execute after the UnixSocketImpl has already been deleted. - Finally the third patch fixes the race condition by removing the socket's watchers from within the I/O thread and dispatching a delete operation.

Kyle Machulis [:qdot] [:kmachulis] (INACTIVE)

Updated

•

12 years ago

Attachment #712464 - Flags: review?(kyle) → review+

Kyle Machulis [:qdot] [:kmachulis] (INACTIVE)

Updated

•

12 years ago

Attachment #712466 - Flags: review?(kyle) → review+

Kyle Machulis [:qdot] [:kmachulis] (INACTIVE)

Updated

•

12 years ago

Attachment #712467 - Flags: review?(kyle) → review+

Kyle Machulis [:qdot] [:kmachulis] (INACTIVE)

Comment 27

•

12 years ago

Ok, everything looks good, seems to make sense.

Eric Chou [:ericchou] [:echou]

Updated

•

12 years ago

Attachment #712464 - Flags: review?(echou) → review+

Eric Chou [:ericchou] [:echou]

Updated

•

12 years ago

Attachment #712466 - Flags: review?(echou) → review+

Eric Chou [:ericchou] [:echou]

Updated

•

12 years ago

Attachment #712467 - Flags: review?(echou) → review+

Eric Chou [:ericchou] [:echou]

Comment 28

•

12 years ago

Looks good to me. Thanks, Thomas.

Thomas Zimmermann [:tzimmermann] [:tdz]

Assignee

Comment 29

•

12 years ago

Attached patch Protect mCurrentTaskIsCanceled by lock [r=qdot,echou] — Details — Splinter Review

Rebased and added r=

Attachment #712464 - Attachment is obsolete: true

Attachment #712859 - Flags: review+

Thomas Zimmermann [:tzimmermann] [:tdz]

Assignee

Comment 30

•

12 years ago

Attached patch Wait for incoming connections in UnixSocketImpl [r=qdot,echou] (obsolete) — Details — Splinter Review

Rebased and added r=

Attachment #712860 - Flags: review+

Thomas Zimmermann [:tzimmermann] [:tdz]

Assignee

Comment 31

•

12 years ago

Attached patch Wait for incoming connections in UnixSocketImpl [r=qdot,echou] — Details — Splinter Review

Rebased and added r=

Attachment #712466 - Attachment is obsolete: true

Attachment #712860 - Attachment is obsolete: true

Attachment #712861 - Flags: review+

Thomas Zimmermann [:tzimmermann] [:tdz]

Assignee

Comment 32

•

12 years ago

Attached patch Cleanup UnixSocketImpl from within I/O thread [r=qdot,echou] — Details — Splinter Review

Rebased and added r=

Attachment #712467 - Attachment is obsolete: true

Attachment #712862 - Flags: review+

Thomas Zimmermann [:tzimmermann] [:tdz]

Assignee

Updated

•

12 years ago

Keywords: checkin-needed

Ryan VanderMeulen [:RyanVM]

Comment 33

•

12 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/89edfdd1a350 https://hg.mozilla.org/integration/mozilla-inbound/rev/4251e6dd0218 https://hg.mozilla.org/integration/mozilla-inbound/rev/164c9a8f3711 If this ends up getting approved for b2g18 uplift, please set checkin-needed again. My b2g uplift queries ignore the Gaia component and I won't see this otherwise.

Keywords: checkin-needed

Ryan VanderMeulen [:RyanVM]

Comment 34

•

12 years ago

https://hg.mozilla.org/mozilla-central/rev/89edfdd1a350 https://hg.mozilla.org/mozilla-central/rev/4251e6dd0218 https://hg.mozilla.org/mozilla-central/rev/164c9a8f3711

Status: NEW → RESOLVED

Closed: 12 years ago

Resolution: --- → FIXED

Gregor Wagner [:gwagner]

Updated

•

12 years ago

Depends on: 841648

Blake Kaplan (:mrbkap) (inactive)

Comment 35

•

12 years ago

Comment on attachment 712859 [details] [diff] [review] Protect mCurrentTaskIsCanceled by lock [r=qdot,echou] Review of attachment 712859 [details] [diff] [review]: ----------------------------------------------------------------- ::: ipc/unixsocket/UnixSocket.cpp @@ +80,5 @@ > return; > } > mTask->Cancel(); > mTask = nullptr; > + MutexAutoLock lock(mLock); This is not the correct way of using a lock. If we needed a lock (though the patch in bug 845148 shows that it's not needed here) then the lock would need to cover the entirety of the code that relies on having a consistent value for the protected variables. The lock also misses mTask possibly being assigned and read from multiple threads at the same time. I'm going to end up nuking it in my backport for bug 845148.

Ryan VanderMeulen [:RyanVM]

Comment 36

•

12 years ago

This is going to need to be rebased for b2g18 uplift on top of bug 845148.

Thomas Zimmermann [:tzimmermann] [:tdz]

Assignee

Comment 37

•

12 years ago

Reopened for uplift.

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Thomas Zimmermann [:tzimmermann] [:tdz]

Assignee

Comment 38

•

12 years ago

(In reply to Blake Kaplan (:mrbkap) from comment #35) > Comment on attachment 712859 [details] [diff] [review] > Protect mCurrentTaskIsCanceled by lock [r=qdot,echou] > > Review of attachment 712859 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: ipc/unixsocket/UnixSocket.cpp > @@ +80,5 @@ > > return; > > } > > mTask->Cancel(); > > mTask = nullptr; > > + MutexAutoLock lock(mLock); > > This is not the correct way of using a lock. If we needed a lock (though the > patch in bug 845148 shows that it's not needed here) then the lock would > need to cover the entirety of the code that relies on having a consistent > value for the protected variables. The lock also misses mTask possibly being > assigned and read from multiple threads at the same time. I'm going to end > up nuking it in my backport for bug 845148. mlock only protects mCurrentTaskIsCanceled (and only mCurrentTaskisCanceled), which is signals to the IO thread (and only the IO thread) that the current task has been canceled from the main thread (and only the main thread). You need proper concurrency control for doing this.

Thomas Zimmermann [:tzimmermann] [:tdz]

Assignee

Comment 39

•

12 years ago

(In reply to Ryan VanderMeulen [:RyanVM] from comment #36) > This is going to need to be rebased for b2g18 uplift on top of bug 845148. Both patch sets overlap considerably. I cannot rebase the changes in these patches without risking serious regressions. Instead I'd propose to remove the changes for bug 845148 from b2g18, apply the fixes for this bug, and afterwards re-apply bug-845148 fixes.

Thomas Zimmermann [:tzimmermann] [:tdz]

Assignee

Comment 40

•

12 years ago

Attached patch Backout fix 845148 (b2g18) (obsolete) — Details — Splinter Review

Blake, could you re-port your patch set for bug 845148 to b2g18? That would bring both patch sets into the correct order, as bug 845148 is a fix for the patches here. Porting the socket changes is probably much easier now. You should be able to reuse the previous patch to BluetoothOppManager.

Attachment #722285 - Flags: review?(mrbkap)

Thomas Zimmermann [:tzimmermann] [:tdz]

Assignee

Comment 41

•

12 years ago

Attached patch Protect mCurrentTaskIsCanceled by lock [r=qdot,echou] (b2g18) (obsolete) — Details — Splinter Review

Attachment #722287 - Flags: review+

Thomas Zimmermann [:tzimmermann] [:tdz]

Assignee

Comment 42

•

12 years ago

Attached patch Wait for incoming connections in UnixSocketImpl [r=qdot,echou] (b2g18) (obsolete) — Details — Splinter Review

Attachment #722288 - Flags: review+

Thomas Zimmermann [:tzimmermann] [:tdz]

Assignee

Comment 43

•

12 years ago

Attached patch Cleanup UnixSocketImpl from within I/O thread [r=qdot,echou] (b2g18) (obsolete) — Details — Splinter Review

Attachment #722291 - Flags: review+

Thomas Zimmermann [:tzimmermann] [:tdz]

Assignee

Updated

•

12 years ago

Attachment #722288 - Attachment description: Wait for incoming connections in UnixSocketImpl [r=qdot,echou] → Wait for incoming connections in UnixSocketImpl [r=qdot,echou] (b2g18)

Kyle Machulis [:qdot] [:kmachulis] (INACTIVE)

Comment 44

•

12 years ago

UNDO. UNDO. The "Wait for incoming connections in UnixSocketImpl" takes out the O_NONBLOCK set in SetNonblockingFlags, which not only causes us to lock on exit and completely negates the whole reason the function exists, god only knows what it's doing to the IOThread. Why is that gone? If you don't set the socket non-block, libevent isn't going to work at ALL.

Blake Kaplan (:mrbkap) (inactive)

Comment 45

•

12 years ago

(In reply to Thomas Zimmermann [:tzimmermann] from comment #38) > mlock only protects mCurrentTaskIsCanceled (and only > mCurrentTaskisCanceled), which is signals to the IO thread (and only the IO > thread) that the current task has been canceled from the main thread (and > only the main thread). You need proper concurrency control for doing this. As the patch in bug 845148 shows, the lock is actually unnecessary. It's much better to avoid shared mutable state when possible. The patch in that bug does so by using the already thread-safe event loops in order to serialize access to the various data structures. Here is why this lock is not useful: The reason that threads racing to set shared state is dangerous is to avoid inconsistent state when two states set the given value at the same time. Getting or setting a boolean is an atomic operation: it is either true or false, therefore only locking around the get or set operation does not guarantee anything that the underlying memory model doesn't already give you. The reason that the lock is not sufficient is this code: if (mTask) { return; } if (IsCanceled()) { return; } mTask = aTask; needs to assume that IsCanceled() cannot change after the check. Otherwise, we can end up in a state where mIsCanceled is true and mTask is not in a canceled state. To make the lock work, it would be more correct to hold it above the mTask check (mTask also needs to be read under the same lock) and release it before posting the task to the event loop. Then StopTask would need to hold it for the entirety of its body to avoid racing against EnqueueTask. Anyway, the approach in bug 845148 avoids all of this entirely by not having any shared mutable state. There cannot be any inconsistency because all accesses are properly serialized on all threads.

Blake Kaplan (:mrbkap) (inactive)

Comment 46

•

12 years ago

Comment on attachment 722285 [details] [diff] [review] Backout fix 845148 (b2g18) I'm not going to review this. The model on b2g18 is cleaner and better understood than what we'd be replacing it with. Bug 846615 has patches that sync b2g18 and m-c, which will make getting the rest of the bluetooth patches in easier everywhere.

Attachment #722285 - Flags: review?(mrbkap) → review-

Kyle Machulis [:qdot] [:kmachulis] (INACTIVE)

Comment 47

•

12 years ago

Ok, nevermind on comment 44. accept() got merged back in accidentally.

Thomas Zimmermann [:tzimmermann] [:tdz]

Assignee

Comment 48

•

12 years ago

(In reply to Blake Kaplan (:mrbkap) from comment #45) The C++ memory model does not give you any guarantees about concurrent memory access. The compiler and processor are free to reorder operations, or split them up into smaller chunks, or not execute them at all. So even though a boolean is either true or false, you'd still need memory barriers to make sure that operations are processed in the correct order, or processed at all. Anyway, thanks for fixing the undefined memory access and cleaning up this code. From comment 46, I assume that I don't need to back-port the patches from this issue, right?

Blake Kaplan (:mrbkap) (inactive)

Comment 49

•

12 years ago

(In reply to Thomas Zimmermann [:tzimmermann] from comment #48) > Anyway, thanks for fixing the undefined memory access and cleaning up this > code. From comment 46, I assume that I don't need to back-port the patches > from this issue, right? I think that's true for most of the patches. Once bug 846615 lands on mozilla-central, the models on the two branches will mostly match each other except for patch 2 here ("Wait for incoming connections..."), bug 844705 and some naming differences. What should probably happen is that we should probably merge the names and the second patch here (assuming that comment 44 means that qDot is okay with it) and probably overwrite my changes in the Opp manager with the patch for bug 844705. At that point, the two branches should be merged. I'm going to re-mark this as FIXED since it's fixed on m-c and we can track the landing of the second patch with branch flags.

Status: REOPENED → RESOLVED

Closed: 12 years ago → 12 years ago

Resolution: --- → FIXED

Ryan VanderMeulen [:RyanVM]

Comment 50

•

12 years ago

If everyone's in agreement that b2g18 is where it needs to be, can you please set status-b2g18 to fixed on this, bug 842434, and bug 840544? Thanks. And please post branch-specific patches of whatever still needs to land otherwise (and bonus points for setting checkin-needed on the relevant bugs as this has become a tangled mess to follow).

Blake Kaplan (:mrbkap) (inactive)

Comment 51

•

12 years ago

Talking to Kyle on IRC, it sounds like he agrees with my plan in comment 49. Tim, would you mind wrangling the relevant patches and marking the bugs appropriately?

Ryan VanderMeulen [:RyanVM]

Comment 52

•

12 years ago

Any updates here?

Thomas Zimmermann [:tzimmermann] [:tdz]

Assignee

Comment 53

•

12 years ago

Attached patch Wait for incoming connections in UnixSocketImpl (b2g18) (obsolete) — Details — Splinter Review

This is a port of the second patch to b2g18. The other fixes have already been ported as part of other patches.

Attachment #722285 - Attachment is obsolete: true

Attachment #722287 - Attachment is obsolete: true

Attachment #722288 - Attachment is obsolete: true

Attachment #722291 - Attachment is obsolete: true

Attachment #727773 - Flags: review?(kyle)

Thomas Zimmermann [:tzimmermann] [:tdz]

Assignee

Comment 54

•

12 years ago

Attached patch Wait for incoming connections in UnixSocketImpl (b2g18) (obsolete) — Details — Splinter Review

Attachment #727773 - Attachment is obsolete: true

Attachment #727773 - Flags: review?(kyle)

Attachment #727881 - Flags: review?(kyle)

Attachment #727881 - Flags: review?(echou)

Kyle Machulis [:qdot] [:kmachulis] (INACTIVE)

Updated

•

12 years ago

Attachment #727881 - Flags: review?(kyle) → review+

Eric Chou [:ericchou] [:echou]

Comment 55

•

12 years ago

Comment on attachment 727881 [details] [diff] [review] Wait for incoming connections in UnixSocketImpl (b2g18) Review of attachment 727881 [details] [diff] [review]: ----------------------------------------------------------------- Looks good.

Attachment #727881 - Flags: review?(echou) → review+

Thomas Zimmermann [:tzimmermann] [:tdz]

Assignee

Comment 56

•

12 years ago

Attached patch Wait for incoming connections in UnixSocketImpl [r=qdot,echou] (b2g18) — Details — Splinter Review

Updated with review tags. Please uplift.

Attachment #727881 - Attachment is obsolete: true

Attachment #730611 - Flags: review+

Thomas Zimmermann [:tzimmermann] [:tdz]

Assignee

Updated

•

12 years ago

Keywords: checkin-needed

Ryan VanderMeulen [:RyanVM]

Comment 57

•

12 years ago

This doesn't have approval-b2g18+ or blocking status. Please get approval and re-request checkin.

status-b2g18: --- → affected

status-b2g18-v1.0.0: --- → wontfix

status-b2g18-v1.0.1: --- → wontfix

status-firefox20: --- → wontfix

status-firefox21: --- → fixed

Keywords: checkin-needed

Target Milestone: --- → B2G C4 (2jan on)

You need to log in before you can comment on or make changes to this bug.