Closed
Bug 560579
Opened 14 years ago
Closed 13 years ago
bitfield-induced race in class nsHttpTransaction
Categories
(Core :: Networking: HTTP, defect)
Tracking
()
RESOLVED
FIXED
mozilla5
People
(Reporter: jseward, Assigned: jseward)
Details
Attachments
(1 file, 1 obsolete file)
2.53 KB,
patch
|
Biesinger
:
review+
|
Details | Diff | Splinter Review |
NOTE, this bug is currently unconfirmed. See https://bugzilla.mozilla.org/show_bug.cgi?id=551155#c12 for background and rationale. HOWTO REPRO: difficult, am looking for ways to reproduce reliably. Start up Fx on Helgrind and go to news.bbc.co.uk; but only happens occasionally. netwerk/protocol/http/src/nsHttpTransaction.h defines class nsHttpTransaction. This contains a number of bitfield state flags (mClosed .. mSSLConnectFailed) which are accessed from different threads, leading to a race between apparently-unrelated bitfields. netwerk/protocol/http/src/nsHttpTransaction.h (R) nsHttpTransaction::SSLConnectFailed 134 PRBool SSLConnectFailed() { return mSSLConnectFailed; } netwerk/protocol/http/src/nsHttpTransaction.cpp (W) nsHttpTransaction::WritePipeSegment 509 trans->mReceivedData = PR_TRUE; So a read of ::mSSLConnectFailed races against a later write of ::mReceivedData. Inspection of netwerk/protocol/http/src/nsHttpTransaction.h shows these two fields live in the same byte: 192 // state flags 193 PRUint32 mClosed : 1; 194 PRUint32 mConnected : 1; 195 PRUint32 mHaveStatusLine : 1; 196 PRUint32 mHaveAllHeaders : 1; 197 PRUint32 mTransactionDone : 1; 198 PRUint32 mResponseIsComplete : 1; 199 PRUint32 mDidContentStart : 1; 200 PRUint32 mNoContent : 1; // expecting an empty entity body 201 PRUint32 mSentData : 1; 202 PRUint32 mReceivedData : 1; 203 PRUint32 mStatusEventPending : 1; 204 PRUint32 mHasRequestBody : 1; 205 PRUint32 mSSLConnectFailed : 1; In this case I guess this isn't harmful, because the read-modify-write of mSSLConnectFailed caused by the assignment to mReceivedData doesn't change the value of mSSLConnectFailed. But if both fields were assigned then there would be two uncoordinated R-M-W cycles for the containing byte and a small window in which the effects of one of the assignments could be lost. ATM the main difficulty in diagnosing/fixing this is the problem of reliably reproducing it.
Assignee | ||
Comment 1•14 years ago
|
||
Possible data race during write of size 1 at 0x209cb232 by thread #2 at 0x12D7B26A: nsHttpTransaction::WritePipeSegment(nsIOutputStream*, void*, char*, unsigned int, unsigned int, unsigned int*) (nsHttpTransaction.cpp:509) by 0x5861046: nsPipeOutputStream::WriteSegments(unsigned int (*)(nsIOutputStream*, void*, char*, unsigned int, unsigned int, unsigned int*), void*, unsigned int, unsigned int*) (nsPipe3.cpp:1137) by 0x12D7B5A2: nsHttpTransaction::WriteSegments(nsAHttpSegmentWriter*, unsigned int, unsigned int*) (nsHttpTransaction.cpp:531) by 0x12D70B71: nsHttpConnection::OnSocketReadable() (nsHttpConnection.cpp:664) by 0x12D70BF7: nsHttpConnection::OnInputStreamReady(nsIAsyncInputStream*) (nsHttpConnection.cpp:763) by 0x12D2616F: nsSocketInputStream::OnSocketReady(unsigned int) (nsSocketTransport2.cpp:256) by 0x12D262D8: nsSocketTransport::OnSocketReady(PRFileDesc*, short) (nsSocketTransport2.cpp:1519) by 0x12D28F4B: nsSocketTransportService::DoPollIteration(int) (nsSocketTransportService2.cpp:674) by 0x12D2911B: nsSocketTransportService::OnProcessNextEvent( nsIThreadInternal*, int, unsigned int) (nsSocketTransportService2.cpp:539) by 0x5879BD6: nsThread::ProcessNextEvent(int, int*) (nsThread.cpp:508) by 0x5841B16: NS_ProcessNextEvent_P(nsIThread*, int) (nsThreadUtils.cpp:250) by 0x12D2954B: nsSocketTransportService::Run() (nsSocketTransportService2.cpp:581) This conflicts with a previous read of size 1 by thread #1 at 0x12D8FD80: nsHttpTransaction::SSLConnectFailed() (nsHttpTransaction.h:134) by 0x12D8F72E: nsHttpChannel::ProcessResponse() (nsHttpChannel.cpp:957) by 0x12D8FB73: nsHttpChannel::OnStartRequest(nsIRequest*, nsISupports*) (nsHttpChannel.cpp:5182) by 0x12D11541: nsInputStreamPump::OnStateStart() (nsInputStreamPump.cpp:439) by 0x12D11874: nsInputStreamPump::OnInputStreamReady(nsIAsyncInputStream*) (nsInputStreamPump.cpp:395) by 0x58622E3: nsInputStreamReadyEvent::Run() (nsStreamUtils.cpp:112) by 0x5879C2C: nsThread::ProcessNextEvent(int, int*) (nsThread.cpp:527) by 0x5841B16: NS_ProcessNextEvent_P(nsIThread*, int) (nsThreadUtils.cpp:250) by 0x155C2606: nsBaseAppShell::Run() (nsBaseAppShell.cpp:177) Address 0x209cb232 is 226 bytes inside a block of size 232 alloc'd at 0x4C25B08: malloc (vg_replace_malloc.c:236) by 0x5AD1F75: moz_xmalloc (mozalloc.cpp:75) by 0x12D8A8E8: nsHttpChannel::SetupTransaction() (mozalloc.h:222) by 0x12D8AD6E: nsHttpChannel::Connect(int) (nsHttpChannel.cpp:358) by 0x12D8EC63: nsHttpChannel::AsyncOpen(nsIStreamListener*, nsISupports*) (nsHttpChannel.cpp:4491) by 0x12D8F294: nsHttpChannel::ProcessRedirection(unsigned int) (nsHttpChannel.cpp:3011) by 0x12D8F8F2: nsHttpChannel::ProcessResponse() (nsHttpChannel.cpp:1019) by 0x12D8FB73: nsHttpChannel::OnStartRequest(nsIRequest*, nsISupports*) (nsHttpChannel.cpp:5182) by 0x12D11541: nsInputStreamPump::OnStateStart() (nsInputStreamPump.cpp:439) by 0x12D11874: nsInputStreamPump::OnInputStreamReady(nsIAsyncInputStream*) (nsInputStreamPump.cpp:395) by 0x58622E3: nsInputStreamReadyEvent::Run() (nsStreamUtils.cpp:112) by 0x5879C2C: nsThread::ProcessNextEvent(int, int*) (nsThread.cpp:527)
Assignee | ||
Comment 2•14 years ago
|
||
Aggregated results over about 40 runs loading a couple of hundred web pages produces the following set of conflicts: R mSSLConnectFailed (134h) vs W mReceivedData (509) R mHaveAllHeaders (319) vs W mConnected (587) R mHaveAllHeaders (319) vs W mClosed (638) R mHaveAllHeaders (319) vs W mTransactionDone (978) R mHaveAllHeaders (319) vs W mTransactionDone (637) R mHaveAllHeaders (319) vs W mResponseIsComplete (979) where (X) means line X in nsHttpTransaction.cpp. In each case examination of the conflicting field pairs shows them to be within the same byte, which is because gcc generates code to modify the fields using 8-bit read-modify-writes. If it had instead used 16- or 32-bit loads and stores then perhaps more conflicts would be reported. This is quite subtle. A R-vs-W conflict isn't harmful (here) since the write merely writes back unaffected bits unchanged. Whereas a W-vs-W conflict (of different fields) would be dangerous. All the observed conflicts are of the R-vs-W form. So I haven't actually managed to demonstrate anything dangerous happening. OTOH to show that it is always safe would require proving that only one thread writes the fields, and all the other threads only read them. Which sounds pretty unlikely to me.
Assignee | ||
Comment 3•14 years ago
|
||
With this in place I can no longer observe any of the races listed in comment 2.
Attachment #441200 -
Flags: review?(cbiesinger)
Comment 4•14 years ago
|
||
Comment on attachment 441200 [details] [diff] [review] de-bitfield-ise state flags in class nsHttpTransaction PRUint8 -> PRPackedBool
Attachment #441200 -
Flags: review?(cbiesinger) → review-
Assignee | ||
Comment 5•13 years ago
|
||
Attachment #441200 -
Attachment is obsolete: true
Assignee | ||
Updated•13 years ago
|
Attachment #521673 -
Flags: review?(cbiesinger)
Comment 6•13 years ago
|
||
this is a good find. It's going to cause me pain with patch conflicts, but a good find none the less ;)
Assignee | ||
Comment 7•13 years ago
|
||
> It's going to cause me pain with patch conflicts,
It's an ultra-trivial patch; did you look at it?
Comment 8•13 years ago
|
||
(In reply to comment #7) > > It's going to cause me pain with patch conflicts, > > It's an ultra-trivial patch; did you look at it? I just meant auto rejection - I have outstanding patches that add new flags. I can handle it ok :)
Comment 9•13 years ago
|
||
Comment on attachment 521673 [details] [diff] [review] using PRPackedBool instead sorry for the delay!
Attachment #521673 -
Flags: review?(cbiesinger) → review+
Updated•13 years ago
|
Status: UNCONFIRMED → NEW
Ever confirmed: true
Assignee | ||
Updated•13 years ago
|
Keywords: checkin-needed
Updated•13 years ago
|
Assignee: nobody → jseward
Comment 10•13 years ago
|
||
http://hg.mozilla.org/projects/cedar/rev/798776deb69d
Keywords: checkin-needed
Whiteboard: fixed-in-cedar
Comment 11•13 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/8b2ff9fc5320
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Whiteboard: fixed-in-cedar
Target Milestone: --- → mozilla2.2
You need to log in
before you can comment on or make changes to this bug.
Description
•