Closed Bug 560579 Opened 14 years ago Closed 13 years ago

bitfield-induced race in class nsHttpTransaction

Categories

(Core :: Networking: HTTP, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla5

People

(Reporter: jseward, Assigned: jseward)

Details

Attachments

(1 file, 1 obsolete file)

NOTE, this bug is currently unconfirmed.  See
https://bugzilla.mozilla.org/show_bug.cgi?id=551155#c12
for background and rationale.


HOWTO REPRO: difficult, am looking for ways to reproduce reliably.
Start up Fx on Helgrind and go to news.bbc.co.uk; but only happens
occasionally.


netwerk/protocol/http/src/nsHttpTransaction.h defines class
nsHttpTransaction.  This contains a number of bitfield state flags
(mClosed .. mSSLConnectFailed) which are accessed from different
threads, leading to a race between apparently-unrelated bitfields.

netwerk/protocol/http/src/nsHttpTransaction.h
(R)  nsHttpTransaction::SSLConnectFailed
134    PRBool    SSLConnectFailed() { return mSSLConnectFailed; }

netwerk/protocol/http/src/nsHttpTransaction.cpp
(W)  nsHttpTransaction::WritePipeSegment
509    trans->mReceivedData = PR_TRUE;

So a read of ::mSSLConnectFailed races against a later write of
::mReceivedData.


Inspection of netwerk/protocol/http/src/nsHttpTransaction.h shows
these two fields live in the same byte:

192    // state flags
193    PRUint32                        mClosed             : 1;
194    PRUint32                        mConnected          : 1;
195    PRUint32                        mHaveStatusLine     : 1;
196    PRUint32                        mHaveAllHeaders     : 1;
197    PRUint32                        mTransactionDone    : 1;
198    PRUint32                        mResponseIsComplete : 1;
199    PRUint32                        mDidContentStart    : 1;
200    PRUint32                        mNoContent          : 1; // expecting an empty entity body

201    PRUint32                        mSentData           : 1;
202    PRUint32                        mReceivedData       : 1;
203    PRUint32                        mStatusEventPending : 1;
204    PRUint32                        mHasRequestBody     : 1;
205    PRUint32                        mSSLConnectFailed   : 1;


In this case I guess this isn't harmful, because the read-modify-write
of mSSLConnectFailed caused by the assignment to mReceivedData doesn't
change the value of mSSLConnectFailed.  But if both fields were
assigned then there would be two uncoordinated R-M-W cycles for the
containing byte and a small window in which the effects of one of
the assignments could be lost.

ATM the main difficulty in diagnosing/fixing this is the problem
of reliably reproducing it.
Possible data race during write of size 1 at 0x209cb232 by thread #2
   at 0x12D7B26A: nsHttpTransaction::WritePipeSegment(nsIOutputStream*,
                  void*, char*, unsigned int, unsigned int, unsigned int*)
                  (nsHttpTransaction.cpp:509)
   by 0x5861046: nsPipeOutputStream::WriteSegments(unsigned int
                 (*)(nsIOutputStream*, void*, char*, unsigned int, unsigned
                 int, unsigned int*), void*, unsigned int, unsigned int*)
                 (nsPipe3.cpp:1137)
   by 0x12D7B5A2: nsHttpTransaction::WriteSegments(nsAHttpSegmentWriter*,
                  unsigned int, unsigned int*) (nsHttpTransaction.cpp:531)
   by 0x12D70B71: nsHttpConnection::OnSocketReadable()
                  (nsHttpConnection.cpp:664)
   by 0x12D70BF7: nsHttpConnection::OnInputStreamReady(nsIAsyncInputStream*)
                  (nsHttpConnection.cpp:763)
   by 0x12D2616F: nsSocketInputStream::OnSocketReady(unsigned int)
                  (nsSocketTransport2.cpp:256)
   by 0x12D262D8: nsSocketTransport::OnSocketReady(PRFileDesc*, short)
                  (nsSocketTransport2.cpp:1519)
   by 0x12D28F4B: nsSocketTransportService::DoPollIteration(int)
                  (nsSocketTransportService2.cpp:674)
   by 0x12D2911B: nsSocketTransportService::OnProcessNextEvent(
                  nsIThreadInternal*, int, unsigned int)
                  (nsSocketTransportService2.cpp:539)
   by 0x5879BD6: nsThread::ProcessNextEvent(int, int*) (nsThread.cpp:508)
   by 0x5841B16: NS_ProcessNextEvent_P(nsIThread*, int) (nsThreadUtils.cpp:250)
   by 0x12D2954B: nsSocketTransportService::Run()
                  (nsSocketTransportService2.cpp:581)

 This conflicts with a previous read of size 1 by thread #1
   at 0x12D8FD80: nsHttpTransaction::SSLConnectFailed()
                  (nsHttpTransaction.h:134)
   by 0x12D8F72E: nsHttpChannel::ProcessResponse() (nsHttpChannel.cpp:957)
   by 0x12D8FB73: nsHttpChannel::OnStartRequest(nsIRequest*, nsISupports*)
                  (nsHttpChannel.cpp:5182)
   by 0x12D11541: nsInputStreamPump::OnStateStart() (nsInputStreamPump.cpp:439)
   by 0x12D11874: nsInputStreamPump::OnInputStreamReady(nsIAsyncInputStream*)
                  (nsInputStreamPump.cpp:395)
   by 0x58622E3: nsInputStreamReadyEvent::Run() (nsStreamUtils.cpp:112)
   by 0x5879C2C: nsThread::ProcessNextEvent(int, int*) (nsThread.cpp:527)
   by 0x5841B16: NS_ProcessNextEvent_P(nsIThread*, int) (nsThreadUtils.cpp:250)
   by 0x155C2606: nsBaseAppShell::Run() (nsBaseAppShell.cpp:177)

 Address 0x209cb232 is 226 bytes inside a block of size 232 alloc'd
   at 0x4C25B08: malloc (vg_replace_malloc.c:236)
   by 0x5AD1F75: moz_xmalloc (mozalloc.cpp:75)
   by 0x12D8A8E8: nsHttpChannel::SetupTransaction() (mozalloc.h:222)
   by 0x12D8AD6E: nsHttpChannel::Connect(int) (nsHttpChannel.cpp:358)
   by 0x12D8EC63: nsHttpChannel::AsyncOpen(nsIStreamListener*, nsISupports*)
                  (nsHttpChannel.cpp:4491)
   by 0x12D8F294: nsHttpChannel::ProcessRedirection(unsigned int)
                  (nsHttpChannel.cpp:3011)
   by 0x12D8F8F2: nsHttpChannel::ProcessResponse() (nsHttpChannel.cpp:1019)
   by 0x12D8FB73: nsHttpChannel::OnStartRequest(nsIRequest*, nsISupports*)
                  (nsHttpChannel.cpp:5182)
   by 0x12D11541: nsInputStreamPump::OnStateStart() (nsInputStreamPump.cpp:439)
   by 0x12D11874: nsInputStreamPump::OnInputStreamReady(nsIAsyncInputStream*)
                 (nsInputStreamPump.cpp:395)
   by 0x58622E3: nsInputStreamReadyEvent::Run() (nsStreamUtils.cpp:112)
   by 0x5879C2C: nsThread::ProcessNextEvent(int, int*) (nsThread.cpp:527)
Aggregated results over about 40 runs loading a couple of hundred web
pages produces the following set of conflicts:

R mSSLConnectFailed (134h)  vs  W mReceivedData (509)
R mHaveAllHeaders   (319)   vs  W mConnected (587)
R mHaveAllHeaders   (319)   vs  W mClosed (638)
R mHaveAllHeaders   (319)   vs  W mTransactionDone (978)
R mHaveAllHeaders   (319)   vs  W mTransactionDone (637)
R mHaveAllHeaders   (319)   vs  W mResponseIsComplete (979)

where (X) means line X in nsHttpTransaction.cpp.  In each case
examination of the conflicting field pairs shows them to be within the
same byte, which is because gcc generates code to modify the fields
using 8-bit read-modify-writes.  If it had instead used 16- or 32-bit
loads and stores then perhaps more conflicts would be reported.

This is quite subtle.  A R-vs-W conflict isn't harmful (here) since
the write merely writes back unaffected bits unchanged.  Whereas a
W-vs-W conflict (of different fields) would be dangerous.

All the observed conflicts are of the R-vs-W form.  So I haven't
actually managed to demonstrate anything dangerous happening.  OTOH to
show that it is always safe would require proving that only one thread
writes the fields, and all the other threads only read them.  Which
sounds pretty unlikely to me.
With this in place I can no longer observe any of the races
listed in comment 2.
Attachment #441200 - Flags: review?(cbiesinger)
Comment on attachment 441200 [details] [diff] [review]
de-bitfield-ise state flags in class nsHttpTransaction

PRUint8 -> PRPackedBool
Attachment #441200 - Flags: review?(cbiesinger) → review-
Attachment #441200 - Attachment is obsolete: true
Attachment #521673 - Flags: review?(cbiesinger)
this is a good find.

It's going to cause me pain with patch conflicts, but a good find none the less ;)
> It's going to cause me pain with patch conflicts,

It's an ultra-trivial patch; did you look at it?
(In reply to comment #7)
> > It's going to cause me pain with patch conflicts,
> 
> It's an ultra-trivial patch; did you look at it?

I just meant auto rejection - I have outstanding patches that add new flags. I can handle it ok :)
Comment on attachment 521673 [details] [diff] [review]
using PRPackedBool instead

sorry for the delay!
Attachment #521673 - Flags: review?(cbiesinger) → review+
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: checkin-needed
Assignee: nobody → jseward
http://hg.mozilla.org/mozilla-central/rev/8b2ff9fc5320
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Whiteboard: fixed-in-cedar
Target Milestone: --- → mozilla2.2
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: