Closed
Bug 924967
Opened 11 years ago
Closed 11 years ago
Use after free of *this at nsHttpChannel::OnStopRequest, nsLoadGroup::Cancel not thread-safe
Categories
(Core :: Networking: HTTP, defect)
Core
Networking: HTTP
Tracking
()
RESOLVED
FIXED
mozilla27
Tracking | Status | |
---|---|---|
firefox26 | --- | unaffected |
firefox27 | --- | fixed |
firefox-esr17 | --- | unaffected |
firefox-esr24 | --- | unaffected |
b2g18 | --- | unaffected |
People
(Reporter: mayhemer, Assigned: sworkman)
Details
(Keywords: csectype-uaf, sec-critical)
Attachments
(3 files, 2 obsolete files)
15.02 KB,
application/zip
|
Details | |
3.01 KB,
patch
|
sworkman
:
review+
|
Details | Diff | Splinter Review |
3.54 KB,
patch
|
mcmanus
:
review+
|
Details | Diff | Splinter Review |
In nsHttpChannel::OnStopRequest last references to the channel may be coming only from: - the transaction pump, that may be destroyed in OnStopRequest (mTransaction{*} = null) - the consumer, that may release the channel in mListener->OnStopRequest() After mListener->OnStopRequest then the channel can already be gone. In opt builds we get up to CloseCacheEntry() { ... mCacheEntry->AsyncDoom() } where it crashes. Snippet of the log: 2013-10-09 16:06:43.604000 UTC - 0[12c310]: nsHttpChannel::OnStopRequest [this=de482e0 request=de18248 status=804b0002] 2013-10-09 16:06:43.604000 UTC - 0[12c310]: nsHttpTransaction::DeleteSelfOnConsumerThread [this=de5eef0] >> transaction goes away (it's an already closed transaction, released from connman pools) 2013-10-09 16:06:43.604000 UTC - 0[12c310]: Destroying nsHttpTransaction @de5eef0 2013-10-09 16:06:43.604000 UTC - 0[12c310]: _OldCacheEntryWrapper::HasWriteAccess [this=de802b8, write-access=1] 2013-10-09 16:06:43.604000 UTC - 0[12c310]: nsHttpChannel::FinalizeCacheEntry [this=de482e0] 2013-10-09 16:06:43.604000 UTC - 0[12c310]: calling OnStopRequest >> channel is destroyed here! 2013-10-09 16:06:43.604000 UTC - 0[12c310]: Destroying nsHttpChannel [this=de482e0] 2013-10-09 16:06:43.604000 UTC - 7972[321e0e0]: CACHE: disk BindEntry [d7ce6e8 5abef39c] >> cache entry after the last ref released too 2013-10-09 16:06:43.604000 UTC - 0[12c310]: Destroying _OldCacheEntryWrapper de802b8 for descriptor de80a30 2013-10-09 16:06:43.604000 UTC - 7972[321e0e0]: CACHE: AddRecord [5abef39c] 2013-10-09 16:06:43.604000 UTC - 0[12c310]: Destroying HttpBaseChannel @de482e0 2013-10-09 16:06:43.604000 UTC - 7972[321e0e0]: CACHE: InvalidateCache 2013-10-09 16:06:43.604000 UTC - 0[12c310]: Destroying nsHttpConnectionInfo @dcff530 2013-10-09 16:06:43.604000 UTC - 7972[321e0e0]: CACHE: disk OpenOutputStreamForEntry [d7ce6e8 2 0] >> here we operate on freed |*this| 2013-10-09 16:06:43.604000 UTC - 0[12c310]: nsHttpChannel::CloseCacheEntry [this=de482e0] mStatus=804b0002 mCacheEntryIsWriteOnly=1 2013-10-09 16:06:43.604000 UTC - 7972[321e0e0]: CACHE: disk OnDataSizeChange [d7ce6e8 0] 2013-10-09 16:06:43.604000 UTC - 0[12c310]: dooming cache entry!! => crash... Reproduced by cycling |while ./mach mochitest-plain content/events/test/test_bug409604.html; do true; done| on an opt build (bug 922296) Solution: kungFuDeathGrip(this) over nsHttpChannel::OnStopRequest
Reporter | ||
Comment 1•11 years ago
|
||
Attachment #814949 -
Flags: review?(mcmanus)
Reporter | ||
Comment 2•11 years ago
|
||
Fixing this will fix the bug tree of https://bugzilla.mozilla.org/show_bug.cgi?id=924112 (Stupid bugzilla syncs the |See also| field... grrr!!!!)
Updated•11 years ago
|
Attachment #814949 -
Flags: review?(mcmanus) → review+
Reporter | ||
Comment 3•11 years ago
|
||
I was running the same test loop with this patch and it still crashes. I also realized that the transaction pump is held by the callback event utility class, so it cannot go away. The chain should live until the bottom of the stack, so there shouldn't be any crash. We apparently somewhere call release twice on something. I have to find out where. It still could be something with the new cache.
Reporter | ||
Updated•11 years ago
|
Attachment #814949 -
Attachment is obsolete: true
Attachment #814949 -
Flags: review+
Reporter | ||
Comment 4•11 years ago
|
||
New clue: we call nsHttpChannel::Cancel() on two threads [main/NS_BINDING_ABORTED and html5-parser/NS_ERROR_FAILURE] at almost the same time - with high probability not a safe operation. Adding Steve, this could be an ODA on non-mainthread issue.
Reporter | ||
Updated•11 years ago
|
Status: ASSIGNED → NEW
Reporter | ||
Comment 5•11 years ago
|
||
Err: not the html-5 parser, but the image parser (the url is a .png file). Related to bug 867755 ?
Reporter | ||
Comment 6•11 years ago
|
||
Few data: - NS_ERROR_FAILURE cancel is coming from imgRequest::ODA since ImageFactory::CreateImage fails - NS_BINDING_ABORTED cancel is coming as the window is being closed (comes from docshell->loadgroup) // this is the last place I haven't look at, it is probably where the double release happens since in non-crashing logs I don't see this cancel - doesn't matter which of the two cancels is called sooner, both cases crash // supports the theory - the concurrent cancel on the channel is caught by |if (mCanceled) return| check - the listener is imgRequest that releases the channel in imgRequest::OnStopRequest (that is in the bad case the last reference) - the underlying input stream pump is alive at the moment of the crash (dtor not logged - I've added that log locally)
Assignee | ||
Comment 7•11 years ago
|
||
Caught up reading about this, and had to do a clobber build to investigate more. The Cancel from imgRequest::ODA is what I was going to look for, so that's good. I think it's similar behavior to test_draggableprop.html, in which an image (happy.png) is referenced but doesn't exist. So, Cancel gets called. I don't yet have any suggestions about why there are two releases, but it's good to have a codepath to start with.
Reporter | ||
Comment 8•11 years ago
|
||
Got it: I think this will be cleaner from the two callstacks (that happen simultaneously on two threads:) Image decoder xul.dll!mozilla::net::nsHttpChannel::Cancel(tag_nsresult status=NS_ERROR_FAILURE) Line 4309 C++ > xul.dll!nsLoadGroup::Cancel(tag_nsresult status=NS_ERROR_FAILURE) Line 290 + 0x18 bytes C++ xul.dll!imgRequest::Cancel(tag_nsresult aStatus=NS_ERROR_FAILURE) Line 269 C++ xul.dll!imgRequest::OnDataAvailable(nsIRequest * aRequest=0x0eb419b4, nsISupports * ctxt=0x00000000, nsIInputStream * inStr=0x15796a00, unsigned __int64 sourceOffset=0, unsigned int count=391) Line 800 C++ Main thread > xul.dll!mozilla::net::nsHttpChannel::Cancel(tag_nsresult status=NS_BINDING_ABORTED) Line 4309 C++ xul.dll!nsLoadGroup::Cancel(tag_nsresult status=NS_BINDING_ABORTED) Line 290 + 0x18 bytes C++ xul.dll!imgRequest::Cancel(tag_nsresult aStatus=NS_BINDING_ABORTED) Line 269 C++ xul.dll!imgRequest::RemoveProxy(imgRequestProxy * proxy=0x156f60c0, tag_nsresult aStatus=NS_BINDING_ABORTED) Line 222 C++ xul.dll!imgRequestProxy::DoCancel(tag_nsresult status=NS_BINDING_ABORTED) Line 312 C++ xul.dll!imgRequestProxy::imgCancelRunnable::Run() Line 130 C++ The important part here is that imgRequest->mRequest *is a load group* that does nasty NS_RELEASE() calls on the channel. nsLoadGroup::Cancel is simply not thread safe.
Reporter | ||
Updated•11 years ago
|
Summary: Use after free of *this at nsHttpChannel::OnStopRequest → Use after free of *this at nsHttpChannel::OnStopRequest, nsLoadGroup::Cancel not thread-safe
Comment 9•11 years ago
|
||
all cancel methods should probably assert main threadedness. (loadgroup and channel too)
Assignee | ||
Comment 10•11 years ago
|
||
Honza, I know you'd prefer making nsLoadGroup::Cancel threadsafe, but after a quick chat with Pat it looks like we'd need to hunt down every implementation of nsIRequest to make this truly threadsafe. So, dispatching to the main thread may be the fastest, simplest option here. I've set f? on this patch not so much for the code, but more to see if you can reproduce the crash with it (I hope not!). I tried reproducing without the patch locally on my own machine (Mac OSX 10.8), but nothing yet. I'll keep trying, but since you were able to reproduce it already, it seems like you're in a better place to test the patch. If it does seem to solve the issue, I'll add in Pat's suggestion to assert main thread only on the Cancel methods. Hopefully that won't crash anything else.
Assignee: honzab.moz → sworkman
Status: NEW → ASSIGNED
Attachment #815653 -
Flags: feedback?(honzab.moz)
Assignee | ||
Comment 11•11 years ago
|
||
Also, I'm taking PTO on Friday/tomorrow, so feel free to take the bug back if you want it done super quick. Otherwise, I'll get back to it on Monday.
Reporter | ||
Comment 12•11 years ago
|
||
Comment on attachment 815653 [details] [diff] [review] v1.0 Dispatch nsIRequest::Cancel to main thread in imgRequest::Cancel No crashes. I'm also off today. An image peer must carefully review this.
Attachment #815653 -
Flags: feedback?(honzab.moz) → feedback+
Assignee | ||
Comment 13•11 years ago
|
||
No crashes == awesome.
Assignee | ||
Comment 14•11 years ago
|
||
Comment on attachment 815653 [details] [diff] [review] v1.0 Dispatch nsIRequest::Cancel to main thread in imgRequest::Cancel Seth: Honza discovered that nsIRequest::Cancel was being called in imgRequest::Cancel OFF the main thread, and concurrently ON the main thread, resulting in two ptr releases and a crash. Converting all the implementations of nsIRequest to be threadsafe is not really feasible, so this patch dispatches the call to the main thread. Can you take a look and see if you can think of any problems with this? FYI: Here are the results from try: https://tbpl.mozilla.org/?tree=Try&rev=71b2892eac3d
Attachment #815653 -
Flags: review?(seth)
Reporter | ||
Comment 15•11 years ago
|
||
The problem is really not in how each nsIRequest::Cancel is implemented, but in particular only in how nsLoadGroup::RemoveRequest is implemented. However, safer and more complete solution can be here to post to the main thread (however, if that would delay calling OnStopRequest for consumers that has redirected to a background thread, then I'd be against - it would be just a performance regression and ruin the ODA off main thread work).
Comment 16•11 years ago
|
||
Comment on attachment 815653 [details] [diff] [review] v1.0 Dispatch nsIRequest::Cancel to main thread in imgRequest::Cancel Review of attachment 815653 [details] [diff] [review]: ----------------------------------------------------------------- Looks good to me! r+ with the PR_Sleep gone. ::: image/src/imgRequest.cpp @@ +260,5 @@ > + { > + MOZ_ASSERT(NS_IsMainThread(), "I should be running on the main thread!"); > + mImgRequest->ContinueCancel(mStatus); > + return NS_OK; > + } Whitespace. @@ +290,5 @@ > +{ > + MOZ_ASSERT(NS_IsMainThread()); > + > + RemoveFromCache(); > + Whitespace. @@ +825,5 @@ > if (mImage->HasError() && !mIsMultiPartChannel) { // Probably bad mimetype > // We allow multipart images to fail to initialize without cancelling the > // load because subsequent images might be fine; thus only single part > // images end up here. > + PR_Sleep(PR_MillisecondsToInterval(1000)); Is this just for debugging? ::: image/src/imgRequest.h @@ +73,5 @@ > void CancelAndAbort(nsresult aStatus); > > + // Called or dispatched by cancel for main thread only execution. > + void ContinueCancel(nsresult aStatus); > + Whitespace.
Attachment #815653 -
Flags: review?(seth) → review+
Assignee | ||
Comment 17•11 years ago
|
||
(In reply to Seth Fowler [:seth] from comment #16) > Comment on attachment 815653 [details] [diff] [review] > v1.0 Dispatch nsIRequest::Cancel to main thread in imgRequest::Cancel > > Review of attachment 815653 [details] [diff] [review]: > ----------------------------------------------------------------- > > Looks good to me! r+ with the PR_Sleep gone. Oops! > @@ +825,5 @@ > > if (mImage->HasError() && !mIsMultiPartChannel) { // Probably bad mimetype > > // We allow multipart images to fail to initialize without cancelling the > > // load because subsequent images might be fine; thus only single part > > // images end up here. > > + PR_Sleep(PR_MillisecondsToInterval(1000)); > > Is this just for debugging? It was; I didn't even use it because Honza figured it out before I got the change to run it :)
Assignee | ||
Comment 18•11 years ago
|
||
Attachment #815653 -
Attachment is obsolete: true
Attachment #817512 -
Flags: review+
Assignee | ||
Comment 19•11 years ago
|
||
Pat, you mentioned having main thread assertions in the loadgroup and the channel. I added some to nsLoadGroup, nsHttpChannel, HttpChannelChild and nsInputStreamPump. I know there are more implemented nsIRequest sub-classes, but this seems like a decent core set to start with. Let me know what you think. try: https://tbpl.mozilla.org/?tree=Try&rev=87dc59a0bd33
Attachment #817513 -
Flags: review?(mcmanus)
Assignee | ||
Comment 20•11 years ago
|
||
Comment on attachment 817512 [details] [diff] [review] v1.1 Dispatch nsIRequest::Cancel to main thread in imgRequest::Cancel https://hg.mozilla.org/integration/mozilla-inbound/rev/dd2957ba1805
Assignee | ||
Comment 21•11 years ago
|
||
Setting [leave open] so I can add the main thread assertions once Pat gets a chance to look at it.
Whiteboard: [leave open]
Comment 22•11 years ago
|
||
fixed in mozilla central now https://hg.mozilla.org/mozilla-central/rev/dd2957ba1805https://hg.mozilla.org/mozilla-central/rev/dd2957ba1805
Updated•11 years ago
|
Keywords: csec-uaf,
sec-critical
Comment 23•11 years ago
|
||
Comment on attachment 817513 [details] [diff] [review] v1.0 Add main thread assertions for Cancel in nsLoadGroup, nsHttpChannel and nsInputStreamPump Review of attachment 817513 [details] [diff] [review]: ----------------------------------------------------------------- these are a good idea - here's an implementation of nsirequest in javascript - https://mxr.mozilla.org/mozilla-central/source/netwerk/protocol/app/AppProtocolHandler.js#101 .. it has an empty cancel method, but clearly that's not going to work too well if called off the main thread either :) it would be nice to make this free threaded, but that's a different (bigger) patch. ::: netwerk/protocol/http/HttpChannelChild.cpp @@ +917,5 @@ > NS_IMETHODIMP > HttpChannelChild::Cancel(nsresult status) > { > + MOZ_ASSERT(NS_IsMainThread()); > + whitespace issue
Attachment #817513 -
Flags: review?(mcmanus) → review+
Assignee | ||
Comment 24•11 years ago
|
||
Thanks Pat! https://hg.mozilla.org/integration/mozilla-inbound/rev/4fcc79a6234b
Whiteboard: [leave open]
Comment 25•11 years ago
|
||
Does this affect 25 and 26? How about ESR24? If so, we probably want an Aurora patch even though it is too late for the Beta branch.
Assignee | ||
Comment 26•11 years ago
|
||
Thanks for checking Al, but this should only affect 27. It was caused by bug 867755, which is targeted for 27.
Comment 27•11 years ago
|
||
Thanks, Steve. Flagging appropriately.
status-b2g18:
--- → unaffected
status-firefox26:
--- → unaffected
status-firefox-esr17:
--- → unaffected
status-firefox-esr24:
--- → unaffected
Comment 28•11 years ago
|
||
fixed on central https://hg.mozilla.org/mozilla-central/rev/4fcc79a6234b
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Updated•10 years ago
|
Group: network-core-security → core-security
Updated•9 years ago
|
Group: core-security
You need to log in
before you can comment on or make changes to this bug.
Description
•