Closed Bug 1011354 Opened 11 years ago Closed 11 years ago

crash in mozilla::net::nsHttpChannel::OnStopRequest(nsIRequest*, nsISupports*, tag_nsresult)

Categories

(Core :: Networking, defect)

x86
Windows NT
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla35
Tracking Status
firefox31 --- wontfix
firefox32 --- wontfix
firefox33 + fixed
firefox34 + fixed
firefox35 + fixed
firefox-esr31 33+ fixed
b2g-v1.4 --- fixed
b2g-v2.0 --- fixed
b2g-v2.0M --- fixed
b2g-v2.1 --- fixed
b2g-v2.2 --- fixed

People

(Reporter: davidb, Assigned: valentin)

References

Details

(Keywords: crash, csectype-uaf, sec-high, Whiteboard: [adv-main33+][adv-esr31.2+][b2g-adv-main2.2+])

Crash Data

Attachments

(1 file, 2 obsolete files)

This bug was filed from the Socorro interface and is report bp-57a3cfc1-a538-4c0b-baf8-d75182140515. ============================================================= Note: automated exploitability analysis rated this high.
The crash address 0x5a5a5a62 makes this look like a use-after-free
Also found this report [1] (searched for CacheEntry in the signature), has Crash Address 0x5a5a5a5e. Seems like CacheEntry here is a bogus thanks code sharing. When looking at the line in nsHttpChannel, seems like the connection on the transaction is bad. [1] https://crash-stats.mozilla.com/report/index/3af1765d-9551-4939-a6f1-0cb4c2140515
(In reply to Honza Bambas (:mayhemer) from comment #2) > Also found this report [1] (searched for CacheEntry in the signature), has > Crash Address 0x5a5a5a5e. Seems like CacheEntry here is a bogus thanks code > sharing. When looking at the line in nsHttpChannel, seems like the > connection on the transaction is bad. > > [1] > https://crash-stats.mozilla.com/report/index/3af1765d-9551-4939-a6f1- > 0cb4c2140515 It's a crash on beta and on only one build ever...
Honza, what is the next step here? It is marked sec-high.
Flags: needinfo?(honzab.moz)
Forwarding to Patrick, it's more his area now.
Flags: needinfo?(honzab.moz) → needinfo?(mcmanus)
I'm running out of ideas here. cc: valentin for a fresh set of eyes.
On PTO for a couple more days. Will try to figure it out then.
From what Patrick and I could figure out, it is possible that there is a race between gHttpHandler->CancelTransaction(mTransaction, status) which goes to the socket thread, and mTransaction->Connection() which is called on the main thread. I have guarded NS_RELEASE(mConnection), and mConnection's getters and setters with a mutex, per Patrick's suggestion. This should take care of the problem, unless there's some other issue we're missing.
Attachment #8491614 - Flags: review?(mcmanus)
Assignee: nobody → valentin.gosu
Status: NEW → ASSIGNED
Comment on attachment 8491614 [details] [diff] [review] Use a mutex to guard access to nsHttpTransaction::mConnection Review of attachment 8491614 [details] [diff] [review]: ----------------------------------------------------------------- This is solving a real problem - thanks! its a little speculative as to whether or not its the cause of this crash (or perhaps one cause of several) - but the good news is there are several nightly crashes a day with this signature, so we ought to be able to get a fairly quick read on whether it helps or not. r=mcmanus ::: netwerk/protocol/http/nsHttpTransaction.cpp @@ +152,5 @@ > // Force the callbacks to be released right now > mCallbacks = nullptr; > > + { > + MutexAutoLock lock(mConnectionLock); we don't need the lock in the dtor.. you can't be calling a method on an object that is simultaneously being destroyed (and that's why the channel holds a hard ref on the transaction) @@ +453,5 @@ > > void > nsHttpTransaction::SetConnection(nsAHttpConnection *conn) > { > + MutexAutoLock lock(mConnectionLock); let's scope this to just the two RELEASE/ADDREF operations.. I don't like locks held when stuff is pushed onto the stack if we can avoid it
Attachment #8491614 - Flags: review?(mcmanus) → review+
Attachment #8491614 - Attachment is obsolete: true
Comment on attachment 8491614 [details] [diff] [review] Use a mutex to guard access to nsHttpTransaction::mConnection Review of attachment 8491614 [details] [diff] [review]: ----------------------------------------------------------------- ::: netwerk/protocol/http/nsHttpTransaction.cpp @@ +391,1 @@ > return mConnection; imagine this: T1: nsRefPtr<nsAHttpConnection> t = trans->Connection(); while t->mConnection refcnt == 1 T1 switches between t.mRawPtr = result and t->AddRef() T2: t->SetConnection(something-else) t->Release() => delete t T1: switches back to execution t->AddRef() => CRASH This method must return already_AddRefed<>
Attachment #8491614 - Attachment is obsolete: false
Attachment #8491614 - Flags: review-
BTW, could we reuse (and rename) the mCallbacksLock? This class is instantiated in bursts and having two locks when not needed is IMO wasting.
(In reply to Honza Bambas (:mayhemer) from comment #13) > Comment on attachment 8491614 [details] [diff] [review] > > This method must return already_AddRefed<> Connection is defined in nsAHttpTransaction to return a raw pointer. Should I change the abstract class and all of the ones extending it? > BTW, could we reuse (and rename) the mCallbacksLock? This class is > instantiated in bursts and having two locks when not needed is IMO wasting. Sounds good.
(In reply to Honza Bambas (:mayhemer) from comment #13) > > This method must return already_AddRefed<> good point
(In reply to Honza Bambas (:mayhemer) from comment #14) > BTW, could we reuse (and rename) the mCallbacksLock? This class is > instantiated in bursts and having two locks when not needed is IMO wasting. I actually prefer the finer locking granularity of 2 locks.. doubt it matters much in this case. either way I'm fine.
(In reply to Valentin Gosu [:valentin] from comment #15) > Connection is defined in nsAHttpTransaction to return a raw pointer. Should > I change the abstract class and all of the ones extending it? Sure.
Attachment #8491614 - Attachment is obsolete: true
Attachment #8491731 - Attachment is obsolete: true
Comment on attachment 8491864 [details] [diff] [review] Use a mutex to guard access to nsHttpTransaction::mConnection Review of attachment 8491864 [details] [diff] [review]: ----------------------------------------------------------------- I think you need to cover SetConnection() with the lock too - that's still called by the channel. also let's see a try run - this should be fine to put through try without a bug number.
Attachment #8491864 - Flags: review?(mcmanus)
https://tbpl.mozilla.org/?tree=Try&rev=1ddd4a743712 Also, unless I'm missing the point, we do lock the mutex in SetConnection as well.
(In reply to Valentin Gosu [:valentin] from comment #21) > > Also, unless I'm missing the point, we do lock the mutex in SetConnection as > well. ack
Attachment #8491864 - Flags: review+
Comment on attachment 8491864 [details] [diff] [review] Use a mutex to guard access to nsHttpTransaction::mConnection [Security approval request comment] How easily could an exploit be constructed based on the patch? Not easily, since it the bug seems to be a rare race condition. Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? No, but someone looking at this patch and at the soccoro reports could figure out that they are related. Which older supported branches are affected by this flaw? All branches are affected by this flaw. The oldest crash with this stack trace that I've found was Fx19. Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? The patch can be applied on all trees between release-inbound with no conflicts. How likely is this patch to cause regressions; how much testing does it need? Unlikely to cause regressions, as the change only touches the way we access the mConnection pointer, now thread safely. The try run is all green. This should probably sit in mozilla-central for a week or so, to make sure no more crashes occur.
Attachment #8491864 - Flags: sec-approval?
Comment on attachment 8491864 [details] [diff] [review] Use a mutex to guard access to nsHttpTransaction::mConnection sec-approval+ for trunk. After a week, please create and nominate Aurora, Beta, and ESR31 patches.
Attachment #8491864 - Flags: sec-approval? → sec-approval+
Comment on attachment 8491864 [details] [diff] [review] Use a mutex to guard access to nsHttpTransaction::mConnection Review of attachment 8491864 [details] [diff] [review]: ----------------------------------------------------------------- r=mcmanus,honzab.. btw, I didn't give this formally an r+ ;) but also not r-. ::: netwerk/protocol/http/nsHttpTransaction.cpp @@ +378,5 @@ > NS_ADDREF(*responseBody = mPipeIn); > return NS_OK; > } > > +// This method should only be used on the socket thread you can easily assert this (and probably should)
(In reply to Honza Bambas (:mayhemer) from comment #26) > Comment on attachment 8491864 [details] [diff] [review] > Use a mutex to guard access to nsHttpTransaction::mConnection > > Review of attachment 8491864 [details] [diff] [review]: > ----------------------------------------------------------------- > > r=mcmanus,honzab.. btw, I didn't give this formally an r+ ;) but also not > r-. Ooops, I think the r? got lost in my last bzexport, and I missed it. Sorry about that. > ::: netwerk/protocol/http/nsHttpTransaction.cpp > @@ +378,5 @@ > > NS_ADDREF(*responseBody = mPipeIn); > > return NS_OK; > > } > > > > +// This method should only be used on the socket thread > > you can easily assert this (and probably should) Patrick suggested the same thing on IRC, but said we should do it in a followup bug.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
https://hg.mozilla.org/mozilla-central/rev/d28403874a12 Sorry, I apparently forgot to mark this bug manually when I merged inbound to m-c on the 22nd. Valentin, can you please nominate this for aurora/beta/esr31 approval when you get a chance?
Flags: needinfo?(valentin.gosu)
Target Milestone: --- → mozilla35
Comment on attachment 8491864 [details] [diff] [review] Use a mutex to guard access to nsHttpTransaction::mConnection Approval Request Comment [Feature/regressing bug #]: Issue present since (at least) Fx19 [User impact if declined]: Firefox may crash in some cases because of a race condition (pointer getting nulled on one thread and deref'd on another) [Describe test coverage new/current, TBPL]: The patch has been on m-c for a few days, and seems to have fixed the crash for Nightly users. [Risks and why]: The patch is low risk. The access to mConnection is now done under a mutex to prevent the race. [String/UUID change made/needed]: None.
Attachment #8491864 - Flags: approval-mozilla-esr31?
Attachment #8491864 - Flags: approval-mozilla-beta?
Attachment #8491864 - Flags: approval-mozilla-aurora?
Flags: needinfo?(valentin.gosu)
Attachment #8491864 - Flags: approval-mozilla-esr31?
Attachment #8491864 - Flags: approval-mozilla-esr31+
Attachment #8491864 - Flags: approval-mozilla-beta?
Attachment #8491864 - Flags: approval-mozilla-beta+
Attachment #8491864 - Flags: approval-mozilla-aurora?
Attachment #8491864 - Flags: approval-mozilla-aurora+
Group: network-core-security
I can confirm update successfully fixed bug for Aurora.
Whiteboard: [adv-main33+][adv-esr31.2+]
Marking [qe-verify-] due to lack of test case and details. Comment 33 indicates some success. If a test and/or STR materialize, let us know and we'll happily test this.
Flags: qe-verify-
Even though I didn't have any problems for almost 2 weeks in Aurora, same error occurred while browsing with Beta v33. I thought fix is released for v33 as well? Here is the crash report, if interested: 70f06d3d-e308-4b78-9e84-2cabd2141007 . However, I couldn't regenerate the bug again so it may not be v33 vs Aurora either.
(In reply to comexx from comment #36) > Even though I didn't have any problems for almost 2 weeks in Aurora, same > error occurred while browsing with Beta v33. I thought fix is released for > v33 as well? Here is the crash report, if interested: > 70f06d3d-e308-4b78-9e84-2cabd2141007 . However, I couldn't regenerate the > bug again so it may not be v33 vs Aurora either. https://crash-stats.mozilla.com/report/index/70f06d3d-e308-4b78-9e84-2cabd2141007 Thanks for the followup, but the build id for that report with beta is 20140923222114 (Sep-23-2014).. the patch wasn't ported to beta until Sep-26 (Comment 32). So I don't think the data point is relevant to verifying the fix. interestingly the signature is similar, but a little different to the ones that drove this bug fix.. so I'm not 100% sure its the same issue.
You're right on a closer look it's not the identical call stack. It occurred while browsing stackoverflow.com, I had the same OnStateStop error on that website before the fix, so I thought that would be the case. About the build date, I don't know FF fix mechanisms to be honest. When I update, doesn't it get that fix? Should it be complete build (uninstall - reinstall)? Thanks for the response.
(In reply to comexx from comment #38) When I update, doesn't it > get that fix? Should it be complete build (uninstall - reinstall)? Thanks > for the response. you should just need to update, quit, and restart.. "about" should tell you what build you are running. troubleshooting that is something I'm not very familiar with.
I remember updating my FF but that is weird. I'll keep in touch in such crash. Currently my about:buildconfig gives changeset - 216918:899d1a66aeb6 which covers this bug. Thanks!
Group: core-security
Whiteboard: [adv-main33+][adv-esr31.2+] → [adv-main33+][adv-esr31.2+][b2g-adv-main2.2+]
Crash Signature: [@ mozilla::net::nsHttpChannel::OnStopRequest(nsIRequest*, nsISupports*, tag_nsresult)] → [@ mozilla::net::nsHttpChannel::OnStopRequest(nsIRequest*, nsISupports*, tag_nsresult)] [@ nsRefPtr<T>::assign_with_AddRef | mozilla::net::nsHttpChannel::OnStopRequest ]
See Also: → 1451293
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: