Last Comment Bug 804237 - (CVE-2013-0764) Crash [@ nsSOCKSSocketInfo::ConnectToProxy(PRFileDesc*) ] clicking "Download the rest of the message"
(CVE-2013-0764)
: Crash [@ nsSOCKSSocketInfo::ConnectToProxy(PRFileDesc*) ] clicking "Download ...
Status: RESOLVED FIXED
[adv-main18+][adv-esr17+]
: crash, regression, sec-high
Product: Core
Classification: Components
Component: Networking (show other bugs)
: 16 Branch
: x86 Windows XP
: -- critical (vote)
: mozilla19
Assigned To: Honza Bambas (:mayhemer)
:
Mentors:
Depends on:
Blocks: 767159
  Show dependency treegraph
 
Reported: 2012-10-22 11:09 PDT by Jerry Baker
Modified: 2013-04-30 18:41 PDT (History)
15 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
wontfix
+
fixed
fixed
fixed
unaffected
18+
fixed


Attachments
v1 (1.98 KB, patch)
2012-10-29 13:20 PDT, Honza Bambas (:mayhemer)
cbiesinger: review+
akeybl: approval‑mozilla‑beta+
lukasblakk+bugs: approval‑mozilla‑esr17+
honzab.moz: checkin+
Details | Diff | Splinter Review

Description Jerry Baker 2012-10-22 11:09:56 PDT
Tbird 16.0.1.

Reliable crash upon clicking "Download the rest of this message."

Crash IDs:
bp-db07a65e-15a7-4b58-bada-b01502121022
bp-a06c34e6-c260-449a-8721-4723c2121022
bp-822a5bd5-ce6d-4c57-8bf0-944732121022
Comment 1 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-10-23 09:54:08 PDT
Bug 804237 looks a lot like what we ran into when we made SSL server cert verification asynchronous. Basically, I assumed that SSL connections only happen on the socket transport thread, but in Thunderbird connections happen on the main thread and other background threads too. The bug for the Thunderbird regressions caused by the SSL thread removal / async SSL server cert verification was bug 712363.

All bug reports are from Thunderbird 16. mcmanus just rewrote the proxy stuff to be async, similar to how I made SSL server cert verification async. Intuititively, because Thunderbird does BLOCKING I/O on the main thread, we must have a synchronous version of all async socket-level networking code. But, see bug 767159 ("Remove synchronous DNS resolution in nsSOCKSIOLayer.cpp"), target milestone: mozilla16.

That's enough to convince me that this is likely a regression from bug 767159.
Comment 2 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-10-23 10:17:31 PDT
Mark, after Thunderbird 17, Thunderbird will be on ESR, right?

So, if this bug is really caused by bug 767159, could we just back out the changes from bug 767159 on Thunderbird 16 and 17, and then do the proper fix (make Thunderbird networking code async) in the year between ESRs?

<bsmith> mayhemer: I am not very eager to add back the sync proxy resolution just for Thunderbird
<bsmith> really, tbird's networking should be made async
<mayhemer> bsmith: we should fix TB
<bsmith> the same problem as one year ago
<mayhemer> bsmith: check reports here: https://crash-stats.mozilla.com/report/list?product=Firefox&product=Thunderbird&query_search=signature&query_type=contains&query=ConnectToProxy&reason_type=contains&date=10%2F23%2F2012%2017%3A06%3A17&range_value=3&range_unit=weeks&hang_type=any&process_type=any&do_query=1&signature=nsSOCKSSocketInfo%3A%3AConnectToProxy%28PRFileDesc*%29
<bsmith> mayhemer: I suggest we just ask tbird to back out mcmanus's async proxy work in comm-central.
<jduell_bbl> bsmith: what's the workflow for that?  If they back out a patch that's in m-c, I assume that's fine until any patches that touch the same code land in m-c, after which they've got to merge things by hand?
* jduell_bbl is now known as jduell
<bsmith> I guess so.
<bsmith> But, they might want an immediate fix for tbird 16
<bsmith> and remember that Thuderbird is on Gecko 17 for one year
<bsmith> so, if we do the backout for 16 and 17, then they have a year for a better fix
Comment 3 Honza Bambas (:mayhemer) 2012-10-23 10:22:10 PDT
According the reports in the previous comment, this is present in Firefox as well but at a random address.  The crash could potentially be exploitable.

We may have two issues here: a null deref crash that can be (easily) fixed on the TB side and a random-address crash that should be fixed in the platform.

Closing the bug, just for safety.
Comment 4 Mark Banner (:standard8) 2012-10-24 14:54:37 PDT
As this crash is seen in Firefox, is it really the case that Thunderbird is at fault here?

Is there a socks server around that we could use to debug this?
Comment 5 Patrick McManus [:mcmanus] 2012-10-24 15:02:08 PDT
(In reply to Mark Banner (:standard8) from comment #4)

> Is there a socks server around that we could use to debug this?

"ssh -D 9999 otherhost"

Then there will be a SOCKS proxy on localhost:9999 with all network traffic proxied to otherhost.
Comment 6 Jerry Baker 2012-10-24 15:03:54 PDT
It's free and relatively easy to set up a free EC2 instance on Amazon AWS, which includes SSH. If not, I can see about creating a temporary key for my instance later on.
Comment 7 Honza Bambas (:mayhemer) 2012-10-27 12:54:31 PDT
Debug build of comm-central, windows 7, ccproxy used as SOCK5 proxy running on localhost, account setup to download only headers, checked with ssl/995 and plain/110.  Could not reproduce.

Jerry, did you try with early bird or daily build of Thunderbird? [1]

If you are willing to do such test, don't forget to *backup your profile first!!!*


I'm only getting assertion failure at nsMsgProtocol::Cancel (NS_ASSERTION(m_request,"no channel");) but no crash.


[1] http://ftp.mozilla.org/pub/mozilla.org/thunderbird/nightly/latest-comm-central/
Comment 8 Honza Bambas (:mayhemer) 2012-10-27 13:03:05 PDT
The same with 16.0.1 release installation.
Comment 9 Jerry Baker 2012-10-28 05:30:47 PDT
I am using the 16.0.1 release, no nightlies. Are you using network.proxy.socks_remote_dns=true? The only other thing I can think of is that I'm having mail from several accounts placed into one unified folder structure.
Comment 10 Honza Bambas (:mayhemer) 2012-10-29 13:20:54 PDT
Created attachment 676285 [details] [diff] [review]
v1

Cause: before we finish DNS resolution the socket may actually close.  Ths mFD member in nsSOCKSSocketInfo simply becomes invalid.  

It has to be nullified when PR_Close on the socket is called.  That is what the patch is doing.  PSM socket layer also releases back-reference to FD from its info object in close, btw.
Comment 11 Honza Bambas (:mayhemer) 2012-10-29 13:21:31 PDT
Jerry, thanks a lot for help with this!
Comment 12 :Irving Reid (No longer working on Firefox) 2012-10-30 06:51:54 PDT
Comment on attachment 676285 [details] [diff] [review]
v1

Review of attachment 676285 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/socket/nsSOCKSIOLayer.cpp
@@ +275,5 @@
>      mDnsRec = aRecord;
>      mState = SOCKS_DNS_COMPLETE;
> +    if (mFD) {
> +      ConnectToProxy(mFD);
> +      ForgetFD();

Are OnLookupComplete() and PRIOMethods.close() always called on the main thread? If not, there's still a race condition here unless something above, in both call stacks, is doing the locking.
Comment 13 Honza Bambas (:mayhemer) 2012-10-30 09:30:41 PDT
(In reply to Irving Reid (:irving) from comment #12)
> Are OnLookupComplete() and PRIOMethods.close() always called on the main
> thread? If not, there's still a race condition here unless something above,
> in both call stacks, is doing the locking.

Very good point (I feel shamed a bit :).  Yes, this needs a lock.
Comment 14 Honza Bambas (:mayhemer) 2012-10-30 10:06:36 PDT
(In reply to Honza Bambas (:mayhemer) from comment #13)
> (In reply to Irving Reid (:irving) from comment #12)
> > Are OnLookupComplete() and PRIOMethods.close() always called on the main
> > thread? If not, there's still a race condition here unless something above,
> > in both call stacks, is doing the locking.
> 
> Very good point (I feel shamed a bit :).  Yes, this needs a lock.

Taking back.  This all happens on just a single thread.  Locking is not necessary here.

The patch is OK as is.
Comment 16 Ed Morley [:emorley] 2012-11-01 07:05:59 PDT
https://hg.mozilla.org/mozilla-central/rev/845ac67d79e6
Comment 17 Robert Kaiser 2012-12-03 08:54:34 PST
This sec-high bug has been marked tracking+ for 18 - Honza, care to request approval for beta and land there?
Comment 18 Honza Bambas (:mayhemer) 2012-12-04 09:34:15 PST
Comment on attachment 676285 [details] [diff] [review]
v1

[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 767159
User impact if declined: potentially bad memory access in a realistic scenario (as per comment 0), not sure whether exploitable
Testing completed (on m-c, etc.): more then a month ago landed on m-c, currently lives on aurora
Risk to taking this patch (and alternatives if risky): very low, this can be treated as an added null check
String or UUID changes made by this patch: none/none
Comment 19 Ryan VanderMeulen [:RyanVM] 2012-12-04 18:36:33 PST
https://hg.mozilla.org/releases/mozilla-beta/rev/3ac89158810f
Comment 20 Honza Bambas (:mayhemer) 2012-12-05 05:27:09 PST
(In reply to Ryan VanderMeulen from comment #19)
> https://hg.mozilla.org/releases/mozilla-beta/rev/3ac89158810f

Thanks!
Comment 21 Lukas Blakk [:lsblakk] use ?needinfo 2012-12-11 11:37:53 PST
Still need an esr17 nomination & landing here - please prepare asap as we are approaching the end of the 18 beta cycle.
Comment 22 Honza Bambas (:mayhemer) 2012-12-12 10:17:49 PST
Comment on attachment 676285 [details] [diff] [review]
v1

[Approval Request Comment]
See comment 18.
Comment 23 Ryan VanderMeulen [:RyanVM] 2012-12-15 07:10:41 PST
https://hg.mozilla.org/releases/mozilla-esr17/rev/658aee3358ed
Comment 24 Honza Bambas (:mayhemer) 2012-12-17 09:30:44 PST
(In reply to Ryan VanderMeulen from comment #23)
> https://hg.mozilla.org/releases/mozilla-esr17/rev/658aee3358ed

Thanks :)
Comment 25 Al Billings [:abillings] 2013-01-04 11:46:29 PST
Is Seamonkey affected by this issue? I know TB and Firefox are.
Comment 26 Robert Kaiser 2013-01-04 15:11:20 PST
(In reply to Al Billings [:abillings] from comment #25)
> Is Seamonkey affected by this issue? I know TB and Firefox are.

Given that the patch is in shared core code, I'd expect that. Neil might know more  details, though.
Comment 27 neil@parkwaycc.co.uk 2013-01-04 16:05:42 PST
(In reply to Robert Kaiser from comment #26)
> (In reply to Al Billings from comment #25)
> > Is Seamonkey affected by this issue? I know TB and Firefox are.
> Given that the patch is in shared core code
Sorry, I can't improve on this assessment.
Comment 28 Al Billings [:abillings] 2013-01-04 16:16:54 PST
Ok. This is making it into SM 2.15?
Comment 29 Justin Wood (:Callek) 2013-01-04 16:26:34 PST
(In reply to Al Billings [:abillings] from comment #28)
> Ok. This is making it into SM 2.15?

Yes, has been part of our 2.15 train since Beta 3.

Note You need to log in before you can comment on or make changes to this bug.