Closed Bug 804237 (CVE-2013-0764) Opened 12 years ago Closed 12 years ago

Crash [@ nsSOCKSSocketInfo::ConnectToProxy(PRFileDesc*) ] clicking "Download the rest of the message"

Categories

(Core :: Networking, defect)

16 Branch
x86
Windows XP
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla19
Tracking Status
firefox17 --- wontfix
firefox18 + fixed
firefox19 --- fixed
firefox20 --- fixed
firefox-esr10 --- unaffected
firefox-esr17 18+ fixed

People

(Reporter: mozilla, Assigned: mayhemer)

References

Details

(Keywords: crash, regression, sec-high, Whiteboard: [adv-main18+][adv-esr17+])

Crash Data

Attachments

(1 file)

Tbird 16.0.1.

Reliable crash upon clicking "Download the rest of this message."

Crash IDs:
bp-db07a65e-15a7-4b58-bada-b01502121022
bp-a06c34e6-c260-449a-8721-4723c2121022
bp-822a5bd5-ce6d-4c57-8bf0-944732121022
Severity: normal → critical
Keywords: crash
Summary: Crash clicking "Download the rest of the message" → Crash [@ nsSOCKSSocketInfo::ConnectToProxy(PRFileDesc*) ] clicking "Download the rest of the message"
Crash Signature: [@ nsSOCKSSocketInfo::ConnectToProxy(PRFileDesc*) ]
Component: Folder and Message Lists → Networking
Product: Thunderbird → Core
Version: 16 → 16 Branch
Bug 804237 looks a lot like what we ran into when we made SSL server cert verification asynchronous. Basically, I assumed that SSL connections only happen on the socket transport thread, but in Thunderbird connections happen on the main thread and other background threads too. The bug for the Thunderbird regressions caused by the SSL thread removal / async SSL server cert verification was bug 712363.

All bug reports are from Thunderbird 16. mcmanus just rewrote the proxy stuff to be async, similar to how I made SSL server cert verification async. Intuititively, because Thunderbird does BLOCKING I/O on the main thread, we must have a synchronous version of all async socket-level networking code. But, see bug 767159 ("Remove synchronous DNS resolution in nsSOCKSIOLayer.cpp"), target milestone: mozilla16.

That's enough to convince me that this is likely a regression from bug 767159.
Blocks: 767159
Keywords: regression
Product: Core → MailNews Core
Version: 16 Branch → 16
Mark, after Thunderbird 17, Thunderbird will be on ESR, right?

So, if this bug is really caused by bug 767159, could we just back out the changes from bug 767159 on Thunderbird 16 and 17, and then do the proper fix (make Thunderbird networking code async) in the year between ESRs?

<bsmith> mayhemer: I am not very eager to add back the sync proxy resolution just for Thunderbird
<bsmith> really, tbird's networking should be made async
<mayhemer> bsmith: we should fix TB
<bsmith> the same problem as one year ago
<mayhemer> bsmith: check reports here: https://crash-stats.mozilla.com/report/list?product=Firefox&product=Thunderbird&query_search=signature&query_type=contains&query=ConnectToProxy&reason_type=contains&date=10%2F23%2F2012%2017%3A06%3A17&range_value=3&range_unit=weeks&hang_type=any&process_type=any&do_query=1&signature=nsSOCKSSocketInfo%3A%3AConnectToProxy%28PRFileDesc*%29
<bsmith> mayhemer: I suggest we just ask tbird to back out mcmanus's async proxy work in comm-central.
<jduell_bbl> bsmith: what's the workflow for that?  If they back out a patch that's in m-c, I assume that's fine until any patches that touch the same code land in m-c, after which they've got to merge things by hand?
* jduell_bbl is now known as jduell
<bsmith> I guess so.
<bsmith> But, they might want an immediate fix for tbird 16
<bsmith> and remember that Thuderbird is on Gecko 17 for one year
<bsmith> so, if we do the backout for 16 and 17, then they have a year for a better fix
According the reports in the previous comment, this is present in Firefox as well but at a random address.  The crash could potentially be exploitable.

We may have two issues here: a null deref crash that can be (easily) fixed on the TB side and a random-address crash that should be fixed in the platform.

Closing the bug, just for safety.
Group: core-security
Product: MailNews Core → Core
Version: 16 → 16 Branch
As this crash is seen in Firefox, is it really the case that Thunderbird is at fault here?

Is there a socks server around that we could use to debug this?
(In reply to Mark Banner (:standard8) from comment #4)

> Is there a socks server around that we could use to debug this?

"ssh -D 9999 otherhost"

Then there will be a SOCKS proxy on localhost:9999 with all network traffic proxied to otherhost.
It's free and relatively easy to set up a free EC2 instance on Amazon AWS, which includes SSH. If not, I can see about creating a temporary key for my instance later on.
Debug build of comm-central, windows 7, ccproxy used as SOCK5 proxy running on localhost, account setup to download only headers, checked with ssl/995 and plain/110.  Could not reproduce.

Jerry, did you try with early bird or daily build of Thunderbird? [1]

If you are willing to do such test, don't forget to *backup your profile first!!!*


I'm only getting assertion failure at nsMsgProtocol::Cancel (NS_ASSERTION(m_request,"no channel");) but no crash.


[1] http://ftp.mozilla.org/pub/mozilla.org/thunderbird/nightly/latest-comm-central/
The same with 16.0.1 release installation.
I am using the 16.0.1 release, no nightlies. Are you using network.proxy.socks_remote_dns=true? The only other thing I can think of is that I'm having mail from several accounts placed into one unified folder structure.
Attached patch v1Splinter Review
Cause: before we finish DNS resolution the socket may actually close.  Ths mFD member in nsSOCKSSocketInfo simply becomes invalid.  

It has to be nullified when PR_Close on the socket is called.  That is what the patch is doing.  PSM socket layer also releases back-reference to FD from its info object in close, btw.
Assignee: nobody → honzab.moz
Status: NEW → ASSIGNED
Attachment #676285 - Flags: review?(cbiesinger)
Jerry, thanks a lot for help with this!
Attachment #676285 - Flags: review?(cbiesinger) → review+
Comment on attachment 676285 [details] [diff] [review]
v1

Review of attachment 676285 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/socket/nsSOCKSIOLayer.cpp
@@ +275,5 @@
>      mDnsRec = aRecord;
>      mState = SOCKS_DNS_COMPLETE;
> +    if (mFD) {
> +      ConnectToProxy(mFD);
> +      ForgetFD();

Are OnLookupComplete() and PRIOMethods.close() always called on the main thread? If not, there's still a race condition here unless something above, in both call stacks, is doing the locking.
(In reply to Irving Reid (:irving) from comment #12)
> Are OnLookupComplete() and PRIOMethods.close() always called on the main
> thread? If not, there's still a race condition here unless something above,
> in both call stacks, is doing the locking.

Very good point (I feel shamed a bit :).  Yes, this needs a lock.
(In reply to Honza Bambas (:mayhemer) from comment #13)
> (In reply to Irving Reid (:irving) from comment #12)
> > Are OnLookupComplete() and PRIOMethods.close() always called on the main
> > thread? If not, there's still a race condition here unless something above,
> > in both call stacks, is doing the locking.
> 
> Very good point (I feel shamed a bit :).  Yes, this needs a lock.

Taking back.  This all happens on just a single thread.  Locking is not necessary here.

The patch is OK as is.
Keywords: sec-high
https://hg.mozilla.org/mozilla-central/rev/845ac67d79e6
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla19
This sec-high bug has been marked tracking+ for 18 - Honza, care to request approval for beta and land there?
Comment on attachment 676285 [details] [diff] [review]
v1

[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 767159
User impact if declined: potentially bad memory access in a realistic scenario (as per comment 0), not sure whether exploitable
Testing completed (on m-c, etc.): more then a month ago landed on m-c, currently lives on aurora
Risk to taking this patch (and alternatives if risky): very low, this can be treated as an added null check
String or UUID changes made by this patch: none/none
Attachment #676285 - Flags: approval-mozilla-beta?
Attachment #676285 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
(In reply to Ryan VanderMeulen from comment #19)
> https://hg.mozilla.org/releases/mozilla-beta/rev/3ac89158810f

Thanks!
Still need an esr17 nomination & landing here - please prepare asap as we are approaching the end of the 18 beta cycle.
Flags: needinfo?(honzab.moz)
Comment on attachment 676285 [details] [diff] [review]
v1

[Approval Request Comment]
See comment 18.
Attachment #676285 - Flags: approval-mozilla-esr17?
Flags: needinfo?(honzab.moz)
Attachment #676285 - Flags: approval-mozilla-esr17? → approval-mozilla-esr17+
(In reply to Ryan VanderMeulen from comment #23)
> https://hg.mozilla.org/releases/mozilla-esr17/rev/658aee3358ed

Thanks :)
Alias: CVE-2013-0764
Whiteboard: [adv-main18+][adv-esr17+]
Is Seamonkey affected by this issue? I know TB and Firefox are.
(In reply to Al Billings [:abillings] from comment #25)
> Is Seamonkey affected by this issue? I know TB and Firefox are.

Given that the patch is in shared core code, I'd expect that. Neil might know more  details, though.
(In reply to Robert Kaiser from comment #26)
> (In reply to Al Billings from comment #25)
> > Is Seamonkey affected by this issue? I know TB and Firefox are.
> Given that the patch is in shared core code
Sorry, I can't improve on this assessment.
Ok. This is making it into SM 2.15?
(In reply to Al Billings [:abillings] from comment #28)
> Ok. This is making it into SM 2.15?

Yes, has been part of our 2.15 train since Beta 3.
Group: core-security
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: