Closed
Bug 808402
Opened 12 years ago
Closed 12 years ago
FTP use-after-free crash [@nsInputStreamPump::Cancel]
Categories
(Core Graveyard :: Networking: FTP, defect)
Tracking
(firefox17 wontfix, firefox18 wontfix, firefox19- wontfix, firefox20+ wontfix, firefox21+ verified, firefox22+ verified, firefox23+ verified, firefox-esr10 wontfix, firefox-esr1721+ fixed, b2g18+ fixed, b2g18-v1.0.1 wontfix)
People
(Reporter: posidron, Assigned: sworkman)
References
()
Details
(Keywords: csectype-uaf, sec-critical, Whiteboard: crash [leave open][adv-main21+][adv-esr1706+])
Attachments
(5 files)
18.77 KB,
text/plain
|
Details | |
24.87 KB,
text/plain
|
Details | |
521 bytes,
application/octet-stream
|
Details | |
70.53 KB,
text/plain
|
Details | |
2.18 KB,
patch
|
jduell.mcbugs
:
review+
posidron
:
feedback+
lsblakk
:
approval-mozilla-aurora+
lsblakk
:
approval-mozilla-beta+
lsblakk
:
approval-mozilla-esr17+
lsblakk
:
approval-mozilla-b2g18+
abillings
:
sec-approval+
|
Details | Diff | Splinter Review |
This is a reproducible use-after-free crash which occurred while testing FTP server responses. Haven't managed it to get an external testcase for this bug but it crashes reliably in the fuzzer after some time. (The flow of the requests/responses is included in the file of the callstack.)
The fake server accepts FTP connections from Firefox, some of the responses have no CRLF token in the response (FTP banner) when Firefox tries to handle upcoming connections it shows an alert() message of the previous response where we didn't provide a CRLF token, clicking that alert() box results in this crash. I am guessing that this seems to be a problem in handling async connections and their state.
In the callstack of the free'ed thread we see:
#26 0x1051b0541 in nsFtpState::StopProcessing nsFtpConnectionThread.cpp:1866
#27 0x1051b90e4 in nsFtpState::CloseWithStatus nsFtpConnectionThread.cpp:2153
therefore I have put this into the FTP category.
The following assertions are shown before the crash occurs:
###!!! ASSERTION: ftp read state mixup: 'Error', file /Users/cdiehl/Code/Mozilla/mc-asan-opt/netwerk/protocol/ftp/nsFtpConnectionThread.cpp, line 195
###!!! ASSERTION: Read buffer doesn't include response code: 'line.Length() > 4 && startNum', file /Users/cdiehl/Code/Mozilla/mc-asan-opt/netwerk/protocol/ftp/nsFtpConnectionThread.cpp, line 184
alloc: netwerk/base/src/nsInputStreamPump.cpp:64
nsRefPtr<nsInputStreamPump> pump = new nsInputStreamPump();
free: netwerk/base/src/nsInputStreamPump.cpp:141
// although this class can only be accessed from one thread at a time, we do
// allow its ownership to move from thread to thread, assuming the consumer
// understands the limitations of this.
NS_IMPL_THREADSAFE_ISUPPORTS3(nsInputStreamPump,
nsIRequest,
nsIInputStreamCallback,
nsIInputStreamPump)
re-use: netwerk/base/src/nsInputStreamPump.cpp:188
185: // close input stream
186: if (mAsyncStream) {
187: mAsyncStream->CloseWithStatus(status);
188: if (mSuspendCount == 0)
189: EnsureWaiting();
Tested on m-c changeset: 112115:56d3db021482
Comment 1•12 years ago
|
||
cdiehl: is it likely that you'll be able to provide a reproducible testcase for this (can we figure out what fuzzer case causes it to crash)? That would be very helpful here.
Reporter | ||
Comment 2•12 years ago
|
||
Jason: It's a custom fork of Peach. If you have a GitHub account and some time to configure it on your system then I could give you access to it. Feel free to ping me on IRC.
Regarding the testcase - I could look at it again on the weekend till then I am bit busy with my Q4 goals.
Comment 3•12 years ago
|
||
Would the built-in NSPR logging help narrow down the traffic to the part near where it crashes?\
I'm going to guess that if it's a bug in our FTP code it's old and affects all branches.
status-firefox-esr10:
--- → affected
status-firefox17:
--- → affected
status-firefox18:
--- → affected
status-firefox19:
--- → affected
tracking-firefox18:
--- → +
tracking-firefox19:
--- → +
Keywords: testcase-wanted
Updated•12 years ago
|
Comment 4•12 years ago
|
||
Christoph: since this isn't going anywhere without more information I'm assigning to you to see if the logging trick in comment 3 might help.
Comment 5•12 years ago
|
||
Christoph,
To get an NSPR log for FTP, follow instructions here
https://developer.mozilla.org/en-US/docs/HTTP_Logging
except use for NSPR_LOG_MODULES use
NSPR_LOG_MODULES=timestamp,nsFtp:5
A log could be very helpful here.
Reporter | ||
Comment 6•12 years ago
|
||
Attached is the NSPR_LOG of a session with such a crash.
Comment 7•12 years ago
|
||
(In reply to Jason Duell (:jduell) from comment #5)
> Christoph,
>
> To get an NSPR log for FTP, follow instructions here
>
> https://developer.mozilla.org/en-US/docs/HTTP_Logging
>
> except use for NSPR_LOG_MODULES use
>
> NSPR_LOG_MODULES=timestamp,nsFtp:5
>
> A log could be very helpful here.
Any updates here? We have the requested log info as per comment 6. Is there something low risk we can do to resolve this issue in FF18 timeframe ?
Comment 8•12 years ago
|
||
I'm booked with B2G work right now. cc-ing Josh in case he knows of necko folks who've got cycles for this.
Comment 9•12 years ago
|
||
meanwhile a reproducible test case (steps to reliably generate the crash) is definitely the most useful thing we could have here, if someone can get that. If it's too hard to figure out which fuzzer case is blowing things up then a wireshark log of the crash would probably be a good start. Or even just a recipe for a developer on how to get the fuzzer working.
Reporter | ||
Comment 10•12 years ago
|
||
(In reply to Jason Duell (:jduell) from comment #9)
> Or even just a recipe for a developer on how to get the fuzzer working.
Please take a look the callstack - at the end of the file. That is the whole session which produced the crash.
Updated•12 years ago
|
Comment 11•12 years ago
|
||
Between the assertion output and the sent and received bytes the attachments seem to have what jduell was asking for.
Assignee: cdiehl → jduell.mcbugs
Comment 12•12 years ago
|
||
(In reply to Daniel Veditz [:dveditz] from comment #11)
> Between the assertion output and the sent and received bytes the attachments
> seem to have what jduell was asking for.
Jason noted he didn't have time for this bug due to B2G. Sending over to Josh to re-assign.
Assignee: jduell.mcbugs → joshmoz
Updated•12 years ago
|
status-b2g18:
--- → affected
status-firefox21:
--- → affected
status-firefox-esr17:
--- → affected
Flags: needinfo?(joshmoz)
Comment 13•12 years ago
|
||
I don't recall that conversation :)
I'm still planning to work on this very soon. Josh, if michal or someone can take sooner, you can reassign.
Assignee: joshmoz → jduell.mcbugs
Flags: needinfo?(joshmoz)
Comment 14•12 years ago
|
||
Great, thanks Jason.
Comment 15•12 years ago
|
||
Given Firefox 19 ships in two weeks it's unrealistic to expect we can jump a fix (should we get one in time) into a late beta; guess we'll stop tracking 19 as a release target. Then again if it turns out to be something simple (null check, kungFuDeathGrip) we might be able to.
tracking-b2g18:
--- → ?
tracking-firefox18:
+ → ---
tracking-firefox21:
--- → +
tracking-firefox-esr17:
--- → ?
Comment 16•12 years ago
|
||
(In reply to Jason Duell (:jduell) from comment #13)
> I don't recall that conversation :)
comment 8? :D
Updated•12 years ago
|
Updated•12 years ago
|
Comment 17•12 years ago
|
||
cdiehl: can you have a peek at this file to see if you see anything obviously wrong with my reconstruction of the server session here? I reconstructed it from the the info at the end of the callstack file, per comment 10.
Alas, running 'nc -l 21 <file' doesn't cause my nightly build to crash (I do see a dialog pop up with lots of "AAAA" characters, but clicking "OK" doesn't crash as in comment 0). Could be a timing thing, or maybe I don't have the exact same responses from the server.
The stack trace from the crash seems to indicate we're hitting refcount =0 for the nsInputStreamPump in nsInputStreamReadyEvent::Run(), but then the LoadGroup Cancel() -> nsBaseChannel::Cancel() is thinking it still has a valid pointer to the nsInputStreamPump. So refcounting has gone awry somewhere.
Reporter | ||
Comment 18•12 years ago
|
||
This is kind of the right way. We need to trigger somehow two message boxes. The other message box needs to pop up immediately after the first one was closed by clicking "Ok", than the crash should occur.
Reporter | ||
Updated•12 years ago
|
Reporter | ||
Comment 19•12 years ago
|
||
As you see in the video, it's not necessarily needed to have two pop up boxes. I am a bit clueless to make a testcase here since it happens randomly or at least I can't determine a reproducible flow of the FTP commands. I am not even sure if it's a problem because of the FTP commands sent to the browser. Some how I think it's about an unfinished state (no QUIT command?) and initiating a new FTP connection to host.
Comment 20•12 years ago
|
||
Batch edit: Bugs marked status-b2g18: affected after 2/13 branching of v1.0.1 are now also status-b2g18-v1.0.1: affected
status-b2g18-v1.0.1:
--- → affected
Comment 21•12 years ago
|
||
Christoph,
Thanks for the video and info. I expect the best way forward is for me to install a copy of your peach framework and see if I can catch this in a debugger. I know this doesn't happen reproducibly, but does it happen on a particular test? About how often (10%, 1%, etc)?
Reporter | ||
Comment 22•12 years ago
|
||
It happens 100% of the time but not on a specific testcase. At least I wasn't able to determine a repeatable pattern.
I have added now a detailed callstack with all threads. If that doesn't help you and you still want to test it with the fuzzer, please let know your GitHub account.
Note, if you are going to debug it with an ASan build make sure to set a breakpoint at __asan_report_error.
Updated•12 years ago
|
Updated•12 years ago
|
status-firefox22:
--- → affected
tracking-firefox22:
--- → +
Assignee | ||
Comment 23•12 years ago
|
||
I think I have located the problem in the code.
The ASan built I've been using locally identifies the freed memory as nsInputStreamPump:
-- it is allocated in nsBaseChannel::BeginPumpingData, which is called in nsFtpChannel::AsyncOpen (really, nsBaseChannel::AsyncOpen).
-- it is freed in nsBaseChannel::OnStopRequest.
The problem is with the alert box that pops up. It is called in the following code path:
nsBaseChannel::Cancel
nsInputStreamPump::Cancel
nsFtpState::CloseWithStatus
nsFtpState::StopProcessing
nsIPrompt::Alert.
At this point, it looks like the Alert causes an async wait, so these functions don't return until the alert window is dismissed. In the meantime, nsBaseChannel::OnStopRequest (among other functions) has been called, and the nsInputStreamPump is freed. Then, when the alert window is dismissed, we eventually get back to nsInputStreamPump::Cancel, where it tries to do a compare on one of its member vars, a use-after-free error which causes the ASan build to crash.
I notice in the code, in nsFtpConnectionThread.cpp, the home of nsFtpState, that nsIPrompt::Alert is called in two places, both of which have the following comment:
"XXX(darin): this code should not be dictating UI like this!"
but then just before it:
"check to see if the control status is bad.
web shell wont throw an alert. we better:"
There has to be a way to dispatch the error message to the shell without losing serialization here. Or some kind of async alert that means we don't block here. That is what I'll look into now.
Btw, thanks to Christoph for helping set up peach fuzzer and the Asan builds.
Comment 24•12 years ago
|
||
We'll mark this for a specific ESR release once this is resolved/fixed.
tracking-b2g18:
20+ → ---
tracking-firefox-esr17:
20+ → ---
Assignee | ||
Comment 25•12 years ago
|
||
Moved the call to Alert to an async nsRunnable class. Dispatched on the main thread. This will allow nsFtpState::StopProcessing to run to completion, and thus allow nsInputStreamPump::Cancel to complete well before nsInputStreamPump is freed.
Christoph, can you check verify that the crash doesn't happen, please?
Attachment #725565 -
Flags: review?(jduell.mcbugs)
Attachment #725565 -
Flags: feedback?(cdiehl)
Reporter | ||
Comment 26•12 years ago
|
||
Comment on attachment 725565 [details] [diff] [review]
v1.0 Make call to nsIPromtp::Alert from nsFtpState::StopProcessing async
feedback+
I can no longer reproduce it.
Attachment #725565 -
Flags: feedback?(cdiehl) → feedback+
Assignee | ||
Comment 27•12 years ago
|
||
Awesome, thanks Christoph!
Comment 28•12 years ago
|
||
Comment on attachment 725565 [details] [diff] [review]
v1.0 Make call to nsIPromtp::Alert from nsFtpState::StopProcessing async
Review of attachment 725565 [details] [diff] [review]:
-----------------------------------------------------------------
Nice work Steve!
We need to hold off landing this until sec team gives us green light. I'll set the sec-approval ? flag for that.
Attachment #725565 -
Flags: review?(jduell.mcbugs) → review+
Comment 29•12 years ago
|
||
Comment on attachment 725565 [details] [diff] [review]
v1.0 Make call to nsIPromtp::Alert from nsFtpState::StopProcessing async
[Security approval request comment]
How easily could an exploit be constructed based on the patch? Not obvious w/o being able to see comments in bug.
Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? No
Which older supported branches are affected by this flaw? ancient.
Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? Easy and low risk.
How likely is this patch to cause regressions; how much testing does it need? No automated FTP tests :( but is now passing fuzzer!
Attachment #725565 -
Flags: sec-approval?
Assignee | ||
Comment 30•12 years ago
|
||
Thanks, Jason!
Comment 31•12 years ago
|
||
I need comments from the Release Management team before I can give sec-approval+ for trunk. We're two weeks and one day from shipping. If we can't take this in beta and, potentially, ESR, we'll need to wait until 4/23 to take this (the middle of the next cycle).
Comment 32•12 years ago
|
||
Why don't we get some automated FTP tests then and take this in the first FF21 beta.
Comment 33•12 years ago
|
||
We would need to write (or find) a JS/Python FTP server and integrate it into the build system to get automated FTP tests. That's a worthy but non-trivial goal. Do we really want to block this secfix for that? I wouldn't.
FTP is on life-support at this point: I'd rather see us remove support for the entire protocol than spend significant cycles developing test infrastructure. Neither is realistic yet, probably, so I think just landing security fixes like this is the way to go.
That said, this isn't an obvious exploit, so it doesn't need to block anything IMO. But the patch is ready, simple, and unlikely to be improved. Let's just land it?
Flags: needinfo?(lsblakk)
Comment 34•12 years ago
|
||
To be clear, I don't have a strong opinion on whether this should land now or in ff21. I just don't want to block on automated testing.
Comment 35•12 years ago
|
||
So, given that we're unlikely to have automated tests here, what are we going to do to get this checked in and when?
Comment 36•12 years ago
|
||
It's passing the fuzzer that exposed the problem, and I assume Steve has hand-tested that regular FTP browsing/downloading works. (needinfo-ing him to confirm). Other than that, I think we just need a green flag to land (and on which branches) from security/release.
Flags: needinfo?(lsblakk) → needinfo?(sworkman)
Comment 37•12 years ago
|
||
Comment on attachment 725565 [details] [diff] [review]
v1.0 Make call to nsIPromtp::Alert from nsFtpState::StopProcessing async
Sec-approval+ to land after 4/2 (when we branch and ship).
Attachment #725565 -
Flags: sec-approval? → sec-approval+
Updated•12 years ago
|
Flags: needinfo?(sworkman)
Assignee | ||
Comment 38•12 years ago
|
||
For the record, here is a list of the tests I've run manually:
- get list of files available remotely
- download a file
- download and open a txt file, an HTML file, and a gzip file
- cancel a download
- resume a download
- sftp connection and login
These are basic, but all is working fine. I'll land the patch after 4/2, as requested.
Assignee | ||
Comment 39•12 years ago
|
||
Today is 4/3, so landing on inbound:
https://hg.mozilla.org/integration/mozilla-inbound/rev/56c37ce3f358
Status: NEW → ASSIGNED
Whiteboard: crash → crash [leave open]
Assignee | ||
Comment 40•12 years ago
|
||
Comment on attachment 725565 [details] [diff] [review]
v1.0 Make call to nsIPromtp::Alert from nsFtpState::StopProcessing async
NOTE: Please see https://wiki.mozilla.org/Release_Management/B2G_Landing to better understand the B2G approval process and landings.
[Approval Request Comment] B2G
Bug caused by (feature/regressing bug #):
FTP error handling code.
User impact if declined:
Use-after-free exploits.
Testing completed:
Yes - manual testing, comment 36 and comment 38.
Risk to taking this patch (and alternatives if risky):
Low risk.
String or UUID changes made by this patch:
None.
[Approval Request Comment] ESR
If this is not a sec:{high,crit} bug, please state case for ESR consideration:
--sec-crit, so ignoring these questions.
User impact if declined:
Fix Landed on Version:
Risk to taking this patch (and alternatives if risky):
String or UUID changes made by this patch:
See https://wiki.mozilla.org/Release_Management/ESR_Landing_Process for more info.
[Approval Request Comment] Aurora, Beta.
Bug caused by (feature/regressing bug #):
FTP error handling code.
User impact if declined:
Use-after-free exploits
Testing completed (on m-c, etc.):
Landed on inbound today.
Risk to taking this patch (and alternatives if risky):
Low risk.
String or IDL/UUID changes made by this patch:
None.
Attachment #725565 -
Flags: approval-mozilla-esr17?
Attachment #725565 -
Flags: approval-mozilla-beta?
Attachment #725565 -
Flags: approval-mozilla-b2g18?
Attachment #725565 -
Flags: approval-mozilla-aurora?
Comment 41•12 years ago
|
||
(In reply to Steve Workman [:sworkman] from comment #39)
> Today is 4/3, so landing on inbound:
> https://hg.mozilla.org/integration/mozilla-inbound/rev/56c37ce3f358
https://hg.mozilla.org/mozilla-central/rev/56c37ce3f358
Updated•12 years ago
|
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
status-firefox23:
--- → fixed
tracking-firefox23:
--- → +
Resolution: --- → FIXED
Updated•12 years ago
|
Updated•12 years ago
|
Attachment #725565 -
Flags: approval-mozilla-esr17?
Attachment #725565 -
Flags: approval-mozilla-esr17+
Attachment #725565 -
Flags: approval-mozilla-beta?
Attachment #725565 -
Flags: approval-mozilla-beta+
Attachment #725565 -
Flags: approval-mozilla-b2g18?
Attachment #725565 -
Flags: approval-mozilla-b2g18+
Attachment #725565 -
Flags: approval-mozilla-aurora?
Attachment #725565 -
Flags: approval-mozilla-aurora+
Assignee | ||
Comment 42•12 years ago
|
||
Assignee | ||
Comment 43•12 years ago
|
||
Lukas, status-b2g18-v1.0.1 is affected - I'm not sure if there's an uplift process here - should I land on that branch as well?
Flags: needinfo?(lsblakk)
Updated•12 years ago
|
Keywords: testcase-wanted
Updated•12 years ago
|
Whiteboard: crash [leave open] → crash [leave open][adv-main21+][adv-main1706+]
Updated•12 years ago
|
Whiteboard: crash [leave open][adv-main21+][adv-main1706+] → crash [leave open][adv-main21+][adv-esr1706+]
Comment 44•12 years ago
|
||
Based on comment 26 and comment 38, QA is going to call this verified.
Comment 45•11 years ago
|
||
(In reply to Matt Wobensmith from comment #44)
> Based on comment 26 and comment 38, QA is going to call this verified.
Marking this verified across the board based on this comment.
Status: RESOLVED → VERIFIED
Comment 46•11 years ago
|
||
We're not taking fixes to v1.0.1 anymore, just 1.1 so clearing this.
Flags: needinfo?(lsblakk)
Updated•11 years ago
|
Group: core-security
Updated•8 years ago
|
Keywords: csectype-uaf
Updated•10 months ago
|
Product: Core → Core Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•