Closed Bug 1618158 Opened 4 years ago Closed 4 years ago

PHC Crash in [@ neqo_crypto::agentio::agent_close] with use-after-free

Categories

(Core :: Networking, defect, P1)

x86_64
Linux
defect

Tracking

()

RESOLVED FIXED
mozilla77
Tracking Status
firefox-esr68 --- unaffected
firefox74 --- disabled
firefox75 --- disabled
firefox76 --- disabled
firefox77 --- fixed

People

(Reporter: decoder, Assigned: dragana)

References

(Blocks 2 open bugs, Regression)

Details

(4 keywords, Whiteboard: [necko-triaged] [post-critsmash-triage] disabled on beta/release)

Crash Data

Attachments

(1 file)

This bug is for crash report bp-423e6e05-704b-4863-97df-a179e0200225.

Top 10 frames of crashing thread:

0 libxul.so neqo_crypto::agentio::agent_close third_party/rust/neqo-crypto/src/agentio.rs:265
1 libssl3.so ssl_DefClose security/nss/lib/ssl/ssldef.c:219
2 libxul.so neqo_crypto::agent::SecretAgent::close third_party/rust/neqo-crypto/src/agent.rs:684
3 libxul.so core::ptr::real_drop_in_place src/libcore/ptr/mod.rs:182
4 libxul.so core::ptr::real_drop_in_place src/libcore/ptr/mod.rs:182
5 libxul.so core::ptr::real_drop_in_place src/libcore/ptr/mod.rs:182
6 libxul.so core::ptr::real_drop_in_place src/libcore/ptr/mod.rs:182
7 libxul.so neqo_http3conn_release netwerk/socket/neqo_glue/src/lib.rs:98
8 libxul.so mozilla::net::Http3Session::~Http3Session netwerk/protocol/http/Http3Session.cpp:182
9 libxul.so <name omitted> netwerk/protocol/http/Http3Session.cpp:49

Additional stacks from PHC:

Free stack:

#0    replace_free(void*) (firefox-bin)
#1    neqo_crypto::agentio::agent_close (libxul.so)
#2    ssl_DefClose (libssl3.so)
#3    neqo_crypto::agent::SecretAgent::close (libxul.so)
#4    core::ptr::real_drop_in_place (libxul.so)
#5    core::ptr::real_drop_in_place (libxul.so)
#6    core::ptr::real_drop_in_place (libxul.so)
#7    core::ptr::real_drop_in_place (libxul.so)
#8    neqo_http3conn_release (libxul.so)
#9    mozilla::net::Http3Session::~Http3Session() (libxul.so)
#10    <name omitted> (libxul.so)
#11    mozilla::net::nsHttpConnection::CloseTransaction(mozilla::net::nsAHttpTransaction*, nsresult, bool) (libxul.so)
#12    mozilla::net::nsHttpConnection::OnInputStreamReady(nsIAsyncInputStream*) (libxul.so)
#13    non-virtual thunk to mozilla::net::nsHttpConnection::OnInputStreamReady(nsIAsyncInputStream*) (libxul.so)
#14    mozilla::net::nsSocketInputStream::OnSocketReady(nsresult) (libxul.so)
#15    mozilla::net::nsSocketTransport::OnSocketDetached(PRFileDesc*) (libxul.so)

Alloc stack:

#0    MaybePageAlloc(mozilla::Maybe<unsigned long> const&, unsigned long, bool) (firefox-bin)
#1    calloc (firefox-bin)
#2    PR_CreateIOLayerStub (libnspr4.so)
#3    neqo_crypto::agent::SecretAgent::new (libxul.so)
#4    neqo_crypto::agent::Client::new (libxul.so)
#5    neqo_http3conn_new (libxul.so)
#6    mozilla::net::Http3Session::Init(nsTSubstring<char> const&, nsISocketTransport*, mozilla::net::nsHttpConnection*) (libxul.so)
#7    mozilla::net::nsHttpConnection::Init(mozilla::net::nsHttpConnectionInfo*, unsigned short, nsISocketTransport*, nsIAsyncInputStream*, nsIAsyncOutputStream*, bool, nsIInterfaceRequestor*, unsigned int) (libxul.so)
#8    mozilla::net::nsHttpConnectionMgr::nsHalfOpenSocket::SetupConn(nsIAsyncOutputStream*, bool) (libxul.so)
#9    mozilla::net::nsHttpConnectionMgr::nsHalfOpenSocket::OnOutputStreamReady(nsIAsyncOutputStream*) (libxul.so)
#10    mozilla::net::nsSocketOutputStream::OnSocketReady(nsresult) (libxul.so)
#11    mozilla::net::nsSocketTransport::OnSocketReady(PRFileDesc*, short) (libxul.so)
#12    mozilla::net::nsSocketTransportService::DoPollIteration(mozilla::BaseTimeDuration<mozilla::TimeDurationValueCalculator>*) (libxul.so)
#13    mozilla::net::nsSocketTransportService::Run() (libxul.so)
#14    non-virtual thunk to mozilla::net::nsSocketTransportService::Run() (libxul.so)
#15    nsThread::ProcessNextEvent(bool, bool*) (libxul.so)

I can try to get filename/line numbers for the PHC stacks if necessary.

Group: core-security → network-core-security

Andy, can you look at this please?

Tentatively a P1 as this a UAF in a new code.

Assignee: nobody → agrover
Priority: -- → P1
Whiteboard: [necko-triaged]

Hey, I just took a look at the neqo-crypto piece of this. The operative function, SecretAgent::close, is shaped like this:

    pub fn close(&mut self) {
        // It should be safe to close multiple times.
        if self.fd.is_null() {
            return;
        }
        if let Some(true) = self.raw {
            // Need to hold the record list in scope until the close is done.
            let _records = self.setup_raw().expect("Can only close");
            unsafe { prio::PR_Close(self.fd as *mut prio::PRFileDesc) };
        } else {
            // Need to hold the IO wrapper in scope until the close is done.
            let _io = self.io.wrap(&[]);
            unsafe { prio::PR_Close(self.fd as *mut prio::PRFileDesc) };
        };
        let _output = self.io.take_output();
        self.fd = null_mut();
    }

Note the first and last statements. So unless this is threaded and racing, this is a double-free problem at the Http3Session layer. Let's see what Dragana thinks.

Flags: needinfo?(dd.mozilla)
Blocks: QUIC

Http3Session and NeqoHttp3Conn are both ref-counted.
NeqoHttp3Conn is created and destroyed by Http3Session nothing else ever references it.

Http3Session is not thread safe and it should not be use on any other thread except the socketThread. There was an issue with thread safety during shutdown. The Stream needed to be in certain state when the shutdown is called for that to happen. Looking at that code I may miss to fix one code path to that state that my trigger the issue, but that code path is not triggered here because it involves shutdown and especially the Timer thread should be shutting down that is not the case in this trace.
I also did a bit of refactoring of nsHttpConnection code in the last 2 weeks and the issue with the timer thread has been fixed.

If this crash has happened in a debug build this would crash on use of Http3Session on multiple threads before crashing here. The ref counting of Http3Session will trigger crash because it is the non-thread-safe ref counting (NS_DECL_ISUPPORTS).

Can we have a bug in necko code that is not Http3Session specific? we may, I hope not and we would have other crashes in other parts of the code.
I could add diagnostic asserts into Http3Session code.

decoder, this is, I assume, not reproducible? Was this a debug build?

Flags: needinfo?(dd.mozilla) → needinfo?(choller)

(In reply to Dragana Damjanovic [:dragana] from comment #3)

decoder, this is, I assume, not reproducible? Was this a debug build?

This was a regular crash report (see comment 0 for the URL) that had PHC information attached to it so we would get the alloc/free stacks in addition to the crash stack which is the "use" stack. According to the crash report this was a regular Nightly build.

The crash report contains a URL and an email address for the person who submitted the crash report. Maybe they can help you to figure out what they did when this happened and maybe it is reproducible (maybe an ASan or TSan build could also reproduce it if we had steps to reproduce).

Flags: needinfo?(choller)

Martin, are you defining dtor function somewhere in neqo code or are you using the nspr function (pl_FDDestructor)?

If you are using pl_FDDestructor, you should not access secret after that.

Flags: needinfo?(mt)

I am using PR_Close() to clean up the socket state. That is called from SecretAgent::close, which in turn is called from the Drop implementation on SecretAgent (that's why we have the guard on the close function, because an explicit close will be eventually followed by a drop).

The crash is specifically about the code being hit twice though, which I would assume is impossible unless it is from different threads.

Flags: needinfo?(mt)
Assignee: agrover → dd.mozilla
Status: NEW → ASSIGNED

Comment on attachment 9134828 [details]
Bug 1618158 - Make Http3Session ref counting thread-safe. r=kershaw

Security Approval Request

  • How easily could an exploit be constructed based on the patch?: not very easy. Freeing of an object must happen at the same time on 2 threads. Also this code is disabled on all channels therefore there is only a small number of people actually affected.
  • Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?: Yes
  • Which older supported branches are affected by this flaw?: disabled on beta and release
  • If not all supported branches, which bug introduced the flaw?: Bug 1581637
  • Do you have backports for the affected branches?: Yes
  • If not, how different, hard to create, and risky will they be?: not risky. This code is actually disabled
  • How likely is this patch to cause regressions; how much testing does it need?: not very likely, it only makes ref counting thread safe.
Attachment #9134828 - Flags: sec-approval?
Regressed by: 1581637
Whiteboard: [necko-triaged] → [necko-triaged] disabled on beta/release
Has Regression Range: --- → yes
Keywords: regression

Comment on attachment 9134828 [details]
Bug 1618158 - Make Http3Session ref counting thread-safe. r=kershaw

sec-approval+

Attachment #9134828 - Flags: sec-approval? → sec-approval+
Group: network-core-security → core-security-release
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla76
Status: RESOLVED → REOPENED
Resolution: FIXED → ---

This should be fixed by https://github.com/mozilla/neqo/pull/521 and uplift in bug 1628459.

Status: REOPENED → RESOLVED
Closed: 4 years ago4 years ago
Resolution: --- → FIXED
Depends on: 1628459
Target Milestone: mozilla76 → mozilla77
Flags: qe-verify-
Whiteboard: [necko-triaged] disabled on beta/release → [necko-triaged] [post-critsmash-triage] disabled on beta/release
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: