Closed Bug 1540759 Opened 5 years ago Closed 5 years ago

AddressSanitizer: heap-use-after-free [@ MOZ_Z_inflate_fast] with READ of size 1 through [@ mozilla::net::nsHTTPCompressConv::OnDataAvailable]

Categories

(Core :: Networking: HTTP, defect, P1)

x86_64
Windows
defect

Tracking

()

RESOLVED FIXED
mozilla68
Tracking Status
firefox-esr60 68+ fixed
firefox67 --- wontfix
firefox68 + fixed

People

(Reporter: decoder, Assigned: mayhemer)

References

(Blocks 1 open bug)

Details

(4 keywords, Whiteboard: [necko-triaged][post-critsmash-triage][adv-main68+][adv-esr60.8+])

Attachments

(3 files, 1 obsolete file)

The attached crash information was submitted via the ASan Nightly Reporter on mozilla-central-asan-nightly revision 68.0a1-20190328131410-https://hg.mozilla.org/mozilla-central/rev/4e2ea1a75e878ae392e4775f2eddd9f83d1b008e.

For detailed crash information, see attachment.

Group: core-security → network-core-security
Keywords: csectype-uaf

The "use" and "allocation" stacks look the same.

The "use" stack is thread T35 is in the middle of a call to HttpChannelChild::DoOnDataAvailable():
nsresult rv = mListener->OnDataAvailable(aRequest, aStream, offset, count);

The "free" stack is thread T0 calling ReleaseListeners() which I assume nulls out mListener. I don't know what sort of interleaving there is allowed for HttpChannelChild.

One possible fix would be to locally root mListener in HttpChannelChild::DoOnDataAvailable(), if somehow that is getting called simultaneously with ReleaseListeners(). But I don't know how this code is expected to work.

That's a good question...

It looks like we're doing background data delivery in an HTTP channel, with a gzip decoder in the data pipe. That's the HttpChannelChild::OnTransportAndData bits. That's coming off an event we queued in HttpBackgroundChannelChild::RecvOnTransportAndData.

While that's all going on, on the main thread we have what looks like an IPC error of some sort (MessageChannel::OnNotifyMaybeChannelError) and start tearing down actors: PContentChild::OnChannelClose -> PContentChild::DestroySubtree -> PNeckoChild::DestroySubtree -> HttpChannelChild::ActorDestroy.

That ends up calling into HttpBaseChannel::DoNotifyListener.

At this point, afaict, we have already lost. We are not done with the off-main-thread OnDataAvailable notification, and now we're sendin an OnStopRequest on the main thread to the same listener. That's likely to cause races in the listener and its following listeners, even apart from the race we end up with ourselves on our mListener.

It seems to me that HttpChannelChild::DoOnDataAvailable should have a mutex protecting mCanceled(!) and once it takes that mutex it should do that check, then set a boolean indicating we are calling OnDataAvailable, release the mutex, do the call, take the mutex, unset the boolean, and return.

In HttpChannelChild::ActorDestroy we should then take the same mutex, check whether the "we are delivering data" boolean is set", and if so either queue a runnable to call HandleAsyncAbort (possibly after checking the same boolean) or signal DoOnDataAvailable to queue the runnable when it unsets its boolean or something.

See Also: → 1542153

I meant to needinfo people...

Flags: needinfo?(dd.mozilla)

(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #3)

That's a good question...

It looks like we're doing background data delivery in an HTTP channel, with a gzip decoder in the data pipe. That's the HttpChannelChild::OnTransportAndData bits. That's coming off an event we queued in HttpBackgroundChannelChild::RecvOnTransportAndData.

While that's all going on, on the main thread we have what looks like an IPC error of some sort (MessageChannel::OnNotifyMaybeChannelError) and start tearing down actors: PContentChild::OnChannelClose -> PContentChild::DestroySubtree -> PNeckoChild::DestroySubtree -> HttpChannelChild::ActorDestroy.

That ends up calling into HttpBaseChannel::DoNotifyListener.

At this point, afaict, we have already lost. We are not done with the off-main-thread OnDataAvailable notification, and now we're sendin an OnStopRequest on the main thread to the same listener. That's likely to cause races in the listener and its following listeners, even apart from the race we end up with ourselves on our mListener.

It seems to me that HttpChannelChild::DoOnDataAvailable should have a mutex protecting mCanceled(!) and once it takes that mutex it should do that check, then set a boolean indicating we are calling OnDataAvailable, release the mutex, do the call, take the mutex, unset the boolean, and return.

In HttpChannelChild::ActorDestroy we should then take the same mutex, check whether the "we are delivering data" boolean is set", and if so either queue a runnable to call HandleAsyncAbort (possibly after checking the same boolean) or signal DoOnDataAvailable to queue the runnable when it unsets its boolean or something.

this code is not really build to handle IPC channel errors. We should make this behave nicer.

Can we find out what is the IPC error?

Flags: needinfo?(dd.mozilla)
Priority: -- → P1
Whiteboard: [necko-triaged]

Bug 1542153 has a similar situation which is not obviously an IPC error. Looks like it might just be teardown of the PContenChild actor, which tears down the actors it manages...

First, we must kungfudeathgrip mListener at https://searchfox.org/mozilla-central/rev/7944190ad1668a94223b950a19f1fffe8662d6b8/netwerk/protocol/http/HttpChannelChild.cpp#979

This is easy to do and fixes the UAF.

Second, we must enqueue OnStopRequest call on the listener from the aborter same/similar way as we do at https://searchfox.org/mozilla-central/rev/7944190ad1668a94223b950a19f1fffe8662d6b8/netwerk/protocol/http/HttpChannelChild.cpp#1013

The contract is that onStart/onData/onStop are delivered in sequence when onData is retargetted; we don't call onData until onStart exits, and similarly, we don't call onStop until all onData are handled.

This second part is harder, we have to write DoNotifyListener for child (doesn't need to be virtualized, because the HttpAsyncAborter is templated with the implementor class type!) that will enqueue onStop call.

Assignee: nobody → honzab.moz
Status: NEW → ASSIGNED

Comment on attachment 9063545 [details]
Bug 1540759, r=valentin

Security Approval Request

  • How easily could an exploit be constructed based on the patch?: One would need good knowledge of the code (even I can't easily see how this works ;)) and force the parent process to be killed at the right time to invoke actorDestroyed. this is all highly racy and the crash would be on the content process only anyway
  • Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?: No
  • Which older supported branches are affected by this flaw?: all fot hem
  • If not all supported branches, which bug introduced the flaw?: None
  • Do you have backports for the affected branches?: No
  • If not, how different, hard to create, and risky will they be?: we have changed the code recently in bug 1530691, currently on beta; so backporting this to release/esr may be a bit tricky, but not impossible
  • How likely is this patch to cause regressions; how much testing does it need?: hard to say, but I gave my self a hard a time to understand all the possible flows here and this is the right fix that should not cause any trouble under a normal run

I hope that we cover this bit with our automated tests well enough, but hard to say.

Attachment #9063545 - Flags: sec-approval?

Comment on attachment 9063545 [details]
Bug 1540759, r=valentin

sec-approval+ for trunk.

Attachment #9063545 - Flags: sec-approval? → sec-approval+
Keywords: checkin-needed
Group: network-core-security → core-security-release
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla68

(In reply to Honza Bambas (:mayhemer) from comment #10)

  • If not, how different, hard to create, and risky will they be?: we have changed the code recently in bug 1530691, currently on beta; so backporting this to release/esr may be a bit tricky, but not impossible

Is this something we need on ESR60? Given the rating, I'm inclined to say yes unless this is prohibitively difficult to backport.

Flags: needinfo?(honzab.moz)

I can try to have a patch. This has baked w/o problems long enough. Let's do it. Deadline for a patch?

Flags: needinfo?(honzab.moz) → needinfo?(ryanvm)

60.8esr is due to ship on July 9, so there's time :)

Flags: needinfo?(ryanvm)
Flags: qe-verify-
Whiteboard: [necko-triaged] → [necko-triaged][post-critsmash-triage]

I think you can do that ESR60 approval request now :)

Flags: needinfo?(honzab.moz)
Attached patch esr60 rebased (obsolete) — Splinter Review

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration:
  • User impact if declined: UAF
  • Fix Landed on Version: 68
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): sticks well and affects only older code.
  • String or UUID changes made by this patch: none
Flags: needinfo?(honzab.moz)
Attachment #9075165 - Flags: approval-mozilla-esr60?
Attached patch esr60Splinter Review

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration: Comment 21
  • User impact if declined:
  • Fix Landed on Version:
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky):
  • String or UUID changes made by this patch:
Attachment #9075165 - Attachment is obsolete: true
Attachment #9075165 - Flags: approval-mozilla-esr60?
Attachment #9075207 - Flags: approval-mozilla-esr60?
Comment on attachment 9075207 [details] [diff] [review]
esr60

Fixes a Necko sec bug. Approved for 60.8esr.
Attachment #9075207 - Flags: approval-mozilla-esr60? → approval-mozilla-esr60+
Whiteboard: [necko-triaged][post-critsmash-triage] → [necko-triaged][post-critsmash-triage][adv-main68+][adv-esr60.8+]
Group: core-security-release
Duplicate of this bug: 1507178
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: