Closed Bug 1873597 Opened 1 year ago Closed 1 year ago

Crash in [@ memchr] in nsContentUtils::IsURIInList()

Categories

(Core :: Privacy: Anti-Tracking, defect, P1)

Firefox 121
x86_64
Windows 10
defect

Tracking

()

RESOLVED FIXED
124 Branch
Tracking Status
firefox-esr115 123+ fixed
firefox121 --- wontfix
firefox122 --- wontfix
firefox123 + fixed
firefox124 + fixed

People

(Reporter: pbone, Assigned: bvandersloot)

References

(Blocks 1 open bug)

Details

(Keywords: crash, csectype-uaf, sec-high, Whiteboard: [adv-main123+r][adv-esr115.8+r])

Crash Data

Attachments

(4 files, 1 obsolete file)

Crash report: https://crash-stats.mozilla.org/report/index/3aff5085-ecc8-4f5a-ada2-8ce7e0240104

Reason: EXCEPTION_ACCESS_VIOLATION_READ

Top 10 frames of crashing thread:

0  VCRUNTIME140.dll  memchr  D:\a\_work\1\s\src\vctools\crt\vcruntime\src\string\amd64\memchr.asm:92
1  xul.dll  std::_Narrow_char_traits<char, int>::find  /builds/worker/fetches/vs/VC/Tools/MSVC/14.29.30133/include/xstring:422
1  xul.dll  std::_Traits_find  /builds/worker/fetches/vs/VC/Tools/MSVC/14.29.30133/include/xstring:618
1  xul.dll  std::basic_string_view<char, std::char_traits<char> >::find const  /builds/worker/fetches/vs/VC/Tools/MSVC/14.29.30133/include/xstring:1490
1  xul.dll  mozilla::detail::nsTStringRepr<char>::Find const  xpcom/string/nsTStringRepr.cpp:116
1  xul.dll  mozilla::detail::nsTStringRepr<char>::Find const  xpcom/string/nsTStringRepr.h:356
1  xul.dll  nsContentUtils::IsURIInList  dom/base/nsContentUtils.cpp:11025
2  xul.dll  mozilla::net::  netwerk/url-classifier/AsyncUrlChannelClassifier.cpp:470
2  xul.dll  mozilla::net::  netwerk/url-classifier/AsyncUrlChannelClassifier.cpp:742
2  xul.dll  mozilla::net::AsyncUrlChannelClassifier::CheckChannel::<lambda_1>::operator const  netwerk/url-classifier/AsyncUrlChannelClassifier.cpp:910

This is a buffer overflow detected by PHC. The allocation stack is:

Alloc stack:

#0    nsTSubstring<char>::AssignNonDependent(nsTSubstringTuple<char> const&, unsigned long long, std::nothrow_t const&) (xul.dll)
#1    nsTSubstring<char>::Assign(nsTSubstringTuple<char> const&) (xul.dll)
#2    nsContentUtils::IsURIInList(nsIURI*, nsTString<char> const&) (xul.dll)
#3    mozilla::detail::RunnableFunction<`lambda at /builds/worker/checkouts/gecko/netwerk/url-classifier/AsyncUrlChannelClassifier.cpp:910:13'>::Run() (xul.dll)
#4    mozilla::TaskController::DoExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex &> const&) (xul.dll)
#5    mozilla::TaskController::ProcessPendingMTTask(bool) (xul.dll)
#6    NS_ProcessNextEvent(nsIThread*, bool) (xul.dll)
#7    mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) (xul.dll)
#8    MessageLoop::RunHandler() (xul.dll)
#9    nsBaseAppShell::Run() (xul.dll)
#10    nsAppShell::Run() (xul.dll)
#11    nsAppStartup::Run() (xul.dll)
#12    XREMain::XRE_mainRun() (xul.dll)
#13    XREMain::XRE_main(int, char**, mozilla::BootstrapConfig const&) (xul.dll)
#14    XRE_main(int, char**, mozilla::BootstrapConfig const&) (xul.dll)
#15    wmain(int, wchar_t**) (firefox.exe)

For a 16 byte object allocated at 0x207a297bff0. The faulting address is: 0x00000207a297c000

Group: core-security → network-core-security
Summary: Crash in [@ memchr] → Crash in [@ memchr] in nsContentUtils::IsURIInList()

This looks more like anti-tracking than networking, but I could be wrong.

Moving to Anti-Tracking based on the last bugs that made changes to that code - Bug 1590922 and bug 1594540.

The overflow seems to be here:
https://searchfox.org/mozilla-central/rev/52f45d12530f9f8da61c6c7237003b5d1499b6d1/dom/base/nsContentUtils.cpp#11102-11108

int32_t startIndexOfCurrentLevel = host[0] == '*' ? 1 : 0;
int32_t startIndexOfNextLevel =
    host.Find(".", startIndexOfCurrentLevel + 1);
if (startIndexOfNextLevel <= 0) {
  break;
}
host = "*"_ns + nsDependentCSubstring(host, startIndexOfNextLevel);

I've also looked at the crash dump in VS code and this is the stack at the crash site.

Name Value Type
host <Error reading characters of string.> nsTAutoStringN<char,64>
▶ Length {mLength=0x00000005 } mozilla::detail::nsTStringLengthStorage<char>
Flags 0x0005 mozilla::detail::StringDataFlags
◢ [Raw View] {mInlineCapacity={mLength=0x0000003f } mStorage=0x0000009ad6ffd9b4 "www.youtube.com" } nsTAutoStringN<char,64>
▶ nsTString<char> {...} nsTString<char>
▶ mInlineCapacity {mLength=0x0000003f } const mozilla::detail::nsTStringLengthStorage<char>
▶ mStorage 0x0000009ad6ffd9b4 "www.youtube.com" char[0x00000040]
index Variable is optimized away and not available.
indexAfterHost Variable is optimized away and not available.
pathInList Variable is optimized away and not available.
scheme "https" nsTAutoStringN<char,64>
▶ Length {mLength=0x00000005 } mozilla::detail::nsTStringLengthStorage<char>
Flags 0x0011 mozilla::detail::StringDataFlags
▶ [Raw View] {mInlineCapacity={mLength=0x0000003f } mStorage=0x0000009ad6ffda0c "https" } nsTAutoStringN<char,64>
startIndexOfCurrentLevel Variable is optimized away and not available.
startIndexOfNextLevel Variable is optimized away and not available.
token <Error reading characters of string.> const nsTString<char>
▶ Length {mLength=0x00000017 } mozilla::detail::nsTStringLengthStorage<char>
Flags 0x0005 mozilla::detail::StringDataFlags
◢ [Raw View] {...} const nsTString<char>
◢ nsTSubstring<char> {...} nsTSubstring<char>
◢ mozilla::detail::nsTStringRepr<char> {mData=0x00000207af8b2e88 <Error reading characters of string.> mLength={mLength=0x00000017 } mDataFlags=...} mozilla::detail::nsTStringRepr<char>
▶ mData 0x00000207af8b2e88 <Error reading characters of string.> char *
▶ mLength {mLength=0x00000017 } mozilla::detail::nsTStringLengthStorage<char>
mDataFlags 0x0005 mozilla::detail::StringDataFlags
mClassFlags NULL_TERMINATED (0x0002) const mozilla::detail::StringClassFlags
Component: Networking → Privacy: Anti-Tracking
Group: network-core-security → dom-core-security
Severity: -- → S2
Priority: -- → P1
Flags: needinfo?(bvandersloot)

The algorithm gives me some smells at first glance (naked consts, lots of off by one opportunity), but a little inspection had me feeling better.

The stack from the crash site is useful, thank you Valentin! Interesting things from that:

  • the token is length 23 and should be something from one of our various lists consumed from Disconnect.
  • the host is a view into "www.youtube.com", which I think means it was the host of the URI argument.
  • the host's length is 5.

So in theory, while running this, we would do the following trace of execution, tildes indicating the quoted section from above:

host = "www.youtube.com"
enter for (;;)
try to find host in token - if we find it return true 
fail to find (we know this because we don't return true)
~c = 0
~n = "www.youtube.com".find(".", c+1) (=3)
~n >=? 0 (we know this is always true, because we would break out of this loop)
host = "*" + dependentsubstring(".youtube.com")
try to find host in token - if we find it return true 
fail to find (we know this because we don't return true)
~c = 1 (we start with *)
~n = "*.youtube.com".find(".", c+1) (=9)
~n >=? 0 (we know this is always true, because we would break out of this loop)
host = "*" + dependentsubstring(".com")  *** This is the value that gives us a host length of 5.
try to find host in token - if we find it return true 
fail to find (we know this because we don't return true)
~c = 1 (we start with *)
~n = "*.com".find(".", c+1) (=str::npos) IT SEEMS TO CRASH HERE! 
~n >=? 0 : this is now false, so we break, immediately returning false.

The dependent strings are giving me pause. I'm not sure how those work under the hood. I know they refer to an external data buffer, but these are

However, the weird part here is that this crashing bit of code is only dependent on the host from URI and whether or not it matches any of the lists. Given that the host's mStorage is "www.youtube.com", which is really common and not an edge case in any way, and we should see users with and without that matching, I am suspicious.

I don't look at many crash reports, is "Possible bit flips max confidence: 12%" high?

My other lead at the moment is host = "*"_ns + nsDependentCSubstring(host, startIndexOfNextLevel); where host is an in-scope nsAutoCString. I'm not sure what happens as far as data goes when doing that assignment or if it allows a reallocation or anything weird.

I'll keep looking into this, but will probably need to get approval and look at that crash dump to make more progress if that string thing doesn't turn up any ideas.

Keeping my NI, but posting as an update.

My other lead at the moment is host = "*"_ns + nsDependentCSubstring(host, startIndexOfNextLevel); where host is an in-scope nsAutoCString. I'm not sure what happens as far as data goes when doing that assignment or if it allows a reallocation or anything weird.

This assignment (probably) produces the relevant allocation. At least it's the only tuple assign I see in IsURIInList.

I now have a working theory: the variable host is created and stores the buffer of "www.youtube.com" on the stack. Then, grabbing a dependent string from it and assigning it into itself is not okay because the dependent string expects the lifetime of the owner to be longer than that of the dependent. Really there should be another variable that holds onto the current substring of host for each iteration.

I've written a change to do that locally, and run local gtests that exist for this function. I'm going to start review to get feedback from peers.

Flags: needinfo?(bvandersloot)

@paul: what is the typical testing/reproduction requirement for bugs found by PHC? I'm not sure how to validate that my patch would have prevented the reported crash, or reliably replicate heap buffer overruns.

Flags: needinfo?(pbone)
Assignee: nobody → bvandersloot

Taking the liberty to answer for Paul here: PHC is just like regular crash reports. So unfortunately, there is no real way to validate your fix.

What you describe sounds like a stack-use-after-return, is that correct? In that case, not even ASan would catch it in our configuration.

Flags: needinfo?(pbone)

Hmm, it isn't actually stack-use-after-return. The Auto string is actually referencing the heap after the first assignment because of the dependent assignment. I'm not sure why the Find would crash unless the underlying view of the string was malformed.

I'm not familiar with the string implementations, so I'm learning as I go here. To get a more useful set of eyes I'll join the CC's here against a git log xpcom/string....

@nika: does the use of the host variable break any expected invariants of the string classes here: https://searchfox.org/mozilla-central/rev/52f45d12530f9f8da61c6c7237003b5d1499b6d1/dom/base/nsContentUtils.cpp#11067-11109? Particularly in the end: https://searchfox.org/mozilla-central/rev/52f45d12530f9f8da61c6c7237003b5d1499b6d1/dom/base/nsContentUtils.cpp#11102-11108? The short version is that the Find in line 11104 is crashing, presumably because of a host constructed in the assignment on line 11108 in a previous iteration of the loop where it takes continually smaller suffixes of host prepended by "*".

Flags: needinfo?(nika)

Yeah, I think the call in host = "*"_ns + nsDependentCSubstring(host, startIndexOfNextLevel); is incorrect.

The value in host is a string buffer, and it is being assigned to a nsCSubstringTuple which was created by the + operator. The first part of this (the "*"_ns) is fine, and the buffer in host will be destroyed & replaced with "*"_ns, then the second part will attempt to assign the suffix. That dependent substring is invalid, however, as it's a reference into the buffer from host which was destroyed to perform the assignment.

The code should probably use a call intended for this purpose like ReplaceLiteral instead, which won't have this issue because it directly modifies the buffer. Something like this:

host.ReplaceLiteral(0, startIndexOfNextLevel, "*");

FWIW don't fully understand how the Find call is what is crashing here, rather than that assignment, but optimizers can do fun things.

Flags: needinfo?(nika)

(In reply to Benjamin VanderSloot [:bvandersloot] from comment #5)

@paul: what is the typical testing/reproduction requirement for bugs found by PHC? I'm not sure how to validate that my patch would have prevented the reported crash, or reliably replicate heap buffer overruns.

As decoder said. There's no different method from other crashes that occur in the wild (and not on treeherder). And it depends on the bug.

Attachment #9373482 - Attachment description: Bug 1873597 - Improve URI-list matching algorithm for ETP/TP - r=nika!,#anti-tracking! → Bug 1873597 - Improve URI-list matching algorithm for performance - r=nika!,#anti-tracking!
Attached file sec-approval request (obsolete) —

sec approval request was in the wrong text box. Whoops!

Attachment #9375988 - Flags: sec-approval?
Comment on attachment 9375988 [details] sec-approval request --
Attachment #9375988 - Attachment is obsolete: true
Attachment #9375988 - Flags: sec-approval?
Attached file sec-approval request
Attachment #9375989 - Flags: sec-approval?

Comment on attachment 9373482 [details]
Bug 1876275 - Improve URI-list matching algorithm for performance - r=nika!,#anti-tracking!

Approved to land and uplift

Attachment #9373482 - Flags: sec-approval+
Attachment #9373482 - Attachment description: Bug 1873597 - Improve URI-list matching algorithm for performance - r=nika!,#anti-tracking! → Bug 1876275 - Improve URI-list matching algorithm for performance - r=nika!,#anti-tracking!
Attachment #9376205 - Flags: approval-mozilla-esr115?
Attachment #9376206 - Flags: approval-mozilla-beta?

Benjamin, Is this going to land on mozilla-central this cycle? Also, I see you asked for a beta uplift but didn't fill in the uplift request form, could you do it please? Thanks.

Flags: needinfo?(bvandersloot)

Uplift Approval Request

  • String changes made/needed: no
  • Is Android affected?: yes
  • Explanation of risk level: minimal, the change is very small and 1-to-1 functional
  • Fix verified in Nightly: no
  • User impact if declined: continued security issue
  • Code covered by automated testing: yes
  • Risk associated with taking this patch: This is commonly used code so any unexpected corner cases also not covered by automatic testing would be bugs.
  • Needs manual QE test: no
  • Steps to reproduce for manual QE testing: n/a

Uplift Approval Request

  • Explanation of risk level: minimal, the change is very small and 1-to-1 functional
  • Code covered by automated testing: yes
  • Fix verified in Nightly: no
  • User impact if declined: continued security issue
  • Risk associated with taking this patch: This is commonly used code so any unexpected corner cases also not covered by automatic testing would be bugs.
  • Needs manual QE test: no
  • Steps to reproduce for manual QE testing: n/a
  • String changes made/needed: no
  • Is Android affected?: yes

Requested! Sorry- didn't realize I missed a step while trying out the new Lando uplift request process.

Flags: needinfo?(bvandersloot)

Thanks for the uplift requests, this needs to land on mozilla-central first before we fix it on older branches though.

Flags: needinfo?(bvandersloot)

I'll be landing via https://phabricator.services.mozilla.com/D200633 to keep this subtle and NI :pascalc once that is landed.

See Also: → 1876275

This change is now in mozilla-central: https://hg.mozilla.org/mozilla-central/rev/d8e90bb252ae

Flags: needinfo?(bvandersloot) → needinfo?(pascalc)
Attachment #9376206 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9376205 - Flags: approval-mozilla-esr115? → approval-mozilla-esr115+

Closed via Bug 1876275, and all uplifts complete.

Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Group: dom-core-security → core-security-release
Depends on: 1876275
See Also: 1876275
Target Milestone: --- → 124 Branch
QA Whiteboard: [post-critsmash-triage]
Flags: qe-verify-
Whiteboard: [adv-main123+r]
Whiteboard: [adv-main123+r] → [adv-main123+r][adv-esr115.8+r]
Group: core-security-release

See comment 8. This ended up being a string view UAF kind of thing despite being reported as a bounds issue.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: