Closed Bug 1614967 Opened 4 years ago Closed 1 year ago

Crash in [@ sqlite3_uri_parameter] with use-after-free

Categories

(Toolkit :: Storage, defect, P2)

Unspecified
Windows 10
defect

Tracking

()

RESOLVED INCOMPLETE
Tracking Status
firefox-esr68 --- unaffected
firefox73 --- unaffected
firefox74 + wontfix
firefox75 + wontfix
firefox76 --- wontfix
firefox77 --- wontfix

People

(Reporter: decoder, Unassigned)

References

(Blocks 1 open bug)

Details

(4 keywords)

Crash Data

This bug is for crash report bp-5b571b5e-d9e3-439f-a1d9-12d5d0200211.

Top 10 frames of crashing thread:

0 nss3.dll sqlite3_uri_parameter third_party/sqlite3/src/sqlite3.c:163331
1 nss3.dll winOpen third_party/sqlite3/src/sqlite3.c:46667
2 xul.dll `anonymous namespace'::xOpen storage/TelemetryVFS.cpp:491
3 nss3.dll sqlite3VdbeHalt third_party/sqlite3/src/sqlite3.c:80407
4 nss3.dll sqlite3VdbeExec third_party/sqlite3/src/sqlite3.c:88291
5 nss3.dll sqlite3_step third_party/sqlite3/src/sqlite3.c:83275
6 nss3.dll sqlite3_exec third_party/sqlite3/src/sqlite3.c:122068
7 xul.dll mozilla::storage::Connection::executeSql storage/mozStorageConnection.cpp:1216
8 xul.dll mozilla::storage::Connection::ExecuteSimpleSQL storage/mozStorageConnection.cpp:1852
9 xul.dll mozStorageTransaction::Commit storage/mozStorageHelper.h:124

Additional PHC data:

Free stack:

#0    sqlite3VdbeExec(Vdbe*) (nss3.pdb)
#1    sqlite3_step(sqlite3_stmt*) (nss3.pdb)
#2    mozilla::storage::Connection::stepStatement(sqlite3*, sqlite3_stmt*) (xul.pdb)
#3    mozilla::storage::Statement::ExecuteStep(bool*) (xul.pdb)
#4    mozilla::places::`anonymous namespace'::FetchPageInfo(RefPtr<mozilla::places::Database> const&, mozilla::places::PageData&) (xul.pdb)
#5    mozilla::places::AsyncFetchAndSetIconForPage::Run() (xul.pdb)
#6    nsThread::ProcessNextEvent(bool, bool*) (xul.pdb)
#7    NS_ProcessNextEvent(nsIThread*, bool) (xul.pdb)
#8    mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) (xul.pdb)
#9    MessageLoop::RunHandler() (xul.pdb)
#10    MessageLoop::Run() (xul.pdb)
#11    static nsThread::ThreadFunc(void*) (xul.pdb)
#12    _PR_NativeRunThread(void*) (nss3.pdb)
#13    pr_root(void*) (nss3.pdb)
#14    thread_start<unsigned int (__cdecl*)(void *),1> (ucrtbase.pdb)
#15    BaseThreadInitThunk (kernel32.pdb)

Alloc stack:

#0    sqlite3Malloc(unsigned long long) (nss3.pdb)
#1    dbMallocRawFinish(sqlite3*, unsigned long long) (nss3.pdb)
#2    sqlite3VdbeExec(Vdbe*) (nss3.pdb)
#3    sqlite3_step(sqlite3_stmt*) (nss3.pdb)
#4    mozilla::storage::Connection::stepStatement(sqlite3*, sqlite3_stmt*) (xul.pdb)
#5    mozilla::storage::Statement::ExecuteStep(bool*) (xul.pdb)
#6    mozilla::places::`anonymous namespace'::FetchPageInfo(RefPtr<mozilla::places::Database> const&, mozilla::places::PageData&) (xul.pdb)
#7    mozilla::places::AsyncFetchAndSetIconForPage::Run() (xul.pdb)
#8    nsThread::ProcessNextEvent(bool, bool*) (xul.pdb)
#9    NS_ProcessNextEvent(nsIThread*, bool) (xul.pdb)
#10    mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) (xul.pdb)
#11    MessageLoop::RunHandler() (xul.pdb)
#12    MessageLoop::Run() (xul.pdb)
#13    static nsThread::ThreadFunc(void*) (xul.pdb)
#14    _PR_NativeRunThread(void*) (nss3.pdb)
#15    pr_root(void*) (nss3.pdb)

I'm going to try and get filename/line numbers for the PHC traces, the symbol server doesn't support it right now.

Locally parsed stacks (looks mostly correct, but in general, my symfile parser does not cover all corner cases):

Free stack:

#0    sqlite3VdbeExec(Vdbe*)
    in file hg:hg.mozilla.org/mozilla-central:third_party/sqlite3/src/sqlite3.c:119cd111d7d4568c3040437648bb67f0f83f9eb7 line 85089
#1    sqlite3_result_text16le(sqlite3_context*, void const*, int, void (*)(void*))
    in file hg:hg.mozilla.org/mozilla-central:third_party/sqlite3/src/sqlite3.c:119cd111d7d4568c3040437648bb67f0f83f9eb7 line 82122
#2    mozilla::storage::Connection::stepStatement(sqlite3*, sqlite3_stmt*)
    in file hg:hg.mozilla.org/releases/mozilla-beta:storage/mozStorageConnection.cpp:2a94604e87b36c6edaffd365fe5bb0aa1da3ef5b line 1124
#3    mozilla::storage::Statement::ExecuteStep(bool*)
    in file hg:hg.mozilla.org/releases/mozilla-beta:storage/mozStorageStatement.cpp:2a94604e87b36c6edaffd365fe5bb0aa1da3ef5b line 519
#4    mozilla::places::`anonymous namespace'::FetchPageInfo(RefPtr<mozilla::places::Database> const&, mozilla::places::PageData&)
    in file hg:hg.mozilla.org/releases/mozilla-beta:toolkit/components/places/FaviconHelpers.cpp:2a94604e87b36c6edaffd365fe5bb0aa1da3ef5b line 97
#5    mozilla::places::AsyncFetchAndSetIconForPage::Run()
    in file hg:hg.mozilla.org/releases/mozilla-beta:toolkit/components/places/FaviconHelpers.cpp:2a94604e87b36c6edaffd365fe5bb0aa1da3ef5b line 527
#6    nsThread::ProcessNextEvent(bool, bool*)
    in file hg:hg.mozilla.org/releases/mozilla-beta:xpcom/threads/nsThread.cpp:2a94604e87b36c6edaffd365fe5bb0aa1da3ef5b line 1222
#7    NS_ProcessNextEvent(nsIThread*, bool)
    in file hg:hg.mozilla.org/releases/mozilla-beta:xpcom/threads/nsThreadUtils.cpp:2a94604e87b36c6edaffd365fe5bb0aa1da3ef5b line 486
#8    mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*)
    in file hg:hg.mozilla.org/releases/mozilla-beta:ipc/glue/MessagePump.cpp:2a94604e87b36c6edaffd365fe5bb0aa1da3ef5b line 302
#9    MessageLoop::RunHandler()
    in file hg:hg.mozilla.org/releases/mozilla-beta:ipc/chromium/src/base/message_loop.cc:2a94604e87b36c6edaffd365fe5bb0aa1da3ef5b line 309
#10    MessageLoop::Run()
    in file hg:hg.mozilla.org/releases/mozilla-beta:ipc/chromium/src/base/message_loop.cc:2a94604e87b36c6edaffd365fe5bb0aa1da3ef5b line 291
#11    static nsThread::ThreadFunc(void*)
    in file hg:hg.mozilla.org/releases/mozilla-beta:xpcom/threads/nsThread.cpp:2a94604e87b36c6edaffd365fe5bb0aa1da3ef5b line 466
#12    pr_StringToNetAddrFB(char const*, PRNetAddr*)
    in file hg:hg.mozilla.org/mozilla-central:nsprpub/pr/src/misc/prnetdb.c:119cd111d7d4568c3040437648bb67f0f83f9eb7 line 2336
#13    PR_LoadLibraryWithFlags(PRLibSpec, int)
    in file hg:hg.mozilla.org/mozilla-central:nsprpub/pr/src/linking/prlink.c:119cd111d7d4568c3040437648bb67f0f83f9eb7 line 416
#14    __crt_stdio_output::output_processor<char,__crt_stdio_output::stream_output_adapter<char>,__crt_stdio_output::standard_base<char,__crt_stdio_output::stream_output_adapter<char> > >::type_case_integer(unsigned int,bool)
#15    SbpCreateSwitchContext

Alloc stack:

#0    sqlite3_realloc(void*, int)
    in file hg:hg.mozilla.org/mozilla-central:third_party/sqlite3/src/sqlite3.c:119cd111d7d4568c3040437648bb67f0f83f9eb7 line 27105
#1    sqlite3FindTable(sqlite3*, char const*, char const*)
    in file hg:hg.mozilla.org/mozilla-central:third_party/sqlite3/src/sqlite3.c:119cd111d7d4568c3040437648bb67f0f83f9eb7 line 107975
#2    sqlite3VdbeMemGrow(sqlite3_value*, int, int)
    in file hg:hg.mozilla.org/mozilla-central:third_party/sqlite3/src/sqlite3.c:119cd111d7d4568c3040437648bb67f0f83f9eb7 line 74900
#3    sqlite3_result_text16le(sqlite3_context*, void const*, int, void (*)(void*))
    in file hg:hg.mozilla.org/mozilla-central:third_party/sqlite3/src/sqlite3.c:119cd111d7d4568c3040437648bb67f0f83f9eb7 line 82122
#4    mozilla::storage::Connection::stepStatement(sqlite3*, sqlite3_stmt*)
    in file hg:hg.mozilla.org/releases/mozilla-beta:storage/mozStorageConnection.cpp:2a94604e87b36c6edaffd365fe5bb0aa1da3ef5b line 1124
#5    mozilla::storage::Statement::ExecuteStep(bool*)
    in file hg:hg.mozilla.org/releases/mozilla-beta:storage/mozStorageStatement.cpp:2a94604e87b36c6edaffd365fe5bb0aa1da3ef5b line 519
#6    mozilla::places::`anonymous namespace'::FetchPageInfo(RefPtr<mozilla::places::Database> const&, mozilla::places::PageData&)
    in file hg:hg.mozilla.org/releases/mozilla-beta:toolkit/components/places/FaviconHelpers.cpp:2a94604e87b36c6edaffd365fe5bb0aa1da3ef5b line 97
#7    mozilla::places::AsyncFetchAndSetIconForPage::Run()
    in file hg:hg.mozilla.org/releases/mozilla-beta:toolkit/components/places/FaviconHelpers.cpp:2a94604e87b36c6edaffd365fe5bb0aa1da3ef5b line 527
#8    nsThread::ProcessNextEvent(bool, bool*)
    in file hg:hg.mozilla.org/releases/mozilla-beta:xpcom/threads/nsThread.cpp:2a94604e87b36c6edaffd365fe5bb0aa1da3ef5b line 1222
#9    NS_ProcessNextEvent(nsIThread*, bool)
    in file hg:hg.mozilla.org/releases/mozilla-beta:xpcom/threads/nsThreadUtils.cpp:2a94604e87b36c6edaffd365fe5bb0aa1da3ef5b line 486
#10    mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*)
    in file hg:hg.mozilla.org/releases/mozilla-beta:ipc/glue/MessagePump.cpp:2a94604e87b36c6edaffd365fe5bb0aa1da3ef5b line 302
#11    MessageLoop::RunHandler()
    in file hg:hg.mozilla.org/releases/mozilla-beta:ipc/chromium/src/base/message_loop.cc:2a94604e87b36c6edaffd365fe5bb0aa1da3ef5b line 309
#12    MessageLoop::Run()
    in file hg:hg.mozilla.org/releases/mozilla-beta:ipc/chromium/src/base/message_loop.cc:2a94604e87b36c6edaffd365fe5bb0aa1da3ef5b line 291
#13    static nsThread::ThreadFunc(void*)
    in file hg:hg.mozilla.org/releases/mozilla-beta:xpcom/threads/nsThread.cpp:2a94604e87b36c6edaffd365fe5bb0aa1da3ef5b line 466
#14    pr_StringToNetAddrFB(char const*, PRNetAddr*)
    in file hg:hg.mozilla.org/mozilla-central:nsprpub/pr/src/misc/prnetdb.c:119cd111d7d4568c3040437648bb67f0f83f9eb7 line 2336
#15    PR_LoadLibraryWithFlags(PRLibSpec, int)
    in file hg:hg.mozilla.org/mozilla-central:nsprpub/pr/src/linking/prlink.c:119cd111d7d4568c3040437648bb67f0f83f9eb7 line 416
Component: General → Storage
Product: Core → Toolkit
Group: core-security → firefox-core-security

Marco, could you take a look? I'm not sure if the issue here might be inside places favicon code or something deeper in storage code. Thanks.

Flags: needinfo?(mak)

According to the crash dates, it seems related to the changes in bug 1611925 or the new Sqlite version from bug 1611253. The latter seems more plausible off-hand.

Regressed by: 1611925
No longer blocks: PHC
Flags: needinfo?(mak)
Blocks: PHC
No longer regressed by: 1611925

the signature is regressing in volume in firefox 74

:janv, can you please have a look here?

Flags: needinfo?(jvarga)

I honestly can't explain that spike in 74.0b3. This runs off the main-thread, there is a realloc and somehow xOpen receives a bogus pointer to zName? It looks tricky.

I presume the problem here is from Places attaching to a database using its AttachDatabase helper.

:drh, Are there limits on sqlite3_uri_parameter when xOpen is being called because of an ATTACH call that we might be obviously violating? Thanks!

Flags: needinfo?(drh)

The argument to sqlite3_uri_parameter() must be a filename that was passed by the SQLite "pager" module down into the VFS. If TelemetryVFS is fabricating its own filenames and passing them down into other VFSes, that might be the source of the problem.

Flags: needinfo?(drh)

I couldn't find a point in TelemetryVFS where we'd fabricate our own filename, we always pass-through zName.
We used to pass a custom built string to sqlite_uri_parameter BEFORE we fixed bug 1611925, and that is more or less when this crash started. Apparently passing our own string was preventing the crash.

Looking deeper... The stack trace does not seem to be making sense. sqlite3.c:80407 is shown in context here: https://www.sqlite.org/src/artifact/ff690e6c9314ef28?ln=3074 and the sqlite3CommitInternalChanges() function is a one-liner (which your compiler has apparently in-lined) shown here: https://www.sqlite.org/src/artifact/2394d2c853088106?ln=579-581. There are no deeper function calls, and in particular no calls to VFS routines, anywhere in the vicinity. Could this be some kind of stack corruption?

You are right, all the crash stacks look broken. That doesn't help and the crash spike on b4 is even more puzzling. We may be following a red herring.

Let alone sqlite_uri_parameter for a bit, supposing it's just stack corruption. The PHC stacks in comment 1 shows a realloc in sqlite3_result_text16le / sqlite3FindTable, I don't know what sqlite3FindTable may realloc, and what may keep a dangling ref to that.

:mak, if the priority is not proper, please set it accordingly.

Priority: -- → P1

Discussed with Marco offline. The root cause is still unknown and it's unlikely we can fix it in time - fix-optional for 74.
Also, lower the priority since the number of crashes has been decreasing.

Priority: P1 → P2

The volume seems to have dropped on Beta74 after EARLY_BETA_OR_EARLIER was unset. Given that PHC is tied to that, is that an expected outcome?

That definitely makes sense -- PHC will cause some UAFs to crash instead of doing invalid but silent accesses. If this bug isn't fixed by the start of the next dev cycle we will likely see another spike when PHC is re-enabled for early beta.

How can we make this bug actionable?

(In reply to Ryan VanderMeulen [:RyanVM] from comment #18)

How can we make this bug actionable?

Reproduce under rr/pernosco? Can PHC run on linux and work with rr?

Reproduce under rr/pernosco? Can PHC run on linux and work with rr?

PHC runs on all platforms and poses no obvious obstacles to rr/Pernosco.

PHC is again disabled in 75.0b7 so the crash volume should go down.

Christian, do we have someone in Mozilla with experience with rr/Pernosco and PHC to try to capture this?

Richard, leaving alone the main stack that may be broken, do the last calls in the PHC additional stacks from comment 1 tell you anything about what may be happening here?

Flags: needinfo?(jvarga)
Flags: needinfo?(drh)
Flags: needinfo?(choller)

In order to support APIs like sqlite3_filename_database(), and to support API calls like sqlite3_uri_boolean() against the names of journal files passed down into an SQLite VFS, the filenames must be laid out in memory in a very particular way. Older versions of SQLite were more tolerant of non-conforming memory layouts. Newer versions are increasingly strict. My guess (and this is only a guess) is that the TelemetryVFS does not conform to the latest (evolving) requirements on the memory layout of filenames passed into xOpen.

There is a new API for SQLite 3.32.0 (due for release on 2020-05-29) that will construct filenames suitable to be passed into the xOpen method of a VFS. See draft documentation at https://www.sqlite.org/draft/c3ref/create_filename.html

Long-term, I think the correct fix is to update TelemetryVFS to make use of sqlite3_create_filename() when it constructs filenames to be passed down into lower-level VFSes. That API is not yet available in an official SQLite release, but the Prerelease Snapshot is certainly stable enough for development work, and if you ask, we can run tests to ensure that it is stable enough for an official FF release.

If you don't want to use the SQLite Prerelease of 3.32.0, then another interim fix would be to simply comment-out the calls to sqlite3_uri_boolean() found in the default SQLite VFSes. I don't imagine that FF is using any of the query parameters that those calls implement, so no harm will come of simply commenting them out. A similar short-term work-around is that we could do a 3.31.2 release of SQLite that has a new compile-time option that causes the sqlite3_uri_boolean() calls normally found in the default VFSes to be omitted.

Flags: needinfo?(drh)

It would be worth noting that we have no crashes in nightly 77 while both nightly 76 and 75 had a significant amount of crashes. There's three possibilities: the crash is still there but the signature changed, the crash is still there but it's a lot harder to hit or somebody else accidentally fixed the crash.

Based on comment 24, we can wait for the new Sqlite version.

Though, in xOpen we just take zName and pass it down to sqlite3_uri_parameter, we also pass the same zName to the underlying xOpen. There's no manipulation of the database name on our side, nor we build a custom database name.
The db in this stack doesn't have parameters, it's just a normal connection to places.sqlite with attached favicons.sqlite.

Keywords: stalled

We had bp-4922e06a-ae1d-48fa-9906-780560200503 yesterday (but no PHC stack there).

What is the best way for us to get a patched sqlite3.c file to you to try out?

At the moment we don't have an easy way to reproduce this crash, apart from pushing something to Nightly and seeing how crashes evolve in crash-stats.
We can push some #ifdef NIGHTLY_BUILD code, if it's not likely to cause disruption to users. We could do this you have a simple patch that we can #ifdef/#else.
If we must temporarily replace all of sqlite3.c, we could do it too, but we must backout that patch before the next soft freeze on 28th of May, so we'd have a couple weeks.

If we'd have steps to reproduce or a locally reproduced crash, it would be much simpler.

I must also note that indeed the crash became quite rare, so it's not given that 2 weeks of Nightly will be enough to tell if a change is solving the problem or we just didn't get a report in that timeframe. We may have to use #ifdef EARLY_BETA_OR_EARLIER. It all depends on how much the patch is risky for final users.

It looks like this might have been fixed by something in Firefox 79, given the release date and sudden stop in crash reports?

Flags: needinfo?(choller)

I see at least one crash for Firefox 80, so I would not be too optimistic given the low frequency in the recent past. The "positive" part of the story might be, that this crash's EXCEPTION_ACCESS_VIOLATION_WRITE is pointing now to the null page, not to some random address.

I assume, that the lock on mHistory->mBlockShutdownMutex does not prevent the mDBConn from being closed (for shutdown?) while the transaction is running (btw: it seems to be a possibly long transaction with dispatches to the main thread in between and such).

(In reply to Jens Stutte [:jstutte] (REO for FF 81) from comment #31)

I see at least one crash for Firefox 80, so I would not be too optimistic given the low frequency in the recent past. The "positive" part of the story might be, that this crash's EXCEPTION_ACCESS_VIOLATION_WRITE is pointing now to the null page, not to some random address.

I would assume that is a different bug indeed. The drop after July 26 looks very noticeable in stats and it remained that way until today.

The severity field for this bug is set to normal. However, the bug is flagged with the sec-high keyword.
:mak, could you consider increasing the severity of this security bug?

For more information, please visit auto_nag documentation.

Flags: needinfo?(mak)
Flags: needinfo?(mak)
Severity: normal → S3

Considered the initial investigation has shown broken stacks, and there has been no report in 1 years, I'm resolving as incomplete for now.

Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → INCOMPLETE

Since the bug is closed, the stalled keyword is now meaningless.
For more information, please visit auto_nag documentation.

Keywords: stalled
Group: firefox-core-security
You need to log in before you can comment on or make changes to this bug.