Closed Bug 1719319 Opened 4 years ago Closed 4 years ago

Startup Crash in [@ mozilla::URLPreloader::ReadCache]

Categories

(Core :: XPConnect, defect)

Unspecified
Windows 10
defect

Tracking

()

RESOLVED FIXED
92 Branch
Tracking Status
firefox-esr78 --- wontfix
firefox-esr91 --- fixed
firefox89 --- wontfix
firefox90 --- wontfix
firefox91 blocking fixed
firefox92 blocking fixed

People

(Reporter: aryx, Assigned: kmag)

References

(Blocks 1 open bug, Regression)

Details

(5 keywords, Whiteboard: [adv-main91+r])

Crash Data

Attachments

(3 files)

18 crashes for nightly 91.0a1 starting with build 20210626094654, all on Windows 10. The 89.0 release branch has 5 crashes with that signature.

Crash report: https://crash-stats.mozilla.org/report/index/8928ea8d-f55f-40bf-b5a3-f249b0210627

MOZ_CRASH Reason: MOZ_RELEASE_ASSERT(!listElem->isInList())

Top 10 frames of crashing thread:

0 xul.dll mozilla::URLPreloader::ReadCache js/xpconnect/loader/URLPreloader.cpp:312
1 xul.dll mozilla::URLPreloader::BackgroundReadFiles js/xpconnect/loader/URLPreloader.cpp:340
2 xul.dll mozilla::detail::RunnableMethodImpl< xpcom/threads/nsThreadUtils.h:1201
3 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1146
4 xul.dll mozilla::ipc::MessagePumpForNonMainThreads::Run ipc/glue/MessagePump.cpp:300
5 xul.dll MessageLoop::RunHandler ipc/chromium/src/base/message_loop.cc:324
6 xul.dll MessageLoop::Run ipc/chromium/src/base/message_loop.cc:306
7 xul.dll static nsThread::ThreadFunc xpcom/threads/nsThread.cpp:392
8 nss3.dll _PR_NativeRunThread nsprpub/pr/src/threads/combined/pruthr.c:399
9 nss3.dll pr_root nsprpub/pr/src/md/windows/w95thred.c:139
Severity: -- → S2
Group: dom-core-security
Regressed by: 1717778
Has Regression Range: --- → yes

This release assert was added by bug 1717778, but the underlying issue is presumably older, and any fix here will need to be backported.

The crash is in the insertBack here:

      auto entry = mCachedURLs.GetOrInsertNew(key, key);
      entry->mResultCode = NS_ERROR_NOT_INITIALIZED;
      pendingURLs.insertBack(entry);

I guess things can go badly if GetOrInsertNew finds an existing key because it can already be in the pending URLs list?

Kris, do you have time to look into this?

Flags: needinfo?(kmaglione+bmo)

Giving this a relatively lower security rating because it's our code running at startup. Other problems found by this assertion could easily be exploitable UAF sec-high bugs.

Keywords: sec-moderate

The volume of crashes, startup ones, makes it a blocker for 91 and 92. Unfortunately, this being a sec bug means that fewer people that may be able to investigate and fix it may know that we have this bug on file. Andrew can you help us find an owner rapidly for this crash please? Thanks!

Flags: needinfo?(continuation)
Keywords: topcrash

Anything accessed before that time won't benefit from caching, and having
entries inserted into the hashtable before the cache file is read may lead to
undefined behavior.

This bug shouldn't be exploitable, since it only affects file reads that
happen long before any untrusted code has a chance to run.

Assignee: nobody → kmaglione+bmo
Status: NEW → ASSIGNED
Flags: needinfo?(continuation)

We need to make sure we stop adding entries before we start the cache write
task (which is triggered by the ScriptPreloader's cache write task after
startup is marked complete) so that the hashtable we're writing can't be
modified while we're iterating it from that thread.

Flags: needinfo?(kmaglione+bmo)

Kris, it seems your patches are accepted and this is a sec-moderate, could you land and request uplift so as that we have it in the next beta? Thanks

Flags: needinfo?(kmaglione+bmo)
Group: dom-core-security → core-security-release
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 92 Branch

Comment on attachment 9232480 [details]
Bug 1719319: Part 2 - Make sure URLPreloader startup is marked complete at the same time as ScriptPreloader. r=mccr8

Beta/Release Uplift Approval Request

  • User impact if declined: This patch fixes a data race, the main side-effect which is an assertion and crash at startup.

Note that since a startup crash automatically purges the caches that we crash when reading, even without that patch, the crash in question should only happen once.

  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: No
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): These are fairly simple data consistency fixes.
  • String changes made/needed:
Flags: needinfo?(kmaglione+bmo)
Attachment #9232480 - Flags: approval-mozilla-beta?
Attachment #9232230 - Flags: approval-mozilla-beta?

Comment on attachment 9232480 [details]
Bug 1719319: Part 2 - Make sure URLPreloader startup is marked complete at the same time as ScriptPreloader. r=mccr8

91 release blocker, approved for our last beta today, thanks.

Attachment #9232480 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9232230 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
QA Whiteboard: [post-critsmash-triage]
Flags: qe-verify-

Kris, we are still crashing at startup in beta 9 which has the patches.

Flags: needinfo?(kmaglione+bmo)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: 92 Branch → ---

I looked over the code a bit so it wouldn't just have to be Kris staring at this. The only thing I noticed was that if the buffer was somehow cut off or corrupted maybe you could end up in a situation where you read CacheKey::mType, but then hit an error while reading mPath. mPath would remain empty. Then if there was already an existing entry with the same mType and an empty mPath, you'd get a collision. I can't imagine how you'd end up with an empty path except in the case of an error.

I still haven't figured out why this is happening, so since we can make it
non-fatal on release builds, we should, given the crash volume.

I don't want to completely silence the issue in non-release builds, since
something is clearly still wrong.

Attachment #9234589 - Attachment description: Bug 1719319: Make consistency error non-fatal on relase. r=mccr8 → Bug 1719319: Make consistency error non-fatal on release. r=mccr8
Flags: needinfo?(kmaglione+bmo)

Make consistency error non-fatal on relase. r=mccr8
https://hg.mozilla.org/integration/autoland/rev/8c6dba83

Status: REOPENED → RESOLVED
Closed: 4 years ago4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 92 Branch

Comment on attachment 9234589 [details]
Bug 1719319: Make consistency error non-fatal on release. r=mccr8

Wallpapers over the crash for release users. Approved for 91.0/91.0esr RC2.

Attachment #9234589 - Flags: approval-mozilla-release+
Attachment #9234589 - Flags: approval-mozilla-esr91+

We're seeing some hits on the new diagnostic assert. I filed bug 1724336 to continue the investigation.

Whiteboard: [adv-main90+r]
Whiteboard: [adv-main90+r] → [adv-main91+r]
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: