Closed Bug 1665836 Opened 5 years ago Closed 4 years ago

Intermittent PROCESS-CRASH | damp | application crashed [@ nsThread::ProcessNextEvent(bool, bool*)] on CacheIOThread

Categories

(Core :: Networking: Cache, defect, P1)

defect

Tracking

()

RESOLVED FIXED
91 Branch
Tracking Status
firefox-esr78 90+ fixed
firefox89 --- wontfix
firefox90 + fixed
firefox91 + fixed

People

(Reporter: intermittent-bug-filer, Assigned: valentin)

Details

(4 keywords, Whiteboard: [sec-survey][post-critsmash-triage][adv-main90+r])

Crash Data

Attachments

(4 files)

Filed by: ncsoregi [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer.html#?job_id=316034546&repo=autoland
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/Ki7sVWKbQDidqap_HI5qnQ/runs/0/artifacts/public/logs/live_backing.log


>[task 2020-09-18T04:23:58.804Z] 04:23:58     INFO -  PROCESS-CRASH | damp | application crashed [@ nsThread::ProcessNextEvent(bool, bool*)]
[task 2020-09-18T04:23:58.805Z] 04:23:58     INFO -  Crash dump filename: c:\users\task_1600400516\appdata\local\temp\tmpnb_rm9\profile\minidumps\f638cee3-de80-4772-8ab4-2a5989d42e72.dmp
[task 2020-09-18T04:23:58.805Z] 04:23:58     INFO -  Operating system: Windows NT
[task 2020-09-18T04:23:58.806Z] 04:23:58     INFO -                    10.0.17134
[task 2020-09-18T04:23:58.807Z] 04:23:58     INFO -  CPU: amd64
[task 2020-09-18T04:23:58.807Z] 04:23:58     INFO -       family 6 model 94 stepping 3
[task 2020-09-18T04:23:58.807Z] 04:23:58     INFO -       8 CPUs
[task 2020-09-18T04:23:58.807Z] 04:23:58     INFO -  GPU: UNKNOWN
[task 2020-09-18T04:23:58.808Z] 04:23:58     INFO -  Crash reason:  EXCEPTION_ACCESS_VIOLATION_READ
[task 2020-09-18T04:23:58.808Z] 04:23:58     INFO -  Crash address: 0xffffffff
[task 2020-09-18T04:23:58.808Z] 04:23:58     INFO -  Process uptime: 565 seconds
[task 2020-09-18T04:23:58.808Z] 04:23:58     INFO -  Thread 24 (crashed)
[task 2020-09-18T04:23:58.809Z] 04:23:58     INFO -   0  xul.dll!nsThread::ProcessNextEvent(bool, bool*) [nsThread.cpp:d5f57ef7c1a2bf4051b4320e98ede73dee781dd0 : 1227 + 0x26]
[task 2020-09-18T04:23:58.809Z] 04:23:58     INFO -      rax = 0xe5e5e5e5e5e5e5e5   rdx = 0x000000b70285f428
[task 2020-09-18T04:23:58.809Z] 04:23:58     INFO -      rcx = 0x00007ff8a75cf970   rbx = 0x000001deb9a52840
[task 2020-09-18T04:23:58.809Z] 04:23:58     INFO -      rsi = 0x000000b70285f650   rdi = 0x000001dec6ab6700
[task 2020-09-18T04:23:58.809Z] 04:23:58     INFO -      rbp = 0x0000000000000000   rsp = 0x000000b70285f480
[task 2020-09-18T04:23:58.810Z] 04:23:58     INFO -       r8 = 0x000000b70285f581    r9 = 0xffe61a02d3a448fb
[task 2020-09-18T04:23:58.810Z] 04:23:58     INFO -      r10 = 0x00000fff14e6fe34   r11 = 0x0010000000000000
[task 2020-09-18T04:23:58.810Z] 04:23:58     INFO -      r12 = 0x0000000000000000   r13 = 0x0000000000000003
[task 2020-09-18T04:23:58.810Z] 04:23:58     INFO -      r14 = 0x000001deadf74158   r15 = 0x0000000000000001
[task 2020-09-18T04:23:58.810Z] 04:23:58     INFO -      rip = 0x00007ff8a6b90271
[task 2020-09-18T04:23:58.811Z] 04:23:58     INFO -      Found by: given as instruction pointer in context
[task 2020-09-18T04:23:58.811Z] 04:23:58     INFO -   1  xul.dll!mozilla::net::CacheIOThread::ThreadFunc() [CacheIOThread.cpp:d5f57ef7c1a2bf4051b4320e98ede73dee781dd0 : 460 + 0x14]
[task 2020-09-18T04:23:58.811Z] 04:23:58     INFO -      rbx = 0x000001deb9a52840   rbp = 0x0000000000000000
[task 2020-09-18T04:23:58.811Z] 04:23:58     INFO -      rsp = 0x000000b70285faa0   r12 = 0x0000000000000000
[task 2020-09-18T04:23:58.811Z] 04:23:58     INFO -      r13 = 0x0000000000000003   r14 = 0x000001deadf74158
[task 2020-09-18T04:23:58.811Z] 04:23:58     INFO -      r15 = 0x0000000000000001   rip = 0x00007ff8a78b1ee1
[task 2020-09-18T04:23:58.812Z] 04:23:58     INFO -      Found by: call frame info
[task 2020-09-18T04:23:58.812Z] 04:23:58     INFO -   2  xul.dll!static mozilla::net::CacheIOThread::ThreadFunc(void*) [CacheIOThread.cpp:d5f57ef7c1a2bf4051b4320e98ede73dee781dd0 : 416 + 0x8]
[task 2020-09-18T04:23:58.812Z] 04:23:58     INFO -      rbx = 0x000001deb9a52840   rbp = 0x0000000000000000
[task 2020-09-18T04:23:58.812Z] 04:23:58     INFO -      rsp = 0x000000b70285fb50   r12 = 0x0000000000000000
[task 2020-09-18T04:23:58.812Z] 04:23:58     INFO -      r13 = 0x0000000000000003   r14 = 0x000001deadf74158
[task 2020-09-18T04:23:58.813Z] 04:23:58     INFO -      r15 = 0x0000000000000001   rip = 0x00007ff8a78b1af3
[task 2020-09-18T04:23:58.813Z] 04:23:58     INFO -      Found by: call frame info
[task 2020-09-18T04:23:58.813Z] 04:23:58     INFO -   3  nss3.dll!_PR_NativeRunThread(void*) [pruthr.c:d5f57ef7c1a2bf4051b4320e98ede73dee781dd0 : 399 + 0xe]
[task 2020-09-18T04:23:58.813Z] 04:23:58     INFO -      rbx = 0x000001deb9a52840   rbp = 0x0000000000000000
[task 2020-09-18T04:23:58.813Z] 04:23:58     INFO -      rsp = 0x000000b70285fb80   r12 = 0x0000000000000000
[task 2020-09-18T04:23:58.814Z] 04:23:58     INFO -      r13 = 0x0000000000000003   r14 = 0x000001deadf74158
[task 2020-09-18T04:23:58.814Z] 04:23:58     INFO -      r15 = 0x0000000000000001   rip = 0x00007ff8c698ac4a
[task 2020-09-18T04:23:58.814Z] 04:23:58     INFO -      Found by: call frame info
[task 2020-09-18T04:23:58.814Z] 04:23:58     INFO -   4  nss3.dll!pr_root(void*) [w95thred.c:d5f57ef7c1a2bf4051b4320e98ede73dee781dd0 : 139 + 0xd]
[task 2020-09-18T04:23:58.814Z] 04:23:58     INFO -      rbx = 0x000001deb9a52840   rbp = 0x0000000000000000
[task 2020-09-18T04:23:58.815Z] 04:23:58     INFO -      rsp = 0x000000b70285fc00   r12 = 0x0000000000000000
[task 2020-09-18T04:23:58.815Z] 04:23:58     INFO -      r13 = 0x0000000000000003   r14 = 0x000001deadf74158
[task 2020-09-18T04:23:58.815Z] 04:23:58     INFO -      r15 = 0x0000000000000001   rip = 0x00007ff8c697bd51
[task 2020-09-18T04:23:58.815Z] 04:23:58     INFO -      Found by: call frame info
[task 2020-09-18T04:23:58.816Z] 04:23:58     INFO -   5  ucrtbase.dll!LoadImageMappings + 0x46
[task 2020-09-18T04:23:58.816Z] 04:23:58     INFO -      rbx = 0x000001deb9a52840   rbp = 0x0000000000000000
[task 2020-09-18T04:23:58.816Z] 04:23:58     INFO -      rsp = 0x000000b70285fc30   r12 = 0x0000000000000000
[task 2020-09-18T04:23:58.816Z] 04:23:58     INFO -      r13 = 0x0000000000000003   r14 = 0x000001deadf74158
[task 2020-09-18T04:23:58.816Z] 04:23:58     INFO -      r15 = 0x0000000000000001   rip = 0x00007ff8e4bdc4be
[task 2020-09-18T04:23:58.817Z] 04:23:58     INFO -      Found by: call frame info```
Component: Talos → XPCOM
Product: Testing → Core
Group: core-security → dom-core-security

In case it matters, the main thread's stack looks like

moz_xmalloc
mozilla::net::nsAsyncResolveRequest::DoCallback()
mozilla::net::ExecuteCallback::Run()

The UAF is happening on the CacheIOThread, so I'm going to move this to networking, as the issue is likely a runnable with a raw pointer reference to some other object, and I'd imagine that network-y things are usually being run on that thread.

Group: dom-core-security → network-core-security
Component: XPCOM → Networking: Cache
Summary: Intermittent PROCESS-CRASH | damp | application crashed [@ nsThread::ProcessNextEvent(bool, bool*)] → Intermittent PROCESS-CRASH | damp | application crashed [@ nsThread::ProcessNextEvent(bool, bool*)] on CacheIOThread
Severity: normal → S2
Priority: -- → P1
Assignee: nobody → valentin.gosu
Attached file live_backing.log

Attaching logs so we don't loose them.

Attachment #9185406 - Attachment mime type: application/vnd.tcpdump.pcap → application/octet-stream

The Symbols seem to be missing for that revision, so it's hard to figure out exactly what task is being dispatched - or if it's the thread code that malfunctioning or the code being run.

The things that we dispatch to the cacheIOThread are:
https://searchfox.org/mozilla-central/search?q=symbol:_ZN7mozilla3net13CacheIOThread25DispatchAfterPendingOpensEP11nsIRunnable&redirect=false
https://searchfox.org/mozilla-central/search?q=symbol:_ZN7mozilla3net13CacheIOThread8DispatchEP11nsIRunnablej&redirect=false

Unfortunatelly it only happened once on automation.
There are some similar signatures on crash-stats that I'm looking over.

Crash Signature: [@ nsThread::ProcessNextEvent(bool, bool*)] → [@ nsThread::ProcessNextEvent(bool, bool*)] [@ mozilla::net::CacheIOThread::ThreadFunc]
Keywords: stalled
Keywords: stalled

Comment on attachment 9227658 [details]
Bug 1665836 - Make CacheIOThread::ThreadFunc hold reference to thread r=#necko

Security Approval Request

  • How easily could an exploit be constructed based on the patch?: This is a shutdown thread race. Seems difficult to exploit.
  • Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?: Yes
  • Which older supported branches are affected by this flaw?: all
  • If not all supported branches, which bug introduced the flaw?: None
  • Do you have backports for the affected branches?: Yes
  • If not, how different, hard to create, and risky will they be?: Should graft cleanly on all supported branches.
  • How likely is this patch to cause regressions; how much testing does it need?: Low chance of regressions. We just hold an extra reference for the lifetime of the CacheIO thread.
Attachment #9227658 - Flags: sec-approval?

Comment on attachment 9227658 [details]
Bug 1665836 - Make CacheIOThread::ThreadFunc hold reference to thread r=#necko

sec-approval = dveditz

Attachment #9227658 - Flags: sec-approval? → sec-approval+
Group: network-core-security → core-security-release
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 91 Branch

The patch landed in nightly and beta is affected.
:valentin, is this bug important enough to require an uplift?
If not please set status_beta to wontfix.

For more information, please visit auto_nag documentation.

Flags: needinfo?(valentin.gosu)

Comment on attachment 9227658 [details]
Bug 1665836 - Make CacheIOThread::ThreadFunc hold reference to thread r=#necko

Beta/Release Uplift Approval Request

  • User impact if declined: Potential crash at shutdown.
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: No
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): The patch is very simple - just holding an extra reference to the object the cacheIO thread to make sure it's not released before the thread exits.
  • String changes made/needed:
Flags: needinfo?(valentin.gosu)
Attachment #9227658 - Flags: approval-mozilla-beta?

As part of a security bug pattern analysis, we are requesting your help with a high level analysis of this bug. It is our hope to develop static analysis (or potentially runtime/dynamic analysis) in the future to identify classes of bugs.

Please visit this google form to reply.

Flags: needinfo?(valentin.gosu)
Whiteboard: [sec-survey]
Flags: qe-verify-
Whiteboard: [sec-survey] → [sec-survey][post-critsmash-triage]

Comment on attachment 9227658 [details]
Bug 1665836 - Make CacheIOThread::ThreadFunc hold reference to thread r=#necko

approved for 90.0 rc1

Attachment #9227658 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Comment on attachment 9227658 [details]
Bug 1665836 - Make CacheIOThread::ThreadFunc hold reference to thread r=#necko

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration:
  • User impact if declined: Potential crash at shutdown.
  • Fix Landed on Version: 91
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Low chance of regressions. We just hold an extra reference for the lifetime of the CacheIO thread.
  • String or UUID changes made by this patch: N/A
Attachment #9227658 - Flags: approval-mozilla-esr78?
Whiteboard: [sec-survey][post-critsmash-triage] → [sec-survey][post-critsmash-triage][adv-main90+r]

Comment on attachment 9227658 [details]
Bug 1665836 - Make CacheIOThread::ThreadFunc hold reference to thread r=#necko

Approved for 78.12esr.

Attachment #9227658 - Flags: approval-mozilla-esr78? → approval-mozilla-esr78+
Flags: needinfo?(valentin.gosu)
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: