Closed Bug 1755186 Opened 4 years ago Closed 2 years ago

Intermittent awsy/test_memory_usage.py TestMemoryUsage.test_open_tabs | application crashed [@ mozilla::detail::nsDequeBase::ObjectAt(unsigned long) const]

Categories

(Core :: Networking: HTTP, defect, P2)

defect

Tracking

()

RESOLVED FIXED
124 Branch
Tracking Status
firefox-esr115 124+ fixed
firefox122 --- wontfix
firefox123 --- wontfix
firefox124 + fixed

People

(Reporter: intermittent-bug-filer, Assigned: valentin)

Details

(4 keywords, Whiteboard: [necko-triaged][necko-priority-queue][adv-main124-][adv-esr115.9-])

Crash Data

Attachments

(2 files, 1 obsolete file)

Filed by: smolnar [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=367715731&repo=try
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/AyPf4b7DR5KyYfptx7aSOQ/runs/0/artifacts/public/logs/live_backing.log


INFO - mozcrash Saved app info as /builds/worker/workspace/build/blobber_upload_dir/5f731954-1203-03b5-8f48-ca5de9c7824b.extra
[task 2022-02-13T11:35:31.400Z] 11:35:31     INFO - PROCESS-CRASH | awsy/test_memory_usage.py TestMemoryUsage.test_open_tabs | application crashed [@ mozilla::detail::nsDequeBase::ObjectAt(unsigned long) const]
[task 2022-02-13T11:35:31.400Z] 11:35:31     INFO - Crash dump filename: /tmp/tmppd1gk7a1.profile/minidumps/5f731954-1203-03b5-8f48-ca5de9c7824b.dmp
[task 2022-02-13T11:35:31.400Z] 11:35:31     INFO - Operating system: Linux
[task 2022-02-13T11:35:31.400Z] 11:35:31     INFO -                   0.0.0 Linux 4.4.0-1014-aws #14taskcluster1-Ubuntu SMP Tue Apr 3 10:27:00 UTC 2018 x86_64
[task 2022-02-13T11:35:31.400Z] 11:35:31     INFO - CPU: amd64
[task 2022-02-13T11:35:31.400Z] 11:35:31     INFO -      family 6 model 85 stepping 7
[task 2022-02-13T11:35:31.400Z] 11:35:31     INFO -      4 CPUs
[task 2022-02-13T11:35:31.400Z] 11:35:31     INFO - Linux Ubuntu 18.04 - bionic (Ubuntu 18.04.6 LTS)
[task 2022-02-13T11:35:31.400Z] 11:35:31     INFO - 
[task 2022-02-13T11:35:31.400Z] 11:35:31     INFO - Crash reason:  SIGSEGV / SI_KERNEL
[task 2022-02-13T11:35:31.400Z] 11:35:31     INFO - Crash address: 0x0
[task 2022-02-13T11:35:31.400Z] 11:35:31     INFO - Process uptime: not available
[task 2022-02-13T11:35:31.400Z] 11:35:31     INFO - 
[task 2022-02-13T11:35:31.400Z] 11:35:31     INFO - Thread 0  (crashed)
[task 2022-02-13T11:35:31.400Z] 11:35:31     INFO -  0  libxul.so!mozilla::detail::nsDequeBase::ObjectAt(unsigned long) const [nsDeque.cpp:d19a3142131dc80c732ff5ad4e7ac69f7479e7f4 : 260 + 0xe]
[task 2022-02-13T11:35:31.400Z] 11:35:31     INFO -     rax = 0x0000000000000001   rdx = 0x00000000000000e6
[task 2022-02-13T11:35:31.400Z] 11:35:31     INFO -     rcx = 0xe5e5e5e5e5e5e5e5   rbx = 0x0000000000004590
[task 2022-02-13T11:35:31.400Z] 11:35:31     INFO -     rsi = 0x00000000000000e6   rdi = 0x00007f2fcd1df7a0
[task 2022-02-13T11:35:31.400Z] 11:35:31     INFO -     rbp = 0x00007ffc423cb3e0   rsp = 0x00007ffc423cb3e0
[task 2022-02-13T11:35:31.400Z] 11:35:31     INFO -      r8 = 0xe5e5e5e5e5e5e5e5    r9 = 0x0000000000000003
[task 2022-02-13T11:35:31.401Z] 11:35:31     INFO -     r10 = 0x0018886c9d7744ab   r11 = 0x00000000ffffffff
[task 2022-02-13T11:35:31.401Z] 11:35:31     INFO -     r12 = 0x0000000000000000   r13 = 0x0000000000000123
[task 2022-02-13T11:35:31.401Z] 11:35:31     INFO -     r14 = 0x0000000000004590   r15 = 0x00007f2f7a6b9cc0
[task 2022-02-13T11:35:31.401Z] 11:35:31     INFO -     rip = 0x00007f30272a3efa
[task 2022-02-13T11:35:31.401Z] 11:35:31     INFO -     Found by: given as instruction pointer in context
[task 2022-02-13T11:35:31.401Z] 11:35:31     INFO -  1  libxul.so!mozilla::net::HpackDynamicTableReporter::CollectReports(nsIHandleReportCallback*, nsISupports*, bool) [Http2Compression.cpp:d19a3142131dc80c732ff5ad4e7ac69f7479e7f4 : 65 + 0x6a]
[task 2022-02-13T11:35:31.401Z] 11:35:31     INFO -     rbx = 0x0000000000004590   rbp = 0x00007ffc423cb470
[task 2022-02-13T11:35:31.401Z] 11:35:31     INFO -     rsp = 0x00007ffc423cb3f0   r12 = 0x0000000000000000
[task 2022-02-13T11:35:31.401Z] 11:35:31     INFO -     r13 = 0x0000000000000123   r14 = 0x0000000000004590
[task 2022-02-13T11:35:31.401Z] 11:35:31     INFO -     r15 = 0x00007f2f7a6b9cc0   rip = 0x00007f302300df0e
[task 2022-02-13T11:35:31.401Z] 11:35:31     INFO -     Found by: call frame info
[task 2022-02-13T11:35:31.401Z] 11:35:31     INFO -  2  libxul.so!mozilla::detail::RunnableFunction<nsMemoryReporterManager::DispatchReporter(nsIMemoryReporter*, bool, nsIHandleReportCallback*, nsISupports*, bool)::$_1>::Run() [nsThreadUtils.h:d19a3142131dc80c732ff5ad4e7ac69f7479e7f4 : 531 + 0x15]
[task 2022-02-13T11:35:31.402Z] 11:35:31     INFO -     rbx = 0x00007f2f7a270940   rbp = 0x00007ffc423cb490
 INFO -     rsp = 0x00007ffc423cb480   r12 = 0x00007f2fc5c83880
[task 2022-02-13T11:35:31.402Z] 11:35:31     INFO -     r13 = 0x00000000ffffffff   r14 = 0x0000000000000000
[task 2022-02-13T11:35:31.402Z] 11:35:31     INFO -     r15 = 0x00007f30350e1be8   rip = 0x00007f3022d71d72
[task 2022-02-13T11:35:31.402Z] 11:35:31     INFO -     Found by: call frame info
[task 2022-02-13T11:35:31.402Z] 11:35:31     INFO -  3  libxul.so!mozilla::TaskController::DoExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) [TaskController.cpp:d19a3142131dc80c732ff5ad4e7ac69f7479e7f4 : 770 + 0x22]
[task 2022-02-13T11:35:31.402Z] 11:35:31     INFO -     rbx = 0x0000000000000000   rbp = 0x00007ffc423cb6b0
[task 2022-02-13T11:35:31.402Z] 11:35:31     INFO -     rsp = 0x00007ffc423cb4a0   r12 = 0x00007f2fc5c83880
[task 2022-02-13T11:35:31.402Z] 11:35:31     INFO -     r13 = 0x00000000ffffffff   r14 = 0x0000000000000000
[task 2022-02-13T11:35:31.402Z] 11:35:31     INFO -     r15 = 0x00007f30350e1be8   rip = 0x00007f3026863784
[task 2022-02-13T11:35:31.402Z] 11:35:31     INFO -     Found by: call frame info
[task 2022-02-13T11:35:31.402Z] 11:35:31     INFO -  4  libxul.so!nsThread::ProcessNextEvent(bool, bool*) [nsThread.cpp:d19a3142131dc80c732ff5ad4e7ac69f7479e7f4 : 1171 + 0x3a]
[task 2022-02-13T11:35:31.402Z] 11:35:31     INFO -     rbx = 0x00007ffc423cb7d0   rbp = 0x00007ffc423cb910
[task 2022-02-13T11:35:31.402Z] 11:35:31     INFO -     rsp = 0x00007ffc423cb6c0   r12 = 0x00007f3020ec22c0
[task 2022-02-13T11:35:31.402Z] 11:35:31     INFO -     r13 = 0x00007f301fc62478   r14 = 0x00007f301fc622f0
[task 2022-02-13T11:35:31.402Z] 11:35:31     INFO -     r15 = 0x00007f301fc1c301   rip = 0x00007f302686ae22
[task 2022-02-13T11:35:31.402Z] 11:35:31     INFO -     Found by: call frame info
[task 2022-02-13T11:35:31.402Z] 11:35:31     INFO -  5  libxul.so!mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) [MessagePump.cpp:d19a3142131dc80c732ff5ad4e7ac69f7479e7f4 : 85 + 0x2a]
[task 2022-02-13T11:35:31.402Z] 11:35:31     INFO -     rbx = 0x00007f301fc1d240   rbp = 0x00007ffc423cbaa0
[task 2022-02-13T11:35:31.402Z] 11:35:31     INFO -     rsp = 0x00007ffc423cb920   r12 = 0x00007f301fc1d268
[task 2022-02-13T11:35:31.402Z] 11:35:31     INFO -     r13 = 0x00007f30179f4880   r14 = 0x00007f3020e7ebc0
[task 2022-02-13T11:35:31.402Z] 11:35:31     INFO -     r15 = 0x00007f30350e2bc0   rip = 0x00007f30268b2006
[task 2022-02-13T11:35:31.402Z] 11:35:31     INFO -     Found by: call frame info
[task 2022-02-13T11:35:31.402Z] 11:35:31     INFO -  6  libxul.so!MessageLoop::Run() [message_loop.cc:d19a3142131dc80c732ff5ad4e7ac69f7479e7f4 : 306 + 0xb]
[task 2022-02-13T11:35:31.402Z] 11:35:31     INFO -     rbx = 0x00007ffc423cbab8   rbp = 0x00007ffc423cbae0
[task 2022-02-13T11:35:31.402Z] 11:35:31     INFO -     rsp = 0x00007ffc423cbab0   r12 = 0x00007ffc423cbba0
[task 2022-02-13T11:35:31.402Z] 11:35:31     INFO -     r13 = 0x00007ffc423cbb78   r14 = 0x00007f30350e1ae0
[task 2022-02-13T11:35:31.402Z] 11:35:31     INFO -     r15 = 0x00007ffc423cbd40   rip = 0x00007f3027499bbf
[task 2022-02-13T11:35:31.403Z] 11:35:31     INFO -     Found by: call frame info
[task 2022-02-13T11:35:31.403Z] 11:35:31     INFO -  7  libxul.so!nsBaseAppShell::Run() [nsBaseAppShell.cpp:d19a3142131dc80c732ff5ad4e7ac69f7479e7f4 : 137 + 0xc]
[task 2022-02-13T11:35:31.403Z] 11:35:31     INFO -     rbx = 0x00007f3020e79f90   rbp = 0x00007ffc423cbb00
[task 2022-02-13T11:35:31.403Z] 11:35:31     INFO -     rsp = 0x00007ffc423cbaf0   r12 = 0x00007ffc423cbba0
[task 2022-02-13T11:35:31.403Z] 11:35:31     INFO -     r13 = 0x00007ffc423cbb78   r14 = 0x00007f30350e1ae0
[task 2022-02-13T11:35:31.403Z] 11:35:31     INFO -     r15 = 0x00007ffc423cbd40   rip = 0x00007f3027b99d09
[task 2022-02-13T11:35:31.403Z] 11:35:31     INFO -     Found by: call frame info
[task 2022-02-13T11:35:31.403Z] 11:35:31     INFO -  8  libxul.so!nsAppStartup::Run() [nsAppStartup.cpp:d19a3142131dc80c732ff5ad4e7ac69f7479e7f4 : 295 + 0x9]

UAF while running HpackDynamicTableReporter.

Group: core-security-release → network-core-security
Component: AWSY → Networking: HTTP
Product: Testing → Core

Well, there are poison values in the register. The actual crash is on null, so maybe it isn't really a UAF.

I did find 15 crashes in the wild with HpackDynamicTableReporter in the proto signature.

Here are some of the recent ones:

bp-a310073c-e4e6-4175-9231-9e0f90220124
bp-f1b5bd36-d373-4b5c-bd4a-d47180211230
bp-64871290-3d60-48e1-80f8-d17bb0211222
bp-236f0893-6a77-4489-8ccb-31c930211219
bp-0a7b34d4-d815-4010-bb15-5ab9d0220128

All but the last of these is crashing on a poison value.

We do periodically get a memory report automatically, so this crash can happen without a user manually getting a memory report, but it is probably very difficult to exploit. I'd guess there's also some kind of race involved.

Crash Signature: [@ mozilla::detail::nsDequeBase::ObjectAt(unsigned long) const] → [@ mozilla::detail::nsDequeBase::ObjectAt ] [@ mozilla::net::HpackDynamicTableReporter::CollectReports ]

The nsDequeBase::ObjectAt signature is very generic, and the most recent 3 crashes I see are not related to the memory reporter.

Looking at the line the crash is happening on, it looks like the dead object is HpackDynamicTableReporter::mCompressor, but ~Http2BaseCompressor clears the weak reference from the reporter to the compressor. However, I don't see any locking there, so CollectReports() could start, copying mCompressor into a local variable, then ~Http2BaseCompressor could run on the socket thread, which wouldn't clear the stack reference.

I looked at a bunch of the crash reports, and in most of them, the socket thread is idle, but I did find one where ~nsSocketTransport() was running, so maybe this could be evidence of some activity happening on the socket thread which could tear down a Http2BaseCompressor?

bp-d79c6b68-ec97-493d-b455-8d5de0211127

I guess this is more of a race leading to a UAF, so I'll change the sec type.

Priority: -- → P2
Whiteboard: [necko-triaged]
Whiteboard: [necko-triaged] → [necko-triaged][necko-priority-next]
Whiteboard: [necko-triaged][necko-priority-next] → [necko-triaged][necko-priority-queue]
Assignee: nobody → valentin.gosu
Attachment #9378542 - Attachment description: Bug 1755186 - Hold mutex when reporting Http2BaseCompressor::SizeOfExcludingThis r=#necko → Bug 1755186 - HpackDynamicTableReporter::CollectReports should hold a mutex r=#necko
Pushed by valentin.gosu@gmail.com: https://hg.mozilla.org/integration/autoland/rev/3c97098ecdef HpackDynamicTableReporter::CollectReports should hold a mutex r=necko-reviewers,jesup

Backed out for causing build bustages in Http2Compression.h
https://hg.mozilla.org/integration/autoland/rev/3fe6bd949054cc4a09d7e282c149f0d97e4b7003

Push with failures
Failure log

/builds/worker/checkouts/gecko/netwerk/protocol/http/Http2Compression.h:62:33: error: expected ';' at end of declaration list
gmake[4]: *** [/builds/worker/checkouts/gecko/config/rules.mk:688: Unified_cpp_dom_fetch0.o] Error 1
/builds/worker/checkouts/gecko/netwerk/protocol/http/Http2Compression.h:62:33: error: expected ';' at end of declaration list
Flags: needinfo?(valentin.gosu)
Flags: needinfo?(valentin.gosu)
Pushed by valentin.gosu@gmail.com: https://hg.mozilla.org/integration/autoland/rev/334e2ef40a67 HpackDynamicTableReporter::CollectReports should hold a mutex r=necko-reviewers,jesup
Group: network-core-security → core-security-release
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 124 Branch
QA Whiteboard: [post-critsmash-triage]
Flags: qe-verify-

:valentin can you nominate this for esr115 when you get a chance?

Flags: needinfo?(valentin.gosu)

HpackDynamicTableReporter::mMutex protects access to HpackDynamicTableReporter::mCompressor
nvFIFO::mMutex protects access to mvFIFO::mTable - is aquired when adding or removing elements in the table, and when reporting SizeOfDynamicTable

Original Revision: https://phabricator.services.mozilla.com/D200814

Attachment #9387981 - Flags: approval-mozilla-esr115?

HpackDynamicTableReporter::mMutex protects access to HpackDynamicTableReporter::mCompressor
nvFIFO::mMutex protects access to mvFIFO::mTable - is aquired when adding or removing elements in the table, and when reporting SizeOfDynamicTable

Original Revision: https://phabricator.services.mozilla.com/D200814

Attachment #9387982 - Flags: approval-mozilla-esr115?
Attachment #9387981 - Attachment is obsolete: true
Flags: needinfo?(valentin.gosu)
Attachment #9387981 - Flags: approval-mozilla-esr115?

Comment on attachment 9387982 [details]
Bug 1755186 - HpackDynamicTableReporter::CollectReports should hold a mutex r=#necko

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration: Possible UAF when measuring memory usage.
  • User impact if declined: To avoid crashes when measuring memory usage.
  • Fix Landed on Version: 124
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): The fix makes it so that we hold a mutex while changing the dynamic header table or when reading it from the main thread.
    This avoids accessing elements that may have been freed.
    The fix is low in complexity and should only affect behaviour when measuring memory use.
Attachment #9387982 - Flags: approval-mozilla-esr115? → approval-mozilla-esr115+

Uplift Approval Request

  • User impact if declined: See ticket
  • Is Android affected?: no
  • Code covered by automated testing: no
  • Steps to reproduce for manual QE testing: See ticket
  • Explanation of risk level: Low
  • Fix verified in Nightly: yes
  • Risk associated with taking this patch: See ticket
  • Needs manual QE test: no
  • String changes made/needed: None
Whiteboard: [necko-triaged][necko-priority-queue] → [necko-triaged][necko-priority-queue][adv-esr124-][adv-main124-]
Whiteboard: [necko-triaged][necko-priority-queue][adv-esr124-][adv-main124-] → [necko-triaged][necko-priority-queue][adv-main124-][adv-esr115.9-]
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: