Closed Bug 1970753 Opened 10 months ago Closed 9 months ago

AddressSanitizer: heap-use-after-free [@ operator-] with READ of size 4

Categories

(Core :: Graphics: WebRender, defect)

defect

Tracking

()

RESOLVED FIXED
141 Branch
Tracking Status
firefox-esr115 --- unaffected
firefox-esr128 --- unaffected
firefox-esr140 141+ fixed
firefox139 --- wontfix
firefox140 --- wontfix
firefox141 + fixed

People

(Reporter: tsmith, Assigned: lsalzman)

References

(Blocks 1 open bug, Regression)

Details

(4 keywords, Whiteboard: [sec-high part was fixed by bug 1963341][adv-main141-][adv-ESR140.1-])

Attachments

(2 files)

Attached file asan-log.txt

Found while fuzzing 20250502-5b5bd7e73009 (--enable-address-sanitizer --enable-fuzzing)

I do have a test case but it is not fully reduced and it seems to trigger multiple crashes. Here is the UAF the full log is attached.
The most easily reproducible report generated by the test case is an assertion failure:

firefox: src/gl.cc:1957: void TexSubImage2D(GLenum, GLint, GLint, GLint, GLsizei, GLsizei, GLenum, GLenum, void *): Assertion `xoffset + width <= t.width' failed.

The assertion failure can be reproduce with all patches from bug 1963341 applied. I will work on getting Pernosco sessions.

==462==ERROR: AddressSanitizer: heap-use-after-free on address 0x5070005aeca4 at pc 0x748e864fcacf bp 0x748e39eb9800 sp 0x748e39eb97f8
READ of size 4 at 0x5070005aeca4 thread T39
    #0 0x748e864fcace in operator- /gecko/gfx/wr/swgl/src/glsl.h:855:57
    #1 0x748e864fcace in operator- /gecko/gfx/wr/swgl/src/gl.cc:171:34
    #2 0x748e864fcace in Composite /gecko/gfx/wr/swgl/src/composite.h:591:62
    #3 0x748e85af3646 in webrender::compositor::sw_compositor::SwCompositeJob::composite_rect::hdf3f597965e94ea7 /gecko/gfx/wr/webrender/src/compositor/sw_compositor.rs
    #4 0x748e85af57bf in webrender::compositor::sw_compositor::SwCompositeJob::process::hb3f0d4fa14b33510 /gecko/gfx/wr/webrender/src/compositor/sw_compositor.rs:709:13
    #5 0x748e85af57bf in webrender::compositor::sw_compositor::SwCompositeGraphNode::process_job::hcbc8fdd31ab65c9f /gecko/gfx/wr/webrender/src/compositor/sw_compositor.rs:848:13
    #6 0x748e85af57bf in webrender::compositor::sw_compositor::SwCompositeThread::process_job::h67180a7bea78fa37 /gecko/gfx/wr/webrender/src/compositor/sw_compositor.rs:964:9
    #7 0x748e85af78ac in webrender::compositor::sw_compositor::SwCompositeThread::wait_for_composites::hce87d4a805a541dc /gecko/gfx/wr/webrender/src/compositor/sw_compositor.rs:1124:17
    #8 0x748e85b115c9 in _$LT$webrender..compositor..sw_compositor..SwCompositor$u20$as$u20$webrender..composite..Compositor$GT$::end_frame::h73850d5d9c14fe53 /gecko/gfx/wr/webrender/src/compositor/sw_compositor.rs:1993:13
    #9 0x748e85df6583 in webrender::renderer::Renderer::render_impl::hd185126aacfbbe5a /gecko/gfx/wr/webrender/src/renderer/mod.rs:1789:21
    #10 0x748e85dded3c in webrender::renderer::Renderer::render::h6465cafaf50cef67 /gecko/gfx/wr/webrender/src/renderer/mod.rs:1283:30
    ...

0x5070005aeca4 is located 52 bytes inside of 80-byte region [0x5070005aec70,0x5070005aecc0)
freed by thread T39 here:
    #0 0x5b55b7b729fd in operator delete(void*) /builds/worker/fetches/llvm-project/compiler-rt/lib/asan/asan_new_delete.cpp:143:3
    #1 0x748e864f28e6 in ObjectStore<Texture>::erase(unsigned long) /gecko/gfx/wr/swgl/src/gl.cc:742:7
    #2 0x748e864f25e9 in DeleteTexture /gecko/gfx/wr/swgl/src/gl.cc:2021:26
    #3 0x748e864df991 in _$LT$swgl..swgl_fns..Context$u20$as$u20$gleam..gl..Gl$GT$::delete_textures::h649ec7a38a50c86f /gecko/gfx/wr/swgl/src/swgl_fns.rs:822:17
    #4 0x748e85b2b72e in webrender::device::gl::Device::delete_texture::h0f50c9c2f8596acd /gecko/gfx/wr/webrender/src/device/gl.rs:3002:9
    #5 0x748e85e95a54 in webrender::renderer::Renderer::draw_frame::h02801c24c87a7e7d /gecko/gfx/wr/webrender/src/renderer/mod.rs:5171:13
    #6 0x748e85de4d36 in webrender::renderer::Renderer::render_impl::hd185126aacfbbe5a /gecko/gfx/wr/webrender/src/renderer/mod.rs:1594:17
    #7 0x748e85dded3c in webrender::renderer::Renderer::render::h6465cafaf50cef67 /gecko/gfx/wr/webrender/src/renderer/mod.rs:1283:30
    #8 0x748e851410b6 in wr_renderer_render /gecko/gfx/webrender_bindings/src/bindings.rs:650:11
    #9 0x748e7734d092 in mozilla::wr::RendererOGL::UpdateAndRender(mozilla::Maybe<mozilla::gfx::IntSizeTyped<mozilla::gfx::UnknownUnits>> const&, mozilla::Maybe<mozilla::wr::ImageFormat> const&, mozilla::Maybe<mozilla::Range<unsigned char>> const&, bool*, mozilla::wr::FrameReadyParams const&, mozilla::wr::RendererStats*) /gecko/gfx/webrender_bindings/RendererOGL.cpp:226:19
    #10 0x748e7734afeb in mozilla::wr::RenderThread::UpdateAndRender(mozilla::wr::WrWindowId, mozilla::layers::BaseTransactionId<mozilla::VsyncIdType> const&, mozilla::TimeStamp const&, mozilla::wr::FrameReadyParams const&, mozilla::Maybe<mozilla::gfx::IntSizeTyped<mozilla::gfx::UnknownUnits>> const&, mozilla::Maybe<mozilla::wr::ImageFormat> const&, mozilla::Maybe<mozilla::Range<unsigned char>> const&, mozilla::wr::RendererStats*, bool*) /gecko/gfx/webrender_bindings/RenderThread.cpp:853:31
    ...

previously allocated by thread T39 here:
    #0 0x5b55b7b7219d in operator new(unsigned long) /builds/worker/fetches/llvm-project/compiler-rt/lib/asan/asan_new_delete.cpp:86:3
    #1 0x748e869886a5 in ObjectStore<Texture>::insert(unsigned long, Texture const&) /gecko/gfx/wr/swgl/src/gl.cc:709:35
    #2 0x748e864f24e6 in insert /gecko/gfx/wr/swgl/src/gl.cc:721:5
    #3 0x748e864f24e6 in GenTextures /gecko/gfx/wr/swgl/src/gl.cc:2016:31
    #4 0x748e864dec90 in _$LT$swgl..swgl_fns..Context$u20$as$u20$gleam..gl..Gl$GT$::gen_textures::hd4c4c97224eee4c5 /gecko/gfx/wr/swgl/src/swgl_fns.rs:710:13
    #5 0x748e85b26642 in webrender::device::gl::Device::create_texture::h35341bbb4dcd4ec1 /gecko/gfx/wr/webrender/src/device/gl.rs:2600:17
    #6 0x748e85e859ec in webrender::renderer::Renderer::create_gpu_buffer_texture::hcdd65b8c704938ed /gecko/gfx/wr/webrender/src/renderer/mod.rs:4896:38
    #7 0x748e85e859ec in webrender::renderer::Renderer::draw_frame::h02801c24c87a7e7d /gecko/gfx/wr/webrender/src/renderer/mod.rs:4946:36
    #8 0x748e85de4d36 in webrender::renderer::Renderer::render_impl::hd185126aacfbbe5a /gecko/gfx/wr/webrender/src/renderer/mod.rs:1594:17
    #9 0x748e85dded3c in webrender::renderer::Renderer::render::h6465cafaf50cef67 /gecko/gfx/wr/webrender/src/renderer/mod.rs:1283:30
    #10 0x748e851410b6 in wr_renderer_render /gecko/gfx/webrender_bindings/src/bindings.rs:650:11
    ...

Thread T39 created by T0 here:
    #0 0x5b55b7b17a61 in pthread_create /builds/worker/fetches/llvm-project/compiler-rt/lib/asan/asan_interceptors.cpp:250:3
    #1 0x748e975b72b9 in _PR_CreateThread /gecko/nsprpub/pr/src/pthreads/ptthread.c:429:10
    #2 0x748e975a54fe in PR_CreateThread /gecko/nsprpub/pr/src/pthreads/ptthread.c:496:10
    #3 0x748e741c3701 in nsThread::Init(nsTSubstring<char> const&) /gecko/xpcom/threads/nsThread.cpp:615:20
    #4 0x748e741d0d66 in nsThreadManager::NewNamedThread(nsTSubstring<char> const&, nsIThreadManager::ThreadCreationOptions, nsIThread**) /gecko/xpcom/threads/nsThreadManager.cpp:619:22
    #5 0x748e741db689 in NS_NewNamedThread(nsTSubstring<char> const&, nsIThread**, already_AddRefed<nsIRunnable>, nsIThreadManager::ThreadCreationOptions) /gecko/xpcom/threads/nsThreadUtils.cpp:176:57
    #6 0x748e773435bf in NS_NewNamedThread<9UL> /builds/worker/workspace/obj-build/dist/include/nsThreadUtils.h:76:10
    #7 0x748e773435bf in mozilla::wr::RenderThread::Start(unsigned int) /gecko/gfx/webrender_bindings/RenderThread.cpp:141:17
    #8 0x748e76fe4cb9 in gfxPlatform::InitLayersIPC() /gecko/gfx/thebes/gfxPlatform.cpp:1346:7
    #9 0x748e76fdf092 in gfxPlatform::Init() /gecko/gfx/thebes/gfxPlatform.cpp:975:3
    #10 0x748e7e7bc9b4 in GetPlatform /builds/worker/workspace/obj-build/dist/include/gfxPlatform.h:184:7
    ...

A Pernosco session is available here: https://pernos.co/debug/Mu5umOwID5XZIUnjAaeymQ/index.html

This was created with m-c 236f8bd740a3 and the "Add more checks" patch applied.

Keywords: pernosco

I was able to trigger bug 1970772 and bug 1970773 with the same test case that I use to get the above Pernosco session with. I am trying to reduce the test case but it is large and unreliable.

Keywords: sec-high

I initially thought this was related to my recent SWGL changes, but now I'm not so sure. I've discussed it with Lee a bit.

The changes I made in theory should not be in use yet (the preference isn't flipped), but it's possible it's triggering some other way.

A few questions:

(a) Do we think this and the related bugs are only triggered when a view transition occurs? The only test case I could find in the related bugs creates a view transition, but maybe there are other test cases that don't have this?

(b) The crash seems to occur in a deleted gpu_buffer_texture which shouldn't be touched during the sw composite calls, so something very strange is happening. Could it possibly be related to a texture ID being reused between a compositing and gpu_buffer_texture? Could it possibly be related to the changes that can render a frame to an off-screen buffer (e.g. what if a begin/end frame was being missed somehow in the compositing code, which may not do the block waiting on sw composite thread to complete?). Just complete speculation here.

(c) There is a related bug where SWGL was receiving some RED data for a RGBA texture. Lee added support for this conversion, but I think it might be indicative of a deeper underlying bug (which might be the same cause as this), as we should never be supplying RED data for an RGBA texture from WR, as far as I know?

Flags: needinfo?(twsmith)
Flags: needinfo?(nical.bugzilla)
Assignee: nobody → lsalzman
Status: NEW → ASSIGNED

We're not really sure what's causing this, but we have a guess that at heart this may be due to some recent change in WebRender or Gecko that is causing a composite job to keep referencing a SWGL texture even after the texture has been deleted by WebRender.

I have made a tentative patch to keep the locked texture alive even if WR requests a DeleteTexture, which should get rid of any use-after-free behavior. At worst, it might leave over a non-security leak behavior if the composite jobs are not being properly cleaned up, leading to this bug in the first place, but that seems like a much lesser evil.

In the mean time, this hypothetically will at least downgrade the security issue and prevent similar issues from happening in the future.

Comment on attachment 9494505 [details]
Bug 1970753 - Keep locked resources alive. r?gw

Security Approval Request

  • How easily could an exploit be constructed based on the patch?: Potential use-after-free in Software WebRender that may be getting aggravated by the recent introduction of view transitions.
  • Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?: Unknown
  • Which branches (beta, release, and/or ESR) are affected by this flaw, and do the release status flags reflect this affected/unaffected state correctly?: 139+
  • If not all supported branches, which bug introduced the flaw?: Bug 1950759
  • Do you have backports for the affected branches?: Yes
  • If not, how different, hard to create, and risky will they be?:
  • How likely is this patch to cause regressions; how much testing does it need?: Unlikely
  • Is the patch ready to land after security approval is given?: Yes
  • Is Android affected?: Yes
Attachment #9494505 - Flags: sec-approval?

Tyson, can you see if this patch resolves the use-after-free here?

[Tracking Requested - why for this release]:

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?:
Unknown

The commit comment says "Keep locked resources alive" which flags the commit as fixing a UAF without having to look at the patch.

Code comments say things like "The texture was deleted while still locked and must stay alive until..." and "Lock the texture so that it can't be deleted by another thread yet" which likewise says "Here be UAFs". There are similar comments in a big chunk of code that was moved to another file (with some changes) so readers might find the comments less interesting than they otherwise would? Assuming the commit comment wasn't there, of course.

There's no test attached and the fuzzing testcase resisted reduction so I wouldn't call it a "bulls-eye", but it does show generally where the target is. It's too late to land in 140 (normally) and given comment 7 we don't have the confidence in the fix needed to uplift into an extra RC (RC 1 is already underway or done by now). But since this doesn't affect ESRs it might be a good candidate for the mid-cycle point-release.

:dveditz/:lsalzman the regressing bug enabled this in Nightly only?
https://searchfox.org/mozilla-central/source/modules/libpref/init/StaticPrefList.yaml#5062
Are Fx139 and Fx140 affected?

Flags: needinfo?(lsalzman)
Flags: needinfo?(dveditz)

I have not been able to reproduce this issue.

FWIW the fuzzers have not reported it since the patch for bug 1963341 landed, OTOH this has been very difficult to reproduce.

Flags: needinfo?(twsmith)

(In reply to Donal Meehan [:dmeehan] from comment #9)

:dveditz/:lsalzman the regressing bug enabled this in Nightly only?

sorry I missed that -- you're right, and I've updated the appropriate statuses to "disabled".

(In reply to Donal Meehan [:dmeehan] from comment #9)

:dveditz/:lsalzman the regressing bug enabled this in Nightly only?
https://searchfox.org/mozilla-central/source/modules/libpref/init/StaticPrefList.yaml#5062
Are Fx139 and Fx140 affected?

While the issues started when view transitions were introduced, it is still unclear if view transitions must be enabled to repro, or if something in the supporting code that was added is allowing it to happen even when view transitions are disabled.

So it would be safest to assume 139 and up are effected even though the order isn't on for them.

Flags: needinfo?(lsalzman)

(In reply to Daniel Veditz [:dveditz] from comment #8)

[Tracking Requested - why for this release]:

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?:
Unknown

The commit comment says "Keep locked resources alive" which flags the commit as fixing a UAF without having to look at the patch.

Code comments say things like "The texture was deleted while still locked and must stay alive until..." and "Lock the texture so that it can't be deleted by another thread yet" which likewise says "Here be UAFs". There are similar comments in a big chunk of code that was moved to another file (with some changes) so readers might find the comments less interesting than they otherwise would? Assuming the commit comment wasn't there, of course.

There's no test attached and the fuzzing testcase resisted reduction so I wouldn't call it a "bulls-eye", but it does show generally where the target is. It's too late to land in 140 (normally) and given comment 7 we don't have the confidence in the fix needed to uplift into an extra RC (RC 1 is already underway or done by now). But since this doesn't affect ESRs it might be a good candidate for the mid-cycle point-release.

If it seems I resolved the issue in bug 1963341 as Tyson suggests, can we instead consider the patch here a non-sec enhancement so I can land it otherwise? I can file a different enhancement bug for it if need be. The aim is just to resolve this nebulous class of bugs without requiring spot fixes like in bug 1963341, even though my keep-alive patch is potentially unnecessary in that instance.

Flags: needinfo?(dveditz)

Even as a non-security enhancement I'd want to keep this bug hidden because it does relate to security bugs we haven't shipped yet. But sure, that sounds fine.

I've changed this bug to sec-other and noted in the whiteboard that the original sec-high vuln was fixed in bug 1963341. Other than leaving the bug itself hidden you can land as a non-security bug.

Depends on: 1963341
Flags: needinfo?(dveditz)
Keywords: sec-highsec-other
Whiteboard: [sec-high part was fixed by bug 1963341]
Status: ASSIGNED → RESOLVED
Closed: 9 months ago
Resolution: --- → FIXED
Target Milestone: --- → 141 Branch
Group: gfx-core-security → core-security-release
Flags: needinfo?(nical.bugzilla)
QA Whiteboard: [sec] [qa-triage-done-c142/b141]
Flags: qe-verify-
Attachment #9494505 - Flags: sec-approval?

Please nominate this for ESR140 when you get a chance.

Flags: needinfo?(lsalzman)

Comment on attachment 9494505 [details]
Bug 1970753 - Keep locked resources alive. r?gw

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration: Guard against potential use-after-free bugs in WebRender due to view transitions-related work.
  • User impact if declined: ^
  • Fix Landed on Version: 141
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Seems stable enough in 141.
Flags: needinfo?(lsalzman)
Attachment #9494505 - Flags: approval-mozilla-esr140?
Attachment #9494505 - Flags: approval-mozilla-esr140? → approval-mozilla-esr140+
Whiteboard: [sec-high part was fixed by bug 1963341] → [sec-high part was fixed by bug 1963341][adv-main141-]
Whiteboard: [sec-high part was fixed by bug 1963341][adv-main141-] → [sec-high part was fixed by bug 1963341][adv-main141-][adv-ESR140.1-]
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: