Closed Bug 1771025 Opened 2 years ago Closed 2 years ago

Assertion failure: !mStream, at /dom/cache/ReadStream.cpp:475

Categories

(Core :: Storage: Cache API, defect, P2)

x86_64
Linux
defect

Tracking

()

RESOLVED FIXED
103 Branch
Tracking Status
firefox103 --- fixed

People

(Reporter: jkratzer, Assigned: saschanaz)

References

(Blocks 1 open bug)

Details

(Keywords: testcase)

Attachments

(3 files, 2 obsolete files)

Testcase found while fuzzing mozilla-central rev e97c9ea8b858 (built with: --enable-address-sanitizer --enable-fuzzing).

Testcase can be reproduced using the following commands:

$ pip install fuzzfetch grizzly-framework
$ python -m fuzzfetch --build e97c9ea8b858 --asan --fuzzing -n firefox
$ python -m grizzly.replay ./firefox/firefox testcase.html
Assertion failure: !mStream, at /dom/cache/ReadStream.cpp:475

    =================================================================
    ==697664==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000001 (pc 0x7f4c1de5bfc7 bp 0x7ffc882dab10 sp 0x7ffc882daa60 T0)
    ==697664==The signal is caused by a WRITE memory access.
    ==697664==Hint: address points to the zero page.
        #0 0x7f4c1de5bfc7 in mozilla::dom::cache::ReadStream::Inner::OpenStreamFailed() /dom/cache/ReadStream.cpp:475:3
        #1 0x7f4c1de5b530 in mozilla::dom::cache::ReadStream::Inner::MaybeAbortAsyncOpenStream() /dom/cache/ReadStream.cpp:470:3
        #2 0x7f4c1de5b0a9 in mozilla::dom::cache::ReadStream::Inner::NoteClosedOnOwningThread() /dom/cache/ReadStream.cpp:380:3
        #3 0x7f4c1de99d0f in mozilla::dom::cache::ReadStream::Inner::NoteClosedRunnable::Run() /dom/cache/ReadStream.cpp:122:14
        #4 0x7f4c18dd2742 in mozilla::RunnableTask::Run() /xpcom/threads/TaskController.cpp:475:16
        #5 0x7f4c18d987a5 in mozilla::TaskController::DoExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) /xpcom/threads/TaskController.cpp:788:26
        #6 0x7f4c18d95958 in mozilla::TaskController::ExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) /xpcom/threads/TaskController.cpp:620:15
        #7 0x7f4c18d96080 in mozilla::TaskController::ProcessPendingMTTask(bool) /xpcom/threads/TaskController.cpp:398:36
        #8 0x7f4c18ddb291 in operator() /xpcom/threads/TaskController.cpp:124:37
        #9 0x7f4c18ddb291 in mozilla::detail::RunnableFunction<mozilla::TaskController::InitializeInternal()::$_0>::Run() /builds/worker/workspace/obj-build/dist/include/nsThreadUtils.h:531:5
        #10 0x7f4c18db9107 in nsThread::ProcessNextEvent(bool, bool*) /xpcom/threads/nsThread.cpp:1180:16
        #11 0x7f4c18dc326c in NS_ProcessNextEvent(nsIThread*, bool) /xpcom/threads/nsThreadUtils.cpp:465:10
        #12 0x7f4c1a4d90af in mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) /ipc/glue/MessagePump.cpp:85:21
        #13 0x7f4c1a35f3d1 in RunInternal /ipc/chromium/src/base/message_loop.cc:380:10
        #14 0x7f4c1a35f3d1 in RunHandler /ipc/chromium/src/base/message_loop.cc:373:3
        #15 0x7f4c1a35f3d1 in MessageLoop::Run() /ipc/chromium/src/base/message_loop.cc:355:3
        #16 0x7f4c21294ed7 in nsBaseAppShell::Run() /widget/nsBaseAppShell.cpp:137:27
        #17 0x7f4c2618054f in XRE_RunAppShell() /toolkit/xre/nsEmbedFunctions.cpp:874:20
        #18 0x7f4c1a35f3d1 in RunInternal /ipc/chromium/src/base/message_loop.cc:380:10
        #19 0x7f4c1a35f3d1 in RunHandler /ipc/chromium/src/base/message_loop.cc:373:3
        #20 0x7f4c1a35f3d1 in MessageLoop::Run() /ipc/chromium/src/base/message_loop.cc:355:3
        #21 0x7f4c2617f6fb in XRE_InitChildProcess(int, char**, XREChildData const*) /toolkit/xre/nsEmbedFunctions.cpp:733:34
        #22 0x5648f59d3c1d in content_process_main(mozilla::Bootstrap*, int, char**) /browser/app/../../ipc/contentproc/plugin-container.cpp:57:28
        #23 0x5648f59d4050 in main /browser/app/nsBrowserApp.cpp:338:18
        #24 0x7f4c3ea9c082 in __libc_start_main /build/glibc-SzIz7B/glibc-2.31/csu/../csu/libc-start.c:308:16
        #25 0x5648f5914069 in _start (/home/jkratzer/builds/mc-asan/firefox+0x5f069) (BuildId: edba2da312b95260515d181573d31b9eefc8f190)
    
    AddressSanitizer can not provide additional info.
    SUMMARY: AddressSanitizer: SEGV /dom/cache/ReadStream.cpp:475:3 in mozilla::dom::cache::ReadStream::Inner::OpenStreamFailed()
    ==697664==ABORTING
Attached file Testcase (obsolete) —

Hmm, the stack itself is I believe has nothing with DOM Streams, but the test uses it anyway... 🤔

Bugmon Analysis
Verified bug as reproducible on mozilla-central 20220524214448-3d57939c6779.
The bug appears to have been introduced in the following build range:

Start: 0d1d9fa72512cf8da0bc8c0fdd63f3fb3ff0a468 (20220322200148)
End: 39b9b2ea812745fc89d98d71c5601758d83342e1 (20220322213301)
Pushlog: https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=0d1d9fa72512cf8da0bc8c0fdd63f3fb3ff0a468&tochange=39b9b2ea812745fc89d98d71c5601758d83342e1

Keywords: regression
Whiteboard: [bugmon:confirm] → [bugmon:bisected,confirmed]
Attached file minimal.html (obsolete) —

This is my more-minimal testcase. I'm off for the rest of this week, feel free to take this if this is urgent to anyone.

The clone() and match() after that are interesting. I think those are supposed to be effectively no-op, but both are needed for this.

resp1.body getter must happen before .clone() for some unknown reason.

So far, no ability to get a recording of this failure, unfortunately.

I think this might be a logic error in the shutdown of ReadStream::Inner.

  • We're asserting inside the NoteClosed runnable, which is generated when NoteClosed is called on a different thread than the owning event target.
  • It seems like NoteClosed is called from OpenStreamFailed. This is particularly notable, because OpenStreamFailed explicitly sets mStream to a new NSCStringInputStream of ""_ns. Why this happens isn't clear to me at all...
  • During the execution of the NoteClosedRunnable, we end up invoking MaybeAbortAsyncOpenStream, which in turn calls OpenStreamFailed, where we assert; it's worth noting that in our code-coverage we never hit the part of that method that invokes OpenStreamFailed -- so this bug could be pre-existing.

So to summarize my understanding: The issue here is OpenStreamFailed, when invoked the first time, may create a runnable to happen on another thread, and that runnable in turn assumes there's no stream... but we set one up explicitly. Could it be this is the first time OpenStreamFailed is getting called on a thread that doesn't match the owning event target?

This is ultimately only incidentally a DOM Streams issue, as it's really exposing an unfortunate interleaving from the storage APIs, so I'm going to move this bug.

Component: DOM: Streams → Storage: Cache API

Thanks Matthew! It could be that resp1.clone() somehow breaks some assumption.

so this bug could be pre-existing.

Fortunately I got a repro without pipeto which should ease bisection. Rebisecting...

Attachment #9278028 - Attachment mime type: text/plain → text/html
Attached file minimal.html

Hmm, somehow debug builds from mozregression never crashes with it while my local debug builds do, not sure why.

Jason, could you retrigger the bot bisection with this new repro? Thanks!

Attachment #9278257 - Attachment is obsolete: true
Flags: needinfo?(jkratzer)

Looks quite close to bug 1771033, but that one is a shutdown crash while this one is not.

See Also: → 1771033
Attachment #9278028 - Attachment is obsolete: true
Flags: needinfo?(jkratzer)
Whiteboard: [bugmon:bisected,confirmed]
Attachment #9278804 - Attachment filename: file_1771025.txt → file_1771025.html

Err, my new testcase was for WPT rather than grizzly since it has no window.close call, I'll upload a new one...

Kagami, no need. The problem was related to the file extension. I'm re-running the bisection now.

Ah thanks, I also just confirmed that grizzly is still able to reproduce.

The bug appears to have been introduced in the following build range:

Start: 010ddecfd41c4d4caf2ef0e1954230fbc59a5761 (20210830162701)
End: 6e747e40e885da63da331f010611e9aea677fa7b (20210830160116)
Pushlog: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=010ddecfd41c4d4caf2ef0e1954230fbc59a5761&tochange=6e747e40e885da63da331f010611e9aea677fa7b

Thanks! But something is off, I can reproduce this on e7459f43c367 which is before that range: https://treeherder.mozilla.org/jobs?repo=mozilla-central&revision=e7459f43c36760e79f5435219d55458dab3b477b

2d291a99004d also has this, which means this is way too old to get anything good from bisection.

Assignee: nobody → krosylight
Status: NEW → ASSIGNED

In the testcase, clone() will trigger the first AsyncOpenStreamOnOwningThread() call and shortly after that read() will trigger the second one. For both calls mSnappyStream isn't set yet, and the first one finishes first and get the stream.

When the second run of AsyncOpenStreamOnOwningThread starts, the first run already got the stream and thus OpenStream fails. Then it thinks there should be no stream since it failed, but there is, so assertion fails at line 481.

Depends on D148073

:saschanaz, since this bug contains a bisection range, could you fill (if possible) the regressed_by field?
For more information, please visit auto_nag documentation.

Flags: needinfo?(krosylight)

bugmon:bisected is removed, not sure why it thinks this still has a bisection range?

Flags: needinfo?(krosylight)
Severity: -- → S3
Priority: -- → P2
Created web-platform-tests PR https://github.com/web-platform-tests/wpt/pull/34338 for changes under testing/web-platform/tests
Pushed by bszekely@mozilla.com:
https://hg.mozilla.org/mozilla-central/rev/d6b49d0dc514
Part 1: Cleaning up redundant header includes r=smaug
https://hg.mozilla.org/mozilla-central/rev/557d6b61a063
Part 2: Return early when it already has the stream r=asuth
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 103 Branch
Upstream PR merged by moz-wptsync-bot
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: