Open Bug 1795059 Opened 2 years ago Updated 3 months ago

Assertion failure: aState != mReadyState, at /dom/media/mediasource/MediaSource.cpp:545

Categories

(Core :: Audio/Video, defect, P3)

x86_64
Linux
defect

Tracking

()

ASSIGNED

People

(Reporter: jkratzer, Assigned: az)

References

(Blocks 1 open bug)

Details

(Keywords: regression, testcase, Whiteboard: [bugmon:bisected,confirmed])

Attachments

(4 files)

Testcase found while fuzzing mozilla-central rev cbbf6a7e34a3 (built with: --enable-debug --enable-fuzzing).

Testcase can be reproduced using the following commands:

$ pip install fuzzfetch grizzly-framework
$ python -m fuzzfetch --build cbbf6a7e34a3 --debug --fuzzing -n firefox
$ python -m grizzly.replay ./firefox/firefox testcase.html
Assertion failure: aState != mReadyState, at /dom/media/mediasource/MediaSource.cpp:545

    ==300863==ERROR: UndefinedBehaviorSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7fbe5e729b16 bp 0x7ffe4d01da50 sp 0x7ffe4d01da20 T300863)
    ==300863==The signal is caused by a WRITE memory access.
    ==300863==Hint: address points to the zero page.
        #0 0x7fbe5e729b16 in mozilla::dom::MediaSource::SetReadyState(mozilla::dom::MediaSourceReadyState) /dom/media/mediasource/MediaSource.cpp:545:3
        #1 0x7fbe5e729f15 in mozilla::dom::MediaSource::EndOfStream(mozilla::MediaResult const&) /dom/media/mediasource/MediaSource.cpp:418:3
        #2 0x7fbe5e73fde6 in mozilla::dom::SourceBuffer::AppendError(mozilla::MediaResult const&) /dom/media/mediasource/SourceBuffer.cpp:658:17
        #3 0x7fbe5e768211 in mozilla::MozPromise<std::pair<bool, mozilla::SourceBufferAttributes>, mozilla::MediaResult, true>::ThenValue<mozilla::dom::SourceBuffer*, void (mozilla::dom::SourceBuffer::*)(std::pair<bool, mozilla::SourceBufferAttributes> const&), void (mozilla::dom::SourceBuffer::*)(mozilla::MediaResult const&)>::DoResolveOrRejectInternal(mozilla::MozPromise<std::pair<bool, mozilla::SourceBufferAttributes>, mozilla::MediaResult, true>::ResolveOrRejectValue&) /builds/worker/workspace/obj-build/dist/include/mozilla/MozPromise.h
        #4 0x7fbe5e768e86 in mozilla::MozPromise<std::pair<bool, mozilla::SourceBufferAttributes>, mozilla::MediaResult, true>::ThenValueBase::ResolveOrRejectRunnable::Run() /builds/worker/workspace/obj-build/dist/include/mozilla/MozPromise.h:487:21
        #5 0x7fbe5a7a8414 in mozilla::XPCOMThreadWrapper::Runner::Run() /xpcom/threads/AbstractThread.cpp:208:25
        #6 0x7fbe5a7d5e6e in mozilla::RunnableTask::Run() /xpcom/threads/TaskController.cpp:538:16
        #7 0x7fbe5a7ae389 in mozilla::TaskController::DoExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) /xpcom/threads/TaskController.cpp:851:26
        #8 0x7fbe5a7acf13 in mozilla::TaskController::ExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) /xpcom/threads/TaskController.cpp:683:15
        #9 0x7fbe5a7ad183 in mozilla::TaskController::ProcessPendingMTTask(bool) /xpcom/threads/TaskController.cpp:461:36
        #10 0x7fbe5a7d9716 in operator() /xpcom/threads/TaskController.cpp:187:37
        #11 0x7fbe5a7d9716 in mozilla::detail::RunnableFunction<mozilla::TaskController::InitializeInternal()::$_0>::Run() /builds/worker/workspace/obj-build/dist/include/nsThreadUtils.h:531:5
        #12 0x7fbe5a7c2fdf in nsThread::ProcessNextEvent(bool, bool*) /xpcom/threads/nsThread.cpp:1205:16
        #13 0x7fbe5a7c95ed in NS_ProcessNextEvent(nsIThread*, bool) /xpcom/threads/nsThreadUtils.cpp:465:10
        #14 0x7fbe5b3b5046 in mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) /ipc/glue/MessagePump.cpp:85:21
        #15 0x7fbe5b2d9187 in MessageLoop::RunInternal() /ipc/chromium/src/base/message_loop.cc:381:10
        #16 0x7fbe5b2d9092 in RunHandler /ipc/chromium/src/base/message_loop.cc:374:3
        #17 0x7fbe5b2d9092 in MessageLoop::Run() /ipc/chromium/src/base/message_loop.cc:356:3
        #18 0x7fbe5f7fcbc8 in nsBaseAppShell::Run() /widget/nsBaseAppShell.cpp:150:27
        #19 0x7fbe61a0a6db in XRE_RunAppShell() /toolkit/xre/nsEmbedFunctions.cpp:880:20
        #20 0x7fbe5b3b5f3a in mozilla::ipc::MessagePumpForChildProcess::Run(base::MessagePump::Delegate*) /ipc/glue/MessagePump.cpp:235:9
        #21 0x7fbe5b2d9187 in MessageLoop::RunInternal() /ipc/chromium/src/base/message_loop.cc:381:10
        #22 0x7fbe5b2d9092 in RunHandler /ipc/chromium/src/base/message_loop.cc:374:3
        #23 0x7fbe5b2d9092 in MessageLoop::Run() /ipc/chromium/src/base/message_loop.cc:356:3
        #24 0x7fbe61a09cbe in XRE_InitChildProcess(int, char**, XREChildData const*) /toolkit/xre/nsEmbedFunctions.cpp:739:34
        #25 0x563f6ccbec19 in content_process_main /browser/app/../../ipc/contentproc/plugin-container.cpp:57:28
        #26 0x563f6ccbec19 in main /browser/app/nsBrowserApp.cpp:357:18
        #27 0x7fbe7144ed8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
        #28 0x7fbe7144ee3f in __libc_start_main csu/../csu/libc-start.c:392:3
        #29 0x563f6cc948dc in _start (/home/jkratzer/builds/m-c-20221012213343-fuzzing-debug/firefox-bin+0x168dc) (BuildId: 79de1d6fe4f74fe64c1836d51ad7afe42e2e9e06)
    
    UndefinedBehaviorSanitizer can not provide additional info.
    SUMMARY: UndefinedBehaviorSanitizer: SEGV /dom/media/mediasource/MediaSource.cpp:545:3 in mozilla::dom::MediaSource::SetReadyState(mozilla::dom::MediaSourceReadyState)
    ==300863==ABORTING
Attached file Testcase

Bugmon Analysis
Verified bug as reproducible on mozilla-central 20221013154647-4563dd583110.
The bug appears to have been introduced in the following build range:

Start: b7a5205832b471367bad42b69cdf69a4da1eb15c (20220916033628)
End: ca0480fcb71c17e7f422d960f073cabeb4864827 (20220916014558)
Pushlog: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=b7a5205832b471367bad42b69cdf69a4da1eb15c&tochange=ca0480fcb71c17e7f422d960f073cabeb4864827

Keywords: regression
Whiteboard: [bugmon:confirm] → [bugmon:bisected,confirmed]
Severity: -- → S3
Priority: -- → P3

I'm unable to reproduce on my end running Debian bullseye, with and without headless mode enabled. I'm not familiar with grizzly, so I imagine I might be missing something... would you have any ideas?

$ python -m grizzly.replay ./firefox/firefox testcase.html --repeat 100 --headless
[2022-10-14 12:30:10] Starting Grizzly Replay
[2022-10-14 12:30:10] Running browser headless (default)
[2022-10-14 12:30:10] Ignoring: log-limit, timeout
[2022-10-14 12:51:38] Using time limit: 30s, timeout: 45s
[2022-10-14 12:51:38] Repeat: 100, Minimum crashes: 1, Relaunch 100
[2022-10-14 12:51:41] Running test (1/100)... 
[2022-10-14 12:52:14] Running test (2/100)... 
...
[2022-10-14 13:43:29] Running test (99/100)...
[2022-10-14 13:44:01] Running test (100/100)...
[2022-10-14 13:44:33] Failed to reproduce results
[2022-10-14 13:44:33] Shutting down...
[2022-10-14 13:44:33] Done.
Flags: needinfo?(jkratzer)

:az, are you using a debug build as specified in comment 0? Also, you need to point grizzly at either, the testcase.zip or the directory you unpacked it to (where test_info.json).

Flags: needinfo?(jkratzer)

Hi Jason, I ran the commands as in comment 0... here's an example log. (Sorry about the duplicate posts here, BTW -- not sure why that happened, and it doesn't seem to let me delete the duplicate attachments)

echo '
mkdir test
cd test
wget "https://bugzilla.mozilla.org/attachment.cgi?id=9298443"
unzip *
pip install fuzzfetch grizzly-framework
python -m fuzzfetch --build cbbf6a7e34a3 --debug --fuzzing -n firefox
python -m grizzly.replay ./firefox/firefox testcase.html
' > test.sh && sh ./test.sh 2>&1 | tee log

Log file: https://pastebin.mozilla.org/EiK8QCWc

Yes, my apologies. Comment zero is wrong. You either need to do:

python -m grizzly.replay ./firefox/firefox testcase.zip

Or, if you're in the test directory:

python -m grizzly.replay ./firefox/firefox ./

Thanks :jkratzer, that did the trick -- I'm able to repro now.

Thanks Tyson! Also, I just wanted to add a quick note here from our earlier discussion in case it's useful for others with this or similar bugs. If Grizzly reports Failed to reproduce results when --rr is enabled, increasing the setTimeout value to 5000 in testcase.html to account for the additional overhead from running rr might help.

NI'ing pehrsons, as this might be related to 1788557

Flags: needinfo?(apehrson)

What happens is basically

1. (from js) SourceBuffer::AppendBuffer -> async to 3
2. (from js) SourceBuffer::AppendBuffer -> async to 4
3. SourceBuffer::AppendDataErrored -> MediaSource::SetReadyState(Ended)
4. SourceBuffer::AppendDataErrored -> MediaSource::SetReadyState(Ended)

The errors in question are in the first segment "Timecode appeared before SegmentInfo" and in the second segment "Invalid element id of length 8". It's correct bug 1788557 added the second error and modified the error reporting a bit, but the first error is older than that, and more importantly the possibility for (async) errors here is older than that. If the second segment hits the same error as the first, we should be able to reproduce without the bug 1788557 patches.

Assuming errors are the only async event that can cause readyState ended, it seems to me that SourceBuffer::AppendError should guard against subsequent errors, similarly to what HTMLMediaElement does.

Flags: needinfo?(apehrson)

I took a closer look at the problem and it seems like adding a guard in SourceBuffer::AppendError(const MediaResult& aDecodeError) isn't sufficient to fix the issue. It looks like MediaSource::EndOfStream(const MediaResult& aError) is only called from within SourceBuffer::AppendError here. I added a guard to ensure that this function is only being called once from a given SourceBuffer object and the assert still occurred. I checked the memory addresses for the SourceBuffer object to verify that it wasn't the same SourceBuffer object attempting to end the stream twice. I think the root problem is that if multiple SourceBuffer objects can refer to the same MediaSource, they can each attempt to call EndOfStream(const MediaResult& aError) and we'll hit the assert.

One possible fix would be to add a check to the MediaSource::EndOfStream functions and only execute if the ready state isn't already MediaSourceReadyState::Ended. This fixes the assert in my testing. I'll attach a patch to illustrate. What do you think?

Flags: needinfo?(apehrson)
Assignee: nobody → azebrowski
Status: NEW → ASSIGNED

Makes sense. And I left some comments on the patch. Thanks!

Flags: needinfo?(apehrson)

Based on comment #2, this bug contains a bisection range found by bugmon. However, the Regressed by field is still not filled.

:az, if possible, could you fill the Regressed by field and investigate this regression?

For more information, please visit auto_nag documentation.

Flags: needinfo?(azebrowski)

Bugmon was unable reproduce this issue.
Removing bugmon keyword as no further action possible. Please review the bug and re-add the keyword for further analysis.

Keywords: bugmon

A change to the Taskcluster build definitions over the weekend caused Bugmon to fail when reproducing issues. This issue has been corrected. Re-enabling bugmon.

Keywords: bugmon
Flags: needinfo?(azebrowski)

Unable to reproduce bug 1795059 using build mozilla-central 20231208094241-3152110c63b5. Without a baseline, bugmon is unable to analyze this bug.
Removing bugmon keyword as no further action possible. Please review the bug and re-add the keyword for further analysis.

Keywords: bugmon
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: