Closed Bug 1697641 Opened 5 years ago Closed 4 years ago

Assertion failure: mStart <= mEnd (Invalid Interval), at src/dom/media/Intervals.h:48

Tracking

()

Status:

RESOLVED FIXED

Milestone:

89 Branch

Tracking Flags:

Tracking

Status

firefox-esr78

---

wontfix

firefox87

---

wontfix

firefox88

---

wontfix

firefox89

---

fixed

People

(Reporter: bryce, Assigned: bryce)

References

(Blocks 1 open bug)

Details

(Keywords: assertion, crash, testcase)

Crash Data

Attachments

(2 files)

testcase.webm 5 years ago Tyson Smith [:tsmith] 78.80 KB, video/webm		Details
Bug 1697641 - Gracefully handle webms with bogus timecodes that conflict other metadata. r?kinetik 4 years ago Bryce Seager van Dyk [:bryce] (he/him) - Not reading bugmail 48 bytes, text/x-phabricator-request		Details \| Review

Bryce Seager van Dyk [:bryce] (he/him) - Not reading bugmail

Assignee

Description

•

5 years ago

+++ This bug was initially created as a clone of Bug #1530897 +++

Bug 1530897 fixed a fuzz case for this signature, but we have more crashes with the same stack. I'm fairly confident that while we're getting crashes logged on this line, that the assertion in questions cannot happen there because that line constructs an empty interval. I think the issue actually lies with the buffered variable and the assertion on the line we're seeing is a quirk of optimisation.

Tyson Smith [:tsmith]

Comment 1

•

5 years ago

Fuzzers are hitting this a few times a day but none of the test cases seem to repro.

Bryce Seager van Dyk [:bryce] (he/him) - Not reading bugmail

Assignee

Comment 2

•

5 years ago

(In reply to Tyson Smith [:tsmith] from comment #1)

Fuzzers are hitting this a few times a day but none of the test cases seem to repro.

Do the fuzzers ever hit this in debug builds such that we can get a stack/dump with further symbols? It would help to see if we ever get the assertion on a different line in the WebM demuxer to help with my comment 0 thesis.

Tyson Smith [:tsmith]

Comment 3

•

5 years ago

Seeing this on both ASan (non-debug) and debug builds. They all seem to be on the same line WebMDemuxer.cpp:999.

Assertion failure: mStart <= mEnd (Invalid Interval), at /builds/worker/workspace/obj-build/dist/include/Intervals.h:49

#0 0x7f94e8935928 in Interval<mozilla::media::TimeUnit &, mozilla::media::TimeUnit &> /builds/worker/workspace/obj-build/dist/include/Intervals.h:49:5
#1 0x7f94e8935928 in mozilla::WebMDemuxer::GetBuffered() /builds/worker/checkouts/gecko/dom/media/webm/WebMDemuxer.cpp:999:19
#2 0x7f94e89395ab in mozilla::WebMTrackDemuxer::GetBuffered() /builds/worker/checkouts/gecko/dom/media/webm/WebMDemuxer.cpp:1257:19
#3 0x7f94e893897a in mozilla::WebMTrackDemuxer::Reset() /builds/worker/checkouts/gecko/dom/media/webm/WebMDemuxer.cpp:1190:35
#4 0x7f94e851410c in operator() /builds/worker/checkouts/gecko/dom/media/MediaFormatReader.cpp:671:41
#5 0x7f94e851410c in mozilla::detail::RunnableFunction<mozilla::MediaFormatReader::DemuxerProxy::Wrapper::Reset()::'lambda'()>::Run() /builds/worker/workspace/obj-build/dist/include/nsThreadUtils.h:534:5
#6 0x7f94e4d30462 in mozilla::TaskQueue::Runner::Run() /builds/worker/checkouts/gecko/xpcom/threads/TaskQueue.cpp:158:20
#7 0x7f94e4d48dc7 in nsThreadPool::Run() /builds/worker/checkouts/gecko/xpcom/threads/nsThreadPool.cpp:303:14
#8 0x7f94e4d3fc73 in nsThread::ProcessNextEvent(bool, bool*) /builds/worker/checkouts/gecko/xpcom/threads/nsThread.cpp:1152:16
#9 0x7f94e4d465aa in NS_ProcessNextEvent(nsIThread*, bool) /builds/worker/checkouts/gecko/xpcom/threads/nsThreadUtils.cpp:548:10
#10 0x7f94e566dc6d in mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) /builds/worker/checkouts/gecko/ipc/glue/MessagePump.cpp:302:20
#11 0x7f94e55d7ee3 in MessageLoop::RunInternal() /builds/worker/checkouts/gecko/ipc/chromium/src/base/message_loop.cc:335:10
#12 0x7f94e55d7dfd in RunHandler /builds/worker/checkouts/gecko/ipc/chromium/src/base/message_loop.cc:328:3
#13 0x7f94e55d7dfd in MessageLoop::Run() /builds/worker/checkouts/gecko/ipc/chromium/src/base/message_loop.cc:310:3
#14 0x7f94e4d3c396 in nsThread::ThreadFunc(void*) /builds/worker/checkouts/gecko/xpcom/threads/nsThread.cpp:391:10
#15 0x7f94f9cebcdb in _pt_root /builds/worker/checkouts/gecko/nsprpub/pr/src/pthreads/ptthread.c:201:5
#16 0x7f94fa36a608 in start_thread /build/glibc-eX1tMB/glibc-2.31/nptl/pthread_create.c:477:8
#17 0x7f94f9f33292 in clone /build/glibc-eX1tMB/glibc-2.31/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Bryce Seager van Dyk [:bryce] (he/him) - Not reading bugmail

Assignee

Comment 4

•

5 years ago

Perfect, thanks! That helps confirm my comment 0 thoughts.

Tyson Smith [:tsmith]

Comment 5

•

5 years ago

Attached video testcase.webm — Details

Tyson Smith [:tsmith]

Updated

•

5 years ago

status-firefox88: --- → affected

status-firefox89: --- → affected

Flags: in-testsuite?

Keywords: testcase

Tyson Smith [:tsmith]

Comment 6

•

5 years ago

A Pernosco session is available here: https://pernos.co/debug/sqhQ-8jgn3bpi2hHGVIRpQ/index.html

Bryce Seager van Dyk [:bryce] (he/him) - Not reading bugmail

Assignee

Comment 7

•

4 years ago

•

Edited

Think I've figured out this one. This is caused by our clamping of end time to the duration[0]. However, we don't clamp the start time. So we can get into a state where oddly formed files will cause the algorithm to end up with a start time and end time greater than our duration. Then we only clamp the end time such that end < start and we then create an invalid interval.

This looks to be caused by the final cluster in the block having a timecode that is inconsistent with other timecodes and number of frames (it jumps ahead by an hour from what you'd expect given previous clusters).

I think we should handle this as in the current malformed cases by skipping the malformed interval. We could try and clamp the interval back into valid ranges, but I don't know if it's worth the time for busted files.

Fix incoming.

[0] https://searchfox.org/mozilla-central/rev/be413c29deeb86be6cdac22445e0d0b035cb9e04/dom/media/webm/WebMDemuxer.cpp#978

Bryce Seager van Dyk [:bryce] (he/him) - Not reading bugmail

Assignee

Comment 8

•

4 years ago

I can't get a reliable crash test out of this. I've been able to crash locally with the test file and thought I had a good case where I opened the file and forced a reset of the decoder by triggering a load on another file. However it doesn't reliably repro and I can't get my crashtest to fail. Going to push my fix and if I can't get my crasher crashing I'll rely on the fuzzers.

Bryce Seager van Dyk [:bryce] (he/him) - Not reading bugmail

Assignee

Comment 9

•

4 years ago

Attached file Bug 1697641 - Gracefully handle webms with bogus timecodes that conflict other metadata. r?kinetik — Details

This expands on existing checks when getting a webms buffered intervals. The
additional check ensures we don't end up with end < start due to our code that
clamps end at duration.

This patch moves those checks into their own helper function so as to reduce
clutter in GetBuffered.

Bryce Seager van Dyk [:bryce] (he/him) - Not reading bugmail

Assignee

Comment 10

•

4 years ago

If I load the test file from bmo then run

$('video').load()

in the console, I get a reliable repro. My guess would be it's to do with different caching of the resource. Still can't get a reliable local repro or test, I'm going to give up in the interest of timeboxing the rabbit hole.

Tyson Smith [:tsmith]

Comment 11

•

4 years ago

I can reproduce it reliable locally with Grizzly.

If you don't have Grizzly installed:

pip install grizzly-framework

To repro:

python3 -m grizzly.replay <path_to_browser>/firefox testcase.webm --relaunch 1 --repeat 10 --time-limit 5

Use -l if you want to save logs and --rr if you also want an rr trace.

Bryce Seager van Dyk [:bryce] (he/him) - Not reading bugmail

Assignee

Comment 12

•

4 years ago

(In reply to Tyson Smith [:tsmith] from comment #11)

I can reproduce it reliable locally with Grizzly.

If you don't have Grizzly installed:
pip install grizzly-framework
To repro:
python3 -m grizzly.replay <path_to_browser>/firefox testcase.webm --relaunch 1 --repeat 10 --time-limit 5
Use -l if you want to save logs and --rr if you also want an rr trace.

I tried several times with a local build and a shippable build and couldn't repro (10 repeats, no fault). Do a rebuild, now faulting :\

My crashtest will now fault if I serve it locally, but won't fault if I run it with ./mach crashtest. I've been trying different combinations of config to try and get it to fall over, but no luck.

Bryce Seager van Dyk [:bryce] (he/him) - Not reading bugmail

Assignee

Updated

•

4 years ago

Blocks: 1704556

Pulsebot

Comment 13

•

4 years ago

Pushed by bvandyk@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/23ff124946a2 Gracefully handle webms with bogus timecodes that conflict other metadata. r=kinetik

Andreea Pavel [:apavel]

Comment 14

•

4 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/23ff124946a2

Status: NEW → RESOLVED

Closed: 4 years ago

status-firefox89: affected → fixed

Resolution: --- → FIXED

Target Milestone: --- → 89 Branch

Ryan VanderMeulen (PTO, back 6-April)

Updated

•

4 years ago

status-firefox87: --- → wontfix

status-firefox88: affected → wontfix

status-firefox-esr78: --- → wontfix

Flags: in-testsuite? → in-testsuite-

You need to log in before you can comment on or make changes to this bug.