Intermittent 860143.html | application crashed [@ mozilla::TrackUnionStream::CopyTrackData(mozilla::StreamBuffer::Track*, unsigned int, long long, long long, bool*)] | Assertion failure: startTicks == TimeToTicksRoundUp(rate, outputStart) (Samples missing

RESOLVED FIXED in Firefox 34, Firefox OS v2.1

Status

()

RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: cbook, Assigned: jesup)

Tracking

({assertion, crash, intermittent-failure})

unspecified
mozilla35
assertion, crash, intermittent-failure
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox33 unaffected, firefox34 fixed, firefox35 fixed, firefox-esr31 unaffected, b2g-v2.0 unaffected, b2g-v2.0M unaffected, b2g-v2.1 fixed, b2g-v2.2 fixed)

Details

(Whiteboard: [webrtc-uplift], URL)

Attachments

(1 attachment, 3 obsolete attachments)

(Reporter)

Description

4 years ago
Rev5 MacOSX Mountain Lion 10.8 mozilla-inbound debug test crashtest on 2014-08-26 21:58:31 PDT for push a99f3b29c4a8

slave: talos-mtnlion-r5-082

https://tbpl.mozilla.org/php/getParsedLog.php?id=46829628&tree=Mozilla-Inbound



Assertion failure: startTicks == TimeToTicksRoundUp(rate, outputStart) (Samples missing), at /builds/slave/m-in-osx64-d-00000000000000000/build/content/media/TrackUnionStream.h:253

22:04:08  WARNING -  PROCESS-CRASH | file:///builds/slave/talos-slave/test/build/tests/reftest/tests/dom/media/tests/crashtests/860143.html | application crashed [@ mozilla::TrackUnionStream::CopyTrackData(mozilla::StreamBuffer::Track*, unsigned int, long long, long long, bool*)]
22:04:08     INFO -  Crash dump filename: /var/folders/m9/3c3mb3ws0y18t531sbfpqzmc00000w/T/tmpQWcW0r.mozrunner/minidumps/3FEC79A8-396C-4D04-A56D-529861434139.dmp
22:04:08     INFO -  Operating system: Mac OS X
22:04:08     INFO -                    10.8.0 12A269
22:04:08     INFO -  CPU: amd64
22:04:08     INFO -       family 6 model 42 stepping 7
22:04:08     INFO -       8 CPUs
22:04:08     INFO -  Crash reason:  EXC_BAD_ACCESS / KERN_INVALID_ADDRESS
22:04:08     INFO -  Crash address: 0x0
22:04:08     INFO -  Thread 55 (crashed)
22:04:08     INFO -   0  XUL!mozilla::TrackUnionStream::CopyTrackData(mozilla::StreamBuffer::Track*, unsigned int, long long, long long, bool*) [nsTArray.h:a99f3b29c4a8 : 866 + 0x0]
22:04:08     INFO -      rbx = 0x00007fff7e795c68   r12 = 0x0000000000000e80
22:04:08     INFO -      r13 = 0x0000000000062d80   r14 = 0x00000001114c3cd0
22:04:08     INFO -      r15 = 0x00000000000f4240   rip = 0x000000010301d608
22:04:08     INFO -      rsp = 0x0000000127e809a0   rbp = 0x0000000127e80a80
22:04:08     INFO -      Found by: given as instruction pointer in context
22:04:08     INFO -   1  XUL!mozilla::TrackUnionStream::ProcessInput(long long, long long, unsigned int) [TrackUnionStream.h:a99f3b29c4a8 : 77 + 0x1f]
22:04:08     INFO -      rbx = 0x000000011282dd60   r12 = 0x0000000000000000
22:04:08     INFO -      r13 = 0x0000000000000000   r14 = 0x000000010c977c80
22:04:08     INFO -      r15 = 0x0000000000000000   rip = 0x000000010301cb87
22:04:08     INFO -      rsp = 0x0000000127e80a90   rbp = 0x0000000127e80b60
22:04:08     INFO -      Found by: call frame info
22:04:08     INFO -   2  XUL!mozilla::MediaStreamGraphImpl::Process(long long, long long) [MediaStreamGraph.cpp:a99f3b29c4a8 : 1302 + 0x1b]
22:04:08     INFO -      rbx = 0x0000000000062f80   r12 = 0x0000000000000000
22:04:08     INFO -      r13 = 0x000000010f23aae0   r14 = 0x00000001114c3cd0
22:04:08     INFO -      r15 = 0x00000001114c3cd0   rip = 0x0000000103007480
22:04:08     INFO -      rsp = 0x0000000127e80b70   rbp = 0x0000000127e80bd0
22:04:08     INFO -      Found by: call frame info
22:04:08     INFO -   3  XUL!mozilla::MediaStreamGraphImpl::OneIteration(long long, long long, long long, long long) [MediaStreamGraph.cpp:a99f3b29c4a8 : 1381 + 0xe]
22:04:08     INFO -      rbx = 0x000000010f23aae0   r12 = 0x000000010f23aae0
22:04:08     INFO -      r13 = 0x00000001066fe508   r14 = 0x0000000000062f80
22:04:08     INFO -      r15 = 0x000000000000ac44   rip = 0x000000010300789b
22:04:08     INFO -      rsp = 0x0000000127e80be0   rbp = 0x0000000127e80c60
22:04:08     INFO -      Found by: call frame info
22:04:08     INFO -   4  XUL!mozilla::ThreadedDriver::RunThread() [GraphDriver.cpp:a99f3b29c4a8 : 255 + 0x4]
22:04:08     INFO -      rbx = 0x000000010c9977b0   r12 = 0x000000010f23aae0
22:04:08     INFO -      r13 = 0x00000001066fe508   r14 = 0x0000000000062a1c
22:04:08     INFO -      r15 = 0x000000000000ac44   rip = 0x0000000102fd1c3f
22:04:08     INFO -      rsp = 0x0000000127e80c70   rbp = 0x0000000127e80cb0
22:04:08     INFO -      Found by: call frame info
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Duplicate of this bug: 1059786
Duplicate of this bug: 1061177
Duplicate of this bug: 1061677
Duplicate of this bug: 1062238
Duplicate of this bug: 1063292
Duplicate of this bug: 1064291
Duplicate of this bug: 1061217
This bug is attracting dupes to it like fruit flies to a rotting banana. Can we please find a way to make some time for this?
Flags: needinfo?(paul)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Duplicate of this bug: 1067979
Duplicate of this bug: 1067990
Duplicate of this bug: 1068858
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Duplicate of this bug: 1069298
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
(Assignee)

Updated

4 years ago
Assignee: nobody → rjesup
(Assignee)

Comment 122

4 years ago
 MOZ_ASSERT(startTicks == TimeToTicksRoundUp(rate, outputStart), "Samples missing");

So, every one of these shows an assertion failure at TimeToTicksRoundUp(), preceded by one or more (usually 3-6) WARNING: Slice out of range: 'aStart >= 0 && aEnd <= aSource.mDuration', file /builds/slave/m-in-osx64-d-00000000000000000/build/content/media/MediaSegment.h, line 282
Comment hidden (Treeherder Robot)
(Assignee)

Comment 124

4 years ago
Created attachment 8496142 [details] [diff] [review]
assert only on time going backwards.  If time jumps forwards (an underflow due to hardware/VM/load stalling us too long), just warn about it - underflows on hardware that stalls you too long are unavoidable.
(Assignee)

Comment 125

4 years ago
Created attachment 8496145 [details] [diff] [review]
assert only on time going backwards.  If time jumps forwards (an underflow due to hardware/VM/load stalling us too long), just warn about it - underflows on hardware that stalls you too long are unavoidable.

now without an additional assert for debugging
(Assignee)

Updated

4 years ago
Attachment #8496142 - Attachment is obsolete: true
(Assignee)

Comment 126

4 years ago
Comment on attachment 8496145 [details] [diff] [review]
assert only on time going backwards.  If time jumps forwards (an underflow due to hardware/VM/load stalling us too long), just warn about it - underflows on hardware that stalls you too long are unavoidable.

I'll take a review from whomeever can get to it first.  It's a really trivial patch.
Attachment #8496145 - Flags: review?(roc)
Attachment #8496145 - Flags: review?(karlt)
(Assignee)

Updated

4 years ago
status-firefox34: --- → affected
status-firefox35: --- → affected
Whiteboard: [webrtc-uplift]
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment on attachment 8496145 [details] [diff] [review]
assert only on time going backwards.  If time jumps forwards (an underflow due to hardware/VM/load stalling us too long), just warn about it - underflows on hardware that stalls you too long are unavoidable.

Review of attachment 8496145 [details] [diff] [review]:
-----------------------------------------------------------------

I'm skeptical of your explanation. Underflows should not cause the track to get out of sync here. Can you explain what you think is happening in more detail?
Attachment #8496145 - Flags: review?(roc) → review-
https://tbpl.mozilla.org/php/getParsedLog.php?id=49050179&tree=Mozilla-Inbound

mozilla-inbound_snowleopard-debug_test-mochitest-browser-chrome-1
revision: aaca52b09d8f

13:33:49     INFO -  Assertion failure: startTicks == TimeToTicksRoundUp(rate, outputStart) (Samples missing), at /builds/slave/m-in-osx64-d-00000000000000000/build/content/media/TrackUnionStream.h:267
13:33:52     INFO -  601 ERROR TEST-UNEXPECTED-FAIL | chrome://mochitests/content/browser/browser/base/content/test/general/browser_devices_get_user_media.js | application terminated with exit code 1
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Created attachment 8497888 [details] [diff] [review]
Fix assert

This bug is blocking bug 994912 from landing; you can reproduce 100% of the time with my changes in a debug build in case reproducibility is holding up a resolution.

I was looking at it independent from this and wrote the attached patch to resolve it, thinking I was the only one suffering. But I don't fully understand the module, so I'm not sure if I did it right :).

My understanding is as follows:

content/media/TrackUnionStream.h:283
>        // These are the properties we need to ensure:
>        // 1) Exactly 'ticks' ticks of output are produced, i.e.
>        // inputEndTicks - inputStartTicks = ticks.
>        // 2) inputEndTicks <= aInputTrack->GetSegment()->GetDuration().
>        // 3) In any sequence of intervals where neither stream is blocked,
>        // the content of the input track we use is a contiguous sequence of
>        // ticks with no gaps or overlaps.
>        if (map->mEndOfLastInputIntervalInInputStream != inputStart ||
>            map->mEndOfLastInputIntervalInOutputStream != outputStart) {
>          // Start of a new series of intervals where neither stream is blocked.
>          map->mEndOfConsumedInputTicks = TimeToTicksRoundDown(rate, inputStart) - 1;
>        }
>        TrackTicks inputStartTicks = map->mEndOfConsumedInputTicks;
>        TrackTicks inputEndTicks = inputStartTicks + ticks;

There is a long proof in the comments for property #3, but it hinges on the statement below:

content/media/TrackUnionStream.h:331
>        // (now using the fact that inputEnd <= track->GetEndTimeRoundDown() for a non-ended track)

But in the if statement I quoted above, we actually change the inputEndTicks, such that in the non-trivial case, this assumption is invalidated and it may go past the track ending.. In my testing usually start time is way ahead of the actual track end time, so there are no acceptable samples only stale samples (underrun?). The vast majority of the time that the if is tripped, all of the values of 0, so the math works out (trivial case), I'm guessing that happens at the beginning.
Attachment #8497888 - Flags: feedback?(roc)
Blocks: 994912
(Assignee)

Updated

4 years ago
Attachment #8496145 - Flags: review?(karlt)
(Assignee)

Comment 145

4 years ago
So I'll note that ALL of these crashes are due to the only line of code added to this section since Dec 2013 - the one MOZ_ASSERT() line that's crashing us, which padenot added.  We may have been failing this all along occasionally.
Comment on attachment 8497888 [details] [diff] [review]
Fix assert

Review of attachment 8497888 [details] [diff] [review]:
-----------------------------------------------------------------

Looks reasonable.

Bug 1061046 vastly simplifies all this logic but I won't be able to land that for a couple of weeks.
Attachment #8497888 - Flags: feedback?(roc) → feedback+
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Created attachment 8498224 [details] [diff] [review]
Fix assert, v2

Updated with a brief comment explaining underrun can cause the code documented assumptions to be violated.

try: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=8224896bea5e
rerun try on busted cases I don't think this impacts: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=20dfba0dceba

(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) (on vacation Sep 27 to Oct 12) from comment #146)
> Comment on attachment 8497888 [details] [diff] [review]
> Fix assert
> 
> Review of attachment 8497888 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Looks reasonable.
> 
> Bug 1061046 vastly simplifies all this logic but I won't be able to land
> that for a couple of weeks.

I reviewed this afterwards, specifically: https://bugzilla.mozilla.org/attachment.cgi?id=8492532&action=diff

I think that is complementary to my change, since it still has some (original) questionable logic:
1) inputTrackEndPoint is only updated if the track is ended, but even if it hasn't we still can't go past the duration boundary
2) if we select fewer samples than "TrackTicks ticks", it will complain at the end indicating the stream is out of sync; this could easily happen if it prefers inputTrackEndPoint (the correct value from 1 above) in the std::min calculations, so we still need to do the null calculations
Attachment #8497888 - Attachment is obsolete: true
Attachment #8498224 - Flags: review?(roc)
Comment hidden (Treeherder Robot)
Flags: needinfo?(padenot)
Keywords: checkin-needed
Attachment #8496145 - Attachment is obsolete: true
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
https://hg.mozilla.org/mozilla-central/rev/0a8b7e2db5fd

So far, so good on inbound, FWIW :)
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla35
Looking good! Please nominate this for Aurora approval when you get a chance :)
status-b2g-v2.0: --- → unaffected
status-b2g-v2.0M: --- → unaffected
status-b2g-v2.1: --- → affected
status-b2g-v2.2: --- → fixed
status-firefox33: --- → unaffected
status-firefox35: affected → fixed
status-firefox-esr31: --- → unaffected
Flags: needinfo?(rjesup)
(Assignee)

Comment 163

4 years ago
Comment on attachment 8498224 [details] [diff] [review]
Fix assert, v2

Approval Request Comment
[Feature/regressing bug #]: bug 848954

[User impact if declined]: lots of starring randomly across the tree (top webrtc/media orange) - or forcing us to disable the MOZ_ASSERT Paul added, which caught this extant bug.

[Describe test coverage new/current, TBPL]: all of tbpl that touches MSG tests it.

[Risks and why]: low risk - tweaks how tick ranges are decided on.  The code survived being off by a tick before we added the assert.

[String/UUID change made/needed]: none.
Attachment #8498224 - Flags: approval-mozilla-aurora?
Flags: needinfo?(rjesup)
Comment hidden (Treeherder Robot)
Comment on attachment 8498224 [details] [diff] [review]
Fix assert, v2

Aurora+
Attachment #8498224 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
https://hg.mozilla.org/releases/mozilla-aurora/rev/855936995e70
status-b2g-v2.1: affected → fixed
status-firefox34: affected → fixed
(Assignee)

Updated

4 years ago
Blocks: 1096078
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
You need to log in before you can comment on or make changes to this bug.