Closed Bug 1059101 Opened 5 years ago Closed 5 years ago

Intermittent 860143.html | application crashed [@ mozilla::TrackUnionStream::CopyTrackData(mozilla::StreamBuffer::Track*, unsigned int, long long, long long, bool*)] | Assertion failure: startTicks == TimeToTicksRoundUp(rate, outputStart) (Samples missing

Categories

(Core :: WebRTC, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla35
Tracking Status
firefox33 --- unaffected
firefox34 --- fixed
firefox35 --- fixed
firefox-esr31 --- unaffected
b2g-v2.0 --- unaffected
b2g-v2.0M --- unaffected
b2g-v2.1 --- fixed
b2g-v2.2 --- fixed

People

(Reporter: cbook, Assigned: jesup)

References

()

Details

(Keywords: assertion, crash, intermittent-failure, Whiteboard: [webrtc-uplift])

Attachments

(1 file, 3 obsolete files)

Rev5 MacOSX Mountain Lion 10.8 mozilla-inbound debug test crashtest on 2014-08-26 21:58:31 PDT for push a99f3b29c4a8

slave: talos-mtnlion-r5-082

https://tbpl.mozilla.org/php/getParsedLog.php?id=46829628&tree=Mozilla-Inbound



Assertion failure: startTicks == TimeToTicksRoundUp(rate, outputStart) (Samples missing), at /builds/slave/m-in-osx64-d-00000000000000000/build/content/media/TrackUnionStream.h:253

22:04:08  WARNING -  PROCESS-CRASH | file:///builds/slave/talos-slave/test/build/tests/reftest/tests/dom/media/tests/crashtests/860143.html | application crashed [@ mozilla::TrackUnionStream::CopyTrackData(mozilla::StreamBuffer::Track*, unsigned int, long long, long long, bool*)]
22:04:08     INFO -  Crash dump filename: /var/folders/m9/3c3mb3ws0y18t531sbfpqzmc00000w/T/tmpQWcW0r.mozrunner/minidumps/3FEC79A8-396C-4D04-A56D-529861434139.dmp
22:04:08     INFO -  Operating system: Mac OS X
22:04:08     INFO -                    10.8.0 12A269
22:04:08     INFO -  CPU: amd64
22:04:08     INFO -       family 6 model 42 stepping 7
22:04:08     INFO -       8 CPUs
22:04:08     INFO -  Crash reason:  EXC_BAD_ACCESS / KERN_INVALID_ADDRESS
22:04:08     INFO -  Crash address: 0x0
22:04:08     INFO -  Thread 55 (crashed)
22:04:08     INFO -   0  XUL!mozilla::TrackUnionStream::CopyTrackData(mozilla::StreamBuffer::Track*, unsigned int, long long, long long, bool*) [nsTArray.h:a99f3b29c4a8 : 866 + 0x0]
22:04:08     INFO -      rbx = 0x00007fff7e795c68   r12 = 0x0000000000000e80
22:04:08     INFO -      r13 = 0x0000000000062d80   r14 = 0x00000001114c3cd0
22:04:08     INFO -      r15 = 0x00000000000f4240   rip = 0x000000010301d608
22:04:08     INFO -      rsp = 0x0000000127e809a0   rbp = 0x0000000127e80a80
22:04:08     INFO -      Found by: given as instruction pointer in context
22:04:08     INFO -   1  XUL!mozilla::TrackUnionStream::ProcessInput(long long, long long, unsigned int) [TrackUnionStream.h:a99f3b29c4a8 : 77 + 0x1f]
22:04:08     INFO -      rbx = 0x000000011282dd60   r12 = 0x0000000000000000
22:04:08     INFO -      r13 = 0x0000000000000000   r14 = 0x000000010c977c80
22:04:08     INFO -      r15 = 0x0000000000000000   rip = 0x000000010301cb87
22:04:08     INFO -      rsp = 0x0000000127e80a90   rbp = 0x0000000127e80b60
22:04:08     INFO -      Found by: call frame info
22:04:08     INFO -   2  XUL!mozilla::MediaStreamGraphImpl::Process(long long, long long) [MediaStreamGraph.cpp:a99f3b29c4a8 : 1302 + 0x1b]
22:04:08     INFO -      rbx = 0x0000000000062f80   r12 = 0x0000000000000000
22:04:08     INFO -      r13 = 0x000000010f23aae0   r14 = 0x00000001114c3cd0
22:04:08     INFO -      r15 = 0x00000001114c3cd0   rip = 0x0000000103007480
22:04:08     INFO -      rsp = 0x0000000127e80b70   rbp = 0x0000000127e80bd0
22:04:08     INFO -      Found by: call frame info
22:04:08     INFO -   3  XUL!mozilla::MediaStreamGraphImpl::OneIteration(long long, long long, long long, long long) [MediaStreamGraph.cpp:a99f3b29c4a8 : 1381 + 0xe]
22:04:08     INFO -      rbx = 0x000000010f23aae0   r12 = 0x000000010f23aae0
22:04:08     INFO -      r13 = 0x00000001066fe508   r14 = 0x0000000000062f80
22:04:08     INFO -      r15 = 0x000000000000ac44   rip = 0x000000010300789b
22:04:08     INFO -      rsp = 0x0000000127e80be0   rbp = 0x0000000127e80c60
22:04:08     INFO -      Found by: call frame info
22:04:08     INFO -   4  XUL!mozilla::ThreadedDriver::RunThread() [GraphDriver.cpp:a99f3b29c4a8 : 255 + 0x4]
22:04:08     INFO -      rbx = 0x000000010c9977b0   r12 = 0x000000010f23aae0
22:04:08     INFO -      r13 = 0x00000001066fe508   r14 = 0x0000000000062a1c
22:04:08     INFO -      r15 = 0x000000000000ac44   rip = 0x0000000102fd1c3f
22:04:08     INFO -      rsp = 0x0000000127e80c70   rbp = 0x0000000127e80cb0
22:04:08     INFO -      Found by: call frame info
Duplicate of this bug: 1059786
Duplicate of this bug: 1061177
Duplicate of this bug: 1061677
Duplicate of this bug: 1062238
Duplicate of this bug: 1063292
Duplicate of this bug: 1064291
Duplicate of this bug: 1061217
This bug is attracting dupes to it like fruit flies to a rotting banana. Can we please find a way to make some time for this?
Flags: needinfo?(paul)
Duplicate of this bug: 1067979
Duplicate of this bug: 1067990
Duplicate of this bug: 1068858
Duplicate of this bug: 1069298
Assignee: nobody → rjesup
 MOZ_ASSERT(startTicks == TimeToTicksRoundUp(rate, outputStart), "Samples missing");

So, every one of these shows an assertion failure at TimeToTicksRoundUp(), preceded by one or more (usually 3-6) WARNING: Slice out of range: 'aStart >= 0 && aEnd <= aSource.mDuration', file /builds/slave/m-in-osx64-d-00000000000000000/build/content/media/MediaSegment.h, line 282
Attachment #8496142 - Attachment is obsolete: true
Comment on attachment 8496145 [details] [diff] [review]
assert only on time going backwards.  If time jumps forwards (an underflow due to hardware/VM/load stalling us too long), just warn about it - underflows on hardware that stalls you too long are unavoidable.

I'll take a review from whomeever can get to it first.  It's a really trivial patch.
Attachment #8496145 - Flags: review?(roc)
Attachment #8496145 - Flags: review?(karlt)
Whiteboard: [webrtc-uplift]
Comment on attachment 8496145 [details] [diff] [review]
assert only on time going backwards.  If time jumps forwards (an underflow due to hardware/VM/load stalling us too long), just warn about it - underflows on hardware that stalls you too long are unavoidable.

Review of attachment 8496145 [details] [diff] [review]:
-----------------------------------------------------------------

I'm skeptical of your explanation. Underflows should not cause the track to get out of sync here. Can you explain what you think is happening in more detail?
Attachment #8496145 - Flags: review?(roc) → review-
https://tbpl.mozilla.org/php/getParsedLog.php?id=49050179&tree=Mozilla-Inbound

mozilla-inbound_snowleopard-debug_test-mochitest-browser-chrome-1
revision: aaca52b09d8f

13:33:49     INFO -  Assertion failure: startTicks == TimeToTicksRoundUp(rate, outputStart) (Samples missing), at /builds/slave/m-in-osx64-d-00000000000000000/build/content/media/TrackUnionStream.h:267
13:33:52     INFO -  601 ERROR TEST-UNEXPECTED-FAIL | chrome://mochitests/content/browser/browser/base/content/test/general/browser_devices_get_user_media.js | application terminated with exit code 1
Attached patch Fix assert (obsolete) — Splinter Review
This bug is blocking bug 994912 from landing; you can reproduce 100% of the time with my changes in a debug build in case reproducibility is holding up a resolution.

I was looking at it independent from this and wrote the attached patch to resolve it, thinking I was the only one suffering. But I don't fully understand the module, so I'm not sure if I did it right :).

My understanding is as follows:

content/media/TrackUnionStream.h:283
>        // These are the properties we need to ensure:
>        // 1) Exactly 'ticks' ticks of output are produced, i.e.
>        // inputEndTicks - inputStartTicks = ticks.
>        // 2) inputEndTicks <= aInputTrack->GetSegment()->GetDuration().
>        // 3) In any sequence of intervals where neither stream is blocked,
>        // the content of the input track we use is a contiguous sequence of
>        // ticks with no gaps or overlaps.
>        if (map->mEndOfLastInputIntervalInInputStream != inputStart ||
>            map->mEndOfLastInputIntervalInOutputStream != outputStart) {
>          // Start of a new series of intervals where neither stream is blocked.
>          map->mEndOfConsumedInputTicks = TimeToTicksRoundDown(rate, inputStart) - 1;
>        }
>        TrackTicks inputStartTicks = map->mEndOfConsumedInputTicks;
>        TrackTicks inputEndTicks = inputStartTicks + ticks;

There is a long proof in the comments for property #3, but it hinges on the statement below:

content/media/TrackUnionStream.h:331
>        // (now using the fact that inputEnd <= track->GetEndTimeRoundDown() for a non-ended track)

But in the if statement I quoted above, we actually change the inputEndTicks, such that in the non-trivial case, this assumption is invalidated and it may go past the track ending.. In my testing usually start time is way ahead of the actual track end time, so there are no acceptable samples only stale samples (underrun?). The vast majority of the time that the if is tripped, all of the values of 0, so the math works out (trivial case), I'm guessing that happens at the beginning.
Attachment #8497888 - Flags: feedback?(roc)
Attachment #8496145 - Flags: review?(karlt)
So I'll note that ALL of these crashes are due to the only line of code added to this section since Dec 2013 - the one MOZ_ASSERT() line that's crashing us, which padenot added.  We may have been failing this all along occasionally.
Comment on attachment 8497888 [details] [diff] [review]
Fix assert

Review of attachment 8497888 [details] [diff] [review]:
-----------------------------------------------------------------

Looks reasonable.

Bug 1061046 vastly simplifies all this logic but I won't be able to land that for a couple of weeks.
Attachment #8497888 - Flags: feedback?(roc) → feedback+
Attached patch Fix assert, v2Splinter Review
Updated with a brief comment explaining underrun can cause the code documented assumptions to be violated.

try: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=8224896bea5e
rerun try on busted cases I don't think this impacts: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=20dfba0dceba

(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) (on vacation Sep 27 to Oct 12) from comment #146)
> Comment on attachment 8497888 [details] [diff] [review]
> Fix assert
> 
> Review of attachment 8497888 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Looks reasonable.
> 
> Bug 1061046 vastly simplifies all this logic but I won't be able to land
> that for a couple of weeks.

I reviewed this afterwards, specifically: https://bugzilla.mozilla.org/attachment.cgi?id=8492532&action=diff

I think that is complementary to my change, since it still has some (original) questionable logic:
1) inputTrackEndPoint is only updated if the track is ended, but even if it hasn't we still can't go past the duration boundary
2) if we select fewer samples than "TrackTicks ticks", it will complain at the end indicating the stream is out of sync; this could easily happen if it prefers inputTrackEndPoint (the correct value from 1 above) in the std::min calculations, so we still need to do the null calculations
Attachment #8497888 - Attachment is obsolete: true
Attachment #8498224 - Flags: review?(roc)
Flags: needinfo?(padenot)
Keywords: checkin-needed
Attachment #8496145 - Attachment is obsolete: true
https://hg.mozilla.org/mozilla-central/rev/0a8b7e2db5fd

So far, so good on inbound, FWIW :)
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla35
Looking good! Please nominate this for Aurora approval when you get a chance :)
Comment on attachment 8498224 [details] [diff] [review]
Fix assert, v2

Approval Request Comment
[Feature/regressing bug #]: bug 848954

[User impact if declined]: lots of starring randomly across the tree (top webrtc/media orange) - or forcing us to disable the MOZ_ASSERT Paul added, which caught this extant bug.

[Describe test coverage new/current, TBPL]: all of tbpl that touches MSG tests it.

[Risks and why]: low risk - tweaks how tick ranges are decided on.  The code survived being off by a tick before we added the assert.

[String/UUID change made/needed]: none.
Attachment #8498224 - Flags: approval-mozilla-aurora?
Flags: needinfo?(rjesup)
Comment on attachment 8498224 [details] [diff] [review]
Fix assert, v2

Aurora+
Attachment #8498224 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Blocks: 1096078
You need to log in before you can comment on or make changes to this bug.