Closed Bug 1325707 Opened 3 years ago Closed 3 years ago

Crash in opus_packet_get_samples_per_frame

Categories

(Core :: Audio/Video: Playback, defect, critical)

x86
Windows 8
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla54
Tracking Status
firefox51 --- unaffected
firefox52 + fixed
firefox-esr52 --- fixed
firefox53 --- fixed
firefox54 --- fixed

People

(Reporter: marcia, Assigned: jya)

References

Details

(Keywords: crash, csectype-nullptr, regression)

Crash Data

Attachments

(7 files, 1 obsolete file)

This bug was filed from the Socorro interface and is 
report bp-8ee530aa-6ffc-4b2b-b63f-88f9c2161223.
=============================================================

Seen while looking at nightly crash data: http://bit.ly/2hh7vm9. Crashes started using the 20161223030226 build.

Possible regression range: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=f179934df0c1bab590c558485d419c7910e41325&tochange=2785aaf276ba29fb2e1f5607d90d441fee42efb4

ni on :jya for some possible insight.
Flags: needinfo?(jyavenard)
7 crashes in Nightly 20161223030226, making it the #6 top crash.

All the crashes have address 0x0, so it looks like |data| might be null.

jya is in the process of moving countries, so I'll ask kinetik as well for any suggestions.
Flags: needinfo?(kinetik)
As I don't see anything WebRTC related in the stack traces I'm assuming this is not WebRTC specific.
Component: WebRTC: Audio/Video → Audio/Video: Playback
At a guess, it's bug 1321076 that affected the webm demuxer (where opus is most used)
Blocks: 1321076
Flags: needinfo?(jyavenard)
Keywords: regression
actually, maybe it's bug 1323390 if that impacts opus in mp4.

there's no URL provided in those crash reports.

Change in bug 1323390 appears wrong to me anyway.
No longer blocks: 1321076
(In reply to Jean-Yves Avenard [:jya] from comment #4)
> actually, maybe it's bug 1323390 if that impacts opus in mp4.

It checks profile in ESDS, and I believe ESDS is not exist in opus.

> 
> there's no URL provided in those crash reports.
> 
> Change in bug 1323390 appears wrong to me anyway.

As we discussed on irc, please let me know if anything is wrong to you.
(In reply to Marcia Knous [:marcia - use ni] from comment #0)
> Seen while looking at nightly crash data: http://bit.ly/2hh7vm9. Crashes
> started using the 20161223030226 build.

I see the same crash going back as far as the 20161218030213 build in 53.0a1, so the regression range may be wider than initially suggested.

There are also three similar crashes (in the last 6 months) in opus_packet_get_samples_per_frame in older versions which may or may not be related... one less likely alternate possibility to a recent regression is that a certain type of file triggers this and it just happened to spike recently.
Flags: needinfo?(kinetik)
All the 53-and-newer crashes appear to be null; older ones look like nullptr (we have two 54 crashes this week already).

This appears to be a problem outside of Opus proper (just to note); it's being passed a null ptr (and apparently in DeDecode() it believes the number of frames in the null data block is not 0; else it would have exited earlier.)

jya?  gerald?
Flags: needinfo?(jyavenard)
Flags: needinfo?(gsquelart)
the signature now also appeared in 52.0b2 - probably due to the uplift of bug 1320705...
jya, gerald - since this is in 52, we'll want to get a solution to it soon.  perhaps bug 1320705 is a hint
Will work on it on Monday
Assignee: nobody → jyavenard
Flags: needinfo?(gsquelart)
Flags: needinfo?(jyavenard)
Comment on attachment 8834023 [details]
Bug 1325707: P2. Handle OOM conditions when creating MediaRawData object.

https://reviewboard.mozilla.org/r/110134/#review111244

::: dom/media/ogg/OggCodecState.cpp:265
(Diff revision 2)
> +    !IsHeader(packet),
> +    "PacketOutAsMediaRawData can only be called on non-header packets");
>    RefPtr<MediaRawData> sample = new MediaRawData(packet->packet, packet->bytes);
> +  if (!sample->Data()) {
> +    // OOM.
> +    return nullptr;

Need to release packet
Comment on attachment 8834022 [details]
Bug 1325707: P1. Check returned value.

https://reviewboard.mozilla.org/r/110132/#review111232
Attachment #8834022 - Flags: review?(gsquelart) → review+
Comment on attachment 8834023 [details]
Bug 1325707: P2. Handle OOM conditions when creating MediaRawData object.

https://reviewboard.mozilla.org/r/110134/#review111252

::: dom/media/ogg/OggCodecState.cpp:265
(Diff revision 2)
> +    !IsHeader(packet),
> +    "PacketOutAsMediaRawData can only be called on non-header packets");
>    RefPtr<MediaRawData> sample = new MediaRawData(packet->packet, packet->bytes);
> +  if (!sample->Data()) {
> +    // OOM.
> +    return nullptr;

Or really you should have an "AutoReleasePacket" RAII thing. ;-)
But I guess that could be a separate bug, in the meantime just releasing it is fine.
Attachment #8834023 - Flags: review?(gsquelart) → review+
Comment on attachment 8834024 [details]
Bug 1325707: P3. Fix coding style.

https://reviewboard.mozilla.org/r/110136/#review111254

r+ with nits:

::: dom/media/ogg/OggCodecState.h:68
(Diff revision 2)
> -  ogg_packet* PopFront() { return static_cast<ogg_packet*>(nsDeque::PopFront()); }
> -  ogg_packet* PeekFront() { return static_cast<ogg_packet*>(nsDeque::PeekFront()); }
> -  ogg_packet* Pop() { return static_cast<ogg_packet*>(nsDeque::Pop()); }
> -  ogg_packet* operator[](size_t aIndex) const
> +  ogg_packet *PopFront()
> +  {
> +    return static_cast<ogg_packet *>(nsDeque::PopFront());
> +  }
> +  ogg_packet *PeekFront()
> +  {
> +    return static_cast<ogg_packet *>(nsDeque::PeekFront());
> +  }
> +  ogg_packet *Pop() { return static_cast<ogg_packet *>(nsDeque::Pop()); }
> +  ogg_packet *operator[](size_t aIndex) const

Please put the *'s back with the types they belong to.

And fix your code formatter! Other files were correctly formatted, so maybe it thinks this particular file is C code? You may need to force it to consider all C-looking files to be C++, if that's possible.

::: dom/media/ogg/OggCodecState.h:282
(Diff revision 2)
> -  bool SetCodecSpecificConfig(MediaByteBuffer* aBuffer, OggPacketQueue& aHeaders);
> +  bool SetCodecSpecificConfig(MediaByteBuffer *aBuffer,
> +                              OggPacketQueue &aHeaders);

'type* param'
'type& param'

::: dom/media/webm/WebMDemuxer.cpp:690
(Diff revision 2)
> -            mSharedVideoTrackInfo = new SharedTrackInfo(mInfo.mVideo, ++sStreamSourceID);
> +            mSharedVideoTrackInfo =
> +                new SharedTrackInfo(mInfo.mVideo, ++sStreamSourceID);

Too far, there should only be a 2-space indent for the 2nd line.
Attachment #8834024 - Flags: review?(gsquelart) → review+
Comment on attachment 8834025 [details]
Bug 1325707: P4. Fix coding style of MediaDataDemuxers.

https://reviewboard.mozilla.org/r/110138/#review111262

The usual, y'know:

::: dom/media/ADTSDemuxer.cpp:569
(Diff revision 2)
>    FastSeek(media::TimeUnit());
>  }
>  
>  RefPtr<ADTSTrackDemuxer::SkipAccessPointPromise>
> -ADTSTrackDemuxer::SkipToNextRandomAccessPoint(const media::TimeUnit& aTimeThreshold)
> +ADTSTrackDemuxer::SkipToNextRandomAccessPoint(
> +  const media::TimeUnit &aTimeThreshold)

'type& param'

::: dom/media/MP3Demuxer.cpp:473
(Diff revision 2)
> -VerifyFrameConsistency(
> -    const FrameParser::Frame& aFrame1, const FrameParser::Frame& aFrame2) {
> +VerifyFrameConsistency(const FrameParser::Frame &aFrame1,
> +                       const FrameParser::Frame &aFrame2)

'type& param'

::: dom/media/flac/FlacDemuxer.cpp:647
(Diff revision 2)
>  {
>    static const int BUFFER_SIZE = 4096;
>  
>    // First check if we have a valid Flac start.
>    char buffer[BUFFER_SIZE];
> -  const uint8_t* ubuffer = reinterpret_cast<uint8_t*>(buffer); // only needed due to type constraints of ReadAt.
> +  const uint8_t *ubuffer = // only needed due to type constraints of ReadAt.

'type* var'

::: dom/media/fmp4/MP4Demuxer.cpp:463
(Diff revision 2)
>    return NS_OK;
>  }
>  
>  RefPtr<MP4TrackDemuxer::SkipAccessPointPromise>
> -MP4TrackDemuxer::SkipToNextRandomAccessPoint(const media::TimeUnit& aTimeThreshold)
> +MP4TrackDemuxer::SkipToNextRandomAccessPoint(
> +  const media::TimeUnit &aTimeThreshold)

'type& param'

::: dom/media/mediasource/MediaSourceDemuxer.h:109
(Diff revision 2)
>    void Reset() override;
>  
>    nsresult GetNextRandomAccessPoint(media::TimeUnit* aTime) override;
>  
> -  RefPtr<SkipAccessPointPromise> SkipToNextRandomAccessPoint(const media::TimeUnit& aTimeThreshold) override;
> +  RefPtr<SkipAccessPointPromise>
> +  SkipToNextRandomAccessPoint(const media::TimeUnit &aTimeThreshold) override;

'type& param'

::: dom/media/mediasource/MediaSourceDemuxer.h:124
(Diff revision 2)
> -  already_AddRefed<MediaRawData> GetSample(MediaResult& aError);
> +  DoSkipToNextRandomAccessPoint(const media::TimeUnit &aTimeThreadshold);
> +  already_AddRefed<MediaRawData> GetSample(MediaResult &aError);

'type& param'
Attachment #8834025 - Flags: review?(gsquelart) → review+
Pushed by jyavenard@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/0afa9c6cd411
P1. Check returned value. r=gerald
https://hg.mozilla.org/integration/autoland/rev/c15456c1b320
P2. Handle OOM conditions when creating MediaRawData object. r=gerald
https://hg.mozilla.org/integration/autoland/rev/7266481cd443
P3. Fix coding style. r=gerald
https://hg.mozilla.org/integration/autoland/rev/cb7b14a913fd
P4. Fix coding style of MediaDataDemuxers. r=gerald
Tracking for 52 as new regression.  Is this something we can consider uplifting?
Hi Jean-Yves, is this something we should uplift to beta52 and aurora53? If yes, please nominate for uplift.
Flags: needinfo?(jyavenard)
Comment on attachment 8834022 [details]
Bug 1325707: P1. Check returned value.

Approval Request Comment
[Feature/Bug causing the regression]: 1320705
[User impact if declined]: crashes
[Is this code covered by automated tests?]: yes
[Has the fix been verified in Nightly?]: no. The crash rate was rather low to start with, I'm not 100% sure the fix is *the* fix. But it can't hurt
[Needs manual test from QE? If yes, steps to reproduce]: no
[List of other uplifts needed for the feature/fix]: Need P1 and P2
[Is the change risky?]: Somewhat low
[Why is the change risky/not risky?]: We now refuse anything that contain a null data allocated, that includes what would otherwise be valid packet (like those whose packet size is 0). We'll get to know soon enough
[String changes made/needed]: none
Flags: needinfo?(jyavenard)
Attachment #8834022 - Flags: approval-mozilla-beta?
Attachment #8834022 - Flags: approval-mozilla-aurora?
forgot to add, unfortunately, no one care about ogg...
Comment on attachment 8834022 [details]
Bug 1325707: P1. Check returned value.

A potential crash fix. Aurora53+.
Attachment #8834022 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment on attachment 8834022 [details]
Bug 1325707: P1. Check returned value.

avoid a crash on OOM, beta52+
Attachment #8834022 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment on attachment 8834023 [details]
Bug 1325707: P2. Handle OOM conditions when creating MediaRawData object.

[Triage Comment]
Attachment #8834023 - Flags: approval-mozilla-beta+
needs rebasing for beta because part 2 fails like:

warning: conflicts while merging dom/media/webm/WebMDemuxer.cpp! (edit, then use 'hg resolve --mark')
abort: unresolved conflicts, can't continue
(use 'hg resolve' and 'hg graft --continue')
Flags: needinfo?(jyavenard)
This is really just for cleanliness perspective, as it can never happen, we always check that demuxing a packet succeeded first.

MozReview-Commit-ID: FQ1yz16m9Ix
Flags: needinfo?(jyavenard)
Depends on: 1340129
Setting qe-verify- based on jya's assessment on testing needs (Comment 32) and the fact that this patch has automated coverage.
Flags: qe-verify-
Duplicate of this bug: 1325920
It is still happening :(
https://crash-stats.mozilla.com/report/index/b456f391-f739-4712-baea-06e9e2170220

I don't understand what is happening here.
The patches earlier, always check that if Data() returned null, then we must have Size() return 0; everything else is an error (OOM) and will be aborted. 

So it crashes here:
https://hg.mozilla.org/mozilla-central/annotate/c3cbadc5d2fa/dom/media/platforms/agnostic/OpusDecoder.cpp#l189

because aData() returned nullptr.

However, 3 lines above we have:
https://hg.mozilla.org/mozilla-central/annotate/18b5d19d0abc/dom/media/platforms/agnostic/OpusDecoder.cpp#l188

  uint32_t frames_number = opus_packet_get_nb_frames(aSample->Data(),
                                                     aSample->Size());
  if (frames_number <= 0) {

opus_packet_get_nb_frames does:
https://dxr.mozilla.org/mozilla-central/source/media/libopus/src/opus_decoder.c?q=opus_packet_get_nb_frames&redirect_type=direct#944

Which check that Size() isn't null.

So if Size() isn't 0, then Data() can't be nullptr either....

I can of course add an early return ...


Oh noooooooo..
As I'm re-read the code, I see what's going on...

DOH!!!!
unother signed/unsigned issue !
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
MozReview-Commit-ID: 26bxAyhxbsx
Attachment #8839481 - Flags: review?(gsquelart)
MozReview-Commit-ID: 26bxAyhxbsx
Comment on attachment 8839481 [details] [diff] [review]
Follow-up, fix invalid unsigned comparison. r?gerald

fixed indent
Attachment #8839481 - Attachment is obsolete: true
Attachment #8839481 - Flags: review?(gsquelart)
Attachment #8839482 - Flags: review?(gsquelart)
Attachment #8839482 - Flags: review?(gsquelart) → review+
Pushed by jyavenard@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/c7d0ea0c1c04
Follow-up, fix invalid unsigned comparison. r=gerald
Comment on attachment 8839482 [details] [diff] [review]
Follow-up, fix invalid unsigned comparison. r?gerald

Approval Request Comment
[Feature/Bug causing the regression]: 1320705
[User impact if declined]: crashes
[Is this code covered by automated tests?]: yes
[Has the fix been verified in Nightly?]: So now, that is the proper fix. Though the previous fixes did fix a problem, so nothing has been lost.
[Needs manual test from QE? If yes, steps to reproduce]: no
[List of other uplifts needed for the feature/fix]: just that one
[Is the change risky?]: No
[Why is the change risky/not risky?]: We no longer do signed comparison on a unsigned integer !
[String changes made/needed]: none
Attachment #8839482 - Flags: approval-mozilla-beta?
Attachment #8839482 - Flags: approval-mozilla-aurora?
Comment on attachment 8839482 [details] [diff] [review]
Follow-up, fix invalid unsigned comparison. r?gerald

fix crashes due to wrong signed/unsigned comparison, aurora53+, beta52+
Attachment #8839482 - Flags: approval-mozilla-beta?
Attachment #8839482 - Flags: approval-mozilla-beta+
Attachment #8839482 - Flags: approval-mozilla-aurora?
Attachment #8839482 - Flags: approval-mozilla-aurora+
https://hg.mozilla.org/mozilla-central/rev/c7d0ea0c1c04
Status: REOPENED → RESOLVED
Closed: 3 years ago3 years ago
Resolution: --- → FIXED
This last patch needs rebasing for the Aurora uplift.
Flags: needinfo?(jyavenard)
You need to log in before you can comment on or make changes to this bug.