Closed Bug 1282214 Opened 3 years ago Closed 3 years ago
Crash in OOM | large | mozalloc
_abort | mozalloc _handle _oom | moz _xmalloc | mp4parse _new
1.57 KB, patch
|Details | Diff | Splinter Review|
2.54 KB, patch
|Details | Diff | Splinter Review|
[Tracking Requested - why for this release]: new crash signature first showing up in firefox 48 beta builds. looking related to the new rust parsing of mp4. This bug was filed from the Socorro interface and is report bp-978e1134-f3a5-4a0d-b9af-6ff002160625. ============================================================= Crashing Thread (98) Frame Module Signature Source 0 mozglue.dll mozalloc_abort(char const* const) memory/mozalloc/mozalloc_abort.cpp:33 1 mozglue.dll mozalloc_handle_oom(unsigned int) memory/mozalloc/mozalloc_oom.cpp:46 2 mozglue.dll moz_xmalloc memory/mozalloc/mozalloc.cpp:85 3 xul.dll mp4parse_new 4 @0xb46771f 5 mozglue.dll mozglue.dll@0x510f 6 xul.dll mp4_demuxer::MP4Metadata::MP4Metadata(mp4_demuxer::Stream*) media/libstagefright/binding/MP4Metadata.cpp:128 7 xul.dll mozilla::MP4Demuxer::Init() dom/media/fmp4/MP4Demuxer.cpp:134 8 xul.dll mozilla::MediaFormatReader::AsyncReadMetadata() dom/media/MediaFormatReader.cpp:264 9 xul.dll mozilla::detail::ProxyRunnable<mozilla::MozPromise<RefPtr<mozilla::MetadataHolder>, mozilla::ReadMetadataFailureReason, 1>, mozilla::MediaDecoderReader>::Run() obj-firefox/dist/include/mozilla/MozPromise.h:960 10 xul.dll mozilla::AutoTaskDispatcher::TaskGroupRunnable::Run() obj-firefox/dist/include/mozilla/TaskDispatcher.h:192 11 xul.dll mozilla::TaskQueue::Runner::Run() xpcom/threads/TaskQueue.cpp:172 12 xul.dll nsThreadPool::Run() xpcom/threads/nsThreadPool.cpp:228 13 xul.dll nsThread::ProcessNextEvent(bool, bool*) xpcom/threads/nsThread.cpp:994 14 xul.dll NS_ProcessNextEvent(nsIThread*, bool) xpcom/glue/nsThreadUtils.cpp:290 15 xul.dll mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) ipc/glue/MessagePump.cpp:369 16 xul.dll MessageLoop::RunHandler() ipc/chromium/src/base/message_loop.cc:223 17 xul.dll MessageLoop::Run() ipc/chromium/src/base/message_loop.cc:203 18 xul.dll nsThread::ThreadFunc(void*) xpcom/threads/nsThread.cpp:396 19 nss3.dll _PR_NativeRunThread nsprpub/pr/src/threads/combined/pruthr.c:397 20 nss3.dll pr_root nsprpub/pr/src/md/windows/w95thred.c:95 21 msvcr120.dll _callthreadstartex f:\dd\vctools\crt\crtw32\startup\threadex.c:376 22 msvcr120.dll msvcr120.dll@0x2c000 23 kernel32.dll BaseThreadInitThunk 24 ntdll.dll __RtlUserThreadStart 25 ntdll.dll _RtlUserThreadStart
hi, maybe you could help here.
Well, slight improvement in stack traces. Nathan, what does mp4parse_new (rust code) calling moz_xalloc on Windows 7 mean for our allocator story? I also don't understand what it means that this is an OOM large; the MediaContext object it allocates is tens of bytes. https://hg.mozilla.org/releases/mozilla-beta/annotate/13b02b96281e/media/libstagefright/binding/mp4parse/lib.rs#l282
Assignee: nobody → giles
Flags: needinfo?(giles) → needinfo?(nfroyd)
I don't think this is a Rust crash; looking at the minidump, I think the stackwalker got confused and we're actually crashing here: http://dxr.mozilla.org/mozilla-central/source/media/libstagefright/binding/MP4Metadata.cpp#470 Assuming I'm reading the stack right, I think we're trying to resize the buffer to be 1MB and having problems finding a block that large. That would also make more sense for an OOM large being reported...and indeed, the allocation size in the crash report is exactly 1MB. I guess the right thing to do here is convert that code to use nsTArray and a fallible allocation?
(In reply to Nathan Froyd [:froydnj] from comment #3) > Assuming I'm reading the stack right, I think we're trying to resize the > buffer to be 1MB and having problems finding a block that large. That would > also make more sense for an OOM large being reported...and indeed, the > allocation size in the crash report is exactly 1MB. Ah, that makes much more sense. > I guess the right thing to do here is convert that code to use nsTArray and > a fallible allocation? Probably. We could also reduce the maximum allocation, but I'd be nervous doing that on beta. In bug 1267887 we replace the copy with a read callback, which should greatly reduce the allocation. We can also just disable MOZ_RUST_MP4PARSE on win32 on beta. Philipp, how serious is this? It's not in the top 100 crashes, but ~100 crashes a week is higher than I'd like. If we're failing to allocate 1 MB likely the video wouldn't have played anyway, but of course getting a 'video could not be decoded' message is better than an OOM.
Proposed patch to just disable the feature on beta. There should be no impact on users other than fewer crashes, but we'll lose the broader testing for when the allocation succeeds. I'll also prepare a fallible allocator patch, but wanted to have this ready to go as a fallback; if we're turning this off, sooner is better than later.
3 years ago
Attachment #8766113 - Flags: review?(ajones) → review+
(In reply to Ralph Giles (:rillian) needinfo me from comment #4) > Philipp, how serious is this? It's not in the top 100 crashes, > but ~100 crashes a week is higher than I'd like. i agree that it isn't a high impact issue - the signature is just 0.07% of 48.0b crashes.
Patch to return an error instead of crashing on the allocation failure. Nathan, is this sufficient to check the resize(), or do I need to declare nsFallibleTArray?
Comment on attachment 8766514 [details] [diff] [review] Make read buffer fallible for aurora 49 Review of attachment 8766514 [details] [diff] [review]: ----------------------------------------------------------------- Looks good to me; we're trying to get rid of FallibleTArray by having fallible versions of SetLength, so one can just use nsTArray everywhere and fallible operations are clearly noted in the code.
Attachment #8766514 - Flags: feedback?(nfroyd) → feedback+
Attachment #8766514 - Flags: review?(kinetik) → review+
Comment on attachment 8766514 [details] [diff] [review] Make read buffer fallible for aurora 49 Approval Request Comment [Feature/regressing bug #]: Bug 1161350 (rust mp4 parser) [User impact if declined]: Crashes watching videos in low memory conditions. [Describe test coverage new/current, TreeHerder]: local testing only; the issue was addressed differently in nightly. [Risks and why]: The main risk is that the change would trigger some other issue with a higher frequency than this bug by pushing the OOM somewhere else. I think the risk is worthwhile if we can reduce crashes, and get some indication of safety should we decide we want this on 48 beta as well. [String/UUID change made/needed]: None.
Attachment #8766514 - Flags: approval-mozilla-aurora?
Comment on attachment 8766514 [details] [diff] [review] Make read buffer fallible for aurora 49 May reduce OOM crashes in aurora. OK to land directly in aurora.
Attachment #8766514 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
rillian, beta 6 (of 10) will go to build next thursday morning, so maybe you and philipp can help assess the results of this in aurora before then. I'm willing to try it but the earlier in beta we land, the better, so we have time to judge it there and back out if needed.
i only see this signature in 48 beta builds so far (it's not on aurora at all), so we probably can't draw any conclusions from the patch landing on aurora.
Track this as this issue happened in 48 beta for a while.
Comment on attachment 8766514 [details] [diff] [review] Make read buffer fallible for aurora 49 phillipp is correct that we don't have any reports from 49. In light of that I'd like to try this on 48 beta as well. Approval Request Comment [Feature/regressing bug #]: Bug 1161350 (rust mp4 parser) [User impact if declined]: Crashes playing video in low memory conditions (~100/week on beta) [Describe test coverage new/current, TreeHerder]: Landed on m-a. [Risks and why]: Since we're hitting an OOM, it's likely we won't be able to play the video anyway; this patch just converts the crash to a more-friendly playback failure. The risk would be that by postponing the OOM we cause more crashes elsewhere. If we do judge it worse, backout or disabling the feature entirely are straightforward. [String/UUID change made/needed]: None
Attachment #8766514 - Flags: approval-mozilla-beta?
3 years ago
Priority: -- → P1
Comment on attachment 8766514 [details] [diff] [review] Make read buffer fallible for aurora 49 Review of attachment 8766514 [details] [diff] [review]: ----------------------------------------------------------------- This patch fixes a crash. Take it in 48 beta 6. And, we need to keep an eye on this fix and see if any side effect happens.
Attachment #8766514 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
I don't see any crash reports for 48 beta 5, so either we've driven off everyone who was hitting this issue, or something else shadowed it before we could verify the fix. Will keep watching as beta 6 comes out.
No reports since beta4. Treating as fixed.
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.