Closed Bug 929029 Opened 11 years ago Closed 11 years ago

[B2G][Settings][Sound] - crash in nsSimpleURI::Release()

Categories

(Core :: Audio/Video, defect)

26 Branch
ARM
Gonk (Firefox OS)
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla28
blocking-b2g 1.3+
Tracking Status
firefox27 --- wontfix
firefox28 --- fixed
b2g-v1.2 --- unaffected

People

(Reporter: KTucker, Assigned: sotaro)

References

()

Details

(Keywords: crash, regression, reproducible, Whiteboard: [b2g-crash])

Crash Data

Attachments

(2 files, 11 obsolete files)

15.28 KB, text/plain
Details
10.09 KB, patch
sotaro
: review+
Details | Diff | Splinter Review
Attached file RingtoneCrashLog.txt
Description:
A crash will occur when setting a song as a ringtone through settings.

Repro Steps:
1)  Updated Buri to Build ID: 20131014080339
2)  Tap on the "Settings" icon.
3)  Tap on "Sound". 
4)  Tap on the box below "Ringer".
5)  Tap on "Music". 
6)  Tap on a "Song".
7)  Tap on the "Play" icon to play the song.
8)  Tap on the "Checked box" in the upper right corner of the screen to save the song as the ringtone.

Actual:
Settings will crash. Signature: nsSimpleURI::Release()

Expected:
The song is saved as the ringtone without a crash occurring.

Environmental Variables
Device: Buri v 1.3.0 Mozilla RIL
Build ID: 20131014080339
Gecko: http://hg.mozilla.org/mozilla-central/rev/ab8e70fb76a8
Gaia: 74e7e69d98206bda5c517c0ada57d43de662913e
Platform Version: 27.0a1

Notes:
Repro frequency: 100%
See attached: logcat
This issue reproduces on 10/14 on Buri v 1.3.0 Mozilla RIL

Environmental Variables
Device: Buri v 1.3.0 Mozilla RIL
Build ID: 20131014080339
Gecko: http://hg.mozilla.org/mozilla-central/rev/ab8e70fb76a8
Gaia: 74e7e69d98206bda5c517c0ada57d43de662913e
Platform Version: 27.0a1

Settings will crash after setting a song as a ringtone.


This issue does not reproduce on the 10/15 Buri v 1.3.0 Mozilla RIL

Environmental Variables
Device: Buri v 1.3.0 Mozilla RIL
Build ID: 20131015040202
Gecko: http://hg.mozilla.org/mozilla-central/rev/febfe3c7732b
Gaia: 17e871ae1f82699793e3cd28acda805ba724a8b6
Platform Version: 27.0a1

The user can set a song as a ringtone through settings without a crash occurring.
See Also: → 926125
Crash Signature: https://crash-stats.mozilla.com/report/index/61d71bce-4be1-47ef-b080-90d362131021 → [@ nsSimpleURI::Release() ]
This bug was filed from the Socorro interface and is 
report bp-61d71bce-4be1-47ef-b080-90d362131021.
Severity: normal → critical
Summary: [B2G][Settings][Sound] - nsSimpleURI::Release() → [B2G][Settings][Sound] - crash in nsSimpleURI::Release()
Component: Gaia::Settings → Video/Audio
Product: Boot2Gecko → Core
Version: unspecified → 26 Branch
blocking-b2g: --- → 1.3?
Keywords: reproducible
This isn't in my area (WebRTC or Web Audio), but this should block 1.3 IMO
(In reply to Maire Reavy [:mreavy] from comment #3)
> This isn't in my area (WebRTC or Web Audio), but this should block 1.3 IMO

Agreed.
Hi Hema,

Could you check if this one is on your team for the triage?
Flags: needinfo?(hkoka)
(In reply to Ivan Tsay (:ITsay) from comment #5)
> Hi Hema,
> 
> Could you check if this one is on your team for the triage?

FWIW - Hema's team to my understanding handles triaging Gaia bugs for Gallery, Video, Music, FMRadio, and Camera. However, that doesn't include gecko bugs.

This bug is a gecko content media crash. Very likely something that Chris Double or Sotaro would look into.
Sotaro or media recording team is most likely the best person to look into this.
Flags: needinfo?(hkoka)
(In reply to Hema Koka [:hema] from comment #7)
> Sotaro or media recording team is most likely the best person to look into
> this.

Sotaro, would you mind take a look this one and see if you know who can handle this. It's not under the area of media recording team. I also checked with CJ and he had no idea about it.
Flags: needinfo?(sotaro.ikeda.g)
(In reply to Ivan Tsay (:ITsay) from comment #8)
> Sotaro, would you mind take a look this one and see if you know who can
> handle this. It's not under the area of media recording team. I also checked
> with CJ and he had no idea about it.

Okey, I am going take a look.
Assignee: nobody → sotaro.ikeda.g
Flags: needinfo?(sotaro.ikeda.g)
I can not re-generate the crash from a ROM built from today's master on hamachi.
QA Wanted for a retest
Keywords: qawanted
(In reply to ktucker from comment #2)
> This bug was filed from the Socorro interface and is 
> report bp-61d71bce-4be1-47ef-b080-90d362131021.

From the crash log, it seems mp3 regression like Bug 927884. In the crash log, the following function is present. OmxDecoderProcessCachedDataTask does mp3 parse on io thread. But I am not sure why it is a problem.

> mozilla::OmxDecoderProcessCachedDataTask::~OmxDecoderProcessCachedDataTask
Blocks: 927884
QA Contact: sarsenyev
The crash still reproduces 100% 6/6 on the latest master Buri build 
https://crash-stats.mozilla.com/report/index/0736ee5e-5100-418f-9dac-ce6fd2131107

But I wasn't able to reproduce it when I had just a few songs on my memory card, but when I filled the memory card with 1 GB songs, the bug reproduces for 100%, as far as I open the ringtone category and choosing the "Music library", the app crashes even without choosing any songs

Device: Buri 1.3 Master build
BuildID: 20131106040203
Gaia: 3b5fe429f2414f2a9d7241b311b399033bb10612
Gecko: 9ba3faa35c96
Version: 28.0a1
Firmware Version: US_20131015
Keywords: qawanted
I think the problem here is that we are destroying the MediaResource on a non-main thread, and that's release the nsSimpleURI on a non-main thread, and nsSimpleURI's AddRef/Release aren't threadsafe.

MediaResource must be destroyed on the main thread. If necessary you need to dispatch an event to the main thread to release the MediaResource on the main thread, like I did in the Windows Media Foundation backend here:

http://mxr.mozilla.org/mozilla-central/source/content/media/wmf/WMFByteStream.cpp#129
Hi,

This implements the proposed fix. I haven't been able to see this bug, so could someone how is able to reproduce it try the patch?
Attachment #829273 - Flags: review?(cpearce)
Attachment #829273 - Flags: feedback?(sarsenyev)
Thanks! I could reproduce it. I am going to comfirm the patch.
I can not reproduce the crash by Steps in Comment 0. But could reproduce it by the following Steps.

Prepare:
   Store more than 1 hr mp3 files in sd.

Repro Steps:
1)  Tap on the "Settings" icon.
2)  Tap on "Sound". 
3)  Tap on the box below "Ringer".
4)  Tap on "Music". 
5)  Tap on the long mp3 file.
6)  Tap on the "Play" icon to play the song.
7)  Soon after 6) Tap on the "<" in the upper left corner of the screen.

continue 6) 7) until crash.
Thomas can you take a bug? You are actually working for the bug.
Flags: needinfo?(tzimmermann)
Ok, I take the bug, but it's still not reproducible here.
Assignee: sotaro.ikeda.g → tzimmermann
Flags: needinfo?(tzimmermann)
(In reply to Thomas Zimmermann [:tzimmermann] [:tdz] from comment #20)
> Ok, I take the bug, but it's still not reproducible here.

If mp3 file is very big and sd card is slow, the crash seems easier to reproduce. I am using more than 64 MB mp3 files.
The crash is still happens after applying attachment 829273 [details] [diff] [review].
Attached file crash of stacktrace (obsolete) —
After MediaOmxReader was destroyed, OmxDecoderProcessCachedDataTask becomes only object referencing OmxCodec. Then OmxDecoderProcessCachedDataTask's destructor triggers OmxCodec's destructor. And then FileMediaResource's destructor is called in non-main thread.
(In reply to Sotaro Ikeda [:sotaro] from comment #24)
> After MediaOmxReader was destroyed, OmxDecoderProcessCachedDataTask becomes
> only object referencing OmxCodec. Then OmxDecoderProcessCachedDataTask's
> destructor triggers OmxCodec's destructor. And then FileMediaResource's
> destructor is called in non-main thread.

I probably just introduced another race condition.

With the patch applied this actually shouldn't happen, because there should always be a OmxDecoderNotifyDataArrivedRunnable that still contains a reference to the OmxDecoder.

I suspect that this runnable finished before OmxDecoderProcessCachedDataTask.
This patch should not contain the race condition I described in my previous posting. I know the code is quite ugly, but does it solve the problem for you?
Attachment #829273 - Attachment is obsolete: true
Attachment #829273 - Flags: review?(cpearce)
Attachment #829273 - Flags: feedback?(sarsenyev)
Attachment #829375 - Flags: feedback?(sotaro.ikeda.g)
By applying attachment 829375 [details] [diff] [review], a frequency of the crash seems to be decreased. But still happens.
Attached file stacktrace of the crash 2 (obsolete) —
The stacktrace is slightly different from previous one.
Perhaps a better solution is to implement a custom MediaResourse::Release that dispatches an event to the main thread to destroy the media resource when the refcount reaches 0? Then we don't need bespoke fixes like what we're doing here and in the WMF backend.
(In reply to Chris Pearce (:cpearce) from comment #29)
> Perhaps a better solution is to implement a custom MediaResourse::Release
> that dispatches an event to the main thread to destroy the media resource
> when the refcount reaches 0? Then we don't need bespoke fixes like what
> we're doing here and in the WMF backend.

The log in attachment 829412 [details] shows that the crash also happens on the main thread. Something else seems going wrong here.
This patch implements what cpearce describes in comment 14. I can only try different fixes, since I still cannot reproduce the problem, even with large files (~150 MiB).
Attachment #829375 - Attachment is obsolete: true
Attachment #829375 - Flags: feedback?(sotaro.ikeda.g)
Attachment #830148 - Flags: feedback?(sotaro.ikeda.g)
(In reply to Thomas Zimmermann [:tzimmermann] [:tdz] from comment #30)
> 
> The log in attachment 829412 [details] shows that the crash also happens on
> the main thread. Something else seems going wrong here.

It was not on main thread.
Attached file stacktrace of the crash 3 (obsolete) —
A stack trace of the crash after applying attachment 830148 [details] [diff] [review].
Attachment #830148 - Flags: feedback?(sotaro.ikeda.g)
(In reply to Sotaro Ikeda [:sotaro] from comment #33)
> Created attachment 830219 [details]
> stacktrace of the crash 3
> 
> A stack trace of the crash after applying attachment 830148 [details] [diff] [review]
> [review].

Crash still happens by applying attachment 830148 [details] [diff] [review].
Plus this one for v1.3 since the crash issue is critical.
blocking-b2g: 1.3? → 1.3+
Flags: needinfo?(jsmith)
What's the needinfo request for specifically?
Flags: needinfo?(jsmith)
(In reply to Sotaro Ikeda [:sotaro] from comment #32)
> (In reply to Thomas Zimmermann [:tzimmermann] [:tdz] from comment #30)
> > 
> > The log in attachment 829412 [details] shows that the crash also happens on
> > the main thread. Something else seems going wrong here.
> 
> It was not on main thread.

Oh, you're right. I just saw the runnable in the stack trace, but there is also the task structure involved.
From stack 3, it looks like there is more involved here than just MediaResource.

The second patch still didn't work as expected. This patch should (hopefully) release OmxDecoder and anything it references on the main thread after the reference on the I/O thread is gone.
Attachment #830148 - Attachment is obsolete: true
Attachment #830782 - Flags: feedback?(sotaro.ikeda.g)
(In reply to Thomas Zimmermann [:tzimmermann] [:tdz] from comment #38)
> Created attachment 830782 [details] [diff] [review]
> [01] Bug 929029: Release OmxDecoder on main thread (v4)
> 

I checked the  attachment 830782 [details] [diff] [review], the crash still happens(not often).
I feel that it might be better to use nsMainThreadPtrHandle and nsMainThreadPtrHolder to fix this problem. like camera does like following.

http://mxr.mozilla.org/mozilla-central/source/dom/camera/CameraControlImpl.h#159
Sotaro,

Can I give this bug to you? I won't be able to work on it for some time, and I'm still not able to reproduce it. I tried to move the release of OmxDecoder to the main thread, but OmxDecoder is freed on the I/O thread in all stack traces you provided. Maybe there is something wrong with the reference counting of OmxDecoder.
Assignee: tzimmermann → sotaro.ikeda.g
NO problem. I take this bug.
Attachment #830782 - Flags: feedback?(sotaro.ikeda.g)
Attachment #829412 - Attachment is obsolete: true
Attachment #830219 - Attachment is obsolete: true
Attachment #829307 - Attachment is obsolete: true
Attachment #830988 - Attachment is obsolete: true
By applying the patch, I confirmed the crash seems to be fixed on master hamachi.
Attachment #830782 - Attachment is obsolete: true
Fix a nit.
Attachment #8338520 - Attachment is obsolete: true
Attachment #8338523 - Flags: review?(chris.double)
Attachment #8338523 - Flags: review?(chris.double) → review+
https://tbpl.mozilla.org/?tree=Try&rev=a629087121ba
A lot of tests failed at ChannelMediaResource::OpenChannel() :-(
You should probably get roc to check this patch too. He wrote most of MediaResource and MediaCache, so he understands the assumptions the code makes best.
After fixed the crash, I will ask to roc.
Fix as to use mChannel.get() for checking nullptr.
Attachment #8338523 - Attachment is obsolete: true
(In reply to Sotaro Ikeda [:sotaro] from comment #50)
> https://tbpl.mozilla.org/?tree=Try&rev=1356dc623fdd
> almost green.

In the orange, the following test are failed related to video/audio. It is already handled by Bug 943556. 
- /tests/toolkit/content/tests/widgets/test_videocontrols_standalone.html
Attachment #8340372 - Flags: review?(roc)
https://tbpl.mozilla.org/?tree=Try&rev=6ba0ed9b44ab
re-tryserver. NO media related test failure.
Committable patch. Carry "roc, doublec review+".
Attachment #8340372 - Attachment is obsolete: true
Attachment #8340771 - Flags: review+
Keywords: checkin-needed
Hmm, I did not saw this crash on tryserver tests.
(In reply to Ryan VanderMeulen [:RyanVM UTC-5][Back 2-Dec] from comment #55)
> https://tbpl.mozilla.org/php/getParsedLog.php?id=31322007&tree=B2g-Inbound
> https://tbpl.mozilla.org/php/getParsedLog.php?id=31321907&tree=B2g-Inbound
> https://tbpl.mozilla.org/php/getParsedLog.php?id=31322194&tree=B2g-Inbound

These failure seems not related to this bug. I checked the dom/media source code. These code does not use nsMainThreadPtrHandle and nsMainThreadPtrHolder for JS objects they are referenced from non main thread.
Blocks: 945334
No longer blocks: 945334
Depends on: 945334
(In reply to Sotaro Ikeda [:sotaro] from comment #57)
> These failure seems not related to this bug. I checked the dom/media source
> code. These code does not use nsMainThreadPtrHandle and
> nsMainThreadPtrHolder for JS objects they are referenced from non main
> thread.

Created Bug 945334.
You're right, I was being dumb. It was bug 926746 that caused it. Re-landed.
https://hg.mozilla.org/integration/b2g-inbound/rev/f9d8f53e8739
https://hg.mozilla.org/mozilla-central/rev/f9d8f53e8739
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla28
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: