[RTSP][V2.0] Closing a audio RTSP streaming via tab page causes system crash

VERIFIED FIXED in Firefox 30

Status

defect
P1
critical
VERIFIED FIXED
5 years ago
5 years ago

People

(Reporter: whsu, Assigned: ethan)

Tracking

({crash})

unspecified
2.0 S1 (9may)
ARM
Gonk (Firefox OS)
Dependency tree / graph

Firefox Tracking Flags

(firefox29 wontfix, firefox30 fixed, firefox31 fixed, firefox32 fixed, b2g-v1.4 fixed, b2g-v2.0 fixed)

Details

(Whiteboard: [p=2][b2g-crash], crash signature)

Attachments

(2 attachments, 2 obsolete attachments)

Reporter

Description

5 years ago
Posted video WP_20140506_013.mp4
* Description:
  Closing a RTSP streaming via tag page of browser. It will cause a system crash.
  Here are the CrashID
  - https://crash-stats.mozilla.com/report/index/0149e0eb-4003-455a-849c-972ca2140506
  Attach the demo video (WP_20140506_013.mp4)
  

* Reproduction steps:
  1. Launch the following page via browser
     - http://goo.gl/FyHFNs
  2. Tap the "AMX test file"
  3. Switch to tag page after RTSP streaming is ready to play.
  4. Tap "X" to close the web page

* Expected result:
  Web page can be closed

* Actual result:
  A system crash happened

* Reproduction build: V2.0 (Buri)
  - Gaia      e8a08a3f7a608993f0b302371e016e73faceea70
  - Gecko     https://hg.mozilla.org/mozilla-central/rev/2897fb554990
  - BuildID   20140505160203
  - Version   32.0a1
Assignee

Comment 1

5 years ago
William,
What is the reproduction rate of this bug?
I cannot reproduce it on my Unagi.
Assignee: nobody → ettseng
Severity: normal → critical
Priority: -- → P1
The crash report shows that the object is destroyed at an android thread, not at main thread.

The class SendOnDisconnectedTask might break the rule "release on main thread" because it holds a reference to the object.

http://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/rtsp/controller/RtspController.cpp?from=RtspController.cpp#338

We should try debug build and modify line338 to sync dispatch. In this situation, the crash should be easily reproduced.
Duplicate of this bug: 1005867
Reporter

Comment 4

5 years ago
OOPS~ I always can reproduce it on my buri device.
I can show you tomorrow. Thanks!
Assignee

Comment 5

5 years ago
(In reply to Benjamin Chen [:bechen] from comment #2)
> We should try debug build and modify line338 to sync dispatch. In this
> situation, the crash should be easily reproduced.
I tried to run a debug build.
It did crash! But after the crash, my Unagi cannot boot anymore and I re-flashed it to a normal build to boot it.
However, I think this result reveals the root cause could quite possibly be this line.
Assignee

Comment 6

5 years ago
(In reply to William Hsu [:whsu] from comment #4)
> OOPS~ I always can reproduce it on my buri device.
> I can show you tomorrow. Thanks!
Never mind. I think we are pretty close to the root cause. :)
BTW, I will also try to reproduce this bug on a Buri phone.
Assignee

Comment 7

5 years ago
Hi Steve,

We think the root cause of this crash is a threading/timing issue.
Could you help to clarify our thoughts?

The crash report reveals the crash occurs at this line:
http://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/rtsp/controller/RtspController.cpp?from=RtspController.cpp#338

327 RtspController::OnDisconnected(uint8_t index,
328                                nsresult reason)
329 {
330   LOG(("RtspController::OnDisconnected() for track %d reason = 0x%x", index, reason));
331   mState = DISCONNECTED;
332   if (mListener) {
333     nsRefPtr<SendOnDisconnectedTask> task =
334       new SendOnDisconnectedTask(mListener, index, reason);
335     // Break the cycle reference between the Listener (RtspControllerParent) and
336     // us.
337     mListener = nullptr;
338     return NS_DispatchToMainThread(task);
339   }
340   return NS_ERROR_NOT_AVAILABLE;
341 }

The member "mURI" of RtspControllerParent is declared as "nsCOMPtr<nsIURI>" and cannot be released on an off-main-thread.
I imagine line 338 will do the following things:
1) NS_DispatchToMainThread(task) executes SendOnDisconnectedTask::Run() on the main thread, which calls RtspControllerParent's OnDisconnected() that triggers a sequence of upstream release of RTSP objects.
2) After NS_DispatchToMainThread() returns, the local variable "task" leaves its scope and is being destructed, which in turn releases its reference count to mListener (RtspControllerParent).

So I guess if 1) is completed before 2), 2) decreases the ref-count of RtspControllerParent to zero and releases it here, which is running on an off-main-thread!
And the reason why we don't encounter this crash previously is because 2) usually completes before 1).

If my guess is true, I can think of two possible solutions.
a. Declare SendOnDisconnectedTask::mListener as nsMainThreadPtrHandle instead of nsCOMPtr.
b. Use forget() mechanism, such as NS_DispatchToMainThread(task.forget());

How do you think?
Is our guess and solution sane?
Flags: needinfo?(sworkman)
Wow - this is a fun one!

I think the race condition you describe sounds plausible. But I think the solution should include nsStandardURL.

task.forget() might help, or else just declaring "new SendOnDisconnectedTask" within the function call:
NS_DispatchToMainThread(new SendOn....);

But this only fixes this instance of RtspControllerParent being released off main thread.

I think it is better to do a proxy release of RtspControllerParent::mURI. Try NS_ProxyRelease in ~RtspControllerParent. This way, you avoid mURI causing crashes in the future, because it can't be released off main thread. I think it's worth trying this anyway.
Flags: needinfo?(sworkman)
Reporter

Updated

5 years ago
blocking-b2g: --- → 2.0?
Assignee

Updated

5 years ago
Status: NEW → ASSIGNED
Assignee

Comment 9

5 years ago
Steve,
We found a good example in MediaResource::Destroy(), which is added by bug 975858.
I just applied the same solution as there.
This patch would ensure RtspControllerParent is released on the main thread.

Could you help to review it?
Attachment #8419242 - Flags: review?(sworkman)
Attachment #8419242 - Flags: feedback-
Assignee

Comment 10

5 years ago
Comment on attachment 8419242 [details] [diff] [review]
Using proxy release to ensure RtspControllerParent is released on the main thread

Hi William,

Please help to apply this patch and verify whether it fixes the crash or not.
Thanks!
Attachment #8419242 - Flags: feedback- → feedback?(whsu)
Assignee

Updated

5 years ago
Whiteboard: [p=2]
Target Milestone: --- → 2.0 S2 (23may)
Reporter

Comment 11

5 years ago
Hi, Ethan,

Roger that! Thanks for the prompt help! :)
Assignee

Comment 12

5 years ago
(In reply to Ethan Tseng [:ethan] from comment #9)
> Steve,
> This patch would ensure RtspControllerParent is released on the main thread.
> Could you help to review it?
I just filed bug 1007577 in order to remove nsIInterfaceRequestor from RtspControllerParent.
I will rebase either the patch in bug 1007577 or in this bug depending on which one is landed first.
(In reply to Ethan Tseng [:ethan] from comment #9)
> Created attachment 8419242 [details] [diff] [review]
> Using proxy release to ensure RtspControllerParent is released on the main
> thread
> 
> Steve,
> We found a good example in MediaResource::Destroy(), which is added by bug
> 975858.
> I just applied the same solution as there.
> This patch would ensure RtspControllerParent is released on the main thread.
If the patch ensures the RtspControllerParent is released on the main thread,
you might need to remove the nsMainThreadPtrHandle reference to RtspControllerParent?
http://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/rtsp/controller/RtspController.h?from=rtspcontroller.h#40
Assignee

Comment 14

5 years ago
(In reply to Benjamin Chen [:bechen] from comment #13)
> If the patch ensures the RtspControllerParent is released on the main thread,
> you might need to remove the nsMainThreadPtrHandle reference to
> RtspControllerParent?
> http://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/rtsp/
> controller/RtspController.h?from=rtspcontroller.h#40

Yes! Thanks for your reminder!
Comment on attachment 8419242 [details] [diff] [review]
Using proxy release to ensure RtspControllerParent is released on the main thread

Review of attachment 8419242 [details] [diff] [review]:
-----------------------------------------------------------------

Ooo. That's a nice fix. Good job. r=me.

Please remove the nsMainThreadPtrs as Benjamin suggests.
Attachment #8419242 - Flags: review?(sworkman) → review+
Assignee

Comment 16

5 years ago
1. Address comment 14.
2. Refresh commit message "r=sworkman"
Attachment #8419242 - Attachment is obsolete: true
Attachment #8419242 - Flags: feedback?(whsu)
Attachment #8419867 - Flags: review+
Assignee

Comment 17

5 years ago
Fix a careless compile error.
Attachment #8419867 - Attachment is obsolete: true
Attachment #8419887 - Flags: review+
Assignee

Comment 18

5 years ago
The result of TBPL:
https://tbpl.mozilla.org/?tree=Try&rev=5b72159fe3b7

Trychooser syntax:
try: -b do -p all -u none -t none
This patch should not affect any test cases. Just run builds for all platform.
Keywords: checkin-needed
Assignee

Comment 21

5 years ago
(In reply to Steve Workman [:sworkman] from comment #15)
> Review of attachment 8419242 [details] [diff] [review]:
> -----------------------------------------------------------------
> Ooo. That's a nice fix. Good job. r=me. 
> Please remove the nsMainThreadPtrs as Benjamin suggests.

Thanks! Steve. :)
https://hg.mozilla.org/mozilla-central/rev/fcfc9fedb6b4
Status: ASSIGNED → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Target Milestone: 2.0 S2 (23may) → 2.0 S1 (9may)
Duplicate of this bug: 1009032
Ethan, a duplicate of this was seen on v28 - does it need to be uplifted?
Flags: needinfo?(ettseng)
Assignee

Comment 25

5 years ago
(In reply to Steve Workman [:sworkman] from comment #24)
> Ethan, a duplicate of this was seen on v28 - does it need to be uplifted?
Yes! I think so.
I didn't uplift a bug before. Could you do it for me? I will learn it.
Flags: needinfo?(ettseng)
Ah, it's very simple actually, and you need to answer some questions, so better if you do it.

Click on 'Details' beside the attached patch, and then select approval-aurora and approval-beta - choose '?'. Some questions will be auto-filled in the comment box - just answer those and the release engineering will decide if it can be approved or not. Then, RyanVM or another code sheriff will do the landing.

At least, that's how I did it before :)
Assignee

Comment 27

5 years ago
Comment on attachment 8419887 [details] [diff] [review]
Using proxy release to ensure RtspControllerParent is released on the main thread

[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 1009032
User impact if declined: occasional system crash on B2G
Testing completed (on m-c, etc.): on m-c
Risk to taking this patch (and alternatives if risky): none
String or IDL/UUID changes made by this patch: none
Attachment #8419887 - Flags: approval-mozilla-beta?
Attachment #8419887 - Flags: approval-mozilla-aurora?
Assignee

Comment 28

5 years ago
(In reply to Steve Workman [:sworkman] from comment #26)
> Ah, it's very simple actually, and you need to answer some questions, so
> better if you do it.

Just did it.
Thanks for taking the time to explain this. :)
Reporter

Comment 29

5 years ago
Thanks for the help!
I verified this patch.
No crash happened during verification.
Good Job!

* Build Information:
 - Gaia      2f89c43e798ccba631025bedc47a1fb24e830cf2
 - Gecko     https://hg.mozilla.org/mozilla-central/rev/4b6d63b05a0a
 - BuildID   20140512160204
 - Version   32.0a1
Status: RESOLVED → VERIFIED
Crash Signature: [@ nsStandardURL::Release() ]
Keywords: crash
Whiteboard: [p=2] → [p=2][b2g-crash]
Assignee

Comment 30

5 years ago
(In reply to William Hsu [:whsu] from comment #29)
> Thanks for the help!
> I verified this patch.
> No crash happened during verification.
> Good Job!

So glad to hear this!
Thanks for your verification. :)
Comment on attachment 8419887 [details] [diff] [review]
Using proxy release to ensure RtspControllerParent is released on the main thread

low risk, crash fix approving for uplift. This has existed since fx28.
Attachment #8419887 - Flags: approval-mozilla-beta?
Attachment #8419887 - Flags: approval-mozilla-beta+
Attachment #8419887 - Flags: approval-mozilla-aurora?
Attachment #8419887 - Flags: approval-mozilla-aurora+
blocking-b2g: 2.0? → ---

Updated

5 years ago
Blocks: b2g-RTSP-2.0
You need to log in before you can comment on or make changes to this bug.