Closed
Bug 877708
Opened 12 years ago
Closed 12 years ago
[A/V] ###!!! ABORT: ChannelEventQueue::Suspend called recursively: '!mSuspended', file ../../../dist/include/mozilla/net/ChannelEventQueue.h, line 124
Categories
(Firefox OS Graveyard :: General, defect)
Tracking
(Not tracked)
RESOLVED
DUPLICATE
of bug 870564
People
(Reporter: mikeh, Assigned: jduell.mcbugs)
Details
Attachments
(3 files)
3.27 MB,
application/x-zip-compressed
|
Details | |
1.65 KB,
patch
|
jdm
:
review+
|
Details | Diff | Splinter Review |
30.68 KB,
patch
|
jdm
:
review+
|
Details | Diff | Splinter Review |
Observed while trying to reproduce bug 870564 in a DEBUG build.
STR:
1. open browser to link in bug 870564 comment 0
2. play the video
Expected results: no aborts/crashing!
Observed results: video plays for some amount of time, then assertion failure triggers a SIGSEGV:
05-30 12:05:04.903 963 963 I PRLog : 1074922744[425c92b0]: [Child 963] ###!!! ABORT: ChannelEventQueue::Suspend called recursively: '!mSuspended', file ../../../dist/include/mozilla/net/ChannelEventQueue.h, line 124
05-30 12:05:04.903 963 963 I Gecko : [Child 963] ###!!! ABORT: ChannelEventQueue::Suspend called recursively: '!mSuspended', file ../../../dist/include/mozilla/net/ChannelEventQueue.h, line 124
...
05-30 12:05:04.973 963 963 E Gecko : mozalloc_abort: [Child 963] ###!!! ABORT: ChannelEventQueue::Suspend called recursively: '!mSuspended', file ../../../dist/include/mozilla/net/ChannelEventQueue.h, line 124
...
05-30 12:05:04.983 963 963 F libc : Fatal signal 11 (SIGSEGV) at 0x00000000 (code=1)
Observed on Inari with:
- gecko: b2g18:ceee454a90b5
- gaia: c51082d92f18c4c7cdbc4f96c977be735500562a
Reporter | ||
Comment 1•12 years ago
|
||
The abort in question is NS_ABORT_IF_FALSE()[1] which is removed in non-DEBUG builds[2].
1. http://mxr.mozilla.org/mozilla-central/source/netwerk/ipc/ChannelEventQueue.h#123
2. http://mxr.mozilla.org/mozilla-central/source/xpcom/glue/nsDebug.h#142
Looks like this assertion/code was added in http://hg.mozilla.org/mozilla-central/rev/4a33bc1f772d
jduell, any idea on how to resolve this? (For now I'm going to comment out the macro.)
Flags: needinfo?(jduell.mcbugs)
Comment 2•12 years ago
|
||
A backtrace from the failing Suspend call would be a useful place to start.
Comment 3•12 years ago
|
||
Following code could make different decision between video read and audio read. audio track read more ahead(1 sec) than video track. This causes a lot of consecutive Suspend() and Resume() calls in some situations. This might affect to this bug.
> enableReading = predictedNewDataUse < latestNextUse;
http://mxr.mozilla.org/mozilla-b2g18/source/content/media/nsMediaCache.cpp#1191
Reporter | ||
Comment 4•12 years ago
|
||
jdm, as requested:
Program received signal SIGSEGV, Segmentation fault.
0x417f00b4 in mozalloc_abort (msg=<value optimized out>)
at /home/mikeh/dev/mozilla/m-c/b2g18/memory/mozalloc/mozalloc_abort.cpp:30
30 MOZ_CRASH();
(gdb) bt
#0 0x417f00b4 in mozalloc_abort (msg=<value optimized out>)
at /home/mikeh/dev/mozilla/m-c/b2g18/memory/mozalloc/mozalloc_abort.cpp:30
#1 0x411c36b6 in Abort (aSeverity=<value optimized out>, aStr=<value optimized out>,
aExpr=<value optimized out>, aFile=<value optimized out>, aLine=124)
at /home/mikeh/dev/mozilla/m-c/b2g18/xpcom/base/nsDebugImpl.cpp:423
#2 NS_DebugBreak_P (aSeverity=<value optimized out>, aStr=<value optimized out>,
aExpr=<value optimized out>, aFile=<value optimized out>, aLine=124)
at /home/mikeh/dev/mozilla/m-c/b2g18/xpcom/base/nsDebugImpl.cpp:380
#3 0x404e849e in mozilla::net::ChannelEventQueue::Suspend (this=0x44811c00)
at ../../../dist/include/mozilla/net/ChannelEventQueue.h:123
#4 mozilla::net::HttpChannelChild::Suspend (this=0x44811c00)
at /home/mikeh/dev/mozilla/m-c/b2g18/netwerk/protocol/http/HttpChannelChild.cpp:915
#5 0x40bdbb6a in mozilla::ChannelMediaResource::PossiblySuspend (this=<value optimized out>)
at /home/mikeh/dev/mozilla/m-c/b2g18/content/media/MediaResource.cpp:1126
#6 0x40bdd2b8 in mozilla::ChannelMediaResource::Suspend (this=0x44bb0000,
aCloseImmediately=<value optimized out>)
at /home/mikeh/dev/mozilla/m-c/b2g18/content/media/MediaResource.cpp:813
#7 0x40bdba56 in mozilla::ChannelMediaResource::CacheClientSuspend (this=0xb5)
at /home/mikeh/dev/mozilla/m-c/b2g18/content/media/MediaResource.cpp:1031
#8 0x40bf7b68 in nsMediaCache::Update (this=0x44c3cc90)
at /home/mikeh/dev/mozilla/m-c/b2g18/content/media/nsMediaCache.cpp:1278
#9 0x40bf7cd4 in UpdateEvent::Run (this=<value optimized out>)
at /home/mikeh/dev/mozilla/m-c/b2g18/content/media/nsMediaCache.cpp:1301
#10 0x411ba6aa in nsThread::ProcessNextEvent (this=0x424c9940, mayWait=<value optimized out>,
result=0xbeccedc7) at /home/mikeh/dev/mozilla/m-c/b2g18/xpcom/threads/nsThread.cpp:620
#11 0x41181df0 in NS_ProcessNextEvent_P (thread=0xb5, mayWait=false)
at /home/mikeh/dev/mozilla/btg024/objdir-gecko-b2g18-debug/xpcom/build/nsThreadUtils.cpp:237
#12 0x40ff100e in mozilla::ipc::MessagePump::Run (this=0x424c5340, aDelegate=0xbeccf8b8)
at /home/mikeh/dev/mozilla/m-c/b2g18/ipc/glue/MessagePump.cpp:82
#13 0x40ff1196 in mozilla::ipc::MessagePumpForChildProcess::Run (this=0x424c5340, aDelegate=0xbeccf8b8)
at /home/mikeh/dev/mozilla/m-c/b2g18/ipc/glue/MessagePump.cpp:231
#14 0x411f0c86 in MessageLoop::RunInternal (this=0xbeccf8b8)
at /home/mikeh/dev/mozilla/m-c/b2g18/ipc/chromium/src/base/message_loop.cc:219
#15 0x411f0ce6 in MessageLoop::RunHandler (this=0xbeccf8b8)
at /home/mikeh/dev/mozilla/m-c/b2g18/ipc/chromium/src/base/message_loop.cc:212
#16 MessageLoop::Run (this=0xbeccf8b8)
at /home/mikeh/dev/mozilla/m-c/b2g18/ipc/chromium/src/base/message_loop.cc:186
#17 0x40f27c1e in nsBaseAppShell::Run (this=0x4428d040)
at /home/mikeh/dev/mozilla/m-c/b2g18/widget/xpwidgets/nsBaseAppShell.cpp:163
#18 0x404278e6 in XRE_RunAppShell ()
at /home/mikeh/dev/mozilla/m-c/b2g18/toolkit/xre/nsEmbedFunctions.cpp:646
#19 0x40ff1100 in mozilla::ipc::MessagePumpForChildProcess::Run (this=0x424c5340, aDelegate=0xbeccf8b8)
at /home/mikeh/dev/mozilla/m-c/b2g18/ipc/glue/MessagePump.cpp:198
#20 0x411f0c86 in MessageLoop::RunInternal (this=0xbeccf8b8)
at /home/mikeh/dev/mozilla/m-c/b2g18/ipc/chromium/src/base/message_loop.cc:219
#21 0x411f0ce6 in MessageLoop::RunHandler (this=0xbeccf8b8)
at /home/mikeh/dev/mozilla/m-c/b2g18/ipc/chromium/src/base/message_loop.cc:212
#22 MessageLoop::Run (this=0xbeccf8b8)
at /home/mikeh/dev/mozilla/m-c/b2g18/ipc/chromium/src/base/message_loop.cc:186
#23 0x40428104 in XRE_InitChildProcess (aArgc=3, aArgv=0x424e8800, aProcess=GeckoProcessType_Content)
at /home/mikeh/dev/mozilla/m-c/b2g18/toolkit/xre/nsEmbedFunctions.cpp:485
#24 0x00008662 in main (argc=5, argv=0xbeccfa34)
at /home/mikeh/dev/mozilla/m-c/b2g18/ipc/app/MozillaRuntimeMain.cpp:60
Comment 5•12 years ago
|
||
Jason, do you remember why we made suspending the channel queue a binary operation while the regular HTTP channel supports multiple suspensions?
Assignee | ||
Comment 6•12 years ago
|
||
We make the channel queue non-recursive because we can "easily enough" keep track of recursive calls in the child channel, and then only issue a Suspend IPDL message (and also suspend the eventQueue) when we actually transition to/from suspension. It's worth not sending the extra IPDL traffic around, but there's no inherent reason ChannelEventQueue couldn't handle recursive suspend/resume itself.
Alas there's a race in the implementation: at Resume() we launch a new event which flushed the EventQueue (we want to do that async, so new events aren't dispatched during the caller's Resume() stack). But if something happens to call Suspend() before that event runs, than we hit this assertion. And more seriously, we will currently flush the queue when the event is finally dispatched, even though we're supposed to be suspended.
Flags: needinfo?(jduell.mcbugs)
Assignee | ||
Comment 7•12 years ago
|
||
I also had to get rid of the assert in ChannelEventQueue::Resume, because if we get a series of suspend/resumes in a row, we can stack up a series of HttpChannelChild::CompleteResume events, the 2nd and following of which will see !mSuspended (as the 1st will set it to false).
This means ChannelEventQueue can't handle recursive suspend/resume, but we also can't have any asserts to check it :( Sad, but fine for now. Fancier patch for m-c coming that will fix this.
Assignee | ||
Comment 8•12 years ago
|
||
OK, here's a version of ChannelEventQueue that handles recursive suspend/resume. This makes eventQ more like the cache/network pumps used in nsHttpChannel.
Most of the changes are because in order to have it handle its own resume asynchronously it has to be AddRef/Release-able, so mEventQ is now a nsRefPtr.
Try server run:
https://tbpl.mozilla.org/?tree=Try&rev=dfece2c8069a
Attachment #757787 -
Flags: review?(josh)
Comment 9•12 years ago
|
||
Comment on attachment 757787 [details] [diff] [review]
v1: better fix for m-c
Review of attachment 757787 [details] [diff] [review]:
-----------------------------------------------------------------
Nice.
Attachment #757787 -
Flags: review?(josh) → review+
Comment 10•12 years ago
|
||
Comment on attachment 757784 [details] [diff] [review]
Fast and simple patch for b2g18 branch
Review of attachment 757784 [details] [diff] [review]:
-----------------------------------------------------------------
You should make an equivalent change to FTPChannelChild as well.
Attachment #757784 -
Flags: review?(josh) → review+
Comment on attachment 757784 [details] [diff] [review]
Fast and simple patch for b2g18 branch
NOTE: Please see https://wiki.mozilla.org/Release_Management/B2G_Landing to better understand the B2G approval process and landings.
[Approval Request Comment]
Bug caused by (feature/regressing bug #): none
User impact if declined: Video playback can freeze after a while
Testing completed: not much
Risk to taking this patch (and alternatives if risky): Jason says it's fast and simple :-)
String or UUID changes made by this patch: none
Attachment #757784 -
Flags: approval-mozilla-b2g18?
Reporter | ||
Comment 12•12 years ago
|
||
Comment on attachment 757787 [details] [diff] [review]
v1: better fix for m-c
Review of attachment 757787 [details] [diff] [review]:
-----------------------------------------------------------------
::: netwerk/ipc/ChannelEventQueue.h
@@ +142,5 @@
>
> inline void
> ChannelEventQueue::Resume()
> {
> + if (!--mSuspendCount) {
Would it be useful to have an assertion here that would catch mSuspendCount < 0 ?
Reporter | ||
Comment 13•12 years ago
|
||
Comment on attachment 757784 [details] [diff] [review]
Fast and simple patch for b2g18 branch
See bug 870564 comment 103.
Assignee | ||
Comment 14•12 years ago
|
||
> Would it be useful to have an assertion here that would catch mSuspendCount < 0 ?
>
> You should make an equivalent change to FTPChannelChild as well.
Thanks--fixed both of these.
Dupe-ing this to bug 870564 and moving patches there (so don't have to ask for leo+ here).
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → DUPLICATE
Updated•12 years ago
|
Attachment #757784 -
Flags: approval-mozilla-b2g18?
You need to log in
before you can comment on or make changes to this bug.
Description
•