Closed
Bug 605457
Opened 13 years ago
Closed 13 years ago
Ogg code uses a singlethreaded hashtable on multiple threads, causing assertions during random tests (e.g. 432561-1.html)
Categories
(Core :: Audio/Video, defect)
Core
Audio/Video
Tracking
()
VERIFIED
FIXED
People
(Reporter: mounir, Assigned: cajbir)
References
Details
(Keywords: intermittent-failure)
Attachments
(1 file)
10.66 KB,
patch
|
cpearce
:
review+
roc
:
approval2.0+
|
Details | Diff | Splinter Review |
###!!! ASSERTION: RECURSION_LEVEL_SAFE_TO_FINISH(table): 'RECURSION_LEVEL_SAFE_TO_FINISH(table)', file pldhash.c, line 406 PL_DHashTableFinish [pldhash.c:409] nsTHashtable<nsBaseHashtableET<nsUint32HashKey, nsAutoPtr<nsOggCodecState> > >::~nsTHashtable [nsTHashtable.h:318] nsBaseHashtable<nsUint32HashKey,nsAutoPtr<nsOggCodecState>,nsOggCodecState*>::~nsBaseHashtable [nsBaseHashtable.h:84] nsClassHashtable<nsUint32HashKey,nsOggCodecState>::~nsClassHashtable [nsClassHashtable.h:56] nsOggReader::~nsOggReader [content/media/ogg/nsOggReader.cpp:116] nsAutoPtr<nsBuiltinDecoderReader>::~nsAutoPtr [nsAutoPtr.h:104] nsBuiltinDecoderStateMachine::~nsBuiltinDecoderStateMachine [content/media/nsBuiltinDecoderStateMachine.cpp:164] nsOggDecoderStateMachine::~nsOggDecoderStateMachine [content/media/ogg/nsOggDecoderStateMachine.h:45] nsRunnable::Release [nsThreadUtils.cpp:55] nsCOMPtr<nsDecoderStateMachine>::assign_assuming_AddRef [nsCOMPtr.h:518] nsCOMPtr<nsDecoderStateMachine>::assign_with_AddRef [nsCOMPtr.h:1204] nsCOMPtr<nsDecoderStateMachine>::operator= [nsCOMPtr.h:664] nsBuiltinDecoder::Stop [content/media/nsBuiltinDecoder.cpp:137] nsRunnableMethodImpl<void (nsBuiltinDecoder::*)(),true>::Run [nsThreadUtils.h:348] nsThread::ProcessNextEvent [xpcom/threads/nsThread.cpp:547] NS_ProcessNextEvent_P [nsThreadUtils.cpp:250] nsThread::Shutdown [xpcom/threads/nsThread.cpp:476] nsBuiltinDecoder::Stop [content/media/nsBuiltinDecoder.cpp:135] nsRunnableMethodImpl<void (nsBuiltinDecoder::*)(),true>::Run [nsThreadUtils.h:348] nsThread::ProcessNextEvent [xpcom/threads/nsThread.cpp:547] NS_ProcessNextEvent_P [nsThreadUtils.cpp:250] nsThread::Shutdown [xpcom/threads/nsThread.cpp:476] nsBuiltinDecoder::Stop [content/media/nsBuiltinDecoder.cpp:135] nsRunnableMethodImpl<void (nsBuiltinDecoder::*)(),true>::Run [nsThreadUtils.h:348] nsThread::ProcessNextEvent [xpcom/threads/nsThread.cpp:547] NS_ProcessPendingEvents_P [nsThreadUtils.cpp:200] nsBaseAppShell::NativeEventCallback [widget/src/xpwidgets/nsBaseAppShell.cpp:132] nsAppShell::ProcessGeckoEvents [widget/src/cocoa/nsAppShell.mm:400] CoreFoundation + 0x4d271 CoreFoundation + 0x4b469 CoreFoundation + 0x4ac2f HIToolbox + 0x2ea4e HIToolbox + 0x2e7b1 HIToolbox + 0x2e70c AppKit + 0x441f2 -AppKit + 0x43b41 nsAppShell::ProcessNextNativeEvent [widget/src/cocoa/nsAppShell.mm:675] nsBaseAppShell::DoProcessNextNativeEvent [widget/src/xpwidgets/nsBaseAppShell.cpp:161] nsBaseAppShell::OnProcessNextEvent [widget/src/xpwidgets/nsBaseAppShell.cpp:299] nsAppShell::OnProcessNextEvent [widget/src/cocoa/nsAppShell.mm:833] nsThread::ProcessNextEvent [xpcom/threads/nsThread.cpp:519] NS_ProcessPendingEvents_P [nsThreadUtils.cpp:200] nsBaseAppShell::NativeEventCallback [widget/src/xpwidgets/nsBaseAppShell.cpp:132] nsAppShell::ProcessGeckoEvents [widget/src/cocoa/nsAppShell.mm:400] CoreFoundation + 0x4d271 CoreFoundation + 0x4b469 CoreFoundation + 0x4ac2f HIToolbox + 0x2ea4e HIToolbox + 0x2e7b1 HIToolbox + 0x2e70c AppKit + 0x441f2 -AppKit + 0x43b41 -AppKit + 0x9747 nsAppShell::Run [widget/src/cocoa/nsAppShell.mm:746] nsAppStartup::Run [toolkit/components/startup/src/nsAppStartup.cpp:191] XRE_main [toolkit/xre/nsAppRunner.cpp:3670] main [browser/app/nsBrowserApp.cpp:158] REFTEST TEST-PASS | file:///Users/cltbld/talos-slave/mozilla-central_snowleopard-debug_test-crashtest/build/reftest/tests/layout/style/crashtests/432561-1.html | (LOAD ONLY) REFTEST INFO | Loading a blank page ++DOMWINDOW == 16 (0x11c83c8d8) [serial = 2767] [outer = 0x11825b9c0] REFTEST TEST-UNEXPECTED-FAIL | file:///Users/cltbld/talos-slave/mozilla-central_snowleopard-debug_test-crashtest/build/reftest/tests/layout/style/crashtests/432561-1.html | assertion count 1 is more than expected 0 assertions http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1287495658.1287496043.17882.gz
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 2•13 years ago
|
||
See also bug 613564 for this assertion cropping up in a different test.
Updated•13 years ago
|
Component: XPCOM → Video/Audio
QA Contact: xpcom → video.audio
Summary: 432561-1.html | assertion count 1 is more than expected 0 assertions → Ogg code uses a singlethreaded hashtable on multiple threads, causing assertions during random tests (e.g. 432561-1.html)
Assignee: nobody → chris.double
blocking2.0: --- → final+
Comment 3•13 years ago
|
||
We could just use nsClassHashTableMT instead of nsClassHashTable, but: [11:57] <bent> cpearce: there's a gotcha though [11:57] <bent> which i think makes the MT variants useless [11:57] <bent> you can't do something like "get this entry, or make a new one" [11:58] <bent> which i almost always want to do [11:58] <cpearce> bent: so it's read only? [11:58] <bent> no, it's that [11:58] <jlebar> cpearce, in that case, you don't need MT! [11:58] <bent> cpearce: the MT versions only lock during "Get" and "Put" [11:59] <bent> there's no way to say "lock during Get *and* Put" [11:59] <cpearce> ah [11:59] <cpearce> this is ok for my use case, but in general, yeah, annoying. [11:59] <bent> so other threads can hop in [11:59] <bent> it's a mess, i wish we would remove them [11:59] <bent> people think "oh, this one is threadsafe, lah dee dah" We only ever use the hash table by one thread at a time, so we'd not be bitten by this. Or we could change the nsBuiltinDecoderStateMachine to read the metadata and shutdown on the decoder thread as well, but that's much more work. We're going to rewrite the decoder anyway to reduce its number of threads, maybe we should do that change then?
(In reply to comment #3) > [11:59] <bent> people think "oh, this one is threadsafe, lah dee dah" Yeah, that's par for the course for threadsafe data structures in general. > We only ever use the hash table by one thread at a time, so we'd not be bitten > by this. Are you sure? Because this assertion would only fire if we had concurrent access from multiple threads, AFAIK. If we can locate the concurrent access and prevent it, then this bug will be fixed.
Comment 5•13 years ago
|
||
(In reply to comment #4) > > We only ever use the hash table by one thread at a time, so we'd not be bitten > > by this. > > Are you sure? Because this assertion would only fire if we had concurrent > access from multiple threads, AFAIK. > > If we can locate the concurrent access and prevent it, then this bug will be > fixed. Wait no. We use the hash table on the decoder thread as well as on the main thread in the Ogg GetBuffered() implementation. GetBuffered() needs it to identify playing streams, and convert granulepos to timecodes. The only way we could remove the concurrent access is to duplicate the list of playing streams for Ogg, and factor out the granulepos to timecode logic. We'd need to reimplement th_granule_frame() (which is easy) and maintain it to do that.
Comment 6•13 years ago
|
||
We talked about this over lunch, it's safe enough to just use an nsTArray to store the serialno to nsOggCodecState mapping. There's usually only 3 streams, so a linear scan in an nsTArray won't be slower than a hash table lookup.
Assignee | ||
Comment 7•13 years ago
|
||
Instead of using the hash table in the GetBuffered that runs on the main thread can we just compare the serial number against the streams we are interested in (mTheoraState, mVorbisState, mSkeletonState)? This way we avoid using the hashtable completely. I notice we call methods/access data in the state objects themselves in GetBuffered. How threadsafe is that?
Comment 8•13 years ago
|
||
(In reply to comment #7) > Instead of using the hash table in the GetBuffered that runs on the main thread > can we just compare the serial number against the streams we are interested in > (mTheoraState, mVorbisState, mSkeletonState)? This way we avoid using the > hashtable completely. Sounds reasonable. > I notice we call methods/access data in the state objects themselves in > GetBuffered. How threadsafe is that? GetBuffered() and the functions it calls access things that shouldn't change after metadata is loaded (like the Theora granuleshift and the nsOggCodecState::mActive field, so I think we're safe.
blocking2.0: final+ → -
Assignee | ||
Comment 9•13 years ago
|
||
I took an approach similar to that suggested by Chris Pearce in comment 5 about reimplementing th_granule_frame.
Attachment #498017 -
Flags: review?(chris)
Comment 10•13 years ago
|
||
Comment on attachment 498017 [details] [diff] [review] Remove concurrent access to hash table Looks good, just a small typo "Vorbis/Tehora data used to compute timestamps" in the comment starting above nsOggReader::mVorbisSerial.
Attachment #498017 -
Flags: review?(chris) → review+
Reporter | ||
Updated•13 years ago
|
Attachment #498017 -
Flags: approval2.0+
Reporter | ||
Updated•13 years ago
|
Keywords: checkin-needed
Reporter | ||
Updated•13 years ago
|
Status: NEW → ASSIGNED
Comment 11•13 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/c6c33df76468
Comment 12•13 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey/1296500234.1296500612.9600.gz&fulltext=1 OS X 10.6 comm-central-trunk debug test crashtest on 2011/01/31 10:57:14
Updated•11 years ago
|
Keywords: intermittent-failure
Updated•11 years ago
|
Whiteboard: [orange]
You need to log in
before you can comment on or make changes to this bug.
Description
•