Last Comment Bug 716417 - Crash in nsINodeInfo::NodeInfoManager() called from LookupMediaElementURITable
: Crash in nsINodeInfo::NodeInfoManager() called from LookupMediaElementURITable
Status: VERIFIED FIXED
[qa!]
: crash, testcase
Product: Core
Classification: Components
Component: Audio/Video (show other bugs)
: Trunk
: x86_64 Windows 7
: -- critical (vote)
: mozilla12
Assigned To: Chris Pearce (:cpearce)
:
Mentors:
Depends on:
Blocks: 703379
  Show dependency treegraph
 
Reported: 2012-01-08 15:45 PST by Chris Pearce (:cpearce)
Modified: 2012-05-20 19:39 PDT (History)
7 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
unaffected
affected
verified


Attachments
Testcase (321 bytes, text/html)
2012-01-09 01:00 PST, Chris Pearce (:cpearce)
no flags Details
Patch (6.37 KB, patch)
2012-01-10 13:22 PST, Chris Pearce (:cpearce)
roc: review+
akeybl: approval‑mozilla‑aurora-
Details | Diff | Review

Description Chris Pearce (:cpearce) 2012-01-08 15:45:40 PST
To crash browser: load attached testcase, reload page (press F5).

Stack trace:

xul.dll!nsINodeInfo::NodeInfoManager()  Line 218 + 0x3 bytes	C++
xul.dll!nsINode::NodePrincipal()  Line 693 + 0x12 bytes	C++
xul.dll!nsHTMLMediaElement::LookupMediaElementURITable(nsIURI * aURI)  Line 1367 + 0x8 bytes	C++
xul.dll!nsHTMLMediaElement::LoadResource()  Line 943 + 0x1d bytes	C++
xul.dll!nsHTMLMediaElement::SelectResource()  Line 698 + 0xb bytes	C++
xul.dll!nsSyncSection::Run()  Line 601	C++
xul.dll!nsBaseAppShell::RunSyncSections()  Line 358	C++
xul.dll!nsBaseAppShell::AfterProcessNextEvent(nsIThreadInternal * thr, unsigned int recursionDepth)  Line 369	C++
xul.dll!nsThread::ProcessNextEvent(bool mayWait, bool * result)  Line 675	C++
xul.dll!NS_ProcessNextEvent_P(nsIThread * thread, bool mayWait)  Line 245 + 0x17 bytes	C++
xul.dll!mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate * aDelegate)  Line 110 + 0xe bytes	C++
xul.dll!MessageLoop::RunInternal()  Line 209	C++
xul.dll!MessageLoop::RunHandler()  Line 202	C++
xul.dll!MessageLoop::Run()  Line 176	C++
xul.dll!nsBaseAppShell::Run()  Line 191	C++
xul.dll!nsAppShell::Run()  Line 258 + 0x9 bytes	C++
xul.dll!nsAppStartup::Run()  Line 220 + 0x1c bytes	C++
xul.dll!XRE_main(int argc, char * * argv, const nsXREAppData * aAppData)  Line 3523 + 0x25 bytes	C++
firefox.exe!do_main(const char * exePath, int argc, char * * argv)  Line 201 + 0x13 bytes	C++
firefox.exe!NS_internal_main(int argc, char * * argv)  Line 287 + 0x14 bytes	C++
firefox.exe!wmain(int argc, wchar_t * * argv)  Line 107 + 0xd bytes	C++
firefox.exe!__tmainCRTStartup()  Line 552 + 0x19 bytes	C
firefox.exe!wmainCRTStartup()  Line 371	C
kernel32.dll!@BaseThreadInitThunk@12()  + 0x12 bytes	
ntdll.dll!___RtlUserThreadStart@8()  + 0x27 bytes	
ntdll.dll!__RtlUserThreadStart@8()  + 0x1b bytes	

Looks like elem->NodePrincipal() [1] is returning null.

[1] http://mxr.mozilla.org/mozilla-central/source/content/html/content/src/nsHTMLMediaElement.cpp#1367
Comment 1 Chris Pearce (:cpearce) 2012-01-08 15:52:42 PST
(In reply to Chris Pearce, Mozilla Corporation (:cpearce) from comment #0)
> Looks like elem->NodePrincipal() [1] is returning null.

... because elem's memory has been released... Maybe when the previous document was GC'd?
Comment 2 Boris Zbarsky [:bz] (Out June 25-July 6) 2012-01-08 16:29:40 PST
elem->NodePrincipal() should never return null on trunk, if |elem| is not garbage memory yet.

At what point do we remove eleents from gElementTable?  Sounds like we failed to do that here for an element that was going away?
Comment 3 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-01-08 17:01:41 PST
nsHTMLMediaElement::~nsHTMLMediaElement removes it, but only if mDecoder is not null (an element should only be in the table while mDecoder is not null). We should probably just remove that condition. Although I thought everywhere that sets mDecoder to null also calls RemoveMediaElementFromURITable.

Another possibility is that mLoadingSrc changes while the element is in the hashtable so that we can't find it to remove it. But mLoadingSrc should not change while mDecoder is not null.
Comment 4 Scoobidiver (away) 2012-01-09 00:58:50 PST
(In reply to Chris Pearce, Mozilla Corporation (:cpearce) from comment #0)
> To crash browser: load attached testcase, reload page (press F5).
I see no attachment.
Comment 5 Chris Pearce (:cpearce) 2012-01-09 01:00:50 PST
Created attachment 586917 [details]
Testcase
Comment 6 Scoobidiver (away) 2012-01-09 01:21:03 PST
Here are my crash reports:
bp-273843f0-30b7-43dd-a4d8-545f82120109 32-bit build
bp-bfb013e4-4b01-4308-9e44-af3772120109 64-bit build
Comment 7 Chris Pearce (:cpearce) 2012-01-10 13:22:20 PST
Created attachment 587451 [details] [diff] [review]
Patch

If a media was played before nsHTMLMediaElement::FinishDecoderSetup() run, FinishDecoderSetup() can fail if its call to nsBuiltinDecoder::Play() fails. nsBuiltinDecoder::Play() can fail if the thread limit is exceeded. FinishDecoderSetup() always adds the element to the media element table, but if FinishDecoderSetup() fails, it doesn't remove the element from the table. If a failing call to FinishDecoderSetup() came while trying to initialize a cloned decoder, we'll fall back to creating a non-cloned decoder, and so we'll end up calling FinishDecoderSetup() again then, and add the element to the table twice. The destructor only removes the element from the table once, leaving a pointer to the now freed memory in the table, which causes this crash when we try to use that memory.

This patch changes FinishDecoderSetup() to remove the media element from the table upon failure, and adds assertions to guard against this case.

I don't think we can easily make a crash test for this, my testcase requires reloading the page...
Comment 8 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-01-10 13:39:38 PST
Comment on attachment 587451 [details] [diff] [review]
Patch

Review of attachment 587451 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/html/content/src/nsHTMLMediaElement.cpp
@@ +1336,5 @@
> +      count++;
> +    }
> +  }
> +  return count;
> +}

make this #ifdef DEBUG
Comment 9 Chris Pearce (:cpearce) 2012-01-10 15:01:55 PST
https://hg.mozilla.org/integration/mozilla-inbound/rev/79eb51bafb9a

We should try to get this on Aurora too, the crash will be occurring there too.
Comment 10 Ed Morley [:emorley] 2012-01-11 18:20:25 PST
https://hg.mozilla.org/mozilla-central/rev/79eb51bafb9a
Comment 11 Chris Pearce (:cpearce) 2012-01-12 15:32:07 PST
Comment on attachment 587451 [details] [diff] [review]
Patch

[Approval Request Comment]
Regression caused by (bug #): 703379
User impact if declined: Potential crashes, seems low volume though, I'm not aware of it occurring in the wild, only in my testcase - but that doesn't mean it won't occur.
Testing completed (on m-c, etc.): Patch has been on m-c since 2012-01-11
Risk to taking this patch (and alternatives if risky): Low risk, it's a simple fix.
Comment 12 Alex Keybl [:akeybl] 2012-01-17 14:51:48 PST
Comment on attachment 587451 [details] [diff] [review]
Patch

[Triage Comment]
Crash volume isn't high enough to meet the criteria for aurora. Let's reconsider if this becomes a significant crasher in Beta 11.
Comment 13 Alex Keybl [:akeybl] 2012-01-17 17:25:22 PST
Comment on attachment 587451 [details] [diff] [review]
Patch

Whoops cpearce caught my mistake. That was supposed to be an a-.
Comment 14 Paul Silaghi, QA [:pauly] 2012-04-03 06:02:46 PDT
No crashes occur. Verified fixed on Firefox 12b3:
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:12.0) Gecko/20100101 Firefox/12.0
Mozilla/5.0 (X11; Linux i686; rv:12.0) Gecko/20100101 Firefox/12.0
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:12.0) Gecko/20100101 Firefox/12.0

Note You need to log in before you can comment on or make changes to this bug.