Closed Bug 1522547 Opened 10 months ago Closed 9 months ago

Intermittent PROCESS-CRASH | /encrypted-media/clearkey-generate-request-disallowed-input.https.html | application crashed [@ mozilla::ipc::MessageChannel::Clear()]

Categories

(Core :: Audio/Video: Playback, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla67
Tracking Status
thunderbird_esr60 --- unaffected
geckoview64 --- unaffected
geckoview65 --- unaffected
geckoview66 --- unaffected
firefox-esr60 --- unaffected
firefox64 --- unaffected
firefox65 --- unaffected
firefox66 + wontfix
firefox67 --- fixed

People

(Reporter: apavel, Assigned: bryce)

References

Details

(Keywords: intermittent-failure, regression)

Attachments

(6 files)

[Tracking Requested - why for this release]:

Central as Beta simulation

Treeherder link: https://treeherder.mozilla.org/#/jobs?repo=try&resultStatus=pending%2Crunning%2Ctestfailed%2Cbusted%2Cexception&classifiedState=unclassified&revision=9d34905450a5cbdceaf728b6359fe24e6df1c30a&group_state=expanded&selectedJob=223758791

Failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=223758791&repo=try&lineNumber=19711

14:06:54 INFO - TEST-START | /encrypted-media/clearkey-generate-request-disallowed-input.https.html
14:06:54 INFO - Closing window 12884901889
14:06:54 INFO - mozcrash .......
14:06:54 INFO - mozcrash Copy/paste: Z:\task_1548336912\build\win32-minidump_stackwalk.exe c:\users\task_1548336912\appdata\local\temp\tmpprvgbv.mozrunner\minidumps\44d4c34c-2c06-445e-997d-5da127bf0011.dmp Z:\task_1548336912\build\symbols
14:07:04 INFO - mozcrash Saved minidump as Z:\task_1548336912\build\blobber_upload_dir\44d4c34c-2c06-445e-997d-5da127bf0011.dmp
14:07:04 INFO - mozcrash Saved app info as Z:\task_1548336912\build\blobber_upload_dir\44d4c34c-2c06-445e-997d-5da127bf0011.extra
14:07:04 INFO - PROCESS-CRASH | /encrypted-media/clearkey-generate-request-disallowed-input.https.html | application crashed [@ mozilla::ipc::MessageChannel::Clear()]
14:07:04 INFO - Crash dump filename: c:\users\task_1548336912\appdata\local\temp\tmpprvgbv.mozrunner\minidumps\44d4c34c-2c06-445e-997d-5da127bf0011.dmp
14:07:04 INFO - Operating system: Windows NT
14:07:04 INFO - 10.0.15063
14:07:04 INFO - CPU: amd64
14:07:04 INFO - family 6 model 85 stepping 4
14:07:04 INFO - 8 CPUs
14:07:04 INFO -
14:07:04 INFO - GPU: UNKNOWN
14:07:04 INFO -
14:07:04 INFO - Crash reason: EXCEPTION_BREAKPOINT
14:07:04 INFO - Crash address: 0x7ff8036783fc
14:07:04 INFO - Assertion: Unknown assertion type 0x00000000
14:07:04 INFO - Process uptime: 2 seconds
14:07:04 INFO -
14:07:04 INFO - Thread 0 (crashed)
14:07:04 INFO - 0 xul.dll!mozilla::ipc::MessageChannel::Clear() [MessageChannel.cpp:9d34905450a5cbdceaf728b6359fe24e6df1c30a : 744 + 0x0]
14:07:04 INFO - rax = 0x00007ff80a39ca7b rdx = 0x00000055731ff2e8
14:07:04 INFO - rcx = 0x00007ff82cd5fac8 rbx = 0x0000000000000004
14:07:04 INFO - rsi = 0x000001a5eec83d48 rdi = 0x00000055731ff2f8
14:07:04 INFO - rbp = 0x0000000000000564 rsp = 0x00000055731ff2d0
14:07:04 INFO - r8 = 0x00000055731ff2e0 r9 = 0x00000055731ff2d8
14:07:04 INFO - r10 = 0x0000000000000000 r11 = 0x00000055731fa910
14:07:04 INFO - r12 = 0x0000000000000010 r13 = 0x00007ff838c533f0
14:07:04 INFO - r14 = 0x000001a5eec83c00 r15 = 0x000001a5ef415920
14:07:04 INFO - rip = 0x00007ff8036783fc
14:07:04 INFO - Found by: given as instruction pointer in context
14:07:04 INFO - 1 xul.dll!mozilla::ipc::MessageChannel::~MessageChannel() [MessageChannel.cpp:9d34905450a5cbdceaf728b6359fe24e6df1c30a : 628 + 0x8]
14:07:04 INFO - rbx = 0x0000000000000004 rbp = 0x0000000000000564
14:07:04 INFO - rsp = 0x00000055731ff340 r12 = 0x0000000000000010
14:07:04 INFO - r13 = 0x00007ff838c533f0 r14 = 0x000001a5eec83c00
14:07:04 INFO - r15 = 0x000001a5ef415920 rip = 0x00007ff8036779e3
14:07:04 INFO - Found by: call frame info
14:07:04 INFO - 2 xul.dll!mozilla::ipc::IToplevelProtocol::ToplevelState::~ToplevelState() [ProtocolUtils.h:9d34905450a5cbdceaf728b6359fe24e6df1c30a : 401 + 0xc]
14:07:04 INFO - rbx = 0x0000000000000004 rbp = 0x0000000000000564
14:07:04 INFO - rsp = 0x00000055731ff4c0 r12 = 0x0000000000000010
14:07:04 INFO - r13 = 0x00007ff838c533f0 r14 = 0x000001a5eec83c00
14:07:04 INFO - r15 = 0x000001a5ef415920 rip = 0x00007ff8036980d4
14:07:04 INFO - Found by: call frame info
14:07:04 INFO - 3 xul.dll!void mozilla::ipc::IToplevelProtocol::ToplevelState::~ToplevelState() [ProtocolUtils.h:9d34905450a5cbdceaf728b6359fe24e6df1c30a : 401 + 0x10]
14:07:04 INFO - rbx = 0x0000000000000004 rbp = 0x0000000000000564
14:07:04 INFO - rsp = 0x00000055731ff510 r12 = 0x0000000000000010
14:07:04 INFO - r13 = 0x00007ff838c533f0 r14 = 0x000001a5eec83c00
14:07:04 INFO - r15 = 0x000001a5ef415920 rip = 0x00007ff80368f6a0
14:07:04 INFO - Found by: call frame info
14:07:04 INFO - 4 xul.dll!mozilla::ipc::IToplevelProtocol::~IToplevelProtocol() [ProtocolUtils.cpp:9d34905450a5cbdceaf728b6359fe24e6df1c30a : 567 + 0x1b]
14:07:04 INFO - rbx = 0x0000000000000004 rbp = 0x0000000000000564
14:07:04 INFO - rsp = 0x00000055731ff550 r12 = 0x0000000000000010
14:07:04 INFO - r13 = 0x00007ff838c533f0 r14 = 0x000001a5eec83c00
14:07:04 INFO - r15 = 0x000001a5ef415920 rip = 0x00007ff80368adae
14:07:04 INFO - Found by: call frame info
14:07:04 INFO - 5 xul.dll!void mozilla::gmp::GMPContentParent::~GMPContentParent() [GMPContentParent.cpp:9d34905450a5cbdceaf728b6359fe24e6df1c30a : 43 + 0x5]
14:07:04 INFO - rbx = 0x0000000000000004 rbp = 0x0000000000000564
14:07:04 INFO - rsp = 0x00000055731ff5a0 r12 = 0x0000000000000010
14:07:04 INFO - r13 = 0x00007ff838c533f0 r14 = 0x000001a5eec83c00
14:07:04 INFO - r15 = 0x000001a5ef415920 rip = 0x00007ff806200bc0
14:07:04 INFO - Found by: call frame info
14:07:04 INFO - 6 xul.dll!PLDHashTable::~PLDHashTable() [PLDHashTable.cpp:9d34905450a5cbdceaf728b6359fe24e6df1c30a : 293 + 0x27]
14:07:04 INFO - rbx = 0x0000000000000004 rbp = 0x0000000000000564
14:07:04 INFO - rsp = 0x00000055731ff5e0 r12 = 0x0000000000000010
14:07:04 INFO - r13 = 0x00007ff838c533f0 r14 = 0x000001a5eec83c00
14:07:04 INFO - r15 = 0x000001a5ef415920 rip = 0x00007ff802ea48c3
14:07:04 INFO - Found by: call frame info
14:07:04 INFO - 7 xul.dll!void mozilla::gmp::GMPServiceChild::~GMPServiceChild() [GMPServiceChild.cpp:9d34905450a5cbdceaf728b6359fe24e6df1c30a : 430 + 0x13]
14:07:04 INFO - rbx = 0x0000000000000004 rbp = 0x0000000000000564
14:07:04 INFO - rsp = 0x00000055731ff640 r12 = 0x0000000000000010
14:07:04 INFO - r13 = 0x00007ff838c533f0 r14 = 0x000001a5eec83c00
14:07:04 INFO - r15 = 0x000001a5ef415920 rip = 0x00007ff80623309e
14:07:04 INFO - Found by: call frame info
14:07:04 INFO - 8 xul.dll!mozilla::gmp::GeckoMediaPluginServiceChild::Observe(nsISupports *,char const *,char16_t const *) [GMPServiceChild.cpp:9d34905450a5cbdceaf728b6359fe24e6df1c30a : 361 + 0x25]
14:07:04 INFO - rbx = 0x0000000000000004 rbp = 0x0000000000000564
14:07:04 INFO - rsp = 0x00000055731ff680 r12 = 0x0000000000000010
14:07:04 INFO - r13 = 0x00007ff838c533f0 r14 = 0x000001a5eec83c00
14:07:04 INFO - r15 = 0x000001a5ef415920 rip = 0x00007ff80621d0ae
14:07:04 INFO - Found by: call frame info
14:07:04 INFO - 9 xul.dll!nsObserverList::NotifyObservers(nsISupports *,char const *,char16_t const *) [nsObserverList.cpp:9d34905450a5cbdceaf728b6359fe24e6df1c30a : 66 + 0xf]
14:07:04 INFO - rbx = 0x0000000000000004 rbp = 0x0000000000000564
14:07:04 INFO - rsp = 0x00000055731ff6e0 r12 = 0x0000000000000010
14:07:04 INFO - r13 = 0x00007ff838c533f0 r14 = 0x000001a5eec83c00
14:07:04 INFO - r15 = 0x000001a5ef415920 rip = 0x00007ff802eb012f
14:07:04 INFO - Found by: call frame info
14:07:04 INFO - 10 xul.dll!nsObserverService::NotifyObservers(nsISupports *,char const *,char16_t const *) [nsObserverService.cpp:9d34905450a5cbdceaf728b6359fe24e6df1c30a : 286 + 0x11]
14:07:04 INFO - rbx = 0x0000000000000004 rbp = 0x0000000000000564
14:07:04 INFO - rsp = 0x00000055731ff750 r12 = 0x0000000000000010
14:07:04 INFO - r13 = 0x00007ff838c533f0 r14 = 0x000001a5eec83c00
14:07:04 INFO - r15 = 0x000001a5ef415920 rip = 0x00007ff802eb1b8f
14:07:04 INFO - Found by: call frame info
14:07:04 INFO - 11 xul.dll!mozilla::ShutdownXPCOM(nsIServiceManager *) [XPCOMInit.cpp:9d34905450a5cbdceaf728b6359fe24e6df1c30a : 775 + 0x11]
14:07:04 INFO - rbx = 0x0000000000000004 rbp = 0x0000000000000564
14:07:04 INFO - rsp = 0x00000055731ff800 r12 = 0x0000000000000010
14:07:04 INFO - r13 = 0x00007ff838c533f0 r14 = 0x000001a5eec83c00
14:07:04 INFO - r15 = 0x000001a5ef415920 rip = 0x00007ff802f78fc5
14:07:04 INFO - Found by: call frame info
14:07:04 INFO - 12 xul.dll!XRE_TermEmbedding() [nsEmbedFunctions.cpp:9d34905450a5cbdceaf728b6359fe24e6df1c30a : 213 + 0x7]
14:07:04 INFO - rbx = 0x0000000000000004 rbp = 0x0000000000000564
14:07:04 INFO - rsp = 0x00000055731ff8b0 r12 = 0x0000000000000010
14:07:04 INFO - r13 = 0x00007ff838c533f0 r14 = 0x000001a5eec83c00
14:07:04 INFO - r15 = 0x000001a5ef415920 rip = 0x00007ff808793cfa
14:07:04 INFO - Found by: call frame info
14:07:04 INFO - 13 xul.dll!mozilla::ipc::ScopedXREEmbed::Stop() [ScopedXREEmbed.cpp:9d34905450a5cbdceaf728b6359fe24e6df1c30a : 90 + 0x8]
14:07:04 INFO - rbx = 0x0000000000000004 rbp = 0x0000000000000564
14:07:04 INFO - rsp = 0x00000055731ff8f0 r12 = 0x0000000000000010
14:07:04 INFO - r13 = 0x00007ff838c533f0 r14 = 0x000001a5eec83c00
14:07:04 INFO - r15 = 0x000001a5ef415920 rip = 0x00007ff80368dc42
14:07:04 INFO - Found by: call frame info
14:07:04 INFO - 14 xul.dll!XRE_InitChildProcess(int,char * * const,XREChildData const *) [nsEmbedFunctions.cpp:9d34905450a5cbdceaf728b6359fe24e6df1c30a : 750 + 0x9]
14:07:04 INFO - rbx = 0x0000000000000004 rbp = 0x0000000000000564
14:07:04 INFO - rsp = 0x00000055731ff920 r12 = 0x0000000000000010
14:07:04 INFO - r13 = 0x00007ff838c533f0 r14 = 0x000001a5eec83c00
14:07:04 INFO - r15 = 0x000001a5ef415920 rip = 0x00007ff8087948c3
14:07:04 INFO - Found by: call frame info
14:07:04 INFO - 15 firefox.exe!NS_internal_main(int,char * *,char * *) [nsBrowserApp.cpp:9d34905450a5cbdceaf728b6359fe24e6df1c30a : 265 + 0x60]
14:07:04 INFO - rbx = 0x0000000000000004 rbp = 0x0000000000000564
14:07:04 INFO - rsp = 0x00000055731ffbb0 r12 = 0x0000000000000010
14:07:04 INFO - r13 = 0x00007ff838c533f0 r14 = 0x000001a5eec83c00
14:07:04 INFO - r15 = 0x000001a5ef415920 rip = 0x00007ff7c217152f
14:07:04 INFO - Found by: call frame info
14:07:04 INFO - 16 firefox.exe!wmain [nsWindowsWMain.cpp:9d34905450a5cbdceaf728b6359fe24e6df1c30a : 129 + 0x15]
14:07:04 INFO - rbx = 0x0000000000000004 rbp = 0x0000000000000564
14:07:04 INFO - rsp = 0x00000055731ffd50 r12 = 0x0000000000000010
14:07:04 INFO - r13 = 0x00007ff838c533f0 r14 = 0x000001a5eec83c00
14:07:04 INFO - r15 = 0x000001a5ef415920 rip = 0x00007ff7c217122e
14:07:04 INFO - Found by: call frame info
14:07:04 INFO - 17 firefox.exe!static int __scrt_common_main_seh() [exe_common.inl : 288 + 0x22]
14:07:04 INFO - rbx = 0x0000000000000004 rbp = 0x0000000000000564
14:07:04 INFO - rsp = 0x00000055731ffe00 r12 = 0x0000000000000010
14:07:04 INFO - r13 = 0x00007ff838c533f0 r14 = 0x000001a5eec83c00
14:07:04 INFO - r15 = 0x000001a5ef415920 rip = 0x00007ff7c21e2010
14:07:04 INFO - Found by: call frame info
14:07:04 INFO - 18 kernel32.dll!BaseThreadInitThunk + 0x14
14:07:04 INFO - rbx = 0x0000000000000004 rbp = 0x0000000000000564
14:07:04 INFO - rsp = 0x00000055731ffe40 r12 = 0x0000000000000010
14:07:04 INFO - r13 = 0x00007ff838c533f0 r14 = 0x000001a5eec83c00
14:07:04 INFO - r15 = 0x000001a5ef415920 rip = 0x00007ff838c52774
14:07:04 INFO - Found by: call frame info
14:07:04 INFO - 19 ntdll.dll!SdbpCheckMatchingRegistryEntry + 0x20d
14:07:04 INFO - rbx = 0x0000000000000004 rbp = 0x0000000000000564
14:07:04 INFO - rsp = 0x00000055731ffe70 r12 = 0x0000000000000010
14:07:04 INFO - r13 = 0x00007ff838c533f0 r14 = 0x000001a5eec83c00
14:07:04 INFO - r15 = 0x000001a5ef415920 rip = 0x00007ff838e80d51
14:07:04 INFO - Found by: call frame info
14:07:04 INFO - 20 KERNELBASE.dll + 0x67c0
14:07:04 INFO - rsp = 0x00000055731ffea0 rip = 0x00007ff8358567c0
14:07:04 INFO - Found by: stack scanning

Nils, can you please take a look?

Flags: needinfo?(drno)

Bryce can you please have a look at this?

Flags: needinfo?(drno) → needinfo?(bvandyk)
Priority: -- → P2
Assignee: nobody → bvandyk
Status: NEW → ASSIGNED
Flags: needinfo?(bvandyk)

That failure is on Windows 32-bit and 64-bit debug here: https://treeherder.mozilla.org/#/jobs?repo=try&resultStatus=pending%2Crunning%2Ctestfailed%2Cbusted%2Cexception&revision=9d34905450a5cbdceaf728b6359fe24e6df1c30a&group_state=expanded&selectedJob=223758791

Since a web-platform-tests update yesterday, encrypted-media tests were also permafailing on Linux x64 asan (tier 2) until they got disabled: bug 1522213

Looks like we're dying in MessageChannel::Clear because we're trying to clear a connected channel as part of the MessageChannel dtor. Further up the stack GMPContentParent dtor is happening, and that appears to be due to us shutting down XPCOM.

Not my area of expertise, but I'll continue to try and figure out what's going on.

I first see these crashes start happening on central here. At that time they're not marked as failures, but we see them in the logs. For the run before that, here, I don't see us crashing.

The range that introduces the crashes above has bug 1519862's patches in it, which modify various bits of our wpt running and is mentioned in bug 1522213. If I back out those changes locally, I get less logging, but am still seeing crashes. If I update to 2e27f3f1ebc6 (the revision for the no crash treeherder run), I don't see any sign of crashes locally.

So it looks like something in that range, likely more than just bug 1519862, is either causing the crash, or making an existing crash apparent (or some combination).

Walking through that range and running the tests I find we start crashing once bug 1505370 lands. Looks like our culprit, though the exact mechanism resulting in our new failures is unclear to me.

Looks like bug 1505370 syncs us with wpt#13966. As noted in that PR, there are some regressions. wpt#14496 does some reverting to fix issues -- looks like we don't have that yet, going to try and figure out if we can grab that and see if it helps here.

The sync got stuck due to harness changes and in the meantime there was some divergence which I'm sorting out. Hopefully we get caught up early next week and get that PR into central.

:bryce, any progress on this one for 66?

Flags: needinfo?(bvandyk)

Bug 1513880 looks like it's landed us the reverting PR from the wpts that I mentioned in comment 5. Looks like it was in when I last commented, and I overlooked it.

We appear to still be crashing in recent runs of these tests but now the crashes are swallowed. I'm not sure if this should be happening.

I'll continue looking -- this appears to be an interaction between changes in the wpt-harness and our IPC channels, rather than a specific EME issue (though we may need to revise how we're using IPC).

More looking, seems to be a GMP shutdown issue that is highlighted by how the test harness is now running tests. The shutdown happening in an unexpected state means we fail to close the IPC channel and hit the assert that's cause the crashes.

We fail when shutting down XPCOM:

  • XPCOM shutdown kicks off.
  • The GeckoMediaPluginServiceChild observes xpcom-will-shutdown then xpcom-shutdown-threads. When it see the xpcom-shutdown-threads it closes its own channel then nulls the mServiceChild unique ptr, triggering the destruction of a GMPServiceChild.
  • The GMPServiceChild dtor destroys mCotentParents.
  • The GMPContentParent dtor runs, destroys its channel, if we haven't closed the channel yet, we assert.

It looks like we don't anticipate that the GMPContentParent may not yet have closed its channel when we run ~GMPServiceChild. I've added some logging and the problem appears to be that the ChromiumCDMParent that is held by the GMPContentParent has not been destroyed yet, and we rely on the destruction to signal that the channel should be closed.

Clearing the NI, as I've working on this and don't need the reminder. Comment 9 appears to be correct. I've got some local changes that fix the crash, but given that GMP shutdown crashes are new to me, and the code is complicated, I'm going to make sure I understand the parts involved.

Flags: needinfo?(bvandyk)

Still looking into this, my current solution is insufficient as it can result in shutdown deadlocks.

Further background here is that we have a multiple ownership scenario:

ChromiumCDMParents are owned by both a ChromiumCDMProxy and a GMPContentParent.

Comment 9 outlines the issue as to what is happening on the side of the GMPContentParent. A simple diagram of ownership here, where -> is used to point to the owner of an object:

ChromiumCDMParent -> GMPContentParent -> GMPServiceChild

If we start destroying from GMPServiceChild before the ChromiumCDMParent has been destroyed, we get this bug. To avoid this, ChromiumCDMParent needs to call ChromiumCDMDestroyed, which it does from Recv__delete__ or ActorDestroy.

However, we also have another set of ownership of ChromiumCDMParent:

ChromiumCDMParent -> ChromiumCDMProxy -> MediaKeys

Here, we expect the ChormiumCDMProxy to shutdown the ChromiumCDMParent. The shutdown should (eventually) result in the removal form the GMPServiceChild side. ChromiumCDMProxy shutdown looks like it should typically result from MediaKeys shutdown, which happens in the MeidaKeys dtor, (edit) or by the HTMLMediaElement (which would make sense here, but we seem to not be hitting).

With some extra logging in ~GMPContentParent to show if we still have elements in mChromiumCDMs at destruction, I'm able to get logs showing if we're going to hit our problematic assert or not. Running one of the problematic tests:

./mach wpt testing/web-platform/tests/encrypted-media/clearkey-generate-request-disallowed-input.https.html

gives me a wpt results window. If I hold this Window open for ~15 seconds after the test has run, then my logs show the MediaKeys being destroyed and shutdown of other components happening as expected. If I close the Window immediately after the test run, the MediaKeys is not destroyed on time and the assertion/this bug is hit.

So something is keeping the MediaKeys alive longer than we'd expect, though if we wait awhile, whatever it is dies. Digging further, I suspect it could be related to the pending MediaKeySessions stored by MediaKeys. If I log in the MediaKeySession dtor, the pending sessions are destroyed after waiting ~15 seconds from test finish, and their destruction precedes MediaKeys shutting down. Looks like the MediaKeys and the pending MediaKeySessions form cycle. I'm not particularly familiar with how cycle collection takes place, but wonder if the time it takes for the cycle collector to move though and clobber the cycle is why this works if I wait awhile?

Summary: Perma PROCESS-CRASH | /encrypted-media/clearkey-generate-request-disallowed-input.https.html | application crashed [@ mozilla::ipc::MessageChannel::Clear()] when Geko 66 merges to Beta on 2019-01-21 → Intermittent PROCESS-CRASH | /encrypted-media/clearkey-generate-request-disallowed-input.https.html | application crashed [@ mozilla::ipc::MessageChannel::Clear()]

Use GMPLog.h in GMPContentPareant. Add logging to most functions. This logging
was added to aid in diagnosing a shutdown crash, but should be generally useful
to have.

Driveby touchup of arg name to ChromiumCDMDestroyed to match header.

The macros in these classes are used to output class names in logs.
Differentiating them helps make logs clearer.

Depends on D21978

Because multiple ChromiumCDMProxy object may exist during a browser lifetime,
logging the value of this in their logs is useful for disambiguating log
statements, as well as matching ChromiumCDMProxy objects to pointers held by
other objects.

Depends on D21979

  • Add more logging to MediaKeys.
  • Replace the instances of the %d formatter with PRIu32 when printing the
    PromiseId type (which has the underlying uint32_t type).

Depends on D21980

MediaKeys objects are typically created and associated with an HTMLMediaElement,
however it is possible to create a MediaKeys object and not associate it with an
HTMLMediaElement.

This resulted in an issue where these MediaKeys would keep alive other
components that would assert during bowrser shutdown (see bug 1522547). We
anticipated that MediaKeys associated with an HTMLMediaElement would need to be
shutdown if their owning document became inactive, but were not handling the
case where the keys never became associated with an element.

This patch has the MediaKeys listen directly to their owning document for
activity change. The MediaKeys will shutdown if their document becomes inactive.
This avoids MediaKeys not associated with HTMLMediaElements keeping other
objects alive during browser shutdown.

Depends on D21982

Expanding on comment 11, our culprit appears to be that HTMLMediaElement stores a reference to MediaKeys and will shutdown the keys if it observes its owning document going inactive. However, if the MediaKeys is never associated with an HTMLMediaElement, then it is not shutdown when the document it is in goes inactivate, and is only shutdown after being cycle collected (which is too late to prevent the crash seen in this bug).

My proposed fix is that the MediaKeys instead directly listens to the owning document, so it can shutdown even if not associated with an HTMLMediaElement.

Attachment #9048255 - Attachment description: Bug 1522547 - Log `this` in ChromiumCDMProxy logs. r?cpearce → Bug 1522547 - Log `this` in ChromiumCDMProxy logs, use PRIu32 where appropriate. r?cpearce
Pushed by bvandyk@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/6f4b04b69f80
Add logging to GMPContentParent. r=cpearce
https://hg.mozilla.org/integration/autoland/rev/7e199486530b
Differentiate GMPServiceParent and GMPServiceChild __CLASS__ macro. r=cpearce
https://hg.mozilla.org/integration/autoland/rev/c092112f933e
Log `this` in ChromiumCDMProxy logs, use PRIu32 where appropriate. r=cpearce
https://hg.mozilla.org/integration/autoland/rev/c273b07bc5f0
Add more logs to MediaKeys, use PRIu32 for printing where appropriate. r=cpearce
https://hg.mozilla.org/integration/autoland/rev/153245ec86c2
Log in MediaKeySession dtor. r=cpearce
https://hg.mozilla.org/integration/autoland/rev/062f54735111
Have MediaKeys observe if the owning document becomes inactive so it can shutdown. r=cpearce

Do we need to uplift this? Do you think it may be risky now that we're at the end of the beta 66 cycle?

Flags: needinfo?(bvandyk)

I think the underlying issue causing that bug has existed for a long time, but changes in the wpt led us to discover the shutdown crash recently.

My feeling is that this crash would be rare in real world usage due to it requiring usage of the EME API which I wouldn't expect outside of tests (not associating a MediaKeys object with an HTMLMediaElement).

Based on the above, I'd be happy for the fix to ride the trains from Nightly, unless there are concerns with doing so.

Flags: needinfo?(bvandyk)

Thanks, that sounds good to me.

Following on from this bug:

  • I've created bug 1533761 to follow up on the crashes not being reported as failures (outside of ASAN).
  • I'm looking at if we can re-enable these tests on ASAN now in bug 1522213.
Blocks: 1522213
Regressed by: 1571081
No longer regressed by: 1571081
Regressions: 1571081
You need to log in before you can comment on or make changes to this bug.