Closed Bug 1101308 Opened 10 years ago Closed 9 years ago

GMPs crash sporadically

Categories

(Core :: Audio/Video, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla37
Tracking Status
firefox36 --- fixed
firefox37 --- fixed
firefox38 --- fixed
firefox39 --- fixed

People

(Reporter: cpearce, Assigned: cpearce)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

I've noticed that sometimes GMPs crash during gtests, or sometimes if I refresh page using EME. It's intermittent. This is on Windows, but I think JW may have seen this on Linux too.
For example:


$ ./mach gtest *GMP*
 0:00.25 c:\mozilla-build\mozmake\mozmake.EXE -C testing/gtest -j8 -s -w gtest
 0:00.27 mozmake.EXE: Entering directory 'c:/Users/cpearce/src/mozilla/purple/objdir/testing/gtest'
 0:00.29 mozmake.EXE[1]: Entering directory 'c:/Users/cpearce/src/mozilla/purple/objdir/toolkit/library'
 0:00.31 mozmake.EXE[1]: Leaving directory 'c:/Users/cpearce/src/mozilla/purple/objdir/toolkit/library'
 0:00.31 mozmake.EXE: Leaving directory 'c:/Users/cpearce/src/mozilla/purple/objdir/testing/gtest'
 0:00.32 c:\Users\cpearce\src\mozilla\purple\objdir\dist\bin\firefox.exe -unittest
Running GTest tests...
[13284] WARNING: XPCOM objects created/destroyed from static ctor/dtor: file c:\Users\cpearce\src\mozilla\purple\xpcom\base\nsTraceRefcnt.cpp, line 143
[13284] WARNING: XPCOM objects created/destroyed from static ctor/dtor: file c:\Users\cpearce\src\mozilla\purple\xpcom\base\nsTraceRefcnt.cpp, line 143
Note: Google Test filter = *GMP*
[==========] Running 12 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 12 tests from GeckoMediaPlugins
[ RUN      ] GeckoMediaPlugins.GMPTestCodec
[       OK ] GeckoMediaPlugins.GMPTestCodec (190 ms)
[ RUN      ] GeckoMediaPlugins.GMPCrossOrigin
[       OK ] GeckoMediaPlugins.GMPCrossOrigin (170 ms)
[ RUN      ] GeckoMediaPlugins.GMPStorageGetNodeId
[13284] WARNING: This method is lossy. Use GetCanonicalPath !: file c:/Users/cpearce/src/mozilla/purple/xpcom/io/nsLocalFileWin.cpp, line 3483
[       OK ] GeckoMediaPlugins.GMPStorageGetNodeId (31 ms)
[ RUN      ] GeckoMediaPlugins.GMPStorageBasic
[       OK ] GeckoMediaPlugins.GMPStorageBasic (73 ms)
[ RUN      ] GeckoMediaPlugins.GMPStorageCrossOrigin
[       OK ] GeckoMediaPlugins.GMPStorageCrossOrigin (77 ms)
[ RUN      ] GeckoMediaPlugins.GMPStoragePrivateBrowsing
[       OK ] GeckoMediaPlugins.GMPStoragePrivateBrowsing (190 ms)
[ RUN      ] GeckoMediaPlugins.GMPStorageAsyncShutdownTimeout
[       OK ] GeckoMediaPlugins.GMPStorageAsyncShutdownTimeout (93 ms)
[ RUN      ] GeckoMediaPlugins.GMPStorageAsyncShutdownStorage
[       OK ] GeckoMediaPlugins.GMPStorageAsyncShutdownStorage (53 ms)
[ RUN      ] GeckoMediaPlugins.GMPPluginVoucher
[       OK ] GeckoMediaPlugins.GMPPluginVoucher (26 ms)
[ RUN      ] GeckoMediaPlugins.GMPOutputProtection
[       OK ] GeckoMediaPlugins.GMPOutputProtection (51 ms)
[ RUN      ] GeckoMediaPlugins.GMPStorageGetRecordNamesInMemoryStorage
[       OK ] GeckoMediaPlugins.GMPStorageGetRecordNamesInMemoryStorage (39 ms)
[ RUN      ] GeckoMediaPlugins.GMPStorageGetRecordNamesPersistentStorage
[13284] WARNING: GMP crash without crash report: file c:\Users\cpearce\src\mozilla\purple\dom\media\gmp\GMPParent.cpp, line 633
[13284] WARNING: Unable to terminate process: 5: file c:/Users/cpearce/src/mozilla/purple/ipc/chromium/src/base/process_util_win.cc, line 375
[13284] WARNING: Timed out waiting for GMP async shutdown!: file c:\Users\cpearce\src\mozilla\purple\dom\media\gmp\GMPParent.cpp, line 180
[13284] WARNING: Timed out waiting for GMP async shutdown!: file c:\Users\cpearce\src\mozilla\purple\dom\media\gmp\GMPParent.cpp, line 180
[13284] WARNING: Timed out waiting for GMP async shutdown!: file c:\Users\cpearce\src\mozilla\purple\dom\media\gmp\GMPParent.cpp, line 180

(timeouts at the end are expected, the "Unable to terminate process" is not.
Mine is a little different.

[5730] WARNING: FAILED TO exec() CHILD PROCESS, path: /media/jwwang/DATA/codebase/mozilla-central2/plugin-container: file /media/jwwang/DATA/codebase/mozilla-central2/ipc/chromium/src/base/process_util_linux.cc, line 314

It looks like it is related to Linux permission.
(In reply to JW Wang [:jwwang] from comment #2)
> Mine is a little different.
> 
> [5730] WARNING: FAILED TO exec() CHILD PROCESS, path:
> /media/jwwang/DATA/codebase/mozilla-central2/plugin-container: file
> /media/jwwang/DATA/codebase/mozilla-central2/ipc/chromium/src/base/
> process_util_linux.cc, line 314
> 
> It looks like it is related to Linux permission.

I had this before. The correct bin path isn't being used here; I've been working around it for now by doing |mach run| from obj/dist/bin.
OK, I looked into this today. The cause is the code in the #ifdef HASH_NODE_ID_WITH_DEVICE_ID block in GMPLoader.cpp to zero the thread stack's memory down to the guard page to clear user identifiable information. If I comment out that code, all the GMP gtests pass consistently.

We only run that code on Windows only, so crashes on other platforms are something else.
This code is in Aurora, and will affect OpenH264. We'd better disable it and uplift the patch to disable it.
We'll need this uplifted, and we'll need to reinstate a fixed version of the  code in bug 1114867.
Attachment #8540515 - Flags: review?(edwin)
Comment on attachment 8540515 [details] [diff] [review]
Patch: remove code causing crash

Review of attachment 8540515 [details] [diff] [review]:
-----------------------------------------------------------------

I'll take an r+ from ajones too...
Attachment #8540515 - Flags: review?(ajones)
Comment on attachment 8540515 [details] [diff] [review]
Patch: remove code causing crash

Review of attachment 8540515 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/gmp/GMPLoader.cpp
@@ +110,5 @@
>        return false;
>      }
> +    // TODO: (Bug 1114867) Clear any memory on the stack that may have been
> +    // used by functions we've called that may have left behind data that
> +    // can be used to uniquely identify the user.

This will probably crash if we hit an interrupt while we're in SecureZeroMemory(). We need to increment the stack pointer before we clear the stack space.

We want something like:

void* buf=alloca(top - bottom);
// The buffer should be at least as low as bottom. We can have local variables
// above it in memory.
MOZ_ASSERT(buf <= bottom);
SecureZeroMemory(buf, top - bottom);

This will allocate a buffer that sits below the current position in the stack and should cover the block of memory we want to erase. If we have local variables between the end of this buffer and top (e.g. buf) then we're writing to them anyway so they do get cleared.
Attachment #8540515 - Flags: review?(ajones) → review+
On second thoughts.. ignore that comment. It is completely wrong.
I don't quite get it. Is the implementation of GetStackAfterCurrentFrame() flawed and causing crash?
https://hg.mozilla.org/mozilla-central/rev/c29ebd2b4a10
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla37
(In reply to JW Wang [:jwwang] from comment #11)
> I don't quite get it. Is the implementation of GetStackAfterCurrentFrame()
> flawed and causing crash?

I'm not sure, but that seems likely.
Comment on attachment 8540515 [details] [diff] [review]
Patch: remove code causing crash

Approval Request Comment
[Feature/regressing bug #]: bug 1060179, device binding support for EME Gecko Media Plugins
[User impact if declined]: Open264 will crash randomly on startup.
[Describe test coverage new/current, TBPL]: Tested locally, this code is also exercised by our unit test.
[Risks and why]: Low; we're just removing code.
[String/UUID change made/needed]: None.
Attachment #8540515 - Flags: approval-mozilla-aurora?
Attachment #8540515 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Landing on aurora needed, so marking checkin-needed.
Keywords: checkin-needed
Looks to me like it landed on aurora seven hours before that.
Keywords: checkin-needed
No longer blocks: 1114826
Mass update firefox-status to track EME uplift.
You need to log in before you can comment on or make changes to this bug.