Attach to and report on crashes in FlashPlayerPlugin_*.exe processes which are children of our plugin container

RESOLVED FIXED in Firefox 15

Status

()

Core
Plug-ins
P1
normal
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: bsmedberg, Assigned: bsmedberg)

Tracking

unspecified
mozilla16
x86_64
Windows 7
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox14 wontfix, firefox15+ fixed, firefox16 fixed)

Details

(URL)

Attachments

(6 attachments, 7 obsolete attachments)

14.27 KB, patch
Details | Diff | Splinter Review
10.17 KB, patch
Ehsan
: review+
Details | Diff | Splinter Review
19.34 KB, patch
Ehsan
: review+
Details | Diff | Splinter Review
11.75 KB, patch
jimm
: review+
Details | Diff | Splinter Review
985 bytes, patch
Ehsan
: review+
Details | Diff | Splinter Review
16.37 KB, patch
Details | Diff | Splinter Review
(Assignee)

Description

5 years ago
We currently only see crashes in the plugin-container process. With the Flash Player 11.3 sandbox, the interesting crashes all occur in the FlashPlayerPlugin_*.exe processes. We can inject our crash reporter into those processes and report crashes directly to crash-stats from there, which will significantly aid bucketing and diagnosing the increases in crash rates with Flash 11.3.
(Assignee)

Comment 1

5 years ago
http://hg.mozilla.org/users/bsmedberg_mozilla.com/breakpadinject/ contains PoC code which hooks breakpad up via DLL injection in a way that doesn't require any changes to the Flash player.
(Assignee)

Comment 2

5 years ago
The Mozilla patches here depend on the upstream breakpad change http://breakpad.appspot.com/406002/ , although in a pinch we could take a one-off change if a full breakpad merge was too complicated or risky.
(Assignee)

Updated

5 years ago
Assignee: nobody → benjamin
Priority: -- → P1
Target Milestone: --- → mozilla14
(Assignee)

Comment 3

5 years ago
Created attachment 637340 [details] [diff] [review]
The injector library, rev. 1

Not requesting review yet until I finish and test the other half.
(Assignee)

Comment 4

5 years ago
Created attachment 638240 [details] [diff] [review]
Part A - Breakpad update cherrypicked from upstream
(Assignee)

Comment 5

5 years ago
Created attachment 638241 [details] [diff] [review]
Part B - Ambiguous call resolution from breakpad update
Attachment #638241 - Flags: review?(ehsan)
(Assignee)

Comment 6

5 years ago
Created attachment 638242 [details] [diff] [review]
Part C - the injected library
Attachment #637340 - Attachment is obsolete: true
Attachment #638242 - Flags: review?(ehsan)
(Assignee)

Comment 7

5 years ago
Created attachment 638244 [details] [diff] [review]
Part D - MemoryModule import

In the checkin comment I included the exact revision of https://github.com/fancycode/MemoryModule from which I copied and modified this code, if diffing against that would be useful.
Attachment #638244 - Flags: review?(ehsan)
(Assignee)

Comment 8

5 years ago
Created attachment 638245 [details] [diff] [review]
Part E - Crashreport API for injection
Attachment #638245 - Flags: review?(ehsan)
(Assignee)

Comment 9

5 years ago
Created attachment 638246 [details] [diff] [review]
Part F - actually do injection from PluginModuleParent specifically for Flash and FlashPlayerPlugin processes only
Attachment #638246 - Flags: review?(jmathies)
(Assignee)

Updated

5 years ago
Attachment #638242 - Flags: review?(khuey)
(Assignee)

Comment 10

5 years ago
Just started https://tbpl.mozilla.org/?tree=Try&rev=4a1b0f213788 for a full tryserver run.

https://wiki.mozilla.org/User:Benjamin_Smedberg/Flash_Crash_Reporting contains a design document.

I've locally verified at least one crash in the sandboxed process which was caught successfully and produced a minidump.
(Assignee)

Updated

5 years ago
Blocks: 770078
Comment on attachment 638242 [details] [diff] [review]
Part C - the injected library

Review of attachment 638242 [details] [diff] [review]:
-----------------------------------------------------------------

Don't you need to add something to the package-manifest?
Attachment #638242 - Flags: review?(khuey) → review+

Comment 12

5 years ago
e:/builds/moz2_slave/try-w32/build/toolkit/crashreporter/nsExceptionHandler.cpp(66) : fatal error C1083: Cannot open include file: 'InjectCrashReporter.h': No such file or directory
(Assignee)

Comment 13

5 years ago
Created attachment 638316 [details] [diff] [review]
Part E, files added

Indeed bc, I forgot to `hg add` files. Part E now fixed.
Attachment #638245 - Attachment is obsolete: true
Attachment #638245 - Flags: review?(ehsan)
Attachment #638316 - Flags: review?(ehsan)
(Assignee)

Comment 14

5 years ago
New try run: https://tbpl.mozilla.org/?tree=Try&rev=2160b10822bc

builds will be at http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/bsmedberg@mozilla.com-2160b10822bc

Comment 15

5 years ago
after crashing flash, should I see entries in about:crashes from this?

Comment 16

5 years ago
(In reply to Jim Mathies [:jimm] from comment #15)
> after crashing flash, should I see entries in about:crashes from this?

Managed to trigger this I think, here's the stack - 

https://crash-stats.mozilla.com/report/index/bp-aef2e9de-e6d8-4283-967a-98f832120702

Comment 17

5 years ago
Comment on attachment 638246 [details] [diff] [review]
Part F - actually do injection from PluginModuleParent specifically for Flash and FlashPlayerPlugin processes only

Review of attachment 638246 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/plugins/ipc/PluginModuleParent.cpp
@@ +1193,5 @@
> +         ok = Process32Next(snapshot, &entry)) {
> +        if (entry.th32ParentProcessID == pid) {
> +            nsString name(entry.szExeFile);
> +            ToUpperCase(name);
> +            if (StringBeginsWith(name, NS_LITERAL_STRING("FLASHPLAYERPLUGIN"))) {

nit - I would suggest putting this in a const up top with a comment explaining how it is used.

::: dom/plugins/ipc/PluginModuleParent.h
@@ +321,5 @@
> +    
> +    NS_OVERRIDE void OnCrash(DWORD processID, const nsAString& aDumpID);
> +
> +    DWORD mFlashProcess1;
> +    DWORD mFlashProcess2;

since InitializeInjector() can fail before these are assigned, we need to initialize them to 0.
Attachment #638246 - Flags: review?(jmathies) → review+
(Assignee)

Comment 18

5 years ago
Yeah, you should see the plugin frowny-face and clicking the button in the plugin frowny face should submit the report, and they should show up in about:crashes.
(Assignee)

Comment 19

5 years ago
Created attachment 638381 [details] [diff] [review]
Part E - fix crash when crash reporter is disabled mid-run as in mochitests

This is a trivial change to part E: there is a race condition when crash handling is disabled while running, which basically only ever happens in mochitests. Added a null-check.
Attachment #638316 - Attachment is obsolete: true
Attachment #638316 - Flags: review?(ehsan)
Attachment #638381 - Flags: review?(ehsan)
(Assignee)

Comment 20

5 years ago
Created attachment 638383 [details] [diff] [review]
Part C - the injected library, with build fix

Contains the fix from khuey (thanks!) as well as an additional build fix for the native breakpad client which only showed up in clobber builds.
Attachment #638242 - Attachment is obsolete: true
Attachment #638242 - Flags: review?(ehsan)
Attachment #638383 - Flags: review?(ehsan)
(Assignee)

Comment 21

5 years ago
Created attachment 638384 [details] [diff] [review]
Part F - actually do injection from PluginModuleParent specifically for Flash and FlashPlayerPlugin processes only

Jim, this is an updated version of part F, which adds the pref that we talked about on IRC. It also changes an incorrect negative in PMP::ShouldContinueFromReplyTimeout and conditional ordering in PMP::ActorDestroy which meant that we weren't submitting hang reports correctly.
Attachment #638246 - Attachment is obsolete: true
Attachment #638384 - Flags: review?(jmathies)

Updated

5 years ago
Attachment #638384 - Flags: review?(jmathies) → review+
Comment on attachment 638241 [details] [diff] [review]
Part B - Ambiguous call resolution from breakpad update

Review of attachment 638241 [details] [diff] [review]:
-----------------------------------------------------------------

This makes us not use MiniDumpWithFullMemoryInfo, since we're no longer passing |minidump_type| to the constructor, right?  Is this expected?
(Assignee)

Comment 23

5 years ago
Created attachment 638399 [details] [diff] [review]
Part B - Ambiguous call resolution from breakpad update

You're right.
Attachment #638241 - Attachment is obsolete: true
Attachment #638241 - Flags: review?(ehsan)
Attachment #638399 - Flags: review?(ehsan)
Comment on attachment 638244 [details] [diff] [review]
Part D - MemoryModule import

Review of attachment 638244 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with the comments below addressed.

::: toolkit/crashreporter/LoadLibraryRemote.cpp
@@ +32,5 @@
> +  unsigned char *localCodeBase;
> +  unsigned char *remoteCodeBase;
> +  HMODULE *modules;
> +  int numModules;
> +  int initialized;

You never set the initialized value to anything non-zero, and never seem to use it, so it can just be removed.

@@ +73,5 @@
> +  }
> +}
> +
> +// Protection flags for memory pages (Executable, Readable, Writeable)
> +static int ProtectionFlags[2][2][2] = {

Nit: static const DWORD.

@@ +211,5 @@
> +
> +    for (; importDesc < importEnd && importDesc->Name; importDesc++) {
> +      POINTER_TYPE *thunkRef;
> +      FARPROC *funcRef;
> +      HMODULE handle = LoadLibraryA((LPCSTR) (codeBase + importDesc->Name));

Nit: please call FreeLibrary when you're done with the handle.

@@ +264,5 @@
> +                                     const WCHAR* library,
> +                                     const char* symbol)
> +{
> +  // Map the DLL into memory
> +  HANDLE hLibrary = CreateFile(library, GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING,

Nit: use nsAutoHandle here, and below for hMapping too.

@@ +296,5 @@
> +  if (dos_header->e_magic != IMAGE_DOS_SIGNATURE) {
> +#if DEBUG_OUTPUT
> +    OutputDebugStringA("Not a valid executable file.\n");
> +#endif
> +    return NULL;

Nit: UnmapViewOfFile in this and the following error detection branches.  Or use RVAMap in nsWindowsDllBlocklist.cpp.

@@ +358,5 @@
> +  }
> +
> +  // mark memory pages depending on section headers and release
> +  // sections that are marked as "discardable"
> +  FinalizeSections(&result, hRemoteProcess);

FinalizeSections can fail in various ways, and it returns void, which effectively causes us to ignore those error conditions.  That seems wrong to me.

::: toolkit/crashreporter/LoadLibraryRemote.h
@@ +8,5 @@
> +#include <windows.h>
> +
> +void* LoadRemoteLibraryAndGetAddress(HANDLE hRemoteProcess,
> +                                     const WCHAR* library,
> +                                     const char* symbol);

The implementation of this function cannot get the proc address for ordinal exports.  This is fine for now, but it may confuse future callers because of the similarity to GetProcAddress, so please add a comment here saying that |symbol| must be a string name and not an ordinal number.
Attachment #638244 - Flags: review?(ehsan) → review+
Attachment #638399 - Flags: review?(ehsan) → review+
Comment on attachment 638383 [details] [diff] [review]
Part C - the injected library, with build fix

Review of attachment 638383 [details] [diff] [review]:
-----------------------------------------------------------------

I think we should add an entry point function to the injector DLL which just returns FALSE on all reason codes, so that if someone uses LoadLibrary on this DLL, it would fail normally.

::: toolkit/crashreporter/injector/injector.cpp
@@ +22,5 @@
> +  HANDLE hCrashPipe = reinterpret_cast<HANDLE>(context);
> +
> +  ExceptionHandler* e = new ExceptionHandler(wstring(),
> +                                             NULL, NULL, NULL, ExceptionHandler::HANDLER_ALL,
> +                                             MiniDumpNormal, hCrashPipe, NULL);

This is an edge case, but it's best to handle OOM here.  If you don't link to libcp*.lib, you can just check for null.  Otherwise you should use the nothrow new variant.
Attachment #638383 - Flags: review?(ehsan) → review+
Comment on attachment 638381 [details] [diff] [review]
Part E - fix crash when crash reporter is disabled mid-run as in mochitests

Review of attachment 638381 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/crashreporter/InjectCrashReporter.cpp
@@ +31,5 @@
> +{
> +  if (mInjectorPath.IsEmpty())
> +    return NS_OK;
> +
> +  HANDLE hProcess = OpenProcess(PROCESS_CREATE_THREAD |

Nit: Using nsAutoHandle could get rid of multiple CloseHandle calls below.
Attachment #638381 - Flags: review?(ehsan) → review+
(Assignee)

Comment 27

5 years ago
Created attachment 638423 [details] [diff] [review]
Part D - MemoryModule import, with fixes
Attachment #638244 - Attachment is obsolete: true
Here's my assessment on the risk of this patch.  The CreateRemoteThread based hack is quite well known and heavily used on Windows.  The part about basically doing our own loader stuff to load a DLL directly in memory is a bit riskier, but the fact that we only use it to load a single known DLL (the injector DLL) and also the fact that we don't do things which rely on the OS specific differences (such as handling module search paths etc) means that it is possible to verify those pieces by doing in-house QA on all versions of Windows we support, which greatly raises my confidence if we're going to take this on beta.

The most significant risk with this patch, as far as I can tell, is for us to do something wrong and crash the sandboxed process as we're injecting breakpad into it (before we finish setting up the reporter in that process.)  So we should keep an eye on people complaining about not being able to view flash content after this lands.

Based on the above, I think that we should take this on beta, as the benefits outweigh the risks, IMO.
(Assignee)

Comment 29

5 years ago
https://tbpl.mozilla.org/?tree=Try&rev=01508fb96949 passed, working on a rebuild of the final patches and getting this pushed to -central now.
(Assignee)

Comment 30

5 years ago
Part A - https://hg.mozilla.org/mozilla-central/rev/796f649b8140
Part B - https://hg.mozilla.org/mozilla-central/rev/c154cee1928a
Part C - https://hg.mozilla.org/mozilla-central/rev/141f0a09f4b6
Part D - https://hg.mozilla.org/mozilla-central/rev/6990667ebd08
Part E - https://hg.mozilla.org/mozilla-central/rev/52b93da29023
Part F - https://hg.mozilla.org/mozilla-central/rev/1d370ca5bb8d

When doing final testing of this I noticed a preexisting race condition that might occasionally show up as "no crash report available" for these crashes and even regular reports. I'll file and fix that separately.
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED

Comment 31

5 years ago
Benjamin, Were you able to determine the problem with the inability of minidump_stackwalk to process the dumps?
Can someone please comment here when we get Nightly builds to test and provide a link? I'd like to get the QA-org and Nightly Testers involved with this bug.
(In reply to comment #32)
> Can someone please comment here when we get Nightly builds to test and provide
> a link? I'd like to get the QA-org and Nightly Testers involved with this bug.

Nightly builds are currently in progress (about an hour away).
(Assignee)

Comment 34

5 years ago
bc, I haven't seen any dumps personally that were incomplete, although I've primarily tried to use them by attaching in MSVC and not with minidump-stackwalk.

Comment 35

5 years ago
I opened one using MSVC. It wasn't particularly interesting but it did open and show some information.

Laura, what does Socorro use to process crash reports? I'm worried this will send all the crashes to '(no signature)' if the dump can't be processed.
(In reply to Ehsan Akhgari [:ehsan] from comment #33)
> (In reply to comment #32)
> > Can someone please comment here when we get Nightly builds to test and provide
> > a link? I'd like to get the QA-org and Nightly Testers involved with this bug.
> 
> Nightly builds are currently in progress (about an hour away).

The nightly should now be available.
(In reply to Ehsan Akhgari [:ehsan] from comment #36)
> The nightly should now be available.

Just to confirm, the builds are here, right?
ftp://ftp.mozilla.org/pub/firefox/nightly/2012-07-03-11-08-46-mozilla-central/
(In reply to comment #37)
> (In reply to Ehsan Akhgari [:ehsan] from comment #36)
> > The nightly should now be available.
> 
> Just to confirm, the builds are here, right?
> ftp://ftp.mozilla.org/pub/firefox/nightly/2012-07-03-11-08-46-mozilla-central/

Yes.
A request has been communicated the following groups of people to dogfood and report crash IDs to this bug:
 * QA Staff
 * QA Contractors
 * Nightly Testers
 * QA Community
 * QMO

Comment 40

5 years ago
browser+plugin hang reports
https://www.youtube.com/watch?v=ZmYRllome8Q&feature=relmfu
bp-8c62eb77-a49a-461c-81c8-1e71d2120704

Updated

5 years ago
Depends on: 770805

Updated

5 years ago
Target Milestone: mozilla14 → mozilla16
ftp://ftp.mozilla.org/pub/firefox/nightly/2012-07-03-11-08-46-mozilla-central/
I got a flash plugin crash on http://www.miniclip.com/games/flash-tennis/en/ on Win 7 64-bit. I see no issues on other flash sites.
https://crash-stats.mozilla.com/report/index/e747fbe8-3bb4-45d7-9913-bd7292120704
https://crash-stats.mozilla.com/report/index/bp-ec35d2e0-3d92-48ca-87c7-d95ed2120704
I get this crash with the testcase from bug 763237, using bsmedberg's special build from this bug.
Mozilla/5.0 (Windows NT 6.0; rv:16.0) Gecko/16.0 Firefox/16.0
buildId: 20120703110846

I got the crash on Win Vista, using the latest Nightly, while watching a video on YouTube.
https://crash-stats.mozilla.com/report/index/bp-a97a4e39-7682-4895-b43d-f32312120704

Updated

5 years ago
Depends on: 771082
(Assignee)

Updated

5 years ago
Depends on: 771134
(Assignee)

Updated

5 years ago
Depends on: 771251
(Assignee)

Comment 44

5 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/cd58bfdf6c71 Followup to part F to actually submit crash annotations
Comment on attachment 638240 [details] [diff] [review]
Part A - Breakpad update cherrypicked from upstream

[Triage Comment]
This pre-approval is for patches A through F and any followups made on m-c today. Given the absence of a risk evaluation, my approval is under the assumption that it will be no more risky on 15 than 16.

The motivation for uplift is to get Adobe as much external and actionable crash data as possible when they're back in the office on Monday.
Attachment #638240 - Flags: approval-mozilla-aurora+
(Assignee)

Comment 46

5 years ago
https://hg.mozilla.org/releases/mozilla-aurora/rev/1fc5db379a48
https://hg.mozilla.org/releases/mozilla-aurora/rev/1d1f1f0c4511
https://hg.mozilla.org/releases/mozilla-aurora/rev/f4cc7502eed5
https://hg.mozilla.org/releases/mozilla-aurora/rev/ad10a8b2d622
https://hg.mozilla.org/releases/mozilla-aurora/rev/43d418b408e1 (minor fixup required for nsIFile/nsILocalFile differences on branch)
https://hg.mozilla.org/releases/mozilla-aurora/rev/3b6f13419b26
https://hg.mozilla.org/releases/mozilla-aurora/rev/6926dd115351

Noting from earlier meetings, we do not intend to land this on beta/ff14 unless plans change with Adobe significantly.
status-firefox14: --- → wontfix
status-firefox15: --- → fixed
tracking-firefox15: --- → +
Target Milestone: mozilla16 → mozilla15

Updated

5 years ago
status-firefox16: --- → fixed
https://hg.mozilla.org/mozilla-central/rev/cd58bfdf6c71

Updated

5 years ago
Depends on: 772432
(Assignee)

Updated

5 years ago
Blocks: 773332

Comment 48

5 years ago
Target Milestone is for m-c.
Target Milestone: mozilla15 → mozilla16
Depends on: 773632
You need to log in before you can comment on or make changes to this bug.