Closed Bug 788512 Opened 7 years ago Closed 7 years ago

Add dumps for Flash processes to modified plugin hang report

Categories

(Core :: Plug-ins, defect)

x86_64
Windows 7
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla19
Tracking Status
firefox18 --- fixed

People

(Reporter: gfritzsche, Assigned: gfritzsche)

References

Details

Attachments

(3 files, 7 obsolete files)

Bug 784145 changes the way the dumps for plugin hangs are submitted. 
The dump submission for the Flash processes still has to be adopted to that.
Attached patch Add dumps for Flash processes (obsolete) — Splinter Review
Attachment #660086 - Flags: review?(benjamin)
Attachment #660086 - Flags: review?(benjamin) → review+
Keywords: checkin-needed
(In reply to Georg Fritzsche [:gfritzsche] from comment #2)
> Try: https://tbpl.mozilla.org/?tree=Try&rev=dd3f0b831afe

Green on Try.

https://hg.mozilla.org/integration/mozilla-inbound/rev/b14f17141bb1
Flags: in-testsuite-
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/b14f17141bb1
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla18
Did you test this? I have not received any flash dumps on the new hang server... do you have crash IDs I can look at to see why it's not working?
Checked into this: I didn't properly test it, as it turns out there are no dumps generated yet for the Flash child (as i assumed) and hence TakeMinidumpForChild() doesn't return anything useful. 
Investigating how to trigger dump generation for those childs.
Ah indeed, "TakeMinidumpForChild" doesn't actually create a minidump, it just gets one that already exists. You basically need to create a call like which will create the minidump and name it what you want http://mxr.mozilla.org/mozilla-central/source/toolkit/crashreporter/nsExceptionHandler.cpp?mark=2704-2718#2692
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Breakpad seems to require a target thread whenever a minidump is to be taken for a child. For the plugin-container we already get one via its IPC setup, but not for the Flash processes.
The best i came up with for now is the hackish approach of finding the "main" threads for them and using that (still have to clean it up though). Does the report bp-hr-20120926-3ab9dd12-34bc-489b-80dd-81ea086a0b51 look roughly sensible on the server side?
No:

2012-09-26 11:43:14: minidump_processor.cc:264: INFO: Processing minidump in file /home/bsmedberg/hangprocessor/data/2012/09-26/hr-20120926-3ab9dd12-34bc-489b-80dd-81ea086a0b51/minidump_flash2.dmp
2012-09-26 11:43:14: minidump.cc:3775: INFO: Minidump opened minidump /home/bsmedberg/hangprocessor/data/2012/09-26/hr-20120926-3ab9dd12-34bc-489b-80dd-81ea086a0b51/minidump_flash2.dmp
2012-09-26 11:43:14: minidump.cc:3820: INFO: Minidump not byte-swapping minidump
2012-09-26 11:43:14: minidump.cc:4186: INFO: GetStream: type 1197932545 not present
2012-09-26 11:43:14: minidump.cc:4186: INFO: GetStream: type 1197932546 not present
2012-09-26 11:43:14: minidump.cc:1958: INFO: MinidumpModule could not determine version for C:\Program Files (x86)\NVIDIA Corporation\CoProcManager\detoured.dll
2012-09-26 11:43:14: minidump.cc:1857: INFO: MinidumpModule could not determine debug_file for C:\Program Files (x86)\WinSplit Revolution\WinSplitHook32.DLL
2012-09-26 11:43:14: minidump.cc:1926: INFO: MinidumpModule could not determine debug_identifier for C:\Program Files (x86)\WinSplit Revolution\WinSplitHook32.DLL
2012-09-26 11:43:14: minidump.cc:1322: ERROR: MinidumpThread has a memory region problem, 0x0+0x0
2012-09-26 11:43:14: minidump.cc:1516: ERROR: MinidumpThreadList cannot read thread 0/12
2012-09-26 11:43:14: minidump.cc:4209: ERROR: GetStream could not read stream type 3
2012-09-26 11:43:14: minidump_processor.cc:112: ERROR: Minidump /home/bsmedberg/hangprocessor/data/2012/09-26/hr-20120926-3ab9dd12-34bc-489b-80dd-81ea086a0b51/minidump_flash2.dmp has no thread list
2012-09-26 11:43:14: minidump.cc:3747: INFO: Minidump closing minidump
2012-09-26 11:43:14: minidump_stackwalk.cc:529: ERROR: MinidumpProcessor::Process failed

[minidump-stackwalk failed with code 1]
This sounds like the issue that bc was hitting. I had a sort-of-useful patch to work around this here:
https://breakpad.appspot.com/413002/
Current approach on generating minidumps: give breakpad the thread id of the oldest thread for the respective Flash process.
The dumps open fine locally in Visual Studio by the way, but only show the active function in the call stack and no locals etc.
Comment on attachment 665112 [details] [diff] [review]
Generate Flash process dumps on hang

Review of attachment 665112 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/crashreporter/nsExceptionHandler.cpp
@@ +2686,5 @@
> +  if (task_threads(childPid, &threads_for_task, &thread_count)
> +      == KERN_SUCCESS && childBlamedThread < thread_count) {
> +    childThread = threads_for_task[childBlamedThread];
> +  }
> +#else

We should probably pull this out into a separate function since it's used elsewhere.
Attached file Output of patched dump processor (obsolete) —
Tested around with the thread ids (check for threads being alive, set them to 0, ...) and haven't seen any difference.

Patching the hang processor prevents aborting the dump processing completely, but all threads stacks are broken. 
Attaching output of patched hang processor with the flash thread ids set to 0.
Turns out that the process handle used for the dump generation was missing PROCESS_VM_READ.

This might still intermittently trigger the bug Ted referenced, i'll need to check that tomorrow to be safe.
Attachment #665112 - Attachment is obsolete: true
The write-out of the .extra annotations is triggered via:
-> AsyncChannel::Close()
-> AsyncChannel::NotifyMaybeChannelError()
[...]
-> PluginModuleParent::ActorDestroy(AbnormalShutdown)
-> PluginModuleParent::ProcessFirstMinidump()
-> CrashReporterParent::GenerateChildData()

When the plugin .extra data is missing, those are never called.
When those aren't called, we shouldn't be getting a plugin hang notification at all in the UI. Are you submitting these via about:crashes?

The codepath is supposed to be:

* at the end of PluginModuleParent::ShouldContinueFromReplyTimeout(): post PluginModuleParent::CleanupFromTimeout
* in CleanupFromTimeout, Close(), but only if OkToCleanup! This may be the problem, in which case we should repost CleanupFromTimeout the same way we repost NotifyPluginCrashed
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #18)
> When those aren't called, we shouldn't be getting a plugin hang notification
> at all in the UI. Are you submitting these via about:crashes?

Yes and yes.

> * at the end of PluginModuleParent::ShouldContinueFromReplyTimeout(): post
> PluginModuleParent::CleanupFromTimeout
> * in CleanupFromTimeout, Close(), but only if OkToCleanup! This may be the
> problem, in which case we should repost CleanupFromTimeout the same way we
> repost NotifyPluginCrashed

Yes, that fixes it. So this is a separate issue and actually goes back to bug 747055.
Attached patch Repost cleanup notification (obsolete) — Splinter Review
Repost PluginModuleParent::CleanupFromTimeout() to avoid skipping it.
Attachment #670134 - Attachment is obsolete: true
Attachment #670887 - Flags: review?(benjamin)
Comment on attachment 670884 [details] [diff] [review]
Repost cleanup notification

Note bug 747055, comments 34-35.
Attachment #670884 - Flags: review?(benjamin)
(In reply to Georg Fritzsche [:gfritzsche] from comment #22)
> Comment on attachment 670884 [details] [diff] [review]
> Repost cleanup notification

Also worth noting that plugin hang notifications and reports are covered by tests, but they never triggered on such a scenario as far as i know.
Attachment #670884 - Flags: review?(benjamin) → review+
Comment on attachment 670887 [details] [diff] [review]
Generate Flash process dumps on hang, v3

lgtm, let's see if ted wants to review the crashreporter bits of this
Attachment #670887 - Flags: review?(ted.mielczarek)
Attachment #670887 - Flags: review?(benjamin)
Attachment #670887 - Flags: review+
Comment on attachment 670887 [details] [diff] [review]
Generate Flash process dumps on hang, v3

Review of attachment 670887 [details] [diff] [review]:
-----------------------------------------------------------------

This looks fine, just one nit.

::: toolkit/crashreporter/nsExceptionHandler.h
@@ +122,5 @@
> +// Create an additional minidump for a child of a process which already has
> +// a minidump (|parentMinidump|).
> +// The resulting dump will get the id of the parent and use the |name| as
> +// an extension.
> +bool CreateChildMinidump(ProcessHandle childPid,

Looking at this again, it might be better to call this "CreateAdditionalChildMinidump", just to be very explicit. It's not really a deal-breaker though.
Attachment #670887 - Flags: review?(ted.mielczarek) → review+
Adressed review comments.
Attachment #670887 - Attachment is obsolete: true
Attachment #670884 - Attachment is obsolete: true
Attachment #669972 - Attachment is obsolete: true
Fix Linux build bustage for newer breakpad revision.
Try: https://tbpl.mozilla.org/?tree=Try&rev=0a6ae5301535
Attachment #671367 - Attachment is obsolete: true
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/2b96e7442087
https://hg.mozilla.org/mozilla-central/rev/c8926a55f661
Status: REOPENED → RESOLVED
Closed: 7 years ago7 years ago
Resolution: --- → FIXED
Comment on attachment 671370 [details] [diff] [review]
Repost cleanup notification

[Approval Request Comment]
This change is a feature uplift which allows plugin hang reports to be more useful. The code in question only runs when we've already detected that a plugin has stopped responding, and this appears relatively safe. The only risk I can see is that we'd end up causing some kind of crash in the Firefox process while running this code, but that would show up on crash-stats pretty quickly/obviously.
Attachment #671370 - Flags: approval-mozilla-aurora?
Attachment #671391 - Flags: approval-mozilla-aurora?
Attachment #671370 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #671391 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
You need to log in before you can comment on or make changes to this bug.