Save minidump files if MINIDUMP_SAVE_PATH is set to preserve crash dumps

NEW
Unassigned

Status

P2
normal
a year ago
2 months ago

People

(Reporter: whimboo, Unassigned)

Tracking

(Depends on: 1 bug, Blocks: 3 bugs)

Version 3
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments)

In case of mozcrash for Python there is MINIDUMP_SAVE_PATH support, which basically means that the minidump files as created by Firefox will be saved off to a given folder before the profile gets removed.

As long as we cannot (want to) handle crash analysis, we should at least offer a way to users to safe their minidump files. This is not possible for now, and causes a bit of pain for me to get investigated.

Should be a trivial addition, and would be good to have for the 0.20 release.
Summary: Safe minidump files if MINIDUMP_SAVE_PATH is set → Save minidump files if MINIDUMP_SAVE_PATH is set
Priority: -- → P3
Blocks: 1441204
No longer blocks: 1401129
Comment hidden (mozreview-request)

Comment 2

a year ago
mozreview-review
Comment on attachment 8954393 [details]
Bug 1433495 - Save minidump files if MINIDUMP_SAVE_PATH is set.

https://reviewboard.mozilla.org/r/223478/#review229584

::: testing/mozbase/rust/mozrunner/Cargo.toml:12
(Diff revision 1)
>  license = "MPL-2.0"
>  
>  [dependencies]
>  log = "0.4"
>  mozprofile = { path = "../mozprofile" }
> +walkdir = "2"

Remember to run "./mach vendor rust" so it gets put in third_party/rust.

::: testing/mozbase/rust/mozrunner/src/minidump.rs:6
(Diff revision 1)
> +use std::env;
> +use std::fs;
> +use std::path::PathBuf;
> +use walkdir::WalkDir;
> +
> +pub fn save_minidump_files(path: &PathBuf) -> () {

This code looks ok-ish, but I’d prefer a slightly different API.

First of all I think it would be better if you split this function in
two: one part providing an iterator over minidump files given a Profile,
the other consuming the minidumps and copying them to a given location.

It is also possible the iterator part should live in mozprofile?

::: testing/mozbase/rust/mozrunner/src/minidump.rs:10
(Diff revision 1)
> +
> +pub fn save_minidump_files(path: &PathBuf) -> () {
> +    // Until we can analyze crash reports similar to mozcrash
> +    // lets save available minidump files to a safe location
> +    // if the MINIDUMP_SAVE_PATH env variable has been set.
> +    if let Ok(val) = env::var("MINIDUMP_SAVE_PATH") {

I think probably this should live in the Drop implementation,
and that the function to save minidumps should take a profile
(with an iterator) and a destination path instead.

::: testing/mozbase/rust/mozrunner/src/minidump.rs:15
(Diff revision 1)
> +            println!("MINIDUMP_SAVE_PATH set, but {:?} is not a valid path", save_path);
> +            return;

Return Err here.

::: testing/mozbase/rust/mozrunner/src/minidump.rs:26
(Diff revision 1)
> +                match fs::copy(entry.path(), &save_path) {
> +                    Ok(_) => {
> +                        println!("Saved {:?} as {:?}", entry.file_name(), save_path);
> +                    }
> +                    _ => {}
> +                }

Can we have this function return Result, and do away with the match stataement?

::: testing/mozbase/rust/mozrunner/src/minidump.rs:28
(Diff revision 1)
> +        for entry in walker.filter_map(|e| e.ok()) {
> +            if entry.path().is_file() {
> +                save_path.set_file_name(entry.file_name());
> +                match fs::copy(entry.path(), &save_path) {
> +                    Ok(_) => {
> +                        println!("Saved {:?} as {:?}", entry.file_name(), save_path);

Let’s use the info!() macro for this.
Attachment #8954393 - Flags: review?(ato) → review-
(Reporter)

Comment 3

a year ago
mozreview-review-reply
Comment on attachment 8954393 [details]
Bug 1433495 - Save minidump files if MINIDUMP_SAVE_PATH is set.

https://reviewboard.mozilla.org/r/223478/#review229584

> Remember to run "./mach vendor rust" so it gets put in third_party/rust.

We don't have to because it is already in the tree. But I see that this crate is at version 1.0.7 only. We may have to check which other crates depend on it, or if we can just use that version.

> This code looks ok-ish, but I’d prefer a slightly different API.
> 
> First of all I think it would be better if you split this function in
> two: one part providing an iterator over minidump files given a Profile,
> the other consuming the minidumps and copying them to a given location.
> 
> It is also possible the iterator part should live in mozprofile?

The changes here are actually more as what I can do today. Andreas, if you want to pick this up please do it. I would appreciate. I expected that the API is not that great, because I just wanted to get something working.

> Return Err here.

Note that the caller might want to silently ignore the error. We should make this a hard failure.

> Can we have this function return Result, and do away with the match stataement?

If you do a refactoring and this method copies only a single file then this will be fine. With the current implementation we would abort the copy process in the middle and leaving other minidump files behind.
So, post our discussion, I think that the right approach here is to make sure `delete_session` in the handler is called before shutdown and make the marionette implemenation of `WebDriverHandler` do the cleanup in `delete_session`. For the case that the session is explicitly closed that will work. For the case that there's a browser crash and some command fails that will work. The case for which that won't work is if geckodriver is terminated with a signal with an active session (in this case it would olny matter if the browser had crashed after the last command but before termination).

In theory the way we want to handle that case is to install a signal handler that responds to SIGTERM by running the cleanup steps and then shutting down the server. The chan-signal crate seems like the best current option for doing this, and with a small refactor I was able to dispatch a Quit message on ctrl-c and have that run the `delete_session` code. Unfortunately there are two problems:

* It doesn't work on Windows
* It isn't actually possible to shut down hyper 0.10 [1]

So I think to make this work in all cases it would be necessary to upgrade to a newer hyper, which would be a significant refactor since the new version is designed for event-based concurrency rather than the thread-per-connection approach of 0.10, which we are using in a single-threaded fashion.

In the interests of Science, I tried implementing Drop that called delete_session, and afaict that isn't actually called on a signal, so I don't think a purely Drop-based solution will work.

The conclusion here is that I don't think we can make this work in all cases at the moment without significant effort. In the meantime, I think the right thing to do for minidumps is to add a method somewhere in mozprofile that clears out the minidumps (I would even say that the method should take a destination path rather than hardcoding the environment variable), and that should be called from `delete_session`. If we then want to plug the hole in the above approach we will need to do some more fundamental structural work to enable us to shut down the web server cleanly.

[1] https://github.com/hyperium/hyper/issues/338
Summary: Save minidump files if MINIDUMP_SAVE_PATH is set → Save minidump files if MINIDUMP_SAVE_PATH is set to preserve crash dumps
(Reporter)

Updated

11 months ago
See Also: → bug 1460307
Blocks: 1466818
No longer blocks: 1441204
Blocks: 1485259
No longer blocks: 1485259
Blocks: 1490906
No longer blocks: 1466818
I would appreciate to have this feature in geckodriver as soon as possible. Reasoning is that we have a couple of crashes of Firefox in the wdspec test suite which cannot be analyzed. As such there is nothing I can do to get the tests passing again.

Andreas, you mentioned that you have a half-baked patch locally. In the case you cannot complete it due to time constraints would you mind to at least upload this patch to this bug? In such a case I would like to pick it up. But if you could finish it, I would appreciate it. Thanks.
Flags: needinfo?(ato)
As discussed in the meeting today, here is the WIP patch Andreas sent me early in April this year. 

Andreas, maybe you can skim over and check if that is the one you were referring to. If not lets get it obsoleted. Thanks.
Assignee: nobody → hskupin
Thanks for attaching the lost patch.  I had managed to almost
recreate it locally, and the result was surprisingly similar.

I will look at this once I’m done with my P1’s.
Assignee: hskupin → ato
Flags: needinfo?(ato)
Priority: P3 → P2
Assignee: ato → hskupin
Blocks: 1495062
No longer blocks: 1491288
As discussed during the work week Andreas wants to take care of this bug after bug 1430064 has been fixed.
Assignee: hskupin → ato
Depends on: 1430064

Moving to the 0.25 release of geckodriver for which we might be able to get it fixed.

Assignee: ato → nobody
Blocks: 1520585
No longer blocks: 1495062
You need to log in before you can comment on or make changes to this bug.