Handle and recover from crashes on device



5 years ago
2 years ago


(Reporter: zcampbell, Unassigned)


Firefox Tracking Flags

(Not tracked)


(Whiteboard: [emulator][device])

When the device crashes it will restart.
The test runner will attempt to move onto the next step but if Gecko has not restarted in time then the Marionette client will not be able to communicate with the marionette server and immediately fail.

Marionette server itself needs to wait for Gecko to have started and the Marionette server ready (bug 1001322)

If the above fails, the JS client should also catch any exceptions at this point then b2g may have frozen completely rather than restarting. Thus js client should ignore the exception and carry on to stop/start the b2g process back to a clean point before then continuing testing.
Depends on: 994764
Zac, I'm not really sure if I understand what you are saying in the description. It seems to be something different from "the test harness should detect crashes and run minidump_stackwalk", is that true?
Flags: needinfo?(zcampbell)
Yes correct. I'm talking about everything that goes on after it has run minidump_stackwalk in order to make sure we restore the device to be ready to start the next test. For example we need to either wait for the marionette port to re-open, or if b2g has crashed and not restarted, start it again (and wait). If it has frozen completely (say the device needs a battery-out reboot) then we need to trigger that or handle it nicely somehow.
There may be some other scenarios I've not thought of. The device can crash in a variety of ways.
Flags: needinfo?(zcampbell)
Ok, in that case I think this is already fixed with the system I've set up. I did some testing by forcing crashes in bug 1045142 and the marionette-js-runner was able to run the next tests just fine. It might be good for you to try it out yourself though just in case I'm still not understanding you.
Whiteboard: [emulator][device]
Can confirm that we can handle a crash, and we get the stack if SYMBOLS and MINIDUMP_STACKWALK is defined. We recover from crashes by default because we restart b2g on each test that gets run.
Closed: 5 years ago
Resolution: --- → WORKSFORME
Product: Testing → Testing Graveyard
You need to log in before you can comment on or make changes to this bug.