Closed Bug 1038868 Opened 10 years ago Closed 8 years ago

Use B2GDeviceRunner in the Marionette client

Categories

(Remote Protocol :: Marionette, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: davehunt, Unassigned)

References

Details

(Whiteboard: [affects=b2gdevice])

Attachments

(4 files, 8 obsolete files)

46 bytes, text/x-github-pull-request
Details | Review
175.62 KB, text/plain
Details
181.10 KB, text/plain
Details
19.53 KB, patch
ahal
: review+
Details | Diff | Splinter Review
With the latest mozrunner, we can now use B2GDeviceRunner when we're running against a physical device. This should allow us to remove some of the B2G utility methods from gaiatest and the marionette test runner mixins, and also provide a device manager instance to the tests as appropriate.
Blocks: 1038870
Blocks: 1039626
Attached patch Work in progress patch (obsolete) — Splinter Review
This is a work in progress patch, which is working for me locally. It does currently require use of --app=b2g in order to determine that we're running against a B2G device. Could you please provide some feedback Andrew? :)
Assignee: nobody → dave.hunt
Status: NEW → ASSIGNED
Attachment #8457024 - Flags: feedback?(ahalberstadt)
Attached patch gaiatest patch for wip (obsolete) — Splinter Review
This patch is for gaiatest, which works in association with the WIP patch.
Comment on attachment 8457024 [details] [diff] [review]
Work in progress patch

Review of attachment 8457024 [details] [diff] [review]:
-----------------------------------------------------------------

Yeah, this looks great! Thanks for doing this. DeviceRunner is pretty new, so don't be afraid to change things around if it makes sense. For example, maybe it isn't necessary to reboot the device in start (I'm doubtful, but it might be worth a try).. that code has been there for a long time, dating back to when this used to live in b2gautomation.py.

::: testing/mozbase/mozrunner/mozrunner/base/device.py
@@ +21,5 @@
>      remote devices (or emulators), such as B2G.
>      """
>      def __init__(self, device_class, device_args=None, **kwargs):
>          process_args = kwargs.get('process_args', {})
> +        # TODO: Make streaming to stdout optional

Yeah, the mochitest structured logging patch actually has a fix that will make it so these values don't overwrite what gets passed in (which is very bad!). If that doesn't land soon, I'm going to cherry-pick it myself within the next couple of days as it will also be blocking me.

@@ +121,5 @@
>          self.check_for_crashes()
>  
>      def check_for_crashes(self):
>          dump_dir = self.device.pull_minidumps()
> +        return BaseRunner.check_for_crashes(self, dump_directory=dump_dir, test_name=self.last_test)

Good catch!
Attachment #8457024 - Flags: feedback?(ahalberstadt) → feedback+
(In reply to Andrew Halberstadt [:ahal] from comment #4)
> maybe it isn't necessary to reboot the device in start (I'm doubtful, but it
> might be worth a try).. that code has been there for a long time, dating
> back to when this used to live in b2gautomation.py.

I think if we're pushing a profile then it's necessary in order to pick up the new profile. We do have consumers of gaiatest that need to not restart between tests, so perhaps we could avoid the restart by indicating that we want to use the profile provided on the device and not to clean up. This might be best as a subclass of B2GDeviceRunner.

> ::: testing/mozbase/mozrunner/mozrunner/base/device.py
> @@ +21,5 @@
> >      remote devices (or emulators), such as B2G.
> >      """
> >      def __init__(self, device_class, device_args=None, **kwargs):
> >          process_args = kwargs.get('process_args', {})
> > +        # TODO: Make streaming to stdout optional
> 
> Yeah, the mochitest structured logging patch actually has a fix that will
> make it so these values don't overwrite what gets passed in (which is very
> bad!). If that doesn't land soon, I'm going to cherry-pick it myself within
> the next couple of days as it will also be blocking me.

Do you have the bug number so I can set it as a blocker?
Flags: needinfo?(ahalberstadt)
CCing Paul and Walter to provide info on the non-restart use case.
(In reply to Dave Hunt (:davehunt) from comment #2)
> It does currently require use of --app=b2g in order to determine that we're running
> against a B2G device.

As Zac pointed out on IRC we could assume app is b2g in gaiatest if no command line value has been set.
Depends on: 1040078
Depends on: 886570
Flags: needinfo?(ahalberstadt)
Depends on: 1051986
(In reply to Zac C (:zac) from comment #6)
> CCing Paul and Walter to provide info on the non-restart use case.

In our use case, We randomly pick up test cases and trigger gaiatest runner.  Since we can't send infinite test cases we'll re-initialize runner again and again.

Hopefully we can have deviceRunner in our wrapper and pass it to runner, so that we can keep the instance and avoid reboot the device.
Whiteboard: [affects=b2gdevice]
Depends on: 1057044
Depends on: 1059248
Attachment #8457024 - Attachment is obsolete: true
Attachment #8495960 - Flags: review?(ahalberstadt)
Attachment #8457028 - Attachment is obsolete: true
Attachment #8495963 - Flags: review?(zcampbell)
Attachment #8495963 - Flags: review?(rwood)
Attachment #8495963 - Flags: review?(ahalberstadt)
The mozilla-central patch should have no impact on our current testing as it only affects on-device tests. The Gaia patch depends on the mozilla-central patch, so we'll most likely want to release a new version of the marionette client ahead of the patch landing.
The patches also take care of crash reporting as a side-effect. I've included a unit test for crash detection and recovery. The MINIDUMP_STACKWALK environment variable must be set to the location of the stackwalk binary [1] in order to process the crash dump and the --symbols command line option should be provided with a path or URL to the appropriate symbols file.

If the MINIDUMP_SAVE_PATH environment variable is set, any crash dump files will be saved to the location specified by it's value.

[1] http://hg.mozilla.org/build/tools/file/tip/breakpad/
Attachment #8495963 - Flags: feedback?(pyang)
Comment on attachment 8495960 [details] [diff] [review]
Use B2GDeviceRunner in the Marionette client. v1.0

Review of attachment 8495960 [details] [diff] [review]:
-----------------------------------------------------------------

Lgtm! r+ with a couple questions and nits.

::: testing/marionette/client/marionette/marionette.py
@@ +18,5 @@
>  from decorators import do_crash_check
>  from keys import Keys
>  from marionette_transport import MarionetteTransport
>  
> +from mozrunner import B2GDeviceRunner

nit: just add this comma separated on the line below

@@ +474,5 @@
>                   gecko_log=None, homedir=None, baseurl=None, no_window=False, logdir=None,
>                   busybox=None, symbols_path=None, timeout=None, socket_timeout=360,
>                   device_serial=None, adb_path=None, process_args=None):
>          self.host = host
> +        self.port = self.remote_port = port

Should self.port be None to begin with and then get set for the first time when setup_port_forwarding is called?

@@ +548,5 @@
>  
> +        if app and app.lower() == 'b2g':
> +            # There's no way to know if the target is a device, so we try to
> +            # start a device runner and if that fails we assume the target is
> +            # not a device

If we get passed all the if statements above, doesn't that pretty much only leave devices left?

::: testing/marionette/client/marionette/runner/base.py
@@ +574,4 @@
>                  'app_args': self.app_args,
>                  'bin': self.bin,
>                  'profile': self.profile,
> +                'gecko_log': self.gecko_log

nit: generally it's good form to leave commas at the end of python objects, this makes the diff smaller if someone adds a line in later.
Attachment #8495960 - Flags: review?(ahalberstadt) → review+
Comment on attachment 8495963 [details] [review]
Link to Github pull-request: https://github.com/mozilla-b2g/gaia/pull/24475

I'm not super familiar with gaiatest, so mostly just looked at general python/structure stuff. But looks alright from what I can tell.
Attachment #8495963 - Flags: review?(ahalberstadt) → review+
Comment on attachment 8495963 [details] [review]
Link to Github pull-request: https://github.com/mozilla-b2g/gaia/pull/24475

lgtm. I saw gaiatest get marionette instance from init parameters, will create and pass it to runner and avoid re-initialization.  Thanks!
Attachment #8495963 - Flags: feedback?(pyang) → feedback+
Blocks: 1064798
Comment on attachment 8495963 [details] [review]
Link to Github pull-request: https://github.com/mozilla-b2g/gaia/pull/24475

LGTM. Just a question in the PR regarding turning off the device screen on GaiaTestCase.teardown.
Attachment #8495963 - Flags: review?(rwood) → review+
Do these have to be landed simultaneously or can we land the marionette client side before we test/land the gaiatest changes?
Flags: needinfo?(dave.hunt)
(In reply to Zac C (:zac) from comment #17)
> Do these have to be landed simultaneously or can we land the marionette
> client side before we test/land the gaiatest changes?

See comment 11.
Flags: needinfo?(dave.hunt)
Blocks: 1040742
Comment on attachment 8495963 [details] [review]
Link to Github pull-request: https://github.com/mozilla-b2g/gaia/pull/24475

r- as per two comments in the Github PR.
Attachment #8495963 - Flags: review?(zcampbell) → review-
I also found some issues around running against a local desktopb2g instances.

(am I running it correctly?)



In this case I started it with the device connected but not intending to use it.

1. Connect a running, marionette enabled flame device by usb, but make sure adb server is killed
2. Execute gaiatest using --binary and --profile and --restart to have it start desktopb2g
3. The runner will start the desktopb2g instance but then also reboot the flame. Then it all falls apart and all the tests will fail.

I guess it needs to be a bit more strict when binary and profile are passed in to not to try to attempt any device-commands.

Console trace:
(gecko-dev)zac@zac-X1-Carbon:~/Mozilla/gaia/tests/python/gaia-ui-tests$ gaiatest --testvars gaiatest/testvars_desktop.json --html-output=results/index.html  --binary /home/zac/desktopb2g/master/b2g/b2g-bin --profile /home/zac/Mozilla/gaia/profile/ gaiatest/tests/unit/test_kill.py --restart
Results will not be posted to Treeherder. Please set the following environment variables to enable Treeherder reports: TREEHERDER_KEY, TREEHERDER_SECRET
mozdevice {'os': 'flame-eng 4.4.2 KOT49H eng.cltbld.20141006.114623 test-keys'}
starting httpd
running webserver on http://10.246.27.94:46158/
SUITE-START | Running 1 tests
TEST-START | test_kill.py TestKill.test_kill
Traceback (most recent call last):
  File "/home/zac/Mozilla/src/testing/mozbase/mozrunner/mozrunner/base/runner.py", line 194, in check_for_crashes
    test=test_name)
TypeError: unsupported operand type(s) for +=: 'int' and 'NoneType'
TEST-UNEXPECTED-ERROR | test_kill.py TestKill.test_kill | AttributeError: 'GeckoRuntimeRunner' object has no attribute 'device'

Traceback (most recent call last):
  File "/home/zac/Mozilla/src/testing/marionette/client/marionette/marionette_test.py", line 249, in run
    self.setUp()
  File "/home/zac/Mozilla/gaia/tests/python/gaia-ui-tests/gaiatest/gaia_test.py", line 747, in setUp
    self.device.start_b2g()
  File "/home/zac/Mozilla/gaia/tests/python/gaia-ui-tests/gaiatest/gaia_test.py", line 567, in start_b2g
    self.marionette.port = runner.device.setup_port_forwarding(

TEST-INFO took 12785ms
Traceback (most recent call last):
  File "/home/zac/Mozilla/src/testing/mozbase/mozrunner/mozrunner/base/runner.py", line 194, in check_for_crashes
    test=test_name)
TypeError: unsupported operand type(s) for +=: 'int' and 'NoneType'
TEST-START | test_kill.py TestKill.test_kill_multiple
Traceback (most recent call last):
  File "/home/zac/Mozilla/src/testing/mozbase/mozrunner/mozrunner/base/runner.py", line 194, in check_for_crashes
    test=test_name)
TypeError: unsupported operand type(s) for +=: 'int' and 'NoneType'
TEST-UNEXPECTED-ERROR | test_kill.py TestKill.test_kill_multiple | AttributeError: 'GeckoRuntimeRunner' object has no attribute 'device'

Traceback (most recent call last):
  File "/home/zac/Mozilla/src/testing/marionette/client/marionette/marionette_test.py", line 249, in run
    self.setUp()
  File "/home/zac/Mozilla/gaia/tests/python/gaia-ui-tests/gaiatest/gaia_test.py", line 747, in setUp
    self.device.start_b2g()
  File "/home/zac/Mozilla/gaia/tests/python/gaia-ui-tests/gaiatest/gaia_test.py", line 567, in start_b2g
    self.marionette.port = runner.device.setup_port_forwarding(

TEST-INFO took 2287ms
Traceback (most recent call last):
  File "/home/zac/Mozilla/src/testing/mozbase/mozrunner/mozrunner/base/runner.py", line 194, in check_for_crashes
    test=test_name)
TypeError: unsupported operand type(s) for +=: 'int' and 'NoneType'
Traceback (most recent call last):
  File "/home/zac/Mozilla/src/testing/mozbase/mozrunner/mozrunner/base/runner.py", line 194, in check_for_crashes
    test=test_name)
TypeError: unsupported operand type(s) for +=: 'int' and 'NoneType'

SUMMARY
-------
passed: 0
failed: 2
todo: 0

FAILED TESTS
-------
test_kill.py test_kill.TestKill.test_kill
test_kill.py test_kill.TestKill.test_kill_multiple
Traceback (most recent call last):
  File "/home/zac/Mozilla/src/testing/mozbase/mozrunner/mozrunner/base/runner.py", line 194, in check_for_crashes
    test=test_name)
TypeError: unsupported operand type(s) for +=: 'int' and 'NoneType'
Traceback (most recent call last):
  File "/home/zac/Mozilla/src/testing/mozbase/mozrunner/mozrunner/base/runner.py", line 194, in check_for_crashes
    test=test_name)
TypeError: unsupported operand type(s) for +=: 'int' and 'NoneType'
mozversion application_buildid: 20141001040205
mozversion application_changeset: 14665b1de5ee
mozversion application_display_name: B2G
mozversion application_name: B2G
mozversion application_repository: https://hg.mozilla.org/mozilla-central
mozversion application_version: 35.0a1
mozversion gaia_date: 1412163943
mozversion platform_buildid: 20141001040205
mozversion platform_changeset: 14665b1de5ee
mozversion platform_repository: https://hg.mozilla.org/mozilla-central
SUITE-END | took 19s
I also have some concern that when running the tests against a device it adds around 10-15 seconds per test. That's more than 10% in most cases, it would add quite a lot of test run time on Jenkins CI.
(In reply to Zac C (:zac) from comment #20)
> I also found some issues around running against a local desktopb2g instances.
> 
> In this case I started it with the device connected but not intending to use
> it.

That sounds like a bug. I'll look into it.
(In reply to Zac C (:zac) from comment #21)
> I also have some concern that when running the tests against a device it
> adds around 10-15 seconds per test. That's more than 10% in most cases, it
> would add quite a lot of test run time on Jenkins CI.

I would expect an initial delay while the harness does a restart of b2g ahead of the first test, but it shouldn't have an impact on subsequent tests. I can do some timing comparisons and try to see where we're adding time.
(In reply to Dave Hunt (:davehunt) from comment #22)
> (In reply to Zac C (:zac) from comment #20)
> > In this case I started it with the device connected but not intending to use it.
> 
> That sounds like a bug. I'll look into it.

This was because I had an if where I needed an elif. After speaking on IRC we were also able to resolve the TypeError issues too, which were likely due to the environment, and perhaps an outdated mozcrash being used.
(In reply to Dave Hunt (:davehunt) from comment #23)
> (In reply to Zac C (:zac) from comment #21)
> > I also have some concern that when running the tests against a device it
> > adds around 10-15 seconds per test. That's more than 10% in most cases, it
> > would add quite a lot of test run time on Jenkins CI.
> 
> I would expect an initial delay while the harness does a restart of b2g
> ahead of the first test, but it shouldn't have an impact on subsequent
> tests. I can do some timing comparisons and try to see where we're adding
> time.

I ran the unit test suite (36 tests) before and after applying the patches. With the patches applied the suite took an additional 10 minutes (~16 seconds per test). Thinking about it, this is likely to be a combination of the change of the restart (should be negligible), the additional (one-off) restart at the beginning of the suite, and the additional overhead of checking for crashes regularly. As a check for crashes involves pulling any files from the device, and is performed frequently, I suspect this contributes the majority of the increase. I will continue to investigate.
I've profiled running just tests/unit/test_kill.py (two tests) with and without the patches applied. With the patches applied we gained ~49 seconds.

An additional 28 seconds was spent in gaia_test.py(setUp), however our gaia_test.py(start_b2g) only increased by around 4 seconds. We have the additional restart that happens during the initialisation of the Marionette object, which has increased by 10 seconds.

We now also spend 13 additional seconds checking for crashes. Of this, we spent 5 seconds getting the minidumps directory and 8 seconds listing the remote files. Looking at mozrunner, we could certainly optimise this so we spend less time when checking for crashes on devices. I'll file a followup bug for this.

During these two tests we check for crashes 155 times, which does seem a little excessive to me. Malini: Do you know if we need to check so often? Also, do you have any other suggestions for reducing the tests run time?
Flags: needinfo?(mdas)
I've raised bug 1079774 to improve the crash detection on devices.
(In reply to Dave Hunt (:davehunt) from comment #26)
> I've profiled running just tests/unit/test_kill.py (two tests) with and
> without the patches applied. With the patches applied we gained ~49 seconds.
> 
> An additional 28 seconds was spent in gaia_test.py(setUp), however our
> gaia_test.py(start_b2g) only increased by around 4 seconds. We have the
> additional restart that happens during the initialisation of the Marionette
> object, which has increased by 10 seconds.
> 
> We now also spend 13 additional seconds checking for crashes. Of this, we
> spent 5 seconds getting the minidumps directory and 8 seconds listing the
> remote files. Looking at mozrunner, we could certainly optimise this so we
> spend less time when checking for crashes on devices. I'll file a followup
> bug for this.
> 
> During these two tests we check for crashes 155 times, which does seem a
> little excessive to me. Malini: Do you know if we need to check so often?
> Also, do you have any other suggestions for reducing the tests run time?

Definitely don't need to check that often. I think we should only check for crashes when we receive a connection related exception (IOError, etc).
Flags: needinfo?(mdas)
Depends on: 1081043
There are a couple of things still concerning me before I'm comfortable landing these patches. Each may need to be raised separately and potentially marked as blockers:

1. In at least two places if we detect a crash we abort the test run. As we have crash recovery (at least for gaiatest) we do not want to prevent all further tests from running in the event of a crash. See http://hg.mozilla.org/mozilla-central/file/29fbfc1b31aa/testing/marionette/client/marionette/runner/base.py#l189 and http://hg.mozilla.org/mozilla-central/file/29fbfc1b31aa/testing/marionette/client/marionette/runner/base.py#l808

2. The call to check_for_crash only returns true if a crash is detected by that call, so it's unlikely to return true if a crash was detected earlier on in the test. This means any logic that should execute if a crash has been detected at any point during the test (see above examples) may not work as expected.

3. I strongly suspect that our check for crashes can hit a race condition when the crash occurs immediately prior to the check. I have seen instances where the dmp file pulled from the device cannot be read. This might be unrelated, or it might be that the file was still being written to at the time. I have also seen a test immediately following a test that causes a crash to also detect a crash, suggesting that the crash files were still being written and are picked up during the setUp of the next test.

4. A crash occurring during a test does not appear to impact the test result. I expect that we would instead want to mark a test as failed if we witness a crash.

Malini and Andrew: Could you share your thoughts on the above concerns? I suspect we'll need changes to both the Marionette test runner and to mozrunner.
Flags: needinfo?(mdas)
Flags: needinfo?(ahalberstadt)
1) Not super familiar with this, but seems easy to do.

2) I don't really understand what you mean by this.. Check for crash is very simple in that it returns true if there's a dmp file in the profile. So as long as we are calling check_for_crash in all the right places (i.e on timeout, on test end, on process exit), and clean up the minidumps folder between tests, I don't see a problem here.

3) This is likely true, but isn't specific to gaiatest. Our desktop harnesses have been running like this for years and I don't think I've ever seen it happen.. I think it might depend on how the harness is set up to use mozcrash. Typically we have an on_finish handler in mozprocess where we place the check_for_crashes call. When a crash happens, the gecko process won't actually exit until the minidump has finished being written. So we avoid any race conditions that way. I suspect if you are seeing this, then we aren't calling check_for_crashes in the right places or we aren't waiting for the process to exit before starting the next one.

4) Seems reasonable. This wasn't a concern for other suites since crash aborts the test run. There's already a CRASH status in mozlog's test_end method.
Flags: needinfo?(ahalberstadt)
(In reply to Andrew Halberstadt [:ahal] from comment #33)
> 2) I don't really understand what you mean by this.. Check for crash is very
> simple in that it returns true if there's a dmp file in the profile. So as
> long as we are calling check_for_crash in all the right places (i.e on
> timeout, on test end, on process exit), and clean up the minidumps folder
> between tests, I don't see a problem here.

I mean we should distinguish when we want to check if there is a minidump, and when we want to check if we have detected a crash at any point during the test. For example the current code appears to prevent further tests from running if there is a minidump but not if a minidump has already been detected/reported/deleted. The Marionette client itself checks for crashes if certain exceptions are thrown, so it's not safe to assume a minidump will be waiting until the harness picks things up after the test.

> 3) This is likely true, but isn't specific to gaiatest. Our desktop
> harnesses have been running like this for years and I don't think I've ever
> seen it happen.. I think it might depend on how the harness is set up to use
> mozcrash. Typically we have an on_finish handler in mozprocess where we
> place the check_for_crashes call. When a crash happens, the gecko process
> won't actually exit until the minidump has finished being written. So we
> avoid any race conditions that way. I suspect if you are seeing this, then
> we aren't calling check_for_crashes in the right places or we aren't waiting
> for the process to exit before starting the next one.

We have the on_finish handler, but I'm causing a crash using stop with a SIGABRT. I suspect the forced crash is too close to the on_finish handler's execution.
(In reply to Dave Hunt (:davehunt) from comment #34)
> (In reply to Andrew Halberstadt [:ahal] from comment #33)
> > 2) I don't really understand what you mean by this.. Check for crash is very
> > simple in that it returns true if there's a dmp file in the profile. So as
> > long as we are calling check_for_crash in all the right places (i.e on
> > timeout, on test end, on process exit), and clean up the minidumps folder
> > between tests, I don't see a problem here.
> 
> I mean we should distinguish when we want to check if there is a minidump,
> and when we want to check if we have detected a crash at any point during
> the test. For example the current code appears to prevent further tests from
> running if there is a minidump but not if a minidump has already been
> detected/reported/deleted. The Marionette client itself checks for crashes
> if certain exceptions are thrown, so it's not safe to assume a minidump will
> be waiting until the harness picks things up after the test.
> 
> > 3) This is likely true, but isn't specific to gaiatest. Our desktop
> > harnesses have been running like this for years and I don't think I've ever
> > seen it happen.. I think it might depend on how the harness is set up to use
> > mozcrash. Typically we have an on_finish handler in mozprocess where we
> > place the check_for_crashes call. When a crash happens, the gecko process
> > won't actually exit until the minidump has finished being written. So we
> > avoid any race conditions that way. I suspect if you are seeing this, then
> > we aren't calling check_for_crashes in the right places or we aren't waiting
> > for the process to exit before starting the next one.
> 

If I understand correctly, if we implement 1) then shouldn't we have consistent behaviour when checking for a crash, that is, if we see a crash, mark the test as a failure and continue?
Flags: needinfo?(mdas)
Blocks: 1087687
(In reply to Malini Das [:mdas] from comment #35)
> If I understand correctly, if we implement 1) then shouldn't we have
> consistent behaviour when checking for a crash, that is, if we see a crash,
> mark the test as a failure and continue?

That sounds like a combination of 1) and 4) and in order to implement this we could use remove the two places we currently try to stop the remaining tests on crash, and indicate that a crash occurred when the test ends. I would also suggest that this solves 2) because we'll be removing that logic and instead using the runner.crashed value to determine if any crashes were detected in the life of a test. It doesn't help with 3) however perhaps a small sleep in the unit test that demonstrates this is a reasonable trade-off for the coverage.

I'll work on an updated patch with the suggestions implemented.
I have a new working patch with the changes applied. It still requires some work and testing, but I've compared running the 36 unit tests with and without the patch to see what the impact is on the running time. With the patch applied the unit tests took ~3 minutes longer, which works out at ~5 seconds extra per test. Given that ~20 seconds of this is actually before we start running the first test, I think this is an acceptable increase that shouldn't impact our current suites considerably, especially given the advantage that the improved crash handling will provide us. What do you think Zac?
Flags: needinfo?(zcampbell)
Yes I agree, that's an acceptable compromise. There are other places we explore to make that time back.
Flags: needinfo?(zcampbell)
Comment on attachment 8512543 [details] [diff] [review]
Use B2GDeviceRunner in the Marionette client. v2.0

Try is looking good so far, so let's mark this for another review.
Flags: needinfo?(dave.hunt)
Attachment #8512543 - Flags: review?(ahalberstadt)
Comment on attachment 8512543 [details] [diff] [review]
Use B2GDeviceRunner in the Marionette client. v2.0

Review of attachment 8512543 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with comments (especially second one) addressed.

::: testing/marionette/client/marionette/marionette.py
@@ +474,5 @@
>                   gecko_log=None, homedir=None, baseurl=None, no_window=False, logdir=None,
>                   busybox=None, symbols_path=None, timeout=None, socket_timeout=360,
>                   device_serial=None, adb_path=None, process_args=None):
>          self.host = host
> +        self.port = self.remote_port = port

Hard to say for sure without full context, but looks like there's no reason to set self.port at this point anymore.

@@ +755,3 @@
>          if returncode is not None:
> +            print ('PROCESS-CRASH | %s | abnormal termination with exit code %d' % (name, returncode))
> +        assert(crashed == 0)

I think I like the idea of using an exception here, but I don't think this should be an assertion. Assertions can be disabled by passing in -O into python which means we wouldn't detect crashes properly. Probably not a very common case, but I'd rather just raise some sort of exception if crashed > 0.

::: testing/marionette/client/marionette/runner/base.py
@@ -579,4 @@
>                  'app_args': self.app_args,
>                  'bin': self.bin,
>                  'profile': self.profile,
> -                'gecko_log': self.gecko_log,

optional: Personally I like trailing commas because it makes it easier to add an entry.
Attachment #8512543 - Flags: review?(ahalberstadt) → review+
Thanks for the review. I've addressed two of your comments, in partcular check_for_crash has been reverted to return a boolean but will now return true if a crash has occurred at any point in the lifetime of the runner, rather than just at that moment. This patch also allows us to mark crashing tests as expected to fail either by decorator or manifest.

I didn't change the self.port because otherwise this would not be set if the if/elif block was not executed.

Try:
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=963580128383
Attachment #8512543 - Attachment is obsolete: true
Attachment #8514205 - Flags: review?(ahalberstadt)
Comment on attachment 8514205 [details] [diff] [review]
Use B2GDeviceRunner in the Marionette client. v3.0

Review of attachment 8514205 [details] [diff] [review]:
-----------------------------------------------------------------

This looks a lot less frightening :), thanks!

::: testing/marionette/client/marionette/runner/base.py
@@ -743,5 @@
>              for failed_test in self.failures:
>                  self.logger.info('%s' % failed_test[0])
>  
> -        try:
> -            self.marionette.check_for_crash()

I assume the check_for_crash's in this file are no longer needed because they are in MarionetteTestCase instead?
Attachment #8514205 - Flags: review?(ahalberstadt) → review+
(In reply to Andrew Halberstadt [:ahal] from comment #43)
> Comment on attachment 8514205 [details] [diff] [review]
> Use B2GDeviceRunner in the Marionette client. v3.0
> 
> Review of attachment 8514205 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> This looks a lot less frightening :), thanks!

Yeah, this is a much nicer approach. Thanks for helping me arrive here. :)

> ::: testing/marionette/client/marionette/runner/base.py
> @@ -743,5 @@
> >              for failed_test in self.failures:
> >                  self.logger.info('%s' % failed_test[0])
> >  
> > -        try:
> > -            self.marionette.check_for_crash()
> 
> I assume the check_for_crash's in this file are no longer needed because
> they are in MarionetteTestCase instead?

That's correct.

So my try run appears to have replicated a crash, and all subsequent tests failed because we don't restart between tests so runner.crashed will remain true. I don't think this is really a problem with this patch though. I tested a crash in Firefox desktop without the patch and we stop the entire test run on crash, so the difference now is that we'll have more failures as a result of a crash. When I replicated with my patch I got errors about starting a Marionette session, so the try run is a little different for some reason (are there types of crashes where we'll have a minidump but not lose the browser/session?)

I think a subsequent patch to detect that a previous test has crashed and in that scenario we could restart the runner during setUp. We kind of already do this with gaiatest for B2G but I think it makes sense to have this in the Marionette test runner where we can then override the behaviour for B2G. I think this is pretty much bug 1062676.

Andrew: Does this sound good/make sense? If so, I'll go ahead and land this patch early next week and will closely follow it up with the gaiatest patch and release of the marionette-client to PyPI.
Flags: needinfo?(ahalberstadt)
(In reply to Dave Hunt (:davehunt) from comment #44)
> (In reply to Andrew Halberstadt [:ahal] from comment #43)
> > Comment on attachment 8514205 [details] [diff] [review]
> > Use B2GDeviceRunner in the Marionette client. v3.0
> > 
> > Review of attachment 8514205 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > This looks a lot less frightening :), thanks!
> 
> Yeah, this is a much nicer approach. Thanks for helping me arrive here. :)
> 
> > ::: testing/marionette/client/marionette/runner/base.py
> > @@ -743,5 @@
> > >              for failed_test in self.failures:
> > >                  self.logger.info('%s' % failed_test[0])
> > >  
> > > -        try:
> > > -            self.marionette.check_for_crash()
> > 
> > I assume the check_for_crash's in this file are no longer needed because
> > they are in MarionetteTestCase instead?
> 
> That's correct.
> 
> So my try run appears to have replicated a crash, and all subsequent tests
> failed because we don't restart between tests so runner.crashed will remain
> true. I don't think this is really a problem with this patch though. I
> tested a crash in Firefox desktop without the patch and we stop the entire
> test run on crash, so the difference now is that we'll have more failures as
> a result of a crash.

Can you post a link to the log? You might want to run this by the sheriffs to make sure it isn't being unnecessarily noisy.

> When I replicated with my patch I got errors about
> starting a Marionette session, so the try run is a little different for some
> reason (are there types of crashes where we'll have a minidump but not lose
> the browser/session?)

Yes, if e10s is enabled and MOZ_CRASHREPORTER_SHUTDOWN is unset, a child process crash should trigger this behaviour.

> Andrew: Does this sound good/make sense? If so, I'll go ahead and land this
> patch early next week and will closely follow it up with the gaiatest patch
> and release of the marionette-client to PyPI.

Yep makes sense, I'd just run the new log by the sheriffs.
Flags: needinfo?(ahalberstadt)
The log is here:
https://treeherder.mozilla.org/ui/logviewer.html#?job_id=2827460&repo=try

So without this patch I'd expect that log would stop after the PROCESS-CRASH and subsequent TEST-UNEXPECTED-FAIL. Now we may have situations where a crash is followed by the remaining tests all failing either due to an assertion failure checking that no crashes have occurred (seen in the above log) or perhaps an exception stating that there's no Marionette session. We could fix this to recover from the crash in bug 1062676. Does this sound okay?
Flags: needinfo?(ryanvm)
Spamtastic failures tend to be a pain. I'd much prefer that the suite abort on a crash.
Flags: needinfo?(ryanvm)
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #47)
> Spamtastic failures tend to be a pain. I'd much prefer that the suite abort
> on a crash.

We already have crash recovery in the Gaia tests, so aborting the suite on crash would prevent us from valuable test coverage.
This restores the behaviour of aborting the testrun on crash. We can override this in GaiaTest, where we have crash recovery.

Try:
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=6bb5465e0af9
Attachment #8514205 - Attachment is obsolete: true
Attachment #8516752 - Flags: review?(ahalberstadt)
Comment on attachment 8495963 [details] [review]
Link to Github pull-request: https://github.com/mozilla-b2g/gaia/pull/24475

Clearing previous feedback/reviews and requesting review for updated patch.
Attachment #8495963 - Flags: review?
Attachment #8495963 - Flags: review-
Attachment #8495963 - Flags: review+
Attachment #8495963 - Flags: feedback+
I've been doing some testing and found a clash between desktopb2g and device when both were running/attached to the host computer. adb commands would run against device and tests against desktopb2g (which is bound to the port).

Dave and I debugged it and found that mozdevice's forward was failing silently (Bug 1094164). With that resolved, the test run would fail before it starts and force the user to disconnect one of the running instances.
Depends on: 1094164
Generally I'm happy with how this goes, but I would like to change the default or have the option to override it stopping the b2g process at the end of the test.
Our workflow is quite heavily rooted in leaving the device running after the test has finished but this is mostly a requirement for debugging test failures or while writing tests rather than when just executing the test suite.
Comment on attachment 8516752 [details] [diff] [review]
Use B2GDeviceRunner in the Marionette client. v3.1

Review of attachment 8516752 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks, lgtm!
Attachment #8516752 - Flags: review?(ahalberstadt) → review+
(In reply to Zac C (:zac) from comment #52)
> Generally I'm happy with how this goes, but I would like to change the
> default or have the option to override it stopping the b2g process at the
> end of the test.

I agree that when we're running tests against a device it doesn't make sense to stop the process when the Marionette object is destroyed. I initially thought we could just skip the cleanup call if the target is a device (but not emulator) but this also includes some code for restoring the original profile, though I'm not convinced this is working as expected as I've experienced an unbootable device after running tests... :/

Andrew: Do you have any thoughts on how we could leave devices in a usable state after tests have completed? Should the backup/restore be optional? Are you aware of the backup/restore being used anywhere?
Flags: needinfo?(ahalberstadt)
I worked out the issue with the device being unusable. We were calling runner.stop() to restart between tests, which doesn't call cleanup() and therefore does not restore the profile. On the next start() this means we backup the remote profiles.ini over the existing backup, therefore losing the ability to restore the original profile. I've solved this by calling cleanup() instead of stop() but a similar issue could happen if a test aborts or is interrupted by a user. In this case we'll likely remain in the pushed profile, which isn't as bad as a device that won't boot.
It seems like stop() should probably call cleanup() instead of the other way around.. I'm not sure if it's worth fixing though, as there are likely places that depend on this behaviour..

Also this parameter should control whether the profile gets restored or not afterwards:
http://dxr.mozilla.org/mozilla-central/source/testing/mozbase/mozrunner/mozrunner/devices/base.py#25

I'm not guaranteeing that it works exactly as advertised though :/
Flags: needinfo?(ahalberstadt)
This patch restarts the remote b2g service after cleanup, which means a completed test run will restart a device into a usable state with the original profile and settings. Testing against running emulators will also restart once the cleanup is completed.

There are scenarios where cleanup may fail, such as a keyboard interrupt or device crash. In these cases the profile will not be restored. We may want to add an option to skip the cleanup and leave the target in the state at the end of the last run test, however I feel this should be raised in a separate bug.

Try:
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=7799fd930f1a
Attachment #8519926 - Flags: review?(ahalberstadt)
Try suffered timeout probably related to bug 1096337. Here's a new try run:
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=e06a7e99d170
(In reply to Dave Hunt (:davehunt) from comment #58)
> There are scenarios where cleanup may fail, such as a keyboard interrupt or
> device crash. In these cases the profile will not be restored. We may want
> to add an option to skip the cleanup and leave the target in the state at
> the end of the last run test, however I feel this should be raised in a
> separate bug.

We could piggy back off of the mozprofile 'restore' parameter: http://dxr.mozilla.org/mozilla-central/source/testing/mozbase/mozprofile/mozprofile/profile.py#41

Another bug sounds fine to me though.
Comment on attachment 8519926 [details] [diff] [review]
Use B2GDeviceRunner in the Marionette client. v3.2

Review of attachment 8519926 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, r+ with the minor review comment fixed.

::: testing/mozbase/mozrunner/mozrunner/devices/base.py
@@ +257,5 @@
>          # Remove the test profile
>          self.dm.removeDir(self.app_ctx.remote_profile)
>  
> +        # Restart the application
> +        self.app_ctx.start_application()

Please wrap this in an "if not emulator".
Attachment #8519926 - Flags: review?(ahalberstadt) → review+
Comment on attachment 8519911 [details] [diff] [review]
Use B2GDeviceRunner in the Marionette client. v3.2

Afaict, this is the same patch.
Attachment #8519911 - Attachment is obsolete: true
Attachment #8519911 - Flags: review?(ahalberstadt)
(In reply to Andrew Halberstadt [:ahal] from comment #61)
> Comment on attachment 8519926 [details] [diff] [review]
> Use B2GDeviceRunner in the Marionette client. v3.2
> 
> Review of attachment 8519926 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Looks good, r+ with the minor review comment fixed.
> 
> ::: testing/mozbase/mozrunner/mozrunner/devices/base.py
> @@ +257,5 @@
> >          # Remove the test profile
> >          self.dm.removeDir(self.app_ctx.remote_profile)
> >  
> > +        # Restart the application
> > +        self.app_ctx.start_application()
> 
> Please wrap this in an "if not emulator".

I figured that if running tests against an emulator that's already running we'd want to restart the application. This does mean that when we let Marionette start the emulator that we restart the application and then immediately kill the emulator process, but I didn't see that as an issue. Could you confirm that you don't want the application restarted on emulator after restoring the profile, etc?
Flags: needinfo?(ahalberstadt)
I guess it doesn't hurt, but changes to the profile on the emulator aren't persistent anyway, and we kill the emulator after our test runs, so it seems like a waste of time to do.
Flags: needinfo?(ahalberstadt)
(In reply to Andrew Halberstadt [:ahal] from comment #64)
> I guess it doesn't hurt, but changes to the profile on the emulator aren't
> persistent anyway, and we kill the emulator after our test runs, so it seems
> like a waste of time to do.

As discussed on IRC we don't kill the emulator if we didn't start it, and the restart of the application isn't expected to have much impact when we do kill the emulator.

Landed on inbound with a version bump for the Marionette client:
https://hg.mozilla.org/integration/mozilla-inbound/rev/92fc5580994a

We'll need to release to PyPI once this reaches mozilla-central and then we'll need revisit the Gaia patch to update them to use the new client.
Keywords: leave-open
Retriggers on inbound are strongly pointing to this as the cause of a recent spike in OSX Marionette crashes like the log below:
https://treeherder.mozilla.org/ui/logviewer.html#?job_id=3821145&repo=mozilla-inbound

Backed out.
https://hg.mozilla.org/integration/mozilla-inbound/rev/053dba7ee315
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #66)
> Retriggers on inbound are strongly pointing to this as the cause of a recent
> spike in OSX Marionette crashes like the log below:
> https://treeherder.mozilla.org/ui/logviewer.
> html#?job_id=3821145&repo=mozilla-inbound
> 
> Backed out.
> https://hg.mozilla.org/integration/mozilla-inbound/rev/053dba7ee315

This would suggest that we're detecting crashes but not stopping the test suite. It also suggests that the crash we're detecting does not kill the browser instance. I wouldn't expect that we've increased the number of crashes, but instead our ability to detect them with a side-effect of not being able to recover from them. I also notice that we're not passing the symbols or setting the minidump stacktrace, so the crash details are severely limited.
(In reply to Dave Hunt (:davehunt) from comment #67)
> (In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #66)
> > Retriggers on inbound are strongly pointing to this as the cause of a recent
> > spike in OSX Marionette crashes like the log below:
> > https://treeherder.mozilla.org/ui/logviewer.
> > html#?job_id=3821145&repo=mozilla-inbound
> > 
> > Backed out.
> > https://hg.mozilla.org/integration/mozilla-inbound/rev/053dba7ee315
> 
> This would suggest that we're detecting crashes but not stopping the test
> suite. It also suggests that the crash we're detecting does not kill the
> browser instance. I wouldn't expect that we've increased the number of
> crashes, but instead our ability to detect them with a side-effect of not
> being able to recover from them. I also notice that we're not passing the
> symbols or setting the minidump stacktrace, so the crash details are
> severely limited.

Here's a couple of examples of a similar crashes without this patch that go completely unnoticed: https://treeherder.mozilla.org/logviewer.html#?job_id=3837945&repo=mozilla-inbound
https://treeherder.mozilla.org/logviewer.html#?job_id=3812049&repo=mozilla-inbound

In these cases the crash is reported, the suite stops, but the tests all pass. I think the worst thing introduced by this bug's patch is that we continue running the tests after the crash is detected and therefore get a lot of extra noise. In my opinion failing on the crash is a good thing.
I've raised bug 1098258 to enable processing of these crash dumps.
(In reply to Dave Hunt (:davehunt) from comment #68)
> (In reply to Dave Hunt (:davehunt) from comment #67)
> > (In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #66)
> > > Retriggers on inbound are strongly pointing to this as the cause of a recent
> > > spike in OSX Marionette crashes like the log below:
> > > https://treeherder.mozilla.org/ui/logviewer.
> > > html#?job_id=3821145&repo=mozilla-inbound
> > > 
> > > Backed out.
> > > https://hg.mozilla.org/integration/mozilla-inbound/rev/053dba7ee315
> > 
> > This would suggest that we're detecting crashes but not stopping the test
> > suite. It also suggests that the crash we're detecting does not kill the
> > browser instance. I wouldn't expect that we've increased the number of
> > crashes, but instead our ability to detect them with a side-effect of not
> > being able to recover from them. I also notice that we're not passing the
> > symbols or setting the minidump stacktrace, so the crash details are
> > severely limited.
> 
> Here's a couple of examples of a similar crashes without this patch that go
> completely unnoticed:
> https://treeherder.mozilla.org/logviewer.html#?job_id=3837945&repo=mozilla-
> inbound
> https://treeherder.mozilla.org/logviewer.html#?job_id=3812049&repo=mozilla-
> inbound
> 
> In these cases the crash is reported, the suite stops, but the tests all
> pass. I think the worst thing introduced by this bug's patch is that we
> continue running the tests after the crash is detected and therefore get a
> lot of extra noise. In my opinion failing on the crash is a good thing.

Josh, Steven: could you help here finding someone who can fix the Mac Crashes ?
Flags: needinfo?(smichaud)
Flags: needinfo?(joshmoz)
(In reply to Carsten Book [:Tomcat] from comment #70)
> Josh, Steven: could you help here finding someone who can fix the Mac
> Crashes ?

I think we need to open new bugs for any crashes this bug exposes. I'd also rather not block landing this for a crash fix, so would like to know what other options we might have.
This version of the patch restores terminating the test suite when a crash is detected outside the stopTest. This reduces the noise after a crash is detected. See try run here including an occurrence of the intermittent crash:
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=c6b7257d8bed
Attachment #8516752 - Attachment is obsolete: true
Attachment #8519926 - Attachment is obsolete: true
Attachment #8522190 - Flags: review?(ahalberstadt)
(In reply to Dave Hunt (:davehunt) from comment #71)
> (In reply to Carsten Book [:Tomcat] from comment #70)
> > Josh, Steven: could you help here finding someone who can fix the Mac
> > Crashes ?
> 
> I think we need to open new bugs for any crashes this bug exposes. I'd also
> rather not block landing this for a crash fix, so would like to know what
> other options we might have.

If we have commitment from a dev to debug and potentially resolve the crash then we could disable the test for treeherder/mac.
But disabling the test without that commitment in advance would make me worry the test would just get ignored and the crash never attended to.
(In reply to Dave Hunt (:davehunt) from comment #71)
> I think we need to open new bugs for any crashes this bug exposes. I'd also
> rather not block landing this for a crash fix, so would like to know what
> other options we might have.

It's probably up to the sheriffs and how frequent the crash is. I've run into similar situations before (where my new change exposed previously failing issues) and had to add in hacks to keep tbpl happy. It's a sad state of affairs.
Comment on attachment 8522190 [details] [diff] [review]
Use B2GDeviceRunner in the Marionette client. v3.3

Review of attachment 8522190 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, but make sure to talk to the sheriffs before re-landing. Even if this patch doesn't cause the crash, the sheriffs are well within their right to back it out anyway. So let's figure out what the best way forward is first.
Attachment #8522190 - Flags: review?(ahalberstadt) → review+
It's a sad state of affairs (no disagreement there), but unfortunately the visibility policy is clear on this:
https://wiki.mozilla.org/Sheriffing/Job_Visibility_Policy

If this patch causes Mn on OSX to no longer meet visibility standards, we'll either need to back it out (like we did yesterday) or hide the suite. Leaving it failing a large percentage of the time in production because it wasn't directly your patch's fault is unfortunately not an option.

If it were me, I'd work to get symbolification fixed first before this re-lands so you know which team to ping about it (no reason that couldn't be sorted out on Try?). Then if you want to re-land and let the suite get hidden by default, we can file the appropriate bugs, mark the deps, and let the various stakeholders prioritize getting it fixed.

I'll also point out that as resource-constrained as we are on OSX, we're probably not going to want to stomach leaving them running but hidden for very long. We don't have excess capacity to waste like that.
(In reply to comment #70)

I'm not going to have time to investigate these crashes anytime soon.  And I suspect Josh is in the same boat.

Furthermore, I suspect these crashes aren't really Mac-specific, even if they're disproportionately triggered on the Mac.  In any case you don't appear to have any information about them.

So here's what you guys need to do:

1) Open a bug, and provide as much information as possible -- hopefully including Socorro and Apple crash reports and reliable STR.

2) Assign one of your own people to the bug, to investigate it to the point that it requires Mac-specific expertise to resolve -- if in fact that point is ever reached.
Flags: needinfo?(smichaud)
Flags: needinfo?(joshmoz)
Depends on: 1098258
No longer depends on: 1040078
Thanks to ahal's work on bug 1098258 we have crash details. A log showing the crash can be seen here: https://treeherder.mozilla.org/ui/logviewer.html#?job_id=66196&repo=ash

It appears to be already raised in bug 1090921 so setting that as a blocker.
Depends on: 1090921
(In reply to Dave Hunt (:davehunt) from comment #78)
> Thanks to ahal's work on bug 1098258 we have crash details. A log showing
> the crash can be seen here:
> https://treeherder.mozilla.org/ui/logviewer.html#?job_id=66196&repo=ash
> 
> It appears to be already raised in bug 1090921 so setting that as a blocker.

a) this is failing in an entirely different test:
test_visibility.py TestVisibility.testShouldAllowTheUserToTellIfAnElementIsDisplayedOrNot
Bug 1090921 is always in test_speech_queue.html.  it's possibly both are due to some underlying bug in the IPC code or how it's used.
b)  No Assertion is flagged in the run above you gave as they are in the speech test, though the line number matches.

Likely this is a locking/timing issue with IPC, as these aren't really code that's directly part of speech/or the test above, etc. Possibly a timing/sequence-of-shutdown issue.  Adding some cc's there.
Blocks: 1056001
Blocks: 1138569
This probably still makes sense for any Python based testing on Firefox OS devices, but I'm no longer working on it.
Assignee: dave.hunt → nobody
Status: ASSIGNED → NEW
Attachment #8495963 - Flags: review?
b2g related so closing
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
Removing leave-open keyword from resolved bugs, per :sylvestre.
Keywords: leave-open
Product: Testing → Remote Protocol
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: