Improve mozdevice's test runner

RESOLVED WONTFIX

Status

Testing
Mozbase
RESOLVED WONTFIX
5 years ago
6 months ago

People

(Reporter: mcote, Assigned: gbrown)

Tracking

Trunk
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

5 years ago
mozdevice's test runner is interesting and works well, except for two main problems that occur when MockDevice receives unexpected or insufficient data:

1. It's often not immediately obvious what test has failed.
2. The process hangs instead of terminating nicely.

For example, bug 880001 exhibits these behaviours.

I think, if done right, we could solve both these problems at the same time.
(Reporter)

Updated

5 years ago
Whiteboard: [mentor=mcote][lang=py]

Comment 1

5 years ago
Hello I want  to start contributing to mozilla automation .I would like this bug to be assigned this bug as my first bug .Can you please guide me a bit on where to start with this?
(In reply to shailrishabh from comment #1)
> Hello I want  to start contributing to mozilla automation .I would like this
> bug to be assigned this bug as my first bug .Can you please guide me a bit
> on where to start with this?

I'm not actually sure how to get us to print out more verbose information about which test is running. The actual test runner is here, maybe that's a good place to start:

https://github.com/mozilla/mozbase/blob/master/test.py

The code that is causing the hangs is here:

https://github.com/mozilla/mozbase/blob/master/mozdevice/tests/sut.py

You could probably tackle the two issues separately. Be warned that this bug is almost certainly not trivial to fix: there's quite a few moving parts involved. But if you want to jump in at the deep end... :)
(Reporter)

Comment 3

5 years ago
Yes, this bug is not trivial; it will require either knowledge about sockets and threads or the willingness to learn about sockets and threads. :) That said, it is fairly self contained in that it only really involves a few files.  If you're up for a challenge, the first steps would be to check out mozbase from github into a virtualenv, run "python setup_development.py" inside the virtualenv to install everything, and then run the tests with "python test.py mozdevice/tests/manifest.ini". Once you get that far, we can talk about how to fix this. :)
(Reporter)

Comment 4

5 years ago
Talking about this in IRC, I just wanted to note that I *think* the main problem here is that when the wrong (or no) command is received, the test exception is raised in the server thread (the one started in mozdevice/tests/sut.py).  The main thread never sees this exception.  I think the fix would be to somehow, maybe by some sort of queue or something, send this exception back to the main thread, and have the main thread raise it.  This would cause a regular test failure, so the test should abort and the name printed out, as usual with the unittests module.

Comment 5

5 years ago
Hello I was successful in running all the tests and installation part .Where should I proceed from here?
(Reporter)

Comment 6

5 years ago
Great!  The next step would be to make sure you understand how these tests are being run.  Take a look at mozdevice/tests/sut.py and mozdevice/tests/sut_basic.py, and try to modify the test so that you get an exception from the MockAgent--the easiest way to do this would be to modify one of the expected commands, for example changing "testroot" to "unexpected" in BasicTest.test_init_err().  MockAgent expects particular commands and fails if they don't match.  It is there that we run into problems, since the server thread raises an exception, but it isn't reported on the main thread.

Also feel free to go to irc.mozilla.org #ateam and ask questions. :)

Comment 7

5 years ago
hi there.

what if we will wrap self.assertEqual with try/except and put AssertionError into some list and then when in call of wait method of mock object will check if whether this list is empty or not and if it's not empty will raise stored exception? will this handle the problem?
(Reporter)

Comment 8

5 years ago
Yes, I think that would work, although you probably don't need a list--you're just keeping one exception around.
Whiteboard: [mentor=mcote][lang=py]

Comment 9

5 years ago
I've tested that approach, it worked well, but not in a right way because all of the commands were executed before exception was propagated in the main thread. 

I found that the main difficulty in that task is to handle exception and somehow tell  devicemanager that it should not try to reconnect to SUT, but I came only to idea to send AGENT-WARNING back to devicemanager which causes DMError. 

Well, as this bug is not considered as mentored one or good first bug anymore, I'll probably step aside :)
(Reporter)

Comment 10

5 years ago
Heh feel free to keep at it if you want--I just realized that this is not the easiest of bugs.  I can certainly mentor you in this.  Did not mean to dissuade you. :)

I think setting the retryLimit to 1 will fix the second problem you mentioned.  I think some good synchronization would fix the first.

Feel free to attach a patch in progress and request feedback for me if you want some more specific advice. :)

Comment 11

5 years ago
Hi Mark!

Where do I start? It looks like it is not hard to propagate exception from child thread to main thread. I thought about catching exceptions in child thread, storing them in some variable (perhaps, MockAgent attribute) and re-throw them in wait method of MockAgent class (after self.thread.join() call). 

Here we have two options: try to fast fail test or continue test execution. For me first option is what we want to, but sadly it's where my troubles start. I don't know how correctly handle exception with devicemanager. It expects some response from SUT and won't take it easy when it doesn't get it. So it hangs for 5 seconds, then for 300 seconds timeout. 

You proposed to change retryLimit, but: a) it's not quite possible from _serve_thread, b) it leads to DMError and thus test is marked as error rather than failure.

Any suggestions?
(Reporter)

Comment 12

5 years ago
Hm yeah I see what you mean.  At the risk of this devolving into a series of hacks, I think we will still need a retryLimit of 1, but you are right that that shouldn't be set in _serve_thread() (or actually in the MockAgent at all).  I think it's probably best to move the creation of DeviceManagerSUT out of the individual tests and into some common function, where we can set necessary testing flags.  If needed, tests could pass in extra, test-specific args to this factory function.  That should take care of the retryLimit problem.

The DMError is indeed more difficult.  The only thing I can think of is to catch DMErrors but raise them only if there is no exception from the MockAgent.  Thus unittest errors would take precedence, since they should be more informative, but we would still catch DMErrors.

Thanks for sticking with this and sorry it's even more complicated than I originally thought!  You're doing a great job at characterizing the problems, even if we don't have a full solution yet. :)
(Assignee)

Updated

6 months ago
Assignee: nobody → gbrown
Status: NEW → RESOLVED
Last Resolved: 6 months ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.