Refactor mozrunner for a better class structure

RESOLVED FIXED

Status

Testing
Mozbase
RESOLVED FIXED
4 years ago
3 years ago

People

(Reporter: whimboo, Assigned: whimboo)

Tracking

(Blocks: 1 bug)

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 7 obsolete attachments)

(Assignee)

Description

4 years ago
The current mozrunner is not that well implemented given the class structure and documentation. There is a bit of overlap and unnecessary code. I stepped over it while working on bug 959551. A patch should be up soon.
(Assignee)

Comment 1

4 years ago
Created attachment 8360625 [details] [diff] [review]
patch v1

Patch for refactoring. It has been tested on OSX which is kinda broken for manifestparser tests. I will run tests on Linux before I ask for review.
(Assignee)

Comment 2

4 years ago
Comment on attachment 8360625 [details] [diff] [review]
patch v1

Review of attachment 8360625 [details] [diff] [review]:
-----------------------------------------------------------------

Andrew, would you mind reviewing this code? Not sure if you think it will be necessary to run this patch through try. Given that I had a lot of trouble lately picking changes for mozilla-central, could you push it in case it is important for you? I would really appreciate that!
Attachment #8360625 - Flags: review?(ahalberstadt)
(Assignee)

Comment 3

4 years ago
Created attachment 8360644 [details] [diff] [review]
patch v1.1

Missed to fully revert a temporary change I did in runner.py. This is fixed now.
Attachment #8360625 - Attachment is obsolete: true
Attachment #8360625 - Flags: review?(ahalberstadt)
Attachment #8360644 - Flags: review?(ahalberstadt)
Comment on attachment 8360644 [details] [diff] [review]
patch v1.1

Review of attachment 8360644 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for the cleanup! I think these changes are all good, though it would be more helpful if there were 8 lines of context in the patch. I think this patch has a fairly high risk of breaking something though, so I'd like to see a full try run before giving it the go ahead.

::: mozrunner/mozrunner/remote.py
@@ +196,4 @@
>              pass
>  
>      def on_output(self, line):
> +        self.log.info(line)

This is going to change mochitest logs, it may ok, but I'd want to see try first.
(Assignee)

Comment 5

4 years ago
Now that the mozfile issues have been solved, I pushed this patch to try:
https://tbpl.mozilla.org/?tree=Try&rev=ff89071c9fc1
(Assignee)

Updated

4 years ago
Blocks: 960495
(Assignee)

Comment 6

4 years ago
(In reply to Andrew Halberstadt [:ahal] from comment #4)
> >      def on_output(self, line):
> > +        self.log.info(line)
> 
> This is going to change mochitest logs, it may ok, but I'd want to see try
> first.

Looks like it broke the logs. There is a difference compared to latest inbound:

06:12:26     INFO -  12839 INFO TEST-PASS | /tests/content/base/test/test_bug416317-1.html | [object HTMLButtonElement]matches selector '#root3 #select3 option[selected]'

vs.

05:33:58     INFO -  01-17 13:12:50.817   724   724 I GeckoDump: 26445 INFO TEST-PASS | /tests/content/base/test/test_bug416317-2.html | [object HTMLInputElement]matches selector '#root3 input[name='foo[bar]']'

So the GeckoDump seems to be in the way. I will revert this logging change.
(Assignee)

Comment 7

4 years ago
Actually what I wanted to refer to is:

04:28:43     INFO -  B2GRunner INFO | [674] WARNING: XPCOM objects created/destroyed from static ctor/dtor: file ../../../gecko/xpcom/base/nsTraceRefcntImpl.cpp, line 142

I'm not sure if that change is causing all the trouble with the B2G mochitests. Andrew, what do you think?
Flags: needinfo?(ahalberstadt)
Yeah, I think the b2g mochitests was caused by a mozharness change + a patch not being landed on try yet (i.e bad timing). I re-triggered a couple to be sure.

Though in comment 7 I see that we are getting rid of the line number in the original log and adding a B2GRunner tag. This could potentially break our regexes that are used to detect errors, so I think we should still revert the log change even if it isn't the culprit.
Flags: needinfo?(ahalberstadt)
(Assignee)

Comment 9

4 years ago
Created attachment 8362705 [details] [diff] [review]
Patch v1.2

After talking with Andrew on IRC we agreed on to remove the mozlog changes. I reverted them all, and pushed a new changeset to try. It covers tests on platform we failed previously.

https://tbpl.mozilla.org/?tree=Try&rev=da591d95e25d
Attachment #8360644 - Attachment is obsolete: true
Attachment #8360644 - Flags: review?(ahalberstadt)
Attachment #8362705 - Flags: review?(ahalberstadt)
Comment on attachment 8362705 [details] [diff] [review]
Patch v1.2

Review of attachment 8362705 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good except for that one comment about wait(). A contributor and I recently modified it to cover a whole bunch of edge cases which I think your changes would break.

::: mozrunner/mozrunner/remote.py
@@ +7,5 @@
>  import posixpath
>  import re
>  import shutil
>  import signal
> +from StringIO import StringIO

I usually like to have "from" imports at the top.. but this is just bikeshedding.

::: mozrunner/mozrunner/runner.py
@@ +110,2 @@
>          elif self.returncode is None:
> +            raise RunnerNotStartedError("Wait() called before process started")

I'm not sure this is ok. This whole function was very carefully crafted to handle cases like multiple threads where one thread kills the process before the other calls stop. Unless you really know what you're doing, please leave this whole function the way it was.
Attachment #8362705 - Flags: review?(ahalberstadt) → review-
(Assignee)

Comment 11

4 years ago
Comment on attachment 8362705 [details] [diff] [review]
Patch v1.2

Review of attachment 8362705 [details] [diff] [review]:
-----------------------------------------------------------------

::: mozrunner/mozrunner/runner.py
@@ +110,2 @@
>          elif self.returncode is None:
> +            raise RunnerNotStartedError("Wait() called before process started")

Do you have a bug reference for that? I assume you meant bug 921199? Personally I don't see anything which could break us here. I wrote a little script to check that, and it shows the same behavior. Looks like we should start to have tests for mozrunner! So let me explain my changes...

Calling self.is_running() doesn't make a change here but is more stable in terms of code changes and performs exactly the same underlying code. I'm not sure why we had the inner if/else condition for Popen. Even in interactive mode another thread/process could have been killed our process. So it needs the same handling. The only difference here should really be the timeout which is None in such a case, so that we wait forever.

Also there is no need to call poll() on the process handler. The exit value of the process you should really take from wait(). So the special case for other threads doesn't play in here anymore. If another thread kills the process while we are waiting, we return normally. If the kill happened before wait, we don't even enter the whole if condition and return the code immediately.

I will try to create some basic tests for mozrunner, which we don't have at all yet. That should proof that all works as expected.
(Assignee)

Comment 12

4 years ago
(In reply to Henrik Skupin (:whimboo) from comment #9)
> https://tbpl.mozilla.org/?tree=Try&rev=da591d95e25d

Andrew, given the new requirement for pefile from mozinstall, all the tests are failing on try. Do I have to explicitely add this dependency for the own internal pypi site?
Flags: needinfo?(ahalberstadt)
(In reply to Henrik Skupin (:whimboo) from comment #11)
> Do you have a bug reference for that? I assume you meant bug 921199?

Yeah bug 921199 and also bug 947974.

> Calling self.is_running() doesn't make a change here but is more stable in
> terms of code changes and performs exactly the same underlying code. I'm not
> sure why we had the inner if/else condition for Popen. Even in interactive
> mode another thread/process could have been killed our process. So it needs
> the same handling. The only difference here should really be the timeout
> which is None in such a case, so that we wait forever.
> 
> Also there is no need to call poll() on the process handler. The exit value
> of the process you should really take from wait(). So the special case for
> other threads doesn't play in here anymore. If another thread kills the
> process while we are waiting, we return normally. If the kill happened
> before wait, we don't even enter the whole if condition and return the code
> immediately.

I guess that makes sense.. I just know from experience that this logic is quite finicky, so any changes here make me worried :)

> I will try to create some basic tests for mozrunner, which we don't have at
> all yet. That should proof that all works as expected.

That would be awesome!

(In reply to Henrik Skupin (:whimboo) from comment #12)
> (In reply to Henrik Skupin (:whimboo) from comment #9)
> > https://tbpl.mozilla.org/?tree=Try&rev=da591d95e25d
> 
> Andrew, given the new requirement for pefile from mozinstall, all the tests
> are failing on try. Do I have to explicitely add this dependency for the own
> internal pypi site?

No it's even worse. The tests use whatever is in the tests.zip bundle. This means you'd have to actually check pefile into the tree (under /python) and get the Makefiles to copy it to the test bundle. What's the bug that added pefile? I think in general we should only depend on 3rd party modules if absolutely necessary. Could we wrap the import in a try block and gracefully fall back to the old way if it doesn't exist?
Flags: needinfo?(ahalberstadt)
(Assignee)

Updated

4 years ago
Blocks: 962495
(Assignee)

Comment 14

4 years ago
(In reply to Andrew Halberstadt [:ahal] from comment #13)
> > Do you have a bug reference for that? I assume you meant bug 921199?
> 
> Yeah bug 921199 and also bug 947974.

I will make sure to create tests for both.

> I guess that makes sense.. I just know from experience that this logic is
> quite finicky, so any changes here make me worried :)

There is one thing which should be clear. If multiple threads are handling mozrunner, we have to use critical sections. Otherwise your code is not thread-safe and any checks which have been put in here only workaround the problem, but don't solve them. So if mozrunner should really be thread-safe a lot more work has to be done.

> > I will try to create some basic tests for mozrunner, which we don't have at
> > all yet. That should proof that all works as expected.
> 
> That would be awesome!

Initial work looks fine. First tests have been implemented with some modifications to test.py to run those tests only when a browser path has been specified. Also I already found a broken behavior. So I'm going to write a couple more tests.

> > Andrew, given the new requirement for pefile from mozinstall, all the tests
> > are failing on try. Do I have to explicitely add this dependency for the own
> > internal pypi site?
> 
> No it's even worse. The tests use whatever is in the tests.zip bundle. This
> means you'd have to actually check pefile into the tree (under /python) and
> get the Makefiles to copy it to the test bundle. What's the bug that added
> pefile? I think in general we should only depend on 3rd party modules if
> absolutely necessary. Could we wrap the import in a try block and gracefully
> fall back to the old way if it doesn't exist?

The pefile module came in via bug 795288, which helps us to read out the PEHeader of Windows EXE files. What would you suggest to do here?
(Assignee)

Comment 15

4 years ago
Created attachment 8363603 [details] [diff] [review]
patch v2

Updated patch with tests for mozrunner and some other small adjustments to mozrunner files.
Attachment #8362705 - Attachment is obsolete: true
(Assignee)

Comment 16

4 years ago
Created attachment 8363675 [details] [diff] [review]
patch v2.1

Threading wasn't working as expected in the last version of the patch. So I reworked the tests. Also I had to fix a couple of things to get the tests working cross-platform, and especially on Windows.

This patch has been tested across platforms and all 11 tests are passing.
Attachment #8363603 - Attachment is obsolete: true
Attachment #8363675 - Flags: review?(ahalberstadt)
(Assignee)

Comment 17

4 years ago
I will try to get those tests also working with Travis. A small update will be pushed later if it works.
(Assignee)

Comment 18

4 years ago
Created attachment 8363750 [details] [diff] [review]
patch v2.2

Travis actually likes it pretty much. So here the updated patch with the .travis.yml changes.

Travis run:
https://travis-ci.org/whimboo/mozbase/builds/17422734
Attachment #8363675 - Attachment is obsolete: true
Attachment #8363675 - Flags: review?(ahalberstadt)
Attachment #8363750 - Flags: review?(ahalberstadt)
(In reply to Henrik Skupin (:whimboo) from comment #14)
> There is one thing which should be clear. If multiple threads are handling
> mozrunner, we have to use critical sections. Otherwise your code is not
> thread-safe and any checks which have been put in here only workaround the
> problem, but don't solve them. So if mozrunner should really be thread-safe
> a lot more work has to be done.

Agreed. Though there was a bug in automation we were hitting that was caused by multiple threads, so I fixed at least that one case.
 
> The pefile module came in via bug 795288, which helps us to read out the
> PEHeader of Windows EXE files. What would you suggest to do here?

We'll either need to:
A) check pefile into the tree under /python if the license permits
B) not include it in setup.py and only use it if it's detected on the system
C) re-implement the parts of pefile we care about within mozbase itself.
Comment on attachment 8363675 [details] [diff] [review]
patch v2.1

Review of attachment 8363675 [details] [diff] [review]:
-----------------------------------------------------------------

Woohoo, mozrunner tests \o/. We'll have to update make check to pass in the binary to enable them in tbpl.

This is a conditional r+ with the caveat that you remove the from runner import * statement as explained below.

::: mozrunner/mozrunner/__init__.py
@@ +5,4 @@
>  from local import *
>  from local import LocalRunner as Runner
>  from remote import *
> +from runner import *

Why did you add this? The intent was that nothing in the base Runner should be used directly.. I guess it should be called base.py instead of runner.py (feel free to change it if you want). This import will cause a name collision with the from local import LocalRunner as Runner statement above.

::: mozrunner/mozrunner/runner.py
@@ +12,5 @@
>  import mozlog
>  
> +
> +__all__ = ['Runner',
> +           'RunnerNotStartedError']

Don't think we need this as it should only be imported internally.

::: test-manifest.ini
@@ +20,3 @@
>  [include:mozprocess/tests/manifest.ini]
>  [include:mozprofile/tests/manifest.ini]
> +[include:mozrunner/tests/manifest.ini]

Is this going to cause orange on tbpl?
Attachment #8363675 - Flags: review+
Comment on attachment 8363750 [details] [diff] [review]
patch v2.2

Review of attachment 8363750 [details] [diff] [review]:
-----------------------------------------------------------------

Oops accidentally reviewed the old patch. This patch is the same minus the travis.yml and some test changes so same comments apply.

This is still a conditional r+ as described in the previous comment.
Attachment #8363750 - Flags: review?(ahalberstadt) → review+
(Assignee)

Comment 22

4 years ago
(In reply to Andrew Halberstadt [:ahal] from comment #19)
> > There is one thing which should be clear. If multiple threads are handling
> > mozrunner, we have to use critical sections. Otherwise your code is not
> > thread-safe and any checks which have been put in here only workaround the
> > problem, but don't solve them. So if mozrunner should really be thread-safe
> > a lot more work has to be done.
> 
> Agreed. Though there was a bug in automation we were hitting that was caused
> by multiple threads, so I fixed at least that one case.

Given that the try run is green and also travis results are fine, I assume that nothing has been broken here.

> > The pefile module came in via bug 795288, which helps us to read out the
> > PEHeader of Windows EXE files. What would you suggest to do here?
> 
> We'll either need to:
> A) check pefile into the tree under /python if the license permits
> B) not include it in setup.py and only use it if it's detected on the system
> C) re-implement the parts of pefile we care about within mozbase itself.

I filed bug 963176 as a follow-up. It's nothing we further have to care on this bug.
(Assignee)

Comment 23

4 years ago
(In reply to Andrew Halberstadt [:ahal] from comment #20)
> Woohoo, mozrunner tests \o/. We'll have to update make check to pass in the
> binary to enable them in tbpl.

Noted down on bug 949600, so that we don't miss that with the move.

> ::: mozrunner/mozrunner/__init__.py
> @@ +5,4 @@
> >  from local import *
> >  from local import LocalRunner as Runner
> >  from remote import *
> > +from runner import *
> 
> Why did you add this? The intent was that nothing in the base Runner should
> be used directly.. I guess it should be called base.py instead of runner.py
> (feel free to change it if you want). This import will cause a name
> collision with the from local import LocalRunner as Runner statement above.

Well, we need the exception class. But I now see your point. So my suggestion would be that we create a new module called error.py, which can hold all available error classes. What do you think?

> ::: test-manifest.ini
> @@ +20,3 @@
> >  [include:mozprocess/tests/manifest.ini]
> >  [include:mozprofile/tests/manifest.ini]
> > +[include:mozrunner/tests/manifest.ini]
> 
> Is this going to cause orange on tbpl?

It should not, given that we do not pass in the binary. And the tests will be marked as skipped by the unittest.skipIf() decorator. This is new in Python 2.7 but is safe here given that we already use it in mozprofile tests for a while.

I can push to try again once I have updated the patch.
Flags: needinfo?(ahalberstadt)
(In reply to Henrik Skupin (:whimboo) from comment #23)
> Well, we need the exception class. But I now see your point. So my
> suggestion would be that we create a new module called error.py, which can
> hold all available error classes. What do you think?

Yep, that makes sense to me.
Flags: needinfo?(ahalberstadt)
(Assignee)

Comment 25

4 years ago
Created attachment 8364492 [details] [diff] [review]
patch v3

Andrew, this is the updated patch. All review comments should have been addressed now. The new module is named exceptions.py because I don't think that error(s).py could cause conflicts.

If you are happy with this patch, I will push the changes to try.
Attachment #8363750 - Attachment is obsolete: true
Attachment #8364492 - Flags: review?(ahalberstadt)
Comment on attachment 8364492 [details] [diff] [review]
patch v3

Review of attachment 8364492 [details] [diff] [review]:
-----------------------------------------------------------------

I think errors.py is kind of an unofficial standard here at mozilla, but meh, bikeshedding :)
Attachment #8364492 - Flags: review?(ahalberstadt) → review+
(Assignee)

Comment 27

4 years ago
Created attachment 8364620 [details] [diff] [review]
Patch v4

Well, I like unofficial stuff! :) So here the update. I will push this shortly to try again.
Attachment #8364492 - Attachment is obsolete: true
Attachment #8364620 - Flags: review+
(Assignee)

Comment 28

4 years ago
Pushed to try: https://tbpl.mozilla.org/?tree=Try&rev=d840522327be

If it looks fine, I will push this change to master tomorrow morning.
(Assignee)

Comment 29

4 years ago
Sadly the last push contained the changes for mozinstall so everything busted. I repushed without the pefile changes:
https://tbpl.mozilla.org/?tree=Try&rev=f1e63e1c40d2
(Assignee)

Comment 30

4 years ago
The last try run is green! So there is nothing which got regressed on mozilla-central. So I will land this patch on master now.
(Assignee)

Comment 31

4 years ago
https://github.com/mozilla/mozbase/commit/4d7a366e55c91e659a919c87dac6673705d4da2b
Status: ASSIGNED → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
(Assignee)

Updated

4 years ago
Depends on: 965183
(Assignee)

Updated

4 years ago
Blocks: 781694
(Assignee)

Updated

3 years ago
Blocks: 1065583
You need to log in before you can comment on or make changes to this bug.