Closed Bug 754073 Opened 8 years ago Closed 7 years ago

merge remotePerfConfigurator with PerfConfigurator

Categories

(Testing :: Talos, defect)

defect
Not set

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: k0scist, Unassigned)

References

Details

(Whiteboard: [good first bug][mentor=BYK][lang=py])

Attachments

(9 files, 6 obsolete files)

35.77 KB, text/plain
Details
15.41 KB, patch
Details | Diff | Splinter Review
28.09 KB, patch
k0scist
: review-
Details | Diff | Splinter Review
27.91 KB, patch
k0scist
: review+
Details | Diff | Splinter Review
679 bytes, patch
Details | Diff | Splinter Review
29.30 KB, patch
Details | Diff | Splinter Review
29.87 KB, patch
jmaher
: review+
Details | Diff | Splinter Review
30.57 KB, patch
Details | Diff | Splinter Review
30.61 KB, patch
jmaher
: review+
Details | Diff | Splinter Review
Following but 704654, you can run e.g.

talos -n -d -a ts -e `which firefox` --develop --results_url file://${PWD}/ts.txt -o output.yml -v

To run talos ts tests and generate a output.yml file which is
additionally dumped to standard out -- there is no need to run a
separate PerfConfigurator step.

However, while the options to remotePerfConfigurator are passed
through the `talos` console_script, you cannot run
remotePerfConfigurator in one step. If the scripts could be combined,
and switched on (e.g.) self.remote
(http://hg.mozilla.org/build/talos/file/488bc187a3ef/talos/remotePerfConfigurator.py#l119),
then one could use the talos CLI to do both and avoid a separate
step for the remote system. 

For most of the methods in
http://hg.mozilla.org/build/talos/file/488bc187a3ef/talos/remotePerfConfigurator.py,
it would be easy to key their behaviour on self.remote.  The default
values, OTOH, could be harder, but not impossible.

PerfCOnfigurator and remotePerfConfigurator must continue to have a
command line interface for buildbot and other automation.  See bug
754063. But if that is fixed they can go away and we could possibly
avoid the .yml file entirely if desired.
Whiteboard: [good first bug][mentor=BYK][lang=py]
See Also: → 704654
In order to really gain much mileage out of this, we'll need to move some of the copy steps to run_tests.py: bug 735502, bug 735500 . I won't put this as blocking, as strictly speaking its not, but they're needed for the larger effort
Attached file PerfConfigurator.py
Thanks for posting it. I didn't realise you can directly attach the PerfConfigurator.py. I thought it "has" to be a patchfile.

I'm looking forward to hearing if it works on the remote stuff, too, so that we can remove the remotePerfConfigurator.py from the directory.
Comment on attachment 656227 [details] [diff] [review]
hg clone http://hg.mozilla.org/build/talos; cd talos/talos; curl https://bug754073.bugzilla.mozilla.org/attachment.cgi?id=656222 > PerfConfigurator.py; hg diff > ~/bug-754073.diff

+        ### Taken from previously existing remotePerfConfigurator

Won't need this in the final patch

+    def __init__(self, remote=False, **kwargs):

Not sure why we'd have this in the constructor.  Shouldn't this come from the configuration?

+        if self.remote:
Likewise

+            # pc.PerfConfigurator.__init__(self, **kwargs) #<- Is this position procedurally necessary?

This won't work with the code as is

The basic approach is right.  However, we'll want to move remote checking to after the configuration has been processed.  As it stands, it should go remote if --remote or --deviceIP is passed in.  Probably ideally, it would go remote if --deviceRoot or --deviceIP is passed in
Attachment #656227 - Flags: review?(jhammel)
Attached patch First revision of patch (obsolete) — Splinter Review
This is a preliminary patch and will need a final revision still. I left a lot of white space where I made changes so that I can easily find it again in the code.

Main question that I still have are:

1. I left the remote_options separate from the other options, as they were from remotePerfConfigurator. Would it make more sense to include them now into the options from the start?

2. in method validate I called

if self.config.get('deviceip') or self.config.get('deviceport'):

to check for remote access. Is that enough?

3. In the review of the first attempt, jhammel wrote:

However, we'll want to move remote checking to after the configuration has been processed.  As it stands, it should go remote if --remote or --deviceIP is passed in.  Probably ideally, it would go remote if --deviceRoot or --deviceIP is passed in

However, the flag --remote, does not exist. Should I create it? If so, where?

Thanks for your help
Attachment #656486 - Flags: review?(madbyk)
Attachment #656486 - Flags: review?(jhammel)
I was hoping :BYK would have a chance to look at this, but I'll take a stab
Comment on attachment 656486 [details] [diff] [review]
First revision of patch

This is getting very close.  As suggested by your comments, I think you will need to incorporate the remote-only options into PerfConfigurator proper.  remotePerfConfigurator.py should be replaced will a shell that just calls PerfConfigurator (to keep things working in production, though we'll phase that out).

+                   'flags': []}, # Does it make sense to add a flag --remote here?

Nope.  As explained before, we want 'remote' to be a config item.  That's why its here.  But actually figuring out if a run is remote or not hinges on deviceip/--remoteDevice.  Some of this logic duplication can (and should) eventually go away, but for the time being its probably easier to do a straight refactor and then fix this (IMHO).

There may be other minor nits, but this is looking towards its on the right track.
Attachment #656486 - Flags: review?(jhammel) → review+
Comment on attachment 656486 [details] [diff] [review]
First revision of patch

Review of attachment 656486 [details] [diff] [review]:
-----------------------------------------------------------------

Looking very good, especially a newcomer. Kudos. My comments are usually nitpicks, other than that you seem to have grasped the whole logic. One little thing would be using PEP8 to ensure code style btw.

R- is to force you to fix those non-trivial nitpicks ;)

::: talos/PerfConfigurator.py
@@ +183,5 @@
> +                   'flags': []}, # Does it make sense to add a flag --remote here?
> +        }
> +    
> +    # remote-specific defaults
> +    remote_default_values = {'basetest': {'timeout': 3600,

Move 'timeout' to the next line please ;)

@@ +298,5 @@
>          # http://hg.mozilla.org/build/talos/file/c702ff8892be/talos/PerfConfigurator.py#l44
>  
> +
> +
> +        # Would it not make more sense to just incorporate remote_options

Seems like it would but jhammel is the boss on that I guess.

@@ +302,5 @@
> +        # Would it not make more sense to just incorporate remote_options
> +        # into the standard options?
> +    
> +        # add remote-specific options
> +        names = [i[0] for i in self.options]

I don't like this. Why don't we try using dict(self.options) and moving forward in that direction?

Even if that does not work I *feel* that there's a much better way to do this :)

@@ +303,5 @@
> +        # into the standard options?
> +    
> +        # add remote-specific options
> +        names = [i[0] for i in self.options]
> +        for key, value in self.remote_options.items():

Would advocate iteritems() over items(). Just a nitpick.

@@ +341,5 @@
> +            
> +            
> +            # setup remote
> +            deviceip = self.config.get('deviceip')
> +            deviceport = self.config['deviceport']

Using .get above and using item accessor on this line? Why the inconsistency?

@@ +769,5 @@
> +
> +        # remote-specific preferences
> +        self.preferences.update({'talos.logfile': 'browser_output.txt',
> +                                 'network.manage-offline-status': False})
> +        for pref in ['network.proxy.http',

Using a tuple here would be more efficient. (yeah, nitpick again)

@@ +779,5 @@
> +            self.preferences.pop(pref)
> +
> +
> +
> +        # add remote-specific options

Do not leave commented-out code in prod. ;)

@@ +819,5 @@
> +
> +        if self.config['remote'] == False:
> +            return
> +
> +        files = ['page_load_test/quit.js',

tuples! tuples everywhere! (At least they should be)

@@ +826,5 @@
> +                 'startup_test/twinopen/winopen.js',
> +                 'startup_test/twinopen/child-window.html']
> +
> +        talosRoot = self.deviceroot + '/talos/'
> +        for file in files:

Oops, do not override the native `file` function, use something else as the variable name. ;)
Attachment #656486 - Flags: review?(madbyk) → review-
Attached patch Second revision of patch (obsolete) — Splinter Review
Ok, here is my second attempt. I have attached inline answers to the specific revisions below if you are interested. I wrote in capitals to make it distinct from the rest, not because I think you guys need some shoutin' at ;)

Additionally, I added a is_remote method, which tests for the presence of deviceroot or deviceip and I removed the remote item from the options at the beginning.

To test the file I used the commands

python PerfConfigurator.py -e `which firefox` -a a11y

and

python PerfConfigurator.py -e `which firefox` -a a11y --remoteDevice 127.0.0.1

and they ran through without throwing any unexpected errors. (The second one should run through until the socket is not found, to test if no syntax error is present)

I hope it is better this time round. Thanks for all the help.

Johannes

------------------------

Here are the specific comments:

Comment on attachment 656486 [details] [diff] [review] [diff] [review]
First revision of patch

This is getting very close.  As suggested by your comments, I think you will need to incorporate the remote-only options into PerfConfigurator proper.  remotePerfConfigurator.py should be replaced will a shell that just calls PerfConfigurator (to keep things working in production, though we'll phase that out).

CHANGED!

------------------

+                   'flags': []}, # Does it make sense to add a flag --remote here?

Nope.  As explained before, we want 'remote' to be a config item.  That's why its here.  But actually figuring out if a run is remote or not hinges on deviceip/--remoteDevice.  Some of this logic duplication can (and should) eventually go away, but for the time being its probably easier to do a straight refactor and then fix this (IMHO).

There may be other minor nits, but this is looking towards its on the right track.

SEE ABOVE

AFTER TALKING TO BYK, WE TRIED TO CHANGE IT, I HOPE IT ISN'T WORSE NOW.

------------------

Comment on attachment 656486 [details] [diff] [review] [diff] [review]
First revision of patch

Review of attachment 656486 [details] [diff] [review] [diff] [review]:
-----------------------------------------------------------------

Looking very good, especially a newcomer. Kudos. My comments are usually nitpicks, other than that you seem to have grasped the whole logic. One little thing would be using PEP8 to ensure code style btw.

R- is to force you to fix those non-trivial nitpicks ;)

::: talos/PerfConfigurator.py
@@ +183,5 @@
> +                   'flags': []}, # Does it make sense to add a flag --remote here?
> +        }
> +    
> +    # remote-specific defaults
> +    remote_default_values = {'basetest': {'timeout': 3600,

Move 'timeout' to the next line please ;)

CHANGED!
------------------
@@ +298,5 @@
>          # http://hg.mozilla.org/build/talos/file/c702ff8892be/talos/PerfConfigurator.py#l44
>  
> +
> +
> +        # Would it not make more sense to just incorporate remote_options

Seems like it would but jhammel is the boss on that I guess.

CHANGED!
------------------
@@ +302,5 @@
> +        # Would it not make more sense to just incorporate remote_options
> +        # into the standard options?
> +    
> +        # add remote-specific options
> +        names = [i[0] for i in self.options]

I don't like this. Why don't we try using dict(self.options) and moving forward in that direction?

Even if that does not work I *feel* that there's a much better way to do this :)

WAS UNNECESSARY AND REMOVED AS REMOTE_OPTIONS WAS MERGED WITH OPTIONS
------------------
@@ +303,5 @@
> +        # into the standard options?
> +    
> +        # add remote-specific options
> +        names = [i[0] for i in self.options]
> +        for key, value in self.remote_options.items():

Would advocate iteritems() over items(). Just a nitpick.

CHANGED!
------------------
@@ +341,5 @@
> +            
> +            
> +            # setup remote
> +            deviceip = self.config.get('deviceip')
> +            deviceport = self.config['deviceport']

Using .get above and using item accessor on this line? Why the inconsistency?

CHANGED TO .GET
------------------
@@ +769,5 @@
> +
> +        # remote-specific preferences
> +        self.preferences.update({'talos.logfile': 'browser_output.txt',
> +                                 'network.manage-offline-status': False})
> +        for pref in ['network.proxy.http',

Using a tuple here would be more efficient. (yeah, nitpick again)

CHANGED!
------------------
@@ +779,5 @@
> +            self.preferences.pop(pref)
> +
> +
> +
> +        # add remote-specific options

Do not leave commented-out code in prod. ;)

REMOVED!
------------------
@@ +819,5 @@
> +
> +        if self.config['remote'] == False:
> +            return
> +
> +        files = ['page_load_test/quit.js',

tuples! tuples everywhere! (At least they should be)

TUPLES!
------------------
@@ +826,5 @@
> +                 'startup_test/twinopen/winopen.js',
> +                 'startup_test/twinopen/child-window.html']
> +
> +        talosRoot = self.deviceroot + '/talos/'
> +        for file in files:

Oops, do not override the native `file` function, use something else as the variable name. ;)

THAT WASN'T MY FAULT ;)
Attachment #660514 - Flags: review?(madbyk)
Attachment #660514 - Flags: review?(jhammel)
Comment on attachment 660514 [details] [diff] [review]
Second revision of patch

+        ### Taken from previously existing remotePerfConfigurator
+        ('nativeUI', {'help': 'Run tests on Fennec with a Native Java UI instead of the XUL UI',
+                      'default': False}),
+        ### Taken from previously existing remotePerfConfigurator
+        ### from previously exisitng remote_options

These comments should be deleted for final commit.  We probably do want to mark them as remote-specific though.

the remote_default_values worries me, as we may be overwriting values from elsewhere.  This is probably a hard problem to solve (one of the reasons I don't necessarily think of this as a good first bug).  If this works in testing, I am fine with taking this now, but we should file a bug for testing this and making it more rigorous.

I'm also curious what :jmaher and :BYK have to say
Attachment #660514 - Flags: review?(jhammel) → review+
Comment on attachment 660514 [details] [diff] [review]
Second revision of patch

Review of attachment 660514 [details] [diff] [review]:
-----------------------------------------------------------------

I did not notice anything that would affect the functionality so this is an r+ but I would be very happy to see those code quality issues resolved before landing. In the mean time can somebody, *ahem* jhammel *ahem*, push this to try and see what happens?

::: talos/PerfConfigurator.py
@@ +159,5 @@
> +                      'default': False}),
> +        ### Taken from previously existing remotePerfConfigurator
> +        ### from previously exisitng remote_options
> +        ('deviceip', {'help': 'Device IP (when using SUTAgent)',
> +                     'flags': ['-r', '--remoteDevice']}),

Nitpick: I would expect single quotes to be aligned ;)

@@ +171,5 @@
>          ]
> +    
> +    # remote-specific defaults
> +    remote_default_values = {'basetest': {
> +                                    'timeout': 3600,

Very inconsistent whitespace usage here and below :(

@@ +325,5 @@
> +                    self.config['webserver'] = utils.getLanIp()
> +
> +            # webServer can be used without remoteDevice, but is required when using remoteDevice
> +            if self.config.get('deviceip') or self.config.get('deviceroot'):
> +                if self.config.get('webserver', 'localhost') == 'localhost' or not self.config.get('deviceip'):

Why two nested ifs?

@@ +614,5 @@
> +        if 'winopen.xul' in url:
> +            self.buildRemoteTwinopen()
> +            url = 'file://' + self.deviceroot + '/talos/' + url
> +        
> +        # Checks if a remote call to PerfConfigurator is done

I don't think we need this comment each and everytime self.is_remote is used.

@@ +656,4 @@
>  
> +            remoteName += '/' + os.path.basename(manifestName)
> +            if self.testAgent.pushFile(newManifestName, remoteName) == False:
> +                msg = "Unable to copy remote manifest file %s to %s" % (newManifestName, remoteName)

Do we really need `msg`?

@@ +712,5 @@
>                    'dirs': {},
>                    'host': self.config.get('deviceip', ''), # XXX names should match!
>                    'port': self.config.get('deviceport', ''), # XXX names should match!
>                    'process': '',
> +                  'remote': False, # can this be taken out or is it linked to something else?

jhammel, jmaher, anyone?

@@ +784,5 @@
> +            if self.testAgent.pushFile(f, talosRoot + f) == False:
> +                raise ConfigurationError("Unable to copy twinopen file "
> +                                            + f + " to " + talosRoot + f) 
> +    
> +    def is_remote(self):

We should have that repeated comment here ;)
Attachment #660514 - Flags: review?(madbyk) → review+
> > +                  'remote': False, # can this be taken out or is it linked to something else?
> 
> jhammel, jmaher, anyone?

I'm fairly sure this will result in breakage if removed.  I'd encourage looking over the code and seeing where this is used, but I believe the short answer is: a lot :)  FWIW, I think browser_config is far far too big for its britches and should mostly go away, but that is far beyond the scope of this bug
of the top of my head, I believe we need the remote variable.
Yes, remote should be reintroduced if it was dropped.  In addition, running in production we call `python remotePerfConfigurator.py ...` so remotePerfConfigurator.py should be replaced with a stub that imports PerfConfigurator and runs its command line main entry point.  Also the reference to remotePerfConfigurator in setup.py should be replaced.  That's all I can think of off the top of my head.

(Also, if try reported to bugzilla correctly this could have been caught a week ago.  Boo!)
Attached patch Third revision of patch (obsolete) — Splinter Review
Ok, I hope I got all the instances of self.remote again... The file ran fine (as far as I could test it...)

Also created the stub and removed remotePerfconfig mention from setup.py

Let's hope it runs through fine this time.
Attachment #656486 - Attachment is obsolete: true
Attachment #660514 - Attachment is obsolete: true
Attachment #664203 - Flags: review?(jhammel)
(In reply to nebelhom from comment #17)
> Created attachment 664203 [details] [diff] [review]
> Third revision of patch
> 
> Ok, I hope I got all the instances of self.remote again... The file ran fine
> (as far as I could test it...)
> 
> Also created the stub and removed remotePerfconfig mention from setup.py
> 
> Let's hope it runs through fine this time.

Suggestion: Since this is a fairly major change, you might want to run the work through pyflakes (http://pypi.python.org/pypi/pyflakes) as well just to make sure nothing else got accidentally broken. It's handy for catching misnamed variables and functions, and other things like that.
Attached patch Fourth revision of patch (obsolete) — Splinter Review
@wlach: Good call. pyflakes spotted a typo in the script in line 541 that I changed now, too (for a change that was an artifact and not my fault :D)

@all: It spotted an unused variable, as well. But I wanted you guys to have a look at it first before I take it out. I don't want another self.remote case.

Here's the output:
PerfConfigurator.py:705: local variable 'testdate' is assigned to but never used

Thanks for the suggestion.
Attachment #664203 - Attachment is obsolete: true
Attachment #664203 - Flags: review?(jhammel)
Attachment #664453 - Flags: review?(jhammel)
ABICT, you do not set self.config['remote'] to self.remote anywhere.  Did I miss it?
Attached patch fifth revision of patch (obsolete) — Splinter Review
Sorry about that. I didn't think of it (forget would imply that I thought of it previously, but that would be a lie)
Attachment #664453 - Attachment is obsolete: true
Attachment #664453 - Flags: review?(jhammel)
Attachment #664640 - Flags: review?(jhammel)
removed testdate variable
Attachment #664640 - Attachment is obsolete: true
Attachment #664640 - Flags: review?(jhammel)
Attachment #665089 - Flags: review?(jhammel)
Comment on attachment 665089 [details] [diff] [review]
Sixth revision of the patch

I don't think remote defaults are handled correctly.  By the time .validate is called, self.config is already populated from the defaults.  You can look for these in self.parsed...however, iirc, this is only for the CLI options, not for options from configuration files.  I think this should be changed in configuration.py, which is actually an upstream project.

Other than that, it looks great!

Sorry for all of this churn.  I would not call this a good first bug, certainly in retrospect
Attachment #665089 - Flags: review?(jhammel) → review-
filed bug 796196 for blocking for this issue.  There may be another way around, but I can't think of it
Depends on: 796196
No worries, in retrospect, I call this a good first bug. Getting thrown in at the deep end was interesting :)

However, you'll be the judge of whether I annoyed you too much with it ;) (Is that also a criteria for good first bug? :P)
Attached patch seventh revision of the patch (obsolete) — Splinter Review
@jhammel: I added your suggestion and it seemed to go through fine.

Thanks a bundle :)
Attachment #667204 - Flags: review?(jhammel)
Comment on attachment 667204 [details] [diff] [review]
seventh revision of the patch

+            # setup remote
+            deviceip = self.config.get('deviceip', -1)
+            deviceport = self.config.get('deviceport', -1)
+            if deviceip or deviceport == -1:
+                self._setupRemote(deviceip, deviceport)

So this isn't going to work.  Why are we using -1 as a default for deviceport?  the current defaults for both are None:

        deviceip = self.config.get('deviceip')
        deviceport = self.config['deviceport']
        if deviceip or deviceport == -1:
            self._setupRemote(deviceip, deviceport)
Attachment #667204 - Flags: review?(jhammel) → review-
Attachment #667204 - Attachment is obsolete: true
Comment on attachment 667589 [details] [diff] [review]
8th revision of the patch

I *think* this covers everything!  Will push to try to check
Attachment #667589 - Flags: review+
  File "run_tests.py", line 298, in <module>
    main()
  File "run_tests.py", line 285, in main
    options, args = parser.parse_args(args)
  File "/home/cltbld/talos-slave/talos-data/talos/PerfConfigurator.py", line 484, in parse_args
    options, args = Configuration.parse_args(self, *args, **kwargs)
  File "/home/cltbld/talos-slave/talos-data/talos/configuration.py", line 476, in parse_args
    self(*config)
  File "/home/cltbld/talos-slave/talos-data/talos/configuration.py", line 377, in __call__
    self.add(config)
  File "/home/cltbld/talos-slave/talos-data/talos/configuration.py", line 389, in add
    self.check(config)
  File "/home/cltbld/talos-slave/talos-data/talos/configuration.py", line 319, in check
    raise UnknownOptionException("Unknown options: %s" % ', '.join(unknown_options))
configuration.UnknownOptionException: Unknown options: remote
program finished with exit code 1

https://tbpl.mozilla.org/php/getParsedLog.php?id=15784281&tree=Try&full=1

:(

Killing the try run.  We need the remote option back
Try run for f024f63f0e7a is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=f024f63f0e7a
Results (out of 33 total builds):
    exception: 8
    success: 5
    failure: 20
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jhammel@mozilla.com-f024f63f0e7a
So this is because of this line removal:

-        ('remote', {'flags': []})

adding this back at least gets me past this point, though now i get a very strange segfault:

(talos)│talos -n -d 20121003_1547_config.yml --develop
setting debug
DEBUG: using testdate: 1349304616
DEBUG: actual date: 1349304616
RETURN:<a href = "http://hg.mozilla.org/mozilla-central/rev/635fcc11d2b1">rev:635fcc11d2b1</a>
qm-pxp01: 
		Started Wed, 03 Oct 2012 15:50:16
Running test ts: 
		Started Wed, 03 Oct 2012 15:50:16
DEBUG: operating with platform_type : linux_
DEBUG: created profile
	Screen width/height:1366/768
	colorDepth:24
	Browser inner width/height: 1024/683


browser_name:Firefox
browser_version:18.0a1
buildID:20121003030545

DEBUG: initialized firefox
DEBUG: command line: /home/jhammel/firefox/firefox  -profile /tmp/tmpK0h6Kf/profile startup_test/startup_test.html?begin=
Aborted (core dumped)
Failed ts: 
		Stopped Wed, 03 Oct 2012 15:53:08
FAIL: Busted: ts
FAIL: timeout exceeded
Traceback (most recent call last):
  File "/home/jhammel/mozilla/src/talos/src/talos/talos/run_tests.py", line 250, in run_tests
    talos_results.add(mytest.runTest(browser_config, test))
  File "/home/jhammel/mozilla/src/talos/src/talos/talos/ttest.py", line 366, in runTest
    raise talosError("timeout exceeded")
talosError: 'timeout exceeded'
Traceback (most recent call last):
  File "/home/jhammel/mozilla/src/talos/bin/talos", line 8, in <module>
    load_entry_point('talos==0.0', 'console_scripts', 'talos')()
  File "/home/jhammel/mozilla/src/talos/src/talos/talos/run_tests.py", line 295, in main
    run_tests(parser)
  File "/home/jhammel/mozilla/src/talos/src/talos/talos/run_tests.py", line 259, in run_tests
    raise e
talos.utils.talosError: 'timeout exceeded'
The url is also wrong: startup_test/startup_test.html?begin=
(In reply to Jeff Hammel [:jhammel] from comment #34)
> The url is also wrong: startup_test/startup_test.html?begin=

So the good news is, this is a preexisting bug.  I've filed at bug 797624.  I'll test a few more things locally and then push to try (again)
.
Ok, cool. thanks. I'm looking forward to seeing what's faulty next ;)
So this almost works o_O  Running desktop talos, things basically work fine. Running remote talos, I get:

(talos)│PerfConfigurator -a tsvg -e org.mozilla.fennec --datazilla-url /home/jhammel/foo.json -o foo.yaml --develop --cycles 1 --tpcycles 1 --tppagecycles 1 --noChrome --remoteDevice 10.251.28.51
reconnecting socket
 - outputName = foo.yaml
(talos)│talos -n -d foo.yaml 
reconnecting socket
setting debug
reconnecting socket
DEBUG: using testdate: 1349308561
DEBUG: actual date: 1349308561
DeviceManager: error pulling file '/data/data/org.mozilla.fennec/application.ini': Unable to access file (internal error)
mobile: 
		Started Wed, 03 Oct 2012 16:56:01
Running test tsvg: 
		Started Wed, 03 Oct 2012 16:56:01
reconnecting socket
DEBUG: operating with platform_type : remote_
pushing directory: /tmp/tmpMrY4T7/profile to /mnt/sdcard/tests/profile
DEBUG: created profile
reconnecting socket
FIRE PROC: 'org.mozilla.fennec  -profile /mnt/sdcard/tests/profile http://10.251.25.242:15707/getInfo.html'

Fennec, however, gets an error about "The proxy server is refusing connections".  I can access the page from localhost and the IP does seem correct. :sigh: I have no clue :/
Probably the next thing to do is to generate a .yaml file before and after the change and see what the variation is.  I'm guessing we're double marking something...
So I ran the old

remotePerfConfigurator -a tsvg -e org.mozilla.fennec --datazilla-url /home/jhammel/foo.json -o old.yaml --develop --cycles 1 --tpcycles 1 --tppagecycles 1 --noChrome --remoteDevice 10.251.28.51

vs the new (patched)

PerfConfigurator -a tsvg -e org.mozilla.fennec --datazilla-url /home/jhammel/foo.json -o new.yaml --develop --cycles 1 --tpcycles 1 --tppagecycles 1 --noChrome --remoteDevice 10.251.28.51

as a sanity check. These files should be exactly the same.  However, there are substantive differences :/  We need to get rid of these
So one thing we're not doing is updating remote-specific preferences:

http://hg.mozilla.org/build/talos/file/e351f3aa6ecc/talos/remotePerfConfigurator.py#l86
https://bugzilla.mozilla.org/attachment.cgi?id=667589&action=diff#a/talos/PerfConfigurator.py_sec11

This should operate on self.config['preferences'] now, not preferences
Following this I get

(talos)│diff old.yaml /home/jhammel/mozilla/src/talos/src/talos/new.yaml
2,4d1
<   cycles: 1
<   linux_counters: []
<   mac_counters: []
7,10d3
<   resolution: 1
<   responsiveness: false
<   rss: false
<   shutdown: false
12d4
<   tpchrome: true
15,22d6
<   tpformat: tinderbox
<   tpmozafterpaint: false
<   tpnoisy: true
<   tppagecycles: 1
<   tprender: false
<   w7_counters: []
<   win_counters: []
<   xperf_counters: []
41d24
<   NO_EM_RESTART: '1'

So there is a bunch that remotePerfConfigurator is adding that we're not picking up
Casually looking the above seem to be errors in the *current* version of remotePerfConfigurator!  I'm going to push to try and see what happens; if nothing horrible, I will flag jmaher for review since a change this big couldn't hurt to have another pair of eyes.
ah, in fact i am wrong: http://hg.mozilla.org/build/talos/file/e351f3aa6ecc/talos/remotePerfConfigurator.py#l106 We should be checking for dict and updating based on that instead of what is done currently :/
So this turns out to be harder than one would think.  In order to get the correct values for extend, http://hg.mozilla.org/build/talos/file/e351f3aa6ecc/talos/PerfConfigurator.py#l221, we need to ensure that the defaults get set *before* the body of __call__ :

http://hg.mozilla.org/build/talos/file/e351f3aa6ecc/talos/configuration.py#l368

This means determining if the system is remote at the top of the __call__ method, setting self.remote there, setting the defaults there, then calling the main body of Configuration.__call__ .
going to leave the try run going in the meantime to see if there are any other errors i missed
Attached patch finished, i hopeSplinter Review
casual testing reveals that this seems to produce identical configuration files.  This also works now, yay:

talos -n -d -a tsvg -e org.mozilla.fennec --datazilla-url /home/jhammel/foo.json -o new.yaml --develop --cycles 1 --tpcycles 1 --tppagecycles 1 --noChrome --remoteDevice 10.251.28.51
Attachment #668654 - Flags: review?(jmaher)
Comment on attachment 668654 [details] [diff] [review]
finished, i hope

Review of attachment 668654 [details] [diff] [review]:
-----------------------------------------------------------------

Some nits in here.  Be careful and wait for :wlach to land his patch before this (which will bitrot this).  If you want, these nits can be resolved in followup bugs, but I am fine handling this all in this current bug.

when you run this on try server, please just do android and 1 other os, we don't need to use up all the try server resources for every little talos change (although this isn't little).

::: talos/PerfConfigurator.py
@@ +176,5 @@
> +            'timeout': 3600,
> +            'profile_path': '${talos}/mobile_profile',
> +            'remote_counters': [],
> +            'tpcycles': 3,
> +            'tpdelay': 1000

do we even use tpdelay?  ack.

@@ +199,5 @@
> +                                              {'cycles': 10,
> +                                               'timeout': 150},
> +                                          'ts_places_generated_med':
> +                                              {'cycles': 10,
> +                                               'timeout': 150},

we do not have support for ts_places*.  I verified this today as those use a custom profile which will not work as we already have a custom profile for remote testing.

@@ +205,5 @@
> +                                              {'tpcycles': 3},
> +                                          'tsvg':
> +                                              {'tpcycles': 3},
> +                                          'tsspider':
> +                                              {'tpcycles': 3}

we don't run tsspider anymore, we could remove this additional config override.

@@ +206,5 @@
> +                                          'tsvg':
> +                                              {'tpcycles': 3},
> +                                          'tsspider':
> +                                              {'tpcycles': 3}
> +                                          }

why is the indentation so large here?

@@ +340,5 @@
> +            # fix webserver for --develop mode
> +            if self.config.get('develop'):
> +                webserver = self.config.get('webserver')
> +                if (not webserver) or (webserver == 'localhost'):
> +                    self.config['webserver'] = utils.getLanIp()

is this duplicated anywhere?  we support --develop without remote.

@@ +645,5 @@
> +            url = url.replace('webServer=', 'webServer=%s' % self.config['webserver'])
> +
> +            # Take care of the robocop based tests
> +            url = url.replace('class org.mozilla.fennec.tests', 'class %s.tests' % self.config['browser_path'])
> +            return url

please remove this entire block as we do not support this anymore.

@@ +796,5 @@
> +        talosRoot = self.deviceroot + '/talos/'
> +        for f in files:
> +            if self.testAgent.pushFile(f, talosRoot + f) == False:
> +                raise ConfigurationError("Unable to copy twinopen file "
> +                                            + f + " to " + talosRoot + f)

can we remove buildRemoteTwinopen as we don't need it anymore?

@@ +828,5 @@
>                      help="Input config file")
>  
>      # parse the arguments and dump an output file
>      options, args = conf.parse_args(args)
> +

extra line in here.
Attachment #668654 - Flags: review?(jmaher) → review+
Try run for afdd1d736e7e is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=afdd1d736e7e
Results (out of 69 total builds):
    exception: 11
    success: 37
    failure: 21
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jhammel@mozilla.com-afdd1d736e7e
Traceback (most recent call last):
  File "remotePerfConfigurator.py", line 12, in <module>
    sys.exit(PerfConfigurator.main()) 
NameError: name 'sys' is not defined
(In reply to Joel Maher (:jmaher) from comment #51)
> Traceback (most recent call last):
>   File "remotePerfConfigurator.py", line 12, in <module>
>     sys.exit(PerfConfigurator.main()) 
> NameError: name 'sys' is not defined

:( that's what I get for not testing the stub
(In reply to Joel Maher (:jmaher) from comment #49)
> Comment on attachment 668654 [details] [diff] [review]
> finished, i hope
> 
> Review of attachment 668654 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Some nits in here.  Be careful and wait for :wlach to land his patch before
> this (which will bitrot this).  

Will do, bug 797996 for reference.

> If you want, these nits can be resolved in
> followup bugs, but I am fine handling this all in this current bug.

This is intended as a straight port of functionality from the current system. Outside of not testing the stub remotePerfConfigurator.py, see comment 52, the generated output yaml files are identical to those before.  So I kept everything the same to ensure that we're getting the same thing before vs after.

> when you run this on try server, please just do android and 1 other os, we
> don't need to use up all the try server resources for every little talos
> change (although this isn't little).

Will do.  I suppose I will use something other than linux since I can test linux locally (albeit not in the buildbot way).

> ::: talos/PerfConfigurator.py
> @@ +176,5 @@
> > +            'timeout': 3600,
> > +            'profile_path': '${talos}/mobile_profile',
> > +            'remote_counters': [],
> > +            'tpcycles': 3,
> > +            'tpdelay': 1000
> 
> do we even use tpdelay?  ack.

Not sure. It looks like we do, at least in theory:

(talos)│grep 'tpdelay' *.py
output.py:        test_options = ['rss', 'tpchrome', 'tpmozafterpaint', 'tpcycles', 'tppagecycles', 'tprender', 'tpdelay', 'responsiveness', 'shutdown']
PerfConfigurator.py:        ('tpdelay', {'help': 'length of the pageloader delay',
PerfConfigurator.py:            'tpdelay': 1000
PerfConfigurator.py:                        'tpdelay',
run_tests.py:  if test.get('tpdelay') and test['tpdelay'] not in range(1, 10000):
run_tests.py:  CLI_options = ['tpformat', 'tpcycles', 'tppagecycles', 'tpdelay']

I am certainly fine deprecating it but would prefer to do so in a separate bug.

> @@ +199,5 @@
> > +                                              {'cycles': 10,
> > +                                               'timeout': 150},
> > +                                          'ts_places_generated_med':
> > +                                              {'cycles': 10,
> > +                                               'timeout': 150},
> 
> we do not have support for ts_places*.  I verified this today as those use a
> custom profile which will not work as we already have a custom profile for
> remote testing.

Ack.

> @@ +205,5 @@
> > +                                              {'tpcycles': 3},
> > +                                          'tsvg':
> > +                                              {'tpcycles': 3},
> > +                                          'tsspider':
> > +                                              {'tpcycles': 3}
> 
> we don't run tsspider anymore, we could remove this additional config
> override.

Ack.

> @@ +206,5 @@
> > +                                          'tsvg':
> > +                                              {'tpcycles': 3},
> > +                                          'tsspider':
> > +                                              {'tpcycles': 3}
> > +                                          }
> 
> why is the indentation so large here?

I'm not sure if can rigorously answer that question.  AFAIK, there are a few options for indentiation in PEP-8, not that we're rigorously using that anyway.  One such option for this case is aligning subitems with the colons following keys in a dict.  That is the pattern I used here

> @@ +340,5 @@
> > +            # fix webserver for --develop mode
> > +            if self.config.get('develop'):
> > +                webserver = self.config.get('webserver')
> > +                if (not webserver) or (webserver == 'localhost'):
> > +                    self.config['webserver'] = utils.getLanIp()
> 
> is this duplicated anywhere?  we support --develop without remote.

I don't believe this is duplicated.  This code block is only for the remote case

> @@ +645,5 @@
> > +            url = url.replace('webServer=', 'webServer=%s' % self.config['webserver'])
> > +
> > +            # Take care of the robocop based tests
> > +            url = url.replace('class org.mozilla.fennec.tests', 'class %s.tests' % self.config['browser_path'])
> > +            return url
> 
> please remove this entire block as we do not support this anymore.

Ack.

> @@ +796,5 @@
> > +        talosRoot = self.deviceroot + '/talos/'
> > +        for f in files:
> > +            if self.testAgent.pushFile(f, talosRoot + f) == False:
> > +                raise ConfigurationError("Unable to copy twinopen file "
> > +                                            + f + " to " + talosRoot + f)
> 
> can we remove buildRemoteTwinopen as we don't need it anymore?

https://bugzilla.mozilla.org/show_bug.cgi?id=798532

> @@ +828,5 @@
> >                      help="Input config file")
> >  
> >      # parse the arguments and dump an output file
> >      options, args = conf.parse_args(args)
> > +
> 
> extra line in here.

Thanks!

So I can deprecate the old stuff with this patch or I can do follow-ups.  Either way
Attached patch unbitrotSplinter Review
Testing with the patch vs without the patch on desktop we are trying to terminate the 'fennec' process.   That isn't right :(
I've tested this locally with ts and tsvg a bit.  I have also tried testing on my tablet, but it hangs at the end of the run with or without this patch (not sure what's going on there).
Attachment #670186 - Flags: review?(jmaher)
Comment on attachment 670186 [details] [diff] [review]
let's try this again

Review of attachment 670186 [details] [diff] [review]:
-----------------------------------------------------------------

looks good.  A lot of changes, and I would really like to get a bug on file for removing the old deprecated test stuff that we are adding from remotePerfConfigurator here.
Attachment #670186 - Flags: review?(jmaher) → review+
(In reply to Joel Maher (:jmaher) from comment #57)
> Comment on attachment 670186 [details] [diff] [review]
> let's try this again
> 
> Review of attachment 670186 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> looks good.  A lot of changes, and I would really like to get a bug on file
> for removing the old deprecated test stuff that we are adding from
> remotePerfConfigurator here.

bug 800421
pushed to try and praying: https://tbpl.mozilla.org/?tree=Try&rev=bb04a322f552
Try run for bb04a322f552 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=bb04a322f552
Results (out of 28 total builds):
    success: 26
    failure: 2
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jhammel@mozilla.com-bb04a322f552
The single failure (don't believe RelEng bot's lies!) is ts:

https://tbpl.mozilla.org/php/getParsedLog.php?id=16020719&tree=Try&full=1

ABICT, this is https://bugzilla.mozilla.org/show_bug.cgi?id=799960

:jmaher, I'd appreciate it if you could look at this and ensure that there is nothing else weird going on.  Other than that, I think we can deploy the landing gear and put the seats in a full upright position
lets land this!
Thanks, Joel.

pushed: http://hg.mozilla.org/build/talos/rev/830ec0525b07 \o/

Let's hope it sticks.

I'll update directions about running talos on android and file some follow-up bugs about mozharness using this, etc.  We should probably give a little thought to what if anything downstream should change because of this, though those are the two things popping to mind.

Also, let's take a step back:  what does this change mean?/why this change?

So this enables *all* talos runs to be a single-stage process.  This means, while it is still supported, you never have to generate a configuration file (YAML, though JSON is also supported just no one uses it because no one knows about it).  Formerly this was true for desktop.  Now its true for mobile as well \o/

Why should I care? you may ask.  While this is a huge convenience for developers, the big win here is that once we get automation to not have to generate configuration we will no longer have to depend on YAML at all (or JSON).
I'm glad this has worked out. Kudos to nebelhom@hotmail.com and Jeff Hammel!
Yes, many thanks for the patch and your extraordinary patience, nebelhom :)

In the future, I think we should avoid marking any talos bug that requires android testing a good first bug.  For one, while many android-related bugs could be good first bugs, if a contributor doesn't have an android device then testing will be, at best, a PITA. Secondly, last I checked it was still annoying to setup an android device for at least Talos testing (rooting, installing sutagent, etc). I don't really want to turn bugzilla into a "setup your device for testing" forum.  For yet a third, I personally often have issues testing talos on android myself that I have not been able to figure out and that do not appear in production.  If someone has a device and wants to work on android and stops by #ateam, it shouldn't be too hard to find something them something to work on, but I don't think we should broadcast these as good first bugs.
updated https://wiki.mozilla.org/Mobile/Fennec/Android#talos

filed bug 800623

I think those are the important two.  Closing
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Awesome! Towards the end, I had no clue what was going on... but AWESOME!!! :D

Thanks for being patient "with me". Now on to new shores (other bugs) :)
I'm happy to explain any outstanding issues you didn't understand, nebelhom.  Mostly it was me unbitrotting your patch, and trouble-shooting on android, mostly by generating config files with the patched version and with an unpatched version (which sadly you couldn't do since it requires and android device).  But, seriously, this was a pretty big effort and I congratulate you in persevering.
Depends on: 801633
@jhammel: I will take you up on that offer. Will contact you in IRC again. It is mostly little insights about change of variables and the like that I wasn't able to follow.

I'm glad the patch landed :)
I currently suspect this of contributing to extraneous logcat output in desktop firefox talos runs.

[14:59]	<jmaher>	whoa, android log for fedora 12? https://tbpl.mozilla.org/php/getParsedLog.php?id=16313032&tree=Firefox&full=1
ahh, when I uploaded my last talos.zip there was a logcat.log inside there and it is getting printed out each and every time.
You need to log in before you can comment on or make changes to this bug.