Closed Bug 729392 Opened 12 years ago Closed 12 years ago

Install toolchain needed for SPDY testing onto test machines

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task, P2)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: joduinn, Assigned: armenzg)

References

Details

(Whiteboard: [opsi][puppet])

Attachments

(2 files)

per meeting with Nick, Patrick on 01feb2012. (sorry for the delayed writeup, chime in if I missed anything.)


To verify that the new SPDY protocol in Firefox13 works, Nick+Patrick want to enable a new test on m-c and all related branches. The node_speedy.js / SPDY code has/will be in m-c, as part of Firefox build. However, for now, it is only being tested manually, with test code logic pref'd off, until node binary deployed.


Decisions:
1) RelEng to deploy binary node/node.exe to all test machines.
2) Nick will build one binary that they endorse, and give it to RelEng to start with. 
3) this binary doesnt change (often) but it will happen
4) per nick, patrick, this is confirmed to be only one binary file that needs to be deployed per machine - nothing else to install/no installer to run
5) the test suite will start the node/node.exe, and wait for it to be fully up before running tests
** need to decide exact location of binary, so test can be written to match
** this binary is *not* to be auto started on boot, so it will not impact other test suites run on the same machines.
6) For now, RelEng should just focus on getting this to work for desktop; no need to worry about android for now.
7) These new tests are run within mochitest xpcshell testsuite, so no change needed for tbpl or buildbot config changes.




TODO / Open Questions:
1) (RelEng) where should the binary get installed on the test machines?
2) Can Releng put the binary onto a shared central location, and have puppet/opsi deploy from that one location to all slaves? (similar to how we deploy talos.zip). This makes it easier to deploy updates to node/node.exe.
3) (nick) to compile binary node/node.exe from source
4) (nick) will file bug for loaner machines in order to test it works on specific OS
5) RelEng to figure out how to deploy, maybe same techique as talos.zip?
6) when can we deploy? will ship in FF13. the sooner the better. Given recent landings on m-c, maybe the answer is "deploy as soon as possible"?
7) (RelEng) what part of m-c is on each test slave? ... is m-c/testing/ ?  (this was in my notes, but I dont recall why; nick/patrick, can you remind me?)
This all looks right to me. Once number (1) in the TODO above is resolved (specifically for linux x86 & x64 to start), I'll file a bug for loaner machines for those platforms to test my patches. OS X will probably come after that, and Windows will come last.

For number (7) above, we need to know what will be available just so we can ensure we're putting the source necessary for the tests in a place that the test slaves will be able to get to it (this is specifically for node-spdy and moz-spdy, see the patch in bug 719609 if you're really curious).
Can we get an ETA for this deployment?
Can this binary be bundled up in the tests zip file?
e.g. firefox-13.0a1.en-US.win32.tests.zip

How often would node.exe file change and need to be re-deployed?

If I wanted to run such test locally what would I need to do?

On which OSes does this already work on?

hurley, which OSes would you need to loan? by when?

I will try to drive this to completion.
Component: Release Engineering → Release Engineering: Platform Support
Priority: -- → P3
QA Contact: release → coop
Hardware: x86 → All
Whiteboard: [opsi][puppet]
Nick - can you reply to Armen?
Armen, sorry for taking so long to get back to you.

We discussed putting the platform-specific binaries in the source tree, but decided against that (which is why we want to deploy it using whatever tools you guys have).

I can't imagine the binaries would need updating all that often, and barring any major issues, we could probably limit the updates fairly strongly. My gut feeling is no more than once per quarter, and likely less often, but I have no real basis for that.

To run the test locally, you would need to download and build node from git (https://github.com/joyent/node), build mozilla-central with my patch from bug 719609 applied, change the xpcshell.ini in netwerk/tests/unit to NOT disable test_spdy.js, run the spdy server manually (there are directions in testing/moz-spdy), and then run the unit test in the usual way for running a single xpcshell test.

Right now, this only works on linux (32 and 64 bit) and probably os x (I haven't tried all that recently). I would need to borrow one each of 32 and 64 bit linux. I'm currently waiting on information on what path the binary would live in so I can finish up the changes to the xpcshell harness to make it run the spdy server. After I know that information is when I would want to borrow the machines so I can make sure my final set of changes work. Can you answer that question for me, or is that something John would have to tell us?
Armen, I ran into John in MV yesterday, and he mentioned that my reference to github might have been a bit confusing. That's only something you have to do if you want to run the test on your local machine. I'll be responsible for providing binaries for use on the test slaves, and for making sure the xpcshell harness knows where/how to find the node binary (both on the test slaves and on personal machines). Sorry for any confusion that might have caused.
(In reply to Armen Zambrano G. [:armenzg] - Release Engineer from comment #4)
> I will try to drive this to completion.

Assigning to Armen.
Assignee: nobody → armenzg
I will be taking care of this.

Questions:
* Is there any reason that node/node.exe could not be zipped inside of the test packages? [1] Putting it there will make this bug trivial.
* Are we still aiming for Fedora/Fedora64 first?
* Will be have any sort of new parameter for mochitest/runtests.py?

For local location I suggest this:
* /home/cltbld/talos-slave/test/build since that is the path where we run tests [2]

[1] http://stage.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-linux/1333966360/firefox-14.0a1.en-US.linux-i686.tests.zip

[2]
python mochitest/runtests.py --appname=firefox/firefox-bin --utility-path=bin --extra-profile-file=bin/plugins --certificate-path=certs --autorun --close-when-done --console-level=INFO --symbols-path=http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-linux64/1333957777/firefox-14.0a1.en-US.linux-x86_64.crashreporter-symbols.zip --total-chunks 5 --this-chunk 1 --chunk-by-dir 4
Priority: P3 → P2
(In reply to Armen Zambrano G. [:armenzg] - Release Engineer from comment #9)
> I will be taking care of this.
> 
> Questions:
> * Is there any reason that node/node.exe could not be zipped inside of the
> test packages? [1] Putting it there will make this bug trivial.

Does this involve committing any binaries to mozilla-central? We were hoping to avoid doing so.

> * Are we still aiming for Fedora/Fedora64 first?

Yes.

> * Will be have any sort of new parameter for mochitest/runtests.py?

This is not a mochitest, it's an xpcshell test. As long as the binary is in a known location (which you suggest below), we should have no need to add a new argument, though it might be nice to be explicit about where the binary is (I imagine it would also make the code changes for xpcshell/runtests.py easier to pass review).

> For local location I suggest this:
> * /home/cltbld/talos-slave/test/build since that is the path where we run
> tests [2]
> 
> [1]
> http://stage.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-
> central-linux/1333966360/firefox-14.0a1.en-US.linux-i686.tests.zip
> 
> [2]
> python mochitest/runtests.py --appname=firefox/firefox-bin
> --utility-path=bin --extra-profile-file=bin/plugins --certificate-path=certs
> --autorun --close-when-done --console-level=INFO
> --symbols-path=http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-
> builds/mozilla-central-linux64/1333957777/firefox-14.0a1.en-US.linux-x86_64.
> crashreporter-symbols.zip --total-chunks 5 --this-chunk 1 --chunk-by-dir 4
After a quick call we determined the way forward:
* Nick to provide the binaries
* Nick to write a patch that will check for an environment variable for path to node
* Nick to push that patch to the try server (it will fail for xpcshell)
* Armen to test the try build on staging where we will have node

If the staging run does not go as expected we will meet again and figure that out.
Nick might need to access a machine if the staging run did not go as expected.

node could not be put inside of packaged tests as the code lives in github. We could go and build it on the build machines but I think we would be making things even more complicated.
(In reply to Armen Zambrano G. [:armenzg] - Release Engineer from comment #11)
> After a quick call we determined the way forward:
> * Nick to provide the binaries
> * Nick to write a patch that will check for an environment variable for path
> to node
> * Nick to push that patch to the try server (it will fail for xpcshell)
This is done.
https://tbpl.mozilla.org/?tree=Try&rev=1e3218bc291d

Now it is my turn.
We decided to use node.exe to make it easier when dealing with Windows (as odd as it looks).

[root@staging-puppet staging]# pwd
/N/staging
[root@staging-puppet staging]# find . -name node.exe
./fedora12-x86_64/test/home/cltbld/bin/node.exe
./fedora12-i686/test/home/cltbld/bin/node.exe

hurley, you can remove the hard coded path since we won't be using it (Windows and Mac slaves would have their own paths).
   47.23 +    testSlavePath = os.path.join('/home/cltbld/talos-slave/test/build', 'node.exe')

I will get you logs today.
(In reply to Armen Zambrano G. [:armenzg] - Release Engineer from comment #14)
> hurley, you can remove the hard coded path since we won't be using it
> (Windows and Mac slaves would have their own paths).
>    47.23 +    testSlavePath =
> os.path.join('/home/cltbld/talos-slave/test/build', 'node.exe')

Will do. That will probably make it more palatable for review purposes, anyway :)

> I will get you logs today.

Sounds good.
Armen, for the sake of completeness, I've started a new try run (doing just xpcshell stuff) with the removal of the hard-coded path at https://tbpl.mozilla.org/?tree=Try&rev=f82b3238c587

If you want builds/test zips from this, they'll be at http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/hurley@mozilla.com-f82b3238c587
hurley, can you verify that this is running well?

I see "Found node at /home/cltbld/bin/node.exe" in
http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTest/1334328835.1334330859.30267.gz&fulltext=1

I will try the new try builds as well as testing on Linux 64-bit.
(In reply to Armen Zambrano G. [:armenzg] - Release Engineer from comment #17)
> hurley, can you verify that this is running well?
> 
> I see "Found node at /home/cltbld/bin/node.exe" in
> http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTest/1334328835.
> 1334330859.30267.gz&fulltext=1
> 
> I will try the new try builds as well as testing on Linux 64-bit.

Hrm, it looks like it should have run just fine, given that it obviously found node. However, if you search for "test_spdy.js" later on in the file, you'll see it says

TEST-INFO | skipping /home/cltbld/talos-slave/test/build/xpcshell/tests/netwerk/test/unit/test_spdy.js | run-if: hasNode

so the hasNode property didn't get properly set. It looks like, while it found the node executable, it was unable to find the node code I wrote, which I just realized is because the xpcshell tests only unpack bin/ certs/ and xpcshell/ from the tests zipfile. My original patch puts the other stuff required in the top-level of the tests zip, but I can easily put them under xpcshell. This should mean you won't have to make any more changes on your end, and it's really no more difficult from my end.

I'll get that change made, and get new try builds running. There's no point in you trying the ones I started earlier today, since we know they won't work.
I think it is skipping again.

> TEST-INFO | skipping /home/cltbld/talos-slave/test/build/xpcshell/tests/netwerk/test/unit/test_spdy.js | run-if: hasNode
http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTest/1334335517.1334337567.16985.gz&fulltext=1
The revision is the correct one:
d4a1d50a9c4f
(In reply to Armen Zambrano G. [:armenzg] - Release Engineer from comment #20)
> I think it is skipping again.
> 
> > TEST-INFO | skipping /home/cltbld/talos-slave/test/build/xpcshell/tests/netwerk/test/unit/test_spdy.js | run-if: hasNode
> http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTest/1334335517.
> 1334337567.16985.gz&fulltext=1
> The revision is the correct one:
> d4a1d50a9c4f

*sigh* that would be because it's not bundling the stuff into the zipfile. /me facepalms

I'll get that fixed, then fingers crossed it should all work (everything looks good so far)
I've got another try build running now that should package it up. I won't bug you to test it again until I verify the required files are actually in the test zipfile :)
Alright, it looks like everything's getting packaged up properly this time. Linux64 has its tests.zip ready, 32-bit still has its build pending. Results are at https://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/hurley@mozilla.com-ab5eb07624b0/ tbpl link is https://tbpl.mozilla.org/?tree=Try&rev=ab5eb07624b0
Attachment #614770 - Attachment description: [wip] env variables for Linux testers → env variables for Fedora testers
Attachment #614770 - Flags: review?(coop)
Attachment #614773 - Attachment description: [wip] deploy node.exe to fedora12 (32/64-bit) test slaves → deploy node.exe to fedora12 (32/64-bit) test slaves
Attachment #614773 - Flags: review?(coop)
hurley, we should have the results showing up over here in the next hour:
http://tinderbox.mozilla.org/showbuilds.cgi?tree=MozillaTest

I won't be around by the time they finish.
(In reply to Armen Zambrano G. [:armenzg] - Release Engineer from comment #25)
> It seems that it failed:
> http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTest/1334350211.
> 1334352275.23479.gz&fulltext=1
> 
> More for me next week.

Armen, what version of python is running on the test servers? There's obviously some difference between what's available in the version I have on my laptop and the version that's on the servers.

For the record, the test and everything succeeds, it's just the shutdown of the server that fails. Should be an easy fix once I find the differences.
Nevermind my question, I figured out what the problem is. I'm working up a solution to it now.
OK, new try run going to build the new tests.zip with the changes to (hopefully) fix this problem. Output will be at http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/hurley@mozilla.com-21d2992ff82c when it's ready.
Is it working now?
http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTest/1334585565.1334587617.26554.gz&fulltext=1

  inflating: xpcshell/moz-spdy/spdy-ca.pem  
  inflating: xpcshell/moz-spdy/spdy-cert.pem  
  inflating: xpcshell/moz-spdy/moz-spdy.js  
  inflating: xpcshell/moz-spdy/spdy-key.pem  
  inflating: xpcshell/moz-spdy/README.txt  
...
Found node at /home/cltbld/bin/node.exe
Found moz-spdy at /home/cltbld/talos-slave/test/build/xpcshell/moz-spdy/moz-spdy.js
Node SPDY server running ...
...
TEST-INFO | /home/cltbld/talos-slave/test/build/xpcshell/tests/netwerk/test/unit/test_spdy.js | running test ...
TEST-PASS | /home/cltbld/talos-slave/test/build/xpcshell/tests/netwerk/test/unit/test_spdy.js | test passed (time: 1675.512ms)
(In reply to Armen Zambrano G. [:armenzg] - Release Engineer from comment #29)
> Is it working now?
> http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTest/1334585565.
> 1334587617.26554.gz&fulltext=1

Yep! Everything looks good in the log.
Cool!
I will look into enabling this in the morning.
Pending the reviews.

hurley, what comes next?
Attachment #614770 - Flags: review?(coop) → review+
Assuming my patches pass review (we should probably hold off on deploying these changes just in case the reviewer decides we should be doing something wildly different on my end), and your (and my) changes land, that should be it for this round. I'll have to request approval for aurora (and possibly beta, depending on how fast things move) for my stuff, but AFAIK, that won't affect you, right? (SPDY is already on by default in current aurora, so we want to have those tests running on those builds, as well, and if it takes more than a week for all this to land, we'll have to get it on beta, too).

Eventually we'll be adding similar support for OS X and Windows, but I can open other bugs for those platforms once I've got things ready for them (OS X should be pretty much the same, I just have to build node, Windows may be a bit trickier from my end, depending on how cooperative the node build feels like being).
Attachment #614773 - Flags: review?(coop) → review+
Comment on attachment 614770 [details] [diff] [review]
env variables for Fedora testers

I will check-in the puppet changes in the morning.
I will also reconfigure the masters in the morning.
http://hg.mozilla.org/build/buildbotcustom/rev/ca08506c9ab3
Attachment #614770 - Flags: checked-in+
(In reply to Nick Hurley [:hurley] from comment #32)
> Assuming my patches pass review (we should probably hold off on deploying
> these changes just in case the reviewer decides we should be doing something
> wildly different on my end), and your (and my) changes land, that should be
> it for this round. 
OK. I will wait.

If you need this deployed while I am away (Wed. 18th gone & Wed. 25th back), you should send an email to the thread we had a while ago and they will land this for you.
Attachment #614770 - Flags: checked-in+ → checked-in-
(In reply to Nick Hurley [:hurley] from comment #32)
> Assuming my patches pass review (we should probably hold off on deploying
> these changes just in case the reviewer decides we should be doing something
> wildly different on my end), and your (and my) changes land, that should be
> it for this round. 
>
Which bug should be block on? (wherever the reviews are happening)
(In reply to Armen Zambrano G. [:armenzg] - Release Engineer from comment #35)
> Which bug should be block on? (wherever the reviews are happening)

It's bug 719609 (currently blocked on this bug, but at this point I think it makes sense to have it the other way around)
No longer blocks: 719609
Depends on: 719609
Can I get the try builds (still running) at https://tbpl.mozilla.org/?tree=Try&rev=0e57865ceab6 tested with Armen's changes, just to make sure my (slight) updates to the patches there didn't break anything? Build output link: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/hurley@mozilla.com-0e57865ceab6

Even if it does break, I don't think there should be any updates required from releng (it should be all on me), but it'll make me feel better to know things will work as expected :)
Looks good?
http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTest/1335371905.1335373948.18690.gz&fulltext=1

TEST-INFO | /home/cltbld/talos-slave/test/build/xpcshell/tests/netwerk/test/unit/test_spdy.js | running test ...
TEST-PASS | /home/cltbld/talos-slave/test/build/xpcshell/tests/netwerk/test/unit/test_spdy.js | test passed (time: 1642.526ms)

I would like to start deploying tomorrow morning if possible.
Looks good! In reality, we deploy this and land bug 719609 in any order, but of course the sooner we get this all done, the better for testing of SPDY. Tomorrow morning should be just fine for deploying.
Attachment #614770 - Flags: checked-in- → checked-in+
Attachment #614773 - Flags: checked-in+
The fedora slaves should pick up the changes in the next couple of hours.
rail will be doing an scheduled reconfiguration at 10AM PDT which should turn everything live.
Comment on attachment 614773 [details] [diff] [review]
deploy node.exe to fedora12 (32/64-bit) test slaves

I had to back this out since I deployed it incorrectly.
I will try again either today/tomorrow.
Attachment #614773 - Flags: checked-in+ → checked-in-
Comment on attachment 614770 [details] [diff] [review]
env variables for Fedora testers

http://hg.mozilla.org/build/buildbotcustom/rev/0f6003269247

I backed out the patch because the env variable set in it is used by http://mxr.mozilla.org/mozilla-central/source/testing/xpcshell/runxpcshelltests.py#406
Attachment #614770 - Flags: checked-in+ → checked-in-
Attachment #614770 - Flags: checked-in- → checked-in+
Attachment #614773 - Flags: checked-in- → checked-in+
There is going to be a reconfig today.
In production
I assume this is fixed since I have seen results of it.

Please let me know if this is not correct and when you are ready for the next OS.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
Component: Platform Support → Buildduty
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: