864940 - Test a registry script for automation registration

Steps for testing: - through default programs set IE as the default browser - purge the system of all firefox related registrations using the reset script. You'll have to search hkey-classes-root for additional app ids to add to the list in here that get deleted. - download the latest opt build of firefox (not the pgo, there's problem with those builds I need to file a follow up on) - http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-win32/1366736193/firefox-23.0a1.en-US.win32.zip - Copy the firefox folder into a new directory on the c drive: c:\slave\test\build\application - download the latest tests zip: http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-win32/1366736193/firefox-23.0a1.en-US.win32.tests.zip - from within the bin sub folder of the zip, copy metrotestharness.exe into the firefox sub folder from the previous step. - run the register script attached here. - open default programs and set 'MozillaTestBrowser' as the default browser - open a cmd shell in the firefox folder above, and try running: metrotestharness -url http://www.mozilla.org/

Jim Mathies [:jimm]

Reporter

Comment 4

•

12 years ago

I know it's a pain but would you mind taking these steps for a spin? This registration script is going to go out to all of our test slaves, so we need to be sure it is working.

Flags: needinfo?(netzen)

Brian R. Bondy [:bbondy]

Comment 5

•

12 years ago

Yup I'll take it for a spin on a VM of mine but it will likely not be until tomorrow.

Flags: needinfo?(netzen)

Jim Mathies [:jimm]

Reporter

Comment 6

•

12 years ago

(In reply to Brian R. Bondy [:bbondy] from comment #5) > Yup I'll take it for a spin on a VM of mine but it will likely not be until > tomorrow. np, thanks!

Jim Mathies [:jimm]

Reporter

Comment 7

•

12 years ago

Attached file register script — Details

Updated the path to CommandExecuteHandler.exe.

Attachment #740976 - Attachment is obsolete: true

Brian R. Bondy [:bbondy]

Comment 8

•

12 years ago

Hi Jim, I followed the steps with your new attachment and it successfully launches the browser in the metro environment at the specified URL.

Jim Mathies [:jimm]

Reporter

Comment 9

•

12 years ago

Attached patch patch — Details — Splinter Review

Thanks. Lets check these in so we have them in a repo.

Attachment #741361 - Flags: review?(netzen)

Brian R. Bondy [:bbondy]

Updated

•

12 years ago

Attachment #741361 - Flags: review?(netzen) → review+

Q

Comment 10

•

12 years ago

Attached file Reg file converted to GPO usable XML — Details

This can be directly imported into a registry preference GPO

Q

Comment 11

•

12 years ago

Re-imaged t-w864-ix-001.wintest.releng.scl3.mozilla.com and applied the registry script and auto association script. The machine should be tested for general roll out.

Q

Comment 12

•

12 years ago

Attached file Auto Assoication XML for GPO — Details

Armen [:armenzg]

Assignee

Comment 13

•

12 years ago

Taking. I will test it on staging.

Assignee: jmathies → armenzg

Armen [:armenzg]

Assignee

Comment 14

•

12 years ago

Attached file log — Details

I don't think this has worked. What do you think?

Attachment #746941 - Flags: feedback?(jmathies)

Jim Mathies [:jimm]

Reporter

Comment 15

•

12 years ago

(In reply to Armen Zambrano G. [:armenzg] (Release Enginerring) from comment #14) > Created attachment 746941 [details] > log > > I don't think this has worked. What do you think? No it didn't. TEST-UNEXPECTED-FAIL | metrotestharness.exe | ActivateApplication result 80270254 0x80270254 is E_APPLICATION_NOT_REGISTERED. So something isn't connected up right. Can I get access to this slave so I can poke around a bit?

Armen [:armenzg]

Assignee

Comment 16

•

12 years ago

Q, could you please do the magic for jimm to look at this machine? Thanks!

Assignee: armenzg → q

Jim Mathies [:jimm]

Reporter

Comment 17

•

12 years ago

(In reply to Q from comment #10) > Created attachment 746788 [details] > Reg file converted to GPO usable XML > > This can be directly imported into a registry preference GPO I don't see the AppID getting created here. I see it getting deleted (from the cleanup code in the original reg script) but it doesn't look like it gets recreated. <Collection clsid="{53B533F5-224C-47e3-B01B-CA3B3F3FF4BF}" name="AppID"> <Collection clsid="{53B533F5-224C-47e3-B01B-CA3B3F3FF4BF}" name="{5100FEC1-212B-4BF5-9BF8-3E650FD794A3}"> <Registry clsid="{9CD4B2F4-923D-47f5-A062-E897DD1DAD50}" name="{5100FEC1-212B-4BF5-9BF8-3E650FD794A3}" status="{5100FEC1-212B-4BF5-9BF8-3E650FD794A3}" image="3" changed="2013-05-08 01:43:33" uid="{7A28A1D3-AF19-E121-FEE0-ECCF3B2481FB}"> <Properties action="D" displayDecimal="0" default="0" hive="HKEY_CLASSES_ROOT" key="AppID\{5100FEC1-212B-4BF5-9BF8-3E650FD794A3}" name="" type="REG_SZ" value=""/> <Filters/> </Registry> </Collection> </Collection> I think action="D" means delete. More generally, we could remove all the delete code from this if it's screwing up the GPO script generation and just update existing data.

Jim Mathies [:jimm]

Reporter

Comment 18

•

12 years ago

Looks like CLSID\{5100FEC1-212B-4BF5-9BF8-3E650FD794A3} may have the same problem.

Jim Mathies [:jimm]

Reporter

Comment 19

•

12 years ago

These entries may propagated up from HKEY_LOCAL_MACHINE data, so I'm not sure. I guess it depends on the order this gets inserted.

Q

Comment 20

•

12 years ago

Magic done just in case jimm wants to take a looks. The removal of the delete actions sounds like a reasonable work around since I order of operations can be different in GPO than in a straight reg script. I can go and change the delete actions in the gpo and we can test if need be then back update the script for posterity's sake.

Jim Mathies [:jimm]

Reporter

Comment 21

•

12 years ago

So the machine in question is "t-w864-ix-001.wintest.releng.scl3.mozilla.com"? Usually to connect to a test slave I'd need RDP login info and an ip address I can plug into my vpn software.

Q

Comment 22

•

12 years ago

That is the correct machine name, the ip is 10.26.40.31. RDP is disabled in window8 however VNC works and the "magic" Armen referred to resets the VNC and cltbld passwords. I will back channel those to you.

Jim Mathies [:jimm]

Reporter

Updated

•

12 years ago

Depends on: 870012

Armen [:armenzg]

Assignee

Comment 23

•

12 years ago

I have manually change the Administrator's password to match the one of cltbld. jimm needed it. It nevertheless required me to re-activate the UAC prompts. Not sure if this will invalidate any testing.

Jim Mathies [:jimm]

Reporter

Comment 24

•

12 years ago

Hrm, so I'm not having any problems running tests after we rebooted this slave. But armen is still unable to get the tests running and is seeing the same registration error. Seems like this might have something to do with accounts. I'm running tests under cltbld, from a windows command prompt with cltbld permissions using: C:\slave\test\build\venv\Scripts\python -u C:\slave\test\build\tests\mochitest\runtests.py --appname=C:\slave\test\build\application\firefox\firefox.exe --utility-path=tests/bin --extra-profile-file=tests/bin/plugins --certificate-path=tests/certs --close-when-done --autorun --console-level=INFO --browser-chrome --metro-immersive What might be different between this and when automation tries to run the same tests? One other interesting note - when I tried running tests from an admin prompt under cltbld the browser wouldn't start. From what I could tell the browser was trying to access a profile directory under the administrator's user directory which it didn't have access too. This resulted in a black screen after launch with no test runs. However I haven't reproduced the E_APPLICATION_NOT_REGISTERED error at all.

Jim Mathies [:jimm]

Reporter

Comment 25

•

12 years ago

Oh, also - c:\mozilla-build\python27\python -u scripts/scripts/desktop_unittest.py --cfg unittests/win_unittest.py --mochitest-suite metro-immersive --download-symbols ondemand works as well under cltbld.

Jim Mathies [:jimm]

Reporter

Comment 26

•

12 years ago

Hrm, looks like the reboot reset the admin pass. I was going to try running tests as admin under ctlbld but the password isn't working.

Armen [:armenzg]

Assignee

Comment 27

•

12 years ago

(In reply to Jim Mathies [:jimm] from comment #26) > Hrm, looks like the reboot reset the admin pass. I was going to try running > tests as admin under ctlbld but the password isn't working. I tried fixing the Administrator password again but I'm unable to :S

Jim Mathies [:jimm]

Reporter

Comment 28

•

12 years ago

I'm really curious how test runs launch on these slaves. I think we are having problems with mixed account access. For example, does the process that launches a test run execute under the ctlbld user or as an admin?

Armen [:armenzg]

Assignee

Comment 29

•

12 years ago

I don't have the last word but this is how I have seen it work before. A task is added on the Admin account, which specifies that when the cltbld logs in, it will start a process with the highest privileges.

Jim Mathies [:jimm]

Reporter

Comment 30

•

12 years ago

(In reply to Armen Zambrano G. [:armenzg] (Release Enginerring) from comment #29) > I don't have the last word but this is how I have seen it work before. > > A task is added on the Admin account, which specifies that when the cltbld > logs in, it will start a process with the highest privileges. By 'highest privileges' do you mean administrator level? I'm curious what account privs the tests actually run under. From the tests I've run under this scenario we apparently have some access issues with the profile directory, and potentially with getting the browser launched. If we're running tests while logged in as cltbld the tests should probably run as cltbld. We should confirm this is the case, and if not, try to get it working that way.

Armen [:armenzg]

Assignee

Comment 31

•

12 years ago

It is similar to this: wget -OC:\\slave\\talosslave.xml "http://people.mozilla.com/~armenzg/win7/talosslave.xml" schtasks /create /tn talosslave /xml "C:\slave\talosslave.xml" <RunLevel>HighestAvailable</RunLevel></Principal> I don't think it is necessarily as the Administrator user but with Admin like privileges. I don't known exactly what Q runs on the win8 machines but it should be similar. As far as I know, there are some jobs that requires us with high privileges. I don't remember what but I could test without it. Also to point out, we have been able to run the metro jobs on this same machine and with the same start up task. I don't know what is different this time. I can try trigger the jobs while I'm VNCed into the machine. What would you like me to try?

Jim Mathies [:jimm]

Reporter

Comment 32

•

12 years ago

I'd suggest confirming this works for you under cltbld privs: > c:\mozilla-build\python27\python -u scripts/scripts/desktop_unittest.py > --cfg unittests/win_unittest.py --mochitest-suite metro-immersive > --download-symbols ondemand Then maybe try doing the same with automation to see if it acts differently. The only difference I can think of would be the account under which we launch the runs.

Q

Comment 33

•

12 years ago

If you need admin privs try using "root" instead of "administrator" with the loaner password

Q

Comment 34

•

12 years ago

We do run the tasks with registry changing and UAC elevated privileges. Originally this was done to try to combat the UAC prompts before we figured out that a certain executable was named with verboten windows words.

Armen [:armenzg]

Assignee

Comment 35

•

12 years ago

Attached patch log — Details — Splinter Review

I modified the task in the task manager and I run buildbot without "highest privileges" and I got the same issue. It actually runs as "cltbld" user. What else could we try?

Jim Mathies [:jimm]

Reporter

Comment 36

•

12 years ago

(In reply to Armen Zambrano G. [:armenzg] (Release Enginerring) from comment #35) > Created attachment 747508 [details] [diff] [review] > log > > I modified the task in the task manager and I run buildbot without "highest > privileges" and I got the same issue. It actually runs as "cltbld" user. > > What else could we try? WARNING - TEST-UNEXPECTED-FAIL | metrotestharness.exe | CoAllowSetForegroundWindow result 80070005 This is different. I think this fails because the command window isn't in the foreground. We can try to address this in the test harness.

Jim Mathies [:jimm]

Reporter

Comment 37

•

12 years ago

(In reply to Jim Mathies [:jimm] from comment #36) > (In reply to Armen Zambrano G. [:armenzg] (Release Enginerring) from comment > #35) > > Created attachment 747508 [details] [diff] [review] > > log > > > > I modified the task in the task manager and I run buildbot without "highest > > privileges" and I got the same issue. It actually runs as "cltbld" user. > > > > What else could we try? > > WARNING - TEST-UNEXPECTED-FAIL | metrotestharness.exe | > CoAllowSetForegroundWindow result 80070005 > > This is different. I think this fails because the command window isn't in > the foreground. We can try to address this in the test harness. With your testing if you can make sure the test output console is in the foreground it will allow you to get farther in. I'll land a patch on central that makes this a non-fatal error.

Q

Comment 38

•

12 years ago

Armen can you run again? I just double checked and made sure all of the registry changes were propagated with no deletes.

Jim Mathies [:jimm]

Reporter

Comment 39

•

12 years ago

Hrm, I can't make this non-fatal. If the console trying to launch metrotestharness isn't in the foreground, it'll fail to launch the browser. :/

Armen [:armenzg]

Assignee

Comment 40

•

12 years ago

By having cmd in the foreground I can see the tests running. I will take a screenshot once the machine reboots so we can see what is up and running. 10:48:28 INFO - TinderboxPrint: mochitest-metro-immersive 550/16/1

Jim Mathies [:jimm]

Reporter

Comment 41

•

12 years ago

Attached patch focus patch — Details — Splinter Review

Attachment #747520 - Flags: review?(netzen)

Jim Mathies [:jimm]

Reporter

Comment 42

•

12 years ago

So we can touch this up somewhat. What's the state of these machines before tests run? Do they have the desktop loaded and displayed or do they have the immerisve interface displayed?

Jim Mathies [:jimm]

Reporter

Updated

•

12 years ago

Whiteboard: [leave-open]

Armen [:armenzg]

Assignee

Comment 43

•

12 years ago

The CMD prompt is always on the back behind the Libraries window: http://cl.ly/OrZ6

Jim Mathies [:jimm]

Reporter

Comment 44

•

12 years ago

(In reply to Armen Zambrano G. [:armenzg] (Release Enginerring) from comment #43) > The CMD prompt is always on the back behind the Libraries window: > http://cl.ly/OrZ6 Seems kind of random, I wonder why that library window is open? Regardless, if it always starts out this way then explorer will have the foreground focus and we don't need to make the call that's failing. My patch accomplishes this.

Q

Comment 45

•

12 years ago

The library windows open because of a hack to "Show the desktop": http://www.chrisnackers.com/2013/02/06/windows-8-show-desktop-at-logon/

Jim Mathies [:jimm]

Reporter

Comment 46

•

12 years ago

(In reply to Q from comment #45) > The library windows open because of a hack to "Show the desktop": > http://www.chrisnackers.com/2013/02/06/windows-8-show-desktop-at-logon/ Ok, that's actually good - having explorer in the foreground solves the problem since explorer launches metrofx. The patch I posted will need to land and then we can retest. Sounds like we are almost there!

Brian R. Bondy [:bbondy]

Updated

•

12 years ago

Attachment #747520 - Flags: review?(netzen) → review+

Jim Mathies [:jimm]

Reporter

Comment 47

•

12 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/f8c1e234d939

Armen [:armenzg]

Assignee

Comment 48

•

12 years ago

I've triggered that changeset on staging. I will review the results in the morning.

Ed Morley [:emorley]

Comment 49

•

12 years ago

https://hg.mozilla.org/mozilla-central/rev/f8c1e234d939

Armen [:armenzg]

Assignee

Comment 50

•

12 years ago

I've got this: 05:08:22 INFO - INFO | automation.py | SSL tunnel pid: 1132 05:08:22 INFO - args: ['C:\\slave\\test\\build\\tests\\bin\\metrotestharness.exe', '-no-remote', '-profile', 'c:\\users\\cltbld~1.t-w\\appdata\\local\\temp\\tmpqdg3wo/', 'about:blank', '-firefoxpath', 'C:\\slave\\test\\build\\application\\firefox\\firefox.exe'] 05:08:22 INFO - INFO | automation.py | Application pid: 3852 05:08:22 INFO - INFO | metrotestharness.exe | firefoxpath: 'C:\slave\test\build\application\firefox\firefox.exe' 05:08:22 INFO - INFO | metrotestharness.exe | args: '-no-remote -profile c:\users\cltbld~1.t-w\appdata\local\temp\tmpqdg3wo/ about:blank' 05:08:22 INFO - INFO | metrotestharness.exe | Launching browser... 05:08:22 INFO - INFO | metrotestharness.exe | App model id='E4CFE2E6B75AA3A3' 05:08:22 INFO - INFO | metrotestharness.exe | Harness process id: 3852 05:08:22 INFO - INFO | metrotestharness.exe | Writing out tests.ini to: 'C:\slave\test\build\application\firefox\tests.ini' 05:08:22 WARNING - TEST-UNEXPECTED-FAIL | metrotestharness.exe | ActivateApplication result 80270254 05:08:22 INFO - INFO | metrotestharness.exe | Deleting C:\slave\test\build\application\firefox\tests.ini 05:08:22 INFO - INFO | automation.py | Application ran for: 0:00:00.668000 05:08:22 INFO - INFO | zombiecheck | Reading PID log: c:\users\cltbld~1.t-w\appdata\local\temp\tmp7dcfsupidlog 05:08:23 INFO - SUCCESS: The process with PID 1132 has been terminated. 05:08:23 INFO - ERROR: The process with PID 4004 could not be terminated. 05:08:23 INFO - Reason: There is no running instance of the task. 05:08:23 INFO - SUCCESS: The process with PID 2096 has been terminated. 05:08:23 INFO - WARNING | leakcheck | refcount logging is off, so leaks can't be detected! 05:08:23 INFO - INFO | runtests.py | Running tests: end. 05:08:24 INFO - Return code: 0 05:08:24 INFO - TinderboxPrint: mochitest-metro-immersive T-FAIL 05:08:24 WARNING - # TBPL WARNING # 05:08:24 WARNING - The mochitest suite: metro-immersive ran with return status: WARNING 05:08:24 INFO - Copying logs to upload dir... 05:08:24 INFO - mkdir: C:\slave\test\build\upload\logs

Jim Mathies [:jimm]

Reporter

Comment 51

•

12 years ago

So now we are back to 0x80270254 is E_APPLICATION_NOT_REGISTERED. Something is different between our manual runs of desktop_unittest.py and automation runs. My guess is it's permissions/accounts related.

Jim Mathies [:jimm]

Reporter

Comment 52

•

12 years ago

One thing I can confirm from the log is that automation is running under the right account, since the temp profile is in cltbld's user data folder: c:\\users\\cltbld~1.t-w\\appdata\\local\\temp\\tmpqdg3wo/ But it might be running with elevated permissions. If I can get the admin user/pass and cltbld's pass I can go in and try to reproduce what automation gets via a manual run.

Armen [:armenzg]

Assignee

Comment 53

•

12 years ago

The admin user is root. The password is the same as cltbld. FTR, I moved the cmd window to the foreground and it passed. Is it possible to move the CMD windows always to the foreground? or minimize the library window?

Jim Mathies [:jimm]

Reporter

Comment 54

•

12 years ago

(In reply to Armen Zambrano G. [:armenzg] (Release Enginerring) from comment #53) > The admin user is root. The password is the same as cltbld. > > FTR, I moved the cmd window to the foreground and it passed. > > Is it possible to move the CMD windows always to the foreground? or minimize > the library window? If the cmd window isn't in foreground, it can't steal focus. If the Libraries window is in the foreground, explorer has the focus and shouldn't have any issues launching the browser. The error was "not registered", so I'm confused as to why forground status on the cmd window plays into this.

Jim Mathies [:jimm]

Reporter

Comment 55

•

12 years ago

Also, if Libraries is in the forgorund and potentially on top of desktop firefox when we run tests, I wonder how that might effect test or talos runs. We might want to file a bug on that to confirm firefox is fully displayed.

Jim Mathies [:jimm]

Reporter

Comment 56

•

12 years ago

(In reply to Armen Zambrano G. [:armenzg] (Release Enginerring) from comment #53) > The admin user is root. The password is the same as cltbld. > > FTR, I moved the cmd window to the foreground and it passed. > > Is it possible to move the CMD windows always to the foreground? or minimize > the library window? 'root' / cltbld's pass didn't authenticate. I was able to run again via a cltbld command window with the Library window in the foreground and everything worked. The only part of this I can do manually is testslave.py invoking desktop_unittest.py. Running desktop_unittest.py manually from the desktop works fine.

Armen [:armenzg]

Assignee

Comment 57

•

12 years ago

At the end of the road I believe we reach Process inside of _dumbwin32proc.py: https://etherpad.mozilla.org/UeEIkhYgKk http://hg.mozilla.org/build/twisted/file/3bdb54e31023/twisted/internet/_dumbwin32proc.py#l105 For the record, the _dumbwin32proc.py on this slave is slightly different than all the other machines. This was pointed out on bug 853609#c6. It only helps buildbot to kill processes. Without it buildbot cannot kill processes. I fixed it manually as of now (not sure if a reboot will take it away). I've triggered the task and put the library directory in the front. Let's see what happens.

Armen [:armenzg]

Assignee

Comment 58

•

12 years ago

It seems to be running. Let me get out of the machine and wait for another job after the reboot. I will check what version of _dumbwin32proc.py is on the machine.

Armen [:armenzg]

Assignee

Comment 59

•

12 years ago

After rebooting I had the same problem. _dumbwin32proc.py is patched. It seems that if I start the task manually it works. Unless, starting the task manually, then making the Desktop window be on the foreground is not exactly the same thing.

Armen [:armenzg]

Assignee

Comment 60

•

12 years ago

I did something different this time. After a reboot, a failed job, I decided to just drag the desktop window a little to the side (just click and drag). After doing that action the next job succeeded. FTR, here's where we are failing to activate: http://mxr.mozilla.org/mozilla-central/source/browser/metro/shell/testing/metrotestharness.cpp#292

Armen [:armenzg]

Assignee

Comment 61

•

12 years ago

Is the error code significant? Can we add debug information on the log? Like which user we're logged in as? or available privileges?

Armen [:armenzg]

Assignee

Comment 62

•

12 years ago

I have removed rebooting out of the equation. I'm going to trigger several jobs in a row and see if the first one fail and the following succeed. After that I can receive suggestions on what to try next as I feel that I'm getting tunnel vision.

Armen [:armenzg]

Assignee

Comment 63

•

12 years ago

It has worked the last few times. I will add rebooting back. Maybe patching _dumbwin32proc.py worked and I got confused on comment 59?

Jim Mathies [:jimm]

Reporter

Comment 64

•

12 years ago

(In reply to Armen Zambrano G. [:armenzg] (Release Enginerring) from comment #61) > Is the error code significant? > Can we add debug information on the log? Like which user we're logged in as? > or available privileges? Not much information on it. Google has a note in their code about it similar to our assumption that it gets returned when the browser is not set as the default. The focus stuff is really weird. If there was a registration problem, you would expect it to happen every time, even when run manually.

Jim Mathies [:jimm]

Reporter

Comment 65

•

12 years ago

(In reply to Armen Zambrano G. [:armenzg] (Release Enginerring) from comment #63) > It has worked the last few times. I will add rebooting back. > Maybe patching _dumbwin32proc.py worked and I got confused on comment 59? or maybe the initial startup on reboot is the problem. What did you have to do to get this machine to work? Go in a play with window focus?

Armen [:armenzg]

Assignee

Comment 66

•

12 years ago

The things that I have done differently are: * patch _dumbwin32proc.py * play with Window focus Out of the last 8 runs 2 have failed. Do you think we could take a screenshot at the beginning of the run to compare things? I don't know what to do or what to try.

Jim Mathies [:jimm]

Reporter

Comment 67

•

12 years ago

(In reply to Armen Zambrano G. [:armenzg] (Release Enginerring) from comment #66) > The things that I have done differently are: > * patch _dumbwin32proc.py > * play with Window focus > > Out of the last 8 runs 2 have failed. > > Do you think we could take a screenshot at the beginning of the run to > compare things? > > I don't know what to do or what to try. During these runs, what are the steps the slave took? Did it reboot for each run for example? Also, do we have logs for the two failures?

Armen [:armenzg]

Assignee

Comment 68

•

12 years ago

I've put the two logs in here: http://people.mozilla.com/~armenzg/metro After each job we reboot. The steps are the same every time. Checkout mozharness and run scripts/scripts/desktop_unittest.py --cfg unittests/win_unittest.py --mochitest-suite metro-immersive --download-symbols ondemand which it eventually runs this: C:\\slave\\test\\build\\venv\\Scripts\\python', '-u', 'C:\\slave\\test\\build\\tests\\mochitest/runtests.py', '--appname=C:\\slave\\test\\build\\application\\firefox\\firefox.exe', '--utility-path=tests/bin', '--extra-profile-file=tests/bin/plugins', '--symbols-path=http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-inbound-win32/1368124676/firefox-23.0a1.en-US.win32.crashreporter-symbols.zip', '--certificate-path=tests/certs', '--autorun', '--close-when-done', '--console-level=INFO', '--browser-chrome', '--metro-immersive' I'm running against the same build.

Jim Mathies [:jimm]

Reporter

Comment 69

•

12 years ago

Q, curious, with your registration scripts, does anything run on startup/login that might delay enough such that it could cause a test run startup failure due to missing registration? Also, FYI Armen, I'd like to look at the registration on this machine again. However this week I'm in Vancouver for a work week so I won't be able to dig into this more until I get back homes next week.

Flags: needinfo?(q)

Q

Comment 70

•

12 years ago

(In reply to Jim Mathies [:jimm] from comment #69) Jim, Registration should happen before any user action can take place. I can do some to debugging to 100% sure but, it should not be an issue.

Flags: needinfo?(q)

Jim Mathies [:jimm]

Reporter

Comment 71

•

12 years ago

(In reply to Armen Zambrano G. [:armenzg] (Release Enginerring) from comment #66) > The things that I have done differently are: > * patch _dumbwin32proc.py > * play with Window focus > > Out of the last 8 runs 2 have failed. > > Do you think we could take a screenshot at the beginning of the run to > compare things? > > I don't know what to do or what to try. So were these ten runs stand alone or did you have to go in and play with the machine? Can we do some sort of automated test where we run the tests - reboot - repeat for ~20 runs without messing with the slave to see what the failure rate is?

Armen [:armenzg]

Assignee

Comment 72

•

12 years ago

(In reply to Jim Mathies [:jimm] from comment #71) > (In reply to Armen Zambrano G. [:armenzg] (Release Enginerring) from comment > #66) > > The things that I have done differently are: > > * patch _dumbwin32proc.py > > * play with Window focus > > > > Out of the last 8 runs 2 have failed. > > > > Do you think we could take a screenshot at the beginning of the run to > > compare things? > > > > I don't know what to do or what to try. > > So were these ten runs stand alone or did you have to go in and play with > the machine? Can we do some sort of automated test where we run the tests - > reboot - repeat for ~20 runs without messing with the slave to see what the > failure rate is? I did not mess any of those. I will go and queue a lot of them where the machine would go straight into taking jobs (rather than me triggering a couple of jobs manually every few minutes since there is job coalescing - I will disable it).

Jim Mathies [:jimm]

Reporter

Comment 73

•

12 years ago

To see if this is a timing issue, we could land a debug patch that retries ActivateApplication over a period of 30 seconds or so, and see if it improves the success rate. Do we think this would be useful?

Jim Mathies [:jimm]

Reporter

Comment 74

•

12 years ago

alternatively, is there any way we can delay test startup on this slave for testing purposes? I think there's already an existing 30 second timeout, maybe we could up that to a minute or two to see if it helps?

Armen [:armenzg]

Assignee

Comment 75

•

12 years ago

I can manually adjust the file. I hope PGO would not overwrite it. Let me try it.

Armen [:armenzg]

Assignee

Comment 76

•

12 years ago

I have bumped the sleep step from 30 to 60 and have re-triggered a lot of consecutive jobs. We had one more failure: 07:29:52 WARNING - TEST-UNEXPECTED-FAIL | metrotestharness.exe | ActivateApplication result 80270254 End time of the previous job was 06:52:23 2013 Start time of the failing job was 07:27:11 2013 This means that the machine was up and waiting without taking a job for almost 30 minutes. Could we dump expected information from the registry at the beginning of the run? Adding more time before taking a job has not helped. I would like to see the retrying of ActiveApplication to see if it fixes the issue.

Jim Mathies [:jimm]

Reporter

Comment 77

•

12 years ago

(In reply to Armen Zambrano G. [:armenzg] (Release Enginerring) from comment #76) > I have bumped the sleep step from 30 to 60 and have re-triggered a lot of > consecutive jobs. > > We had one more failure: 07:29:52 WARNING - TEST-UNEXPECTED-FAIL | > metrotestharness.exe | ActivateApplication result 80270254 One failure out of how many runs? I'm curious how often this happens. > End time of the previous job was 06:52:23 2013 > Start time of the failing job was 07:27:11 2013 > > This means that the machine was up and waiting without taking a job for > almost 30 minutes. So the machine was rebooted, ctlbld logged in and it sitting idle for thirty minutes before a test run was initiated? > Could we dump expected information from the registry at the beginning of the > run? > Adding more time before taking a job has not helped. > > I would like to see the retrying of ActiveApplication to see if it fixes the > issue. Maybe the next step should be to push runs to the slave until it fails, so we know we have a failed setup, and then go in and take a look at the slave config to try and figure out what's wrong. Maybe we can try to get this set up on Monday once I'm back from my work week.

Armen [:armenzg]

Assignee

Comment 78

•

12 years ago

(In reply to Jim Mathies [:jimm] from comment #77) > (In reply to Armen Zambrano G. [:armenzg] (Release Enginerring) from comment > #76) > > I have bumped the sleep step from 30 to 60 and have re-triggered a lot of > > consecutive jobs. > > > > We had one more failure: 07:29:52 WARNING - TEST-UNEXPECTED-FAIL | > > metrotestharness.exe | ActivateApplication result 80270254 > > One failure out of how many runs? I'm curious how often this happens. If looking since May 10th, 4 failures out of 15 runs. Not sure if it matters, 2 runs failed in a row, then many success passes and then again 2 failures in a row. Start times and end times in between do not give us any indication of what could be the reason (sitting idle, maybe a day passes between the last job, and on). > > > End time of the previous job was 06:52:23 2013 > > Start time of the failing job was 07:27:11 2013 > > > > This means that the machine was up and waiting without taking a job for > > almost 30 minutes. > > So the machine was rebooted, ctlbld logged in and it sitting idle for thirty > minutes before a test run was initiated? > Correct. > > Could we dump expected information from the registry at the beginning of the > > run? > > Adding more time before taking a job has not helped. > > > > I would like to see the retrying of ActiveApplication to see if it fixes the > > issue. > > Maybe the next step should be to push runs to the slave until it fails, so > we know we have a failed setup, and then go in and take a look at the slave > config to try and figure out what's wrong. Maybe we can try to get this set > up on Monday once I'm back from my work week. Do you mean manually?

Jim Mathies [:jimm]

Reporter

Comment 79

•

12 years ago

> > Maybe the next step should be to push runs to the slave until it fails, so > > we know we have a failed setup, and then go in and take a look at the slave > > config to try and figure out what's wrong. Maybe we can try to get this set > > up on Monday once I'm back from my work week. > > Do you mean manually? Well, I have to default to you on that. We want to look over the config on the slave after an automated test run failure. Things we would want to look at / test: - inspect browser registration - manually run using a command console to see if the failure is reproducible. - maybe initiate another automated run while we're on the machine to look for something unique.

Armen [:armenzg]

Assignee

Comment 80

•

12 years ago

jimm, I could try to hack the code so it prevents the machine from rebooting once it fails. I'm off tomorrow; I could take a stab at it on Friday.

Assignee: q → armenzg

Priority: -- → P1

Jim Mathies [:jimm]

Reporter

Comment 81

•

12 years ago

(In reply to Armen Zambrano G. [:armenzg] (Release Enginerring) from comment #80) > jimm, I could try to hack the code so it prevents the machine from rebooting > once it fails. I'm off tomorrow; I could take a stab at it on Friday. Did you have any luck with this? I'm back from my work week so I have vpn access again.

Armen [:armenzg]

Assignee

Comment 82

•

12 years ago

(In reply to Jim Mathies [:jimm] from comment #81) > (In reply to Armen Zambrano G. [:armenzg] (Release Enginerring) from comment > #80) > > jimm, I could try to hack the code so it prevents the machine from rebooting > > once it fails. I'm off tomorrow; I could take a stab at it on Friday. > > Did you have any luck with this? > > I'm back from my work week so I have vpn access again. I ended up getting derailed adding WinXP ix machines to the production pool. I'm at my work week this week. I am flying today. I will see what I can do this week about this.

Armen [:armenzg]

Assignee

Comment 83

•

12 years ago

I'm at a work week this week and I am focusing on my breaks to fix the iX test infra refresh project. I don't know how much time I can really spend on this bug this week :S

Jim Mathies [:jimm]

Reporter

Updated

•

12 years ago

Attachment #746941 - Flags: feedback?(jmathies)

Armen [:armenzg]

Assignee

Comment 84

•

12 years ago

Hi jimm, Things are settling down for me. What would you like me to try? Prevent the machine from rebooting if it fails so we can look at it?

Armen [:armenzg]

Assignee

Updated

•

12 years ago

Blocks: 864801

Jim Mathies [:jimm]

Reporter

Comment 85

•

12 years ago

(In reply to Armen Zambrano G. [:armenzg] (Release Enginerring) from comment #84) > Hi jimm, > Things are settling down for me. > What would you like me to try? Prevent the machine from rebooting if it > fails so we can look at it? Yes I think that would be a good next step. Lets see if we can reproduce the registration problem after a test run has failed w/out the reboot, so we are testing with the same config the failed test runs under. If we can then it should be pretty easy to diagnose the configuration problem.

Jim Mathies [:jimm]

Reporter

Comment 86

•

12 years ago

How goes the testing armen?

Armen [:armenzg]

Assignee

Comment 87

•

12 years ago

(In reply to Jim Mathies [:jimm] from comment #86) > How goes the testing armen? I have not had any luck in figuring out a way of doing this with buildbot easily. I'm going to disable rebooting and just check every five minutes.

Armen [:armenzg]

Assignee

Comment 88

•

12 years ago

Here's where we are with logs so far: http://people.mozilla.com/~armenzg/logs/metro.log.txt

Jim Mathies [:jimm]

Reporter

Comment 89

•

12 years ago

(In reply to Armen Zambrano G. [:armenzg] (Release Enginerring) from comment #87) > (In reply to Jim Mathies [:jimm] from comment #86) > > How goes the testing armen? > > I have not had any luck in figuring out a way of doing this with buildbot > easily. > I'm going to disable rebooting and just check every five minutes. Ok, this sound like a good test I think. Basically you're going to let this do multiple test runs for a period of time without a reboot. If we have a working setup and we don't reboot and run multiple test runs, it will be interesting to see if we get registration failures. Is this your plan? (From your posted log I see one successful test run so far.) Question: on which tree is this running? Cedar? Should we be merging over to cedar to trigger these runs? If so I can do merges all weekend long to trigger lots of them. If this is your plan, and if we don't get any registration failures on any test run startup, that means we are dealing with some sort of a sporadic reboot config problem, right?

Armen [:armenzg]

Assignee

Comment 90

•

12 years ago

It is not running on any tree but on one machine that I have on staging. There's also another option. What about if we deployed Q's change as-is? I could then enable metro jobs on certain branches and we could re-trigger them if we need to. Worst comes to worst we have to undo Q's change or modify it. This is what I'm currently doing: * trigger job on buildbot * wait 5 minutes * if jobs has *not* failed I reboot and trigger another job * if the jobs fails, I will see it on the log and let you know so you can jump on the machine and have a look

Jim Mathies [:jimm]

Reporter

Comment 91

•

12 years ago

(In reply to Armen Zambrano G. [:armenzg] (Release Enginerring) from comment #90) > It is not running on any tree but on one machine that I have on staging. > > There's also another option. > > What about if we deployed Q's change as-is? > I could then enable metro jobs on certain branches and we could re-trigger > them if we need to. > > Worst comes to worst we have to undo Q's change or modify it. > > > This is what I'm currently doing: > * trigger job on buildbot > * wait 5 minutes > * if jobs has *not* failed I reboot and trigger another job > * if the jobs fails, I will see it on the log and let you know so you can > jump on the machine and have a look Ok, this sounds good, lets try this first. I'm hesitant about rolling out major machine config changes that we haven't validated yet even if reverting the change is known to be possible.

Armen [:armenzg]

Assignee

Comment 92

•

12 years ago

On another note, is there a way we can get some info dumped to know why it did not activate? or retry few times? I have run this 9 times and none of them have failed. Perhaps we could turn the job to be an automatic RETRY and another machine would take the job if we match the activation failure.

Jim Mathies [:jimm]

Reporter

Comment 93

•

12 years ago

(In reply to Armen Zambrano G. [:armenzg] (Release Enginerring) from comment #92) > On another note, is there a way we can get some info dumped to know why it > did not activate? or retry few times? Some things I can do here for debugging purposes: 1) dumping a bunch of registry related info to logs to cross check what the automatic config is supposed to be doing before the user logs in. 2) I can retry the call after a short wait for debugging purposes. This wouldn't be a valid fix though since test runs are on a time out. 3) Validate that target browser files are in place, for example firefox.exe. I can't really think of much else. > I have run this 9 times and none of them have failed. Hmm, so what do you think is different between what you are doing and what happens when this is automated? > Perhaps we could turn > the job to be an automatic RETRY and another machine would take the job if > we match the activation failure. I'm not sure what this means, but if the result is our ability to go in and look at a slave setup that has failed - sounds good to me.

Armen [:armenzg]

Assignee

Comment 94

•

12 years ago

(In reply to Jim Mathies [:jimm] from comment #93) > (In reply to Armen Zambrano G. [:armenzg] (Release Enginerring) from comment > #92) > > On another note, is there a way we can get some info dumped to know why it > > did not activate? or retry few times? > > Some things I can do here for debugging purposes: > > 1) dumping a bunch of registry related info to logs to cross check what the > automatic config is supposed to be doing before the user logs in. > > 2) I can retry the call after a short wait for debugging purposes. This > wouldn't be a valid fix though since test runs are on a time out. > > 3) Validate that target browser files are in place, for example firefox.exe. > > I can't really think of much else. > If you could do any of these it would be great. > > > I have run this 9 times and none of them have failed. > > Hmm, so what do you think is different between what you are doing and what > happens when this is automated? > The only difference is that it does not reboot automatically at the end of the job. I do the rebooting after I inspect the results. TBH, the failure has never happened often. > > Perhaps we could turn > > the job to be an automatic RETRY and another machine would take the job if > > we match the activation failure. > > I'm not sure what this means, but if the result is our ability to go in and > look at a slave setup that has failed - sounds good to me. > Have you ever seen a blue job on tbpl? (either blue or purple) The blue jobs are considered "infra known failures" which automatically re-trigger the job on another machine. For instance, if there is a network blip, hg is 404 and on. What I'm suggesting is to make the job automatically re-trigger if we fail to active the app.

Jim Mathies [:jimm]

Reporter

Comment 95

•

12 years ago

> If you could do any of these it would be great. I'll put something together and we'll get it landed beginning of the week. > Have you ever seen a blue job on tbpl? (either blue or purple) > The blue jobs are considered "infra known failures" which automatically > re-trigger the job on another machine. > For instance, if there is a network blip, hg is 404 and on. > What I'm suggesting is to make the job automatically re-trigger if we fail > to active the app. Are you suggesting doing this for production and rolling this out? If so that's not a call I can make, since it'll suck releng resources until we get it fixed. We should also be sure we can backout / update Q's work.

Jim Mathies [:jimm]

Reporter

Comment 96

•

12 years ago

(In reply to Jim Mathies [:jimm] from comment #95) > > If you could do any of these it would be great. > > I'll put something together and we'll get it landed beginning of the week. Can you work off a try build with this or do you need it checked into the tree?

Armen [:armenzg]

Assignee

Comment 97

•

12 years ago

I can make it work from try.

Jim Mathies [:jimm]

Reporter

Comment 98

•

12 years ago

Hey Arman, is that slave currently free? I'd like to get on it and do some testing.

Jim Mathies [:jimm]

Reporter

Comment 99

•

12 years ago

Ok here's rev 1 with a bunch of registry config debug code in it, plus a five second retry that retries three times. https://tbpl.mozilla.org/?tree=Try&rev=ef8aa4c447f6 I just sent it over so should be available in a couple hours.

Jim Mathies [:jimm]

Reporter

Comment 100

•

12 years ago

Attached patch reg debug patch v.1 — Details — Splinter Review

Jim Mathies [:jimm]

Reporter

Comment 101

•

12 years ago

(In reply to Jim Mathies [:jimm] from comment #99) > Ok here's rev 1 with a bunch of registry config debug code in it, plus a > five second retry that retries three times. > > https://tbpl.mozilla.org/?tree=Try&rev=ef8aa4c447f6 > > I just sent it over so should be available in a couple hours. builds ready - http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jmathies@mozilla.com-ef8aa4c447f6/try-win32/

Armen [:armenzg]

Assignee

Comment 102

•

12 years ago

(In reply to Jim Mathies [:jimm] from comment #101) > (In reply to Jim Mathies [:jimm] from comment #99) > > Ok here's rev 1 with a bunch of registry config debug code in it, plus a > > five second retry that retries three times. > > > > https://tbpl.mozilla.org/?tree=Try&rev=ef8aa4c447f6 > > > > I just sent it over so should be available in a couple hours. > > builds ready - > > http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jmathies@mozilla. > com-ef8aa4c447f6/try-win32/ I have started running this on staging.

Armen [:armenzg]

Assignee

Comment 103

•

12 years ago

The last job that we run on this machine was on June 3rd. After that we rebooted a couple of times. After those reboots the first job out of 13 runs it failed. The remaining 12 jobs have been succeeding. 08:45:38 INFO - ##### 08:45:38 INFO - ##### Running run-tests step. 08:45:38 INFO - ##### 08:45:38 INFO - Running pre test command run mouse & screen adjustment script with 'c:\mozilla-build\python27\python.exe ../scripts/external_tools/mouse_and_screen_resolution.py --configuration-url http://hg.mozilla.org/%(repo_path)s/raw-file/%(revision)s/testing/machine-configuration.json' 08:45:38 INFO - Running command: ['c:\\mozilla-build\\python27\\python.exe', '../scripts/external_tools/mouse_and_screen_resolution.py', '--configuration-url', u'http://hg.mozilla.org/integration/mozilla-inbound/raw-file/default/testing/machine-configuration.json'] in C:\slave\test\build 08:45:38 INFO - Copy/paste: c:\mozilla-build\python27\python.exe ../scripts/external_tools/mouse_and_screen_resolution.py --configuration-url http://hg.mozilla.org/integration/mozilla-inbound/raw-file/default/testing/machine-configuration.json 08:45:38 INFO - INFO: This script was written to be used with Windows 7 32-bit machines. 08:45:38 INFO - Return code: 0 08:45:38 INFO - #### Running mochitest suites 08:45:38 INFO - ENV: MINIDUMP_STACKWALK is now C:\slave\test\build/tools/breakpad/win32/minidump_stackwalk.exe 08:45:38 INFO - ENV: MINIDUMP_SAVE_PATH is now C:\slave\test\build/../minidumps 08:45:38 INFO - Running command: ['C:\\slave\\test\\build\\venv\\Scripts\\python', '-u', 'C:\\slave\\test\\build\\tests\\mochitest/runtests.py', '--appname=C:\\slave\\test\\build\\application\\firefox\\firefox.exe', '--utility-path=tests/bin', '--extra-profile-file=tests/bin/plugins', '--symbols-path=http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jmathies@mozilla.com-ef8aa4c447f6/try-win32/firefox-24.0a1.en-US.win32.crashreporter-symbols.zip', '--certificate-path=tests/certs', '--autorun', '--close-when-done', '--console-level=INFO', '--browser-chrome', '--metro-immersive'] in C:\slave\test\build 08:45:38 INFO - Copy/paste: C:\slave\test\build\venv\Scripts\python -u C:\slave\test\build\tests\mochitest/runtests.py --appname=C:\slave\test\build\application\firefox\firefox.exe --utility-path=tests/bin --extra-profile-file=tests/bin/plugins --symbols-path=http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jmathies@mozilla.com-ef8aa4c447f6/try-win32/firefox-24.0a1.en-US.win32.crashreporter-symbols.zip --certificate-path=tests/certs --autorun --close-when-done --console-level=INFO --browser-chrome --metro-immersive 08:45:39 INFO - INFO | runtests.py | Installing extension at C:\slave\test\build\tests\mochitest\extensions\specialpowers to c:\users\cltbld~1.t-w\appdata\local\temp\tmptun0sm. 08:45:40 INFO - INFO | runtests.py | Installing extension at C:\slave\test\build\tests\mochitest\extensions\worker to c:\users\cltbld~1.t-w\appdata\local\temp\tmptun0sm. 08:45:40 INFO - INFO | runtests.py | Installing extension at C:\slave\test\build\tests\mochitest\extensions\workerbootstrap to c:\users\cltbld~1.t-w\appdata\local\temp\tmptun0sm. 08:45:40 INFO - args: ['C:\\slave\\test\\build\\tests\\bin\\xpcshell.exe', '-g', 'C:\\slave\\test\\build\\application\\firefox', '-v', '170', '-f', './httpd.js', '-e', "const _PROFILE_PATH = 'c:\\\\users\\\\cltbld~1.t-w\\\\appdata\\\\local\\\\temp\\\\tmptun0sm';const _SERVER_PORT = '8888'; const _SERVER_ADDR = '127.0.0.1';\n const _TEST_PREFIX = undefined; const _DISPLAY_RESULTS = false;", '-f', './server.js'] 08:45:40 INFO - INFO | runtests.py | Server pid: 4032 08:45:42 INFO - args: ['C:\\slave\\test\\build\\venv\\Scripts\\python.exe', 'C:\\slave\\test\\build\\tests\\mochitest\\pywebsocket_wrapper.py', '-p', '9988', '-w', 'C:\\slave\\test\\build\\tests\\mochitest', '-l', 'C:\\slave\\test\\build\\tests\\mochitest\\websock.log', '--log-level=debug', '--allow-handlers-outside-root-dir'] 08:45:42 INFO - INFO | runtests.py | Websocket server pid: 4012 08:45:42 INFO - INFO | runtests.py | Running tests: start. 08:45:42 INFO - args: ['C:\\slave\\test\\build\\tests\\bin\\certutil.exe', '-N', '-d', 'c:\\users\\cltbld~1.t-w\\appdata\\local\\temp\\tmptun0sm', '-f', 'c:\\users\\cltbld~1.t-w\\appdata\\local\\temp\\tmptun0sm\\.crtdbpw'] 08:45:42 INFO - args: ['C:\\slave\\test\\build\\tests\\bin\\certutil.exe', '-A', '-i', 'C:\\slave\\test\\build\\tests\\certs\\bug483440-attack2b.ca', '-d', 'c:\\users\\cltbld~1.t-w\\appdata\\local\\temp\\tmptun0sm', '-f', 'c:\\users\\cltbld~1.t-w\\appdata\\local\\temp\\tmptun0sm\\.crtdbpw', '-n', 'bug483440-attack2b', '-t', 'CT,,'] 08:45:43 INFO - args: ['C:\\slave\\test\\build\\tests\\bin\\certutil.exe', '-A', '-i', 'C:\\slave\\test\\build\\tests\\certs\\bug483440-attack7.ca', '-d', 'c:\\users\\cltbld~1.t-w\\appdata\\local\\temp\\tmptun0sm', '-f', 'c:\\users\\cltbld~1.t-w\\appdata\\local\\temp\\tmptun0sm\\.crtdbpw', '-n', 'bug483440-attack7', '-t', 'CT,,'] 08:45:43 INFO - args: ['C:\\slave\\test\\build\\tests\\bin\\certutil.exe', '-A', '-i', 'C:\\slave\\test\\build\\tests\\certs\\bug483440-pk10oflo.ca', '-d', 'c:\\users\\cltbld~1.t-w\\appdata\\local\\temp\\tmptun0sm', '-f', 'c:\\users\\cltbld~1.t-w\\appdata\\local\\temp\\tmptun0sm\\.crtdbpw', '-n', 'bug483440-pk10oflo', '-t', 'CT,,'] 08:45:43 INFO - args: ['C:\\slave\\test\\build\\tests\\bin\\certutil.exe', '-A', '-i', 'C:\\slave\\test\\build\\tests\\certs\\evintermediate.ca', '-d', 'c:\\users\\cltbld~1.t-w\\appdata\\local\\temp\\tmptun0sm', '-f', 'c:\\users\\cltbld~1.t-w\\appdata\\local\\temp\\tmptun0sm\\.crtdbpw', '-n', 'evintermediate', '-t', 'CT,,'] 08:45:43 INFO - args: ['C:\\slave\\test\\build\\tests\\bin\\certutil.exe', '-A', '-i', 'C:\\slave\\test\\build\\tests\\certs\\evroot.ca', '-d', 'c:\\users\\cltbld~1.t-w\\appdata\\local\\temp\\tmptun0sm', '-f', 'c:\\users\\cltbld~1.t-w\\appdata\\local\\temp\\tmptun0sm\\.crtdbpw', '-n', 'evroot', '-t', 'CT,,'] 08:45:43 INFO - args: ['C:\\slave\\test\\build\\tests\\bin\\certutil.exe', '-A', '-i', 'C:\\slave\\test\\build\\tests\\certs\\jartests-object.ca', '-d', 'c:\\users\\cltbld~1.t-w\\appdata\\local\\temp\\tmptun0sm', '-f', 'c:\\users\\cltbld~1.t-w\\appdata\\local\\temp\\tmptun0sm\\.crtdbpw', '-n', 'jartests-object', '-t', 'CT,,CT'] 08:45:43 INFO - args: ['C:\\slave\\test\\build\\tests\\bin\\pk12util.exe', '-i', 'C:\\slave\\test\\build\\tests\\certs\\mochitest.client', '-w', 'c:\\users\\cltbld~1.t-w\\appdata\\local\\temp\\tmptun0sm\\.crtdbpw', '-d', 'c:\\users\\cltbld~1.t-w\\appdata\\local\\temp\\tmptun0sm'] 08:45:43 INFO - C:\slave\test\build\tests\bin\pk12util.exe: PKCS12 IMPORT SUCCESSFUL 08:45:43 INFO - args: ['C:\\slave\\test\\build\\tests\\bin\\certutil.exe', '-A', '-i', 'C:\\slave\\test\\build\\tests\\certs\\pgoca.ca', '-d', 'c:\\users\\cltbld~1.t-w\\appdata\\local\\temp\\tmptun0sm', '-f', 'c:\\users\\cltbld~1.t-w\\appdata\\local\\temp\\tmptun0sm\\.crtdbpw', '-n', 'pgoca', '-t', 'CT,,'] 08:45:43 INFO - args: ['C:\\slave\\test\\build\\tests\\bin\\ssltunnel.exe', 'c:\\users\\cltbld~1.t-w\\appdata\\local\\temp\\tmptun0sm\\ssltunnel.cfg'] 08:45:43 INFO - INFO | automation.py | SSL tunnel pid: 2660 08:45:43 INFO - args: ['C:\\slave\\test\\build\\tests\\bin\\metrotestharness.exe', '-no-remote', '-profile', 'c:\\users\\cltbld~1.t-w\\appdata\\local\\temp\\tmptun0sm/', 'about:blank', '-firefoxpath', 'C:\\slave\\test\\build\\application\\firefox\\firefox.exe'] 08:45:43 INFO - INFO | automation.py | Application pid: 3100 08:45:43 INFO - INFO | metrotestharness.exe | firefoxpath: 'C:\slave\test\build\application\firefox\firefox.exe' 08:45:43 INFO - INFO | metrotestharness.exe | args: '-no-remote -profile c:\users\cltbld~1.t-w\appdata\local\temp\tmptun0sm/ about:blank' 08:45:43 INFO - INFO | metrotestharness.exe | Launching browser... 08:45:43 INFO - INFO | metrotestharness.exe | App model id='E4CFE2E6B75AA3A3' 08:45:43 INFO - INFO | metrotestharness.exe | Harness process id: 3100 08:45:43 INFO - INFO | metrotestharness.exe | Writing out tests.ini to: 'C:\slave\test\build\application\firefox\tests.ini' 08:45:43 WARNING - TEST-UNEXPECTED-FAIL | metrotestharness.exe | ActivateApplication (retry 0) result 80270254 08:45:48 WARNING - TEST-UNEXPECTED-FAIL | metrotestharness.exe | ActivateApplication (retry 1) result 80270254 08:45:53 WARNING - TEST-UNEXPECTED-FAIL | metrotestharness.exe | ActivateApplication (retry 2) result 80270254 08:45:58 WARNING - TEST-UNEXPECTED-FAIL | metrotestharness.exe | ActivateApplication result 80270254 08:45:58 INFO - INFO | metrotestharness.exe | Deleting C:\slave\test\build\application\firefox\tests.ini 08:45:58 INFO - INFO | automation.py | Application ran for: 0:00:15.574000 08:45:58 INFO - INFO | zombiecheck | Reading PID log: c:\users\cltbld~1.t-w\appdata\local\temp\tmp9m_xtopidlog 08:45:59 INFO - SUCCESS: The process with PID 2660 has been terminated. 08:45:59 INFO - ERROR: The process with PID 4032 could not be terminated. 08:45:59 INFO - Reason: There is no running instance of the task. 08:45:59 INFO - SUCCESS: The process with PID 4012 has been terminated. 08:45:59 INFO - WARNING | leakcheck | refcount logging is off, so leaks can't be detected! 08:45:59 INFO - INFO | runtests.py | Running tests: end. 08:46:00 INFO - Return code: 0 08:46:00 INFO - TinderboxPrint: mochitest-metro-immersive T-FAIL 08:46:00 WARNING - # TBPL WARNING # 08:46:00 WARNING - The mochitest suite: metro-immersive ran with return status: WARNING 08:46:00 INFO - Copying logs to upload dir... 08:46:00 INFO - mkdir: C:\slave\test\build\upload\logs program finished with exit code 1

Armen [:armenzg]

Assignee

Comment 104

•

12 years ago

Hi Q, Is the change that you deployed to this machine something that we can backout easily? (In reply to Jim Mathies [:jimm] from comment #95) > > Have you ever seen a blue job on tbpl? (either blue or purple) > > The blue jobs are considered "infra known failures" which automatically > > re-trigger the job on another machine. > > For instance, if there is a network blip, hg is 404 and on. > > What I'm suggesting is to make the job automatically re-trigger if we fail > > to active the app. > > Are you suggesting doing this for production and rolling this out? If so > that's not a call I can make, since it'll suck releng resources until we get > it fixed. We should also be sure we can backout / update Q's work. jimm, I'm OK to sometimes have some of the win8 64-bit machines take a job and have to retry on another machine. It's not a long job. We might loose at most 5-7 minutes every once in a while.

Flags: needinfo?(q)

Jim Mathies [:jimm]

Reporter

Comment 105

•

12 years ago

> jimm, I'm OK to sometimes have some of the win8 64-bit machines take a job > and have to retry on another machine. It's not a long job. We might loose at > most 5-7 minutes every once in a while. Ok sounds good. From your one test run failure, every registry check succeeded.

Jim Mathies [:jimm]

Reporter

Comment 106

•

12 years ago

(In reply to Q from comment #10) > Created attachment 746788 [details] > Reg file converted to GPO usable XML > > This can be directly imported into a registry preference GPO Q, if these machines get reset on every reboot, we should remove the delete orders in this xml script. There's no point in having them if the slaves are reset and this registration is re-imported every time the user logs in.

Jim Mathies [:jimm]

Reporter

Comment 107

•

12 years ago

One thing I thought of, once the win8 tests finish their run, we exit the browser. But we do not flip back to the desktop. Is this going to fowl up desktop tests that run on these slaves, or do they logout or reboot on every run?

Henrik Skupin [:whimboo][⌚️UTC+2]

Comment 108

•

12 years ago

(In reply to Jim Mathies [:jimm] from comment #107) > One thing I thought of, once the win8 tests finish their run, we exit the > browser. But we do not flip back to the desktop. Is this going to fowl up > desktop tests that run on these slaves, or do they logout or reboot on every > run? A request for this I have filed as bug 879043 yesterday. For our Mozmill tests it would be kinda helpful to get back to desktop. We haven't run those tests yet in our CI so I don't know how other tests will cope with that.

Blocks: 845079

Armen [:armenzg]

Assignee

Comment 109

•

12 years ago

(In reply to Jim Mathies [:jimm] from comment #107) > One thing I thought of, once the win8 tests finish their run, we exit the > browser. But we do not flip back to the desktop. Is this going to fowl up > desktop tests that run on these slaves, or do they logout or reboot on every > run? We always reboot at the end of each run.

Jim Mathies [:jimm]

Reporter

Comment 110

•

12 years ago

(In reply to Armen Zambrano G. [:armenzg] (Release Enginerring) from comment #109) > (In reply to Jim Mathies [:jimm] from comment #107) > > One thing I thought of, once the win8 tests finish their run, we exit the > > browser. But we do not flip back to the desktop. Is this going to fowl up > > desktop tests that run on these slaves, or do they logout or reboot on every > > run? > > We always reboot at the end of each run. Ok, great.

No longer blocks: 845079

Q

Comment 111

•

12 years ago

Yes we can back out the change with ease. (In reply to Armen Zambrano G. [:armenzg] (Release Enginerring) from comment #104) > Hi Q, > Is the change that you deployed to this machine something that we can > backout easily? > > (In reply to Jim Mathies [:jimm] from comment #95) > > > Have you ever seen a blue job on tbpl? (either blue or purple) > > > The blue jobs are considered "infra known failures" which automatically > > > re-trigger the job on another machine. > > > For instance, if there is a network blip, hg is 404 and on. > > > What I'm suggesting is to make the job automatically re-trigger if we fail > > > to active the app. > > > > Are you suggesting doing this for production and rolling this out? If so > > that's not a call I can make, since it'll suck releng resources until we get > > it fixed. We should also be sure we can backout / update Q's work. > > jimm, I'm OK to sometimes have some of the win8 64-bit machines take a job > and have to retry on another machine. It's not a long job. We might loose at > most 5-7 minutes every once in a while.

Flags: needinfo?(q)

Henrik Skupin [:whimboo][⌚️UTC+2]

Updated

•

12 years ago

Comment 112

•

12 years ago

(In reply to Q from comment #111) > Yes we can back out the change with ease. Great. Armen, is releng ok to roll this out to slaves and get these running on inbound/mc? We should start them as hidden so we can deal with any random orange.

Armen [:armenzg]

Assignee

Comment 113

•

12 years ago

Let's get to action in bug 864418.

Status: NEW → RESOLVED

Closed: 12 years ago

Resolution: --- → FIXED

Jim Mathies [:jimm]

Reporter

Updated

•

11 years ago

Attachment #740974 - Attachment mime type: text/x-ms-regedit → text/plain

Henrik Skupin [:whimboo][⌚️UTC+2]

Updated

•

11 years ago

Attachment #741288 - Attachment mime type: text/x-ms-regedit → text/plain

Nobody; OK to take it and work on it

Updated

•

10 years ago

OS: Windows 8 Metro → Windows 8.1

reset script 12 years ago Jim Mathies [:jimm] 1.37 KB, text/plain		Details
register script 12 years ago Jim Mathies [:jimm] 5.51 KB, text/x-ms-regedit		Details
register script 12 years ago Jim Mathies [:jimm] 5.69 KB, text/plain		Details
patch 12 years ago Jim Mathies [:jimm] 9.56 KB, patch	bbondy : review+	Details \| Diff \| Splinter Review
Reg file converted to GPO usable XML 12 years ago Q 28.50 KB, text/xml		Details
Auto Assoication XML for GPO 12 years ago Q 558 bytes, text/xml		Details
log 12 years ago Armen [:armenzg] 47.25 KB, text/x-log		Details
log 12 years ago Armen [:armenzg] 47.16 KB, patch		Details \| Diff \| Splinter Review
focus patch 12 years ago Jim Mathies [:jimm] 1.46 KB, patch	bbondy : review+	Details \| Diff \| Splinter Review
reg debug patch v.1 12 years ago Jim Mathies [:jimm] 6.65 KB, patch		Details \| Diff \| Splinter Review