Closed Bug 614956 Opened 9 years ago Closed 9 years ago

Enable debug unit tests for Win7 and enable XP opt unit tests so we can disable them on Win2003 builders (except 1.9.1 and 1.9.2)

Categories

(Release Engineering :: General, defect, P3)

x86
Windows 7
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: armenzg, Assigned: armenzg)

References

Details

(Whiteboard: [unittest][win7])

Attachments

(2 files)

We deployed Mozilla Build to all XP slaves in bug 549458#c38.
We deployed the Debug CRT on bug 562459 and we can now enable debug unit tests for Win7 slaves.

We should enable the debug unit tests, tackle the three oranges I have found and disable unit tests on Win2003 builders (except on 1.9.1 and 1.9.2).
Assignee: nobody → armenzg
OS: Mac OS X → Windows 7
Priority: -- → P2
Whiteboard: [unittest][win7]
Attachment #493437 - Flags: review?(coop) → review+
I would like to enable it this week. I will prepare fabric to deploy it.

We will disable on the builders once we know everything runs with no perma-oranges on the minis and we announce out loud and ahead of time.
Flags: needs-reconfig?
Summary: Enable debug unit tests for Win7 testing slaves and disable them on Win2003 builders → Enable debug unit tests for Win7 testing slaves and disable them on Win2003 builders (except 1.9.1 and 1.9.2)
Status: NEW → ASSIGNED
Flags: needs-reconfig? → needs-reconfig+
Clearing outdated needs-reconfig flag.
Flags: needs-reconfig+
We have few perma oranges:
- crashtest
REFTEST TEST-UNEXPECTED-PASS | file:///c:/talos-slave/mozilla-central_win7-debug_test-crashtest/build/reftest/tests/layout/generic/crashtests/499885-1.xhtml | assertion count 3 is less than expected 18 to 108 assertions

- reftest
REFTEST TEST-UNEXPECTED-FAIL | file:///c:/talos-slave/mozilla-central_win7-debug_test-reftest/build/reftest/tests/layout/reftests/bugs/421710-1.html | assertion count 4 is more than expected 0 assertions
PROCESS-CRASH | Main app process exited normally | application crashed (minidump found)
Thread 0 (crashed)
PROCESS-CRASH | Main app process exited normally | application crashed (minidump found)
Thread 0 (crashed)
PROCESS-CRASH | Main app process exited normally | application crashed (minidump found)
Thread 0 (crashed)
PROCESS-CRASH | Main app process exited normally | application crashed (minidump found)
Thread 0 (crashed)

- xpcshell
TEST-UNEXPECTED-FAIL | c:\talos-slave\mozilla-central_win7-debug_test-xpcshell\build\xpcshell\tests\netwerk\test\unit\test_bug596443.js | test failed (with xpcshell return code: 0), see following log:
TEST-UNEXPECTED-FAIL | c:/talos-slave/mozilla-central_win7-debug_test-xpcshell/build/xpcshell/tests/netwerk/test/unit/test_bug596443.js | Response1 == Response0 - See following stack:
TEST-UNEXPECTED-FAIL | c:/talos-slave/mozilla-central_win7-debug_test-xpcshell/build/xpcshell/tests/netwerk/test/unit/test_bug596443.js | Response0 == Response1 - See following stack:
TEST-UNEXPECTED-FAIL | c:/talos-slave/mozilla-central_win7-debug_test-xpcshell/build/xpcshell/tests/netwerk/test/unit/test_bug596443.js | Response0 == Response1 - See following stack:
See Also: → 570436
lsblakk is testing a patch to fix the situation with try debug unit tests on Win7 machines as they are not currently being triggered.
http://hg.mozilla.org/build/buildbotcustom/file/tip/try_parser.py#l65
Depends on: 615953
Depends on: 615959
Depends on: 610654
(In reply to comment #6)
> lsblakk is testing a patch to fix the situation with try debug unit tests on
> Win7 machines as they are not currently being triggered.
> http://hg.mozilla.org/build/buildbotcustom/file/tip/try_parser.py#l65

This is being fixed in bug 561437#c18.
STATUS UPDATE:
* only one remaining perma-orange
* try server issue fixed
* only xpcshell is hidden on tpbl
Armen is there a bug for the slowness of xpcshell?  I have some debugging tools to help people profile it.  My profiling skills on windows are pretty weak as it turns out.  I have managed to generate a very pretty xperf graph that shows me that xpcshell was running the entire time of my test, but no more than that. :/  I need to get the windows gurus involved.
(In reply to comment #9)
> Armen is there a bug for the slowness of xpcshell?  I have some debugging tools
> to help people profile it.  My profiling skills on windows are pretty weak as
> it turns out.  I have managed to generate a very pretty xperf graph that shows
> me that xpcshell was running the entire time of my test, but no more than that.
> :/  I need to get the windows gurus involved.

Filed bug 617503
xpcshell is unhidden.
Thanks philor!

What do you suggest we do with bug 614474#c57?
What is your suggestion on how it should affect our decision to disable w2ke debug unit tests?
It seems that it only affects to the w2k3 VMs (win32-slaveXX & try-w32-slaveXX)
Why do you think it does not time out on the win7 machines?
Shall we bring that test up on the discussion groups?

Here are the specs of the machines:
http://armenzg.blogspot.com/2010/12/test-suites-on-windows.html#comments

Win7 Rev3 mini:
* Intel Core 2 DUO P7550@2.26 GHz
* 2GB RAM

The Win2k3 machines can either be an IX machine (hardware of 1/2 unit of a server shelf) or a VM.

The IX machines have this spec:
* Intel Xeon - X3430 @ 2.40GHz
* 4GB RAM

The VM machines have this spec:
* Intel Xeon - L5335 @ 2.0GHz
* 2GB RAM
I think it should make us push ahead as quickly as we possibly can to disable tests on the VMs. That one is just one of probably dozens, the one that I happened to notice while this was on my mind, where we frequently run so slowly on the VMs that we just can't finish a test in 30 seconds. For the last one that I carefully looked at, before I increased the test timeout to make it pass on the Win2K3 slaves, opt builds typically finished in 1 to 2 seconds, debug builds on hardware typically finished in 10 or 11 seconds, and the VMs either finished in 20-some seconds or timed out at 30 seconds. We aren't losing valuable testing data by killing the VMs, all we're losing is data on the fact that running debug Windows builds on VMs isn't a good idea, and we already knew that.

We also shouldn't be slowed down by how slow the debug xpcshell tests are at finishing - yeah, it's a shame, but for the last push that has actually finished, a push at 01:30, the Win debug xpcshell test finished at 04:15 and the opt xpcshell test finished at 05:19 since the PGO build took so much longer to finish. We'll be increasing the time before you get your first Windows xpcshell results, and that's a shame and will matter to maybe one in five hundred or a thousand pushes, but we won't be increasing the time until you're completely off the hook, while massively decreasing the amount of intermittent timeout orange on Windows, which will matter much more to the remaining 499 or 999 pushers.
Blocks: 478241
Blocks: 612448
Also Jim has tracked down the root cause of the slowness in bug 617503 anyway, so we should be able to fix that pretty quickly, and all of our Win 7 tests should get faster!
philor that was very good to hear it from you.
I was aware that the slowness is only really bad if the build preceding the tests were to finish at the same time for all platforms (which is not the case!).

I am so happy that bug 617503 could be soon be solved as the wait times on Win7 are pretty horrible. I will do a short post on that. We need to cut down how much CPU we use on the win7 machines or having more win7 CPU capacity and that ain't easy!
We won't be disabling unit tests on w2k3 until I enable it on WinXP (as per Tuesday meeting).
This is unfortunate and I don't agree that "feel uncomfortable" is a valid reason without further explanation on "why". I might be able to convince people to give up on that thought on the mailing lists.

Meanwhile, I would like to have XP running in production in the next 2-3 weeks in bug 614955 and things won't be as bad during that period (more IX machines available for jobs).
Priority: P2 → P3
The reason behind feeling uncomfortable is pretty simple. Windows 2003 and Windows XP are very similar in terms of the Windows kernel and userland they provide, so having the 2k3 boxes running tests gives us a good approximation of testing on XP. We still have a lot of users on XP, so losing test coverage of this platform is not a good idea, since it's possible that we could regress something badly and not know about it until after we shipped.
Then I should get cranking with WinXP opt as we don't have W2k3 coverage for opt for few months (except 1.9.1 and 1.9.2).
Depends on: 614955
Blocks: 607523
Blocks: 618926
We now have *optimized* unit tests for XP plus full coverage on Win7.

I believe I read in a comment dbaron saying that this should be sufficient to disable W2k3 debug unit tests.

The purpose of doing it soon rather than wait for debug unit tests on XP is to speed the wait times on the try server (we don't have too many IX machines on that pool) and to fulminate the random oranges due to VMs.

I would be able to get to debug unit tests for XP in the last week of January and should take 2 weeks.

Can we disable them as soon as we have no perma-oranges on XP? (maybe 1-3 weeks)

I will bring it up to Tuesday dev meeting.
dbaron's okay was in bug 614955 comment 4, and I just marked the remainder of the failing reftests as failing, so the XP reftests are visible on mozilla-central, and the last of the Win2K3 tests is hidden - it would be nice if it works out that the Win2K3 tests can't be shut off until a downtime next week, giving project branches time enough to merge so they can make a green XP reftest visible, but you're unblocked and ready to shut them off.
Sounds fantastic.
We will wait until the project branches pull from mozilla-central.
There is no need for downtime.

I would like to land attachment 494753 [details] [diff] [review] once we are ready.
You're ready: the only active project branches right now are Places and TraceMonkey, and they've both pulled, and I've made all their XP tests visible and all their Win2K3 tests hidden.
This patch is the same as attachment 494753 [details] [diff] [review] but refreshed.
I am passing along the r+ and will land the patch from here to keep the work tracked in this bug.
Attachment #505175 - Flags: review+
Flags: needs-reconfig?
Comment on attachment 505175 [details] [diff] [review]
disable w2k3 debug unittests (except for 1.9.1 & 1.9.2)

Landed on "default".
http://hg.mozilla.org/build/buildbot-configs/rev/b16c5ee7b58b

We still need to land on the "production" branch and reconfig for this change to take effect.
Attachment #505175 - Flags: checked-in+
This got landed on "production" and the reconfigures succeeded this morning.
http://hg.mozilla.org/build/buildbot-configs/rev/2f6597acbead

I am changing the summary to notice that we were also required to enable XP opt unit tests to disabled the debug w2k3 tests.

Let's close bugs :)

In the next few weeks we will also deploy debug XP unit tests.
Summary: Enable debug unit tests for Win7 testing slaves and disable them on Win2003 builders (except 1.9.1 and 1.9.2) → Enable debug unit tests for Win7 and enable XP opt unit tests so we can disable them on Win2003 builders (except 1.9.1 and 1.9.2)
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
No longer depends on: 614955
Depends on: 614955
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.