Closed
Bug 426997
Opened 16 years ago
Closed 16 years ago
bm-win2k3-pgo01 is burning
Categories
(Release Engineering :: General, defect, P1)
Tracking
(Not tracked)
RESOLVED
DUPLICATE
of bug 420073
People
(Reporter: ted, Assigned: mikeal)
References
Details
Attachments
(2 files)
941 bytes,
patch
|
rhelmer
:
review+
|
Details | Diff | Splinter Review |
2.48 KB,
text/plain
|
Details |
Looks like win2k3-pgo got hung up on something. Filed here instead of tinderbox maintenance because this is a new box, so it's possible that it's still a setup issue: Build Error Log Skipping 26 Lines... PsKill v1.12 - Terminates processes on local or remote systems Copyright (C) 1999-2005 Mark Russinovich Sysinternals - www.sysinternals.com cvs checkout: Updating tinderbox-configs cvs checkout: Updating buildbot-configs No clobber required # tools/buildbot-configs/testing/unittest/mozconfig-win2k3-pgo mk_add_options MOZ_CO_PROJECT=browser ac_add_options --enable-places ac_add_options --disable-installer ac_add_options --enable-application=browser mk_add_options MOZ_OBJDIR=@TOPSRCDIR@/objdir ac_add_options --enable-tests ac_add_options --enable-debugger-info-modules #ac_add_options --enable-mochitest # ac_add_options --enable-extensions=default,jssh # ac_add_options --disable-javaxpcom # ac_add_options --enable-debug # ac_add_options --disable-optimize ac_add_options --disable-composer ac_add_options --disable-mailnews mk_add_options MOZ_MAKE_FLAGS="-j3" # ac_add_options --enable-optimize="-O2 -g" ac_add_options --enable-logrefcnt # mozilla/testing/tools needed for buildbot profile (re)creation mk_add_options MOZ_CO_MODULE="mozilla/testing/tools" mk_add_options PROFILE_GEN_SCRIPT='$(PYTHON) $mozilla/build/autoconf/mozconfig2client-mk: /d/slave/trunk_2k3_pgo/mozilla/.mozconfig: line 23: unexpected EOF while looking for matching `'' NEXT ERROR mozilla/build/autoconf/mozconfig2client-mk: /d/slave/trunk_2k3_pgo/mozilla/.mozconfig: line 24: syntax error: unexpected end of file Adding client.mk options from /d/slave/trunk_2k3_pgo/mozilla/.mozconfig: MOZ_CO_PROJECT=browser MOZ_OBJDIR=$(TOPSRCDIR)/objdir MOZ_MAKE_FLAGS=-j3 MOZ_CO_MODULE=mozilla/testing/tools checkout start: Fri Apr 4 03:27:31 PDT 2008 cvs -d :ext:unittest@cvs.mozilla.org:/cvsroot -q -z 3 co mozilla/client.mk mozilla/browser/config/mozconfig mozilla/browser/config/version.txt mozilla/build/unix/uniq.pl mozilla/calendar/sunbird/config/version.txt mozilla/mail/config/version.txt mozilla/suite/config/version.txt mozilla/build/autoconf/mozconfig2client-mk: /d/slave/trunk_2k3_pgo/mozilla/.mozconfig: line 23: unexpected EOF while looking for matching `'' NEXT ERROR mozilla/build/autoconf/mozconfig2client-mk: /d/slave/trunk_2k3_pgo/mozilla/.mozconfig: line 24: syntax error: unexpected end of file make[1]: Entering directory `/d/slave/trunk_2k3_pgo' cvs -d :ext:unittest@cvs.mozilla.org:/cvsroot -q -z 3 co -P -r NSPR_4_7_1_BETA2 mozilla/nsprpub cvs -d :ext:unittest@cvs.mozilla.org:/cvsroot -q -z 3 co -P -r NSS_3_12_BETA3 mozilla/dbm mozilla/security/nss mozilla/security/coreconf mozilla/security/dbm cvs -d :ext:unittest@cvs.mozilla.org:/cvsroot -q -z 3 co -P -A -l mozilla/ mozilla/db mozilla/js mozilla/js/jsd mozilla/js/src ? mozilla/objdir cvs -d :ext:unittest@cvs.mozilla.org:/cvsroot -q -z 3 co -P -A mozilla/README mozilla/accessible mozilla/browser mozilla/build mozilla/caps mozilla/chrome mozilla/config mozilla/content mozilla/db/mdb mozilla/db/mork mozilla/db/morkreader mozilla/db/sqlite3 mozilla/docshell mozilla/dom mozilla/editor mozilla/embedding mozilla/extensions mozilla/gfx mozilla/intl mozilla/ipc/ipcd mozilla/jpeg mozilla/js/jsd/idl mozilla/js/src/fdlibm mozilla/js/src/liveconnect mozilla/js/src/xpconnect mozilla/layout mozilla/memory/jemalloc mozilla/modules/lcms mozilla/modules/libbz2 mozilla/modules/libimg mozilla/modules/libjar mozilla/modules/libmar mozilla/modules/libpr0n mozilla/modules/libpref mozilla/modules/libreg mozilla/modules/libutil mozilla/modules/oji mozilla/modules/plugin mozilla/modules/staticmod mozilla/modules/zlib mozilla/netwerk mozilla/other-licenses/7zstub/firefox mozilla/other-licenses/atk-1.0 mozilla/other-licenses/branding/firefox mozilla/other-licenses/ia2 mozilla/parser mozilla/plugin/oji mozilla/probes mozilla/profile mozilla/rdf mozilla/security/manager mozilla/storage mozilla/sun-java mozilla/testing/crashtest mozilla/testing/mochitest mozilla/testing/tools mozilla/toolkit mozilla/tools/elf-dynstr-gc mozilla/tools/test-harness mozilla/uriloader mozilla/view mozilla/webshell mozilla/widget mozilla/xpcom mozilla/xpfe mozilla/xpinstall checkout finish: Fri Apr 4 03:30:56 PDT 2008 make[1]: Leaving directory `/d/slave/trunk_2k3_pgo' mozilla/build/autoconf/mozconfig2client-mk: /d/slave/trunk_2k3_pgo/mozilla/.mozconfig: line 23: unexpected EOF while looking for matching `'' NEXT ERROR mozilla/build/autoconf/mozconfig2client-mk: /d/slave/trunk_2k3_pgo/mozilla/.mozconfig: line 24: syntax error: unexpected end of file make -f /d/slave/trunk_2k3_pgo/mozilla/client.mk build MOZ_PROFILE_GENERATE=1 mozilla/build/autoconf/mozconfig2client-mk: /d/slave/trunk_2k3_pgo/mozilla/.mozconfig: line 23: unexpected EOF while looking for matching `'' mozilla/build/autoconf/mozconfig2client-mk: /d/slave/trunk_2k3_pgo/mozilla/.mozconfig: line 24: syntax error: unexpected end of file make[1]: Entering directory `/d/slave/trunk_2k3_pgo/mozilla' Adding client.mk options from /d/slave/trunk_2k3_pgo/mozilla/.mozconfig: MOZ_CO_PROJECT=browser MOZ_OBJDIR=$(TOPSRCDIR)/objdir MOZ_MAKE_FLAGS=-j3 MOZ_CO_MODULE=mozilla/testing/tools make -j3 -C /d/slave/trunk_2k3_pgo/mozilla/objdir make[2]: Entering directory `/d/slave/trunk_2k3_pgo/mozilla/objdir' rm -f -rf ./dist/sdk rm -f -rf ./dist/include rm -f -rf ./dist/private rm -f -rf ./dist/public rm -f -rf _tests make[2]: Leaving directory `/d/slave/trunk_2k3_pgo/mozilla/objdir' make[1]: Leaving directory `/d/slave/trunk_2k3_pgo/mozilla' rm: cannot remove directory `_tests/testing/mochitest': Permission denied rm: cannot remove directory `_tests/testing': Directory not empty rm: cannot remove directory `_tests': Directory not empty make[2]: *** [default] Error 1 make[1]: *** [build] Error 2 make: *** [profiledbuild] Error 2 No More Errors
Comment 1•16 years ago
|
||
clobbering...
Assignee: nobody → rcampbell
OS: Windows XP → Windows Server 2003
Updated•16 years ago
|
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 2•16 years ago
|
||
Went green for one cycle then red again: http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1207318951.1207322570.27303.gz Note: make[7]: Leaving directory `/d/slave/trunk_2k3_pgo/mozilla/objdir/xpcom/tools/registry' 0 [main] make 472 open_stackdumpfile: Dumping stack trace to make.exe.stackdump Looks like it still has some issues.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 3•16 years ago
|
||
This was working fine yesterday afternoon before we moved it from staging to production. Reassigning to Mikeal, as Robcee is traveling.
Priority: -- → P1
Updated•16 years ago
|
Assignee: rcampbell → mrogers
Status: REOPENED → NEW
Comment 4•16 years ago
|
||
Probably separate from the burnination, but the mozconfig2client-mk error is due to: mk_add_options PROFILE_GEN_SCRIPT='$(PYTHON) $ in http://mxr.mozilla.org/seamonkey/source/tools/buildbot-configs/testing/unittest/mozconfig-win2k3-pgo mk_add_options PROFILE_GEN_SCRIPT='$(PYTHON) $(MOZ_OBJDIR)/_profile/pgo/profileserver.py' in http://mxr.mozilla.org/seamonkey/source/tools/tinderbox-configs/firefox/win32/mozconfig
Assignee | ||
Comment 5•16 years ago
|
||
I brought up the staging box to see if we get the same issues. I'm also attaching a patch to fix the mozbuild-win2k3-pgo issue nick noticed.
Assignee | ||
Comment 6•16 years ago
|
||
Attachment #313631 -
Flags: review?
Assignee | ||
Updated•16 years ago
|
Attachment #313631 -
Flags: review? → review?(nrthomas)
Comment 7•16 years ago
|
||
I'd be inclined to call that patch a typo fix, myself, but then again I've been around the project for awhile. :-) Up to you what to do here for now.
Assignee | ||
Comment 8•16 years ago
|
||
I don't have commit right to cvs yet. So _somebody_ has to review it and check it in. I can make the update to the production master. And this change won't require a buildbot reconfig because that file is pulled down anew for each build.
Comment 9•16 years ago
|
||
Comment on attachment 313631 [details] [diff] [review] [checked in] fixing mozbuild-win2k3-pgo >? pgo_mozbuild.patch >Index: unittest/mozconfig-win2k3-pgo >=================================================================== >RCS file: /cvsroot/mozilla/tools/buildbot-configs/testing/unittest/mozconfig-win2k3-pgo,v >retrieving revision 1.1 >diff -u -8 -p -r1.1 mozconfig-win2k3-pgo >--- unittest/mozconfig-win2k3-pgo 28 Mar 2008 18:47:48 -0000 1.1 >+++ unittest/mozconfig-win2k3-pgo 4 Apr 2008 16:45:30 -0000 >@@ -15,9 +15,9 @@ ac_add_options --enable-debugger-info-mo > ac_add_options --disable-composer > ac_add_options --disable-mailnews > mk_add_options MOZ_MAKE_FLAGS="-j3" > # ac_add_options --enable-optimize="-O2 -g" > ac_add_options --enable-logrefcnt > > # mozilla/testing/tools needed for buildbot profile (re)creation > mk_add_options MOZ_CO_MODULE="mozilla/testing/tools" >-mk_add_options PROFILE_GEN_SCRIPT='$(PYTHON) $ >\ No newline at end of file Probably don't want to check this in ^ :) >+mk_add_options PROFILE_GEN_SCRIPT='$(PYTHON) $(MOZ_OBJDIR)/_profile/pgo/profileserver.py'
Comment 10•16 years ago
|
||
Comment on attachment 313631 [details] [diff] [review] [checked in] fixing mozbuild-win2k3-pgo Checking in unittest/mozconfig-win2k3-pgo; /cvsroot/mozilla/tools/buildbot-configs/testing/unittest/mozconfig-win2k3-pgo,v <-- mozconfig-win2k3-pgo new revision: 1.2; previous revision: 1.1 done
Attachment #313631 -
Flags: review?(nrthomas) → review+
Updated•16 years ago
|
Attachment #313631 -
Attachment description: fixing mozbuild-win2k3-pgo → [checked in] fixing mozbuild-win2k3-pgo
Assignee | ||
Comment 11•16 years ago
|
||
Since this change didn't require a buildbot reconfig it didn't require any downtime. I update the production master with this change after it was checked in. The existing cycle compiled green, but looking at the total time compared to the staging PGO build time it seems as those not everything is getting compiled with the old mozconfig, the next build will use the new mozconfig and fix this issue.
Comment 12•16 years ago
|
||
While that existing cycle (09.07am - 10.54am) finished green, we still have problems: 2008/04/04 09.07 green 2008/04/04 11.00 orange 2008/04/04 13.23 green 2008/04/04 15.36 orange 2008/04/04 17.43 orange 2008/04/04 20.06 orange 2008/04/04 22.27 red ...and remains burning red continuously even now. Random poking at the logs shows the following error: rm -f -rf ./dist/public rm -f -rf _tests rm: cannot unlink `_tests/testing/mochitest/httpd.js': Permission denied rm: cannot unlink `_tests/testing/mochitest/server.js': Permission denied rm: cannot remove directory `_tests/testing/mochitest': Permission denied rm: cannot remove directory `_tests/testing': Directory not empty make[2]: Leaving directory `/d/slave/trunk_2k3_pgo/mozilla/objdir' make[1]: Leaving directory `/d/slave/trunk_2k3_pgo/mozilla' rm: cannot remove directory `_tests': Directory not empty make[2]: *** [default] Error 1 make[1]: *** [build] Error 2 make: *** [profiledbuild] Error 2 It was running green in staging before we switched it to production - what changed?
Comment 13•16 years ago
|
||
I've removed bm-win2k3-pgo01 from the main Tinderbox page. With the freeze coming (and people presumably trying to work on the weekend), this new box shouldn't just sit burning on the main Tinderbox page. Please reenable when it's working, though!
Assignee | ||
Comment 14•16 years ago
|
||
This issue isn't specific to this box or to pgo, it's an issue we have on all the win2k3 unittest machines. The three that coop just set up all have the same problem. Adding coop to CC
Assignee | ||
Comment 15•16 years ago
|
||
I have the semantics for a fix down, I'm working on writing up some buildbot code to deal with this. Essentially, we need to kill any other python processes that aren't the main buildbot process as part of our first few cleanup steps. I do think that this intermittent mochitest failure on win2k3 is real and not a problem with the environment on these boxes, but our buildbot code should be robust enough to handle dangling processes from previously failed tests. We can track the mochitest failure much easier after the boxes don't completely fall down when encountering it.
Assignee | ||
Comment 16•16 years ago
|
||
Ok, I have two ShellCommand subclasses that fix the rogue python process issue. I would categorize the risk of this fix as "high" for sure. There is a good amount of failover logic in the code but if something is out of whack it could kill the builbot slave. I'm running it on my test master overnight, if it all looks good I'll merge the code in to the staging master and let it live there for a while. If we could run both PGO boxes on staging and get this code in the other win2k3 unittest boxes and see how they run over a 24 hour period I'd say the patch is good to go, but the risk is high enough that I'd don't want to push it in to production too hastily.
Comment 17•16 years ago
|
||
(In reply to comment #14) > This issue isn't specific to this box or to pgo, it's an issue we have on all > the win2k3 unittest machines. The three that coop just set up all have the same > problem. Any details on what the issue is? For example, is bug#427605 the same problem? Are you seeing memory access violations on this PGO machine?
Assignee | ||
Comment 18•16 years ago
|
||
It's possible that this is the issue that was causing them to fall over, there isn't really enough in that bug for me to tell. Both the PGO machines were red this morning, and only one of them was showing this memory access violation error. Regardless, as I said in an earlier comment, the patch I'm currently working on is to keep the box from going red on consecutive runs after issues like this one. It addresses a slightly larger problem of killing rogue python processes from previous runs. Once the boxes aren't going red after these test failures I'll dig deeper in to the intermittent test failures. If coop thinks this is the reason the mochitests are failing and then locking up a Python process then I'm inclined to agree with him and his fix for that issue will clear up the last of the problems on these PGO boxes. If not then we'll have an easier time tracking the issue once the boxes can run continuously without going red.
Reporter | ||
Comment 19•16 years ago
|
||
Waldo suspects that other bustage will be fixed by his patch in bug 418009.
Comment 20•16 years ago
|
||
I'm going to recommend switching to runtests.pl instead of the pythonic version until we can figure this out in staging.
Assignee | ||
Comment 21•16 years ago
|
||
Are any of the other unittest boxes using runtests.pl?
Assignee | ||
Comment 22•16 years ago
|
||
In order to increase transparency on this I'm going to attach the new ShellCommands I wrote so that people can comment on them before they are in the context of a patch to production.
Assignee | ||
Comment 23•16 years ago
|
||
Comment 24•16 years ago
|
||
On the main tinderbox, they all are. On MozillaTest, most are using py. qm-stage-centos5-01 is mean and green; qm-stage-osx-01 and qm-xserve02 dep were green for awhile but turned orange sometime, I don't know when, and it's reporting failures on a set of mochitests that don't really indicate anything -- I'd be surprised if a kick didn't fix. qm-stage-win2k3-01 was green until bug 418009 hit and bricked it until someone can give it a kick; it's using py for the non-mochitest browser test run. qm-win2k3-03 was orange on a specific browser test for no obvious reason, one that passed on the other box, and is now in need of a kick for the same reason. qm-win2k3-02 is red on something completely non-mochitest-related, some buildbot failure it looks like -- no idea what it is.
Comment 25•16 years ago
|
||
Yeah, windows is the problem area for these things. I'd rather try these steps on staging before putting them on production. Not that they look bad, I'd just prefer not using production as a test environment. Mikeal: please convert the step on the pgo unittest box to runtests.pl.
Assignee | ||
Comment 26•16 years ago
|
||
Both staging and production slaves are reporting to the unittest staging master until they are green. Both boxes have had their resolution set to 1280x1024 as that seems to cause intermittent failures in some of these tests. Both boxes are now using runtests.pl in place of runtests.py.
Assignee | ||
Comment 27•16 years ago
|
||
I commented out the clobber and build steps on both boxes so that we can see more consecutive test cycles to determine if there are any more intermittent issues.
Assignee | ||
Comment 28•16 years ago
|
||
The issue that was causing this to burn is now fixed by using runtests.pl . There are now a new set of issue keeping us from putting the PGO box back on production. I'm marking this bug as a dupe and referring everyone back to the original bug 420073, https://bugzilla.mozilla.org/show_bug.cgi?id=420073, to track further issue with the PGO unittest box. The other bug is older, has more history, and seems to be on people's radar more.
Status: NEW → RESOLVED
Closed: 16 years ago → 16 years ago
Resolution: --- → DUPLICATE
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
You need to log in
before you can comment on or make changes to this bug.
Description
•