Closed
Bug 666019
Opened 14 years ago
Closed 11 years ago
monkey-patch a stock Twisted-11.0.0 to manage processes correctly
Categories
(Infrastructure & Operations Graveyard :: CIDuty, task)
Infrastructure & Operations Graveyard
CIDuty
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: dustin, Unassigned)
References
Details
(Whiteboard: [windows][builldbot])
Attachments
(1 file, 2 obsolete files)
1.68 KB,
patch
|
Callek
:
review+
|
Details | Diff | Splinter Review |
From what I understand of bug 600736 and https://wiki.mozilla.org/ReferencePlatforms/Win64#Buildbot, the Twisted we want on slaves is Twisted-9.0.0 with both
http://hg.mozilla.org/build/twisted/raw-rev/0b441f0f68b4
https://bug543626.bugzilla.mozilla.org/attachment.cgi?id=451277
where the first is just a simple patch to use taskkill instead of win32process.TerminateProcess(self.hProcess, 1).
The second is a forward-port of a patch catlee submitted to Twisted:
Date: Mon Jan 25 16:34:20 2010 +0000
Merge faster-b1282int-2310
Author: catlee
Reviewer: TimAllen, exarkun
Fixes: #2310
Switch the banana function for decoding base-128 encoded integers from an expensive
exponentiation-based approach to a much cheaper bit-shifting approach. Also add a
benchmark which demonstrates how much faster decoding these integers now is.
And which has been accepted upstream in 10.0.0.
So I need to patch our local copy of Buildbot to apply the _dumbwin32proc.py patch; the other isn't required.
Reporter | ||
Comment 1•14 years ago
|
||
So this failed - and badly - with the attached error message. I'll try to transcribe the important stuff for future searching:
File "D:\mozilla-build\python24\lib\site-packages\twisted\internet\_dumbwin32proc.py", line 205, in __init__
raise OSError(pwte)
exceptions.OSError: (5, 'AssignProcessToJobObject', 'Access is Denied.')
Although now that I look at it, that's not the version of Twisted I installed (mine was in c:\tmp\bbve). I'll try again.
Reporter | ||
Comment 2•14 years ago
|
||
OK, that was a false alarm - not sure how.
This is running now on win32-slave60. I'm aware that the tests that need to be killed are not running on this system, but it's a start.
Reporter | ||
Comment 3•14 years ago
|
||
the patch in question
Attachment #540938 -
Flags: review?(armenzg)
Attachment #540938 -
Flags: feedback?(jmaher)
Reporter | ||
Comment 4•14 years ago
|
||
Comment on attachment 540938 [details] [diff] [review]
m666019-buildbot-p1-r1.patch
The feedback I'd like is: is this patch, *applied to a stock Twisted-11.0.0 install* (rather than to the Twisted that's on the refimage), likely to be sufficient? Set aside the Twisted version upgrade.
Attachment #540938 -
Flags: feedback?(anodelman)
Comment 5•14 years ago
|
||
Comment on attachment 540938 [details] [diff] [review]
m666019-buildbot-p1-r1.patch
Review of attachment 540938 [details] [diff] [review]:
-----------------------------------------------------------------
I am not a twisted expert, but this patch seems pretty straightforward.
::: slave/buildslave/dumbw32monkey.py
@@ +21,5 @@
> + return
> +
> + from twisted.internet import error
> +
> + print "applying _dumbwin32proc monkeypatch"
I am not familiar with the print statements normally used with the buildslave and twisted, but should this print statement be a bit more descriptive so it can be found easier?
@@ +26,5 @@
> + def new_signalProcess(self, signalID):
> + if self.pid is None:
> + raise error.ProcessExitedAlready()
> + if signalID in ("INT", "TERM", "KILL"):
> + os.popen('taskkill /T /F /PID %s' % self.pid)
what if signalID isn't in the list? Should we print a warning, do we even care?
Attachment #540938 -
Flags: feedback?(jmaher) → feedback+
Reporter | ||
Comment 6•14 years ago
|
||
(In reply to comment #5)
> I am not familiar with the print statements normally used with the
> buildslave and twisted, but should this print statement be a bit more
> descriptive so it can be found easier?
It will appear on the twistd.log. It's just useful as a way of verifying that the patch is in place.
> what if signalID isn't in the list? Should we print a warning, do we even
> care?
I don't think we care - the buildslave never sends any other signals. The original doesn't handle such signals either, so I think that's fine.
I'm not too concerned with the correctness of the patch, since I'm copying something that's already in production -- I'm more concerned to know that this patch is *enough*.
Comment 7•14 years ago
|
||
Comment on attachment 540938 [details] [diff] [review]
m666019-buildbot-p1-r1.patch
I like the approach.
Attachment #540938 -
Flags: review?(armenzg) → review+
Reporter | ||
Comment 8•14 years ago
|
||
OK, landed on the slaves branch, although not deployed anywhere yet.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Reporter | ||
Updated•14 years ago
|
Attachment #540938 -
Flags: feedback?(anodelman)
Comment 9•14 years ago
|
||
(In reply to Dustin J. Mitchell [:dustin] from comment #8)
> OK, landed on the slaves branch, although not deployed anywhere yet.
Actually doesn't look like this was checked in at all, (c.f. Bug 690232)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Reporter | ||
Comment 10•14 years ago
|
||
Whoops, this definitely isn't me anymore.
Assignee: dustin → nobody
Updated•14 years ago
|
Attachment #540931 -
Attachment is obsolete: true
Comment 11•14 years ago
|
||
So what needs to happen here?
Where should this patch be landing, and where does it need to be deployed? All Windows platforms I'm assuming?
Priority: -- → P3
Whiteboard: [windows][builldbot]
Comment 12•14 years ago
|
||
(In reply to Chris Cooper [:coop] from comment #11)
> So what needs to happen here?
We need to deploy this patch to windows machines somehow.
> Where should this patch be landing,
http://hg.mozilla.org/build/buildbot on the slaves branch.
> and where does it need to be deployed?
> All Windows platforms I'm assuming?
It needs to be deployed on all windows machines, if its better to hard-patch this on windows via opsi/etc. or if its better to cut a moz3 package of buildbot 0.8.4 I'm not sure, but either should be aproximately the same amount of work, since we have buildbot installed as a virtualenv now on windows.
Comment 14•14 years ago
|
||
FYI I just deployed this on my win slaves that were hitting this (due to the pain I'm experiencing on SeaMonkey due to unrelated test issues causing hangs because we can't kill stuff right) with the following:
* Graceful shutdown buildbot on the slave
* Login as Administrator
* Open a mozbuild prompt
* Execute the following commands
cd /d
wget -O patch https://bugzilla.mozilla.org/attachment.cgi?id=540938
cd mozilla-build/buildbotve/Lib/site-packages/buildslave
patch -p3 < /d/patch
rm bot.pyc /d/patch
* Reboot slave
If I encounter weird errors/problems with the patch I'll report here, but I don't expect any.
Comment 15•14 years ago
|
||
FYI after applying the patch I still get failures here:
TEST-UNEXPECTED-FAIL | /tests/dom/workers/test/test_importScripts.html | application timed out after 330 seconds with no output
If you fixed bug 589668, you'd get a screenshot here
command timed out: 1200 seconds without output, attempting to kill
SIGKILL failed to kill process
using fake rc=-1
program finished with exit code -1
I also checked twistd.log to make sure that its being run, it is. I also applied a slight change to print to the log when running the patched function, which is never run, which to me means the monkeypatch isn't actually getting applied internally correctly
Comment 16•14 years ago
|
||
Dustin suggested this change on IRC after I had issues with the patch in my environ. This is the correct patch. (very trivial change, but necessary).
That said, in the few instances that have run since I got this landed, I'm no longer getting SIGKILL Failed... messages and subsequent complete failures.
I am instead getting things like: |ERROR: The process "1744" not found.|
Though logging into the machine seems to indicate its not a real error. And its not triggering any odd stuff with buildbot. (my theory is that the process goes away before buildbot can then kill it...)
Either way, this brings us back to where we were before the buildbot upgrade.
Attachment #540938 -
Attachment is obsolete: true
Attachment #580725 -
Flags: review+
Comment 17•14 years ago
|
||
Assigning to bear as part of his "make-buildduty-suck-less" initiative.
Assignee: nobody → bear
Comment 18•14 years ago
|
||
Is there *anything* that we can quickly and reasonably do here? We are repeatedly missing genuine test failures, because we believe that purple means purple, rather than remembering that purple on Windows is absolutely devoid of any meaning whatsoever.
Updated•13 years ago
|
Attachment #580725 -
Attachment is patch: true
Comment 20•13 years ago
|
||
This bit us again last night.
There is a reviewed patch in-bug, can it land?
Comment 21•13 years ago
|
||
(In reply to Ed Morley (27th Aug is UK public holiday, back 28th) [:edmorley] from comment #20)
> This bit us again last night.
>
> There is a reviewed patch in-bug, can it land?
Unfortunately, simply checking it in isn't going to do anything. It needs to get deployed, which means we need a RelEng person assigned. Bear is no longer with the group, so I'm unassigning him and removing the priority to get this retriaged.
Assignee: bear → nobody
Priority: P3 → --
Comment 22•12 years ago
|
||
found in triage.
Component: Release Engineering → Release Engineering: Platform Support
QA Contact: coop
Comment 23•12 years ago
|
||
Is this still needed on Windows after the switch to Win64 builders and GPO?
Comment 24•12 years ago
|
||
I think it still is required as part of the imaging/GPO process. If it's already covered by that, then no need for this bug.
Assignee | ||
Updated•12 years ago
|
Product: mozilla.org → Release Engineering
Comment 25•12 years ago
|
||
(In reply to Chris AtLee [:catlee] from comment #24)
> I think it still is required as part of the imaging/GPO process. If it's
> already covered by that, then no need for this bug.
Any clue about this, Q?
Flags: needinfo?(q)
Comment 26•12 years ago
|
||
We patch the twisted files during the machine install. And we have GPO frameworks for updating files in mozillabuild if need be.
Flags: needinfo?(q)
Comment 27•11 years ago
|
||
Resolving based on comment #26.
Status: REOPENED → RESOLVED
Closed: 14 years ago → 11 years ago
Resolution: --- → FIXED
Assignee | ||
Updated•7 years ago
|
Component: Platform Support → Buildduty
Product: Release Engineering → Infrastructure & Operations
Updated•6 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•