Closed Bug 726700 Opened 10 years ago Closed 9 years ago

Remove Dependency on Pywin32

Categories

(Testing :: Talos, defect)

x86_64
Windows 7
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: cmtalbert, Assigned: jmaher)

References

Details

Attachments

(4 files, 3 obsolete files)

We currently depend on Pywin32 for talos. We don't use it for anything very special. And this just causes us tons of headaches. Let's remove it.
Duplicate of this bug: 496535
Working on this.
Assignee: nobody → wlachance
Try run for 9beafadc9442 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=9beafadc9442
Results (out of 16 total builds):
    exception: 5
    success: 3
    failure: 8
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/wlachance@mozilla.com-9beafadc9442
Try run for b1f7e1cb81a4 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=b1f7e1cb81a4
Results (out of 12 total builds):
    exception: 7
    success: 1
    failure: 4
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/wlachance@mozilla.com-b1f7e1cb81a4
Try run for 76fa70319688 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=76fa70319688
Results (out of 36 total builds):
    success: 28
    failure: 8
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/wlachance@mozilla.com-76fa70319688
(In reply to Mozilla RelEng Bot from comment #5)
> Try run for 76fa70319688 is complete.
> Detailed breakdown of the results available here:
>     https://tbpl.mozilla.org/?tree=Try&rev=76fa70319688
> Results (out of 36 total builds):
>     success: 28
>     failure: 8
> Builds (or logs if builds failed) available at:
> http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/wlachance@mozilla.
> com-76fa70319688

Oh the irony. Python 2.4 doesn't have ctypes installed by default, so this doesn't work.

Possible solutions:

1. Install pywin32 for python 2.5, update talos to use python 2.5, then apply this patch.
2. Install ctypes for python 2.4, apply this patch, test.
3. Somehow allow specifying the version of the python interpreter to use in talos.json?

I prefer (3) but have no idea how difficult it would be to implement.
Attached patch Remove dep on pywin32 (obsolete) — Splinter Review
I think this about what we want. Note that I changed some interfaces while working on this. One thing I'm not 100% sure about is my use of underscores for internal class names in cmanager_win32.py (e.g. _PDH_COUNTER_PATH_ELEMENTS_A). Thoughts welcome.
Attachment #601365 - Flags: review?(jmaher)
Attachment #601365 - Flags: feedback?(jhammel)
(In reply to William Lachance (:wlach) from comment #6)
> (In reply to Mozilla RelEng Bot from comment #5)
> > Try run for 76fa70319688 is complete.
> > Detailed breakdown of the results available here:
> >     https://tbpl.mozilla.org/?tree=Try&rev=76fa70319688
> > Results (out of 36 total builds):
> >     success: 28
> >     failure: 8
> > Builds (or logs if builds failed) available at:
> > http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/wlachance@mozilla.
> > com-76fa70319688
> 
> Oh the irony. Python 2.4 doesn't have ctypes installed by default, so this
> doesn't work.
> 
> Possible solutions:
> 
> 1. Install pywin32 for python 2.5, update talos to use python 2.5, then
> apply this patch.
> 2. Install ctypes for python 2.4, apply this patch, test.
> 3. Somehow allow specifying the version of the python interpreter to use in
> talos.json?
> 
> I prefer (3) but have no idea how difficult it would be to implement.

After discussing on irc I think we want to update the test slaves to something like python 2.7 + pyyaml, then apply some version of my patch while modifying the build config at the same time. First step to doing that is to try doing this in the staging environment.
Comment on attachment 601365 [details] [diff] [review]
Remove dep on pywin32

Review of attachment 601365 [details] [diff] [review]:
-----------------------------------------------------------------

now that mozprocess is in here, will we be able to use it in other talos parts instead of subprocess.popen, etc...?

Lots of little details and I am concerned about the exceptions from cmanager_win32.py.

::: setup.py
@@ +9,5 @@
>      description = ''
>  
>  version = "0.0"
>  
> +dependencies = ['PyYAML', 'mozdevice', 'mozhttpd >= 0.2', 'mozinfo', 'mozdevice', 'mozprocess']

two mozdevices in the list.

::: talos/cmanager_win32.py
@@ +38,5 @@
> +from ctypes import windll
> +from ctypes.wintypes import DWORD, HANDLE, LPSTR, LPCSTR, LPCWSTR, Structure, pointer, LONG
> +from ctypes import byref, create_string_buffer, memmove, Union, c_double, c_longlong
> +import struct
> +pdh = windll.pdh

should we try/except around these imports?

@@ +160,5 @@
> +      # Add the counter path for the default process.
> +      path = _makeCounterPath(process, counter)
> +      if not path:
> +        raise Exception("Could not create counter path for counter %s for %s" %
> +                        (counter, process))

who catches these exceptions?

@@ +167,5 @@
> +      if pdh.PdhOpenQuery(None, None, byref(hq)) != 0:
> +          raise Exception("Could not open win32 counter query")
> +      hc = HANDLE()
> +      if pdh.PdhAddCounterA(hq, path, 0, byref(hc)) != 0:
> +          raise Exception("Could not add win32 counter %s" % path)

mixing 2 and 4 space indentation

@@ +226,5 @@
>      self.updateCounterPathsForChildProcesses(counter)
>      hq = self.registeredCounters[counter][0]
>  
> +    # we'll just ignore the return value here, in case no counters are valid anymore
> +    pdh.PdhCollectQueryData(hq)

not using try/except, is there a reason?

::: talos/ffprocess_linux.py
@@ +52,5 @@
> +        pid: integer process id of the process to terminate.
> +    """
> +    ret = ''
> +    try:
> +        if self.ProcessesWithNames(str(pid)):

referencing self. outside of a class

@@ +65,5 @@
> +                    os.kill(pid, signal.SIGKILL)
> +                    ret = 'terminated with SIGKILL'
> +    except OSError, (errno, strerror):
> +        print 'WARNING: failed os.kill: %s : %s' % (errno, strerror)
> +    return ret

why is this pulled out of the class?

::: talos/ffprocess_mac.py
@@ +59,5 @@
> +    """
> +    ret = ''
> +    try:
> +        for sig in ('SIGTERM', 'SIGKILL'):
> +            if self.ProcessesWithNames(str(pid)):

referencing self. outside of a class

@@ +65,5 @@
> +                time.sleep(timeout)
> +                ret = 'terminated with %s' % sig
> +    except OSError, (errno, strerror):
> +        print 'WARNING: failed os.kill: %s : %s' % (errno, strerror)
> +    return ret

why is this pulled out of the class?

::: talos/ffprocess_win32.py
@@ +42,2 @@
>  try:
> +  import mozprocess.wpk as wpk

do we need this in a try/except?

@@ +98,5 @@
> +                ret = wpk.kill_pid(pid)
> +                if result and ret:
> +                    result = result + ', '
> +                if ret:
> +                    result = result + process_name + '(' + str(pid) + '): ' + ret

could this just be:
ret = wpk.kill_pid(pid)
results = []
if ret:
    results.append(process_name + '(' + str(pid) + '): ' + ret)

return ', '.join(results)
Attachment #601365 - Flags: review?(jmaher) → review-
Attached patch Remove pywin32 dep, take 2 (obsolete) — Splinter Review
Hopefully addresses some of the feedback
Attachment #601365 - Attachment is obsolete: true
Attachment #601365 - Flags: feedback?(jhammel)
Just posted a new patch, plz hold off on review until it passes try, but I'll comment on stuff now:

(In reply to Joel Maher (:jmaher) from comment #9)
> Comment on attachment 601365 [details] [diff] [review]
> Remove dep on pywin32
> 
> Review of attachment 601365 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> now that mozprocess is in here, will we be able to use it in other talos
> parts instead of subprocess.popen, etc...?

Yup, though there are probably only a few places where this is really necessary.

> Lots of little details and I am concerned about the exceptions from
> cmanager_win32.py.
> 
> ::: setup.py
> @@ +9,5 @@
> >      description = ''
> >  
> >  version = "0.0"
> >  
> > +dependencies = ['PyYAML', 'mozdevice', 'mozhttpd >= 0.2', 'mozinfo', 'mozdevice', 'mozprocess']
> 
> two mozdevices in the list.

Fixed

> ::: talos/cmanager_win32.py
> @@ +38,5 @@
> > +from ctypes import windll
> > +from ctypes.wintypes import DWORD, HANDLE, LPSTR, LPCSTR, LPCWSTR, Structure, pointer, LONG
> > +from ctypes import byref, create_string_buffer, memmove, Union, c_double, c_longlong
> > +import struct
> > +pdh = windll.pdh
> 
> should we try/except around these imports?

I don't think we need to, as we only import this file on windows

> @@ +160,5 @@
> > +      # Add the counter path for the default process.
> > +      path = _makeCounterPath(process, counter)
> > +      if not path:
> > +        raise Exception("Could not create counter path for counter %s for %s" %
> > +                        (counter, process))
> 
> who catches these exceptions?

Ultimately I think it is test_file in run_tests.py.

I changed these to talosExceptions to be consistent with the rest of talos and how exceptions are processed.

> @@ +167,5 @@
> > +      if pdh.PdhOpenQuery(None, None, byref(hq)) != 0:
> > +          raise Exception("Could not open win32 counter query")
> > +      hc = HANDLE()
> > +      if pdh.PdhAddCounterA(hq, path, 0, byref(hc)) != 0:
> > +          raise Exception("Could not add win32 counter %s" % path)
> 
> mixing 2 and 4 space indentation

Fixed

> @@ +226,5 @@
> >      self.updateCounterPathsForChildProcesses(counter)
> >      hq = self.registeredCounters[counter][0]
> >  
> > +    # we'll just ignore the return value here, in case no counters are valid anymore
> > +    pdh.PdhCollectQueryData(hq)
> 
> not using try/except, is there a reason?

This doesn't throw an exception.

> ::: talos/ffprocess_linux.py
> @@ +52,5 @@
> > +        pid: integer process id of the process to terminate.
> > +    """
> > +    ret = ''
> > +    try:
> > +        if self.ProcessesWithNames(str(pid)):
> 
> referencing self. outside of a class

Fixed (I moved _TerminateProcess back inside the classes)

> @@ +65,5 @@
> > +                    os.kill(pid, signal.SIGKILL)
> > +                    ret = 'terminated with SIGKILL'
> > +    except OSError, (errno, strerror):
> > +        print 'WARNING: failed os.kill: %s : %s' % (errno, strerror)
> > +    return ret
> 
> why is this pulled out of the class?

Not really any good reason, aside from the fact that I felt that we shouldn't be calling this function directly. Except, we actually do. I just changed this, so _TerminateProcess is back inside ffprocess.

> ::: talos/ffprocess_mac.py
> @@ +59,5 @@
> > +    """
> > +    ret = ''
> > +    try:
> > +        for sig in ('SIGTERM', 'SIGKILL'):
> > +            if self.ProcessesWithNames(str(pid)):
> 
> referencing self. outside of a class

Fixed, as above.

> @@ +65,5 @@
> > +                time.sleep(timeout)
> > +                ret = 'terminated with %s' % sig
> > +    except OSError, (errno, strerror):
> > +        print 'WARNING: failed os.kill: %s : %s' % (errno, strerror)
> > +    return ret
> 
> why is this pulled out of the class?

Fixed, as above.

> ::: talos/ffprocess_win32.py
> @@ +42,2 @@
> >  try:
> > +  import mozprocess.wpk as wpk
> 
> do we need this in a try/except?

Yes, because we always try to import this file, even on non-win32 platforms

> @@ +98,5 @@
> > +                ret = wpk.kill_pid(pid)
> > +                if result and ret:
> > +                    result = result + ', '
> > +                if ret:
> > +                    result = result + process_name + '(' + str(pid) + '): ' + ret
> 
> could this just be:
> ret = wpk.kill_pid(pid)
> results = []
> if ret:
>     results.append(process_name + '(' + str(pid) + '): ' + ret)
> 
> return ', '.join(results)

Yeah, come to think of it we really should try to factor that bit of code out into a common function, as it is basically the same between win32, linux, and mac. I tried to do so in my new patch.
Try run for f648d95bb977 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=f648d95bb977
Results (out of 61 total builds):
    success: 60
    failure: 1
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/wlachance@mozilla.com-f648d95bb977
Blocks: 734466
Attached patch Remove pywin32 dep, take 3 (obsolete) — Splinter Review
This fixes all the issues pointed out above, as well as making the counters behaviour more like the old pywin32, which fixes some errors we were seeing in staging.
Attachment #604120 - Attachment is obsolete: true
Attachment #605463 - Flags: review?(jmaher)
Comment on attachment 605463 [details] [diff] [review]
Remove pywin32 dep, take 3

Review of attachment 605463 [details] [diff] [review]:
-----------------------------------------------------------------

I don't like how in cmanager_win32.py we raise errors where the original handled it all internally and returned error codes.  But in ttest.py where we instantiate the cmanager, that code is in a try/except block and will work.

::: talos/cmanager_win32.py
@@ +50,5 @@
> +                ("szObjectName",LPSTR),
> +                ("szInstanceName", LPSTR),
> +                ("szParentInstance",LPSTR),
> +                ("dwInstanceIndex", DWORD),
> +                ("szCounterName",LPSTR)]

4 space indent, rest of the file is 2 space.

@@ +122,5 @@
> +        # Assume that this is a memory counter for the system, not a process
> +        # counter
> +        # If we got an error that has nothing to do with that, the exception
> +        # will almost certainly be re-raised
> +        self._addCounter(process, 'Memory', counter)

this seems fragile.

::: talos/ffprocess_mac.py
@@ +49,5 @@
>  import shutil
>  import utils
>  import platform
>  
> +

please don't add an extra line here

::: talos/ffprocess_win32.py
@@ +80,4 @@
>  
> +    def _TerminateProcess(self, pid, timeout):
> +        wpk.kill_pid(pid)
> +        return "terminated with PROCESS_TERMINATE"

I would adjust this to say "Terminated with MozProcess"
Attachment #605463 - Flags: review?(jmaher) → review+
(In reply to Joel Maher (:jmaher) from comment #14)
> Comment on attachment 605463 [details] [diff] [review]
> Remove pywin32 dep, take 3
> 
> Review of attachment 605463 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I don't like how in cmanager_win32.py we raise errors where the original
> handled it all internally and returned error codes.  But in ttest.py where
> we instantiate the cmanager, that code is in a try/except block and will
> work.

Hmm, I'm actually not sure what you're talking about here: afaict the original did not return error codes. There were certain errors which the original silently ignored but we throw an exception for. I think that's probably better behaviour.

> ::: talos/cmanager_win32.py
> @@ +50,5 @@
> > +                ("szObjectName",LPSTR),
> > +                ("szInstanceName", LPSTR),
> > +                ("szParentInstance",LPSTR),
> > +                ("dwInstanceIndex", DWORD),
> > +                ("szCounterName",LPSTR)]
> 
> 4 space indent, rest of the file is 2 space.

Will fix.

> @@ +122,5 @@
> > +        # Assume that this is a memory counter for the system, not a process
> > +        # counter
> > +        # If we got an error that has nothing to do with that, the exception
> > +        # will almost certainly be re-raised
> > +        self._addCounter(process, 'Memory', counter)
> 
> this seems fragile.

It is indeed, but I'm just preserving the original behaviour. A better solution would be to actually define the type of counter in the configuration, but that's beyond the scope of this bug IMO.

> ::: talos/ffprocess_mac.py
> @@ +49,5 @@
> >  import shutil
> >  import utils
> >  import platform
> >  
> > +
> 
> please don't add an extra line here

Will fix.

> ::: talos/ffprocess_win32.py
> @@ +80,4 @@
> >  
> > +    def _TerminateProcess(self, pid, timeout):
> > +        wpk.kill_pid(pid)
> > +        return "terminated with PROCESS_TERMINATE"
> 
> I would adjust this to say "Terminated with MozProcess"

This message is meant to describe how the process was terminated. I think PROCESS_TERMINATE does a better job of doing this (as opposed to MozProcess, which just describes the mechanism which ultimately calls PROCESS_TERMINATE)
Updated patch for bitrot, carrying over r+
Attachment #605463 - Attachment is obsolete: true
Attachment #630535 - Flags: review+
Comment on attachment 675316 [details] [diff] [review]
updated for bitrot

> # use mozhttpd 0.4 excplitly
>-mozhttpd_src = 'https://raw.github.com/mozilla/mozbase/mozhttpd-0.4/'
>+mozhttpd_src = 'https://raw.github.com/mozilla/mozbase/mozhttpd-0.5/'

Drive-by nitpick: You might want to update the comment as well.
(In reply to Gary Kwong [:gkw, :nth10sd] from comment #18)
> Comment on attachment 675316 [details] [diff] [review]
> updated for bitrot
> 
> > # use mozhttpd 0.4 excplitly
> >-mozhttpd_src = 'https://raw.github.com/mozilla/mozbase/mozhttpd-0.4/'
> >+mozhttpd_src = 'https://raw.github.com/mozilla/mozbase/mozhttpd-0.5/'
> 
> Drive-by nitpick: You might want to update the comment as well.

Already committed: http://hg.mozilla.org/build/talos/rev/95422acc676a
I found a few other minor nits...new patch should be uploaded no later than tomorrow morning.
s/excplitly/explicitly too. Thanks Joel!
'Twould be nice if we could do this Q12013.  Not sure if it ever got written down, but this was sorta a goal for Q42012
updated for bitrot, tested on win32 try.  Need to test on all platforms and then deploy.
Assignee: wlachance → jmaher
Status: NEW → ASSIGNED
Attachment #784389 - Flags: review+
https://hg.mozilla.org/build/talos/rev/ae09eb897102
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
one fallout from this patch, ended up causing talos changes to be backed out of inbound.
Attachment #785067 - Flags: review?(jhammel)
Attachment #785067 - Flags: review?(jhammel) → review+
Blocks: 900913
You need to log in before you can comment on or make changes to this bug.