Closed Bug 964516 Opened 10 years ago Closed 10 years ago

Improve mozdevice's reboot method

Categories

(Testing :: Mozbase, defect)

x86_64
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: wlach, Assigned: wlach)

References

Details

Attachments

(1 file, 2 obsolete files)

Problems:

1. There is no standardized API to say "wait for reboot complete" in mozdevice. The parameters you have to pass to dmSUT and dmADB are different.
2. The dmSUT code requires you to pass ip address and port information, even though it can perfectly well figure this out itself.
3. There is no command line client to test this functionality.
4. Logging is spotty.
CCing peoples who might be interested in this.
Comment on attachment 8366313 [details] [diff] [review]
Improve mozdevice rebooting logic;r=bc

Review of attachment 8366313 [details] [diff] [review]:
-----------------------------------------------------------------

Let me know what you think. I tried to clean up the logic in the ways outlined in the bug while preserving backwards compatibility.
Attachment #8366313 - Flags: review?(bclary)
Attachment #8366313 - Flags: feedback?(gbrown)
wlach, you've completely dropped the ability to force the use of a particular ip address and rely entirely on NetworkTools.getLanIp ?

This will break on a OSX box with VMWware fusion where getLanIp will return the wrong address. I'd have to give this an r- based on that. I believe we can do what you want while providing real backwards compatibility for fixed ip addresses at least.
Comment on attachment 8366313 [details] [diff] [review]
Improve mozdevice rebooting logic;r=bc

Review of attachment 8366313 [details] [diff] [review]:
-----------------------------------------------------------------

I like the changes in general, and did not spot any problems. However, I have not used the reboot function much and am not familiar with this particular code path.

::: mozdevice/mozdevice/devicemanagerSUT.py
@@ +745,5 @@
> +        self._logger.debug('Created reboot callback server at %s:%d' %
> +                           serverSocket.getsockname())
> +        return serverSocket
> +
> +    def _waitForRebootPing(self, serverSocket):

Whenever I see this reboot server code, I wonder why we don't just poll the device for sut connectivity (check that we can connect and get a prompt). Is there a good reason? Maybe this is an opportunity to add a comment here.
Attachment #8366313 - Flags: feedback?(gbrown) → feedback+
Oh, ok. Didn't realize that functionality was actually used. Here's an updated patch.
Attachment #8366313 - Attachment is obsolete: true
Attachment #8366313 - Flags: review?(bclary)
Attachment #8366850 - Flags: review?(bclary)
(In reply to Geoff Brown [:gbrown] from comment #5)
> Comment on attachment 8366313 [details] [diff] [review]
> Improve mozdevice rebooting logic;r=bc
> 
> Review of attachment 8366313 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I like the changes in general, and did not spot any problems. However, I
> have not used the reboot function much and am not familiar with this
> particular code path.
> 
> ::: mozdevice/mozdevice/devicemanagerSUT.py
> @@ +745,5 @@
> > +        self._logger.debug('Created reboot callback server at %s:%d' %
> > +                           serverSocket.getsockname())
> > +        return serverSocket
> > +
> > +    def _waitForRebootPing(self, serverSocket):
> 
> Whenever I see this reboot server code, I wonder why we don't just poll the
> device for sut connectivity (check that we can connect and get a prompt). Is
> there a good reason? Maybe this is an opportunity to add a comment here.

We talked about this on irc yesterday. I think the rationale is that if we just poll the device, it may never actually reboot and we'll still think it succeeded.

http://logbot.glob.com.au/?c=mozilla%23ateam&s=27+Jan+2014&e=27+Jan+2014&h=reboot#c739252

A comment is a good idea, I'll add one before pushing.
Comment on attachment 8366931 [details] [diff] [review]
Update to include settling time in updateApp; add comment about callback server vs. polling the device to see if it's up)

Review of attachment 8366931 [details] [diff] [review]:
-----------------------------------------------------------------

::: mozdevice/mozdevice/devicemanagerSUT.py
@@ +748,5 @@
> +        data = None
> +        startTime = datetime.datetime.now()
> +        while not data and \
> +              datetime.datetime.now() - startTime < datetime.timedelta(
> +                  seconds=self.reboot_timeout):

Maybe use a temporary to make it a bit more readable.
       waitTime = datetime.timedelta(seconds=self.reboot_timeout)
       while not data and datetime.datetime.now() - startTime < waitTime:

@@ +749,5 @@
> +        startTime = datetime.datetime.now()
> +        while not data and \
> +              datetime.datetime.now() - startTime < datetime.timedelta(
> +                  seconds=self.reboot_timeout):
> +            self._logger.info("Waiting for callback ping from device...")

Maybe "Waiting for reboot callback ping from device..." ?

@@ +756,4 @@
>                  # Receiving any data is good enough.
>                  data = conn.recv(1024)
>                  if data:
> +                    self._logger.info("Received callback ping from device!")

Maybe "Received reboot callback ping from device!" ?

@@ +767,3 @@
>  
> +        if not data:
> +            raise DMError('Timed out waiting for reboot callback.')

This is just an observation, so feel free to ignore it.

The serverSocket is still open when this exception is raised. It will be gc'd at some point where it will be closed but I think it would be better to be more deterministic about it. If we are relying on the socket being automatically closed when the object goes out of scope and is gc'd, then why bother explicitly closing them at all?

If conn.recv or conn.sendall raise an exception, conn won't be explicitly closed either.

Even when we close a socket, the actual connection is not closed immediately. What do you think of a) calling socket.shutdown before calling socket.close and b) making sure we close every socket explicitly regardless of exceptions?

@@ +776,5 @@
> +
> +
> +    def reboot(self, wait=False, ipAddr=None, port=30000):
> +        # ipAddr, port ^^^ are here for backwards compatibility only! use
> +        # wait if you want to wait for a reboot

Maybe a docstring more in the line of updateApp. Also, ipAddr isn't just for backwards compatibility now.

Note if we really do mean to keep backwards compatibility, then if someone has called reboot with positional arguments the way Autophone does with reboot(self.ipaddr, 30000+self.worker_num) we will want to move wait to after port in the argument list though you might consider that a bug in Autophone which I will happily fix, but it might bite you elsewhere.

def reboot(self, ipAddr=None, port=30000, wait=False):
"""
       Reboots the device.

       :param wait: block on device to come back up before returning
       :param ipAddr: if specified, try to make the device connect to this
                      specific IP address after rebooting (only works with
                      SUT; if None, we try to determine a reasonable address
                      ourselves)"""
        :param port: Only here for backwards compatibility. It is determined
                     automatically.
"""

@@ +875,5 @@
> +    def updateApp(self, appBundlePath, processName=None, destPath=None, wait=False,
> +                  ipAddr=None, port=30000):
> +        # ipAddr, port ^^^ are here for backwards compatibility only! use
> +        # wait if you want to wait for a reboot
> +        wait = (wait or ipAddr)

ditto docstring and ipAddr not being for backwards compatibility only like in reboot above and the change in order of the arguments which may break callers who used positional arguments.
Attachment #8366931 - Flags: review?(bclary) → review+
(In reply to Bob Clary [:bc:] from comment #9)
> Comment on attachment 8366931 [details] [diff] [review]
> Update to include settling time in updateApp; add comment about callback
> server vs. polling the device to see if it's up)
> 
> Review of attachment 8366931 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: mozdevice/mozdevice/devicemanagerSUT.py
> @@ +748,5 @@
> > +        data = None
> > +        startTime = datetime.datetime.now()
> > +        while not data and \
> > +              datetime.datetime.now() - startTime < datetime.timedelta(
> > +                  seconds=self.reboot_timeout):
> 
> Maybe use a temporary to make it a bit more readable.
>        waitTime = datetime.timedelta(seconds=self.reboot_timeout)
>        while not data and datetime.datetime.now() - startTime < waitTime:

Will do.

> @@ +749,5 @@
> > +        startTime = datetime.datetime.now()
> > +        while not data and \
> > +              datetime.datetime.now() - startTime < datetime.timedelta(
> > +                  seconds=self.reboot_timeout):
> > +            self._logger.info("Waiting for callback ping from device...")
> 
> Maybe "Waiting for reboot callback ping from device..." ?

Sounds good. Will change.

> @@ +756,4 @@
> >                  # Receiving any data is good enough.
> >                  data = conn.recv(1024)
> >                  if data:
> > +                    self._logger.info("Received callback ping from device!")
> 
> Maybe "Received reboot callback ping from device!" ?

Sounds good. Will change.

> @@ +767,3 @@
> >  
> > +        if not data:
> > +            raise DMError('Timed out waiting for reboot callback.')
> 
> This is just an observation, so feel free to ignore it.
> 
> The serverSocket is still open when this exception is raised. It will be
> gc'd at some point where it will be closed but I think it would be better to
> be more deterministic about it. If we are relying on the socket being
> automatically closed when the object goes out of scope and is gc'd, then why
> bother explicitly closing them at all?
> 
> If conn.recv or conn.sendall raise an exception, conn won't be explicitly
> closed either.
> 
> Even when we close a socket, the actual connection is not closed
> immediately. What do you think of a) calling socket.shutdown before calling
> socket.close and b) making sure we close every socket explicitly regardless
> of exceptions?

On consideration, let's just take out the close(). I don't think it really matters and simpler is better.

> @@ +776,5 @@
> > +
> > +
> > +    def reboot(self, wait=False, ipAddr=None, port=30000):
> > +        # ipAddr, port ^^^ are here for backwards compatibility only! use
> > +        # wait if you want to wait for a reboot
> 
> Maybe a docstring more in the line of updateApp. Also, ipAddr isn't just for
> backwards compatibility now.

Right, will fix.

> 
> Note if we really do mean to keep backwards compatibility, then if someone
> has called reboot with positional arguments the way Autophone does with
> reboot(self.ipaddr, 30000+self.worker_num) we will want to move wait to
> after port in the argument list though you might consider that a bug in
> Autophone which I will happily fix, but it might bite you elsewhere.

Oh, ok. I'll change the order.

> def reboot(self, ipAddr=None, port=30000, wait=False):
> """
>        Reboots the device.
> 
>        :param wait: block on device to come back up before returning
>        :param ipAddr: if specified, try to make the device connect to this
>                       specific IP address after rebooting (only works with
>                       SUT; if None, we try to determine a reasonable address
>                       ourselves)"""
>         :param port: Only here for backwards compatibility. It is determined
>                      automatically.
> """
> 
> @@ +875,5 @@
> > +    def updateApp(self, appBundlePath, processName=None, destPath=None, wait=False,
> > +                  ipAddr=None, port=30000):
> > +        # ipAddr, port ^^^ are here for backwards compatibility only! use
> > +        # wait if you want to wait for a reboot
> > +        wait = (wait or ipAddr)
> 
> ditto docstring and ipAddr not being for backwards compatibility only like
> in reboot above and the change in order of the arguments which may break
> callers who used positional arguments.

Yup.
Blocks: 964301
Pushed with suggested changes.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Depends on: 972346
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: