Closed
Bug 969518
Opened 10 years ago
Closed 10 years ago
Autophone - attempt to use reverse tethering so we can use ethernet over usb instead of wifi
Categories
(Testing Graveyard :: Autophone, enhancement)
Testing Graveyard
Autophone
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: bc, Assigned: bc)
Details
Attachments
(3 files, 1 obsolete file)
9.35 KB,
patch
|
mcote
:
review+
|
Details | Diff | Splinter Review |
9.34 KB,
patch
|
Details | Diff | Splinter Review | |
12.45 KB,
patch
|
mcote
:
review+
|
Details | Diff | Splinter Review |
Using multiple phones and tablets in close proximity all using the same wifi network is an invitation to networking issues. We should attempt to use reverse tethering so we can use ethernet over usb which will be subject to network congestion as wifi.
Assignee | ||
Comment 1•10 years ago
|
||
I have left the wifi connection active on the phones so I can use that for controlling them via the SUTAgent. Trying to deal with receiving notification of phone reboots so that the usb network could be set up without already having a SUTAgent was a bridge too far. It will be interesting to see if moving the s1s2 traffic off of the wifi network will help improve the SUTAgent reliability. We will still see possible interference over wifi for the SUTAgent activities. I've used a shell script to set up the usb network using ppp over usb. This requires the phone to run the shell as root and will require 're-rooting' of the phones/devices used in production. The script uses linux specific commands such as sysctl, and iptables so the autophone host will have to be converted from OSX to Linux. On the first invocation, the script sets up the ppp device and the iptables rules. On subsequent invocations, it skips creating the iptables rules if the ppp device has already been created. The devices can access the entire local network over the usb network. I've used ip route on the phone to make sure the requests for the remote test pages travel over the usb connection. I have a sample running from an old X41 laptop running Fedora 20 and posting results to <http://phonedash-dev.allizom.org/#/org.mozilla.fennec/totalthrobber/local-blank/norejected/2014-03-04/2014-03-05/notcached/errorbars/standarderror>. It is testing mozilla-central builds for a gs2 and nexus one for 2014-03-04 and has debug level logging turned on in Autophone. It should be done in a couple of hours. I plan on using iproute2's tc command for traffic shaping. It is available on Linux and on the phones. I haven't tried it out yet though. I like wise haven't measured the raw network through put available over the usb network connection. Some observations: * Linux *may* produce more stable results than OS X without any use of usb networking. * Physical proximity to other phones may affect the wifi remote results more than I appreciated. * The production Autophone system has been running but has not been that active this morning and may affect the results depending on its load. * It is unknown how running more phones over usb will scale.
Attachment #8386166 -
Flags: review?(mcote)
Comment 2•10 years ago
|
||
Comment on attachment 8386166 [details] [diff] [review] bug-969518-usbnet.patch Review of attachment 8386166 [details] [diff] [review]: ----------------------------------------------------------------- Looks good to me, just tiny nits. ::: USAGE.md @@ +32,5 @@ > + If specified, set up adb ppp over usb > + connections 'so that all traffic from the > + devices to the host or network 'specified by > + usb_network passes through the 'ppp over usb > + connection. Otherwise, use the default 'network. Looks like you have some quotes from copying the options help. ::: autophone.py @@ +363,5 @@ > self.logger.debug('Received registration message for known phone ' > '%s.' % phoneid) > worker = self.phone_workers[phoneid] > + if worker.phone_cfg == phone_cfg: > + if phone_cfg['usb_network']: I think you meant to use USB_NETWORK. @@ +370,5 @@ > + worker.worker_num + 1) > + phone_usb_ip = '.'.join(usb_ip_parts) > + output = subprocess.check_output([ > + usbnet_script, > + '-s', phone_cfg['serial'], SERIAL
Attachment #8386166 -
Flags: review?(mcote) → review+
Assignee | ||
Comment 3•10 years ago
|
||
damn, I saw the coloring in USAGE.md but was so tired I missed fixing it. Fixed the USB_NETWORK issue but I don't currently have a variable SERIAL or some of the other phone_cfg attributes. Is that something you would want for 'phoneid', 'serial', 'ip', etc?
Assignee | ||
Comment 4•10 years ago
|
||
https://github.com/mozilla/autophone/commit/14cdbd778803b0026c001c44a0e3627da6e12868 I'll leave this open while I finish the investigation and the switch over for the host from OSX to Linux.
Assignee | ||
Comment 5•10 years ago
|
||
* autophone.ini.example ** Update with usbnet examples. * autophone.py ** Make the host port distinct for each ppp connection. ** Add debug logging about usbnet set up and catch errors from calling usbnet.sh. * usbnet.sh ** Fix typo in usage. ** Restrict usbnet to devices where we can run adbd on device as root so we can create the ppp device on the phone. ** If we have udev rules set up we do not need suid adb except for when creating the ppp devices. ** Explicitly turn off deflate and bsd compression on the ppp connection.
Attachment #8393564 -
Flags: review?(mcote)
Comment 6•10 years ago
|
||
Comment on attachment 8393564 [details] [diff] [review] bug-969518-followup-1.patch Review of attachment 8393564 [details] [diff] [review]: ----------------------------------------------------------------- ::: autophone.ini.example @@ +2,5 @@ > #clear_cache = False > #ipaddr = ... > #port = 28001 > +#usb_network=192.168.1.50 > +#usb_gateway=br0 Nit: spacing is inconsistent with the rest of the file. ::: usbnet.sh @@ +48,5 @@ > fi > > echo "waiting for device $serialno" > +adb -s $serialno wait-for-device > +adb -s $serialno root Should you maybe abort (loudly) on error here?
Attachment #8393564 -
Flags: review?(mcote) → review+
Assignee | ||
Comment 7•10 years ago
|
||
(In reply to Mark Côté ( :mcote ) from comment #6) > Comment on attachment 8393564 [details] [diff] [review] > bug-969518-followup-1.patch > Should you maybe abort (loudly) on error here? Good idea.
Assignee | ||
Comment 8•10 years ago
|
||
Not sure how much effort you want to put into reviewing this since it will all be ripped out in our movement to adb instead of SUTAgent. * autophone.ini.example ** Update with usbnet examples. * autophone.py ** Do not lower case registration url data as device serial numbers are case sensitive. ** Make the host port distinct for each ppp connection. ** Add debug logging about usbnet set up and catch errors from calling usbnet.sh. Terminate Autophone and send email notification if an error occurs calling usbnet.sh. ** In Autophone.stop, call stop on the workers before calling shutdown on the Autophone instance. This along with the change to worker.py, helps prevent deadlocks when shutting down the server. * worker.py ** Call terminate on worker process when stopping the worker. * usbnet.sh ** Fix typo in usage. ** Implement a shell function wait_for_device using adb get-state which will time out and return an error after 30 seconds. ** Restrict usbnet to devices where we can run adbd on device as root so we can create the ppp device on the phone. Terminate with a non-zero exit code if adb root is not supported by a device. ** If we have udev rules set up we do not need suid adb except for when creating the ppp devices. ** Explicitly turn off deflate and bsd compression on the ppp connection. I'd rather not spend much more time on this since it is a dead end. I intend to tag the repo with this revision so we can easily identify the last SUTAgent based revision.
Attachment #8393564 -
Attachment is obsolete: true
Attachment #8400006 -
Flags: review?(mcote)
Assignee | ||
Comment 9•10 years ago
|
||
The usb networking is not reliable. Comparing an original usbnet run <http://phonedash.mozilla.org/#/org.mozilla.fennec/throbberstop/remote-twitter/norejected/2014-03-26/2014-03-26/notcached/noerrorbars/standarderror> to an additional usbnet run and a wifi based run <http://phonedash-dev.allizom.org/#/org.mozilla.fennec/throbberstop/remote-twitter/norejected/2014-03-26/2014-03-26/notcached/errorbars/standarderror> shows the usbnet runs are not reproducible. It appears that there is a secular trend after repeated reboots of the device. It may be related to the increasing 'device number' assigned to the device on each reboot, but I'm not certain. Mark and I discussed the situation and have decided that usbnet using ppp/adb is not workable and that wifi is also not viable due to its variability and its unsuitability for hosting in a colo environment. Once this is checked in, we will move towards a local test (from sdcard) only solution using adb instead of SUTAgent over tcp/ip.
Comment 10•10 years ago
|
||
Comment on attachment 8400006 [details] [diff] [review] bug-969518-followup-2.patch Review of attachment 8400006 [details] [diff] [review]: ----------------------------------------------------------------- Looks good although I wonder about the terminate(). ::: worker.py @@ +206,4 @@ > """Call from main process.""" > if self.is_alive(): > self.cmd_queue.put_nowait(('stop', None)) > + self.p.terminate() Hm isn't it nicer to try to let the process stop gracefully rather than immediately SIGTERMing it? Pretty much no point in sending the stop command if you're going to terminate it right after.
Attachment #8400006 -
Flags: review?(mcote) → review+
Assignee | ||
Comment 11•10 years ago
|
||
Not really. It would just keep going and start running the tests. The whole stop, disable, etc thing is and has been broken. If we want to be able to handle a failure to start the ppp/adb networking, we need to keep the device(s) from running the test and submitting results, then rebooting and doing the samething over and over again. That is why I finally was forced to just stop Autophone altogether but that wasn't enough since the phone would just keep running the next test. It would crap out when it tried to get the next one since Autophone would be 'sort of down', but it would always submit one set. I finally got tired of wasting time with the broken control system since we are going to have to change a lot of this any way.
Comment 12•10 years ago
|
||
Okay, I'm just saying that you might as well take the 'stop' out of there, then, since there's I doubt it would process before terminate() acts.
Assignee | ||
Comment 13•10 years ago
|
||
Ah, right. Good point. https://github.com/mozilla/autophone/commit/850f6a2f783737d8d62a86f0425d11afadc2cb80 We tried an failed, so I'm marking this fixed as filed. Maybe we can revisit some time.
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 14•10 years ago
|
||
I've finally gotten my nexus 4, nexus 5 and nexus 7 devices flashed back to factory and rooted via changing the default.prop in the ramdisk.img to enable adb root so I *could* run a test of usb networking with them to see if they exhibit the same secular behavior. I'll try to pick a quiet time and run the same 2014-03-26 day to phonedash-dev and see how they behave.
Updated•2 years ago
|
Product: Testing → Testing Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•