Closed
Bug 1250909
Opened 8 years ago
Closed 8 years ago
Autophone - add ability for device to reboot when it detects a missing usb connection.
Categories
(Testing Graveyard :: Autophone, defect)
Testing Graveyard
Autophone
Tracking
(firefox47 affected)
RESOLVED
FIXED
Tracking | Status | |
---|---|---|
firefox47 | --- | affected |
People
(Reporter: bc, Assigned: bc)
References
Details
Attachments
(7 files, 2 obsolete files)
821 bytes,
text/plain
|
Details | |
5.71 KB,
patch
|
gbrown
:
review+
|
Details | Diff | Splinter Review |
879 bytes,
text/plain
|
gbrown
:
review+
|
Details |
8.89 KB,
text/plain
|
gbrown
:
review+
gbrown
:
feedback+
|
Details |
37.29 KB,
application/x-gzip
|
Details | |
6.58 KB,
patch
|
bc
:
review+
|
Details | Diff | Splinter Review |
935 bytes,
text/plain
|
Details |
When a device loses usb connectivity, our current solution is to gracefully shutdown Autophone and reboot the host. This has the drawback of reducing our testing capacity while the shutdown/reboot is in progress and unfortunately does not always work. We have had too frequent occurrences where devices, in particular the Nexus 6P devices, do not reconnect after the host reboot. The solution has been to ask someone to physically unplug and plug in the usb cable on the device. I've experimented with shell based scripts which use the output from dumpsys battery or dumpsys usb to detect if the device has lost usb connectivity and to reboot if so. This works, but has issues due to the lack of nohup and running scripts as background processes. I still have a couple of ideas to try but so far the approaches I've tried aren't practical. It is possible to detect usb attach and detach events in Android apps. See http://developer.android.com/guide/topics/connectivity/usb/index.html. The host and accessory functionality is limited to Android 3.1/api 12+ though some parts of the accessory functionality may be available as an add-on library for Android 2.3.4/api 10. Considering api 10 support is approaching EOL, we might just ignore it for now. If I can't get the shell script approach to work, it will be necessary to use an app. The idea would be that the app would be installed on the device by default. When we start Autophone, Autophone could start the app and have it begin monitoring usb either via receiving events or polling dumpsys battery or dumpsys usb. If the device reboots itself, Autophone will need to restart the app in order to monitor usb as well as do other initialization task such as resetting SELinux to Permissive. There is some potential for conflicts/races between the current code which reboots the host in the event of a device usb disconnection and we will need to gracefully terminate whatever test might have been running when the disconnection occurred. The reason I've mentioned manually starting the app rather than having it automatically run as a service, is to make it impossible for us to get into an infinite reboot loop when the device is not connected via usb. With careful coding, we could eliminate that possibility but having Autophone initiate the usb monitoring seems reasonable to me. Thoughts?
Assignee | ||
Comment 1•8 years ago
|
||
I have a preliminary version of a service which monitors the usb connection via polling dumpsys battery. I tried implementing a BroadcastReceiver that would notify the service when usb was detached but couldn't get that to work. I'm not entirely unhappy with the polling solution since I think I can fit it into Autophone's ping logic pretty well. Once the service is installed, it will be started by the worker via am startservice com.mozilla.bclary.usbwatchdog The service does not start properly on Android 2.3 but does with my gs3/Android 4.0.4 and 6p/6.0.1. I think this will be ok, especially considering Android 2.3 will be dropping from trunk builds in the next couple of weeks. Currently, the service uses a hard coded sleep interval between polls. I don't see how to pass an argument to the service via am startservice since I didn't create any activities for it. Instead, I will have the service attempt to load a configuration file from /data/local/tmp and if it exists, use it to determine the polling interval. If the file is not found, a default value will be used. Autophone can control this setting via a command line option and by pushing the config file to the device before the service is started. When the service detects that the device is no longer USB Powered, it will use su -c reboot to reboot the device. I want the polling interval to be short enough that the most common case will be that the device detects that is has disconnected and reboots before Autophone notices but long enough that the repeated polls of dumpsys battery do not cause significant load. When checking the device's state in PhoneWorkerSubprocess.ping, if get_state() does not return 'device', then the worker will sleep for a time longer then the usbwatchdog polling interval to give the device time to reboot itself. If the device's state is 'device', the device's uptime will be retrieved and checked against a saved value to see if the device has rebooted itself since the last time it was checked. If the uptime is less than the saved uptime value, then additional work must be done to initialize SELinux etc and to make sure any currently running test is failed with a retry. There is some duplication between PhoneWorkerSubprocess.ping and ADBAndroid.is_device_ready, but I think I'll leave that to a follow up bug.
Assignee | ||
Comment 2•8 years ago
|
||
Latest wip version supports sdk 10+, automatically detects whether su 0 or su -c should be used and whether the arguments should be quoted, allows the poll interval to be passed as an int extra to the startservice command and works on Android 2.3+. It *requires* a non-standard su which provides su access for users other than root and shell for the reboot which I do not have on all devices. Currently, this works for nexus-4-{1,2,3,4} and nexus-6p-{1..8}. While this is not total coverage it is an improvement and will hopefuly help alleviate the semi-frequent nexus-6p disconnection problem. I'm not sure, but it might be possible to use the standard android su if the app were installed as a system app and the appropriate permission was requested. Is all that is needed is to be able to mount /system as rw and to place the apk in /system/app/ ?
Assignee | ||
Comment 3•8 years ago
|
||
Assignee | ||
Comment 4•8 years ago
|
||
Assignee | ||
Comment 5•8 years ago
|
||
Attachment #8727376 -
Flags: review?(gbrown)
Assignee | ||
Updated•8 years ago
|
Attachment #8727374 -
Flags: review?(gbrown)
Assignee | ||
Comment 6•8 years ago
|
||
fyi, adbd is running as root and I can remount /system rw for nexus-4-{5,6,7}; nexus-6-{1,2}; nexus-9-{1,2}. I'll experiment here locally to see if I can actually install USBWatchdog as a system app. I don't think this will allow me to run su even if it exists, but perhaps I can add the REBOOT permission and use PowerManager.reboot <http://developer.android.com/reference/android/os/PowerManager.html> when the app is installed as a system app. The current approach is sufficient for the moment, but if any one has suggestsions please feel free to share them.
Assignee | ||
Updated•8 years ago
|
Attachment #8727374 -
Flags: review?(gbrown)
Assignee | ||
Comment 7•8 years ago
|
||
Comment on attachment 8727375 [details]
USBService.java
I'm pretty sure I can do the system app approach for devices which support it. New version coming up.
Attachment #8727375 -
Flags: review?(gbrown)
Assignee | ||
Comment 8•8 years ago
|
||
Attachment #8727822 -
Flags: review?(gbrown)
Assignee | ||
Comment 9•8 years ago
|
||
Attachment #8727374 -
Attachment is obsolete: true
Attachment #8727375 -
Attachment is obsolete: true
Attachment #8727824 -
Flags: review?(gbrown)
Assignee | ||
Updated•8 years ago
|
Attachment #8727822 -
Attachment mime type: text/xml → text/plain
Assignee | ||
Comment 10•8 years ago
|
||
Tested this with my gs3, nexus one and nexus 6p locally. the gs3 and nexus one devices allowed me to make /system rw and I installed the app as a system app. the 6p does not permit remounting /system as rw so it was installed as a normal user app. Also tested with nexus-4-7 and nexus-s-10 in production installed as system apps. the nexus 4 allowed adb remount to work but the nexus s required the following from a shell: mount -o rw,remount -t ext4 /dev/block/platform/s3c-sdhci.0/by-name/system /system They showed adb -s $(mobile-inventory.sh -p nexus-4-7 -s) logcat -d -v time | grep USBWatch 03-08 13:49:17.943 I/USBWatchdog( 1845): Permissions granted 03-08 13:49:17.943 I/USBWatchdog( 1845): Checking battery adb -s $(mobile-inventory.sh -p nexus-s-10 -s) wait-for-device shell logcat -d -v time | grep USBWatchdog 03-08 13:58:29.175 I/USBWatchdog( 604): Permissions granted 03-08 13:58:29.175 I/USBWatchdog( 604): Checking battery I presume they'll reboot when required.
Comment 11•8 years ago
|
||
Comment on attachment 8727822 [details] AndroidManifest.xml > package="com.mozilla.bclary.usbwatchdog" Maybe "com.mozilla.autophone.usbwatchdog"?
Attachment #8727822 -
Flags: review?(gbrown) → review+
Comment 12•8 years ago
|
||
Comment on attachment 8727376 [details] [diff] [review] bug-1250909-v1.patch Review of attachment 8727376 [details] [diff] [review]: ----------------------------------------------------------------- I am a little uncomfortable with the approach. This feels like a return to sutagent and watcher. But I don't have a better idea. ::: autophone.py @@ -787,4 @@ > device_ready_retry_attempts=self.options.device_ready_retry_attempts, > verbose=self.options.verbose, > test_root=test_root) > - dm.power_on() This is moved to the worker now, right? ::: worker.py @@ +363,5 @@ > return success > > + def start_usbwatchdog(self): > + if not self.dm.is_app_installed(self.options.usbwatchdog_appname): > + return Would it be useful to log "usbwatchdog not installed"?
Attachment #8727376 -
Flags: review?(gbrown) → review+
Assignee | ||
Comment 13•8 years ago
|
||
(In reply to Geoff Brown [:gbrown] from comment #11) > Comment on attachment 8727822 [details] > AndroidManifest.xml > > > package="com.mozilla.bclary.usbwatchdog" > > Maybe "com.mozilla.autophone.usbwatchdog"? I can do that. (In reply to Geoff Brown [:gbrown] from comment #12) > Comment on attachment 8727376 [details] [diff] [review] > bug-1250909-v1.patch > > Review of attachment 8727376 [details] [diff] [review]: > ----------------------------------------------------------------- > > I am a little uncomfortable with the approach. This feels like a return to > sutagent and watcher. But I don't have a better idea. > This is the only way I could think of even attempting to recover from a disconnection. We can't talk to it over usb. adb over tcp/ip seems like such a can of worms. > ::: autophone.py > @@ -787,4 @@ > > device_ready_retry_attempts=self.options.device_ready_retry_attempts, > > verbose=self.options.verbose, > > test_root=test_root) > > - dm.power_on() > > This is moved to the worker now, right? > Yeah > ::: worker.py > @@ +363,5 @@ > > return success > > > > + def start_usbwatchdog(self): > > + if not self.dm.is_app_installed(self.options.usbwatchdog_appname): > > + return > > Would it be useful to log "usbwatchdog not installed"? I don't think so. It is easy to check via pm list packages and we need to do the installation manually outside of Autophone anyway. Let me change the app name and set up a new android studio project and put it on github and we'll be ready to go.
Comment 14•8 years ago
|
||
Comment on attachment 8727824 [details] USBService.java Hmm. No splinter review for text? > mPermissions &= mPermissions && (permission == PackageManager.PERMISSION_GRANTED); Too many &'s here? I think: mPermissions = mPermissions && (permission == PackageManager.PERMISSION_GRANTED); would be sufficient. I ran this on my AOSP 4.0.1 Galaxy Nexus and got "Do not have su or sufficient permissions." I'm not sure why: root@android:/ # su -c 'ls /data/data' com.android.apps.tag ... root@android:/ # su -c 'id' uid=0(root) gid=0 groups=0 root@android:/ # run-as com.mozilla.bclary.usbwatchdog root@android:/data/data/com.mozilla.bclary.usbwatchdog $ id uid=10038 gid=10038 root@android:/data/data/com.mozilla.bclary.usbwatchdog $ su -c 'id' uid=0(root) gid=0 groups=0 Would you check this into autophone's github? What's the plan for building?
Attachment #8727824 -
Flags: review?(gbrown) → feedback+
Assignee | ||
Comment 15•8 years ago
|
||
(In reply to Geoff Brown [:gbrown] from comment #14) > Comment on attachment 8727824 [details] > USBService.java > > Hmm. No splinter review for text? > I guess I'll have to learn that. > > mPermissions &= mPermissions && (permission == PackageManager.PERMISSION_GRANTED); > > Too many &'s here? I think: Thanks. > > mPermissions = mPermissions && (permission == > PackageManager.PERMISSION_GRANTED); > > would be sufficient. > > I ran this on my AOSP 4.0.1 Galaxy Nexus and got "Do not have su or > sufficient permissions." I'm not sure why: > That is a standard android build with standard su right? That does not support su for any users other than root or shell. That is why I needed to add the support for installation as a system app and the use of permissions and the PowerManager. If you still have it installed as a user app, do adb uninstall com.mozilla.bclary.usbwatchdog then adb root adb remount adb push app-debug.apk /data/local/tmp/ adb shell dd if=/data/local/tmp/app-debug.apk of=/system/app/app-debug.apk chmod 644 /system/app/app-debug.apk reboot then adb shell 'am startservice -n com.mozilla.bclary.usbwatchdog/.USBService --ei poll_interval 60' to check if it is running ok: adb logcat -d | grep USBWatchdog Then unplug the cable and wait 60 seconds. It should reboot. > Would you check this into autophone's github? What's the plan for building? I'm putting it in https://github.com/bclary/USBWatchdog It will build as an Android Studio project.
Comment 16•8 years ago
|
||
I re-installed with your procedure and re-tested -- it works fine now. This is an aosp full-eng build. I have had it for a while -- I'm not positive of the origin/history of su. I am surprised that 'run-as <package> su -c ...' was not an effective simulation. Interesting.
Updated•8 years ago
|
Attachment #8727824 -
Flags: review+
Assignee | ||
Comment 17•8 years ago
|
||
I'll add some instructions to the README.md and add a link from Autophone to the repo along with some more instructions/background. Thanks! I'm running a local test with the new apk. I'll check the apk into the autophone repo so people don't have to build it. It is debug so it includes the debug signing. I don't think it is necessary to do the whole full blown signature thing for now.
Assignee | ||
Comment 18•8 years ago
|
||
I had run the original app quite extensively over the weekend without triggering a disconnected device. While I hoped this had solved the problem of recovering a disconnected device I didn't have proof. I ran another long running test last night using the latest version with your changes and ran into problems unrelated to the changes. I am concerned this solution as currently implemented will not solve our problem. One of my nexus-one devices disconnected and it did not recover. I was running autophone without the reboot on error option so did not get the benefit of the host reboot. It wasn't clear to me that the reboot had occurred. I restarted and ran into the same problem some hours later with the same nexus one device. I carefully checked the uptime quickly after unplugging and plugging in the device before the running autophone instance could reboot it and it appeared it had rebooted but it wasn't absolutely clear. I changed autophone's configuration to reboot on error and started the test again. It ran well overnight up until a few minutes ago when my workstation rebooted. A different nexus one device had disconnected. I looked at the device itself and found it was still usb charging though the host did not detect it via adb devices. The use of the signal "USB powered: false" may not be sufficient. I may need to detect if adb debugging is active instead. I should also log the reboot since it is not possible to see anything from logcat from the time the disconnection occurs and when the reboot occurs.
Assignee | ||
Comment 19•8 years ago
|
||
I've added checking for the adbd process. During a test, the nexus 6p device disconnected but the USBWatchdog did not detect it. I shut down the autophone instance, and did adb kill-server then adb devices and it did not recover. I rebooted the workstation and it did recover. Attaching the logcat from the device in case someone sees something. The error occurred shortly after a reboot and the first invocation of fennec around 04:34.
Assignee | ||
Comment 20•8 years ago
|
||
Carrying forward gbrown's r+. I've pushed the USBWatchdog Android Studio project to https://github.com/bclary/USBWatchdog This version does not attempt to use dumpsys battery to check for USB Power or the existence of the adbd process to detect when the connection is lost. I found that neither of these would detect device disconnections. Instead it uses a "heartbeat" property which is periodically set by the worker process. When the USBWatchdog service detects that the heartbeat has not been updated within the specified polling interval it will reboot the device. This approach will accurately detect when the host can not longer communicate with the device over USB. I also made minor changes to include the ability to pass a debug argument when starting the USBWatchdog service. I am not certain that this will alleviate all USB disconnections which require physical intervention. I found some cases during local testing where a simple reboot of the device without rebooting the host did not reconnect the device. It is my hope this will at least reduce the number of required physical interventions.
Attachment #8731249 -
Flags: review+
Assignee | ||
Comment 21•8 years ago
|
||
https://github.com/mozilla/autophone/commit/4a9b86aaf016587acedcfbe9a8bf77ff1a5d9db1 I'll close when the apk is installed and the commit deployed to the servers.
Assignee | ||
Comment 22•8 years ago
|
||
Some of the devices report insufficient permissions. Need to look into that when time permits.
Assignee | ||
Comment 23•8 years ago
|
||
deployed 2016-03-15 ~22:46
Assignee | ||
Comment 24•8 years ago
|
||
doh, deployed 2016-03-16 ~22:46
Updated•2 years ago
|
Product: Testing → Testing Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•