Closed Bug 1504117 Opened Last year Closed Last year

[mozdevice] Handle sdcard mount race during ADBAndroid initialization

Categories

(Testing :: Mozbase, defect)

defect
Not set

Tracking

(firefox-esr60 unaffected, firefox63 unaffected, firefox64 unaffected, firefox65 fixed)

RESOLVED FIXED
mozilla65
Tracking Status
firefox-esr60 --- unaffected
firefox63 --- unaffected
firefox64 --- unaffected
firefox65 --- fixed

People

(Reporter: bc, Assigned: bc)

References

Details

(Keywords: regression)

Attachments

(3 files, 1 obsolete file)

Bug 1502190 changed ADBDevice to use /sdcard as the test directory when checking which version of the ls command is installed. This has an unintended race condition which is most apparent with emulators. When the emulator is started via 

$ ./mach android-emulator --version 'x86-7.0'
 0:00.93 Starting Android emulator running Android 7.0 x86...
accel:
0
KVM (version 12) is installed and usable.
accel
Error running mach:

    ['android-emulator', '--version', 'x86-7.0']

The error occurred in code that was called by the mach command. This is either
a bug in the called code itself or in the way that mach is calling it.

You should consider filing a bug for this issue.

If filing a bug, please include the full output of mach, including this error
message.

The details of the failure are as follows:

ADBError: ADBDevice.__init__: ls could not be found

  File "/home/bclary/mozilla/builds/inbound/mozilla/mobile/android/mach_commands.py", line 687, in emulator
    if emulator.wait_for_start():
  File "/home/bclary/mozilla/builds/inbound/mozilla/testing/mozbase/mozrunner/mozrunner/devices/android_device.py", line 562, in wait_for_start
    device = _get_device(self.substs, self.device_serial)
  File "/home/bclary/mozilla/builds/inbound/mozilla/testing/mozbase/mozrunner/mozrunner/devices/android_device.py", line 101, in _get_device
    device = ADBAndroid(adb=adb_path, verbose=verbose_logging, device=device_serial)
  File "/home/bclary/mozilla/builds/inbound/mozilla/testing/mozbase/mozdevice/mozdevice/adb_android.py", line 96, in __init__
    require_root=require_root)
  File "/home/bclary/mozilla/builds/inbound/mozilla/testing/mozbase/mozdevice/mozdevice/adb.py", line 655, in __init__
    raise ADBError("ADBDevice.__init__: ls could not be found")

While we could work around this in android_device.py, a better choice would be to fix it in mozdevice. ADBDevice in adb.py seems like a good choice but we can't guarantee that a generic device will actually support the sys.boot_completed property which is available for Android devices at least for 4.2+/17 so ADBAndroid in adb_android.py seems like the better choice.
My second attempt (the first was actually in ADBDevice) <https://hg.mozilla.org/try/rev/38ce1cd3d968e1e25955b5656254f1b2e32605cb> didn't work <https://treeherder.mozilla.org/logviewer.html#?job_id=209361054&repo=try&lineNumber=878-889> since I attempted to use self._adb_host before it was set. The whole decision way back when to overload the device argument has come back to bite us again. Funnily, I didn't see it just testing starting the emulator locally.

I don't want to jump through hoops to support b2g like devices which was the original motivation for the ADBDevice/ADBAndroid/ADBB2G hierarchy, so I think I will go back to ADBDevice and just ignore sys.boot_completed if it is not available.
Attached patch ADBDevice-init-race.patch (obsolete) — Splinter Review
Attachment #9022073 - Flags: review?(gbrown)
Attachment #9022074 - Flags: review?(gbrown)
note: must public new mozdevice 1.1.6 to pypi before landing this patch.
Attachment #9022075 - Flags: review?(gbrown)
Updated a stale commit message in the original version.
Attachment #9022073 - Attachment is obsolete: true
Attachment #9022073 - Flags: review?(gbrown)
Attachment #9022095 - Flags: review?(gbrown)
Depends on: 1504154
Attachment #9022074 - Flags: review?(gbrown) → review+
Comment on attachment 9022095 [details] [diff] [review]
ADBDevice-init-race.patch

Review of attachment 9022095 [details] [diff] [review]:
-----------------------------------------------------------------

fwiw, I have never encountered this problem!

::: testing/mozbase/mozdevice/mozdevice/adb.py
@@ +670,5 @@
> +                if 'No such file or directory' not in e.message:
> +                    boot_completed = True
> +                    self._ls += " -a"
> +            if not boot_completed:
> +                time.sleep(10)

10 seconds between attempts seems long to me. I would use 1 or 2 seconds.

@@ +672,5 @@
> +                    self._ls += " -a"
> +            if not boot_completed:
> +                time.sleep(10)
> +        if not boot_completed:
> +            raise ADBTimeoutError("ADBDevice: /sdcard not found.")

I'm very glad to see an explicit "/sdcard not found" vs "ls could not be found"!

Is ADBTimeoutError intentional? (vs ADBError maybe??)
Attachment #9022095 - Flags: review?(gbrown) → review+
Attachment #9022075 - Flags: review?(gbrown) → review+
(In reply to Geoff Brown [:gbrown] from comment #7)
> Comment on attachment 9022095 [details] [diff] [review]
> ADBDevice-init-race.patch
> 
> Review of attachment 9022095 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> fwiw, I have never encountered this problem!
> 

It's been pretty recent for me. I was kind of shocked it was broken and wondered why no one had bugged me about it.

> ::: testing/mozbase/mozdevice/mozdevice/adb.py
> @@ +670,5 @@
> > +                if 'No such file or directory' not in e.message:
> > +                    boot_completed = True
> > +                    self._ls += " -a"
> > +            if not boot_completed:
> > +                time.sleep(10)
> 
> 10 seconds between attempts seems long to me. I would use 1 or 2 seconds.
> 

Ok.

> @@ +672,5 @@
> > +                    self._ls += " -a"
> > +            if not boot_completed:
> > +                time.sleep(10)
> > +        if not boot_completed:
> > +            raise ADBTimeoutError("ADBDevice: /sdcard not found.")
> 
> I'm very glad to see an explicit "/sdcard not found" vs "ls could not be
> found"!
> 

yeah.

> Is ADBTimeoutError intentional? (vs ADBError maybe??)

Yeah, ADBTimeoutError since it waited for the full timeout period for the sdcard to become available. I would call that an unrecoverable error.
Pushed by bclary@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/aff117dd4d35
[mozdevice] Handle delayed sdcard mount when initializing ADBDevice, r=gbrown.
https://hg.mozilla.org/integration/mozilla-inbound/rev/9a7adcaed611
[mozdevice] Release mozdevice 1.1.6, r=gbrown.
$ python setup.py sdist upload
running sdist
running egg_info
creating mozdevice.egg-info
writing requirements to mozdevice.egg-info/requires.txt
writing mozdevice.egg-info/PKG-INFO
writing top-level names to mozdevice.egg-info/top_level.txt
writing dependency_links to mozdevice.egg-info/dependency_links.txt
writing entry points to mozdevice.egg-info/entry_points.txt
writing manifest file 'mozdevice.egg-info/SOURCES.txt'
reading manifest file 'mozdevice.egg-info/SOURCES.txt'
writing manifest file 'mozdevice.egg-info/SOURCES.txt'
warning: sdist: standard file not found: should have one of README, README.rst, README.txt, README.md

running check
creating mozdevice-1.1.6
creating mozdevice-1.1.6/mozdevice
creating mozdevice-1.1.6/mozdevice.egg-info
copying files to mozdevice-1.1.6...
copying setup.py -> mozdevice-1.1.6
copying mozdevice/__init__.py -> mozdevice-1.1.6/mozdevice
copying mozdevice/adb.py -> mozdevice-1.1.6/mozdevice
copying mozdevice/adb_android.py -> mozdevice-1.1.6/mozdevice
copying mozdevice/adb_b2g.py -> mozdevice-1.1.6/mozdevice
copying mozdevice/version_codes.py -> mozdevice-1.1.6/mozdevice
copying mozdevice.egg-info/PKG-INFO -> mozdevice-1.1.6/mozdevice.egg-info
copying mozdevice.egg-info/SOURCES.txt -> mozdevice-1.1.6/mozdevice.egg-info
copying mozdevice.egg-info/dependency_links.txt -> mozdevice-1.1.6/mozdevice.egg-info
copying mozdevice.egg-info/entry_points.txt -> mozdevice-1.1.6/mozdevice.egg-info
copying mozdevice.egg-info/not-zip-safe -> mozdevice-1.1.6/mozdevice.egg-info
copying mozdevice.egg-info/requires.txt -> mozdevice-1.1.6/mozdevice.egg-info
copying mozdevice.egg-info/top_level.txt -> mozdevice-1.1.6/mozdevice.egg-info
Writing mozdevice-1.1.6/setup.cfg
creating dist
Creating tar archive
removing 'mozdevice-1.1.6' (and everything under it)
running upload
Password: 
Submitting dist/mozdevice-1.1.6.tar.gz to https://upload.pypi.org/legacy/
Server response (200): OK
WARNING: Uploading via this command is deprecated, use twine to upload instead (https://pypi.org/p/twine/)
Pushed by bclary@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/3389e75974f7
[mozdevice] Bump mozdevice requirements to 1.1.6 for raptor, marionette, mozrunner, r=gbrown.
wlach: Could you bump mozregression's mozdevice requirement and do another release so we can see if this fixed the issues people are seeing in mozregression/mozdevice? Thanks!
Flags: needinfo?(wlachance)
(In reply to Bob Clary [:bc:] from comment #12)
> wlach: Could you bump mozregression's mozdevice requirement and do another
> release so we can see if this fixed the issues people are seeing in
> mozregression/mozdevice? Thanks!

Is a new version of mozrunner required to take advantage of the mozdevice changes? If not, I can release right away. If so, could you release a new version of mozrunner first?
Flags: needinfo?(wlachance)
No, I don't believe it is. The relevant code in mozrunner is ADBAndroid.launch_fennec which should use the new grant_permissions so long as mozdevice 1.1.6 is installed.
(In reply to Bob Clary [:bc:] from comment #14)
> No, I don't believe it is. The relevant code in mozrunner is
> ADBAndroid.launch_fennec which should use the new grant_permissions so long
> as mozdevice 1.1.6 is installed.

Cool! I just released a new version of mozregression with the updated dependency.
I'll do an update for mozrunner soonish as well and we can do mozregression again but it won't be today.
(In reply to Bob Clary [:bc:] from comment #16)
> I'll do an update for mozrunner soonish as well and we can do mozregression
> again but it won't be today.

Yup, no rush for that I don't think. Let's just leave it if there's no compelling reason to upgrade.
Blocks: 1483695
You need to log in before you can comment on or make changes to this bug.