Closed Bug 865944 Opened 7 years ago Closed 6 years ago

Use foreground activity instead of process name to determine if remote browser has terminated

Categories

(Testing :: General, defect)

x86
Android
defect
Not set

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: gbrown, Assigned: gbrown)

References

Details

Attachments

(8 files, 6 obsolete files)

5.66 KB, patch
jmaher
: review+
Details | Diff | Splinter Review
1.98 KB, patch
jmaher
: review+
wlach
: feedback+
Details | Diff | Splinter Review
1.59 KB, patch
jmaher
: review+
Details | Diff | Splinter Review
6.24 KB, patch
jmaher
: review+
Details | Diff | Splinter Review
1.38 KB, patch
jmaher
: review+
Details | Diff | Splinter Review
2.43 KB, patch
jmaher
: review+
Details | Diff | Splinter Review
3.92 KB, patch
Details | Diff | Splinter Review
3.95 KB, patch
dminor
: review+
Details | Diff | Splinter Review
Bug 865311 discusses problems associated with using the process name for determining if a launched process -- typically the browser under test -- has completed.

This bug focuses on using the Android foreground activity instead of the process name as a possible solution.
Blocks: 865311
Attached patch very rough wip - sutagent (obsolete) — Splinter Review
Just something to discuss...
Ditto!
Comment on attachment 742132 [details] [diff] [review]
very rough wip - sutagent

Review of attachment 742132 [details] [diff] [review]:
-----------------------------------------------------------------

while I don't know the android system calls that well, this is generally a good approach!

::: build/mobile/sutagent/android/DoCommand.java
@@ +177,5 @@
>          TZGET ("tzget"),
>          TZSET ("tzset"),
>          ADB ("adb"),
>          CHMOD ("chmod"),
> +        FOREGROUNDACTIVITY ("foregroundactivity"),

I would love to go back to our 4 character limit imposed by blassey in the old days of sutagent :)
Comment on attachment 742133 [details] [diff] [review]
very rough wip - devicemanager and automation

Review of attachment 742133 [details] [diff] [review]:
-----------------------------------------------------------------

nice, this might be a simple fix.  If we agree on something here, we can get a sutagent upgrade followed by a talos update also ;)  Could we hack this for now without the sutagent to prove this helps?  i.e. use dumpsys window input method?  I agree for a final solution something integrated into sutagent would be ideal.
This approach makes sense to me.
Assignee: nobody → gbrown
Adds the "activity" command to sutagent, which returns the package name of the top / foreground activity. 

Bumped the sutagent version from 1.16 to 1.17.

Tested locally on panda and tegra. This is from a panda:

$ telnet 192.168.0.111 20701
Trying 192.168.0.111...
Connected to 192.168.0.111.
Escape character is '^]'.
$>ver
SUTAgentAndroid Version 1.17
$>activity
com.mozilla.SUTAgentAndroid
$>quit
quit
$>Connection closed by foreign host.
$ adb shell am start -a android.activity.MAIN -n org.mozilla.fennec_mozdev/.App
Starting: Intent { act=android.activity.MAIN cmp=org.mozilla.fennec_mozdev/.App }
$ telnet 192.168.0.111 20701
Trying 192.168.0.111...
Connected to 192.168.0.111.
Escape character is '^]'.
$>activity
org.mozilla.fennec_mozdev
$>quit
quit
$>Connection closed by foreign host.
$ adb shell am force-stop org.mozilla.fennec_mozdev
$ telnet 192.168.0.111 20701
Trying 192.168.0.111...
Connected to 192.168.0.111.
Escape character is '^]'.
$>activity
com.mozilla.SUTAgentAndroid
$>
Attachment #742132 - Attachment is obsolete: true
Attachment #742458 - Flags: review?(jmaher)
Comment on attachment 742458 [details] [diff] [review]
(1) add "activity" command to sutagent; bump version

Review of attachment 742458 [details] [diff] [review]:
-----------------------------------------------------------------

this looks great.  this can be landed as a DONTBUILD.  Lets put sutagent1.17.apk on a people account and file a bug for deployment.

::: build/mobile/sutagent/android/DoCommand.java
@@ +3856,5 @@
> +        List< ActivityManager.RunningTaskInfo > taskInfo = null;
> +        ComponentName componentInfo = null;
> +        if (aMgr != null)
> +            {
> +            taskInfo = aMgr.getRunningTasks(1); 

nit: trailing whitespace
Attachment #742458 - Flags: review?(jmaher) → review+
Depends on: 866228
Comment on attachment 742133 [details] [diff] [review]
very rough wip - devicemanager and automation

Drive by comment: I'm a bit concerned that this is adding something very Android-specific into the devicemanager code, which we now also use for testing FirefoxOS. Over the last few months I've put quite a bit of effort into documenting mozdevice and clearly seperating the Android functionality from that which is universal to all types of devices (see e.g. https://mozbase.readthedocs.org/en/latest/mozdevice.html#android-extensions). I'm afraid to say that this patch, as is, kind of undoes that. :(

It's a little bit of extra effort (since the harness code will likely need to be updated as well), but I'd much prefer if this method were added into droid.py: 

https://github.com/mozilla/mozbase/blob/master/mozdevice/mozdevice/droid.py

Another alternative might be to reimplement the "foregroundactivity" command as an "info" subcommand in the agent. Then we wouldn't need to add any new code to mozdevice at all (just the harness). Maybe just follow the model of bug 838844 for getting the user serial.
:wlach -- I'll just flag you for feedback on this one patch, but I am interested in your feedback / co-review for all of the patches to come, if you have time.
Attachment #742133 - Attachment is obsolete: true
Attachment #744712 - Flags: review?(jmaher)
Attachment #744712 - Flags: feedback?(wlachance)
This is mostly unrelated to activity-vs-process but affects the timing of our wait-for-launch / wait-for-completion, so wanted to include this here.
Attachment #744714 - Flags: review?(jmaher)
When running DroidADB.getTopActivity, devicemanagerADB will try to execute dumpsys as org.mozilla.fennec. In my experience, this usually fails, whereas running as the normal shell user succeeds.
Attachment #744719 - Flags: review?(jmaher)
Comment on attachment 744712 [details] [diff] [review]
(2) Makefile changes so that droid.py can be used from remote reftests and remote mochitests

Looks like the right thing. Always happy to provide feedback. :)
Attachment #744712 - Flags: feedback?(wlachance) → feedback+
Attached patch (6) Add getTopActivity to droid (obsolete) — Splinter Review
Attachment #744724 - Flags: review?(jmaher)
These tests run well on tegra and panda with these patches. The other 2 disabled tests appear to have other issues.
Attachment #744728 - Flags: review?(jmaher)
Try run with extra debugging: https://tbpl.mozilla.org/?tree=Try&rev=b5c7deee3b3f
Try run with cleaned up patches: https://tbpl.mozilla.org/?tree=Try&rev=10cc0ef4c1f3
Retries indicate there may be a (new, different?) problem in testFindInPage on panda -- I will follow-up.
Comment on attachment 744712 [details] [diff] [review]
(2) Makefile changes so that droid.py can be used from remote reftests and remote mochitests

Review of attachment 744712 [details] [diff] [review]:
-----------------------------------------------------------------

great, it doesn't look like droid.py requires additional files.
Attachment #744712 - Flags: review?(jmaher) → review+
Comment on attachment 744716 [details] [diff] [review]
(4) Use droid.getTopActivity in remoteautomation.py

Review of attachment 744716 [details] [diff] [review]:
-----------------------------------------------------------------

nice
Attachment #744716 - Flags: review?(jmaher) → review+
Comment on attachment 744719 [details] [diff] [review]
(5) Do not use adb run-as for dumpsys

Review of attachment 744719 [details] [diff] [review]:
-----------------------------------------------------------------

::: testing/mozbase/mozdevice/mozdevice/devicemanagerADB.py
@@ +540,5 @@
>              finalArgs.extend(['-s', self._deviceSerial])
>          # use run-as to execute commands as the package we're testing if
>          # possible
>          if not self._haveRootShell and self._useRunAs and \
> +                args[0] == "shell" and args[1] not in [ "run-as", "am", "dumpsys" ]:

this seems like a hack...can we document a bit more of this.  want to make it easy for future hackers to realize they can exclude commands (like dumpsys) and to ensure we understand why we do this.
Attachment #744719 - Flags: review?(jmaher) → review+
Comment on attachment 744724 [details] [diff] [review]
(6) Add getTopActivity to droid

Review of attachment 744724 [details] [diff] [review]:
-----------------------------------------------------------------

this will need to land on the github mozbase instance and then uplifted into mozilla-central.
Attachment #744724 - Flags: review?(jmaher) → review+
Comment on attachment 744728 [details] [diff] [review]
(7) Enable more robocop tests

Review of attachment 744728 [details] [diff] [review]:
-----------------------------------------------------------------

this gets us halfway towards running all tests!
Attachment #744728 - Flags: review?(jmaher) → review+
(In reply to Joel Maher (:jmaher) from comment #22)
> Comment on attachment 744719 [details] [diff] [review]
> (5) Do not use adb run-as for dumpsys
> 
> Review of attachment 744719 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: testing/mozbase/mozdevice/mozdevice/devicemanagerADB.py
> @@ +540,5 @@
> >              finalArgs.extend(['-s', self._deviceSerial])
> >          # use run-as to execute commands as the package we're testing if
> >          # possible
> >          if not self._haveRootShell and self._useRunAs and \
> > +                args[0] == "shell" and args[1] not in [ "run-as", "am", "dumpsys" ]:
> 
> this seems like a hack...can we document a bit more of this.  want to make
> it easy for future hackers to realize they can exclude commands (like
> dumpsys) and to ensure we understand why we do this.

:wlach -- any thoughts here? I find myself wondering if all the _useRunAs code might be better off in Droid.
(In reply to Geoff Brown [:gbrown] from comment #25)
> (In reply to Joel Maher (:jmaher) from comment #22)
> > Comment on attachment 744719 [details] [diff] [review]
> > (5) Do not use adb run-as for dumpsys
> > 
> > Review of attachment 744719 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > ::: testing/mozbase/mozdevice/mozdevice/devicemanagerADB.py
> > @@ +540,5 @@
> > >              finalArgs.extend(['-s', self._deviceSerial])
> > >          # use run-as to execute commands as the package we're testing if
> > >          # possible
> > >          if not self._haveRootShell and self._useRunAs and \
> > > +                args[0] == "shell" and args[1] not in [ "run-as", "am", "dumpsys" ]:
> > 
> > this seems like a hack...can we document a bit more of this.  want to make
> > it easy for future hackers to realize they can exclude commands (like
> > dumpsys) and to ensure we understand why we do this.
> 
> :wlach -- any thoughts here? I find myself wondering if all the _useRunAs
> code might be better off in Droid.

Just use shellCheckOutput as opposed to ._runAs, that should avoid this issue altogether. :)

I think transplanting run-as into droid would be rather involved, not sure. Personally I'm not really convinced that run-as has really proven to help our testing efforts, but that's a discussion we can have another day. :)
> Just use shellCheckOutput as opposed to ._runAs, that should avoid this
> issue altogether. :)

Sorry, I meant "_runCmd", not "_runAs".
Attachment #742458 - Attachment description: add "activity" command to sutagent; bump version → (1) add "activity" command to sutagent; bump version
(In reply to Geoff Brown [:gbrown] from comment #19)
> Retries indicate there may be a (new, different?) problem in testFindInPage
> on panda -- I will follow-up.

That's a pre-existing condition:
https://tbpl.mozilla.org/php/getParsedLog.php?id=22506130&tree=Mozilla-Inbound&full=1#error0
https://tbpl.mozilla.org/php/getParsedLog.php?id=22507286&tree=Mozilla-Inbound&full=1#error7
https://tbpl.mozilla.org/php/getParsedLog.php?id=22508644&tree=Mozilla-Inbound&full=1#error7

I won't try to address it in this bug.
(In reply to William Lachance (:wlach) from comment #26) 
> Just use shellCheckOutput as opposed to ._runAs, that should avoid this
> issue altogether. :)

That works just fine -- good idea!
Updated to use shellCheckOutput -- eliminates need for hack-ish patch (5).
Attachment #744719 - Attachment is obsolete: true
Attachment #744724 - Attachment is obsolete: true
Attachment #744944 - Flags: review?(jmaher)
(In reply to Joel Maher (:jmaher) from comment #23)
> Comment on attachment 744724 [details] [diff] [review]
> (6) Add getTopActivity to droid
> 
> Review of attachment 744724 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> this will need to land on the github mozbase instance and then uplifted into
> mozilla-central.

Can I get a hand with this? I think patches (3) and (6) need to land on github.
Comment on attachment 744944 [details] [diff] [review]
(6) Add getTopActivity to droid

Review of attachment 744944 [details] [diff] [review]:
-----------------------------------------------------------------

good stuff!
Attachment #744944 - Flags: review?(jmaher) → review+
(In reply to Geoff Brown [:gbrown] from comment #31)
> (In reply to Joel Maher (:jmaher) from comment #23)
> > Comment on attachment 744724 [details] [diff] [review]
> > (6) Add getTopActivity to droid
> > 
> > Review of attachment 744724 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > this will need to land on the github mozbase instance and then uplifted into
> > mozilla-central.
> 
> Can I get a hand with this? I think patches (3) and (6) need to land on
> github.

Landed them on github:

https://github.com/mozilla/mozbase/commit/12baea63d19070ba5664a228b179baedcef779bf
https://github.com/mozilla/mozbase/commit/0c933d1867f58d2604d681b0ff0c77a45bb438ea
Patch (7) https://hg.mozilla.org/integration/mozilla-inbound/rev/548558263c23

(...but I kept testDoorHanger disabled -- see bug 868681.)
Those patches seem to have significantly reduced the "2400 seconds without output" failures. We should also port these changes to talos, in case there are similar problems there. Consider, for instance, bug 849478.
There were some formatting errors in the getTopActivity patch that we committed to mozbase (5 space instead of 5 space indentation). Fixed here:

https://github.com/mozilla/mozbase/commit/78c450cd0ed1243376749adc31ff347db9255681
Blocks: 839466
just a patch of what I think would need to be done.  Unfortunately when I ran with this, I continued to get DroidSUT error, getTopActivity() is not defined :(
The sequence of patches with the log information [1] break |make mochitest-remote| on all of my local devices.  Bisecting shows that running mochitests (via runtestsremote.py) succeeds for the Nightly build at

http://ftp.mozilla.org/pub/mozilla.org/mobile/nightly/2013/05/2013-05-03-03-09-20-mozilla-central-android/

and fails for the build at

http://ftp.mozilla.org/pub/mozilla.org/mobile/nightly/2013/05/2013-05-04-03-09-13-mozilla-central-android/

Testing a more recent branch with the four commits reverted allows the mochitests to run.

I will try to figure out what is happening locally and suggest a fix, but in the meantime, if I could get independent verification of my claims that would be great.  (I have verified that invoking via |make mochitest-remote| versus runtestsremote.py directly is not the issue.)

[1]  Candidate changes:

changeset:   130750:29a70c8e22cc
user:        Geoff Brown <gbrown@mozilla.com>
date:        Fri May 03 11:37:51 2013 -0600
summary:     Bug 865944 - Add droid.py to list of files used by remote reftests and mochitests; r=jmaher

changeset:   130751:87ef7a0e5f94
user:        Geoff Brown <gbrown@mozilla.com>
date:        Fri May 03 11:37:57 2013 -0600
summary:     Bug 865944 - Do not wait for "am instrument" to start; r=jmaher

changeset:   130752:d1651293c9d2
user:        Geoff Brown <gbrown@mozilla.com>
date:        Fri May 03 11:37:59 2013 -0600
summary:     Bug 865944 - Use top activity instead of process to check launch success; r=jmaher

changeset:   130753:d2688e330aa0
user:        Geoff Brown <gbrown@mozilla.com>
date:        Fri May 03 11:38:01 2013 -0600
summary:     Bug 865944 - Add getTopActivity to droid; r=jmaher
Are you using adb or sut?

I can check on this first thing tomorrow.
(In reply to Geoff Brown [:gbrown] from comment #42)
> Are you using adb or sut?

adb.

> I can check on this first thing tomorrow.

Now that I have a work-around, it's not urgent.  We just need to figure out the right thing for local and buildbot use.  Do /any/ of the buildbots use adb?  It is my understanding that none do.
Depends on: 876456
Will update this once I finish testing on try server.  This worked in my initial tests.  Also I have some debugging for tspaint in here.
Attachment #750521 - Attachment is obsolete: true
Comment on attachment 8361218 [details] [diff] [review]
updated patch for android_foreground (0.9)

Review of attachment 8361218 [details] [diff] [review]:
-----------------------------------------------------------------

try server is pretty green, I feel comfortable about landing this:
https://tbpl.mozilla.org/?tree=Try&rev=b757e8339944
Attachment #8361218 - Flags: review?(dminor)
Comment on attachment 8361218 [details] [diff] [review]
updated patch for android_foreground (0.9)

Review of attachment 8361218 [details] [diff] [review]:
-----------------------------------------------------------------

For the most part looks good, but you need to fix the hard coded activity name "org.mozilla.fennec".

::: talos/ffprocess_remote.py
@@ +89,3 @@
>          processes_with_names = []
>          for process_name in process_names:
> +            print self.testAgent

nit: I assume this was for debugging and can be removed.

@@ +152,4 @@
>                  total_time = 0
>                  while total_time < timeout:
>                      time.sleep(5)
> +#                    if not self.testAgent.processExist(cmd):

nit: remove commented line

@@ +153,5 @@
>                  while total_time < timeout:
>                      time.sleep(5)
> +#                    if not self.testAgent.processExist(cmd):
> +#TODO: make this a real processname, how do we get that?
> +                    if not self.testAgent.getTopActivity() == "org.mozilla.fennec":

This won't work with a local build where the activity will be e.g. org.mozilla.fennec_dminor.
Attachment #8361218 - Flags: review?(dminor) → review-
Comment on attachment 8366752 [details] [diff] [review]
updated patch for android_foreground (1.1)

Review of attachment 8366752 [details] [diff] [review]:
-----------------------------------------------------------------

::: talos/ffprocess_remote.py
@@ +89,3 @@
>          processes_with_names = []
>          for process_name in process_names:
> +            if self.testAgent.getTopActivity() == process_name:

I suggest pulling the call to getTopActivity() out of the loop -- faster, less SUT traffic, etc.
updated as per gbrown's comment!
Attachment #8361218 - Attachment is obsolete: true
Attachment #8366752 - Attachment is obsolete: true
Attachment #8366752 - Flags: review?(dminor)
Attachment #8366782 - Flags: review?(dminor)
Comment on attachment 8366752 [details] [diff] [review]
updated patch for android_foreground (1.1)

Review of attachment 8366752 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me.

::: talos/ffprocess_remote.py
@@ +89,4 @@
>          processes_with_names = []
>          for process_name in process_names:
> +            if self.testAgent.getTopActivity() == process_name:
> +                processes_with_names.append((-1, process_name))

Is ProcessesWithNames still used anywhere or can it just be removed?
I couldn't find it called from inside talos using grep.

If it is still required, I think you should have a comment or something in the docstring that makes it clear than for the remote case you are looking at activities and not processes.
Attachment #8366752 - Attachment is obsolete: false
Comment on attachment 8366782 [details] [diff] [review]
updated patch for android_foreground (1.2)

Review of attachment 8366782 [details] [diff] [review]:
-----------------------------------------------------------------

See comment on earlier version of patch.
Attachment #8366782 - Flags: review?(dminor) → review+
ProcessesWithNames is a function that is called from ffprocess to checkallprocesses:
http://hg.mozilla.org/build/talos/file/tip/talos/ffprocess.py#l51

I will add a comment.
landed in talos:
https://hg.mozilla.org/build/talos/rev/78d138bbb33a

this bug is marked as leaveopen, is there a reason forthat still?
(In reply to Joel Maher (:jmaher) from comment #53)
> this bug is marked as leaveopen, is there a reason forthat still?

We are almost done, but there is one last issue I am mulling over -- best to keep it open still.
(In reply to Geoff Brown [:gbrown] from comment #54) 
> We are almost done, but there is one last issue I am mulling over -- best to
> keep it open still.

I was thinking that I would like to change the way sutagent's exec works, especially as it relates to robocop. Currently exec waits for up to 5 minutes for native processes (Robocop/am instrument...), but returns immediately for Java apps (mochitest/reftest/org.mozilla.fennec). So if waitForFinish breaks for Robocop (as it has a few times), it only causes a problem in the rare case that the test runs for more than 5 minutes. 

I tried changing the exec wait behavior for native processes and experienced some test failures, which seemed possibly related to the activity-waiting behavior of this bug, but I am not sure of the root cause. It's hard to justify investigating further given that everything seems to work fine now, so let's close this now.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Whiteboard: [leave open]
You need to log in before you can comment on or make changes to this bug.