Closed Bug 851288 Opened 8 years ago Closed 8 years ago

Investigate using quitter.xpi for terminating fennec in autophone s1s2 tests

Categories

(Testing :: Autophone, defect)

ARM
Android
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: bc, Unassigned)

Details

Attachments

(2 files, 1 obsolete file)

We currently terminate fennec in the s1s2 tests simply by killing it. This seems to be less than ideal to me. Jesse's quitter extension is almost tailor made in that it can exit the browser either after a specific period of time or on an event such as the load event.

killing fennec from the shell shows:

I/ActivityManager( 2810): Process org.mozilla.fennec (pid 6398) has died.
I/WindowManager( 2810): WIN DEATH: Window{4090f9b8 SurfaceView paused=false}
I/WindowManager( 2810): WIN DEATH: Window{4096f988 org.mozilla.fennec/org.mozilla.fennec.App paused=false}
D/Zygote  ( 2597): Process 6398 exited cleanly (15)

Modifying quitter to include fennec's uuid, installing it and loading an example page from sdcard shows that it will terminate the browser with the following message:

D/GeckoAppShell( 5905): Killing via System.exit()

It does appear that when killing on page load, the Throbber stop message is sent to logcat.
fwiw, I'm using an expanded version of quitter as a helper for my DOM fuzzer:
https://www.squarefree.com/extensions/domFuzzLite3.xpi
Attached patch patch (obsolete) — Splinter Review
contains a binary file (quitter.xpi), use git apply
Attachment #726165 - Flags: review?(mcote)
Attached file updated webfiles.zip
Need to update both the versions of the files in autophone but phonedash as well.
I ran this (plus the changes for caching/noncaching) for Jan-Mar a couple of times using my extra nexus s and gs2. It does appear smoother and may not show the apparent 'jan 16' regression. I need to rerun with the full set of phones to be sure.
Status: NEW → ASSIGNED
Comment on attachment 726165 [details] [diff] [review]
patch

Review of attachment 726165 [details] [diff] [review]:
-----------------------------------------------------------------

Looks fine.  I'm wondering if the first-run thing should be moved to the worker or something, since presumably this would affect other tests, but I guess it's okay for now.

::: INSTALL.md
@@ +55,5 @@
>      iterations = 20
>      resulturl = http://192.168.1.133:8100/api/s1s2_add/
> +    # initialize_url is loaded prior to the other urls in order to
> +    # initialize the profile.
> +    initialize_url = file://mnt/sdcard/s1test/initialize_profile.html

It would be better to build this URL in the test by looking at the device's test root, and then appending a path onto that.
Attachment #726165 - Flags: review?(mcote) → review+
As discussed, forget the comment about the URLs for now, since it was already a problem with our local-* tests.  I filed bug 852264 to fix that properly.
Comment on attachment 726165 [details] [diff] [review]
patch

Review of attachment 726165 [details] [diff] [review]:
-----------------------------------------------------------------

::: tests/s1s2test.py
@@ +108,5 @@
> +                except DMError:
> +                    self.logger.info('Attempt %d to kill fennec failed' % kill_attempt)
> +                    if kill_attempt == max_killattempts - 1:
> +                        raise
> +                    sleep(kill_wait-time)

Just realized that there is a typo here.  Should be sleep(kill_wait_time).
damnit. That's what I get for making a last minute change. I made that same mistake earlier too. :-(
Attached patch patch v2Splinter Review
This moves the initialize profile to just after the prepare_job since we don't need to initialize the profile for each test iteration.

It also changes the wait_for_fennec to return True if fennec exited without needing to be killed and to return False if it did need killing. This allows us to check that fennec exits normally after initializing the profile and to terminate the test if it does not. The helps prevent cases where a hung browser can cause the device to be stuck for hours testing a non-functional build.
Attachment #726165 - Attachment is obsolete: true
Attachment #726848 - Flags: review?(mcote)
Comment on attachment 726848 [details] [diff] [review]
patch v2

Review of attachment 726848 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, just one question.

::: tests/s1s2test.py
@@ +203,1 @@
>              throbstop = int(throbstart) + max_time * 1000

This is from your last patch but I just realized--we keep a throbberstop of 0 if fennec *did* crash?
Attachment #726848 - Flags: review?(mcote) → review+
If we don't have a throbberstop and crashed, the throbberstop stays 0. That way it will try another attempt and if it can't get a throbberstop it won't get submitted as a result with a long throbber time which swamps the other data points around it.

With the earlier run today, I think the firefox process was kept alive by the crash reporter dialog and it ended up waiting for the full 90 seconds. Of course while testing my local fix for the crash reporter dialog, my builds haven't crashed...
https://github.com/mozilla/autophone/commit/cfce5c69f7240287967891aa65b3f142fee6b773
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.