Closed Bug 1592063 Opened 2 years ago Closed 2 years ago

Browsertime failing on fenix with 'browser failed to start' error

Categories

(Testing :: Raptor, defect, P1)

Version 3
defect

Tracking

(firefox73 fixed)

RESOLVED FIXED
mozilla73
Tracking Status
firefox73 --- fixed

People

(Reporter: rwood, Assigned: rwood)

References

(Blocks 2 open bugs)

Details

Attachments

(4 files)

Raptor-browsertime cold and warm page-load jobs are failing on Fenix with `[browsertime] Browser failed to start' errors.

For examples, see this try push [0].

[0] https://treeherder.mozilla.org/#/jobs?repo=try&tier=1%2C2%2C3&revision=3f342522698abbad2301448e94c43c75e362bdc7&searchStr=fenix

[task 2019-10-28T14:11:04.003Z] 14:10:03 INFO - [2019-10-28 14:10:03] INFO: [browsertime] Running tests using Firefox - 15 iteration(s)
[task 2019-10-28T14:11:04.003Z] 14:10:04 INFO - 1572271803995 webdriver::command WARN You are using deprecated legacy session negotiation patterns (desiredCapabilities/requiredCapabilities), see https://developer.mozilla.org/en-US/docs/Web/WebDriver/Capabilities#Legacy
[task 2019-10-28T14:11:04.003Z] 14:10:04 INFO - 1572271803998 mozdevice WARN adb server response contained hexstring length 103 and message length was 103 and message was "ZY32259STL device usb:1-4.1.1 product:cedric model:Moto_G__5_ device:cedric transport_id:1\n"
[task 2019-10-28T14:11:04.003Z] 14:11:03 INFO - [2019-10-28 14:11:03] INFO: [browsertime] Browser failed to start, trying one more time: Failed to start browser in 60 seconds.
[task 2019-10-28T14:11:04.003Z] 14:11:03 INFO - [2019-10-28 14:11:03] INFO: [browsertime] Browser failed to start, trying one more time: The previously configured GeckoDriver service is still running. You must shut it down before you may adjust its configuration.
[task 2019-10-28T14:11:04.003Z] 14:11:03 INFO - [2019-10-28 14:11:03] INFO: [browsertime] Browser failed to start, trying one more time: The previously configured GeckoDriver service is still running. You must shut it down before you may adjust its configuration.
[task 2019-10-28T14:11:04.003Z] 14:11:03 INFO - [2019-10-28 14:11:03] ERROR: [browsertime] BrowserError: Could not start the browser with 3 tries
[task 2019-10-28T14:11:04.003Z] 14:11:03 INFO - at SeleniumRunner.start (/builds/task_1572271699/fetches/browsertime/node_modules/browsertime/lib/core/seleniumRunner.js:82:13)
[task 2019-10-28T14:11:04.003Z] 14:11:03 INFO - at <anonymous>
[task 2019-10-28T14:11:04.003Z] 14:11:03 INFO - [2019-10-28 14:11:03] ERROR: [browsertime] No data to collect

Looking at the adb logcat [0], the fenix app gets to 'state changed to LAUNCHED' often but never gets to 'state changed to RUNNING'. It's almost like it's being started over and over but never fully starts.

[0] https://taskcluster-artifacts.net/VZdoQhMKRcatMEXLp1rdow/0/public/test_info/logcat-ZY3223FBC4.log

Tried changing some browsertime timeout parameters to ensure browsertime is giving fenix enough time to start:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=80cf76b7c1efc147211ef9a21f44f5aedae590e1

(In reply to Robert Wood [:rwood] from comment #2)

https://treeherder.mozilla.org/#/jobs?repo=try&tier=1%2C2%2C3&revision=61b318613ea5896eb882186b9513e265a27720c4

Nope that didn't make a difference.

:gbrown, do you see any clues in the logcat in Comment 1 above as to what is happening here? Any ideas why Fenix couldn't be started? Maybe a geckodriver issue? Works with refbrow though.

Flags: needinfo?(gbrown)

I'm not very familiar with Fenix logs, but it looks to me like Fenix started okay.

There are indications of adb trouble at that time, especially with respect to port 2829:

adbd    : failed to connect to socket 'tcp:2829': Connection refused

which I notice is used by geckodriver:

https://searchfox.org/mozilla-central/rev/59de675101da711520c0bb6e34a1ea2372e7ddbb/testing/geckodriver/src/android.rs#13

But actually, those adbd messages start long before the tests start failing...maybe they are normal??

Flags: needinfo?(gbrown)

Thanks Geoff. I'm not sure re: connection refused messages.

Try push to see where Fenix browsertime is at on production:

https://treeherder.mozilla.org/#/jobs?repo=try&tier=1%2C2%2C3&revision=e44335d64254f28907abafa5e930c51cba5b6fa0

I will try it out locally on my GP2 and see if I can get anywhere.

Try run in Comment 5 failed for a different infra issue (a 404). Pushed to try again with latest central:

https://treeherder.mozilla.org/#/jobs?repo=try&tier=1%2C2%2C3&revision=aca147e9eff7fe1614c9a726b09a040c1632707f

On my GP2 I can't seem to run Fenix preview anymore (it crashes on startup) I'm trying to figure out why my device isn't working now.

FYI to run browsertime on Fenix locally you would use this command line:

./mach raptor --test raptor-tp6m-amazon-fenix-cold --app fenix --binary org.mozilla.fenix.performancetest --browser-cycles 5 --browsertime

On my GP2 device, using the latest project.mobile.fenix.v2.performance-test.atest build (armeabi-v8a) build, Fenix crashes on startup. So when running via raptor (./mach raptor-test --test raptor-tp6m-amazon-fenix-cold --app fenix --binary org.mozilla.fenix.performancetest --browser-cycles 5 --browsertime) I reproduce the ERROR: [browsertime] BrowserError: Could not start the browser with 3 tries error.

Note: The NON-performance test (regular geckoNightly build) starts up fine on my GP2. Raptor needs to use the 'performancetest' build variant.

(In reply to Robert Wood [:rwood] from comment #6)

Try run in Comment 5 failed for a different infra issue (a 404). Pushed to try again with latest central:

https://treeherder.mozilla.org/#/jobs?repo=try&tier=1%2C2%2C3&revision=aca147e9eff7fe1614c9a726b09a040c1632707f

This issue is bug 1601006; posting here for posterity. Should this bug also depend on that bug?

This issue is still happening in production on Fenix with the performancetest apk:

https://treeherder.mozilla.org/#/jobs?repo=try&tier=1%2C2%2C3&revision=f1f60fad7013f17e600a5467cea70e2d158ca842

(In reply to Stephen Donner [:stephend] from comment #10)

(In reply to Robert Wood [:rwood] from comment #6)

Try run in Comment 5 failed for a different infra issue (a 404). Pushed to try again with latest central:

https://treeherder.mozilla.org/#/jobs?repo=try&tier=1%2C2%2C3&revision=aca147e9eff7fe1614c9a726b09a040c1632707f

This issue is bug 1601006; posting here for posterity. Should this bug also depend on that bug?

Yes I filed Bug 1601006 and I just landed a fix for that a few min ago (yep I should have marked this bug as dependent on that, thanks).

Assignee: nobody → rwood
Status: NEW → ASSIGNED
Depends on: 1601006
Priority: P2 → P1

I fixed my GP2 and I can reproduce this locally now with the Fenix performancetest build.

./mach raptor --test raptor-tp6m-amazon-fenix-cold --app fenix --binary org.mozilla.fenix.performancetest --browser-cycles 5 --browsertime --cold
...
21:19:30 INFO - 1575580770980 geckodriver::marionette INFO Forwarding local port tcp:62621 to remote_port tcp:2829
21:19:31 INFO - 1575580770994 geckodriver::marionette INFO handshake, timeout is None
21:19:31 INFO - 1575580770995 geckodriver::marionette INFO Not connected, sleeping 120ms!
21:19:31 INFO - 1575580771121 geckodriver::marionette INFO handshake, timeout is None
21:19:31 INFO - 1575580771122 geckodriver::marionette INFO Not connected, sleeping 240ms!
21:19:31 INFO - 1575580771364 geckodriver::marionette INFO handshake, timeout is None
...
21:20:27 INFO - 1575580827211 geckodriver::marionette INFO Not connected, sleeping 960ms!
21:20:28 INFO - 1575580828175 geckodriver::marionette INFO handshake, timeout is None
21:20:28 INFO - 1575580828183 geckodriver::marionette INFO Not connected, sleeping 960ms!
21:20:28 INFO - [2019-12-05 16:20:28] INFO: [browsertime] Browser failed to start, trying one more time: Failed to start browser in 60 seconds.
21:20:28 INFO - [2019-12-05 16:20:28] INFO: [browsertime] Browser failed to start, trying one more time: The previously configured GeckoDriver service is still running. You must shut it down before you may adjust its configuration.
21:20:28 INFO - [2019-12-05 16:20:28] INFO: [browsertime] Browser failed to start, trying one more time: The previously configured GeckoDriver service is still running. You must shut it down before you may adjust its configuration.
21:20:28 INFO - [2019-12-05 16:20:28] ERROR: [browsertime] BrowserError: Could not start the browser with 3 tries

:ato, does geckodriver support Fenix?
:mcomella, are there any settings that the Fenix performancetest build might need to enable or something, in order for it to connect to geckodriver?

Thanks!

Flags: needinfo?(michael.l.comella)
Flags: needinfo?(ato)

Maybe it's an issue on my environment with whatever geckodriver is being used? There is a --browsertime-geckodriver command line argument that I am not using, so I'm not sure what geckodriver is used by default...

Even if I manually download the latest geckodriver (geckodriver-v0.26.0-macos.tar.gz) and point do it using the --browsertime-geckodriver command line arg, it still is giving the same error - 21:35:05 INFO - [2019-12-05 16:35:05] ERROR: [browsertime] BrowserError: Could not start the browser with 3 tries.

(In reply to Robert Wood [:rwood] from comment #14)

:ato, does geckodriver support Fenix?

To an extent. There is certain functionality we don’t support on GeckoView, see bug 1559120 for more details on those.

Since 0.26.0, geckodriver does support Android with the caveats outlined in the changelog: https://github.com/mozilla/geckodriver/releases/tag/v0.26.0

Flags: needinfo?(ato)

Thanks :ato!

Just discovered that we are using a 'custom' geckodriver for browsertime android:

https://bugzilla.mozilla.org/show_bug.cgi?id=1585484#c0

Note to self: I need to point my local raptor-browsertime to this geckoview version, also check in production that we are using that version too with Fenix. We must be already with GVE because it doesn't have this issue.

See Also: → 1585484

We should definitely move you towards using the in-tree toolchain builds and land any outstanding Fenix/Android specific patches (I don’t know if there are any?) upstream in testing/geckodriver/.

Latest btime fenix in CI:

(In reply to Andreas Tolfsen 「:ato」 from comment #21)

We should definitely move you towards using the in-tree toolchain builds and land any outstanding Fenix/Android specific patches (I don’t know if there are any?) upstream in testing/geckodriver/.

Sounds like a good plan - although myself I don't know what was done in the custom geckodriver android that we are using (I believe :nalexander did that work?).

This is the custom geckodriver that is used with browsertime (locally, and via fetch task in production):

https://searchfox.org/mozilla-central/rev/690e903ef689a4eca335b96bd903580394864a1c/tools/browsertime/mach_commands.py#245

Even though mach browsertime --setup will install that, just to be sure it's not an install problem, I downloaded that locally. Then ran raptor browsertime android fenix locally and pointed at that geckodriver:

./mach raptor --test amazon --app fenix --binary org.mozilla.fenix.performancetest --browser-cycles 5 --browsertime --cold --browsertime-geckodriver /Users/rwood/Downloads/geckodriver

Still the same issue, Fenix starts up, but geckodriver cannot connect:

19:46:32 INFO - 1576007192571 geckodriver::marionette INFO Not connected, sleeping 960ms!
19:46:33 INFO - 1576007193533 geckodriver::marionette INFO handshake, timeout is None
19:46:33 INFO - 1576007193539 geckodriver::marionette INFO Not connected, sleeping 960ms!
19:46:34 INFO - 1576007194499 geckodriver::marionette INFO handshake, timeout is None
19:46:34 INFO - 1576007194506 geckodriver::marionette INFO Not connected, sleeping 960ms!
19:46:35 INFO - [2019-12-10 14:46:35] INFO: [browsertime] Browser failed to start, trying one more time: Failed to start browser in 60 seconds.
19:46:35 INFO - [2019-12-10 14:46:35] INFO: [browsertime] Browser failed to start, trying one more time: The previously configured GeckoDriver service is still running. You must shut it down before you may adjust its configuration.
19:46:35 INFO - [2019-12-10 14:46:35] INFO: [browsertime] Browser failed to start, trying one more time: The previously configured GeckoDriver service is still running. You must shut it down before you may adjust its configuration.
19:46:35 INFO - [2019-12-10 14:46:35] ERROR: [browsertime] BrowserError: Could not start the browser with 3 tries

Attached file fenix_btime_logcat.txt

Attached adb logcat from when running browsertime fenix on my local GP2

Instead of using the custom geckodriver (comment 24) I downloaded the latest official geckodriver (v0.26.0) which has Firefox android support, and ran raptor-browsertime pointing at it:

https://github.com/mozilla/geckodriver/releases/tag/v0.26.0

./mach raptor --test amazon --app fenix --binary org.mozilla.fenix.performancetest --browser-cycles 5 --browsertime --cold --browsertime-geckodriver /Users/rwood/Downloads/geckodriver

Fenix starts on my GP2 but browsertime won't connect, different message in console (no marionette handshake messages):

20:11:12 INFO - [2019-12-10 15:11:12] INFO: [browsertime] Running tests using Firefox - 5 iteration(s)
20:11:12 INFO - 1576008672487 webdriver::command WARN You are using deprecated legacy session negotiation patterns (desiredCapabilities/requiredCapabilities), see https://developer.mozilla.org/en-US/docs/Web/WebDriver/Capabilities#Legacy
20:11:12 INFO - 1576008672491 mozdevice WARN adb server response contained hexstring length 105 and message length was 105 and message was "FA79F1A05596 device usb:340787200X product:walleye model:Pixel_2 device:walleye transport_id:1\n"
20:12:12 INFO - [2019-12-10 15:12:12] INFO: [browsertime] Browser failed to start, trying one more time: Failed to start browser in 60 seconds.
20:12:12 INFO - [2019-12-10 15:12:12] INFO: [browsertime] Browser failed to start, trying one more time: The previously configured GeckoDriver service is still running. You must shut it down before you may adjust its configuration.
20:12:12 INFO - [2019-12-10 15:12:12] INFO: [browsertime] Browser failed to start, trying one more time: The previously configured GeckoDriver service is still running. You must shut it down before you may adjust its configuration.
20:12:12 INFO - [2019-12-10 15:12:12] ERROR: [browsertime] BrowserError: Could not start the browser with 3 tries

I am running OSX Catilina, but I did give perms for apple to open the downloaded 'geckodriver' exe.

adb logcat when trying btime fenix with the official release geckodriver v0.26.0

:whimboo, hey, if you have a chance could you please take a look at the adb logcats/console errors above? Any ideas why we can't seem to get geckodriver to work with Fenix? Also note that we use the Fenix 'performancetestbuild' build type, so maybe there is something not enabled there or something on the app side. Any ideas would be appreciated, thanks!

Flags: needinfo?(hskupin)

Geckodriver correctly creates the debug config file:

12-10 15:11:14.231 12513 12513 I GeckoRuntime: Adding debug configuration from: /data/local/tmp/org.mozilla.fenix.performancetest-geckoview-config.yaml
12-10 15:11:14.231 12513 12513 D GeckoDebugConfig: Adding environment variables from debug config: {MOZ_CRASHREPORTER=1, MOZ_CRASHREPORTER_NO_REPORT=1, MOZ_CRASHREPORTER_SHUTDOWN=1}
12-10 15:11:14.231 12513 12513 D GeckoDebugConfig: Adding arguments from debug config: [-marionette, -profile, /mnt/sdcard/org.mozilla.fenix.performancetest-geckodriver-profile]
12-10 15:11:14.231 12513 12513 D GeckoThread: State changed to LAUNCHED
12-10 15:11:14.232 12513 12544 I GeckoThread: preparing to run Gecko
12-10 15:11:14.232 12513 12544 D GeckoThread: env var: MOZ_CRASHREPORTER=1
12-10 15:11:14.232 12513 12544 D GeckoThread: env var: MOZ_CRASHREPORTER_NO_REPORT=1
12-10 15:11:14.232 12513 12544 D GeckoThread: env var: MOZ_CRASHREPORTER_SHUTDOWN=1

So it's hard to say right now without proper Marionette trace logs. Can you please set the preference marionette.log.level=Trace and try again?

Flags: needinfo?(hskupin)

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #30)

So it's hard to say right now without proper Marionette trace logs. Can you please set the preference marionette.log.level=Trace and try again?

Thanks, set the pref but I don't see any 'geckodriver' or 'marionette' debug output in the console (or adb logcat). Must be because of browsertime or something; will try again tomorrow.

There is also the --log trace option for geckodriver, if that is something you can set. Sorry, but I don't know anything about browsertime and how it let you customize the webdriver binary.

Thanks Whinboo, I figured it out, I had to set verbose to 'level 3' for browsertime, on the browsertime command line (changed the line below to '-vvv'):

https://searchfox.org/mozilla-central/rev/62a130ba0ac80f75175e4b65536290b52391f116/testing/raptor/raptor/raptor.py#717

And with that (and the 'marionette.log.level=Trace' pref set) there are now a bunch of geckodriver::* debug messages output in the console. Will attach.

Console from when running raptor-browsertime on Fenix with the official geckodriver v0.26

Hopefully that helps (pls see attached log above thanks :whimboo)

Flags: needinfo?(hskupin)
Attachment #9115263 - Attachment mime type: application/octet-stream → text/plain
Flags: needinfo?(hskupin)

Please get rid of this failure first:

19:43:42 INFO - [2019-12-11 14:43:42] INFO: [browsertime] Browser failed to start, trying one more time: The previously configured GeckoDriver service is still running. You must shut it down before you may adjust its configuration.

Not sure which other service is still running here.

I think it's a problem with connecting to the port on Fenix (as :gbrown mentioned in comment 4):

12-12 11:40:07.246 800 800 E adbd : failed to connect to socket 'tcp:2829': Connection refused
...
12-12 11:40:07.334 11848 11880 I Gecko : 1576168807334 Marionette DEBUG Init aborted (running=false, enabled=true, finalUIStartup=false)

Where as in adb logcat with GVE I don't see those errors, I see it working:

12-12 10:33:58.941 9476 9491 I Gecko : 1576146838941 Marionette INFO Listening on port 2829

I don't see anything in the log that is already running on 'tcp:2829' before the attempt to connect to marionette.

I see this with the official geckodriver v0.26 release as well as our custom browsertime geckodriver. I think we should just use the official v0.26 since it now includes android support.

Then please upload a log which contains the Marionette trace logs. The last one you attached didn't. Also note that there is still bug 1559120 for Marionette and GV, but not sure if that has relevance here.

Blocks: 1585484

:mcomella, are there any settings that the Fenix performancetest build might need to enable or something, in order for it to connect to geckodriver?

I believe this may have been addressed in Slack and afaik, mleclair is working with rwood to make this happen. Please re-add NI if I'm still needed.

Flags: needinfo?(michael.l.comella)

Thanks to :mleclair and :acreskey I can get browsertime to start and connect to Fenix with:

args_list.extend(['--firefox.android.intentArgument=-a'])
args_list.extend(['--firefox.android.intentArgument', self.config['intent']])
args_list.extend(['--firefox.android.intentArgument=-d'])
args_list.extend(['--firefox.android.intentArgument', str('about:blank')])

Now in Fenix it tries to load the test page but just shows "no internet connection" which means the mozproxy is not working for some reason, so that's the next issue to fix...

Pushed by rwood@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/578f99cbbe1a
Browsertime failing on fenix with 'browser failed to start' error r=tarek
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla73
You need to log in before you can comment on or make changes to this bug.