Closed Bug 1627867 Opened 4 years ago Closed 4 years ago

Intermittent android raptor-perftest Critical: Connection to Raptor webextension failed!

Categories

(Testing :: Raptor, defect, P5)

Version 3
ARM
Android
defect

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: intermittent-bug-filer, Unassigned)

References

Details

(Keywords: intermittent-failure, Whiteboard: [retriggered])

Filed by: btara [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer.html#?job_id=296539709&repo=autoland
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/L0uThtLsTbuiEF5XFJ7XdQ/runs/0/artifacts/public/logs/live_backing.log


task 2020-04-07T03:12:30.125Z] 03:10:02     INFO -  raptor-webext-android Info: starting geckoview
[task 2020-04-07T03:12:30.125Z] 03:10:03     INFO -  adb shell_output: adb -s ZY3223F92P wait-for-device shell am force-stop org.mozilla.geckoview_example, timeout: None, root: False, timedout: None, exitcode: 0, output:
[task 2020-04-07T03:12:30.125Z] 03:10:03     INFO -  adb Granting important runtime permissions to org.mozilla.geckoview_example
[task 2020-04-07T03:12:30.125Z] 03:10:04     INFO -  adb shell_output: adb -s ZY3223F92P wait-for-device shell pm grant org.mozilla.geckoview_example android.permission.WRITE_EXTERNAL_STORAGE, timeout: None, root: False, timedout: None, exitcode: 0, output:
[task 2020-04-07T03:12:30.125Z] 03:10:05     INFO -  adb shell_output: adb -s ZY3223F92P wait-for-device shell pm grant org.mozilla.geckoview_example android.permission.READ_EXTERNAL_STORAGE, timeout: None, root: False, timedout: None, exitcode: 0, output:
[task 2020-04-07T03:12:30.125Z] 03:10:06     INFO -  adb shell_output: adb -s ZY3223F92P wait-for-device shell pm grant org.mozilla.geckoview_example android.permission.ACCESS_COARSE_LOCATION, timeout: None, root: False, timedout: None, exitcode: 0, output:
[task 2020-04-07T03:12:30.125Z] 03:10:08     INFO -  adb shell_output: adb -s ZY3223F92P wait-for-device shell pm grant org.mozilla.geckoview_example android.permission.ACCESS_FINE_LOCATION, timeout: None, root: False, timedout: None, exitcode: 0, output:
[task 2020-04-07T03:12:30.125Z] 03:10:09     INFO -  adb shell_output: adb -s ZY3223F92P wait-for-device shell pm grant org.mozilla.geckoview_example android.permission.CAMERA, timeout: None, root: False, timedout: None, exitcode: 0, output:
[task 2020-04-07T03:12:30.125Z] 03:10:10     INFO -  adb shell_output: adb -s ZY3223F92P wait-for-device shell pm grant org.mozilla.geckoview_example android.permission.RECORD_AUDIO, timeout: None, root: False, timedout: None, exitcode: 0, output:
[task 2020-04-07T03:12:30.125Z] 03:10:10     INFO -  adb launch_application: am start -W -n org.mozilla.geckoview_example/org.mozilla.geckoview_example.GeckoViewActivity -a android.intent.action.MAIN -d about:blank
[task 2020-04-07T03:12:30.125Z] 03:10:11     INFO -  adb shell_output: adb -s ZY3223F92P wait-for-device shell am start -W -n org.mozilla.geckoview_example/org.mozilla.geckoview_example.GeckoViewActivity -a android.intent.action.MAIN -d about:blank, timeout: None, root: False, timedout: None, exitcode: 0, output: Starting: Intent { act=android.intent.action.MAIN dat=about:blank cmp=org.mozilla.geckoview_example/.GeckoViewActivity }
[task 2020-04-07T03:12:30.125Z] 03:10:11     INFO -  Status: ok
[task 2020-04-07T03:12:30.125Z] 03:10:11     INFO -  Activity: org.mozilla.geckoview_example/.GeckoViewActivity
[task 2020-04-07T03:12:30.125Z] 03:10:11     INFO -  ThisTime: 652
[task 2020-04-07T03:12:30.125Z] 03:10:11     INFO -  TotalTime: 652
[task 2020-04-07T03:12:30.125Z] 03:10:11     INFO -  WaitTime: 664
[task 2020-04-07T03:12:30.125Z] 03:10:11     INFO -  Complete
[task 2020-04-07T03:12:30.125Z] 03:10:11     INFO -  adb shell_output: adb -s ZY3223F92P wait-for-device shell pidof org.mozilla.geckoview_example, timeout: None, root: False, timedout: None, exitcode: 0, output: 9482
[task 2020-04-07T03:12:30.125Z] 03:12:29     INFO -  adb shell_output: adb -s ZY3223F92P wait-for-device shell am force-stop org.mozilla.geckoview_example, timeout: None, root: False, timedout: None, exitcode: 0, output:
[task 2020-04-07T03:12:30.125Z] 03:12:29     INFO -  adb shell_bool: adb -s ZY3223F92P wait-for-device shell test -d /sdcard/raptor/profile/minidumps, timeout: None, root: False, timedout: None, exitcode: 0, output:
[task 2020-04-07T03:12:55.977Z] 03:12:30     INFO -  adb shell_output: adb -s ZY3223F92P wait-for-device shell sync, timeout: None, root: False, timedout: None, exitcode: 0, output:
[task 2020-04-07T03:12:55.977Z] 03:12:30     INFO -  adb shell_bool: adb -s ZY3223F92P wait-for-device shell test -d /sdcard/raptor/profile/minidumps, timeout: None, root: False, timedout: None, exitcode: 0, output:
[task 2020-04-07T03:12:55.977Z] 03:12:30     INFO -  adb command_output: adb -s ZY3223F92P wait-for-device pull /sdcard/raptor/profile/minidumps /tmp/tmpcfjh4R/minidumps, timeout: None, timedout: None, exitcode: 0, output: /sdcard/raptor/profile/minidumps/: 0 files pulled, 0 skipped.
[task 2020-04-07T03:12:55.977Z] 03:12:30 CRITICAL -  raptor-perftest Critical: Connection to Raptor webextension failed!
[task 2020-04-07T03:12:55.977Z] 03:12:30     INFO -  raptor-webext-android Info: removing reverse socket connections
[task 2020-04-07T03:12:55.977Z] 03:12:30     INFO -  adb command_output: adb -s ZY3223F92P wait-for-device reverse --remove-all, timeout: None, timedout: None, exitcode: 0, output:
[task 2020-04-07T03:12:55.977Z] 03:12:30     INFO -  raptor-webext-android Info: skipping check_for_crashes: application has not been launched
[task 2020-04-07T03:12:55.977Z] 03:12:30     INFO -  raptor-mitmproxy Info: Stopping mitmproxy playback, killing process 894
[task 2020-04-07T03:12:55.977Z] 03:12:30     INFO -  raptor-mitmproxy Info: Successfully killed the mitmproxy playback process
[task 2020-04-07T03:12:55.977Z] 03:12:30     INFO -  raptor-mitmproxy Info: Reading confidence values from: /builds/task_1586227509/workspace/build/blobber_upload_dir/mitm_netlocs_mitm4-pixel2-fennec-facebook.json
[task 2020-04-07T03:12:55.977Z] 03:12:30     INFO -  perftest-results-handler Info: RaptorResultsHandler.add_supporting_data received mozproxy-replay data
[task 2020-04-07T03:12:55.977Z] 03:12:30     INFO -  raptor-webext Info: removing webext /builds/task_1586227509/workspace/build/tests/raptor/raptor/webextension/../../webext/raptor
[task 2020-04-07T03:12:55.977Z] 03:12:30     INFO -  raptor-webext-android Info: removing test folder for raptor: /sdcard/raptor
[task 2020-04-07T03:12:55.977Z] 03:12:30     INFO -  adb shell_output: adb -s ZY3223F92P wait-for-device shell sync, timeout: None, root: False, timedout: None, exitcode: 0, output:
[task 2020-04-07T03:12:55.977Z] 03:12:30     INFO -  adb shell_output: adb -s ZY3223F92P wait-for-device shell rm -r /sdcard/raptor, timeout: None, root: False, timedout: None, exitcode: 0, output:
[task 2020-04-07T03:12:55.977Z] 03:12:30     INFO -  adb shell_output: adb -s ZY3223F92P wait-for-device shell sync, timeout: None, root: False, timedout: None, exitcode: 0, output:
[task 2020-04-07T03:12:55.977Z] 03:12:30     INFO -  adb shell_bool: adb -s ZY3223F92P wait-for-device shell test
[task 2020-04-07T03:12:55.977Z] 03:12:30     INFO -  raptor-control-server Info: shutting down control server
[task 2020-04-07T03:12:55.977Z] 03:12:31     INFO -  raptor-webext Info: finished
[task 2020-04-07T03:12:55.977Z] 03:12:31    ERROR - Return code: 1
[task 2020-04-07T03:12:55.977Z] 03:12:31  WARNING - setting return code to 1

Good to see that those failures are bubbling-up now! Thanks Marian!

As it looks like we fail after the first page cycle in the above Raptor job. Raptor starts geckoview_example again for the 2nd cycle but the web extension never gets installed as it looks like. Maybe the application also hangs during the startup? Interestingly I do not see any failures reported so it's kinda unclear what this could be.

So what could be very helpful here is if we would enable the extension logging preference in CI jobs as filed on bug 1626522. Maybe someone could have a look at this? I don't have time for this during that week. Thanks.

Depends on: 1626522

Lots of failures for this specific problem since we split this off from bug 1609295. So it needs to be prioritized. It's all happening on Android and specifically the G5. No failure yet for the Pixel2.

Assignee: nobody → hskupin
Status: NEW → ASSIGNED
OS: Unspecified → Android
Priority: P5 → P1
Hardware: Unspecified → ARM
Whiteboard: [retriggered]
Summary: Intermittent raptor-perftest Critical: Connection to Raptor webextension failed! → Intermittent android raptor-perftest Critical: Connection to Raptor webextension failed!

There are 99 failures associated to this bug in the last 7 days. These are occurring on android-hw-g5-7-0-arm7-api-16, pgo builds.

:whimboo any updates besides what's discussed on 1624077 and 1622610?

Flags: needinfo?(hskupin)

Note that the conditioned profile changes don't fix this problem, so someone has to take a look at bug 1622610. It's the most reliable candidate for now for those problems.

Flags: needinfo?(hskupin)
Whiteboard: [retriggered][stockwell disable-recommended] → [retriggered][Comment 10]
Whiteboard: [retriggered][Comment 10][stockwell disable-recommended] → [retriggered][Comment 10]
Whiteboard: [retriggered][Comment 10][stockwell disable-recommended] → [retriggered][Comment 10]
Flags: needinfo?(hskupin)

(In reply to Cosmin Sabou [:CosminS] from comment #18)

Recent failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=299501857&repo=autoland
Looks like even though https://bugzilla.mozilla.org/show_bug.cgi?id=1535365#c11 landed the failures here are still present.
Henrik, should we expect these failures to go away once bug 1622610 is fixed?

Yes, Bob is working on a couple of improvements for Android right now. Seeing those things landed soon will hopefully get us away from this failure.

Flags: needinfo?(hskupin)
Whiteboard: [retriggered][Comment 10][stockwell disable-recommended] → [retriggered][Comment 10]

Since the patch on bug 1632959 landed yesterday we haven't seen this failure anymore! Lets see how that continues over the next days. I will re-check on Monday.

Whiteboard: [retriggered][Comment 10][stockwell disable-recommended] → [retriggered][Comment 23]

There is only a single failure since April 28th:
https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=300415637&repo=autoland&lineNumber=2101

Bob, any idea what was wrong here? I assume that there was something wrong with starting geckoview_example at the end of the adb logcat.

Flags: needinfo?(bob)

No high frequent intermittent anymore. I'm going to unassign myself.

Assignee: hskupin → nobody
Status: ASSIGNED → NEW
Priority: P1 → P5

I don't know. The logcat ends before the last am start so we don't see what is going on there. Is raptor not getting the full log or did we have a disconnection? Hard to tell.

Flags: needinfo?(bob)
Whiteboard: [retriggered][Comment 23][stockwell disable-recommended] → [retriggered]
Whiteboard: [retriggered][stockwell disable-recommended] → [retriggered]

https://wiki.mozilla.org/Bug_Triage#Intermittent_Test_Failure_Cleanup
For more information, please visit auto_nag documentation.

Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.