Closed Bug 1518984 Opened 5 years ago Closed 2 years ago

Intermittent bitbar | Must have exactly one connected device. 0 found.

Categories

(Firefox for Android Graveyard :: Testing, defect, P3)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: intermittent-bug-filer, Assigned: aerickson)

References

Details

(Keywords: intermittent-failure)

Filed by: shindli [at] mozilla.com

https://treeherder.mozilla.org/logviewer.html#?job_id=220941845&repo=mozilla-central

https://queue.taskcluster.net/v1/task/KTZbrK6CT6mHxwKbc_Dlaw/runs/0/artifacts/public/logs/live_backing.log

        "path": "/builds/worker/workspace/build/logs", 
        "expires": "2020-01-09T21:45:07.240Z", 
        "type": "directory", 
        "name": "public/logs/"
    }, 
    {
        "path": "/builds/worker/workspace/build/blobber_upload_dir", 
        "expires": "2020-01-09T21:45:07.240Z", 
        "type": "directory", 
        "name": "public/test_info/"
    }
], 
"command": [
    "./test-linux.sh", 
    "--installer-url=https://queue.taskcluster.net/v1/task/Db_Qz01uTROZWk4plALHiA/artifacts/public/build/geckoview_example.apk", 
    "--test-packages-url=https://queue.taskcluster.net/v1/task/Db_Qz01uTROZWk4plALHiA/artifacts/public/build/target.test_packages.json", 
    "--test=raptor-speedometer", 
    "--app=geckoview", 
    "--binary=org.mozilla.geckoview_example", 
    "--download-symbols=ondemand"
], 
"env": {
    "XPCOM_DEBUG_BREAK": "warn", 
    "MOZ_NODE_PATH": "/usr/local/bin/node", 
    "MOZ_HIDE_RESULTS_TABLE": "1", 
    "TASKCLUSTER_WORKER_TYPE": "proj-autophone/gecko-t-ap-perf-p2", 
    "GECKO_HEAD_REV": "8e746f670f430ceb0cb85fa8eebfe97cbe42ff01", 
    "WORKING_DIR": "/builds/worker", 
    "MOZHARNESS_SCRIPT": "raptor_script.py", 
    "NEED_XVFB": "false", 
    "MOZHARNESS_URL": "https://queue.taskcluster.net/v1/task/Db_Qz01uTROZWk4plALHiA/artifacts/public/build/mozharness.zip", 
    "MOZ_AUTOMATION": "1", 
    "NO_FAIL_ON_TEST_ERRORS": "1", 
    "GECKO_HEAD_REPOSITORY": "https://hg.mozilla.org/mozilla-central", 
    "MOZ_NO_REMOTE": "1", 
    "MOZHARNESS_CONFIG": "raptor/android_hw_config.py", 
    "MOZILLA_BUILD_URL": "https://queue.taskcluster.net/v1/task/Db_Qz01uTROZWk4plALHiA/artifacts/public/build/geckoview_example.apk", 
    "WORKSPACE": "/builds/worker/workspace"
}, 
"context": "https://hg.mozilla.org/mozilla-central/raw-file/8e746f670f430ceb0cb85fa8eebfe97cbe42ff01/taskcluster/scripts/tester/test-linux.sh"

}
setting HOME to /builds/worker
Creating /builds/worker/workspace
[]
TEST-UNEXPECTED-FAIL | bitbar | Must have exactly one connected device. 0 found.

gecko-t-bitbar-gw-unit-p2 shows 0 pending tasks https://tools.taskcluster.net/provisioners/proj-autophone/worker-types?layout=grid&orderBy=pendingTasks&lastActive=false&search=gecko-t-bitbar-gw-unit-p2

They're up to 9 jobs now and the jobs are running normally. They would fail after 0 mins and the retriggers are beyond that.

That is strange to have a large cluster of those errors.

The Bitbar web UI was slow/not working for me around that time. They could have had server issues. I'll keep an eye on things.

Flags: needinfo?(aerickson)

Andrew: pixel2-21 is the problem. I'll quarantine it.

Flags: needinfo?(bob) → needinfo?(aerickson)

I missed that pixel2-21 was the source of all of these failures. It seems to have gotten better soon after the issues were brought up yesterday. I checked the worker this morning and all of it's past jobs were successful (https://tools.taskcluster.net/provisioners/proj-autophone/worker-types/gecko-t-bitbar-gw-unit-p2/workers/bitbar/pixel2-21).

I've removed it from quarantine and it's had 4 successful jobs since being added back into the pool. I'll keep watching it.

Flags: needinfo?(aerickson)
Priority: -- → P3

pixel2-21 has been fine for the last few days.

I've written a tool to scan all of our workers (it works for any provisioner actually) and calculate their 'success ratio' (successful jobs / completed jobs). It should allow us to actively find these problem hosts. I'm running it locally for now, but if the data looks good I can start logging/graphing it.

Whiteboard: [stockwell needswork:owner]

No failures since 08-13.

Whiteboard: [stockwell needswork:owner]

I don't think they've gone away. We pushed an image that changed the failure signature, resulting in https://bugzilla.mozilla.org/show_bug.cgi?id=1576031.

We have a new image in testing that should move them back to this bug/message.

See Also: → 1635749
See Also: → 1635752
We have completed our launch of our new Firefox on Android. The development of the new versions use GitHub for issue tracking. If the bug report still reproduces in a current version of [Firefox on Android nightly](https://play.google.com/store/apps/details?id=org.mozilla.fenix) an issue can be reported at the [Fenix GitHub project](https://github.com/mozilla-mobile/fenix/). If you want to discuss your report please use [Mozilla's chat](https://wiki.mozilla.org/Matrix#Connect_to_Matrix) server https://chat.mozilla.org and join the [#fenix](https://chat.mozilla.org/#/room/#fenix:mozilla.org) channel.
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → INCOMPLETE
Product: Firefox for Android → Firefox for Android Graveyard

This is permafailing since it first started running on the 20th, treeherder link. Andrew, could you please have a look over this bitbar issue? Thank you.

[task 2022-04-21T17:23:30.053Z] Dockerfile version 20220331T141348
[task 2022-04-21T17:23:30.053Z] Current working directory: /builds/task_165056180875661/fetches/condprofile
[task 2022-04-21T17:23:30.053Z] Bitbar test run: https://mozilla.testdroid.com/#testing/device-session/2200320/3219737/82555894
[task 2022-04-21T17:23:30.053Z] 
[task 2022-04-21T17:23:30.053Z] df -h
[task 2022-04-21T17:23:30.053Z] Filesystem      Size  Used Avail Use% Mounted on
[task 2022-04-21T17:23:30.053Z] overlay         458G   41G  394G  10% /
[task 2022-04-21T17:23:30.053Z] tmpfs            64M     0   64M   0% /dev
[task 2022-04-21T17:23:30.053Z] tmpfs           7.8G     0  7.8G   0% /sys/fs/cgroup
[task 2022-04-21T17:23:30.053Z] shm              64M     0   64M   0% /dev/shm
[task 2022-04-21T17:23:30.053Z] /dev/sda1       458G   41G  394G  10% /test
[task 2022-04-21T17:23:30.053Z] tmpfs           7.8G     0  7.8G   0% /proc/acpi
[task 2022-04-21T17:23:30.053Z] tmpfs           7.8G     0  7.8G   0% /proc/scsi
[task 2022-04-21T17:23:30.053Z] tmpfs           7.8G     0  7.8G   0% /sys/firmware
[task 2022-04-21T17:23:30.053Z] 
[task 2022-04-21T17:23:30.053Z] 
[task 2022-04-21T17:23:30.053Z] 
[task 2022-04-21T17:23:30.053Z] ADBHost: {'_logger': <mozlog.structuredlog.StructuredLogger object at 0x7ff30b501cf8>, '_verbose': True, '_use_root': True, '_adb_path': 'adb', '_adb_host': None, '_adb_port': None, '_timeout': 300, '_polling_interval': 0.1, '_adb_version': ''}
[task 2022-04-21T17:23:30.157Z] command_output: adb devices -l, timeout: None, timedout: None, exitcode: 0, output: * daemon not running; starting now at tcp:5037
[task 2022-04-21T17:23:30.157Z] * daemon started successfully
[task 2022-04-21T17:23:30.157Z] List of devices attached
[task 2022-04-21T17:23:30.158Z] []
[task 2022-04-21T17:23:30.158Z] TEST-UNEXPECTED-FAIL | bitbar | Must have exactly one connected device. 0 found.
[taskcluster 2022-04-21T17:23:30.182Z]    Exit Code: 4
[taskcluster 2022-04-21T17:23:30.182Z]    User Time: 399.549ms
[taskcluster 2022-04-21T17:23:30.182Z]  Kernel Time: 129.425ms
[taskcluster 2022-04-21T17:23:30.182Z]    Wall Time: 662.329779ms
[taskcluster 2022-04-21T17:23:30.182Z]       Result: FAILED
[taskcluster 2022-04-21T17:23:30.182Z] === Task Finished ===
[taskcluster 2022-04-21T17:23:30.182Z] Task Duration: 664.093941ms
[taskcluster:error] Uploading error artifact public/condprof from file archive with message "Could not read directory '/builds/task_165056180875661/archive'", reason "file-missing-on-worker" and expiry 2023-04-21T15:27:47.133Z
[taskcluster:error] TASK FAILURE during artifact upload: file-missing-on-worker: Could not read directory '/builds/task_165056180875661/archive'
[taskcluster 2022-04-21T17:23:30.260Z] Uploading redirect artifact public/logs/live.log to URL https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/cDaR15ZtTKmbOzMH5xuu4g/runs/11/artifacts/public%2Flogs%2Flive_backing.log with mime type "text/plain; charset=utf-8" and expiry 2023-04-21T15:27:47.133Z
[taskcluster:error] Task appears to have failed intermittently - exit code 4 found in task payload.onExitStatus list
[taskcluster:error] file-missing-on-worker: Could not read directory '/builds/task_165056180875661/archive'
Flags: needinfo?(aerickson)
Assignee: nobody → aerickson
Flags: needinfo?(aerickson)

Yes, working with bitbar on our variety of issues.

See Also: → 1765890

Bitbar issues have subsided after restarting the docker hosts (see https://bugzilla.mozilla.org/show_bug.cgi?id=1765890).

Status: REOPENED → RESOLVED
Closed: 3 years ago2 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.