Closed Bug 1754572 Opened 3 years ago Closed 3 years ago

Intermittent ipc/glue/test/browser/browser_utility_memoryReport.js | application terminated with exit code 11

Categories

(Core :: IPC, defect, P5)

defect

Tracking

()

RESOLVED FIXED
99 Branch
Tracking Status
firefox99 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: gerard-majax)

References

(Blocks 1 open bug)

Details

(Keywords: intermittent-failure)

Attachments

(4 files)

Filed by: alissy [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=367295822&repo=try
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/emp4g_ILT-G1pobDsHEUWA/runs/0/artifacts/public/logs/live_backing.log


I think this might be because I directly pass `Maybe<int32_t> utilityPid = utilityProc->ProcessPid();` to the `promise->MaybeResolve` without verifying that we have a `utilityPid` value?
Assignee: nobody → lissyx+mozillians

Locally forcing Maybe<int32_t> utilityPid = Nothing(); gets me some similar stack, so I think this is the reason why. According to the retries in https://treeherder.mozilla.org/jobs?repo=try&revision=0d55c5b72ff4a9e5be4d6939b7b21466b0e7edce it seems to be not a frequent failure.

(In reply to Alexandre LISSY :gerard-majax from comment #2)

try with no fix: https://treeherder.mozilla.org/jobs?repo=try&revision=0511d2555d6d5e072ac86f69a14abf675ad7189d
try with a fix?: https://treeherder.mozilla.org/jobs?repo=try&revision=5c45f30f6db2b800a6fbd25769a8a60907537b22

yet https://treeherder.mozilla.org/jobs?repo=try&revision=e87a2ae1363152b8a966e354b912ad98bd866890 shows green with 2079 (to date) retries, while the best one of those two shows at elast two failures over 1486 runs

This would advocate browser_utility_hard_kill.js is the culprit here. My guess for now is that the way we kill the process and quickly end the test leaves some dangling bits in UtilityProcessManager and when browser_utility_memoryReport.js test starts querying for a running process, then it gets into an incorrect state where it behaves likes there is still a process alive (maybe there is one?), but when querying the pid, it's dead.

original test:
WIP - hard_kill without the fix https://treeherder.mozilla.org/jobs?repo=try&revision=804e59681853930b7ae9856aa4affa22bff52f9c 25 failures, 2478 runs
Fix hard_kill test by waiting on process death - https://treeherder.mozilla.org/jobs?repo=try&revision=7950daf9937d70ebdb5665b965ca271c7ebc84b5 no failure

adding new delayed kill test:
WIP - hard_kill_delayed to repro: https://treeherder.mozilla.org/jobs?repo=try&revision=10e6beae03181bdc572bd52365f84dfec25c728b many failures (all)
WIP - fix hard_kill_delayed with process waiting https://treeherder.mozilla.org/jobs?repo=try&revision=d96df0044fa92270d41a9647a4659a934b304382 no failure

Depends on D138482

Attachment #9263376 - Attachment description: WIP: Bug 1754572 - Fix hard_kill test by waiting on process death → Bug 1754572 - Fix hard_kill test by waiting on process death r?nika!
Attachment #9263377 - Attachment description: WIP: Bug 1754572 - Add a delayed kill test → Bug 1754572 - Add a delayed kill test r?nika!
Attachment #9263378 - Attachment description: WIP: Bug 1754572 - Ensure we stop utility process after running tests → Bug 1754572 - Ensure we stop utility process after running tests r?nika!
Pushed by alissy@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/b4ed010bc971 Correct error handling in UtilityProcessTest r=nika https://hg.mozilla.org/integration/autoland/rev/752d39a0201b Fix hard_kill test by waiting on process death r=nika https://hg.mozilla.org/integration/autoland/rev/5cea2c885fe7 Add a delayed kill test r=nika https://hg.mozilla.org/integration/autoland/rev/364a06e45540 Ensure we stop utility process after running tests r=nika
Regressions: 1756069
No longer regressions: 1756069
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: