Android child process fd limits are too low | Perma [Tier 2] dom/canvas/test/webgl-conf/generated/test_2_conformance__extensions__webgl-compressed-texture-astc.html | Test timed out. -
Categories
(Toolkit :: Startup and Profile System, defect, P1)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr91 | --- | unaffected |
firefox99 | --- | unaffected |
firefox100 | --- | unaffected |
firefox101 | - | fixed |
People
(Reporter: intermittent-bug-filer, Assigned: jld)
References
(Regression)
Details
(Keywords: intermittent-failure, regression)
Attachments
(1 file)
Filed by: smolnar [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=374941149&repo=mozilla-central
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/Iksw5t3yQ8KYd2fQPBTtwA/runs/0/artifacts/public/logs/live_backing.log
INFO - TEST-PASS | dom/canvas/test/webgl-conf/generated/test_2_conformance__extensions__webgl-compressed-texture-astc.html | successfullyParsed is true
[task 2022-04-19T08:52:06.733Z] 08:42:30 INFO - Buffered messages finished
[task 2022-04-19T08:52:06.733Z] 08:42:30 WARNING - TEST-UNEXPECTED-FAIL | dom/canvas/test/webgl-conf/generated/test_2_conformance__extensions__webgl-compressed-texture-astc.html | Test timed out. -
[task 2022-04-19T08:52:06.733Z] 08:42:42 WARNING - TEST-UNEXPECTED-FAIL | SimpleTest | this test already called finish!
[task 2022-04-19T08:52:06.733Z] 08:42:42 WARNING - TEST-UNEXPECTED-ERROR | dom/canvas/test/webgl-conf/generated/test_2_conformance__extensions__webgl-compressed-texture-astc.html | called finish() multiple times
[task 2022-04-19T08:52:06.733Z] 08:42:42 INFO - TEST-INFO took 327446ms
[task 2022-04-19T08:52:06.733Z] 08:43:04 WARNING - TEST-UNEXPECTED-FAIL | dom/canvas/test/webgl-conf/generated/test_2_conformance__extensions__webgl-compressed-texture-astc.html | Test timed out. -
[task 2022-04-19T08:52:06.733Z] 08:43:04 WARNING - TEST-UNEXPECTED-FAIL | SimpleTest | this test already called finish!
[task 2022-04-19T08:52:06.733Z] 08:43:04 WARNING - TEST-UNEXPECTED-ERROR | dom/canvas/test/webgl-conf/generated/test_2_conformance__extensions__webgl-compressed-texture-astc.html | called finish() multiple times
[task 2022-04-19T08:52:06.733Z] 08:43:04 INFO - TEST-INFO
[task 2022-04-19T08:52:06.733Z] 08:43:39 WARNING - TEST-UNEXPECTED-FAIL | dom/canvas/test/webgl-conf/generated/test_2_conformance__extensions__webgl-compressed-texture-astc.html | Test timed out. -
[task 2022-04-19T08:52:06.733Z] 08:43:39 WARNING - TEST-UNEXPECTED-FAIL | SimpleTest | this test already called finish!
[task 2022-04-19T08:52:06.733Z] 08:43:39 WARNING - TEST-UNEXPECTED-ERROR | dom/canvas/test/webgl-conf/generated/test_2_conformance__extensions__webgl-compressed-texture-astc.html | called finish() multiple times
[task 2022-04-19T08:52:06.733Z] 08:43:39 INFO - TEST-INFO
[task 2022-04-19T08:52:06.733Z] 08:44:02 WARNING - TEST-UNEXPECTED-FAIL | dom/canvas/test/webgl-conf/generated/test_2_conformance__extensions__webgl-compressed-texture-astc.html | Test timed out. -
[task 2022-04-19T08:52:06.733Z] 08:44:02 WARNING - TEST-UNEXPECTED-FAIL | (SimpleTest/TestRunner.js) | 4 test timeouts, giving up. -
[task 2022-04-19T08:52:06.733Z] 08:44:02 WARNING - TEST-UNEXPECTED-FAIL | (SimpleTest/TestRunner.js) | Skipping 235 remaining tests. -
[task 2022-04-19T08:52:06.733Z] 08:44:02 WARNING - TEST-UNEXPECTED-FAIL | SimpleTest | this test already called finish!
[task 2022-04-19T08:52:06.733Z] 08:44:02 WARNING - TEST-UNEXPECTED-ERROR | (SimpleTest/TestRunner.js) | called finish() multiple times
[task 2022-04-19T08:52:06.733Z] 08:44:02 INFO - TEST-INFO
[task 2022-04-19T08:52:06.733Z] 08:51:31 INFO - wait for org.mozilla.geckoview.test_runner complete; top activity=org.mozilla.geckoview.test_runner
[task 2022-04-19T08:52:06.733Z] 08:51:31 INFO - org.mozilla.geckoview.test_runner unexpectedly found running. Killing...
[task 2022-04-19T08:52:06.733Z] 08:51:43 WARNING - TEST-UNEXPECTED-FAIL | (SimpleTest/TestRunner.js) (finished) | application timed out after 370 seconds with no output
[task 2022-04-19T08:52:06.733Z] 08:51:43 INFO - runtestsremote.py | Application ran for: 0:14:49.535887
[task 2022-04-19T08:52:06.733Z] 08:51:44 INFO - Stopping web server
[task 2022-04-19T08:52:06.733Z] 08:51:44 INFO - Server shut down.
[task 2022-04-19T08:52:06.733Z] 08:51:44 INFO - Web server killed.
[task 2022-04-19T08:52:06.733Z] 08:51:44 INFO - Stopping web socket server
[task 2022-04-19T08:52:06.733Z] 08:51:44 INFO - Stopping ssltunnel
[task 2022-04-19T08:52:06.733Z] 08:51:44 INFO - leakcheck | refcount logging is off, so leaks can't be detected!
[task 2022-04-19T08:52:06.733Z] 08:51:44 INFO - runtests.py | Running tests: end.
[task 2022-04-19T08:52:06.733Z] 08:51:48 INFO - Buffered messages finished
[task 2022-04-19T08:52:06.733Z] 08:51:53 INFO - 0 INFO TEST-START | Shutdown
Comment 1•2 years ago
|
||
Comment 2•2 years ago
|
||
Nika, the failure seem to have started from here.
Can you please take a look?
Comment 3•2 years ago
|
||
Seems like we're hitting FD limits when trying to create new shared memory regions, which is one of the situations I was worried about with that patch. The logcat logs seem to be emitting errors like:
04-19 08:37:09.155 7372 7396 E Gecko : ShmemAndroid::Create():open: Too many open files (24)
04-19 08:37:09.157 7372 7396 E Gecko : ShmemAndroid::Create():open: Too many open files (24)
It'll be a bit before I can figure out a good way to mitigate this unfortunately. It might be best to back out bug 1757802 until we find some other approach.
Comment 4•2 years ago
|
||
Set release status flags based on info from the regressing bug 1757802
Assignee | ||
Comment 5•2 years ago
|
||
I pushed a patch to Try to see what the per-process resource limits are, and got this result (the two numbers are the current value and the hard limit):
04-20 22:18:30.252 7215 7238 E Gecko : rlimits: 1024 4096
04-20 22:18:30.252 7215 7238 E Gecko : ShmemAndroid::Create():open: Too many open files (24)
So we're currently limited to 1024 fds, but we could raise it as high as 4096. Normally, we do raise the limit to (at least) 4096 if possible, but I think what's going on here is that we only do that in the parent process, because normally the child processes are direct descendants so they inherit the change, but that's not the case on Android. As the de-facto owner of things related to RLIMIT_NOFILE
I'll see if I can come up with a quick fix.
Incidentally, 1024 is pretty small, given that Necko will use up to 1000 on its own, and then there are other subsystems like IndexedDB (and maybe the cache?) that can have significant fd usage; this is why we had to increase it on desktop Linux.
Updated•2 years ago
|
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Comment 6•2 years ago
|
||
(In reply to Jed Davis [:jld] ⟨⏰|UTC-6⟩ ⟦he/him⟧ from comment #5)
Incidentally, 1024 is pretty small, given that Necko will use up to 1000 on its own, and then there are other subsystems like IndexedDB (and maybe the cache?) that can have significant fd usage; this is why we had to increase it on desktop Linux.
But this applies only to child processes, so Necko's usage probably isn't relevant. Even so, it's intended that all processes have at least 4k fds available (if the OS config allows it, which it does in this case), so that ought to be fixed. And it fixes the test failure (on Try).
I'm mostly convinced that there isn't an actual leak here — the new Shmem
is basically a fancy wrapper for RefPtr<mozilla::ipc::SharedMemory>
, so refcount logging ought to pick up any leaks, at least if they live past shutdown.
Assignee | ||
Comment 7•2 years ago
|
||
Updated•2 years ago
|
Comment 8•2 years ago
|
||
We still want this change, but the regressing bug has also been backed out
Comment hidden (Intermittent Failures Robot) |
Updated•2 years ago
|
Comment 10•2 years ago
|
||
Not tracking for 101 anymore since the regressing change was backed out.
Comment 11•2 years ago
|
||
Pushed by jedavis@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/65b54546353e Set fd resource limits correctly for child processes on Android. r=glandium
Comment 12•2 years ago
|
||
bugherder |
Description
•