Intermittent infra TSAN <test-name> | application terminated with exit code -6 | ERROR: ThreadSanitizer failed to allocate 0xf69000490000 (271098340507648) bytes at address 80400003000 (errno: 12)
Categories
(Core :: Sanitizers, defect, P5)
Tracking
()
People
(Reporter: intermittent-bug-filer, Assigned: RyanVM)
References
()
Details
(Keywords: intermittent-failure, Whiteboard: [retriggered][stockwell infra])
Attachments
(2 files)
Filed by: csabou [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=378871606&repo=autoland
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/fldRecxoTu2V9qX4SFI2kA/runs/0/artifacts/public/logs/live_backing.log
Reftest URL: https://hg.mozilla.org/mozilla-central/raw-file/tip/layout/tools/reftest/reftest-analyzer.xhtml#logurl=https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/fldRecxoTu2V9qX4SFI2kA/runs/0/artifacts/public/logs/live_backing.log&only_show_unexpected=1
[task 2022-05-21T16:34:28.127Z] 16:34:28 INFO - REFTEST TEST-START | layout/reftests/css-enabled/input/input-fieldset-1.html == layout/reftests/css-enabled/input/input-fieldset-ref.html
[task 2022-05-21T16:34:28.131Z] 16:34:28 INFO - REFTEST TEST-LOAD | file:///builds/worker/workspace/build/tests/reftest/tests/layout/reftests/css-enabled/input/input-fieldset-1.html | 0 / 9 (0%)
[task 2022-05-21T16:34:29.320Z] 16:34:29 INFO - REFTEST INFO | drawWindow flags = DRAWWINDOW_DRAW_CARET | DRAWWINDOW_DRAW_VIEW | DRAWWINDOW_USE_WIDGET_LAYERS; window size = 800,1000; test browser size = 800,1000
[task 2022-05-21T16:34:29.532Z] 16:34:29 INFO - REFTEST TEST-LOAD | file:///builds/worker/workspace/build/tests/reftest/tests/layout/reftests/css-enabled/input/input-fieldset-ref.html | 0 / 9 (0%)
[task 2022-05-21T16:34:29.810Z] 16:34:29 INFO - ==3900==ERROR: ThreadSanitizer failed to allocate 0xf69000490000 (271098340507648) bytes at address 80400003000 (errno: 12)
[task 2022-05-21T16:34:29.862Z] 16:34:29 INFO - Exiting due to channel error.
[task 2022-05-21T16:34:29.862Z] 16:34:29 INFO - Exiting due to channel error.
[task 2022-05-21T16:34:29.862Z] 16:34:29 INFO - Exiting due to channel error.
[task 2022-05-21T16:34:29.862Z] 16:34:29 INFO - Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=2.14424) Exiting due to channel error.
[task 2022-05-21T16:34:29.864Z] 16:34:29 INFO - Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=2.34402) Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=16.4498) Exiting due to channel error.
[task 2022-05-21T16:34:29.864Z] 16:34:29 INFO - Exiting due to channel error.
[task 2022-05-21T16:34:30.890Z] 16:34:30 INFO - Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=7.23217) Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=5.38764) Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=1.71864)
[task 2022-05-21T16:34:30.890Z] 16:34:30 ERROR - TEST-UNEXPECTED-FAIL | layout/reftests/css-enabled/input/input-fieldset-1.html | application terminated with exit code -6
[task 2022-05-21T16:34:30.903Z] 16:34:30 INFO - REFTEST INFO | Process mode: e10s
[task 2022-05-21T16:34:30.903Z] 16:34:30 WARNING - leakcheck | refcount logging is off, so leaks can't be detected!
[task 2022-05-21T16:34:30.904Z] 16:34:30 INFO - REFTEST INFO | Running tests in file:///builds/worker/workspace/build/tests/reftest/tests/layout/reftests/text-stroke/reftest.list
[task 2022-05-21T16:34:30.927Z] 16:34:30 INFO - REFTEST INFO | Running with e10s: True
[task 2022-05-21T16:34:30.927Z] 16:34:30 INFO - REFTEST INFO | Running with fission: True
[task 2022-05-21T16:34:30.928Z] 16:34:30 INFO - REFTEST INFO | INFO | runtests.py | TSan using symbolizer at /builds/worker/workspace/build/application/firefox/llvm-symbolizer
[task 2022-05-21T16:34:30.928Z] 16:34:30 INFO - REFTEST INFO | Application command: /builds/worker/workspace/build/application/firefox/firefox -marionette -profile /tmp/tmp5h_nhbc8.mozrunner
[task 2022-05-21T16:34:34.470Z] 16:34:34 INFO - 1653150874468 Marionette INFO Marionette enabled
[task 2022-05-21T16:34:34.481Z] 16:34:34 INFO - 1653150874480 Marionette TRACE Received observer notification final-ui-startup
[task 2022-05-21T16:34:34.490Z] 16:34:34 INFO - 1653150874490 Marionette INFO Listening on port 2828
[task 2022-05-21T16:34:34.492Z] 16:34:34 INFO - 1653150874490 Marionette DEBUG Marionette is listening
[task 2022-05-21T16:34:35.236Z] 16:34:35 INFO - 1653150875235 Marionette DEBUG Accepted connection 0 from 127.0.0.1:41650
[task 2022-05-21T16:34:35.503Z] 16:34:35 INFO - 1653150875502 Marionette DEBUG Closed connection 0
[task 2022-05-21T16:34:35.513Z] 16:34:35 INFO - 1653150875512 Marionette DEBUG Accepted connection 1 from 127.0.0.1:41652
[task 2022-05-21T16:34:35.725Z] 16:34:35 INFO - 1653150875723 Marionette DEBUG Accepted connection 2 from 127.0.0.1:41654
[task 2022-05-21T16:34:35.727Z] 16:34:35 INFO - 1653150875726 Marionette DEBUG Closed connection 1
[task 2022-05-21T16:34:37.161Z] 16:34:37 INFO - 1653150877159 Marionette DEBUG 2 -> [0,1,"WebDriver:NewSession",{"strictFileInteractability":true}]
[task 2022-05-21T16:34:37.188Z] 16:34:37 INFO - 1653150877187 Marionette DEBUG Waiting for initial application window
[task 2022-05-21T16:34:42.010Z] 16:34:42 INFO - console.warn: SearchSettings: "get: No settings file exists, new profile?" (new NotFoundError("Could not open the file at /tmp/tmp5h_nhbc8.mozrunner/search.json.mozlz4", (void 0)))
[task 2022-05-21T16:34:49.216Z] 16:34:49 INFO - 1653150889215 Marionette TRACE Received observer notification browser-idle-startup-tasks-finished
[task 2022-05-21T16:34:49.270Z] 16:34:49 INFO - 1653150889268 RemoteAgent TRACE [24] Document already finished loading: about:blank
[task 2022-05-21T16:34:49.347Z] 16:34:49 INFO - 1653150889345 Marionette DEBUG 2 <- [1,1,null,{"sessionId":"ede296ff-1df8-40e7-8a54-2a188346b4b9","capabilities":{"browserName":"firefox","browserVersion":"102.0 ... wnTimeout":360000,"moz:useNonSpecCompliantPointerOrigin":false,"moz:webdriverClick":true,"moz:windowless":false,"proxy":{}}}]
[task 2022-05-21T16:34:49.413Z] 16:34:49 INFO - 1653150889411 Marionette DEBUG 2 -> [0,2,"Addon:Install",{"path":"/builds/worker/workspace/build/tests/reftest/specialpowers","temporary":true}]
[task 2022-05-21T16:34:49.579Z] 16:34:49 INFO - 1653150889577 Marionette DEBUG 2 <- [1,2,null,{"value":"special-powers@mozilla.org"}]
[task 2022-05-21T16:34:49.632Z] 16:34:49 INFO - 1653150889631 Marionette DEBUG 2 -> [0,3,"Addon:Install",{"path":"/builds/worker/workspace/build/tests/reftest/reftest","temporary":true}]
[task 2022-05-21T16:34:49.876Z] 16:34:49 INFO - 1653150889874 Marionette TRACE Received observer notification domwindowopened
[task 2022-05-21T16:34:49.896Z] 16:34:49 INFO - 1653150889895 Marionette DEBUG 2 <- [1,3,null,{"value":"reftest@mozilla.org"}]
[task 2022-05-21T16:34:49.932Z] 16:34:49 INFO - 1653150889931 Marionette DEBUG 2 -> [0,4,"WebDriver:DeleteSession",{}]
[task 2022-05-21T16:34:49.969Z] 16:34:49 INFO - 1653150889966 Marionette DEBUG 2 <- [1,4,null,{"value":null}]
[task 2022-05-21T16:34:49.989Z] 16:34:49 INFO - 1653150889988 Marionette DEBUG Closed connection 2
[task 2022-05-21T16:34:52.093Z] 16:34:52 INFO - REFTEST TEST-START | layout/reftests/text-stroke/webkit-text-stroke-property-001.html == layout/reftests/text-stroke/webkit-text-stroke-property-001-ref.html
[task 2022-05-21T16:34:52.095Z] 16:34:52 INFO - REFTEST TEST-LOAD | file:///builds/worker/workspace/build/tests/reftest/tests/layout/reftests/text-stroke/webkit-text-stroke-property-001.html | 0 / 6 (0%)
[task 2022-05-21T16:34:53.386Z] 16:34:53 INFO - REFTEST INFO | drawWindow flags = DRAWWINDOW_DRAW_CARET | DRAWWINDOW_DRAW_VIEW | DRAWWINDOW_USE_WIDGET_LAYERS; window size = 800,1000; test browser size = 800,1000
[task 2022-05-21T16:34:53.581Z] 16:34:53 INFO - REFTEST TEST-LOAD | file:///builds/worker/workspace/build/tests/reftest/tests/layout/reftests/text-stroke/webkit-text-stroke-property-001-ref.html | 0 / 6 (0%)
[task 2022-05-21T16:34:54.009Z] 16:34:54 INFO - REFTEST INFO | REFTEST fuzzy test (0, 0) <= (64, 25) <= (64, 776)
Updated•2 years ago
|
Comment hidden (Intermittent Failures Robot) |
Comment 3•2 years ago
|
||
First occurrence so far in this batch of retriggers and backfills: https://treeherder.mozilla.org/jobs?repo=autoland&searchStr=Linux%2C18.04%2Cx64%2CWebRender%2Ctsan%2Copt%2CReftests%2Cwith%2Cfission%2Cenabled%2Ctest-linux1804-64-tsan-qr%2Fopt-reftest-fis-e10s%2CR11&group_state=expanded&tochange=4b4bd8a70d278e4f03be15d2e09ee3eb1c5d9a09&fromchange=6fe984120bd951ef222718ab5688d6d2e889c7df&selectedTaskRun=YWxXDoq7S8aUlOLir3WBmw.0
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 19•2 years ago
|
||
There have been 53 total failures in the last 7 days, recent failure log.
Affected platforms are:
- linux1804-64-tsan-qr
[task 2022-09-14T11:04:50.374Z] 11:04:50 INFO - REFTEST TEST-START | layout/reftests/cssom/computed-style-cross-window.html == layout/reftests/cssom/computed-style-cross-window-ref.html
[task 2022-09-14T11:04:50.378Z] 11:04:50 INFO - REFTEST TEST-LOAD | file:///builds/worker/workspace/build/tests/reftest/tests/layout/reftests/cssom/computed-style-cross-window.html | 0 / 2 (0%)
[task 2022-09-14T11:04:53.081Z] 11:04:53 INFO - REFTEST INFO | drawWindow flags = DRAWWINDOW_DRAW_CARET | DRAWWINDOW_DRAW_VIEW | DRAWWINDOW_USE_WIDGET_LAYERS; window size = 800,1000; test browser size = 800,1000
[task 2022-09-14T11:04:53.274Z] 11:04:53 INFO - ==4867==ERROR: ThreadSanitizer failed to allocate 0xf70000546000 (271579377590272) bytes at address 80200001000 (errno: 12)
[task 2022-09-14T11:04:53.331Z] 11:04:53 INFO - Exiting due to channel error.
[task 2022-09-14T11:04:53.332Z] 11:04:53 INFO - Exiting due to channel error.
[task 2022-09-14T11:04:53.335Z] 11:04:53 INFO - Exiting due to channel error.
[task 2022-09-14T11:04:53.336Z] 11:04:53 INFO - Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=3.63616) Exiting due to channel error.
[task 2022-09-14T11:04:53.338Z] 11:04:53 INFO - Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=18.0626) Exiting due to channel error.
[task 2022-09-14T11:04:53.338Z] 11:04:53 INFO - Exiting due to channel error.
[task 2022-09-14T11:04:54.357Z] 11:04:54 ERROR - TEST-UNEXPECTED-FAIL | layout/reftests/cssom/computed-style-cross-window.html | application terminated with exit code -6
[task 2022-09-14T11:04:54.367Z] 11:04:54 INFO - REFTEST INFO | Process mode: e10s
[task 2022-09-14T11:04:54.367Z] 11:04:54 WARNING - leakcheck | refcount logging is off, so leaks can't be detected!
[task 2022-09-14T11:04:54.367Z] 11:04:54 INFO - REFTEST INFO | Result summary:
[task 2022-09-14T11:04:54.367Z] 11:04:54 INFO - REFTEST INFO | Successful: 352 (352 pass, 0 load only)
[task 2022-09-14T11:04:54.367Z] 11:04:54 INFO - REFTEST INFO | Unexpected: 0 (0 unexpected fail, 0 unexpected pass, 0 unexpected asserts, 0 failed load, 0 exception)
[task 2022-09-14T11:04:54.368Z] 11:04:54 INFO - REFTEST INFO | Known problems: 15 (6 known fail, 0 known asserts, 7 random, 2 skipped, 0 slow)
[task 2022-09-14T11:04:54.368Z] 11:04:54 INFO - REFTEST SUITE-END | Shutdown
Comment 20•2 years ago
|
||
It seems tsan is running out of memory, not sure what can be done about it... maybe gc'ing more aggressively? Something else? Christian, have you hit something like this before?
Comment 21•2 years ago
|
||
In general for OOMs, I would recommend disabling the test with TSan (TSan consumes more memory than normal).
However, this specific instance looks like a large allocation of some sort (271579377590272 bytes would fail no matter what). IT might be worth checking where this allocation is coming from and fixing this at the source. This should also OOM or crash regular builds. Maybe the intermittent doesn't trigger there due to timing issues.
Comment 22•2 years ago
|
||
There are 49 total failures in the last 7 days on linux1804-64-tsan-qr opt
Recent failure log: https://treeherder.mozilla.org/logviewer?job_id=390686861&repo=mozilla-central&lineNumber=4171
[task 2022-09-16T22:10:06.983Z] 22:10:06 INFO - REFTEST TEST-START | image/test/reftest/ico/ico-png/ico-size-1x1-png.ico == image/test/reftest/ico/ico-png/ico-size-1x1-png.png
[task 2022-09-16T22:10:06.985Z] 22:10:06 INFO - REFTEST TEST-LOAD | file:///builds/worker/workspace/build/tests/reftest/tests/image/test/reftest/ico/ico-png/ico-size-1x1-png.ico | 0 / 19 (0%)
[task 2022-09-16T22:10:08.180Z] 22:10:08 INFO - REFTEST INFO | drawWindow flags = DRAWWINDOW_DRAW_CARET | DRAWWINDOW_DRAW_VIEW | DRAWWINDOW_USE_WIDGET_LAYERS; window size = 800,1000; test browser size = 800,1000
[task 2022-09-16T22:10:08.388Z] 22:10:08 INFO - ==3204==ERROR: ThreadSanitizer failed to allocate 0xfe121c9a6000 (279353742745600) bytes at address 80000c30000 (errno: 12)
[task 2022-09-16T22:10:08.440Z] 22:10:08 INFO - Exiting due to channel error.
[task 2022-09-16T22:10:08.440Z] 22:10:08 INFO - Exiting due to channel error.
[task 2022-09-16T22:10:08.440Z] 22:10:08 INFO - Exiting due to channel error.
[task 2022-09-16T22:10:08.444Z] 22:10:08 INFO - Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=5.42805) Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=7.96323) Exiting due to channel error.
[task 2022-09-16T22:10:08.446Z] 22:10:08 INFO - Exiting due to channel error.
[task 2022-09-16T22:10:08.448Z] 22:10:08 INFO - Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=2.20013) Exiting due to channel error.
[task 2022-09-16T22:10:09.470Z] 22:10:09 INFO - Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=2.57941) Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=17.182)
[task 2022-09-16T22:10:09.471Z] 22:10:09 ERROR - TEST-UNEXPECTED-FAIL | image/test/reftest/ico/ico-png/ico-size-1x1-png.ico | application terminated with exit code -6
[task 2022-09-16T22:10:09.480Z] 22:10:09 INFO - REFTEST INFO | Process mode: e10s
[task 2022-09-16T22:10:09.480Z] 22:10:09 WARNING - leakcheck | refcount logging is off, so leaks can't be detected!
[task 2022-09-16T22:10:09.481Z] 22:10:09 INFO - REFTEST INFO | Running tests in file:///builds/worker/workspace/build/tests/reftest/tests/image/test/reftest/pngsuite-gamma/reftest.list
[task 2022-09-16T22:10:09.490Z] 22:10:09 INFO - REFTEST INFO | Running with e10s: True
[task 2022-09-16T22:10:09.491Z] 22:10:09 INFO - REFTEST INFO | Running with fission: True
[task 2022-09-16T22:10:09.492Z] 22:10:09 INFO - REFTEST INFO | INFO | runtests.py | TSan using symbolizer at /builds/worker/workspace/build/application/firefox/llvm-symbolizer
[task 2022-09-16T22:10:09.493Z] 22:10:09 INFO - REFTEST INFO | Application command: /builds/worker/workspace/build/application/firefox/firefox -marionette -profile /tmp/tmpram1uqtm.mozrunner
[task 2022-09-16T22:10:13.034Z] 22:10:13 INFO - 1663366213032 Marionette INFO Marionette enabled
[task 2022-09-16T22:10:13.046Z] 22:10:13 INFO - 1663366213045 Marionette TRACE Received observer notification final-ui-startup
[task 2022-09-16T22:10:13.056Z] 22:10:13 INFO - 1663366213055 Marionette INFO Listening on port 2828
[task 2022-09-16T22:10:13.058Z] 22:10:13 INFO - 1663366213056 Marionette DEBUG Marionette is listening
[task 2022-09-16T22:10:13.811Z] 22:10:13 INFO - 1663366213810 Marionette DEBUG Accepted connection 0 from 127.0.0.1:47276
[task 2022-09-16T22:10:14.009Z] 22:10:14 INFO - 1663366214007 Marionette DEBUG Closed connection 0
[task 2022-09-16T22:10:14.016Z] 22:10:14 INFO - 1663366214015 Marionette DEBUG Accepted connection 1 from 127.0.0.1:47278
[task 2022-09-16T22:10:14.273Z] 22:10:14 INFO - 1663366214271 Marionette DEBUG Accepted connection 2 from 127.0.0.1:47280
[task 2022-09-16T22:10:14.276Z] 22:10:14 INFO - 1663366214274 Marionette DEBUG Closed connection 1
[task 2022-09-16T22:10:15.701Z] 22:10:15 INFO - 1663366215700 Marionette DEBUG 2 -> [0,1,"WebDriver:NewSession",{"strictFileInteractability":true}]
[task 2022-09-16T22:10:15.727Z] 22:10:15 INFO - 1663366215726 Marionette DEBUG Waiting for initial application window
[task 2022-09-16T22:10:20.748Z] 22:10:20 INFO - console.warn: SearchSettings: "get: No settings file exists, new profile?" (new NotFoundError("Could not open the file at /tmp/tmpram1uqtm.mozrunner/search.json.mozlz4", (void 0)))
[task 2022-09-16T22:10:28.454Z] 22:10:28 INFO - 1663366228452 Marionette TRACE Received observer notification browser-idle-startup-tasks-finished
[task 2022-09-16T22:10:28.513Z] 22:10:28 INFO - 1663366228511 RemoteAgent TRACE [9] Document already finished loading: about:blank
[task 2022-09-16T22:10:28.592Z] 22:10:28 INFO - 1663366228590 Marionette DEBUG 2 <- [1,1,null,{"sessionId":"e32f686f-c87f-4bb1-a881-96f54c7bf0de","capabilities":{"browserName":"firefox","browserVersion":"106.0 ... wnTimeout":360000,"moz:useNonSpecCompliantPointerOrigin":false,"moz:webdriverClick":true,"moz:windowless":false,"proxy":{}}}]
[task 2022-09-16T22:10:28.685Z] 22:10:28 INFO - 1663366228683 Marionette DEBUG 2 -> [0,2,"Addon:Install",{"path":"/builds/worker/workspace/build/tests/reftest/specialpowers","temporary":true}]
[task 2022-09-16T22:10:28.893Z] 22:10:28 INFO - 1663366228892 Marionette DEBUG 2 <- [1,2,null,{"value":"special-powers@mozilla.org"}]
[task 2022-09-16T22:10:28.935Z] 22:10:28 INFO - 1663366228934 Marionette DEBUG 2 -> [0,3,"Addon:Install",{"path":"/builds/worker/workspace/build/tests/reftest/reftest","temporary":true}]
[task 2022-09-16T22:10:29.243Z] 22:10:29 INFO - 1663366229242 Marionette TRACE Received observer notification domwindowopened
[task 2022-09-16T22:10:29.265Z] 22:10:29 INFO - 1663366229264 Marionette DEBUG 2 <- [1,3,null,{"value":"reftest@mozilla.org"}]
[task 2022-09-16T22:10:29.319Z] 22:10:29 INFO - 1663366229317 Marionette DEBUG 2 -> [0,4,"WebDriver:DeleteSession",{}]
[task 2022-09-16T22:10:29.337Z] 22:10:29 INFO - 1663366229335 Marionette DEBUG 2 <- [1,4,null,{"value":null}]
[task 2022-09-16T22:10:29.349Z] 22:10:29 INFO - 1663366229348 Marionette DEBUG Closed connection 2
Comment hidden (Intermittent Failures Robot) |
Comment 24•2 years ago
|
||
There have been 43 total failures in the last 7 days, recent failure log.
Affected platforms are:
- linux1804-64-tsan-qr
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 27•2 years ago
|
||
There have been 37 total failures in the last 7 days, recent failure log.
Affected platforms are:
- linux1804-64-tsan-qr
Comment hidden (Intermittent Failures Robot) |
Comment 30•2 years ago
|
||
There have been 43 total failures in the last 7 days, recent failure log.
Affected platforms are:
- linux1804-64-tsan-qr
Comment hidden (Intermittent Failures Robot) |
Comment 32•2 years ago
•
|
||
This got frequent on 2022-09-06. Could it be related to the changes in bug 1789056?
Comment 33•2 years ago
|
||
(In reply to Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout) from comment #32)
This got frequent on 2022-09-06. Could it be related to the changes in bug 1789056?
No, i highly doubt it.
Comment hidden (Intermittent Failures Robot) |
Updated•2 years ago
|
Comment hidden (Intermittent Failures Robot) |
Updated•2 years ago
|
Comment hidden (Intermittent Failures Robot) |
Comment 38•2 years ago
|
||
Update:
There have been 33 failures within the last 7 days, all of them on Linux 18.04 x64 WebRender tsan opt.
Recent failure log: https://treeherder.mozilla.org/logviewer?job_id=395731251&repo=autoland&lineNumber=5961
[task 2022-11-07T03:35:39.740Z] 03:35:39 INFO - REFTEST TEST-START | editor/reftests/xul/emptytextbox-4.xhtml != editor/reftests/xul/emptytextbox-ref.xhtml
[task 2022-11-07T03:35:39.745Z] 03:35:39 INFO - REFTEST TEST-LOAD | chrome://reftest/content/editor/reftests/xul/emptytextbox-4.xhtml | 0 / 1 (0%)
[task 2022-11-07T03:35:40.303Z] 03:35:40 INFO - REFTEST INFO | drawWindow flags = DRAWWINDOW_DRAW_CARET | DRAWWINDOW_DRAW_VIEW | DRAWWINDOW_USE_WIDGET_LAYERS; window size = 800,1000; test browser size = 800,1000
[task 2022-11-07T03:35:40.536Z] 03:35:40 INFO - ==4884==ERROR: ThreadSanitizer failed to allocate 0xfe12ba1a6000 (279356385157120) bytes at address 80000c30000 (errno: 12)
[task 2022-11-07T03:35:40.582Z] 03:35:40 INFO - Exiting due to channel error.
[task 2022-11-07T03:35:40.582Z] 03:35:40 INFO - Exiting due to channel error.
[task 2022-11-07T03:35:40.588Z] 03:35:40 INFO - Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=4.46218) Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=3.97656) Exiting due to channel error.
[task 2022-11-07T03:35:40.589Z] 03:35:40 INFO - Exiting due to channel error.
[task 2022-11-07T03:35:40.589Z] 03:35:40 INFO - Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=6.81243) Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=6.55822) Exiting due to channel error.
[task 2022-11-07T03:35:41.609Z] 03:35:41 INFO - Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=16.1584)
[task 2022-11-07T03:35:41.610Z] 03:35:41 ERROR - TEST-UNEXPECTED-FAIL | editor/reftests/xul/emptytextbox-4.xhtml | application terminated with exit code -6
[task 2022-11-07T03:35:41.618Z] 03:35:41 INFO - REFTEST INFO | Process mode: e10s
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 50•1 year ago
•
|
||
There have been 49 total failures in the last 7 days, recent failure log.
All failures are on linux1804-64-tsan-qr.
[task 2023-01-28T23:53:51.773Z] 23:53:51 INFO - REFTEST TEST-START | layout/reftests/css-scroll-snap/scroll-margin-on-anchor.html#target == layout/reftests/css-scroll-snap/scroll-margin-on-anchor-ref.html
[task 2023-01-28T23:53:51.778Z] 23:53:51 INFO - REFTEST TEST-LOAD | file:///builds/worker/workspace/build/tests/reftest/tests/layout/reftests/css-scroll-snap/scroll-margin-on-anchor.html#target | 0 / 2 (0%)
[task 2023-01-28T23:53:52.746Z] 23:53:52 INFO - REFTEST INFO | drawWindow flags = DRAWWINDOW_DRAW_CARET | DRAWWINDOW_DRAW_VIEW | DRAWWINDOW_USE_WIDGET_LAYERS; window size = 800,1000; test browser size = 800,1000
[task 2023-01-28T23:53:52.990Z] 23:53:52 INFO - ==4805==ERROR: ThreadSanitizer failed to allocate 0xfe1a9b5a6000 (279390228996096) bytes at address 800e4def000 (errno: 12)
[task 2023-01-28T23:53:53.038Z] 23:53:53 INFO - Exiting due to channel error.
[task 2023-01-28T23:53:53.039Z] 23:53:53 INFO - Exiting due to channel error.
[task 2023-01-28T23:53:53.040Z] 23:53:53 INFO - Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=5.07075) Exiting due to channel error.
[task 2023-01-28T23:53:53.041Z] 23:53:53 INFO - Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=1.23982) Exiting due to channel error.
[task 2023-01-28T23:53:53.042Z] 23:53:53 INFO - Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=5.11883) Exiting due to channel error.
[task 2023-01-28T23:53:54.063Z] 23:53:54 INFO - Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=16.8043)
[task 2023-01-28T23:53:54.064Z] 23:53:54 ERROR - TEST-UNEXPECTED-FAIL | layout/reftests/css-scroll-snap/scroll-margin-on-anchor.html#target | application terminated with exit code -6
[task 2023-01-28T23:53:54.072Z] 23:53:54 INFO - REFTEST INFO | Process mode: e10s
[task 2023-01-28T23:53:54.072Z] 23:53:54 WARNING - leakcheck | refcount logging is off, so leaks can't be detected!
[task 2023-01-28T23:53:54.073Z] 23:53:54 INFO - REFTEST INFO | Result summary:
[task 2023-01-28T23:53:54.074Z] 23:53:54 INFO - REFTEST INFO | Successful: 358 (358 pass, 0 load only)
[task 2023-01-28T23:53:54.074Z] 23:53:54 INFO - REFTEST INFO | Unexpected: 0 (0 unexpected fail, 0 unexpected pass, 0 unexpected asserts, 0 failed load, 0 exception)
[task 2023-01-28T23:53:54.075Z] 23:53:54 INFO - REFTEST INFO | Known problems: 10 (3 known fail, 0 known asserts, 0 random, 7 skipped, 0 slow)
[task 2023-01-28T23:53:54.076Z] 23:53:54 INFO - REFTEST SUITE-END | Shutdown
[task 2023-01-28T23:53:54.114Z] 23:53:54 ERROR - Return code: 250
[task 2023-01-28T23:53:54.115Z] 23:53:54 INFO - TinderboxPrint: reftest-reftest<br/>729/0/3
[task 2023-01-28T23:53:54.116Z] 23:53:54 ERROR - # TBPL FAILURE #
[task 2023-01-28T23:53:54.116Z] 23:53:54 WARNING - setting return code to 2
[task 2023-01-28T23:53:54.117Z] 23:53:54 ERROR - The reftest suite: reftest ran with return status: FAILURE
[task 2023-01-28T23:53:54.118Z] 23:53:54 INFO - Running post-action listener: _package_coverage_data
[task 2023-01-28T23:53:54.119Z] 23:53:54 INFO - Running post-action listener: _resource_record_post_action
[task 2023-01-28T23:53:54.119Z] 23:53:54 INFO - Running post-action listener: process_java_coverage_data
[task 2023-01-28T23:53:54.119Z] 23:53:54 INFO - [mozharness: 2023-01-28 23:53:54.117016Z] Finished run-tests step (success)
[task 2023-01-28T23:53:54.119Z] 23:53:54 INFO - [mozharness: 2023-01-28 23:53:54.117223Z] Running uninstall step.
[task 2023-01-28T23:53:54.120Z] 23:53:54 INFO - Running pre-action listener: _resource_record_pre_action
[task 2023-01-28T23:53:54.120Z] 23:53:54 INFO - Running main action method: uninstall
[task 2023-01-28T23:53:54.121Z] 23:53:54 INFO - Skipping uninstall for non-MSIX test
[task 2023-01-28T23:53:54.122Z] 23:53:54 INFO - Running post-action listener: _resource_record_post_action
[task 2023-01-28T23:53:54.122Z] 23:53:54 INFO - [mozharness: 2023-01-28 23:53:54.117769Z] Finished uninstall step (success)
[task 2023-01-28T23:53:54.122Z] 23:53:54 INFO - Running post-run listener: _resource_record_post_run
[task 2023-01-28T23:53:54.188Z] 23:53:54 INFO - Validating Perfherder data against /builds/worker/workspace/mozharness/external_tools/performance-artifact-schema.json
[task 2023-01-28T23:53:54.193Z] 23:53:54 INFO - PERFHERDER_DATA: {"framework": {"name": "job_resource_usage"}, "suites": [{"name": "reftest.reftest.15.overall", "extraOptions": ["e10s", "taskcluster-projects/887720501152/machineTypes/n2-standard-2"], "subtests": [{"name": "cpu_percent", "value": 75.07033492822967}, {"name": "io_write_bytes", "value": 2903367680}, {"name": "io.read_bytes", "value": 43061248}, {"name": "io_write_time", "value": 984516}, {"name": "io_read_time", "value": 600}]}, {"name": "reftest.reftest.15.start-pulseaudio", "subtests": [{"name": "time", "value": 0.029196739196777344}, {"name": "cpu_percent", "value": 0}]}, {"name": "reftest.reftest.15.install", "subtests": [{"name": "time", "value": 41.69308543205261}, {"name": "cpu_percent", "value": 50.75}]}, {"name": "reftest.reftest.15.stage-files", "subtests": [{"name": "time", "value": 0.00029468536376953125}, {"name": "cpu_percent", "value": 0}]}, {"name": "reftest.reftest.15.run-tests", "subtests": [{"name": "time", "value": 586.9088296890259}, {"name": "cpu_percent", "value": 76.81239316239316}]}, {"name": "reftest.reftest.15.uninstall", "subtests": [{"name": "time", "value": 0.00027251243591308594}, {"name": "cpu_percent", "value": 0}]}]}
[task 2023-01-28T23:53:54.194Z] 23:53:54 INFO - Total resource usage - Wall time: 628s; CPU: Can't collect data; Read bytes: 43061248; Write bytes: 2903367680; Read time: 600; Write time: 984516
[task 2023-01-28T23:53:54.195Z] 23:53:54 INFO - TinderboxPrint: I/O read bytes / time<br/>43,061,248 / 600
[task 2023-01-28T23:53:54.196Z] 23:53:54 INFO - TinderboxPrint: I/O write bytes / time<br/>2,903,367,680 / 984,516
[task 2023-01-28T23:53:54.197Z] 23:53:54 INFO - TinderboxPrint: CPU idle<br/>311.4 (24.9%)
[task 2023-01-28T23:53:54.198Z] 23:53:54 INFO - TinderboxPrint: CPU system<br/>78.3 (6.3%)
[task 2023-01-28T23:53:54.199Z] 23:53:54 INFO - TinderboxPrint: CPU user<br/>860.0 (68.7%)
[task 2023-01-28T23:53:54.200Z] 23:53:54 INFO - TinderboxPrint: Swap in / out<br/>0 / 0
[task 2023-01-28T23:53:54.201Z] 23:53:54 INFO - start-pulseaudio - Wall time: 0s; CPU: Can't collect data; Read bytes: 0; Write bytes: 0; Read time: 0; Write time: 0
[task 2023-01-28T23:53:54.201Z] 23:53:54 INFO - install - Wall time: 42s; CPU: 51%; Read bytes: 196608; Write bytes: 1704931328; Read time: 0; Write time: 921920
[task 2023-01-28T23:53:54.202Z] 23:53:54 INFO - stage-files - Wall time: 0s; CPU: Can't collect data; Read bytes: 0; Write bytes: 0; Read time: 0; Write time: 0
[task 2023-01-28T23:53:54.203Z] 23:53:54 INFO - run-tests - Wall time: 587s; CPU: 77%; Read bytes: 42504192; Write bytes: 1198436352; Read time: 600; Write time: 62596
[task 2023-01-28T23:53:54.204Z] 23:53:54 INFO - uninstall - Wall time: 0s; CPU: Can't collect data; Read bytes: 0; Write bytes: 0; Read time: 0; Write time: 0
[task 2023-01-28T23:53:54.269Z] 23:53:54 WARNING - returning nonzero exit status 2
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 55•1 year ago
|
||
Christian, could you have a look over this issue? It has seen an increase in failure rate and has now 130 total failures in the last 30 days being on the disable-recommended list.
Recent log: https://treeherder.mozilla.org/logviewer?job_id=404642447&repo=autoland
Comment hidden (Intermittent Failures Robot) |
Comment 57•1 year ago
|
||
I already made a recommendation in comment 21. This should be fixed by the developers or the test should be disabled. It has nothing to do with TSan.
Comment 58•1 year ago
|
||
Daniel, is this something you can investigate given what Christian said above? Disabling is not an option here cause we see this kind of failures across several reftests and is not test related. The only thing in common that I can observe is they all happen on ThreadSanitizer after this kind of failure line:
INFO - ==5055==ERROR: ThreadSanitizer failed to allocate 0xfe7433fa6000 (279775041708032) bytes at address 800e4def000 (errno: 12)
Failure log: https://treeherder.mozilla.org/logviewer?job_id=405000344&repo=autoland&lineNumber=2406
A weekly failure rate looks something like this: https://treeherder.mozilla.org/intermittent-failures/bugdetails?startday=2023-02-01&endday=2023-02-08&tree=trunk&failurehash=all&bug=1770595
Comment 59•1 year ago
•
|
||
(In reply to Christian Holler (:decoder) from comment #57)
I already made a recommendation in comment 21. This should be fixed by the developers [...] It has nothing to do with TSan.
It looks like the problematic allocation is coming from TSAN, not from Firefox, right? Googling for the message turns up other projects hitting the same issue, and some of the results mention huge allocations from TSAN's race detector specifically. I don't know that there's anything we can do on the Firefox/Gecko side to reduce the size of TSAN's own allocations.
or the test should be disabled.
There's no specific one test (or set of tests) that could be disabled here; it looks like this happens for lots of tests, per the bug title.
It also apparently can happen before we've even launched a test, e.g. here during Marionette DEBUG Waiting for initial application window
:
https://treeherder.mozilla.org/logviewer?job_id=404646659&repo=mozilla-esr102&lineNumber=2316
...though in most cases it seems to happen just after we've logged a drawWindow
call, i.e. it's usually associated with us taking a reftest snapshot (which isn't something that we can skip).
Comment 60•1 year ago
•
|
||
(and RE this from comment 21):
(In reply to Christian Holler (:decoder) from comment #21)
In general for OOMs, I would recommend disabling the test with TSan (TSan consumes more memory than normal).
However, this specific instance looks like a large allocation of some sort (271579377590272 bytes would fail no matter what). IT might be worth checking where this allocation is coming from and fixing this at the source. This should also OOM or crash regular builds.
If this allocation is from TSAN itself (as part of its race detection bookkeeping), as I suspect it is, then that would explain why it doesn't happen in regular builds.
decoder, see comment 59 and this comment... Do you have any additional insight/suggestions here about how to reduce TSAN's overhead?
(It is interesting that this seems to be specific to reftest runs -- nearly all of the failures here are reftests. Maybe there's some sort of cumulative overhead that TSAN builds up, with each drawWindow
call... I wonder if we might be able to avoid this intermittent to some extent by reducing the number of reftest tests we run in each task/bucket? (i.e. increasing our number of TSAN reftest buckets)
Updated•1 year ago
|
Assignee | ||
Comment 61•1 year ago
|
||
Maybe we could try using xlarge instance sizes instead? We already do that for some other suites in TSAN mode.
Comment 62•1 year ago
|
||
If the TSAN message can be fully trusted, that probably wouldn't help. The allocation sizes mentioned in the error messages here are always on the order of 2.7*10^14
bytes, i.e. 270 terabytes. Quite a bizarrely giant allocation, and not really something we can reliably provide hardware to support.
(On the other hand: If the error message might be wrong or off by several orders of magnitude, then maybe a bit more memory would help. But taking it at face value, it doesn't seem like throwing beefier-instances at the problem would help.)
Comment 63•1 year ago
•
|
||
One observation... It seems like this is reliably associated with canvas
operations, at least in our four crashtest failures.
Looking at the orange-factor history for the crashtest testsuites, it looks like canvas
is the common thread.
Crashtest runs:
- The two
crashtest-swr-6
failures are during https://searchfox.org/mozilla-central/rev/e66cff951667dacd0faa95dfde830564a58a8a3f/dom/media/test/crashtests/1378826.html which draws to a canvas. - The
crashtest-swr-4
failure is during https://searchfox.org/mozilla-central/rev/e66cff951667dacd0faa95dfde830564a58a8a3f/dom/canvas/crashtests/360293-1.html - The
crashtest-11
failure is during https://searchfox.org/mozilla-central/rev/e66cff951667dacd0faa95dfde830564a58a8a3f/gfx/tests/crashtests/766422-2.html
...and all three of those tests use an HTML<canvas>
with some scripted drawing onto it.
The mochitest failures don't have as clear of a connection to canvas. We do have two failures on mochitest-a11y-1proc
where the issue is during https://searchfox.org/mozilla-central/rev/e66cff951667dacd0faa95dfde830564a58a8a3f/accessible/tests/mochitest/elm/test_HTMLSpec.html which does include a section for testing HTML:canvas
-- but the other mochitest failures don't seem related to canvas. (Two of our mochitest failures are mis-stars, too -- a mochitest-plain-spi-nw-1
failure and a mochitest-browser-chrome-swr-16
failure).
And then, as noted before, the reftest failures are nearly all during drawSnapshot
drawWindow
calls aside from the one that I linked in comment 59 where we failed a bit earlier. [EDIT: I initially said drawSnapshot
but I meant drawWindow
.]
So: focusing just on the (few) crashtest results and the (many) reftest results, it seems like this is typically a failure during canvas operations or drawWindow
.
Comment 64•1 year ago
|
||
(In reply to Daniel Holbert [:dholbert] from comment #63)
So: focusing just on the (few) crashtest results and the (many) reftest results, it seems like this is typically a failure during canvas operations or
drawWindow
.
..and drawWindow
is itself a canvas operation, under the hood, I think. So: to the extent that there's any Gecko code that we can identify as being related here, it'd be our canvas rendering code, and especially whole-reftest-area drawWindow
calls.
Kind of a shot in the dark, but: it might be worth seeing if we get better results by doing these TSAN reftest runs with drawSnapshot
instead? We do have useDrawSnapshot
as a variable in the reftest harness, which defaults to false but we do have one in-tree configuration that sets it to true:
https://searchfox.org/mozilla-central/rev/e66cff951667dacd0faa95dfde830564a58a8a3f/testing/mozharness/configs/unittests/linux_unittest.py#221-227
...and it appears that we run debug builds with that configuration (only debug builds for some reason), e.g. this recent one.
So: it might be worth seeing if a drawSnapshot
-ified reftest run is any better about pushing TSAN over its limits in this respect.
Comment 65•1 year ago
|
||
(In reply to Daniel Holbert [:dholbert] from comment #59)
(In reply to Christian Holler (:decoder) from comment #57)
I already made a recommendation in comment 21. This should be fixed by the developers [...] It has nothing to do with TSan.
It looks like the problematic allocation is coming from TSAN, not from Firefox, right? Googling for the message turns up other projects hitting the same issue, and some of the results mention huge allocations from TSAN's race detector specifically. I don't know that there's anything we can do on the Firefox/Gecko side to reduce the size of TSAN's own allocations.
This cannot be determined from the given crash unfortunately. The message we are seeing here can either be from TSan itself, or from an allocation in our code that doesn't got the standard malloc route (I believe certain types of allocations/mappings can fail this way despite the fact that we have set allocator_may_return_null=1
for options.
It would be useful if someone could reproduce this locally in a debugger and see where the allocation is actually coming from and when.
Comment hidden (Intermittent Failures Robot) |
Comment 67•1 year ago
|
||
(In reply to Christian Holler (:decoder) from comment #65)
The message we are seeing here can either be from TSan itself, or from an allocation in our code that doesn't got the standard malloc route
Of those two options, though, it seems quite-likely to be TSAN itself, given that we're seeing this before Firefox has gotten a chance to do much of anything (e.g. at startup in e.g. this log from comment 59), and given that we're not seeing OOM's from non-TSAN runs (which we would expect to see if Firefox itself were actually requesting multi-hundred-terabyte allocations when run under TSAN).
It would be useful if someone could reproduce this locally in a debugger and see where the allocation is actually coming from and when.
I'm happy to leave a TSAN build running under rr
for a while, but for now I haven't been able to build locally; I'm hitting a build error when trying to build with TSAN, as noted in bug 1816588.
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 76•1 year ago
•
|
||
No; I ran reftests in a local TSAN build under rr
for about a day and wasn't yet able to reproduce. So at this point, this still seems likely to be a bug (or weird edge case) in TSAN itself that we don't have any special insight into, though we might learn more if anyone is able to catch it in a debugger (as I've tried to do, so far unsuccessfully).
Worst-case, if this is causing sheriff time to be wasted classifying failures, I wonder if we could consider this to be a blue "infrastructure issue" type bug (considering TSAN to be part of the infrastructure on these test runs), and write some sort of log-processing rule to detect this and automatically categorize it as such? (And automatically retrigger the task, as we try to do for other infra failures.)
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Comment hidden (Intermittent Failures Robot) |
Updated•1 year ago
|
Comment 82•1 year ago
|
||
(Per recent dupes, this isn't reftest-specific, hence the s/reftest/test/
in the bug title.)
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 92•1 year ago
|
||
Hi Christian, could someone please investigate this #8 top orange failure? As it looks like it increased quite a lot. Thanks.
Comment 93•1 year ago
|
||
There is nothing to investigate from our side. This is likely a bug in TSan or the sanitizer runtime itself. I briefly spoke to the author of TSan last week and he mentioned that Chrome is seeing a similar failure. It could be that a change in our codebase or (more likely) a toolchain update made this more likely.
Comment 94•1 year ago
|
||
For reference:
Bug in Chrome: https://bugs.chromium.org/p/chromium/issues/detail?id=1275223
RCA: https://bugs.chromium.org/p/chromium/issues/detail?id=1275223#c15
Updated•1 year ago
|
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 101•1 year ago
|
||
There is a tentative fix for this issue from Google: https://reviews.llvm.org/D147459
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 103•1 year ago
|
||
Try push with all the relevant upstream LLVM backports looks good. No code -6
failures other than a known data race intermittent.
https://treeherder.mozilla.org/jobs?repo=try&revision=d03ce314edc7b168e9a7086994ec3956def7a4ad
Assignee | ||
Comment 104•1 year ago
|
||
Updated•1 year ago
|
Comment 105•1 year ago
|
||
Pushed by rvandermeulen@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/229cf81f6bc1 Backport upstream LLVM fixes for TSAN allocation failures. r=firefox-build-system-reviewers,andi
Comment 106•1 year ago
|
||
bugherder |
Assignee | ||
Comment 107•1 year ago
|
||
Try push for Beta looks good too:
https://treeherder.mozilla.org/jobs?repo=try&revision=916f554e0e07e9a31c200a7cc92c67d39f9e6003
Unfortunately, ESR is on clang-14 and I don't have the heart to try to rebase these patches onto that old of a release.
Assignee | ||
Comment 108•1 year ago
|
||
Original Revision: https://phabricator.services.mozilla.com/D175903
Comment 109•1 year ago
|
||
The patch landed in nightly and beta is affected.
:RyanVM, is this bug important enough to require an uplift?
- If yes, please nominate the patch for beta approval.
- If no, please set
status-firefox113
towontfix
.
For more information, please visit auto_nag documentation.
Comment hidden (obsolete) |
Assignee | ||
Updated•1 year ago
|
Comment 112•1 year ago
|
||
Uplift Approval Request
- Risk associated with taking this patch: Low
- Steps to reproduce for manual QE testing: N/A
- Explanation of risk level: Only touches LLVM sanitizer code, so it shouldn't affect shipping builds
- Code covered by automated testing: yes
- Fix verified in Nightly: yes
- Is Android affected?: no
- String changes made/needed: None
- Needs manual QE test: no
- User impact if declined: None
Updated•1 year ago
|
Assignee | ||
Comment 113•1 year ago
|
||
bugherder uplift |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 136•9 months ago
•
|
||
This bug seems to be accumulating stars for random unrelated TEST-UNEXPECTED-TIMEOUT
failures for TSAN runs on esr115, e.g. in comment 127 and comment 135.
CosminS, do you know why that's happening, and could you adjust things such that we stop starring these sorts of failures as being this bug?
This bug here wasn't about timeouts; it was specifically about random aborts where we were hitting ThreadSanitizer failed to allocate [giant number] bytes at address [something]
in the log. This bug was is fixed in 113 and beyond, so any failures in ESR115 would be surprising (which is why these reports caught my attention).
(One of the two failures from comment 135 seems to have really been bug 1836972; I fixed that annotation, but I didn't look in detail at the other one from comment 135, or at the ones from comment 127, except to check that they don't have failed to allocate
in their log to be sure they're not this issue here. Presumably we want to re-annotate those with their correct bugs, or file new bugs if appropriate.)
Comment 137•9 months ago
|
||
From what I see some are coming from Bug 1773612 as it's a duplicate of this bug and it's because we see this line present: https://treeherder.mozilla.org/logviewer?job_id=428773739&repo=mozilla-esr102&lineNumber=4170
In the last 30 days, this bug has 4 failures classified against it, 3 come from esr102 (which will no longer happen) the other is a misclassification and will be corrected. Will let the other sheriffs know not to use it anymore.
Comment 138•9 months ago
|
||
Thanks. I'll adjust the title of that bug to make it less likely to be suggested as a match going forward.
Comment hidden (Intermittent Failures Robot) |
Description
•