Closed Bug 1871768 Opened 1 year ago Closed 1 year ago

Perma LeakSanitizer | leak at mozilla::layers::GPUVideoImage::GPUVideoImage, mozilla::RemoteImageHolder::TransferToImage, mozilla::RemoteVideoDecoderChild::ProcessOutput, operator

Categories

(Core :: Audio/Video: Web Codecs, defect, P5)

defect

Tracking

()

RESOLVED FIXED
124 Branch
Tracking Status
firefox-esr115 --- unaffected
firefox121 --- unaffected
firefox122 --- unaffected
firefox123 --- disabled
firefox124 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: chunmin)

References

(Regression)

Details

(Keywords: intermittent-failure, intermittent-testcase, regression)

Attachments

(5 files)

Filed by: nfay [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=441264391&repo=autoland
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/aA2BI3v1T9u8-9-6cSeETQ/runs/0/artifacts/public/logs/live_backing.log


[task 2023-12-23T16:59:27.625Z] 16:59:27     INFO - TEST-START | /webcodecs/videoFrame-texImage.any.worker.html
[task 2023-12-23T16:59:27.738Z] 16:59:27     INFO - Closing window d97685c9-9959-44cc-aa84-5e9b4ad3f93b
[task 2023-12-23T16:59:28.204Z] 16:59:28     INFO - PID 1146 | -----------------------------------------------------
[task 2023-12-23T16:59:28.207Z] 16:59:28     INFO - PID 1146 | Suppressions used:
[task 2023-12-23T16:59:28.207Z] 16:59:28     INFO - PID 1146 |   count      bytes template
[task 2023-12-23T16:59:28.208Z] 16:59:28     INFO - PID 1146 |       2        288 libfontconfig.so
[task 2023-12-23T16:59:28.209Z] 16:59:28     INFO - PID 1146 |      31      16288 style::global_style_data
[task 2023-12-23T16:59:28.209Z] 16:59:28     INFO - PID 1146 |       1       9504 style::sharing::SHARING_CACHE_KEY
[task 2023-12-23T16:59:28.209Z] 16:59:28     INFO - PID 1146 |       1       4112 style::bloom::BLOOM_KEY
[task 2023-12-23T16:59:28.210Z] 16:59:28     INFO - PID 1146 | -----------------------------------------------------
<...>
[task 2023-12-23T16:59:35.166Z] 16:59:35     INFO - PID 1146 | -----------------------------------------------------
[task 2023-12-23T16:59:35.169Z] 16:59:35     INFO - PID 1146 | Suppressions used:
[task 2023-12-23T16:59:35.170Z] 16:59:35     INFO - PID 1146 |   count      bytes template
[task 2023-12-23T16:59:35.170Z] 16:59:35     INFO - PID 1146 |       2         64 nsComponentManagerImpl
[task 2023-12-23T16:59:35.171Z] 16:59:35     INFO - PID 1146 |     633      18115 libfontconfig.so
[task 2023-12-23T16:59:35.171Z] 16:59:35     INFO - PID 1146 |      26       4799 libEGL_mesa.so
[task 2023-12-23T16:59:35.171Z] 16:59:35     INFO - PID 1146 |       2      16393 libglib-2.0.so
[task 2023-12-23T16:59:35.171Z] 16:59:35     INFO - PID 1146 |      20      12800 libX11.so
[task 2023-12-23T16:59:35.171Z] 16:59:35     INFO - PID 1146 |    3081    3093989 swrast_dri.so
[task 2023-12-23T16:59:35.171Z] 16:59:35     INFO - PID 1146 |       1         32 XPCWrappedNativeJSOps
[task 2023-12-23T16:59:35.171Z] 16:59:35     INFO - PID 1146 | -----------------------------------------------------
[task 2023-12-23T16:59:35.268Z] 16:59:35     INFO - Browser exited with return code 0
[task 2023-12-23T16:59:35.268Z] 16:59:35     INFO - ERROR | LeakSanitizer | SUMMARY: AddressSanitizer: 824 byte(s) leaked in 4 allocation(s).
[task 2023-12-23T16:59:35.271Z] 16:59:35     INFO - LeakSanitizer | To show the addresses of leaked objects add report_objects=1 to LSAN_OPTIONS
[task 2023-12-23T16:59:35.272Z] 16:59:35     INFO - This can be done in testing/mozbase/mozrunner/mozrunner/utils.py
[task 2023-12-23T16:59:35.272Z] 16:59:35     INFO - Allowed depth was 4
[task 2023-12-23T16:59:35.272Z] 16:59:35     INFO - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at PLDHashTable::MakeEntryHandle, PLDHashTable::MakeEntryHandle, std::invoke_result, WithEntryHandle
[task 2023-12-23T16:59:35.273Z] 16:59:35     INFO - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at mozilla::layers::GPUVideoImage::GPUVideoImage, mozilla::RemoteImageHolder::TransferToImage, mozilla::RemoteVideoDecoderChild::ProcessOutput, operator
[task 2023-12-23T16:59:35.274Z] 16:59:35     INFO - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at mozilla::ipc::MessageChannel::MessageChannel, mozilla::ipc::IToplevelProtocol::IToplevelProtocol, mozilla::PRemoteDecoderManagerChild::PRemoteDecoderManagerChild, RemoteDecoderManagerChild
[task 2023-12-23T16:59:35.274Z] 16:59:35     INFO - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at mozilla::RemoteDecoderManagerChild::OpenRemoteDecoderManagerChildForProcess, operator, InvokeMethod, InvokeCallbackMethod
[task 2023-12-23T16:59:35.276Z] 16:59:35     INFO - Closing logging queue
[task 2023-12-23T16:59:35.276Z] 16:59:35     INFO - queue closed
[task 2023-12-23T16:59:35.298Z] 16:59:35     INFO - INFO | runtests.py | ASan using symbolizer at /builds/worker/workspace/build/application/firefox/llvm-symbolizer
[task 2023-12-23T16:59:35.311Z] 16:59:35     INFO - LSan enabled.
[task 2023-12-23T16:59:35.313Z] 16:59:35     INFO - INFO | runtests.py | ASan running in default memory configuration
[task 2023-12-23T16:59:35.314Z] 16:59:35     INFO - UBSan enabled.
[task 2023-12-23T16:59:35.317Z] 16:59:35     INFO - Application command: /builds/worker/workspace/build/application/firefox/firefox --marionette about:blank -profile /tmp/tmp7lrngkmn
[task 2023-12-23T16:59:35.328Z] 16:59:35     INFO - Starting runner
[task 2023-12-23T16:59:35.482Z] 16:59:35     INFO - PID 11136 | [ERROR firefox_on_glean::private::string] Unable to set string metric in non-main process. Ignoring.
[task 2023-12-23T16:59:35.960Z] 16:59:35     INFO - PID 11136 | [ERROR firefox_on_glean::private::string] Unable to set string metric in non-main process. Ignoring.
[task 2023-12-23T16:59:36.915Z] 16:59:36     INFO - PID 11246 | ATTENTION: default value of option mesa_glthread overridden by environment.
[task 2023-12-23T16:59:37.001Z] 16:59:37     INFO - PID 11246 | 1703350777000	Marionette	INFO	Marionette enabled
[task 2023-12-23T16:59:37.225Z] 16:59:37     INFO - PID 11246 | ATTENTION: default value of option mesa_glthread overridden by environment.
[task 2023-12-23T16:59:37.285Z] 16:59:37     INFO - PID 11246 | 1703350777284	Marionette	INFO	Listening on port 32991
[task 2023-12-23T16:59:38.257Z] 16:59:38     INFO - PID 11246 | [ERROR firefox_on_glean::private::string] Unable to set string metric in non-main process. Ignoring.
[task 2023-12-23T16:59:38.645Z] 16:59:38     INFO - PID 11246 | [ERROR firefox_on_glean::private::string] Unable to set string metric in non-main process. Ignoring.
[task 2023-12-23T16:59:40.851Z] 16:59:40     INFO - PID 11136 | console.error: ({})
[task 2023-12-23T16:59:43.011Z] 16:59:43     INFO - PID 11136 | [ERROR firefox_on_glean::private::string] Unable to set string metric in non-main process. Ignoring.
[task 2023-12-23T16:59:44.079Z] 16:59:44     INFO - PID 11246 | console.error: ({})
[task 2023-12-23T16:59:44.337Z] 16:59:44     INFO - TEST-START | /eventsource/request-status-error.window.html

I've requested pernosco traces for this, but I think pernosco doesn't work with ASAN, so this may not work.

I noticed this is an ASAN no-fission run; my local testing was with fission. I'm retesting with ASAN no-fission

Note: we don't ship no-fission linux to users.

Set release status flags based on info from the regressing bug 1871736

To understand what happens, it's better to identify which test the leak is from, and enable RemoteDecode and PlatformDecoderModule LogModules, so we can get some logs from RemoteDecoderManager{Parent, Child} and RemoteMediaDataDecoder. Besides, adding a log in ~VideoDecoder might help to see if there is any pending video decoder in the wpt since some tests don't call .close() explicitly.

The log modules can be enabled by setting logging.RemoteDecode and logging.PlatformDecoderModule to 4.

(In reply to C.M.Chang[:chunmin] from comment #4)

To understand what happens, it's better to identify which test the leak is from

It looks like the leak is from videoFrame-createImageBitmap.https.any.js. If it's disabled:
https://treeherder.mozilla.org/jobs?repo=try&revision=42e1c3be923966b4ee08c81fed8596a997549659&selectedTaskRun=DPzosCrUQCaxKGmA9g0cyg.0
then the error is gone. Additionally, there is a leak log: Bug 1872311 comment #1

Besides, adding a log in ~VideoDecoder might help to see if there is any pending video decoder in the wpt since some tests don't call .close() explicitly.

Neither encoder.close() nor decoder.close() is called in videoFrame-createImageBitmap.https.any.js, maybe shutdown-ticket is not released and the ticket blocks the rdd shutdown?

The logs (enabled by these prefs) for the test is here: https://treeherder.mozilla.org/logviewer?job_id=442058969&repo=try&lineNumber=5158-5276

See Also: → 1872311
Attached file intermittent crash log
On my local ubuntu machine, running *videoFrame-createImageBitmap.https.any.js* crashes with "Texture deallocated too late during shutdown" error [1] intermittently. Although `videoFrame.close()` is not called in the test, calling `videoFrame.close()` doesn't solve this on my end. [1] https://searchfox.org/mozilla-central/rev/6321fb8f7533456a62dfa2aa68ee0477a6c8f693/gfx/layers/client/TextureClient.cpp#468

(In reply to C.M.Chang[:chunmin] from comment #7)

calling videoFrame.close() doesn't solve this on my end.

aha, there are two videoframes actually: https://searchfox.org/mozilla-central/rev/9c509b8feb28c1e76ad41e65bf9fd87ef672b00f/testing/web-platform/tests/webcodecs/videoFrame-createImageBitmap.https.any.js#15,19

The decoder's output frame needs to be closed as well. Calling both videoFrame.close() and frame.close() seems to solve the crash on my local machine. If this is the case, calling mResource.reset() somewhere should avoid the leak, presumably ~VideoFrame.

Try server results:


update: It looks like the one of the ~VideoFrame is not called in this case, so mResource.reset() need to be called somewhere else. I'll see if I can find another way (shutdown observer?) to release the mResource in that VideoFrame.

Assignee: nobody → cchang

This patch solves the issue on main thread on my end: https://github.com/ChunMinChang/gecko-dev/commit/8ed29e6555d4758b7c7d21ea1659657c79a9100b
I'll do the similar thing on worker case as well


update: this fixes the worker side as well:
https://github.com/ChunMinChang/gecko-dev/commit/2d637f9934de2087012a8251a5e60d6b69919ff8

Depends on D199178

Depends on D199179

This patch adds a mechanism to close VideoFrame when shutting down if
VideoFrame's Close() is never called. The image held in the VideoFrame
needs to be released so there won't be memory leaks.

Depends on D199180

Attachment #9375627 - Attachment description: WIP: Bug 1871768 - Add lifetime logs for VideoFrame → Bug 1871768 - Add lifetime logs for VideoFrame
Attachment #9375628 - Attachment description: WIP: Bug 1871768 - Add a log in VideoFrame::Close() → Bug 1871768 - Add a log in VideoFrame::Close()
Attachment #9375629 - Attachment description: WIP: Bug 1871768 - Add a log in VideoDecoder dtor → Bug 1871768 - Add a log in VideoDecoder dtor
Attachment #9375630 - Attachment description: WIP: Bug 1871768 - Close VideoFrame when shutting down if needed → Bug 1871768 - Close VideoFrame when shutting down if needed

The rework for https://phabricator.services.mozilla.com/D199181 is here: https://github.com/ChunMinChang/gecko-dev/commits/b1871768/
I'll submit it to phabricator once I tested it on the CI.

Attachment #9375630 - Attachment description: Bug 1871768 - Close VideoFrame when shutting down if needed → Bug 1871768 - Close VideoFrame if Close() is never called
Pushed by cchang@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/30bbdd6c6ce2 Add lifetime logs for VideoFrame r=padenot https://hg.mozilla.org/integration/autoland/rev/6a9336337948 Add a log in VideoFrame::Close() r=padenot https://hg.mozilla.org/integration/autoland/rev/52031505d893 Add a log in VideoDecoder dtor r=padenot https://hg.mozilla.org/integration/autoland/rev/7a5d29795917 Close VideoFrame if Close() is never called r=media-playback-reviewers,padenot,smaug

The patch landed in nightly and beta is affected.
:chunmin, is this bug important enough to require an uplift?

  • If yes, please nominate the patch for beta approval.
  • If no, please set status-firefox123 to wontfix.

For more information, please visit BugBot documentation.

Flags: needinfo?(cchang)

WebCodecs is only enabled in the Nightly, so it's ok to leave 123 as it is.

Flags: needinfo?(cchang)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: