Closed Bug 1675718 Opened 4 years ago Closed 3 years ago

JSRuntime leak MOZ_CRASH(mozilla::LinkedList<mozilla::dom::ContentParent>::~LinkedList() [T = mozilla::dom::ContentParent]

Categories

(Core :: DOM: Content Processes, defect, P3)

56 Branch
defect

Tracking

()

RESOLVED DUPLICATE of bug 1677074

People

(Reporter: hdir.yassine, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: [bugmon:confirm])

Attachments

(1 file)

User Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:56.0) Gecko/20100101 Firefox/56.0

Steps to reproduce:

Bug found while fuzzing DOM
Build: with fuzzfetch --debug

[2020-11-06 09:12:44] Starting Grizzly Replay
[2020-11-06 09:12:44] Ignoring: log-limit, timeout
[2020-11-06 09:12:44] Repeat: 1, Minimum crashes: 1, Relaunch 1
[2020-11-06 09:12:44] Using prefs.js from testcase
[2020-11-06 09:12:48] Performing replay (1/1)...
[2020-11-06 09:12:48] Running test (1/1)...
[2020-11-06 09:13:10] Result: Hit MOZ_CRASH(mozilla::LinkedList<mozilla::dom::ContentParent>::~LinkedList() [T = mozilla::dom::ContentParent] has a buggy user: it should have removed all this list's elements before the list's destruction) at /builds/worker/workspace/obj-build/dist/include/mozilla/LinkedList.h:443 (267db2dc:00e76505)
[2020-11-06 09:13:10] Result successfully reproduced
[2020-11-06 09:13:10] Shutting down...
[2020-11-06 09:13:10] Done.

Actual results:

==10560==ERROR: UndefinedBehaviorSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7ffb4e7e71a2 bp 0x7ffd22ddc010 sp 0x7ffd22ddbff0 T10560)
==10560==The signal is caused by a WRITE memory access.
==10560==Hint: address points to the zero page.
#0 0x7ffb4e7e71a2 (/home/valentino/code/browsers/firefox/libxul.so+0x6f071a2)
#1 0x7ffb4e7a958b (/home/valentino/code/browsers/firefox/libxul.so+0x6ec958b)
#2 0x7ffb65f3e0f0 (/lib/x86_64-linux-gnu/libc.so.6+0x430f0)
#3 0x7ffb65f3e1e9 (/lib/x86_64-linux-gnu/libc.so.6+0x431e9)
#4 0x7ffb65f1cb9d (/lib/x86_64-linux-gnu/libc.so.6+0x21b9d)
#5 0x563aa9451749 (/home/valentino/code/browsers/firefox/firefox-bin+0x14749)

UndefinedBehaviorSanitizer can not provide additional info.
SUMMARY: UndefinedBehaviorSanitizer: SEGV (/home/valentino/code/browsers/firefox/libxul.so+0x6f071a2)
==10560==ABORTING

Blocks: grizzly

Now I'm reducing the test case, i will upload it asap

Moving this to a potential component, if this is not the right one, please do move it over to the correct one.

Component: Untriaged → DOM: Content Processes
Product: Firefox → Core

(In reply to hdir.yassine@gmailcom from comment #1)

Now I'm reducing the test case, i will upload it asap

Yassine, do you have a test case to reproduce this assertion failure?

Looks like a ContentParent is leaked until shutdown.

I'm leaving this bug in the "needs triage" state as a reminder for Nika to look at this bug. :)

Flags: needinfo?(hdir.yassine)
Attached file testcase.zip

yes, i attached the testcase
Regards

Flags: needinfo?(hdir.yassine)

ni? for myself so I look into this when I find the time

Flags: needinfo?(nika)

Looking at the log from the attached testcase, it definitely seems like there's some sort of big leak here, which is leaking a lot of memory. The log mentions:

WARNING: YOU ARE LEAKING THE WORLD (at least one JSRuntime and everything alive inside it, that is) AT JS_ShutDown TIME.  FIX THIS!

The crash here is probably just a symptom of that leak. Unfortunately the test case is rather large, so it may be a bit tricky to figure out why it's leaking. If it reproduces reliably, perhaps :mccr8 could get something useful out of the CC/GC logs?

Flags: needinfo?(nika) → needinfo?(continuation)

We don't know if this is really a DOM Content Processes leak. Should we clear our ContentParent bindings on shutdown?

Severity: -- → S4
Status: UNCONFIRMED → NEW
Ever confirmed: true
Priority: -- → P3
Summary: Hit MOZ_CRASH(mozilla::LinkedList<mozilla::dom::ContentParent>::~LinkedList() [T = mozilla::dom::ContentParent] → JSRuntime leak MOZ_CRASH(mozilla::LinkedList<mozilla::dom::ContentParent>::~LinkedList() [T = mozilla::dom::ContentParent]

I'm not likely going to have time to try to dig into a leak with a giant test case like this.

Flags: needinfo?(continuation)
Keywords: bugmon
Whiteboard: [bugmon:confirm]

Bugmon Analysis
Bugmon was unable to identify a testcase that reproduces this issue.
Removing bugmon keyword as no further action possible. Please review the bug and re-add the keyword for further analysis.

Keywords: bugmon

(In reply to Bugmon [:jkratzer for issues] from comment #9)

Bugmon Analysis
Bugmon was unable to identify a testcase that reproduces this issue.
Removing bugmon keyword as no further action possible. Please review the bug and re-add the keyword for further analysis.

Hi Jason, does this mean, bugmon was not able to understand the attached testcase.zip ?

Flags: needinfo?(jkratzer)

Jens, that's correct. Bugmon stumbled on the nested sub directories. I'm triggering Bugmon manually and have a fix in place. The results should be posted here in a few minutes.

Flags: needinfo?(jkratzer)

I'm unable to reproduce the issue using the attached testcase with the oldest build available on Taskcluster (20201119-36ef6c97da5b). Could this be the same issue as reported in bug 1677074? The testcase there does trigger the issue on 20201119-36ef6c97da5b but not on tip.

(In reply to Jason Kratzer [:jkratzer] from comment #12)

I'm unable to reproduce the issue using the attached testcase with the oldest build available on Taskcluster (20201119-36ef6c97da5b). Could this be the same issue as reported in bug 1677074? The testcase there does trigger the issue on 20201119-36ef6c97da5b but not on tip.

Seems similar, yes. And if it does not trigger anymore also here, even more.

Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: