684711 - Don't add objects that are used off of the main thread to the cycle collector

Assignee

Description

•

14 years ago

I stumbled on this while debugging bug 675605. In that case an nsXPCWrappedJS is added to the cycle collector and is then released by a non-main thread while we're running cycle collection. So part of this is bug 464678, but we're also adding XPCWrappedNatives that are not marked main-thread-only. This probably causes a number of crashes similar to bug 675605, when the cycle collector tries to traverse dangling pointers. I have a patch that fixes this, but it causes leaks. We apparently already rely on cycle collecting these objects to fix leaks :-(.

Andrew McCreight [:mccr8]

Comment 1

•

14 years ago

Maybe when you find a non-main thread nsXPCWrappedJS during cycle collection, you could throw it into a ref-counted array, which will addref it, and keep it alive during collection. You'd have to give it an extra +1 to the number of found references. Then, at the end of a CC, you throw away the extra array, which would drop the extra ref count. Kind of a hack. Is the addref stuff threadsafe? Hopefully...

Bobby Holley (:bholley)

Comment 2

•

14 years ago

Luke thinks that we'll have single-threaded JSRuntimes real soon now, and thinks that enforcing single-threaded XPConnect might be the next step. Is that a reasonable goal? I'd love to work on it, and it sounds like it would clear up nasty bugs like this one (in addition to simplifying a lot of code).

Peter Van der Beken [:peterv]

Assignee

Comment 3

•

14 years ago

(In reply to Andrew McCreight [:mccr8] from comment #1) > Maybe when you find a non-main thread nsXPCWrappedJS during cycle > collection, you could throw it into a ref-counted array, which will addref > it, and keep it alive during collection. You'd have to give it an extra +1 > to the number of found references. Then, at the end of a CC, you throw away > the extra array, which would drop the extra ref count. The problem is that the traversal info will potentially be wrong, since the refcount can change after we've traversed but before we try to unlink.

Andrew McCreight [:mccr8]

Comment 4

•

14 years ago

Hmm, yeah I forgot about that. The original cycle collection paper that the Firefox algorithm is based on deals with that by rechecking once you find some garbage. But that rechecking can also fail, somehow. And there was a lot more complexity that I never really tried to understand, so this probably is not a rabbit hole we want to go down.

Peter Van der Beken [:peterv]

Assignee

Comment 5

•

14 years ago

Looks like bug 674597 might help make this easier to fix.

Depends on: 674597

Peter Van der Beken [:peterv]

Assignee

Updated

•

14 years ago

Depends on: 684938

Luke Wagner [:luke]

Comment 6

•

14 years ago

Actually, xpcom proxies are almost gone.

Depends on: 675221
No longer depends on: 674597

Peter Van der Beken [:peterv]

Assignee

Comment 7

•

14 years ago

There are other ways than using proxies to hit this though.

Luke Wagner [:luke]

Comment 8

•

14 years ago

(In reply to Peter Van der Beken [:peterv] from comment #7) Sure. I was just commenting on why I changed the dependency from bug 674597 to bug 675221.

Daniel Veditz [:dveditz]

Comment 9

•

14 years ago

guessing sg:moderate because the timing dependency would make this an unreliable exploit, but is it common enough that a frame trying the same thing over and over again might have reasonable certainty? If so this might be sg:critical

Whiteboard: [sg:moderate]

Andrew McCreight [:mccr8]

Updated

•

14 years ago

Blocks: 686267, 635480

Andrew McCreight [:mccr8]

Comment 10

•

14 years ago

The bugs I added have stacks with nsXPCWrappedJS::cycleCollection::Traverse that calls into nsXPCOMCycleCollectionParticipant::CheckForRightISupports, that calls into some random unrelated function, or crashes with an invalid address. Kind of weird that it seems to often sink into SVG code. These may not all be related, but they look suspicious. Bug 604865 is another instance of this, but has been resolved invalid, because the random code that CheckForRightISupports was crashing in does not exist any more. Bug 633557, bug 547987 and bug 590721 also look a little suspicious, but the call stacks are not obviously mangled: there's a nsXPCWrappedJS::cycleCollection::Traverse that goes to a NoteXPCOMChild that crashes with a bad access. Bug 607867 and bug 505016 involve stacks that look similar to that.

Blocks: 542078, 602772, 604176

Andrew McCreight [:mccr8]

Comment 11

•

14 years ago

Peter talked about one line of attack for fixing this would involve marking off-thread objects black or gray as you go, in XPCJSRuntime::TraceJS. This would involve switching colors back and forth. Unfortunately, Bill pointed out that this isn't sound. If you have an object x reachable from both a gray and a black object, and you visit the gray one first, then x will end up getting colored gray, which is wrong. Generally speaking, it is only safe to switch colors to gray once per GC. I'd forgotten about that when I was talking to Peter. This could be worked around by scanning the objects twice: once looking for things you want to mark black, and once for things you want to mark gray. Kind of gross. This could be made slightly less bad by creating a list of gray objects to scan later while looking for black objects.

Andrew McCreight [:mccr8]

Comment 12

•

14 years ago

(Bill is planning to eventually split up the XPConnect root marking into separate gray and black callbacks, so the color switching will all be done on the GC side, which will make the color switching stuff a little less footgun-y.)

Daniel Veditz [:dveditz]

Updated

•

13 years ago

Group: core-security

Curtis Koenig [:curtisk-use curtis.koenig+bzATgmail.com]]

Updated

•

13 years ago

Keywords: sec-moderate

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Comment 13

•

13 years ago

Is this still a problem?

Andrew McCreight [:mccr8]

Comment 14

•

13 years ago

JSRuntime should be single threaded now, so hopefully not...

Andrew McCreight [:mccr8]

Comment 15

•

12 years ago

Peter, is this still a problem given that we only allow XPCWJS on the main thread?

Flags: needinfo?(peterv)

Bobby Holley (:bholley)

Comment 16

•

12 years ago

Sure shouldn't be...

Andrew McCreight [:mccr8]

Updated

•

12 years ago

Status: ASSIGNED → RESOLVED

Closed: 12 years ago

Resolution: --- → WORKSFORME

Andrew McCreight [:mccr8]

Updated

•

12 years ago

Flags: needinfo?(peterv)

Bugzilla

Don't add objects that are used off of the main thread to the cycle collector

Categories

(Core :: XPConnect, defect)

Tracking

()

People

(Reporter: peterv, Assigned: peterv)

References

Details

(Keywords: sec-moderate, Whiteboard: [sg:moderate])

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Updated

Comment 6

Comment 7

Comment 8

Comment 9

Updated

Comment 10

Comment 11

Comment 12

Updated

Updated

Comment 13

Comment 14

Comment 15

Comment 16

Updated

Updated