1.39 KB, patch
|Details | Diff | Splinter Review|
6.75 KB, patch
|Details | Diff | Splinter Review|
xpc::StringToJsval uses some static variables. When used from multiple threads this causes problems. Here's the stack: xul.dll!StringBufferToJSVal - xpcpublic.h:214 xul.dll!ReadableToJSVal - xpcstring.cpp:71 xul.dll!NonVoidStringToJsval - xpcquickstubs.cpp:791 xul.dll!NonVoidStringToJsval - xpcpublic.h:291 xul.dll!get_responseText - xmlhttprequestbinding.cpp:2492 xul.dll!genericGetter - xmlhttprequestbinding.cpp:2769 mozjs.dll!CallJSNative - jscntxtinlines.h:218 mozjs.dll!Invoke - interpreter.cpp:489 mozjs.dll!Invoke - interpreter.cpp:539 mozjs.dll!InvokeGetterOrSetter - interpreter.cpp:610 mozjs.dll!get - shape-inl.h:256 mozjs.dll!NativeGetInline<1> - jsobj.cpp:4029 mozjs.dll!GetPropertyHelperInline<1> - jsobj.cpp:4200 mozjs.dll!GetPropertyHelper - jsobj.cpp:4209 mozjs.dll!GetPropertyOperation - interpreter.cpp:292 mozjs.dll!Interpret - interpreter.cpp:2323 mozjs.dll!RunScript - interpreter.cpp:446 mozjs.dll!ExecuteKernel - interpreter.cpp:630 mozjs.dll!Execute - interpreter.cpp:667 mozjs.dll!Evaluate - jsapi.cpp:5156 xul.dll!WorkerRun - scriptloader.cpp:660 xul.dll!Run - workerprivate.cpp:1705 xul.dll!RunSyncLoop - workerprivate.cpp:3612 xul.dll!LoadAllScripts - scriptloader.cpp:717 xul.dll!Load - scriptloader.cpp:945 xul.dll!ImportScripts - workerscope.cpp:419 mozjs.dll!CallJSNative - jscntxtinlines.h:218 mozjs.dll!Invoke - interpreter.cpp:489 mozjs.dll!Interpret - interpreter.cpp:2484 mozjs.dll!RunScript - interpreter.cpp:446 mozjs.dll!ExecuteKernel - interpreter.cpp:630 mozjs.dll!Execute - interpreter.cpp:667 mozjs.dll!Evaluate - jsapi.cpp:5156 Marking s-s until we know otherwise.
Yeah, this is bad. Is there a convenient per-thread data structure we could stick this cache in or something?
I asked about that in #jsapi last week and apparently there isn't anything free in JSRuntime or Zone or such, but such could be added.
Something on a Zone would be ideal, actually. Right now StringBufferToJSVal has to check that the cached-string zone matches the context zone, but if I could just get the cached buffer of the zone directly that would rock. That would still make it easy to clear the cache when strings are finalized, because I should be able to get to the Zone from the finalizer...
Seems fine to add a jsfriendapi.h js::(Get|Set)ZoneUserData for now. If more cases like this come up, we'd probably want to be able to hang a general xpc::ZonePrivate struct off of the JS Zone (w/ hooks to create/destroy), but that seems like overkill atm.
(In reply to Luke Wagner [:luke] from comment #4) > Seems fine to add a jsfriendapi.h js::(Get|Set)ZoneUserData for now. If > more cases like this come up, we'd probably want to be able to hang a > general xpc::ZonePrivate struct off of the JS Zone (w/ hooks to > create/destroy), but that seems like overkill atm. If this is going to be used on multiple threads, I really don't want to get xpconnect involved.
Zones can't migrate between runtimes, so presumably the ZoneUserData would be used for per-thread caching, so it should be okay for XPConnect to be involved. Setting to high because it is probably difficult to exploit, but feel free to raise to critical if you want.
(In reply to Andrew McCreight [:mccr8] from comment #6) > Zones can't migrate between runtimes, so presumably the ZoneUserData would > be used for per-thread caching, so it should be okay for XPConnect to be > involved. Yes, I was talking about luke's suggestion for storing this information in a generalized xpc::ZonePrivate.
(In reply to Bobby Holley (:bholley) from comment #7) > Yes, I was talking about luke's suggestion for storing this information in a > generalized xpc::ZonePrivate. ... which would also hang off the zone (via a pointer in the Zone) and be only used on the JSRuntime's owner thread (so, off the Gecko main thread for workers). Maybe your point is just that xpc:: is the wrong namespace ;)
(In reply to Luke Wagner [:luke] from comment #8) > ... which would also hang off the zone (via a pointer in the Zone) and be > only used on the JSRuntime's owner thread (so, off the Gecko main thread for > workers). Maybe your point is just that xpc:: is the wrong namespace ;) Precisely. If it's general information, it should go in namespace js. If the values are embedder-dependent, we should have different machinery in xpc and mozilla::dom::workers.
I think having a mozilla::dom::ZoneUserData would be fine. ;)
(In reply to Boris Zbarsky [:bz] from comment #10) > I think having a mozilla::dom::ZoneUserData would be fine. ;) Yeah, that works too.
If XPConnect is intended to be mainthread-only, maybe it makes sense to move all of these string functions into some other namespace.
(In reply to Andrew McCreight [:mccr8] from comment #12) > If XPConnect is intended to be mainthread-only, maybe it makes sense to move > all of these string functions into some other namespace. Agreed. I have a heart attack every time I see "xpc" in dom/workers.
This looks easy enough, I'll prototype a JS API and run it past Luke or Bill.
Assignee: nobody → continuation
I noticed another problem while working on this. If, at the start of a GC, we have an otherwise unreachable string in the cache, and during marking we hand it out, we can end up with a dead pointer in a live object, which can lead to a use-after-free. The case where we allocate a string is safe, because we will never throw out a string allocated during the GC. The code currently deals with an equivalent problem during sweeping by calling ClearCache() during sweeping, so I can just fix this by moving ClearCache() to the start of GC rather than the start of sweeping.
For some reason, after running mochitest browser Chrome for about 10 minutes, it pretty consistently ends up pulling a dead JSString out of the cache.
Attachment #810132 - Attachment is obsolete: true
Thanks to Bill for spending a bunch of time helping me figure out why browser-chrome was hitting a use-after-free after ten minutes. If the browser starts a GC, then puts something in the cache, then gets a reset GC, then the cache purging callback isn't triggered, and the string doesn't end up marked, so the GC will collect it, then it can get handed out again. So, just a bug in my patch, not an existing GC problem.
This whole thing is turning into kind of a mess. I wonder how bad it would be to just disable this cache for workers, for backporting.
Having the cache just for main thread sounds ok to me.
Why would that be ok? We're trying to push more things on workers, and that will be much harder if they are fundamentally slower...
Sure, for trunk (and whatever is the equivalent of b2g trunk) I'm going to fix it for real, with per-zone caching. I guess none of the patches are particularly complex, so they shouldn't be very high risk, there's just a lot of them. I'm concerned the most about bug 924706, which reworks how an infrequent GC is done.
Bug 928066 looks like it is hitting this reproducibly, on B2G, with Ringmark.
Bug 924525 might be another case.
Comment on attachment 819854 [details] [diff] [review] only use the string cache on the main thread, at least semi-works I can confirm that applying this patch fixes bug 924525.
Have you tested the performance impact of this NS_IsMainThread() call on the hot path here?
No. Do you have a suggestion for a microbenchmark that could test that? For the bindings case, it looks like we codegen a version of eg get_responseText for workers and for non-workers, so we could add a template argument IsWorker or whatever and avoid any dynamic overhead. I'm not sure if that's possible in the 10 or so places that call this method outside of bindings generated code, or if those places are more or less perf sensitive.
We could just change the codegen to call a different function for worker string args.
> No. Do you have a suggestion for a microbenchmark that could test that? Sure. getAttribute is a good testcase, when the attribute has a value. Worth testing what the numbers look like without the patch, with the NS_IsMainThread() call, and with that call replaced by "false" just to see a baseline... > We could just change the codegen to call a different function for worker string args. No, because some bindings (e.g. EventTarget) are the same on workers and non-workers, and long term they all will be, no?
(In reply to On vacation Oct 12 - Oct 27 from comment #35) > No, because some bindings (e.g. EventTarget) are the same on workers and > non-workers, and long term they all will be, no? Actually, I wasn't sure about that. I mean, we could have two bindings that are generated that have the same methods in most cases, but not in all, right?
We don't have plans to ever converge the bindings for XHR, though I suppose we could if we needed to by introducing some sort of interface class abstraction.
We can generate two separate EventTarget bindings, but we don't particularly want to. In any case, I assume the main-thread-check bit is just for branches; I guess we could munge the codegen a bit for branches if we have to...
I tried out a microbenchmark I found http://jsperf.com/demo-of-microbenchmark-silliness that happens to be by Boris (top search result on Google for "getAttribute microbenchmark"!). The results are not great. In a regular opt build, I get about 22.3 million ops/sec. With the main thread checks, I get about 17.6 million ops/sec, so something like 20% slower. With the cache disabled entirely, I get about 14 million ops/sec. I'll try some tests with per-zone caching.
Attachment #810848 - Attachment is obsolete: true
Attachment #810850 - Attachment is obsolete: true
With the zone cache patch, I'm seeing 20.5 million ops/sec. On the fast path, there's an additional function call, JS_GetZoneUserData, and the additional indirection of reading data out of a struct. I could maybe reduce the former by moving the zone data into the shadow zone, and inline JS_GetZoneUserData.
Well, with further testing, the zone cache is about the same speed with my patch as the existing code, maybe a few % slower. Inlining the zone data access seems to close the gap. I tried caching the zone of the string, but that didn't seem to make it any faster. Just zone caching might avoid crashes we're seeing, but of course it won't help the raciness.
Boris, I assume the performance regressions in comment 39 for the patch with explicit main thread checks are unacceptable and I should pursue an alternate approach?
I only just now noticed the part in comment 3 where we don't need to check that the string is from the right zone because we're pulling from a per-zone cache. Oops. With this version of the patch, we get about the same performance on the getAttribute microbenchmark as without. I tried inlining JS_GetZoneUserData but that didn't seem to improve performance with this version. I'll attach that patch for posterity. I'm also working on a patch to disable the cache on workers without using NS_IsMainThread, by passing around a boolean that indicates if we're on workers or not. I'll see what kind of effect that has on performance on the main thread. As part of that, I'm planning to figure out in which version we started using this cache on workers, in the perhaps naive hope that we've only done it recently (eg not in b2g18). The initial landing of the string cache is bug 773520, Firefox 18. If b2g18 isn't affected by this problem, then I'm tempted to just use this version of the patch on branches. Bill said he's not too worried about regressions from his GC patch. Anyways, the regression from NS_IsMainThread is pretty bad, almost as bad as just disabling the cache, so I think we don't want to do that.
Comment on attachment 824390 [details] [diff] [review] alternate patch, pass in a boolean everywhere to indicate if you use the cache or not, WIP You could probably just define a 3-arg version that calls the 4-arg version with "true" and avoid most of the changes here, now that you've found all the bindings bits that use the function, right? >+ # I'm not sure what these are, but they don't seem to be workerAndNotWorker. They're descriptor providers as opposed to descriptors. Think dictionary codegen (for a worker or not). You want this to useCache = false, I believe.
Here's a list of files where we hit the provider case. I'll set them to not use the cache, as bz suggested. Hopefully nothing too critical in there. No elements.
Yeah, those are all dictionaries containing strings, basically.
Cleaned up a bit, including changes from bug 932505.
Attachment #824390 - Attachment is obsolete: true
Here's another approach. The goal is to have a version we are okay with on all branches. It uses per-zone caching, but does not require modifications to the GC. I still need to check the performance, but it is at least stable enough to run 40 minutes of Mochitests. Instead of clearing the caches at the beginning of a GC, this version uses a two-pronged approach: 1) First, if we read from the cache during the GC, we mark the string black. This handles the problem where a string is present only in the cache at the start of a GC, and we pull the string out during marking. This is hopefully inexpensive in the common case, when we're not actively GCing. (This case is not currently handled by the cache.) 2) Second, at the start of sweeping, we clear the caches. This is done currently, and handles the case where something is only present in the cache prior to sweeping, and it would get handed out during sweeping, when it becomes garbage. Because we now cache per-zone, and sweeping is done in groups that don't include all zones, I had to add a new callback that clears the cache of a single zone. We invoke that callback when we're sweeping the zone.
Linux64 debug run was green: https://tbpl.mozilla.org/?tree=Try&rev=d66269c58900 Performance was about the same as the other implementation of per-zone string caching, which is 0-10% slower than the current implementation.
Full try run was green: https://tbpl.mozilla.org/?tree=Try&rev=dcfdb01479f3 ...modulo some Win-debug build bustage that is fixed by bug 934990: https://tbpl.mozilla.org/?tree=Try&rev=9094f41d50c6
Comment on attachment 827057 [details] [diff] [review] part 1: implement a read barrier for GC things This is basically a hacked up version of ExposeGCThingToActiveJS, because we don't care about grayness for strings, and we're mostly concerned about whether something is alive at all. Hopefully the GGC parts make sense.
Attachment #827057 - Flags: review?(jcoppeard)
Attachment #827058 - Flags: review?(bzbarsky)
Comment on attachment 827057 [details] [diff] [review] part 1: implement a read barrier for GC things Review of attachment 827057 [details] [diff] [review]: ----------------------------------------------------------------- Looks good.
Attachment #827057 - Flags: review?(jcoppeard) → review+
Comment on attachment 827058 [details] [diff] [review] part 2: implement a per-zone string cache r=me. Thank you!
Attachment #827058 - Flags: review?(bzbarsky) → review+
Comment on attachment 827058 [details] [diff] [review] part 2: implement a per-zone string cache (this approval request is for both patches) [Security approval request comment] How easily could an exploit be constructed based on the patch? There are two problems here. One, the race condition, can probably be found by anybody with TSAN, so I don't think that's an issue in terms of discoverability from the patch. The other, the GC thing, isn't entirely obvious, but it is clear some GC stuff is being changed, which is a red flag. It depends on GC timing, so I'm not sure how easy it would be to actually make an exploit. Jesse hasn't fuzzed up a test case for this, with the additional GC control we have, which seems like a reasonably good sign for how hard it is to write an exploit. Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? No. Which older supported branches are affected by this flaw? Everything. If not all supported branches, which bug introduced the flaw? Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? b2g18 will require some extra work, because it doesn't have bug 834877. But generally the code is fairly similar. How likely is this patch to cause regressions; how much testing does it need? This patch is a little tricky, but it is also code that is run a lot, so any regressions will hopefully be caught quickly.
Attachment #827058 - Flags: sec-approval?
Comment on attachment 827058 [details] [diff] [review] part 2: implement a per-zone string cache sec-approval+. Let's make Aurora, Beta, and ESR24 patches for this once it is in.
Attachment #827058 - Flags: sec-approval? → sec-approval+
Status: NEW → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
Can we get patches for uplift here (or noms on the existing patch if it applies?)
I started putting together backports, but I realized the fix here isn't quite right yet: in between when marking stops and an individual zone is swept, we can end up pulling dying objects out of the cache. I'll try to think of how to fix this.
(In reply to Andrew McCreight [:mccr8] from comment #63) > I started putting together backports, but I realized the fix here isn't > quite right yet: in between when marking stops and an individual zone is > swept, we can end up pulling dying objects out of the cache. I'll try to > think of how to fix this. Andrew, we're going to build our second-to-last Beta for FF26 on Monday, where is this at? What can we safely uplift or should we wontfix this for FF26 and get a proper fix into FF27?
Comment on attachment 827058 [details] [diff] [review] part 2: implement a per-zone string cache Yeah, I should just land this. I'll file a followup for whatever other problem I think there may be. [Approval Request Comment] Bug caused by (feature/regressing bug #): something last year User impact if declined: possible sec problems Testing completed (on m-c, etc.): it has been on m-c for a bit Risk to taking this patch (and alternatives if risky): shouldn't be too bad String or IDL/UUID changes made by this patch: none
(This also includes requesting approval for the various blocking bugs.)
Attachment #827058 - Flags: approval-mozilla-esr24?
Attachment #827058 - Flags: approval-mozilla-esr24+
Attachment #827058 - Flags: approval-mozilla-beta?
Attachment #827058 - Flags: approval-mozilla-beta+
Attachment #827058 - Flags: approval-mozilla-aurora?
Attachment #827058 - Flags: approval-mozilla-aurora+
Ryan, I'll land these patches myself. They require some tedious fixups for backporting. Unfortunately, my patches are on a machine that is currently not responding to requests due to the Great MV Power Shutdown of 2013, but hopefully I can retrieve the patches in an hour or so. If not, I'll start rebackporting.
https://hg.mozilla.org/releases/mozilla-b2g26_v1_2/rev/70e625ea860f https://hg.mozilla.org/releases/mozilla-b2g26_v1_2/rev/50f1542687a6 Andrew, can you please prepare patches for b2g18 as well? Being a sec-high and tracking an ESR release, the fix needs to land there as well unless wontfixed.
Oh, rats, I forgot that Zones didn't land until Fx22. It isn't really going to be possible to backport this as is. I'll have to think about that. Maybe a per-JSRuntime cache will work.
pre-zone, why not just a per-compartment cache?
(In reply to Boris Zbarsky [:bz] from comment #73) > pre-zone, why not just a per-compartment cache? Good idea. I suppose two pointers per compartment won't break the bank.
So, somehow or another way, on Esr24 I ended up only registering the new callbacks in XPCJSRuntime, which is bad (the other branches I've landed stuff for are post-CycleCollectedJSRuntime, and are ok). Kyle, where should I register JS engine callbacks on workers, prior to your CycleCollectedJSRuntime changes?
Never mind, it looks like it is CreateJSContextForWorker.
Minor followup to register the zone callbacks on workers in ESR24... https://hg.mozilla.org/releases/mozilla-esr24/rev/0457194b79d6
Bug file(s)? Test case(s)? Any other way to reproduce, test or verify?
Not really, unfortunately.
Thanks Andrew. Marking qa- based on this info.
I think this late in the b2g18 cycle we shouldn't fix it there. Given that I'd have to rewrite this patch from scratch, it would be risky, and I think it isn't as much of a problem on b2g18, as less things were converted to WebIDL there.
Component: DOM → DOM: Core & HTML
Product: Core → Core
You need to log in before you can comment on or make changes to this bug.