Closed Bug 50602 Opened 25 years ago Closed 24 years ago

xpidl/xpconnect needs support to share refcounted strings

Categories

(Core :: XPConnect, defect, P3)

defect

Tracking

()

VERIFIED FIXED
mozilla0.9

People

(Reporter: jband_mozilla, Assigned: shaver)

References

Details

Attachments

(14 files)

4.16 KB, patch
Details | Diff | Splinter Review
4.88 KB, patch
Details | Diff | Splinter Review
6.42 KB, patch
Details | Diff | Splinter Review
1.32 KB, patch
Details | Diff | Splinter Review
7.71 KB, patch
Details | Diff | Splinter Review
8.89 KB, patch
Details | Diff | Splinter Review
8.50 KB, patch
Details | Diff | Splinter Review
11.74 KB, patch
Details | Diff | Splinter Review
26.16 KB, patch
Details | Diff | Splinter Review
44.37 KB, patch
Details | Diff | Splinter Review
44.70 KB, patch
Details | Diff | Splinter Review
44.71 KB, patch
Details | Diff | Splinter Review
43.65 KB, patch
Details | Diff | Splinter Review
42.69 KB, patch
Details | Diff | Splinter Review
I talked with scc about sharing string data without copying between native and JS. Interestings points... - Scott's strings will expose a refcouting interface and immutable representation. - xpidl and our typelib system will need support to callout these string interfaces as special types - We only care about 16bit strings (as far as JS goes) - We can build a wrapper around JSString objects which implements scc's interfaces. - Going the other dirrection we want to be able to make JSStrings share the underlying data owned in the refcounted object. I think we don't want to change the struct layout of JSString. I proposed that we could add an out of band system to track JSStrings which are finalized by some means other than free of their buffer. This means that at JSString finalization time we'd incur a hashtable lookup for each string slated for removal. Is this acceptable? I don't see that there is a bit to spare in JSString that might not break some consumer. Maybe these 'shared' JSStrings live in a special arena and their specialness is easily detected by their address? Better plan, anyone? - python et. al. will want to be able to use this sharing too. issues... - I've been assuming that JSStrings are immutable. I'm wondering if this is strictly true. They are immutable in the language, but is this guarenteed in the api? I note that JS_GetStringChars returns "jschar *" with no mention of const. - Is fully safe sharing possible? Is potentially slightly unsafe sharing acceptable? - To add this new class of noxpcom but not exactly 'native' interfaces to xpidl we need to be fully clear of the ownership model. 'in' uses are clear. But 'out' uses are not so clear. As I understand it, getting and invoking the refcounting interface is potentially expensive.
We should appropriate GCX_DECIMAL (see jsgc.h) and call it GCX_STRING_PLUGIN or some such, and expose an API for setting the GC finalizer (indexed by the thing type-code). I blew it years ago when constipating, and didn't make JS_GetStringBytes return a const char *. JS_GetStringChars, which may have come after constipation, just followed blindly. Sorry; I think we should fix these oversights, but API users will pay with warnings (or errors? Ugh). Maybe an #ifdef in the API that we enable. Comments? /be
Status: NEW → ASSIGNED
That patch does not make the static gc_finalizers array, or at least its 4th element, be per-runtime. You call JS_SetExternalStringGCFinalizer to set the one true external string finalizer for all runtimes. Bug or feature? Right now, minimal change. /be
Brendan, This is very cool. Thanks. But, I think we are going to want more than one class of external strings. *Right now* xpconnect can save a strdup by writing a finalizer that uses nsMemory::Free for wstring cases. But coming soon we'll have recounted strings too and they'll need their own finalizer. It would be nice to have both flavors rather than sacrificing one for the other. It looks like you could widen this mask by a bit, no?
Brendan, If you buy the idea that we can have a range of external string types (maybe 4 or 5) and that JS_SetExternalStringGCFinalizer should allow you to set the particular type index, then it would be good to have JS_GetExternalStringGCFinalizer too so that callers can dynamically 'allocate' these indexes without side effects as they hunt for a free one.
It's imperative that these strings be JSStrings. In which case, for the refcounted ones, how are you going to find the ref-count and other "control" information from the jschar *chars member of the JSString? Don't tell me by playing BSTR-like tricks. /be
I figure you'll need a hash table mapping JSString* to the external string's ref-count-containing metadata structure, anyway. In that event, why would you need more than one external string type? That is, couldn't you put the logic in the API consumer, keyed by data stored in each hash table entry? Ok, maybe a really large, modular system would want discrete types to avoid having to layer or wrap, and forward calls to, the GC finalizer returned by JS_SetExternalStringGCFinalizer (if non-null). But WIB. For now, I don't see the need for 5, 4, 3, or only 2 external string finalizers instead of 1. What am I missing? /be
In case I wasn't clear in writing: > Ok, maybe a really large, modular system would want discrete types to avoid > having to layer or wrap, and forward calls to, the GC finalizer returned by > JS_SetExternalStringGCFinalizer (if non-null). I'm referring to the fact that JS_SetExternalStringGCFinalizer, in the patch and a la a bunch of other JS APIs like JS_SetErrorReporter, JS_SetGCCallback, etc., returns the old value of the external string GC finalizer. So your XPConnect code could look like this: JSStringFinalizeOp gSavedStringFinalizer = nsnull; . . . // startup, single threaded, GC can't be running here gSavedStringFinalizer = JS_SetExternalStringGCFinalizer(xpcXStrFinalizer); . . . JS_STATIC_DLL_CALLBACK(void) xpcXStrFinalizer(JSContext *cx, JSString *str) { if (str is in our hash table) { do your magic with scc's ref-counting stuff. . . } else { gSavedStringFinalizer(cx, str); } } Anyone else using JS_SetExternalStringGCFinalizer should do likewise. The extra layers incur extra hash table lookups only. /be
Your patches look fine to me. r=jband. Chaining with this simple system is messy if one also want to be able to unchain. But, I was not concerned with that. My real issue was that we are likely to want one external type that simply calls nsMemory::Free for all instances of the type (for 'out' wstring cases) and another type where a hashtable lookup is required to get to the actual native interface on which we can call release (for refcounted readable string interfaces). The thing I was trying to do was avoid doing a hashtable lookup just to discover that the object is of the first type and required no per-instance information in order to acomplish a Free. For now your code helps a lot. But when we have two types then we'll be paying a hashtable lookup for all external finalizations. If xpconnect is the only client then the logic could be: if( in hashtable ) call release on object; else call generic free. But, this monopolizes the external string system; i.e. if we want to chain then we'd *also* have to store an entry in the hashtable even for the generic Free cases. I'm OK with worse is better, but if we *can* extend the use of the bits in jsgc then we can avoid lots of unnecessary hashtable entries *and* lookups. FWIW, I did some simple instrumentation of the use of 'out' wstrings where this additional copy of the chars into JSStrings can be avoided. I see that the count gets into the thousands when I do a lot of mail or editor activity. This will be a good change.
Ok, gimme an r=, and don't be stealing any more lock bits! /be
Should there be a JS_RemoveExternalStringFinalizer too? It could be implemented in a lock-free way because it's only searching for a matching function pointer and then storing null. So long as one thread at a time ever calls it with a given finalizer pointer, there shouldn't be any bad races between removes and adds. Right? /be
I remembered beard pointing out some early (circa 1996, when JSRef was born) slate cleaning paranoia: js_FinalizeDouble. Why "finalize" a flat GC-thing like a double? js_FinalizeDouble sets the double to NaN, just to help when debugging the GC and staring at things in the heap. I should have made this #ifdef DEBUG, and have now: JSBool js_InitGC(JSRuntime *rt, uint32 maxbytes) { if (!gc_finalizers[GCX_OBJECT]) { gc_finalizers[GCX_OBJECT] = (GCFinalizeOp)js_FinalizeObject; gc_finalizers[GCX_STRING] = (GCFinalizeOp)js_FinalizeString; #ifdef DEBUG gc_finalizers[GCX_DOUBLE] = (GCFinalizeOp)js_FinalizeDouble; #endif } Hoping to check in soon. /be
r=jband. Looks good. Runs w/o crash for me (always a plus). I like not hardcoding the indexes for callers. We can always steal some bits back in the future if a better use is found. I anticipate using two slots - maybe three. The bit twiddling in jsgc.h got more opaque to potential maintainers with this change. Some comments might help. But I'm not worried. The thread warnings sound serious (ooh!). xpconnect will have to plant these callbacks in the runtime service rather than waiting for an xpconnect caller. This reminds me just how much the xpcom system monopolizes the JS engine. we demand that all interesting stuff happens on our runtime and we even call JS_ShutDown when we are done. This is going to make someone somewhere unhappy one day. I'm also thinking that when I start using this we increase the chance of hard to trace crashes. Now if someone passes an 'out' wstring without making a copy then then this is caught in the Free in xpconnect (after the copy into the JSString) while we are cleaning up after the call to the culprit. This can make it failry easy to figure out which method made the mistake if the C runtime screams at us during the unexpected call the Free. In the future this is going to happen sometime later when the JSString is finalized. We may have no way to trace this back. Also, we increase the chance that a badly written component will either change the underlying string char data or deallocate it underneath us while JS is still using the JSString. We knew this, of course, but we'll need to know to look in the right direction when these bugs pop up (and others think the JS engine is implicated). Thanks!
I tend to think that apps requiring more than one runtime are broken, but classic mozilla was that way. Hmm. The GCX_NTYPES_LOG2 addition actually removes repeated magic numbers in the jsgc.h macros, consolidating things so that we can add more types in the future. But my comments do not belabor this bit-magic; I grew up on HAKMEM. Your points about harder debugging and likelier blame-JS-or-XPConnect diagnoses are well-taken. Maybe we should associate the stack of the method returning the out param with the hash table entry (if any) for the handed-off string. More and more I think the callsite tree code in nsTraceMalloc.c should be available to everyone who wants to remember a stack backtrace using a serial number. Thanks, and let me know if there's more JS could do to help. /be
Status: ASSIGNED → RESOLVED
Closed: 25 years ago
Resolution: --- → FIXED
Reopening. This was not just about the new support in the JS engine. There is plenty more to do to get sharing of refcounted strings working.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Duh! Sorry -- it had to happen, you and I have been trading bugs too much. /be
Status: REOPENED → ASSIGNED
BTW, it's gonna be a nightmare making JS_GetStringChars and JS_GetStringBytes return types be const. I would use a conditionally defined JS_CONST macro, but how to turn it on? Marauding through XPConnect doesn't look too bad, but the DOM looks like work (see dom/src/base/nsGlobalWindow.cpp's WinHasOption, which temporarily NUL-terminates substrings inside of deflated JS strings). /be
Yuk. For my purposes here I'm not worried about the 8bit strings - though it sounds like any code that is fiddling with them is really broken. Also, It turns out that my instrumentation was faulty when I claimed large numbers of 'out' wstrings. Stupid me, I put the counter in a place that was catching 'out' wstrings *and* the much more frequent 'in' wstrings. The real numbers are in the low hundreds of calls and tens of K of bytes after a few minutes of mail use. Editor and browser don't seem to make extensive use of 'out' wstrings. It is really the projected use by the future DOM and the string interfaces that the savings *ought* to kick in.
I'm signed up to finish the XPConnect work for this, but I don't see the interfaces I'll need in string/public, and there's no blocker bug marked, and I can't find such a bug to track. What do I need to get started on this?
Assignee: jband → shaver
Status: ASSIGNED → NEW
scc showed me the light. I'm on it.
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla0.9
I haven't tested those changes yet, but since I can't find any implementation of GetSharedBufferHandle that doesn't just return 0, I don't think it'll go well anyway. scc: am I missing something? I just see http://lxr.mozilla.org/seamonkey/source/string/public/nsPrivateSharableString.h#55 and jag doesn't see anything else on the string branch.
It'll be obvious to even the most casual observer that this won't work, but it compiles! It has my first hack at the readable JSString wrapper, including a SharedBufferHandle interface, and some changes to js/src: including AddNamedRootRT, which this patch uses. I'll hack more on it tomorrow, but I wanted to put something up that people could make fun of in the meantime. I'm going to need some guidance on how to navigate the useAllocator stuff, I think.
Should we not CHECK_REQUEST inside the stubs (JS_PropertyStub, etc.)? Don't force a bogus cx param to js_AddRootRT; do test its return value in js_AddRoot and if false, report out-of-memory there. This avoids extra actual arg overhead for all calls, and the (trivial, but ugly) (!ok && cx) test in js_AddRootRT for all the calls that come from JS_AddNamedRootRT. /be
I'm not seeing justification for whacking all the CHECK_REQUESTs. What do they have to do with this bug? I'd like to see a string length threshold mechanism below which XPCJSReadableStringWrapper::GetSharedBufferHandle fails and let's the caller fallback and make a copy. We could tune the threshold when we have numbers to make an informed decision. We might be in the same boat with ReadableToJSString since we are paying the cost of a hastable entry. If this thing is going to live in xpconnect then you really don't need to go to the service manager to get the JSRuntimeService. I know I'm guilty of piling on in xpcconvert.cpp. But, this might be a good bunch of stuff to move to a new file. Then again, that would require you to declare a class or two in xpcprivate.h and maybe even not inline everything (ooh, real C++!). Man, that hashtable is killing me. Also, there's nothing here to keep us from making wrappers around wrappers, is there? It sucks to make a new JS_NewExternalString if the nsAReadableString is really a XPCJSReadableStringWrapper with a mStr just waiting to be snapped up. Does scc's stuff open *any* avenue for discovering the implementation class? On the whole though, I think I'm OK with the general scheme. Are you going to deal with the 'null' issue for DOMString too?
Sorry, the CHECK_REQUEST changes are for 72552 (overzealous CHECK_REQUEST calling in jsapi.c) and just happened to be in my tree alongside the AddNamedRootRT change, which I needed for this patch. I can take them out of future patches if you'd prefer. Do you want XPCJSReadableStringWrapper to make the no-sharing threshold decision, or should the caller know how relatively valuable the sharing is vs. another malloc and a hash calculation? (It looks like sizeof(SharedStringEntry) and sizeof(nsAReadableString) are the same, so I'm not sure where we win by not sharing in the ReadableToJSString case, unless we're really afraid of the hash calculation or finalization costs.) I guess I can call directly to the runtime service, yeah -- thanks. Do I even need to keep a reference to the service, or just pop in and ask for the singleton? I agree that it would suck to double-wrap strings; it doesn't look like there's any way to tell that it's avoidable, though, unless we track it ourselves with another hash table. And we're not going there. I'm more than happy to move it all to xpcstring.cpp, and use Real C++ and everything! Should I saunter on over to your branch, or is it OK to do this stuff on the trunk? Testing this well will be an unfortunate (for me) amount of work: there are no other string implementations that provide the refcounted interface, that I can tell, so I'll have to mock one up to test the ReadableToJSString stuff. And, of course, nothing else asks for the refcounted interface, so I'll have to write test code on that side, but I was expecting to do that anyway. I have this nagging fear that the SharedBufferHandle interfaces aren't solid yet (scc: any news on the |const CharT *| front?), either, so I may be playing catch-up with scc for the next little while. My life is so hard. How would you like me to deal with the null DOMString case? I think it's an error to pass JSVAL_NULL where a DOMString reference is expected -- these _are_ references we're dealing in, right? Of course, that means that it's OK to pass null for one kind of string, but not for another. Sucks to use these modern interfaces, I guess. (Ditto with undefined -- we should do as little type-coercion as possible here, I think.) Are you specifically happy with the idea of _always_ creating a ReadableStringWrapper[*], modulo the size cutoff mentioned above, rather than adapting the current pattern? I think it's the right thing here, but I'm not afraid to admit that I find the current useAllocator-and-friends logic a bit difficult to follow, so there might be some subtlety that I'm missing. [*] which I think that I will rename to XPCReadableJSStringWrapper to make it an itsy bit clearer. Brendan: I don't think we need to CHECK_REQUEST in the stubs: it's legal, though somewhat bizarre, to call them directly without a request active, and any path _through_ the engine that gets there had better have been CHECK_REQUESTed long before! I'll adjust AddNamedRootRT and co as you mention, thanks.
Another fine patch for your reviewing pleasure. - still the CHECK_REQUEST stuff - fixed-up JS_AddNamedRootRT stuff - moved readable-string magic into xpcstring.cpp - some spelling and warning cleanups (I can't help myself!) - complete implementation of XPCReadableJSStringWrapper, which permits exposure of JSStrings as refcountable nsAReadableStrings - tests for the string goop (though I still don't have a refcounting implementation other than mine to test with) I need someone to put the new files into a Mac project for me, and I'll chase that down well before I land.
Of course I have a refcounting string implementation -- I just _wrote_ one. I took advantage of the fact that we double-wrap strings and added a pass-through method to the test interface. And it pointed out a bug, of course (I wasn't setting entry->key). Fixed in the new patch.
Attached patch I am mortified.Splinter Review
Don't do this: +JS_AddNamedRootRT(JSRuntime *rt, void *rp, const char *name) +{ + /* CHECK_REQUEST(cx) */ + JSBool ok = js_AddRootRT(rt, rp, name); + if (!ok) { + JSContext *cx; + JSStackFrame *fp; + if ((cx = (JSContext *)&rt->contextList) && + (&cx->links != &rt->contextList)) { + fp = cx->fp; + cx->fp = NULL; + JS_ReportOutOfMemory(cx); + cx->fp = fp; You can't use some random context to report an error from a JS_BlahRT API, and should not try. Those are for context-less operation given just a JSRuntime*, and there is no good reason to report errors. Just let the false return unambiguously mean out-of-memory. Reporting on the first context in the list could be bad if that's in use by another thread. Contexts are mostly thread-private. BTW, you don't need to test whether (cx = (JSContext *)&rt->contextList) is non-null -- it must be. But kill all this code please, and just return js_AddRootRT's r.v. Symmetry (fearful or foolish) makes me want you to jihad thru the engine and rename js_RemoveRoot to js_RemoveRootRT. Buy my cereal! + // entry might be NULL, if we ran out of memory adding it to the hash + if (entry) entry cannot be null after JS_DHASH_LOOKUP. You want + if (JS_DHASH_ENTRY_IS_BUSY(entry)) methinks. More in a bit (after jband comments, too!). /be
Update: scc didn't like my plan to virtualize AcquireReference and ReleaseReference in the BufferHandle code, so he's going to give me another way to hook my root management in there in the meantime. I've got brendan's comments addressed in my tree, and I'll post another round after I hear from jband.
I need scc's new allocator-fu to avoid having to virtualize things in nsSharedBufferHandle. New patch coming up that uses the allocator-fu.
Depends on: 73297
No longer depends on: 73297
I've altered the jsval-tagging code in response to some comments from brendan: use of OBJECT_TO_JSVAL is misleading, because we're not dealing in JSObject *s. It now looks like: In Allocator::Deallocate: // store untagged to indicate that we're not rooted mutable_this->mStr = jsval(JSVAL_TO_STRING(mStr)); In Allocator::RootString: if (ok) { // Indicate that we've rooted the string by storing it as a // string-tagged jsval mStr = STRING_TO_JSVAL(jsval(mStr)); } (I'm not going to post another diff for that, if that's OK.) Looking to get this in today!
reviewing the last attachment as per shaver's directive shaver: if you just look at the xpcstring.cpp file and the changes to xpcprivate.h, it's a little easier to manage The code looks good, modulo my usual complaints about old-style casts. However, you do the null test for strings (1 from |GetXXXHandle|) backwards. After discussion with you on IRC, I see that our current scheme for expressing null-ness is onerous. I have updated bug #67876 with the scheme you and I agreed on. Your patch here will have to be updated with this knowledge. Made this bug depend on that one.
Depends on: 67876
I fixed the old-style casts, and added a compile-time check that JSVAL_STRING != 0, so that my root-management stuff is safe. I'll attach a new patch soon, though I'd love to incorporate jband comments when I do (nudge). scc: let's do the new NULL adaptation as a later bug -- I want to get this into the tree ASAP so jband and jst can beat on it.
I like it, but defer sr= to jband. r=brendan@mozilla.org. /be
When I tried to apply your patch the other day it was broken due to your need for scc's BufferHandle changes that weere not in the tree then. Are they now? AFAICT this is all nifty. I seriously dislike the need to root the wrapped JSString. I'm thinking it likely negates a good part of the benefit on anything other then big strings. That is partly my fault in requiring the 'useAlloctor' flag even for when building a string for use in the 'in' param calling case. This is there for ease of cleanup. Maybe I can work out something better. Is scc going to give a stamp of approval? Are you looking to land this on the trunk or my flattening branch or jst's branch? The more you put on the trunk the more risk that I'll screw up bits while moving it to the branch(es). jst and I are talking about just moving my work to his branch and abandoning mine. This presupposes that we are better off landing his changes and mine together rather than two cycles of testing and carpools. If you land anywhere other than the trunk then we'll need to merge the trunk into the branches in order to pick up changes you rely upon. Anyway. sr=jband.
The changes from scc that this code requires are indeed in the tree. scc has reviewed (if not Reviewed) the changes, and other than his cast concerns (fixed) and his desire to switch to a better model for expressing nullness (pending in a later effort), he seems happy. I dislike having to root too, but I didn't see a good way to temporarily root it, and I wasn't looking to do anything too invasive. I'd be happy to work with you to find a Better Way in the new world. I should hope that most of this work would port pretty cleanly to the new branch, as long as you merge the string dir (a good idea anyway, after scc lands his branch). Your advice to factor things into xpcstring.cpp should pay off quite nicely there, since the changes to other parts of the XPConnect system were minimal. Let me know if you have problems, though. I'll be landing this tonight -- thanks, everyone, for your help and comments.
All in the tree.
Status: ASSIGNED → RESOLVED
Closed: 25 years ago24 years ago
Resolution: --- → FIXED
Marking Verified -
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: