Closed Bug 580314 Opened 14 years ago Closed 12 years ago

Move jstl into mozilla/tl

Categories

(Core :: MFBT, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
Tracking Status
blocking2.0 --- -

People

(Reporter: dwitte, Unassigned)

Details

We're starting to get more code duplication between templated stl-ish classes -- jshashtable vs nsTHashtable, jsvector vs nsTArray. It would be much nicer if we had a common template library, with tunable parameters or specializations for whatever divergences need to exist between js* and nsT*.

This would probably best be done as a new toplevel dir, mozilla/tl, and namespaced as mozilla::tl. Whatever non-templated code we have could be built into an always-static lib that's linked into whatever dynamic libs we have (libmozjs and libxul, f.e.). Object duplication shouldn't be a big deal since we're not talking a whole lot of code or copies here.

I'm assuming that our decision to not pull in external tl's still stands, given that we can't deal with exceptions other than bad_alloc. And given that we already have multiple template implementations of vector and hash, unifying them is strictly better than the current situation. (We should endeavor to keep API close to stl where possible, such that familiarity is improved and the future possibility of switching to an external tl remains open.)
See also bug 550610. Most of the deps on that bug were fixed by doing STL wrappers to avoid exceptions and just using the STL classes directly.
I'm going to exaggerate slightly here, but not mainly for comic effect, rather to stiulate thinking.

After all these years, I am against grand-scale sharing. It often leads to bloat through over-generality, in which real bugs hide. C++ templates can help or hurt, so I won't assume they magically compile away the bug-inhabiting code.

Sharing designs is easier than sharing code. Flat is better than nested. Fewer deps are better than more. And WebKit/JavaScriptCore has its WTF subdir. Taking these all together I would much rather keep jstl.h where it is, or grow it into a js/src subdir.

This last point about source location may seem like a detail, but SpiderMonkey is still separately embeddable (so is JavaScriptCore; so is v8 [see nodejs.org]; this is a competitive arena in which we are an old and aging player). I do not want creeping mozilla:: dependencies in SpiderMonkey because it's easy to predict they'll become hard to remove, and inconvenient to remove at the needed time.

/be
> And WebKit/JavaScriptCore has its WTF subdir.

Yes, and they use this throughout their layout engine, not just in JavaScriptCore.

Are we ok with using jstl in the rest of Gecko?  (I am; as long as we don't go breaking it.)
(In reply to comment #3)
> > And WebKit/JavaScriptCore has its WTF subdir.
> 
> Yes, and they use this throughout their layout engine, not just in
> JavaScriptCore.

I know, that's why I cited it.

> Are we ok with using jstl in the rest of Gecko?  (I am; as long as we don't go
> breaking it.)

You bet! But the Gecko folks' preference for reducing deps may make them say otherwise, although they already depend on js/src all over.

/be
Yeah; are we ok with such a dep in libxpcom and the rest?  I have no problem with it in gklayout, for the reason you describe...  But the point would be to share infrastructure with the rest of xpcom if we can.
What in XPCOM needs a redo? What is the benefit? If we are making XPCOM the old way of binary extensibility, ctypes the new, then why bother?

Just asking, I don't know the answers. My conviction at this stage is that any cleanup to XPCOM to make (e.g.) component manager use jshashtable is not worth the effort, and carries enough risk to avoid on its face.

/be
> What in XPCOM needs a redo?

Making the templated xpcom hashtables maybe use jshashtable?  Being able to use LazilyConstructed in xpcom and other modules as needed?

> What is the benefit? 

jshashtable is faster than pldhash?  ;)

I'm not suggesting cleaning up xpcom guts; I'm suggesting that xpcom-exported datatypes (nsTArray, hashtables, etc) might benefit from the performance work the js team has been doing.  And I'd like to avoid things like plify_jsdhash.sed in the process.
Consistency of data structures would be nice, in the long term: having only a single vector, a single hash table (per implementation approach), a single (defaulted) string-hashing implementation to go with it, a single UTF-8-parsing implementation (to go slightly afield from data structures), etc. would have a multiplicative effect for optimization, would reduce code and therefore attack surface, and would make code more approachable because you'd only have to be familiar with one API.  Something WTF-like lets you do this.  And it doesn't seem to me that moving toward something like that means we have to sacrifice SpiderMonkey embeddability to do it, just might mean tarballs include an extra directory beyond js/src.
(In reply to comment #8)
> Consistency of data structures would be nice, in the long term: having only a
> single vector, a single hash table (per implementation approach), a single
> (defaulted) string-hashing implementation to go with it, a single UTF-8-parsing
> implementation (to go slightly afield from data structures), etc. would have a
> multiplicative effect for optimization, would reduce code and therefore attack
> surface, and would make code more approachable because you'd only have to be
> familiar with one API.

This is true up to some scale. My point in comment 2 is that it breaks down at a larger scale, for two reasons:

1. Overgeneralizing and overlayering hurt optimization, attack surface, and code approachability, and sharing at large scale tends to breed over-generality in the shared API and over-layering as different modules add layers where they can't put a generalization into the shared API.

2. Human trust issues, social mobility of developers and their code, and "it's too big, I couldn't find it so I wrote/used my own" happen at large scale. I've seen it over and over, pre-Mozilla and definitely in Mozilla. It is worth fighting the good fight against this tendency, but it's real and at some point it wins.

/be
I'm disappointed that we seem to have given up on the actual, compiler-provided STL again.  If we used that, we wouldn't have to have an argument about where to put it, because it wouldn't be in the tree at all.
Except for the platforms that don't have it (Sun Studio, Android) or those where the performance characteristics aren't what we need (Luke's jsvector bugs have detail on some such cases), in which case we would have to provide an alternative.

But that's not what this bug is about, so we shouldn't really be re-hashing it here.  jstl has proven itself quite useful thus far.
(In reply to comment #11)
> Except for the platforms that don't have it (Sun Studio, Android) or those
> where the performance characteristics aren't what we need (Luke's jsvector bugs
> have detail on some such cases), in which case we would have to provide an
> alternative.

I would argue that our effort would be better spent fixing such buggy platforms at source.
We don't have the source for some of these platforms, and others have a history of not taking obvious improvement patches...  Further, our timeframe for making stuff work (typically months) doesn't match their timeframe for getting deployed out to users (typically years).  Again, this is all off-topic for this bug.
Except Zack is helpfully making my point about sharing breaking down at scale -- thanks, Zack! :-P

One of the break-downs I didn't detail, in addition to generalizations not taken into the shared code (which bred over-layering), is optimizations not taken into the shared code. Including the compiler.

/be
stay on target...
blocking2.0: --- → -
(In reply to comment #7)
> > What in XPCOM needs a redo?
> 
> Making the templated xpcom hashtables maybe use jshashtable?

Assuming the two interfaces are compatible enough to simply do just that, I would be careful to measure the FF binary size before/after; this sucker has not been optimized to avoid code bloat.
I have no problem with leaving jstl in JS and using it from Gecko, providing it's possible to treat it as a separate component from JS, so you can build Gecko components that use jstl without pulling in the whole JS engine. And providing we can add features that JS itself doesn't use without Brendan getting upset.

While sharing code isn't always the right thing, it usually is, especially when we're talking about reusing code that's already under Mozilla control. I think Brendan's arguments about overgeneralization (if arguments they be) are themselves overgeneralization :-).
(In reply to comment #17)
> I have no problem with leaving jstl in JS and using it from Gecko, providing
> it's possible to treat it as a separate component from JS, so you can build
> Gecko components that use jstl without pulling in the whole JS engine. And
> providing we can add features that JS itself doesn't use without Brendan
> getting upset.

Sure, don't worry about it unless we get bit by bloat we can't avoid even if we don't instantiate the particular template.

> While sharing code isn't always the right thing, it usually is, especially when
> we're talking about reusing code that's already under Mozilla control. I think
> Brendan's arguments about overgeneralization (if arguments they be) are
> themselves overgeneralization :-).

They're generalizations, obviously.

I don't think they are over the top. We have lots of open source. Has everyone yet shared even one STL, never mind one hashmap implementation in C++? Why not?

/be
Because the cost of cooperation is sometimes high.

The cost of cooperation within Mozilla had better be very low. Same primary ship vehicle, same schedule, same org, same mission, same platform requirements, etc.
It seems to me that, due to the template-parameterization that is already in place to abstract over things like where memory comes from and how to report errors, there would not be a performance or usability overhead in moving some things into mozilla/tl.

The main benefits I see a shared mozilla/tl are:
 - To avoid template bloat, we will eventually need to have non-inline functions.  If js templates are being used outside libmozjs then, as comment 0 already points out, we want those definitions to be private to each module that uses mozilla/tl.  That is, we basically want to #include mozilla/tl into whatever module is being built.  Whatever build-system black magic is necessary to achieve this, it would nice to have it in one place so that it is easy to see what is going on and easy to add additional reusable libraries that receive the save treatment.
 - It's nice to at least attempt to keep a module boundary between js and mozilla.  Allowing/encouraging widespread #including of js/src/jshashtable.h and jstl.h seems to encourage breaking this boundary.  (For the same reason, we should come up with a way to export auto-rooters without #including jscntxt.h...)

Of course there would need to be a "no dependency on anything outside mozilla/tl (e.g. nspr)" rule -- dependencies can be injected via template parameterization in the style of the *AllocPolicies.  Thus, to ship standalone SpiderMonkey, we should be able to simply cp -r mozilla/tl js/src.
(In reply to comment #20)
> we
> should come up with a way to export auto-rooters without #including
> jscntxt.h...

When conservative GC will be ironed out there would be no need for most autorooters. The remaining cases could be served with public classes doing JS API calls.
A name suggestion for this: Mozilla Framework for (or of) Basic Templates.

(Yes, I know, I'll shut up now.)
Shaver suggested Mozilla Framework Based on Templates when we first added js::tl...
I think we have this now.
Assignee: general → nobody
Status: NEW → RESOLVED
Closed: 12 years ago
Component: JavaScript Engine → MFBT
QA Contact: general → mfbt
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.