Closed Bug 1330776 Opened 3 years ago Closed 5 months ago
Remove flat (null-terminated) strings
47 bytes, text/x-phabricator-request
|Details | Review|
Strings are a pointer and a length. Strings *can* contain '\0', so depending on null-termination in the general case can be lossy. Ergo, people shouldn't depend on null termination, we shouldn't offer it, and they should use the explicit string length we have on hand for them to consult. This also has the benefit of not requiring new-string allocations when a caller wants a null-terminated string, but the only thing handy is an arbitrary non-null-terminated string -- which is most any string that wasn't originally an atom. Most callers can probably handle a pointer/length combo with a little massaging. And the ones that can't? Well, at least they're the only ones paying the cost of bespoke flattening, not everyone. Searchfox sez there are ~265 uses of JSFlatString in the tree, so this will be a mini-slog, but definitely a good one.
Don't we depend on 3rd party libraries in places that use null terminated strings?
This is just inside the JS engine. (For now, at least -- removing null termination from nsString would be wonderful, but out of scope here.) If the JS engine itself needs to call something third-party that depends on null-termination, we can either fix it not to, or make a copy. (And if the third-party code depends on null-termination, it's already broken in the presence of JS strings containing nulls.) Note that ICU, as possibly the most notable third-party dependency, takes pointer-length combos in its APIs almost universally -- or at least in our use of it I remember that almost always being the case.
So I just looked at the DOM consumers. There's some JSID_TO_FLAT_STRING and JS_ASSERT_STRING_IS_FLAT uses, mostly to call js::CopyFlatStringChars. We could totally replace all that with linear strings, asserting things are linear, etc. There's uses of JS_PutEscapedFlatString, but again that can totally be linear; the string is coming from JS_GetFunctionId to start with. There's the IsPermitted stuff, but again it doesn't care about flatness per se. It just starts with a jsid and wants to call things like JS_FlatStringEqualsAscii and JS_GetFlatStringCharAt, both of which could totally work on linear strings. Esp. because it just wants the char at index 0. There's this lovely bit: char* propName = JS_EncodeString(cx, JS_FORGET_STRING_FLATNESS(JSID_TO_FLAT_STRING(aPropId))); (which then passes propName to nsPrintfCString; we could probably do something else here that never involved going through a char* at the expense of doing a bit more concatenation). In any case, this is clearly not using flat strings for much of anything, except they're the only sort of string you can get out of a jsid! That seems to be it in DOM code. I'm happy to convert it all to better APIs once those exist.
This should also be a small win for ThinInline and FatInline strings because we will be able to store an extra character instead of \0. On 32-bit, a TwoByte ThinInlineString will be able to store strings with length <= 4 instead of the current <= 3. For Latin1 strings, we will be able to store 8 chars instead of 7.
Summary: Remove flat strings → Remove flat (null-terminated) strings
Assignee: jwalden → jdemooij
Pushed by firstname.lastname@example.org: https://hg.mozilla.org/integration/autoland/rev/02f8ab4ff124 Stop null-terminating JS strings, remove JSFlatString. r=jwalden,sfink
You need to log in before you can comment on or make changes to this bug.