XDR shouldn't assume that Vector<uint8_t> data has maximum alignment

RESOLVED FIXED in Firefox 67



2 months ago
2 months ago


(Reporter: Waldo, Assigned: Waldo)



Firefox Tracking Flags

(firefox67 fixed)



(6 attachments)



2 months ago

XDR encodes into a Vector<uint8_t, 0, SystemAllocPolicy>. Vector data is generally only aligned to uint8_t alignment -- i.e. one-byte alignment, not aligned.

In certain places, however, code wants to enforce an alignment on the encoded data, so it uses XDRState::codeAlign() and XDRState::isAligned()``. What these functions do, is enforce an alignment _with respect to the start of theVectordata_. So you might callcodeAlign` with alignment 2, then the data will end at a multiple of 2.

But this doesn't actually work, generally -- if the Vector's data is 1-byte-aligned, codeAlign(2) could result in an overall 3-byte alignment of end of data at runtime.

In practice SystemAllocPolicy is js_malloc and friends which use malloc, which is specified to return memory suitable for storing maximally aligned stuff like float and double. So we don't encounter problems now. But it'd be more optimal if these AllocPolicys actually used and understood alignment strictures.

There's a further problem with Vector data being maximally aligned. Right now Vector::begin() when there's no inline storage returns nullptr. This isn't good for users that want to generically consume data/length combinations and want to require data always be non-null. This is a very understandable sanity check: JS's Parser class would like to assume/require non-null data in all cases, even for parsing empty source, for sanity. But even further, SourceUnits that underlies it really wants this, because SourceUnits::isPoisoned() wants to use this forbidden nullptr value to encode a poisoned SourceUnits::ptr. And it can't do that if nullptr can actually show up in this one weird place.

The solution to all this, is to assume nothing about the alignment of any data in an XDR buffer, either direction. This gets a little tricky when 16-bit string data must be XDR'd. But other than that single place, alignment is completely unnecessary. So: let's remove it 1) to permit Vector::begin() to always be non-nullptr (bug 1531638, which for the zero-length, no-inline-storage case returns ((T*) sizeof(T)) which exposed the problem of codeAlign(2) not working correctly described above), and 2) to not have this latent alignment hazard if SystemAllocPolicy (or js_new<T> or js_pod_malloc<T> beneath it) is made to return only minimally-aligned memory.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=0fea2664c6536465d0ac2f722757990d1c147680 is coming back as green as anything ever does in modern times for this change and others necessary to make the change to Vector in bug 1531638. Patch series coming shortly in that four-letter place.


Comment 7

2 months ago

The amount of code duplication of allocation path, to handle char16_t data in this somewhat esoteric case, is possibly unfortunate. But it's better than now where XDR data is only barely dubiously aligned and we're playing with fire, and also impeding making Vector::begin() sane which allows our parsing code to require a non-null source text pointer and use SourceUnits::ptr == nullptr as a true poison value.

FWIW I think it is probably possible that we could massage the "create an atom" code into a form that 1) returns the usual JS string pointer, and 2) returns a pointer to the character contents (by outparam) that the user must fill. If we did that, and the "create an atom" code merely took in the hash to be ascribed to the string, we could probably simplify a bunch of string allocation guts. That might pay down some of the code-path duplication that exists after these patches, if you're super-concerned about it.

Comment 8

2 months ago
Pushed by jwalden@mit.edu:
Define XDRBufferBase::{is,set}Aligned only once, and move #ifdef DEBUG into function definitions.  r=tcampbell
Remove an unused PodOperations.h #include from XDR code.  r=tcampbell
Define HashStringUntilZero and HashStringKnownLength template functions for code that wishes to invoke the HashString algorithm on non-pointer iterator types.  r=froydnj
Introduce fill-and-terminate functions for filling dest chars from source chars and null-terminating, when the filling will not lose information.  r=tcampbell
Don't bother explicitly aligning when XDRing char16_t data: the operations we use to perform the little/native-endian conversion will correctly translate into, or translate out of, unaligned memory.  r=tcampbell
Remove all remaining calls to XDRState::codeAlign as no longer necessary now that codeChars and XDRAtom don't require buffer alignment.  r=tcampbell
Assignee: nobody → jwalden
You need to log in before you can comment on or make changes to this bug.