Open Bug 1732213 Opened 4 years ago Updated 4 years ago

[meta] Review reinterpret_cast's, to remove UB or unnecessary uses

Categories

(Core :: MFBT, task)

task

Tracking

()

People

(Reporter: mozbugz, Unassigned)

References

(Depends on 1 open bug)

Details

(Keywords: meta)

reinterpret_cast is used a lot. A quick grep reveals 13,435 of them!
Intuitively it may seem trivial to use ("just read this thing as if it was that other thing"), but in reality it has lots of subtleties in C++.

Bug 1731085 shows an example of incorrect usage, to convert a bunch of bytes into a 32-bit number. In that case it had real implications, by performing non-aligned reads which are forbidden on ARM.

According to https://en.cppreference.com/w/cpp/language/reinterpret_cast :

Only the following conversions can be done with reinterpret_cast, except when such conversions would cast away constness or volatility.

  1. An expression of integral, enumeration, pointer, or pointer-to-member type can be converted to its own type. The resulting value is the same as the value of expression.
  2. A pointer can be converted to any integral type large enough to hold all values of its type (e.g. to std::uintptr_t)
  3. A value of any integral or enumeration type can be converted to a pointer type. A pointer converted to an integer of sufficient size and back to the same pointer type is guaranteed to have its original value, otherwise the resulting pointer cannot be dereferenced safely (the round-trip conversion in the opposite direction is not guaranteed; the same pointer may have multiple integer representations) The null pointer constant NULL or integer zero is not guaranteed to yield the null pointer value of the target type; static_cast or implicit conversion should be used for this purpose.
  4. Any value of type std::nullptr_t, including nullptr can be converted to any integral type as if it were (void*)0, but no value, not even nullptr can be converted to std::nullptr_t: static_cast should be used for that purpose.
  5. Any object pointer type T1* can be converted to another object pointer type cv T2*. This is exactly equivalent to static_cast<cv T2*>(static_cast<cv void*>(expression)) (which implies that if T2's alignment requirement is not stricter than T1's, the value of the pointer does not change and conversion of the resulting pointer back to its original type yields the original value). In any case, the resulting pointer may only be dereferenced safely if allowed by the type aliasing rules (see below)
  6. An lvalue expression of type T1 can be converted to reference to another type T2. The result is an lvalue or xvalue referring to the same object as the original lvalue, but with a different type. No temporary is created, no copy is made, no constructors or conversion functions are called. The resulting reference can only be accessed safely if allowed by the type aliasing rules (see below)
  7. Any pointer to function can be converted to a pointer to a different function type. Calling the function through a pointer to a different function type is undefined, but converting such pointer back to pointer to the original function type yields the pointer to the original function.
  8. On some implementations (in particular, on any POSIX compatible system as required by dlsym), a function pointer can be converted to void* or any other object pointer, or vice versa. If the implementation supports conversion in both directions, conversion to the original type yields the original value, otherwise the resulting pointer cannot be dereferenced or called safely.
  9. The null pointer value of any pointer type can be converted to any other pointer type, resulting in the null pointer value of that type. Note that the null pointer constant nullptr or any other value of type std::nullptr_t cannot be converted to a pointer with reinterpret_cast: implicit conversion or static_cast should be used for this purpose.
  10. A pointer to member function can be converted to pointer to a different member function of a different type. Conversion back to the original type yields the original value, otherwise the resulting pointer cannot be used safely.
  11. A pointer to member object of some class T1 can be converted to a pointer to another member object of another class T2. If T2's alignment is not stricter than T1's, conversion back to the original type T1 yields the original value, otherwise the resulting pointer cannot be used safely.

Pfew, that's a lot to read and understand! Which makes it even more important to try and avoid reinterpret_cast whenever possible.
In many cases it could probably be replaced with other casts, or with memcpy when deserializing bytes into trivially-copyable objects (compilers are good at optimizing it away when allowed).

And ideally all remaining uses should have an explanation of why they're necessary and why they're safe.
I'm filing this bug in MFBT, maybe we could provide macros/functions to help with classifying valid reinterpret_casts, and even check their use when possible? E.g., for the 2nd allowed conversion:

template <typename IntType, typename PointerType>
IntType MOZ_REINTERPRET_POINTER_TO_INTEGRAL(PointerType aPtr) {
    static_assert(std::is_pointer_v<PointerType>);
    static_assert(std::is_integral_v<IntType>);
    static_assert(sizeof(IntType) >= sizeof(PointerType));
    return reinterpret_cast<IntType>(aPtr);
}

Given the many thousands of uses in our codebase, I realize that this may be an impossible task! But it would be good to at least keep it at the back of our collective mind somehow, and chip at it when possible...

You need to log in before you can comment on or make changes to this bug.