Closed Bug 1880776 Opened 2 years ago Closed 6 months ago

`'std::is_pointer_v<std::nullptr_t>' evaluates to false` static assertion failure in rlbox code when compiled with gcc -std=c++20

Categories

(Core :: Security: RLBox, task, P3)

task

Tracking

()

RESOLVED FIXED
Tracking Status
firefox138 --- fixed

People

(Reporter: cpeterson, Unassigned)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

When compiling Firefox with gcc -std=c++20 (but not gcc -std=c++17 or clang -std=c++20), the build fails with a static assertion failure in rlbox code: 'std::is_pointer_v<std::nullptr_t>' evaluates to false

https://searchfox.org/mozilla-central/rev/1aaacaeb4fa3aca6837ecc157e43e947229ba8ce/third_party/rlbox/include/rlbox.hpp#1027

The error has something to do with tainted_base_impl's operator!:

https://searchfox.org/mozilla-central/rev/1aaacaeb4fa3aca6837ecc157e43e947229ba8ce/third_party/rlbox/include/rlbox.hpp#462

invoked from here (and similar code in other libraries sandboxed using rlbox):

https://searchfox.org/mozilla-central/rev/1aaacaeb4fa3aca6837ecc157e43e947229ba8ce/gfx/thebes/gfxGraphiteShaper.cpp#141

I don't have a gcc build environment on my local (macOS) machine, so I haven't been able to debug the error much or reduce it to a smaller test case.

INFO -  In file included from /builds/worker/checkouts/gecko/gfx/thebes/ThebesRLBox.h:25,
INFO -                   from /builds/worker/checkouts/gecko/gfx/thebes/gfxGraphiteShaper.cpp:20,
INFO -                   from Unified_cpp_gfx_thebes1.cpp:38:
INFO -  /builds/worker/workspace/obj-build/dist/include/mozilla/rlbox/rlbox.hpp: In instantiation of 'rlbox::tainted<T, T_Sbx>::tainted(const nullptr_t&) [with T = std::nullptr_t; T_Sbx = rlbox::rlbox_wasm2c_sandbox; std::nullptr_t = std::nullptr_t]':
INFO -  /builds/worker/workspace/obj-build/dist/include/mozilla/rlbox/rlbox.hpp:791:1:   required from 'constexpr auto rlbox::operator==(const T_Lhs&, const rlbox::tainted_base_impl<T_Wrap, T, T_Sbx>&) [with T_Wrap = rlbox::tainted; T = gr_font*; T_Sbx = rlbox::rlbox_wasm2c_sandbox; T_Lhs = std::nullptr_t; std::enable_if_t<((! rlbox_is_wrapper_v<T_Lhs>) && (! rlbox_is_tainted_boolean_hint_v<T_Lhs>))>* <anonymous> = 0]'
INFO -  /builds/worker/workspace/obj-build/dist/include/mozilla/rlbox/rlbox.hpp:462:21:   required from 'auto rlbox::tainted_base_impl<T_Wrap, T, T_Sbx>::operator!() [with T_Wrap = rlbox::tainted; T = gr_font*; T_Sbx = rlbox::rlbox_wasm2c_sandbox]'
INFO -  /builds/worker/checkouts/gecko/gfx/thebes/gfxGraphiteShaper.cpp:141:8:   required from here
[task 2024-02-18T21:57:38.465Z] 21:57:38    ERROR -  /builds/worker/workspace/obj-build/dist/include/mozilla/rlbox/rlbox.hpp:1027:24: error: static assertion failed
INFO -   1027 |     static_assert(std::is_pointer_v<T>);
INFO -        |                   ~~~~~^~~~~~~~~~~~~~~
INFO -  /builds/worker/workspace/obj-build/dist/include/mozilla/rlbox/rlbox.hpp:1027:24: note: 'std::is_pointer_v<std::nullptr_t>' evaluates to false
INFO -  In file included from /builds/worker/workspace/obj-build/dist/include/nsIFrame.h:52,
INFO -                   from /builds/worker/workspace/obj-build/dist/include/mozilla/SVGUtils.h:25,
INFO -                   from /builds/worker/checkouts/gecko/gfx/thebes/gfxSVGGlyphs.cpp:13,
INFO -                   from Unified_cpp_gfx_thebes1.cpp:101:

Compiling Firefox as C++20 is not supported yet. I'm working through the build problems in bug 1768116. This gcc-specific error is one of the blockers for supporting C++20.

Type: defect → task
Summary: `'std::is_pointer_v<std::nullptr_t>' evaluates to false` static assertion failure in rlbox code when compiling with -std=c++20 → `'std::is_pointer_v<std::nullptr_t>' evaluates to false` static assertion failure in rlbox code when compiled with gcc -std=c++20
Depends on: 1880779

Note that, if you try to reproduce this C++20 build error, there are other build issues blocking both gcc and clang -std=c++20 builds on Linux (such as bug 1880779).

Blocks: C++20
No longer blocks: C++23

And I got nerd-sniped, so here's what I found:

RLBox provides a number of convenience-operator member functions on tainted types whose purpose appears to be to allow binary operations against (certain classes of) non-tainted types without explicitly wrapping them. It also provides convenience reverse operators as free functions which wrap the non-tainted LHS type to match the tainted RHS type, and then forward the call to the now-wrapped LHS's implementation.

Unfortunately, those reverse operators (and operator== specifically) wrap their LHS type unconditionally, and the wrapper types don't expect to handle nullptr_t . Thus, even under C++17, the expression nullptr == p (where p is of some tainted pointer type) would probably have resulted in a compile error on both GCC and clang. The "forward" operator== member-function doesn't try to wrap its RHS, though, so p == nullptr is fine.

However, C++20 allows an expression a == b to be "rewritten" to b == a for the purposes of overload resolution (N4868 §12.4.2.3 [over.match.oper] ¶3.4); and GCC appears to be choosing the rewritten convenience reverse-operator as more viable than the non-rewritten member function.

As far as I can tell, that shouldn't be happening. N4868 §12.4.4.1 [over.match.best.general] ¶2.8 explicitly says that rewritten candidates yield priority to non-rewritten candidates, and I can't find any overriding reason elsewhere in §12.4 that the member function would be dispreferred here. Overload resolution admittedly isn't a can of worms I've delved particularly deeply into before; but I think this is a GCC bug, and that clang's behavior is correct. Here's a reduced test case on godbolt.org showing the difference.

Regardless of which compiler is correct, two independent workarounds suggest themselves:

  1. Adjust the convenience reverse-operator for == (and !=) such that nullptr == p functions correctly.
  2. Remove the convenience reverse-operators for ==, !=, <, >, <=, and >= when compiling under C++20 or later, since C++20 effectively provides them regardless.

A try-build testing the latter is here. It failed, and the logs are sufficiently long that it's not obvious why; but the string is_pointer_t doesn't appear in them, at least?

ERROR - error: 'warning' diagnostics seen but not expected:
INFO - File /builds/worker/checkouts/gecko/build/clang-plugin/tests/TestNoRefcountedInsideLambdas.cpp Line 636: implicit capture of 'this' with a capture default of '=' is deprecated

Those errors are instances of -Wdeprecated-this-capture warnings to be fixed by bug 1775161. We can't (cleanly) fix -Wdeprecated-this-capture warnings in regular code until after we default to -std=c++20 because the code fix is not backwards compatible with C++17. However, we can fix these errors in the TestNoRefcountedInsideLambdas.cpp test using an ugly #if. Something like:

  std::function<void()>(
#if __cplusplus >= 202002L
[=, this]() {
#else
[=]() {
#endif
    this->method(); // expected-error{{Refcounted variable 'this' of type 'R' cannot be captured by a lambda}} expected-note{{Please consider using a smart pointer}}
  });

I can submit a patch to fix the test, but in the meantime you can temporarily comment out these test functions to get your Try run passing:

https://searchfox.org/mozilla-central/rev/6a2a2a52d7e544a2fd5678d04991a7e78b694f22/build/clang-plugin/tests/TestNoRefcountedInsideLambdas.cpp#635-640,659-664

Alas, no joy.

At a first glance, the early errors consisted largely or entirely of:

  • This segment failing because std::remove_cvref wasn't in scope, and suggesting the addition of the already-present line #include <type_traits> to resolve that — as though a new GCC were being used with old header files.
  • Similar errors like the following in new GCC header files (perhaps explicable by a new GCC being used with old and new header files, such that the old header files have higher precedence):
20:27:46     INFO -  In file included from /builds/worker/fetches/gcc/include/c++/11.4.0/bits/iterator_concepts.h:35,
20:27:46     INFO -                   from /builds/worker/fetches/gcc/include/c++/11.4.0/bits/ranges_base.h:36,
20:27:46     INFO -                   from /builds/worker/fetches/gcc/include/c++/11.4.0/span:44,
20:27:46     INFO -                   from /builds/worker/checkouts/gecko/intl/icu_capi/cpp/include/diplomat_runtime.hpp:11,
20:27:46     INFO -                   from /builds/worker/checkouts/gecko/intl/icu_capi/cpp/include/ICU4XLineBreakIteratorLatin1.hpp:10,
20:27:46     INFO -                   from /builds/worker/checkouts/gecko/intl/lwbrk/LineBreaker.cpp:18,
20:27:46     INFO -                   from Unified_cpp_intl_lwbrk0.cpp:2:
20:27:46    ERROR -  /builds/worker/fetches/gcc/include/c++/11.4.0/concepts:78:17: error: 'common_reference_t' was not declared in this scope; did you mean 'remove_reference_t'?
20:27:46     INFO -     78 |       = same_as<common_reference_t<_Tp, _Up>, common_reference_t<_Up, _Tp>>
20:27:46     INFO -        |                 ^~~~~~~~~~~~~~~~~~
20:27:46     INFO -        |                 remove_reference_t
20:27:46    ERROR -  /builds/worker/fetches/gcc/include/c++/11.4.0/concepts:78:9: error: parse error in template argument list
20:27:46     INFO -     78 |       = same_as<common_reference_t<_Tp, _Up>, common_reference_t<_Up, _Tp>>
20:27:46     INFO -        |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
20:27:46    ERROR -  /builds/worker/fetches/gcc/include/c++/11.4.0/concepts:78:9: error: template argument 1 is invalid
20:27:46    ERROR -  /builds/worker/fetches/gcc/include/c++/11.4.0/concepts:93:32: error: 'common_reference_t' was not declared in this scope; did you mean 'common_reference_with'?
20:27:46     INFO -     93 |                                common_reference_t<
20:27:46     INFO -        |                                ^~~~~~~~~~~~~~~~~~
20:27:46     INFO -        |                                common_reference_with

(In reply to Ray Kraesig [:rkraesig] from comment #5)

Alas, no joy.

At a first glance, the early errors consisted largely or entirely of:

  • This segment failing because std::remove_cvref wasn't in scope, and suggesting the addition of the already-present line #include <type_traits> to resolve that — as though a new GCC were being used with old header files.

Sorry, I forgot that was the case. (See comment 2 above.) Even when compiling with the latest GCC, we use old header files that match our "base toolchain" compiler versions. IIUC, this is necessary to build binaries that will be compatible with the older Linux distro versions we support. Updating our base toolchain's header files is bug 1880779, which depends on we can drop support for some older Linux distro versions.

(In reply to Ray Kraesig [:rkraesig] from comment #3)

As far as I can tell, that shouldn't be happening. N4868 §12.4.4.1 [over.match.best.general] ¶2.8 explicitly says that rewritten candidates yield priority to non-rewritten candidates, and I can't find any overriding reason elsewhere in §12.4 that the member function would be dispreferred here. Overload resolution admittedly isn't a can of worms I've delved particularly deeply into before; but I think this is a GCC bug, and that clang's behavior is correct. Here's a reduced test case on godbolt.org showing the difference.

Looks like this gcc behavior is a regression in gcc 10.1. I expanded your godbolt test and:

  • gcc versions >= 10.1 -std=c++2a return 1.
  • gcc versions >= 10.1 -std=c++17 returns 0.
  • gcc-9.5 -std=c++2a returns 0.
  • clang 18.1 -std=c++2a returns 0.
  • msvc 19.38 -std=c++20 returns 0.
  • Intel's icx 2024.0.0 -std=c++20 returns 0.
  • Intel's icc versions <= 2021.8.0 with -std=c++20 return 0, but curiously the last two versions before icc was EOL'd and superseded by icx (icc 2021.9.0 and 2021.10.0) changed to return 1. But icx versions from the same time period return 0.

I tried to file a bug in GCC's Bugzilla, but it's not letting me log into my existing account or reset my password...

I'm attaching Ray's patch to this bug for safekeeping. Since this appears to be a gcc bug, I won't submit this fix to upstream rlbox until we're ready to compile Firefox as C++20 (bug 1768116). Perhaps gcc's bug will have been fixed by then and we won't need this work around for rlbox.

I filed a gcc bug about gcc >= 10.1 selecting the wrong operator== overload:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114549

Severity: -- → N/A
Priority: -- → P3

The gcc bug has been fixed in gcc trunk in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53499. gcc versions <= 13.2.0 still have the bug.

The gcc bug fix has been released in gcc 14. If we want to update to -std=c++20 before our base toolchain's minimum gcc version is >= 14, we will need to apply or upstream the rlbox workaround patch attached to this bug.

Depends on: 1952606

Resolving this bug as fixed because:

  1. Ray's workaround for this gcc bug has been merged into the upstream rlbox repo: https://github.com/PLSysSec/rlbox/commit/945509a12e1e0cadeefdfd3f1b4c951a50776ab3
  2. And then vendored into mozilla-central in bug 1952606.
Status: NEW → RESOLVED
Closed: 6 months ago
No longer depends on: 1880779
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: