1070674 - MOZ_REALLY_CRASH should not invoke undefined behavior (implementation-defined behavior should be enough)

Reporter

Description

•

10 years ago

Currently MOZ_REALLY_CRASH (which is what MOZ_ASSERT failures trigger) is implemented by dereferencing the literal NULL pointer. That's undefined behavior, as far as I know. This complicates reasoning on the interaction between MOZ_ASSERT and other undefined behavior in bug 990764 comment 46 and below. It would be nice if we could avoid invoking undefined behavior; just implementation-defined behavior should be enough. For example, dereference the pointer value 1, or do an integer division by zero.

Benoit Jacob [:bjacob] (mostly away)

Reporter

Comment 1

•

10 years ago

Note: of course, this is a 'mild' form of UB, in the sense that compiler implementers know that it is almost always unintentional, and treat it accordingly. So of course, we are not seeing any crazy UB-induced effects from MOZ_REALLY_CRASH. Still --- the above-mentioned bug shows how this unnecessarily complicates some conversations.

Joshua Cranmer [:jcranmer]

Comment 2

•

10 years ago

Integer division by zero is undefined behavior.

I don't have corresponding definitions in C++11, but in C11, *(uintptr_t*)1 is definitely undefined behavior:
If an invalid value has been assigned to the pointer, the behavior of the unary * operator is
undefined.

It's not really possible to avoid undefined behavior to cause this sort of exit.

Benoit Jacob [:bjacob] (mostly away)

Reporter

Comment 3

•

10 years ago

How about calling abort() then? I don't suppose that has undefined behavior?

Benoit Jacob [:bjacob] (mostly away)

Reporter

Comment 4

•

10 years ago

Another avenue would be to replace the compiler-time bad NULL pointer by a runtime value to prevent the compiler from reasoning about it. In this way, even if it might in principle still be UB, that won't be a kind of UB that can affect the compiler's behavior.

andrew.c.morrow

Comment 5

•

8 years ago

I have some evidence that this undefined behavior can cause real problems. While debugging a misbehaving project embedding SpiderMonkey, I watched it emit multiple 'MOZ_CRASH' messages to the console, without the process actually terminating.

I believe that the compiler (GCC 4.9) had concluded that the call to ::abort could be eliminated since it followed a null pointer dereference and was therefore impossible/unreachable. Commenting out the null pointer dereference restored the expected behavior that MOZ_CRASH would lead to process termination, now by way of abort.

Jesse Ruderman

Comment 6

•

8 years ago

Did your compiler remove the null deref too, or did the null deref fail to crash for some other reason?

Jesse Ruderman

Comment 7

•

8 years ago

For gcc and clang, MOZ_CRASH should probably use __builtin_trap. It's efficient, it's not UB, and it's easy for compilers to optimize the surrounding code.

Jeff Walden [:Waldo]

Comment 8

•

8 years ago

Semi-issues with __builtin_trap().

One, it might well change crash signature to not be a crash at that exact location. For debugging purposes, we really don't want MOZ_CRASH not parking an attached debugger on the line containing the MOZ_CRASH, forcing every developer to 'up' one or more times before doing any debugging -- it's a really poor use of developer time. We've gone to some length to eliminate these 'up's (it once was two, then one, then zero).

Two, MOZ_CRASH writes the line number in the null-deref. This assists with crashdump-debugging and such by giving an extra hint to the developer, reading surrounding assembly, exactly which crash might have been hit, in code that inlines/coalesces multiple crashes to a single location. We used to just write in a zero, then we changed it to differentiate nearly-positioned crashes this way.

Three, __builtin_trap() or whatever could potentially not be as "safe" a crash as a null-deref is. To a very real extent we need a guarantee as to exactly what's done in MOZ_CRASH() so we can know this. Working from the assumption that writing to null dereferences the zero page, and that page is permanently unallocatable on all platforms, we're safe with what the current implementation causes most compilers to do. Even if technically compilers are within their rights to make the user spew nasal demons.

I suppose the best way around this -- maybe -- would to have the macro expand to inline assembly, written per-architecture. That would avoid UB problems, I guess, while still letting us deal with the others appropriately.

Jesse Ruderman

Comment 9

•

8 years ago

If #1 or #3 is a problem (seems unlikely), we can file bugs against clang, because they expect us to use __builtin_trap:

>    foo.c:4:2: warning: indirection of non-volatile null pointer will be deleted, not trap [-Wnull-dereference]
>            *(int *)0;
>            ^~~~~~~~~
>    foo.c:4:2: note: consider using __builtin_trap() or qualifying pointer with 'volatile'

For #2, we're already asking developers to look at surrounding assembly, so we can add our own assembly that does something harmless with the line number.

BMO Automation

Updated

•

2 years ago

Severity: normal → S3

Bugzilla

Quick Search

MOZ_REALLY_CRASH should not invoke undefined behavior (implementation-defined behavior should be enough)

Categories

(Core :: MFBT, defect)

Tracking

()

People

(Reporter: bjacob, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Updated