Closed Bug 860543 Opened 7 years ago Closed 6 years ago

clearColor assertion with NaN

Categories

(Core :: Canvas: WebGL, defect, critical)

defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla25

People

(Reporter: jruderman, Assigned: jgilbert)

References

(Blocks 2 open bugs)

Details

(Keywords: assertion, testcase)

Attachments

(5 files, 3 obsolete files)

Attached file testcase
Assertion failure: colorClearValue[0] == mColorClearValue[0] && colorClearValue[1] == mColorClearValue[1] && colorClearValue[2] == mColorClearValue[2] && colorClearValue[3] == mColorClearValue[3], at content/canvas/src/WebGLContext.cpp:1165

in mozilla::WebGLContext::ForceClearFramebufferWithDefaultValues

This assertion is part of code added in bug 716859.
Attached file stack
Cute, alright, thanks.
Attached patch Possible patch (obsolete) — Splinter Review
Assignee: nobody → dzbarsky
Comment on attachment 757695 [details] [diff] [review]
Possible patch

Review of attachment 757695 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/webidl/WebGLRenderingContext.webidl
@@ +27,5 @@
>  // Ideally the typedef below would use 'unsigned byte', but that doesn't currently exist in Web IDL.
>  typedef octet          GLubyte;        /* 'octet' should be an unsigned 8 bit type. */
>  typedef unsigned short GLushort;
>  typedef unsigned long  GLuint;
>  typedef unrestricted float GLfloat;

Switch `GLfloat` over (to restricted) too.
Attachment #757695 - Flags: review-
Let's just handle the NaNs. I'll take a look and see if we're going to hit this elsewhere, as well.
Assignee: dzbarsky → jgilbert
Attachment #757695 - Attachment is obsolete: true
Attachment #761260 - Flags: review?(bjacob)
Blocks: 878609
OS: Mac OS X → All
Hardware: x86_64 → All
Last hunk of the patch looks wrong.

Do you care what happens when you compare 0 against -0?
Comment on attachment 761260 [details] [diff] [review]
patch: Allow NaNs in prediction sanity checks

Review of attachment 761260 [details] [diff] [review]:
-----------------------------------------------------------------

An actual bug (see last comment) and a hijacking-the-review-process-to-get-it-done-my-pet-way (see big comment with example program).

::: content/canvas/src/WebGLContext.cpp
@@ +1150,5 @@
>      ForceClearFramebufferWithDefaultValues(clearMask);
>      mIsScreenCleared = true;
>  }
>  
> +bool IsPredictionCorrect(float predicted, float actual) {

This function name needs to be changed to something that more precisely describes what it does. How about "AreFloatsBinaryEqual" ... I think you want to convey that this is a variant of operator== that sidesteps the quirks of floating-point semantics and compares at the binary level.

@@ +1155,5 @@
> +    if (predicted == actual)
> +        return true;
> +
> +    // Not equal, but this might be because they're both NaNs.
> +    return IsNaN(predicted) && IsNaN(actual);

So instead of handling special cases like this, how about actually comparing data at binary level?

See this example program that I made just to try out using a union for performing this reinterpretation, as I think it's the right way, see the comment:

  #include <iostream>
  #include <cstdint>
  #include <limits>

  using namespace std;

  uint32_t ReinterpretFloatAsUint32(float x)
  {
    // unions are a better way than reinterpret_cast,
    // as they sidestep any issues with pointer aliasing.
    union FloatOrUint32 {
      uint32_t mUint32;
      float mFloat;
      FloatOrUint32(float f) : mFloat(f) {}
    };
    FloatOrUint32 u(x);
    return u.mUint32;
  }

  int main() {
    float x = 0.f;
    cout << "positive zero: as float " << x << ", as uint32 " << ReinterpretFloatAsUint32(x) << endl;
    x = -0.f;
    cout << "negative zero: as float " << x << ", as uint32 " << ReinterpretFloatAsUint32(x) << endl;
    x = 12;
    cout << "twelve: as float " << x << ", as uint32 " << ReinterpretFloatAsUint32(x) << endl;
    x = numeric_limits<float>::denorm_min();
    cout << "smallest positive denormal: as float " << x << ", as uint32 " << ReinterpretFloatAsUint32(x) << endl;
  }

@@ +1202,5 @@
>          gl->fGetBooleanv(LOCAL_GL_DEPTH_WRITEMASK, &depthWriteMask);
>          gl->fGetFloatv(LOCAL_GL_DEPTH_CLEAR_VALUE, &depthClearValue);
>  
> +        MOZ_ASSERT(depthWriteMask == mDepthWriteMask);
> +        MOZ_ASSERT(IsPredictionCorrect(mDepthWriteMask, depthClearValue));

It seems that this line introduces a bug by replacing depthClearValue by mDepthWriteMask.
Attachment #761260 - Flags: review?(bjacob) → review-
How about BitwiseCast?
(In reply to Jesse Ruderman from comment #6)
> Last hunk of the patch looks wrong.
> 
> Do you care what happens when you compare 0 against -0?

Note that 0.f == -0.f, so we don't need any special-casing here. The special-casing is for cases where operator== returns false even though we would like it to return true (NaN case).
(In reply to :Ms2ger from comment #8)
> How about BitwiseCast?

What's that? I DXR'd for it and didn't see any results. If we do already have some MFBT helper for that, then for sure we should use it.
Yeah, bug 798179 landed yesterday.
ReinterpretFloatAsUint32 is totally wrong here.

* There's no guarantee that sizeof(float) == sizeof(uint32_t).  If you're going to rely on it, at least add a static assertion.

* Aliasing through a union is undefined behavior. The right way to reinterpret bits is to use memcpy.  http://blog.regehr.org/archives/959

* ReinterpretFloatAsUint32 treats two NaNs with different low bits as unequal.  This is almost certainly not what you want.  As the JS team has recently learned, calling Math functions or even negating a NaN can easily give you "non-canonical" NaNs.

* ReinterpretFloatAsUint32 treats 0.f and -0.f as unequal.  It's not obvious that you want this here.
(In reply to Jesse Ruderman from comment #12)
> ReinterpretFloatAsUint32 is totally wrong here.
> 
> * There's no guarantee that sizeof(float) == sizeof(uint32_t).  If you're
> going to rely on it, at least add a static assertion.

You're right, this should be guarded by a static assert.

> 
> * Aliasing through a union is undefined behavior. The right way to
> reinterpret bits is to use memcpy.  http://blog.regehr.org/archives/959

Oh. That's news to me. memcpy would be really bad, though --- unless the compiler optimizes it, that would be a huge overhead.

What is, then, the right way? Does MFBT's new hotness do the right thing?

> * ReinterpretFloatAsUint32 treats two NaNs with different low bits as
> unequal.  This is almost certainly not what you want.

You're right, I overlooked that. Please ignore my entire "do a binary-level comparison" then, it's invalid.

> * ReinterpretFloatAsUint32 treats 0.f and -0.f as unequal.  It's not obvious
> that you want this here.

Ouch, that is not what I want here. Another way in which my binary-compare idea was wrong.

Jeff: please still address the other review comments.
Attached patch patch (obsolete) — Splinter Review
Let's do this one step at a time.

Let's start with doing a binary comparison. This does leave us open to messing up when values are clamped, but we should really handle these as well. Effectively, the values we shadow should *match exactly* the internal state GL has. Thus, we should only need to do a binary comparison.
Attachment #761260 - Attachment is obsolete: true
Attachment #761837 - Flags: review?(bjacob)
Sorry --- my suggestion about binary comparison above was wrong, see above comments. Please ignore it!
Attachment #761837 - Flags: review?(bjacob) → review-
Comment on attachment 761837 [details] [diff] [review]
patch

No, I think we should be doing simple binary comparisons until drivers throw other cases in our face, which I don't think is *that* likely. We should at least explore this path before investing a chunk more work in doing this correctly. (DEBUG builds to query info sent to GL immediately, check shadowing against that)
Attachment #761837 - Flags: review- → review?(bjacob)
Comment on attachment 761837 [details] [diff] [review]
patch

Review of attachment 761837 [details] [diff] [review]:
-----------------------------------------------------------------

"Doing this more correctly" won't require more work, it's just a matter of going back to the approach in your previous patch (with isnan), which was the right one, and I shouldn't have proposed another approach. Sorry!
Attachment #761837 - Flags: review?(bjacob) → review-
Comment on attachment 761837 [details] [diff] [review]
patch

Review of attachment 761837 [details] [diff] [review]:
-----------------------------------------------------------------

I think I actually still disagree. I'm all for more correct solutions, but it really seems like a rathole to me. I still think we're better off being very strict and simple, unless this blows up in our face.

Let's talk tomorrow.
Attachment #761837 - Attachment is obsolete: true
Attachment #763006 - Flags: review?(bjacob)
Attachment #763006 - Flags: review?(bjacob) → review+
https://hg.mozilla.org/mozilla-central/rev/4c8487a3c08a
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla24
See Also: → 864043
Yeah, that patch was bad. It only tested [0] and not 1/2/3.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: mozilla24 → ---
Comment on attachment 766140 [details] [diff] [review]
patch: actually check the all the channels, instead of just channel 0 four times.

Review of attachment 766140 [details] [diff] [review]:
-----------------------------------------------------------------

Oops, reviewer fail, sorry.
Attachment #766140 - Flags: review?(bjacob) → review+
https://hg.mozilla.org/mozilla-central/rev/01a1bbadba47
Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla25
Bad news. My linux+NV machine appears to convert NaNs to 0.0.
We should have a mochitest to test this.

I think we should probably give up on shadowing NaNs properly, and instead only assert shadowing was successful for non-NaNs.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
This actually fixes this testcase.
Attachment #769956 - Flags: review?(bjacob)
Comment on attachment 769956 [details] [diff] [review]
patch: Consider shadowed NaNs always matching

Review of attachment 769956 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/canvas/src/WebGLContext.cpp
@@ +1209,5 @@
> +        // GL is allowed to do anything it wants for NaNs, so if we're shadowing
> +        // a NaN, then whatever `actual` is might be correct.
> +        return true;
> +    }
> +    

trailing \w
Attachment #769956 - Flags: review?(bjacob) → review+
We will need to do a follow-up test with other floating-point toys, like -0 and +/-INF.
https://hg.mozilla.org/mozilla-central/rev/60baa46c0b44
Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.