Closed Bug 998742 Opened 6 years ago Closed 6 years ago

webgl tests will fail on Windows reftest-no-accel when Gecko 31 merges to mozilla-aurora on April 28th

Categories

(Core :: Graphics: Layers, defect, blocker)

x86
Windows 7
defect
Not set
blocker

Tracking

()

RESOLVED FIXED
mozilla31
Tracking Status
firefox31 + fixed

People

(Reporter: philor, Assigned: jgilbert)

References

Details

(Whiteboard: webgl-correctness [qa-])

Attachments

(1 file, 4 obsolete files)

Starting sometime in the last couple of weeks, some change has resulted in https://tbpl.mozilla.org/?tree=Try&rev=4dd3d4a40c56 - current trunk built as it will be built on aurora when 31 merges there in a week completely fails multiple parts of webgl-clear-test.html, webgl-orientation-test.html, webgl-color-test.html and webgl-color-alpha-test.html on Windows reftest-no-accel. Rather surprising, since the only NIGHTLY_BUILD ifdefs in gfx seem to be MOZ_WIDGET_GTK and MOZ_X11 for non-NIGHTLY_BUILD.
We just disabled alpha reftests somewhere else too. We disabled them in another bug temporarily, but I lost it. I think it's time to block on this.

:mattwoodrow, do you still know which bug that was?
Flags: needinfo?(matt.woodrow)
That was bug 990510.
Flags: needinfo?(matt.woodrow)
See Also: → 990510
Failures:


webgl-clear-test.html?__&alpha&________ | image comparison (==), max difference: 255, number of differing pixels: 65536
webgl-clear-test.html?aa&alpha&________ | image comparison (==), max difference: 255, number of differing pixels: 65536
webgl-clear-test.html?__&alpha&preserve | image comparison (==), max difference: 255, number of differing pixels: 65536
webgl-clear-test.html?aa&alpha&preserve | image comparison (==), max difference: 255, number of differing pixels: 65536
webgl-clear-test.html?readback&__&alpha&________ | image comparison (==), max difference: 255, number of differing pixels: 65536
webgl-clear-test.html?readback&aa&alpha&________ | image comparison (==), max difference: 255, number of differing pixels: 65536
webgl-clear-test.html?readback&__&alpha&preserve | image comparison (==), max difference: 255, number of differing pixels: 65536
webgl-clear-test.html?readback&aa&alpha&preserve | image comparison (==), max difference: 255, number of differing pixels: 65536
webgl-orientation-test.html?__&alpha&________ | image comparison (==), max difference: 255, number of differing pixels: 49152
webgl-orientation-test.html?aa&alpha&________ | image comparison (==), max difference: 255, number of differing pixels: 49152
webgl-orientation-test.html?__&alpha&preserve | image comparison (==), max difference: 255, number of differing pixels: 49152
webgl-orientation-test.html?aa&alpha&preserve | image comparison (==), max difference: 255, number of differing pixels: 49152
webgl-orientation-test.html?readback&__&alpha&________ | image comparison (==), max difference: 255, number of differing pixels: 49152
webgl-orientation-test.html?readback&aa&alpha&________ | image comparison (==), max difference: 255, number of differing pixels: 49152
webgl-orientation-test.html?readback&__&alpha&preserve | image comparison (==), max difference: 255, number of differing pixels: 49152
webgl-orientation-test.html?readback&aa&alpha&preserve | image comparison (==), max difference: 255, number of differing pixels: 49152
webgl-color-test.html?__&alpha&_____&_______&________&_______ | image comparison (==), max difference: 255, number of differing pixels: 32768
webgl-color-test.html?aa&alpha&_____&_______&________&_______ | image comparison (==), max difference: 255, number of differing pixels: 32768
webgl-color-test.html?__&alpha&depth&_______&________&_______ | image comparison (==), max difference: 255, number of differing pixels: 32768
webgl-color-test.html?aa&alpha&depth&_______&________&_______ | image comparison (==), max difference: 255, number of differing pixels: 32768
webgl-color-test.html?__&alpha&_____&_______&preserve&_______ | image comparison (==), max difference: 255, number of differing pixels: 32768
webgl-color-test.html?aa&alpha&_____&_______&preserve&_______ | image comparison (==), max difference: 255, number of differing pixels: 32768
webgl-color-test.html?__&alpha&depth&_______&preserve&_______ | image comparison (==), max difference: 255, number of differing pixels: 32768
webgl-color-test.html?aa&alpha&depth&_______&preserve&_______ | image comparison (==), max difference: 255, number of differing pixels: 32768
webgl-color-test.html?__&alpha&_____&_______&________&stencil | image comparison (==), max difference: 255, number of differing pixels: 32768
webgl-color-test.html?aa&alpha&_____&_______&________&stencil | image comparison (==), max difference: 255, number of differing pixels: 32768
webgl-color-test.html?__&alpha&depth&_______&________&stencil | image comparison (==), max difference: 255, number of differing pixels: 32768
webgl-color-test.html?aa&alpha&depth&_______&________&stencil | image comparison (==), max difference: 255, number of differing pixels: 32768
webgl-color-test.html?__&alpha&_____&_______&preserve&stencil | image comparison (==), max difference: 255, number of differing pixels: 32768
webgl-color-test.html?aa&alpha&_____&_______&preserve&stencil | image comparison (==), max difference: 255, number of differing pixels: 32768
webgl-color-test.html?__&alpha&depth&_______&preserve&stencil | image comparison (==), max difference: 255, number of differing pixels: 32768
webgl-color-test.html?aa&alpha&depth&_______&preserve&stencil | image comparison (==), max difference: 255, number of differing pixels: 32768
webgl-color-test.html?readback&__&alpha&________ | image comparison (==), max difference: 255, number of differing pixels: 32768
webgl-color-test.html?readback&aa&alpha&________ | image comparison (==), max difference: 255, number of differing pixels: 32768
webgl-color-test.html?readback&__&alpha&preserve | image comparison (==), max difference: 255, number of differing pixels: 32768
webgl-color-test.html?readback&aa&alpha&preserve | image comparison (==), max difference: 255, number of differing pixels: 32768
webgl-color-alpha-test.html?colorVal=1.0&alphaVal=0.5&alpha | image comparison (==), max difference: 128, number of differing pixels: 32768
webgl-color-alpha-test.html?colorVal=0.5&alphaVal=0.5&alpha | image comparison (==), max difference: 128, number of differing pixels: 65536
The failures are all alpha:true, premultAlpha:false.
Any reason non-accel premultAlpha:false might be broken?
Flags: needinfo?(bas)
Whiteboard: webgl-correctness
Tracking for now
Three days.
Perhaps it's:
http://mxr.mozilla.org/mozilla-central/source/browser/app/profile/firefox.js#1107

#if defined(NIGHTLY_BUILD)
// browser.tabs.remote is enabled on nightly. However, users won't
// actually get remote tabs unless they enable
// browser.tabs.remote.autostart or they use the "New OOP Window" menu
// option.
pref("browser.tabs.remote", true);
#else
pref("browser.tabs.remote", false);
#endif
pref("browser.tabs.remote.autostart", false);
Here's the reason it renders wrong:
http://mxr.mozilla.org/mozilla-central/source/gfx/layers/LayerUtils.cpp#53

With normal trunk, we hit the IS_LITTLE_ENDIAN block.
BUT, with pseudo-aurora configs, we *don't*. We get the path as if we were on big endian.
Flags: needinfo?(bas)
I suspect this is an issue with unified builds.
Yep, this is a unified build issue. With unified builds disabled (like non-nightly is):
Assertion failure: !IsLittleEndian() (Defined as big endian, but actually little!)
Assignee: nobody → jgilbert
Component: Canvas: WebGL → Graphics: Layers
Attachment #8412993 - Flags: review?(bgirard)
Attachment #8412993 - Attachment description: assert-endian → patch 1: Assert correct endianness
Attachment #8412997 - Flags: review?(bgirard)
Comment on attachment 8412993 [details] [diff] [review]
patch 1: Assert correct endianness

Review of attachment 8412993 [details] [diff] [review]:
-----------------------------------------------------------------

Ok but please put a comment as to why you're doing this (unified build). People tend to go WTF when you assert (1 == 1)
Attachment #8412993 - Flags: review?(bgirard) → review+
I was told ehsan might be interested in this.
IS_LITTLE_ENDIAN is unfortunately defined by a bunch of unrelated headers (qcms, jscpuconfig, some NSPR stuff etc), so presumably there is a bug in one of them.  I think a better way to fix this bug is to use MOZ_LITTLE_ENDIAN from mozilla/Endian.h instead which is guaranteed to be defined only once.

Nathan, did we have plans to move away from IS_LITTLE_ENDIAN?  We probably should do that sooner rather than later.
Flags: needinfo?(nfroyd)
(In reply to :Ehsan Akhgari (lagging on bugmail, needinfo? me!) from comment #16)
> Nathan, did we have plans to move away from IS_LITTLE_ENDIAN?  We probably
> should do that sooner rather than later.

We did.  I have slowly been converting things to use it, but apparently not fast enough.
Flags: needinfo?(nfroyd)
(In reply to :Ehsan Akhgari (lagging on bugmail, needinfo? me!) from comment #16)
> IS_LITTLE_ENDIAN is unfortunately defined by a bunch of unrelated headers
> (qcms, jscpuconfig, some NSPR stuff etc), so presumably there is a bug in
> one of them.  I think a better way to fix this bug is to use
> MOZ_LITTLE_ENDIAN from mozilla/Endian.h instead which is guaranteed to be
> defined only once.
> 
> Nathan, did we have plans to move away from IS_LITTLE_ENDIAN?  We probably
> should do that sooner rather than later.

Alright, we can do this then.
Attached patch patch (obsolete) — Splinter Review
I only need one of you, but the sooner the better. We want this in before Monday.
Attachment #8412993 - Attachment is obsolete: true
Attachment #8412997 - Attachment is obsolete: true
Attachment #8412997 - Flags: review?(bgirard)
Attachment #8413042 - Flags: review?(bjacob)
Attachment #8413042 - Flags: review?(bgirard)
Attachment #8413042 - Flags: review?(bas)
Comment on attachment 8413042 [details] [diff] [review]
patch

Review of attachment 8413042 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/LayerUtils.cpp
@@ +26,5 @@
> +    uint16_t testShort;
> +    uint8_t* testBytes = (uint8_t*)&testShort;
> +    testBytes[0] = 0xAA;
> +    testBytes[1] = 0xBB;
> +    return testShort == 0xBBAA;

Please write this as:

uint16_t testShort;
static const uint8_t testBytes[] = { 0xAA, 0xBB };
memcpy(&testShort, testBytes, sizeof(testBytes));
return testShort == 0xBBAA;

so as to not fall afoul of aliasing violations.
Comment on attachment 8413042 [details] [diff] [review]
patch

Review of attachment 8413042 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/LayerUtils.cpp
@@ +23,5 @@
> +#ifdef DEBUG
> +static bool IsLittleEndian()
> +{
> +    uint16_t testShort;
> +    uint8_t* testBytes = (uint8_t*)&testShort;

This violates strict aliasing, but that's OK, because we disable it, and bug 414641 is WONTFIX.

Make this a reinterpret_cast though.
Attachment #8413042 - Flags: review?(bjacob) → review+
Ah, I hadn't seen Nathan's comment 25. Nathan, you know better, decide whether this overrides my r+ :-)
Flags: needinfo?(nfroyd)
Also, given the extreme time constraints there, maybe it's OK to land, and only do the right things as a follow-up.
Oops, I'm starting to reference future things, though. The causality violation is getting harder to conceal. I meant comment 20 :-)
Also, Nathan, how certain are we that a memcpy would actually avoid undefined behavior? Isn't *any* strict aliasing violation UB by definition? I understand that if memcpy is implemented as a non-inline function, then the problem doesn't happen in practice, but here, we have a 2-byte memcpy, which is likely implemented as a compiler builtin. So if we took this road, I would feel more comfortable about a custom __attribute__((noinline)) function instead of memcpy.
(In reply to Benoit Jacob [:bjacob] from comment #25)
> Also, Nathan, how certain are we that a memcpy would actually avoid
> undefined behavior? Isn't *any* strict aliasing violation UB by definition?
> I understand that if memcpy is implemented as a non-inline function, then
> the problem doesn't happen in practice, but here, we have a 2-byte memcpy,
> which is likely implemented as a compiler builtin. So if we took this road,
> I would feel more comfortable about a custom __attribute__((noinline))
> function instead of memcpy.

I think we should just do this, because we alias like this a lot in this sort of code. 
This is the comment I was going to leave in my version of the patch:

    // Violate strict aliasing, because violating strict aliasing is how
    // we always pack and unpack between uint32_t and uint8_t[].
(In reply to Nathan Froyd (:froydnj) from comment #20)
> Comment on attachment 8413042 [details] [diff] [review]
> patch
> 
> Review of attachment 8413042 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: gfx/layers/LayerUtils.cpp
> @@ +26,5 @@
> > +    uint16_t testShort;
> > +    uint8_t* testBytes = (uint8_t*)&testShort;
> > +    testBytes[0] = 0xAA;
> > +    testBytes[1] = 0xBB;
> > +    return testShort == 0xBBAA;
> 
> Please write this as:
> 
> uint16_t testShort;
> static const uint8_t testBytes[] = { 0xAA, 0xBB };
> memcpy(&testShort, testBytes, sizeof(testBytes));
> return testShort == 0xBBAA;
> 
> so as to not fall afoul of aliasing violations.

Ok.
Attached patch patch (obsolete) — Splinter Review
r=bjacob
Attachment #8413042 - Attachment is obsolete: true
Attachment #8413042 - Flags: review?(bgirard)
Attachment #8413042 - Flags: review?(bas)
Attachment #8413084 - Flags: review+
Attached patch assert-endianSplinter Review
r=bjacob
Attachment #8413084 - Attachment is obsolete: true
Attachment #8413086 - Flags: review+
(In reply to Benoit Jacob [:bjacob] from comment #25)
> Also, Nathan, how certain are we that a memcpy would actually avoid
> undefined behavior? Isn't *any* strict aliasing violation UB by definition?
> I understand that if memcpy is implemented as a non-inline function, then
> the problem doesn't happen in practice, but here, we have a 2-byte memcpy,
> which is likely implemented as a compiler builtin. So if we took this road,
> I would feel more comfortable about a custom __attribute__((noinline))
> function instead of memcpy.

We are extremely certain that a memcpy would avoid undefined behavior.  The problem with:

  uint16_t tmp;
  uint8_t* ptr = (uint8_t*)&tmp;

is that we are now accessing the same storage (the storage for |tmp|) with two distinct types, which is disallowed (at least in this case, see below).  With memcpy, we are accessing two distinct storage locations and copying between them.  So, no aliasing problems, guaranteed.

The compiler is, of course, free to merge storage and whatnot here; its IR is not subject to this restriction.  I would expect that an optimizer would turn this "runtime" check into a compile-time constant.

(The nitpicky may point out that |uint8_t| is probably equivalent to |unsigned char|, which the C++ standard explicitly permits as a type through which you may access the storage for another object: [basic.lval]p10, n3337.  There are other cases where accessing things through not-exactly-the-same types is permitted, too; consult your C++ standard for the detailed summary.  So the original code is probably OK, though depending on how uint8_t is defined, the behavior may be undefined.  I think being explicit with the memcpy is a better habit to get into.)
Flags: needinfo?(nfroyd)
(In reply to Nathan Froyd (:froydnj) from comment #31)
> (In reply to Benoit Jacob [:bjacob] from comment #25)
> > Also, Nathan, how certain are we that a memcpy would actually avoid
> > undefined behavior? Isn't *any* strict aliasing violation UB by definition?
> > I understand that if memcpy is implemented as a non-inline function, then
> > the problem doesn't happen in practice, but here, we have a 2-byte memcpy,
> > which is likely implemented as a compiler builtin. So if we took this road,
> > I would feel more comfortable about a custom __attribute__((noinline))
> > function instead of memcpy.
> 
> We are extremely certain that a memcpy would avoid undefined behavior.  The
> problem with:
> 
>   uint16_t tmp;
>   uint8_t* ptr = (uint8_t*)&tmp;
> 
> is that we are now accessing the same storage (the storage for |tmp|) with
> two distinct types, which is disallowed (at least in this case, see below). 
> With memcpy, we are accessing two distinct storage locations and copying
> between them.  So, no aliasing problems, guaranteed.
> 
> The compiler is, of course, free to merge storage and whatnot here; its IR
> is not subject to this restriction.  I would expect that an optimizer would
> turn this "runtime" check into a compile-time constant.
> 
> (The nitpicky may point out that |uint8_t| is probably equivalent to
> |unsigned char|, which the C++ standard explicitly permits as a type through
> which you may access the storage for another object: [basic.lval]p10, n3337.
> There are other cases where accessing things through not-exactly-the-same
> types is permitted, too; consult your C++ standard for the detailed summary.
> So the original code is probably OK, though depending on how uint8_t is
> defined, the behavior may be undefined.  I think being explicit with the
> memcpy is a better habit to get into.)

Yep, I was seeing references to 'char' being immune to aliasing rules. I think we might just luck out in our uint8_t types. Regardless, the final version of the patch uses memcpy.
(In reply to Jeff Gilbert [:jgilbert] from comment #32)
> Yep, I was seeing references to 'char' being immune to aliasing rules. I
> think we might just luck out in our uint8_t types. Regardless, the final
> version of the patch uses memcpy.

jgilbert++

Thanks!
https://hg.mozilla.org/mozilla-central/rev/3d0432bbcb8f
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla31
Whiteboard: webgl-correctness → webgl-correctness [qa-]
You need to log in before you can comment on or make changes to this bug.