Closed Bug 629016 Opened 13 years ago Closed 13 years ago

Permaorange reftests on OS X 64

Categories

(Core :: Graphics, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
Tracking Status
blocking2.0 --- betaN+

People

(Reporter: joe, Assigned: joe)

References

Details

(Keywords: regression, Whiteboard: [hardblocker][has patch][blocked on os x 10.6.6 update])

Attachments

(2 files, 1 obsolete file)

REFTEST TEST-UNEXPECTED-FAIL | file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/bugs/502288-1.html | image comparison (==)
REFTEST TEST-UNEXPECTED-FAIL | file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/bugs/513153-1a.html | image comparison (==)
REFTEST TEST-UNEXPECTED-FAIL | file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/bugs/513153-1b.html | image comparison (==)
REFTEST TEST-UNEXPECTED-FAIL | file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/bugs/526463-1.html | image comparison (==)
REFTEST TEST-UNEXPECTED-FAIL | file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/font-face/ex-unit-1-dynamic.html | image comparison (==)
These look like an invalidation problem.
Depends on: 629066
I pushed http://hg.mozilla.org/mozilla-central/rev/c06b04702cf8 to disable the reftests until we fix them.
Unfortunately one of the two snow leopard staging slaves I was using was updating to the wrong puppet server (it's supposed to talk with the puppet staging server rather than production).
I should have been able to catch this by mistrusting our staging systems.
joe I am very sorry.

Here is the run of the staging slave that actually had the right resolution:
http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTest/1296066805.1296068280.30586.gz
and would have caught the perma-oranges.
I disabled another reftest which failed on OSX64 <http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1296070581.1296071394.14625.gz>.  This was intermittent, but it was very similar to the rest of the failures in this bug.

http://hg.mozilla.org/mozilla-central/rev/7b2654a9c599
Armen: can you confirm that the test in comment 4 was also run on the same slave?
These failures kept happening on other tests too, which caused me to close mozilla-central.
I clarified on IRC but let me rephrase it here.

- a slave on staging was using the wrong screen resolution when we were baking there (it was receiving the screen resolution file from production rather than staging)
- the deployment this morning had no issues. all production slaves updated to the right new screen resolution
- up until now we were *blind* for reftests on 10.6; the new screen size allowed us to run reftests with hardware acceleration (disabling hw acceleration on 10.5 urged us to fix the screen resolution on 10.6)
- since the staging runs were wrong; they failed to catch the perma-oranges that joe disabled on comment 2

Here is my stub at trying to understand what our perma-oranges are.
If we look at [3], those are the last 4 perma-oranges we have revealed.

Before joe's change:
- 10.6 opt reftest - 5626/5/230 [1]
- 10.6 debug reftest - 5622/9/250 [2]

With joe's change:
- 10.6 opt reftest - 5626/0/235
- 10.6 debug reftest - 5622/4/255 [3]

Failures on [3] were already on [2].
Fixed failures on [1] are fixed as well on [3].

We need to disable the perma-oranges for [3]

reftest/tests/content/test/reftest/bug591981-1.html appeared after 8149e1a06476
and is a new perma-orange after that changeset.

###### [1] #######
REFTEST TEST-UNEXPECTED-FAIL | file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/bugs/502288-1.html | image comparison (==)
REFTEST TEST-UNEXPECTED-FAIL | file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/bugs/513153-1a.html | image comparison (==)
REFTEST TEST-UNEXPECTED-FAIL | file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/bugs/513153-1b.html | image comparison (==)
REFTEST TEST-UNEXPECTED-FAIL | file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/bugs/526463-1.html | image comparison (==)
REFTEST TEST-UNEXPECTED-FAIL | file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/font-face/ex-unit-1-dynamic.html | image comparison

###### [2] #######
REFTEST TEST-UNEXPECTED-FAIL | file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/bugs/502288-1.html | image comparison (==)
REFTEST TEST-UNEXPECTED-FAIL | file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/bugs/513153-1a.html | image comparison (==)
REFTEST TEST-UNEXPECTED-FAIL | file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/bugs/513153-1b.html | image comparison (==)
REFTEST TEST-UNEXPECTED-FAIL | file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/bugs/526463-1.html | image comparison (==)
REFTEST TEST-UNEXPECTED-FAIL | file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/font-face/ex-unit-1-dynamic.html | image comparison (==)
REFTEST TEST-UNEXPECTED-FAIL | file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/svg/as-image/img-height-meet-2.html | image comparison (==)
REFTEST TEST-UNEXPECTED-FAIL | file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/svg/as-image/img-height-slice-2.html | image comparison (==)
REFTEST TEST-UNEXPECTED-FAIL | file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/svg/as-image/img-width-meet-2.html | image comparison (==)
REFTEST TEST-UNEXPECTED-FAIL | file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/svg/as-image/img-width-slice-2.html | image comparison (==)

####### [3] #######
REFTEST TEST-UNEXPECTED-FAIL | file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/svg/as-image/img-height-meet-2.html | image comparison (==)
REFTEST TEST-UNEXPECTED-FAIL | file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/svg/as-image/img-height-slice-2.html | image comparison (==)
REFTEST TEST-UNEXPECTED-FAIL | file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/svg/as-image/img-width-meet-2.html | image comparison (==)
REFTEST TEST-UNEXPECTED-FAIL | file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/svg/as-image/img-width-slice-2.html | image comparison (==)

####### [4] #######
REFTEST TEST-UNEXPECTED-FAIL | file:///Users/cltbld/talos-slave/test/build/reftest/tests/content/test/reftest/bug591981-1.html | image comparison (==)
I filed bug 629143 on investigating the img-* failures.  (I don't know about the other ones.)
One more near-permaorange since the resolution change (including the log from Comment 3):
REFTEST TEST-UNEXPECTED-FAIL | file:///Users/cltbld/talos-slave/test/build/reftest/tests/content/test/reftest/bug591981-1.html | image comparison (==)
I tried repro'ing the bug591981-1.html failure and svg/as-image/* ones on desktop linux with GL enabled, but was unable.  Might be a cocoa widgetry issue or some characteristic of the test machines.
(In reply to comment #10)
Might be a cocoa widgetry issue or some characteristic of the test machines.
        ^exacerbated by
One more test disabled:

http://hg.mozilla.org/mozilla-central/rev/b3e0486f4e9b

Also, I made sure that other tests disabled because of this bug are only disabled on OSX64:

http://hg.mozilla.org/mozilla-central/rev/de488f6c12cf
(In reply to comment #10)
> I tried repro'ing the bug591981-1.html failure and svg/as-image/* ones on
> desktop linux with GL enabled, but was unable

Same here, on my 64-bit mac mini running OS X 10.6, using one of the failing tinderbox builds (built from 0e7b88440e17).

Not sure why it happens on the tinderboxen macs but not on my mac. :(
(GL was enabled by default in the tinderbox build that I tried, btw -- verified in about:support)
I have confirmed that this is caused by http://hg.mozilla.org/mozilla-central/rev/31b46f48f714 . Unfortunately, backing out this patch (Bug 621778) is non-trivial, because people built on top of it.

This is a hardblocker because it doesn't just happen on reftests, it happens on real webpages.

This might only happen on 10.6.2; I have not yet been willing to upgrade a computer past that because my own, running 10.6.6, can't reproduce this bug.
blocking2.0: --- → betaN+
Whiteboard: [hardblocker]
Blocks: 621778
Keywords: regression
(In reply to comment #17)
> This might only happen on 10.6.2; I have not yet been willing to upgrade a
> computer past that because my own, running 10.6.6, can't reproduce this bug.

I'm on 10.6.5, and I can lend my computer to you for testing tomorrow if you need.
Is there a value to make a list of which version of 10.6.x it works on?
Or just checking that it works for > 10.6.5 is good enough?
There is value in figuring out what versions this works on, if it is indeed version-specific, because that'll let us more accurately blacklist versions of OS X.
I have confirmed that upgrading a slave to 10.6.6 (from 10.6.2) fixes this bug.
This is not a bug in Firefox. I've verified that all rendering is done properly, but OpenGL puts it in the wrong position. Further, I've verified that updating to 10.6.3 or later fixes this bug.

Unfortunately, this means we need to update all our 10.6 slaves, since we need to blacklist OpenGL-accelerated layers on 10.6.2 and earlier.
Depends on: 629509
Attachment #507763 - Flags: review?(jmuizelaar)
Attachment #507763 - Flags: review?(joshmoz)
Comment on attachment 507763 [details] [diff] [review]
only use OpenGL acceleration on 10.6.3 and above

This will still enable gl in xul accelerated windows. I'm not sure if that's ok or not. Other than that it looks fine.
Attachment #507763 - Flags: review?(jmuizelaar) → review+
Attachment #507763 - Flags: review?(joshmoz) → review+
Comment on attachment 507763 [details] [diff] [review]
only use OpenGL acceleration on 10.6.3 and above

>+  if (err1 == noErr && err2 == noErr && err3 == noErr &&
>+      major >= 10 && minor >= 6 && bugfix >= 3) {
>+    accelerateByDefault = PR_TRUE;

This if test is not correct.  It will decide that version 11.0.0 is less than 10.6.3.
THis needs to be something more like:
     if (err1 == noErr && err2 == noErr && err3 == noErr &&
        (major > 10 || ((major == 10 && minor > 6) ||
        (major == 10 && minor == 6 && bugfix >=3)))
I might be simpler to do something like 

      (major * 1000000 + minor *1000 + bugfix) >= 10006003
(In reply to comment #26)
> I might be simpler to do something like 
> 
>       (major * 1000000 + minor *1000 + bugfix) >= 10006003

Indeed. I think this is the best option.
Hmm. Is the patch just checked in for Bug 628384 going to override this fix?
Yes, but that's ok; our test machines should always be in a working state.
Bill's right that the logic before was wrong. I don't see any reason why we should apply our logic to non-10.6 OSes, though, because 10.5 had no problems, and we know nothing about > 10.6. So let's tweak this a little.
Attachment #507987 - Flags: review?(jmuizelaar)
(In reply to comment #29)
> Yes, but that's ok; our test machines should always be in a working state.

I thought the point of this patch was to fix the orange on the test machines by blacklisting them because they were an old OS.
No, we're going to fix our test machines by updating them to 10.6.6 (bug 629509). This is to fix the few users who are still on 10.6.0, 10.6.1, and 10.6.2.
(In reply to comment #32)
> No, we're going to fix our test machines by updating them to 10.6.6 (bug
> 629509). This is to fix the few users who are still on 10.6.0, 10.6.1, and
> 10.6.2.

OK.  Thanks for the clarification.
Comment on attachment 507987 [details] [diff] [review]
v2: only apply our blocking logic on 10.6

Maybe this should've been done in the blacklisting code. That would make it easier to expose the reason for the block to about:support.
Attachment #507987 - Flags: review?(jmuizelaar) → review+
Whiteboard: [hardblocker] → [hardblocker][has patch]
Whiteboard: [hardblocker][has patch] → [hardblocker][has patch][blocked on os x 10.6.6 update]
Attachment #507763 - Attachment is obsolete: true
Now that we have 10.6.2 and below blocked for OpenGL, we can re-enable the tests we disabled because the OS X bug caused them to fail.
Attachment #509545 - Flags: review?(jmuizelaar)
Attachment #509545 - Flags: review?(jmuizelaar) → review+
http://hg.mozilla.org/mozilla-central/rev/0ab68a939a45
http://hg.mozilla.org/mozilla-central/rev/c6993d965d0a
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
is this included not in beta 11, but in beta 12 ?
I believe you're correct -- at least at one point, the beta11 was tagged at
  http://hg.mozilla.org/mozilla-central/rev/49c3ca45a79f
which is older than the csets in Comment 36 (based on |hg log|).
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: