Closed Bug 931344 Opened 11 years ago Closed 7 years ago

Make sure Shader Editor tests run on Linux machines

Categories

(DevTools Graveyard :: WebGL Shader Editor, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: vporof, Unassigned)

Details

Attachments

(1 file)

Most linux slaves run tests with webgl.force-enabled as true, so in many cases the shader editor tests don't run at all. I think we're unnecessarily paranoid about the force-enabled flag, but only time will tell :)
Sorry, backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/8a514666c2ac for ASan browser-chrome permaorange like https://tbpl.mozilla.org/php/getParsedLog.php?id=29714152&tree=Mozilla-Inbound. I presume I'm just wallpapering over bustage that still exists by backing it out, but at least with it out, we get to run the rest of the tests through ASan.
Attached file log
Relevant part of the log.
Halp! :)

There's a linux asan permaorange/crasher when enabling the shader editor tests (added in bug 910953) on that platform, when webgl is force-enabled. This happens when a fragment shader is recompiled with a different source and relinked with its parent program.

This has been silent until now because the tests weren't running when detecting force-enabled webgl, which seems to be the case on most linux slaves (see comment 0). Bypassing this check makes asan sad.
Flags: needinfo?(bjacob)
I would like to help, but am not sure that I understand the question. Is the problem that you are trying to turn on a test, but that test runs into a driver bug on some slaves? If so, are you looking for a work-around, or a way to selectively avoid running the code on affected drivers?
Flags: needinfo?(bjacob)
(In reply to Benoit Jacob [:bjacob] from comment #6)
> I would like to help, but am not sure that I understand the question. Is the
> problem that you are trying to turn on a test, but that test runs into a
> driver bug on some slaves? If so, are you looking for a work-around, or a
> way to selectively avoid running the code on affected drivers?

A short while ago a new developer tool landed (bug 910953 for the backend, bug 910955 for the frontend). It's a WebGL shader editor. It has some tests. Every time a test starts, it asks the following questions [0]:
- is WebGL *not* force-enabled?
- is WebGL marked as being supported?
- does creating a canvas with a WebGL context actually work?

If all of the above conditions are true, the test runs. Otherwise, it's finalized before anything WebGL-related is performed.

This bug is about removing the first check, "is WebGL *not* force-enabled?", for two reasons:
- gaining better coverage, because almost all linux slaves run tests with webgl being force-enabled
- theoretically, webgl being force-enabled shouldn't affect the usability of this tool

After removing the first check [1], asan linux has become perma-orange, with some shader editor tests failing [2] (there's only one failure in this link because the process was terminated immediately). All other plaforms and linux configurations seem to be fine [3].

I would say that *not* running the shader editor tests on almost all linux test slaves because the force-enabled flag is much too cautious and a bad idea, because we're not getting enough coverage. It would be much better to remove the force-enabled check for the tests (especially since regular linux builds seem to be fine), but it can't be done at the moment because of the asan permaorange.

[0] http://dxr.mozilla.org/mozilla-central/source/browser/devtools/shadereditor/test/head.js#l104
[1] https://hg.mozilla.org/integration/mozilla-inbound/rev/fb8d9cb4698b
[2] https://tbpl.mozilla.org/php/getParsedLog.php?id=29714152&tree=Mozilla-Inbound
[3] https://tbpl.mozilla.org/?tree=Try&rev=a6a89157fbcf
Flags: needinfo?(bjacob)
Priority: -- → P2
Oh, I see. I seem to remember that indeed some Mesa/llvmpipe software GL driver that we are using on some Ubuntu test slaves, have a memory safety bug that is triggered by WebGL, and that made it necessary to turn off at least some WebGL tests there. But I don't remember the details. :decoder is the one who was working on that; :jgilbert too is more current than I am on these matters.

Anyway, from what you say, it sounds like what we want is to run your test on standard builds, but specifically not on ASan builds. But better discuss with :decoder and :jgilbert.
Flags: needinfo?(bjacob)
Thanks for the reply Benoit! Deferring..

(see comment 7)
Flags: needinfo?(jgilbert)
Flags: needinfo?(choller)
Judging from the log, one of your tests triggers a use-after-free in the swrast_dri driver. I strongly recommend to *not* run the particular test on *any* Linux build with this driver. Even if the test is green, it can cause memory corruptions and lead to intermittent failures later. You should specifically try to figure out which subset of your tests makes ASan orange and then disable these on all Linux platforms using the buggy drivers.

If you need any further help, please let me know! :)
Flags: needinfo?(choller)
(In reply to Christian Holler (:decoder) from comment #10)
> You should specifically try to figure out which subset of your tests makes ASan orange
> and then disable these on all Linux platforms using the buggy drivers.
> 

The problem with the Shader Editor's tests is that almost all of them will cause the memory corruption, which is triggered by recompiling (*editing*) a shader :)

Is webgl.force-enabled true only on slaves with bad drivers?

If webgl.force-enabled is always true, no tests will ever run on Linux, which is very bad for coverage.

If webgl.force-enabled is only true for slaves with bad drivers, then this bug is already fixed, because the tests run only when force-enabled is false (although it still makes me nervous that I've never yet seen Shader Editor tests being run on Linux when looking at tbpl logs).

If weblg.force-enabled being true is a bad way of detecting faulty drivers, is there something better that can be done here instead?
I think all our Linux slaves use the bad drivers, so your tests will never run there. The proper solution is to switch to newer drivers on the slaves (I think the more modern llvmpipe would be right, not the old swrast_dri driver).
(In reply to Christian Holler (:decoder) from comment #12)
> I think all our Linux slaves use the bad drivers, so your tests will never
> run there.

All shader editor tests are green for try runs on linux builds without asan. However, that still might mean memory bugs are lurking and could lead to intermittent failures later.

Apparently, tests can be disabled on asan builds like this: [0]

[0] https://mxr.mozilla.org/mozilla-central/source/toolkit/components/aboutmemory/tests/Makefile.in#6
As I said before, the memory corruptions are present with and without ASan. It does not solve the problem to disable the tests under ASan, it just hides them. The consequences are unpredictable and imho it does not make much sense to run tests that cause memory corruptions and then rely on their results. The only proper way to do this is to get rid of the broken drivers. And we should do that anyway.
(In reply to Christian Holler (:decoder) from comment #14)
> As I said before, the memory corruptions are present with and without ASan.

Yeah, completely agreed.

> The only proper way to do this is to get rid of the broken drivers.
> And we should do that anyway.

That would be excellent.
Summary: Relax the check for webgl availability in the shader editor tests → Make sure Shader Editor tests run on Linux machines
Assignee: vporof → nobody
You should definitely land this fix (allow force-enabled test runs), but mark your now-failing tests as random/broken on the platforms where there are as such. This will at least give us partial coverage.
Flags: needinfo?(jgilbert) → needinfo?(vporof)
(In reply to Jeff Gilbert [:jgilbert] from comment #16)
> Mark your now-failing tests as random/broken on the platforms where there
> are as such. This will at least give us partial coverage.

How do I do that?
Flags: needinfo?(vporof)
Flags: needinfo?(jgilbert)
The drivers should be updated instead.
Flags: needinfo?(jgilbert)
Mass-closing inactive (2 years+) bugs on unmaintained devtools components.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WONTFIX
Product: Firefox → DevTools
Product: DevTools → DevTools Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: