Closed Bug 1338814 Opened 3 years ago Closed 2 years ago

Quantum Render Start up crash on Fedora 25 ESX VM

Categories

(Core :: Graphics: WebRender, defect)

54 Branch
x86_64
Linux
defect
Not set

Tracking

()

VERIFIED DUPLICATE of bug 1372880
Tracking Status
firefox54 --- affected

People

(Reporter: bc, Unassigned)

References

(Blocks 3 open bugs)

Details

(Keywords: crash)

Crash Data

Attachments

(4 files)

Attached file Crash Report opt
Attempting to get Quantum Render builds running in Bughunter I found a startup crash on Fedora 25 kernel 3.9.8 for both opt and debug builds. This does does not happen on Ubuntu 16.04 kernel 4.4.0-62. This is with the latest Linux64-qr builds from mozilla-central.

bp-f7217c3f-225a-42b6-b294-5eb9f2170211 doesn't contain useful symbols however.

opt [@ mozilla::wr::WebRenderAPI::SetRootPipeline]
Attached file opt crash dump extra
Attached file Crash Report debug
Debug asserts on startup:

Assertion failure: api, at /home/worker/workspace/build/src/gfx/layers/ipc/CompositorBridgeParent.cpp:1594
PS. Doesn't crash my real Fedora 25 workstation though it tends to lock up my display for a bit when it's starting.
Most likely cause is that the graphics environment in the VM doesn't allow initializing the GL context, so the code at [1] is failing. That will leave the WrAPI pointer at zero which would cause this crash.

FWIW we have a similar problem on Ubuntu 14.04, which is why we're not yet running mochitests that are running on the old Ubuntu 14.04 docker images.

Do you happen to have the console output from before the crash? That will probably help confirm my diagnosis.

[1] https://hg.mozilla.org/projects/graphics/file/3c300355a94d/gfx/webrender_bindings/WebRenderAPI.cpp#l63
When I said "Ubuntu 14.04" above I really meant "Ubuntu 12.04".

Also, if the VM environment only supports OpenGL 2.1 we won't support it in QR for now (https://bugzilla.mozilla.org/show_bug.cgi?id=1334189#c1). In that case we would need to either upgrade the VM's OpenGL support or just not run it in that environment.
Attached file debug console output
The opt crash showed

ATTENTION: default value of option force_s3tc_enable overridden by environment.
ExceptionHandler::GenerateDump cloned child 7415ExceptionHandler::WaitForContinueSignal waiting for continue signal...

ExceptionHandler::SendContinueSignalToChild sent continue signal to child

But that was it. Attaching the debug assertion console output.

Note this does work on Ubuntu 16.04 in a VM.

I can easily reproduce this on the VM. If you want to remote control me or want access to an appropriate vm, let me know. You'll need vpn + some special access sauce.
Yeah the debug output you attached has this:

[4432] WARNING: Failed to create GLXContext!: file /home/worker/workspace/build/src/gfx/gl/GLContextProviderGLX.cpp, line 885

which indicates it is a GL context initialization failure, which confirms my theory. Ubuntu 16.04 has supports newer OpenGL so it'll work there.
I have complete control over what is installed on the vms. If you think installing a system package or building OpenGL on these vms would work, I can do that.
$ glxinfo | grep OpenGL
OpenGL vendor string: VMware, Inc.
OpenGL renderer string: Gallium 0.4 on llvmpipe (LLVM 3.8, 256 bits)
OpenGL version string: 2.1 Mesa 13.0.3
OpenGL shading language version string: 1.30
OpenGL extensions:
OpenGL ES profile version string: OpenGL ES 2.0 Mesa 13.0.3
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 1.0.16
OpenGL ES profile extensions:

cknowles: Any hope of an updated version from VMwware?
Flags: needinfo?(cknowles)
ubuntu says:

mozauto@ubuntu-01 ~ (sisyphus-prd)
$ glxinfo | grep OpenGL
OpenGL vendor string: VMware, Inc.
OpenGL renderer string: Gallium 0.4 on llvmpipe (LLVM 3.8, 256 bits)
OpenGL core profile version string: 3.3 (Core Profile) Mesa 12.0.6
OpenGL core profile shading language version string: 3.30
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL core profile extensions:
OpenGL version string: 3.0 Mesa 12.0.6
OpenGL shading language version string: 1.30
OpenGL context flags: (none)
OpenGL extensions:
OpenGL ES profile version string: OpenGL ES 3.0 Mesa 12.0.6
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.00
OpenGL ES profile extensions:
I'm sorry - I'm not understanding - I think I'm seeing that OpenGL > 2.1 is needed - in comment 9 you show a VM that is 2.1 - and in comment 10 you show another VM that is 3.0.

You're at the latest VM hardware version.  You're also at the latest tools version - From a VM perspective you're as up to date as I can get you.  There is a newer version of ESX, but we're currently planning towards that upgrade - currently there are blockers in our path on that.
Flags: needinfo?(cknowles)
cknowles: Thanks. The first was Fedora 25 and the second was Ubuntu 16.04.

I think I can enable testing on Ubuntu only and we can WONTFIX this if that is appropriate.
See Also: → 1336436
With bug 1372880 this should no longer crash on startup, but it will fall back to non-webrender if webrender startup fails. That's as good as it's going to get on the gecko side, I think. If it's still failing on Fedora and we want to run it there we'll need to update the Fedora VM to a newer OpenGL.
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 1372880
That is fine, thanks. It will at least prevent others crash automatically crashing if webrender is enabled.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.