Closed Bug 1422051 Opened 7 years ago Closed 5 years ago

Categories

(Core :: Graphics: WebRender, defect, P5)

defect

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox-esr52 --- unaffected
firefox57 --- unaffected
firefox58 --- unaffected
firefox59 --- unaffected
firefox62 --- disabled
firefox63 --- disabled
firefox64 --- disabled

People

(Reporter: linuxhippy, Assigned: kvark, NeedInfo)

References

(Blocks 1 open bug, )

Details

(Keywords: correctness, nightly-community, regression, Whiteboard: [wr-reserve])

Attachments

(4 files)

Attached image css-fps-webrender.png
User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:59.0) Gecko/20100101 Firefox/59.0
Build ID: 20171130101246

Steps to reproduce:

Enable webrender + blob images


Actual results:

open: http://keithclark.co.uk/labs/css-fps/nojs/


Expected results:

rendering completly broken
Thank you! Confirmed in Nightly 59 x64 20171130101246 de_DE @ Debian Testing (KDE, Radeon RX480).
fresh profile: layers.acceleration.force-enabled, gfx.webrender.enabled, gfx.webrender.blob-images, image.mem.shared, layout.display-list.retain;false

This problem does not occur in a fresh profile with layers.acceleration.force-enabled, layout.display-list.retain;false.
Status: UNCONFIRMED → NEW
Has STR: --- → yes
Component: Untriaged → Graphics: WebRender
Ever confirmed: true
OS: Unspecified → Linux
Product: Firefox → Core
Hardware: Unspecified → x86_64
Summary: Servere corruptions on fps-css with webrender enabled → Several corruptions on http://keithclark.co.uk/labs/css-fps/nojs/
Whiteboard: [wr-mvp] [triage]
Priority: -- → P3
Whiteboard: [wr-mvp] [triage] → [wr-reserve]
See Also: → 1422116
OS: Linux → All
Hardware: x86_64 → All
Even worse, on my system (AMD APU, 1024MB reserved video memoy, RadeonSI) running the mentioned URL with WebRender seems to exhaust all available GPU memory.

The radeon kernel driver became really angry:

[13175.320818] [TTM] Failed to find memory space for buffer 0xffff9edf05a97868 eviction
[13175.320824] [TTM] No space for ffff9edf05a97868 (262144 pages, 1048576K, 1024M)
[13175.320827] [TTM]   placement[0]=0x00060002 (1)
[13175.320829] [TTM]     has_type: 1
[13175.320830] [TTM]     use_type: 1
[13175.320832] [TTM]     flags: 0x0000000A
[13175.320833] [TTM]     gpu_offset: 0x80000000
[13175.320836] [TTM]     size: 524288
[13175.320837] [TTM]     available_caching: 0x00070000
[13175.320839] [TTM]     default_caching: 0x00010000
[13175.373055] [TTM] Failed to find memory space for buffer 0xffff9edf05a97868 eviction
[13175.373061] [TTM] No space for ffff9edf05a97868 (262144 pages, 1048576K, 1024M)
[13175.373063] [TTM]   placement[0]=0x00060002 (1)
[13175.373065] [TTM]     has_type: 1
[13175.373066] [TTM]     use_type: 1
[13175.373068] [TTM]     flags: 0x0000000A
[13175.373069] [TTM]     gpu_offset: 0x80000000
[13175.373070] [TTM]     size: 524288
[13175.373072] [TTM]     available_caching: 0x00070000
[13175.373073] [TTM]     default_caching: 0x00010000
[13175.373222] [drm:radeon_cs_ioctl [radeon]] *ERROR* Failed to parse relocation -12!
[13175.387882] [TTM] Failed to find memory space for buffer 0xffff9edf05a97868 eviction
[13175.387888] [TTM] No space for ffff9edf05a97868 (262144 pages, 1048576K, 1024M)
[13175.387890] [TTM]   placement[0]=0x00060002 (1)
[13175.387892] [TTM]     has_type: 1
[13175.387893] [TTM]     use_type: 1
[13175.387894] [TTM]     flags: 0x0000000A
[13175.387895] [TTM]     gpu_offset: 0x80000000
[13175.387896] [TTM]     size: 524288
[13175.387898] [TTM]     available_caching: 0x00070000
[13175.387899] [TTM]     default_caching: 0x00010000
[13176.603421] radeon 0000:00:01.0: va above limit (0x002098E7 >= 0x00200000)
[13176.613074] radeon 0000:00:01.0: va above limit (0x002098E7 >= 0x00200000)
[13176.635943] radeon 0000:00:01.0: va above limit (0x002098E7 >= 0x00200000)
[13176.636422] radeon 0000:00:01.0: va above limit (0x002098E7 >= 0x00200000)
[13176.636642] radeon 0000:00:01.0: va above limit (0x002098E7 >= 0x00200000)
[13176.636819] radeon 0000:00:01.0: va above limit (0x002098E7 >= 0x00200000)
[13176.637022] radeon 0000:00:01.0: va above limit (0x002098E7 >= 0x00200000)
[13176.637196] radeon 0000:00:01.0: va above limit (0x002098E7 >= 0x00200000)
[13176.637369] radeon 0000:00:01.0: va above limit (0x002098E7 >= 0x00200000)
[13176.637607] radeon 0000:00:01.0: va above limit (0x002098E7 >= 0x00200000)
[13176.637784] radeon 0000:00:01.0: va above limit (0x002098E7 >= 0x00200000)
[13176.637960] radeon 0000:00:01.0: va above limit (0x002098E7 >= 0x00200000)
[13176.638131] radeon 0000:00:01.0: va above limit (0x002098E7 >= 0x00200000)
[13176.638297] radeon 0000:00:01.0: va above limit (0x002098E7 >= 0x00200000)
[13176.638465] radeon 0000:00:01.0: va above limit (0x002098E7 >= 0x00200000)
[13176.638640] radeon 0000:00:01.0: va above limit (0x002098E7 >= 0x00200000)
[13176.638839] radeon 0000:00:01.0: va above limit (0x002098E7 >= 0x00200000)
[13176.639005] radeon 0000:00:01.0: va above limit (0x002098E7 >= 0x00200000)
[13176.639181] radeon 0000:00:01.0: va above limit (0x002098E7 >= 0x00200000)
This should be improved by an upcoming architectural change that Glenn is doing.
Assignee: nobody → gwatson
Depends on: wr-3d
Glenn-- How many sites are likely to be affected by this bug?
Flags: needinfo?(gwatson)
It's a little hard to say - 3d transforms in general are quite rare, and this page certainly is a very good stress test (mix-blend-mode, clips and 3d transforms all in one!).

The current state of the 3d transforms branch I have makes this site much better, but not (yet) perfect. However, it may well be enough that it fixes the 3d transforms usage on *most* pages that use them (of which there aren't many).

For what it's worth, this page is also completely broken in non-WR Gecko on Linux (which is using the basic compositor) - it probably works on windows and/or mac?
Flags: needinfo?(gwatson)
Thanks, Glenn.  Definitely not a blocker for taking WR to Beta.  Sounds like it's even questionable whether this should block MVP ("ride-to-release").  We can decide that after we get WR into Beta.
Priority: P1 → P2
QA Whiteboard: [retest]
Kvark, can you look at some of the remaining correctness issues here?
Assignee: gwatson → kvark
Flags: needinfo?(kvark)
FYI, there are a number of known performance issues here (mix-blend-mode, clip masks), but those are out of scope for this bug. The remaining work in this bug are the correctness issues - which I suspect are plane-splitting / accuracy / sorting issues?
Jeff, I'll look at those. We seem to be almost there :)
Flags: needinfo?(kvark)
When running this now I get severe corruption (even into the browser chrome). Dzmitry, when do you think you'll get to this?
Flags: needinfo?(kvark)
Right after the ClipId revolution...
Flags: needinfo?(kvark)
With WebRender enabled, it crashes Firefox Nightly on Linux within several seconds (Mesa
With WebRender enabled, it crashes Firefox Nightly on Linux within several seconds:



[ 2273.116894] WARNING: stack recursion on stack type 4
[ 2273.116902] WARNING: can't dereference registers at 000000007d37c53e for ip swapgs_restore_regs_and_return_to_usermode+0x7e/0x87
[ 3415.558768] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* amdgpu_vm_validate_pt_bos() failed.
[ 3415.558807] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Not enough memory for command submission!
[ 3417.366093] show_signal_msg: 63 callbacks suppressed
[ 3417.366097] Chrome_~dThread[12463]: segfault at 0 ip 00007f3a2fbd489f sp 00007f3a27526a90 error 6 in libxul.so[7f3a2bfdb000+520d000]
[ 3417.366116] Code: 8b 0d 8d 55 7d 03 48 89 01 c7 04 25 00 00 00 00 51 01 00 00 e8 92 00 41 fc 48 8d 05 b2 ad 76 01 48 8b 0d 6c 55 7d 03 48 89 01 <c7> 04 25 00 00 00 00 b7 09 00 00 e8 71 00 41 fc 90 55 48 89 e5 41
[ 3417.451779] Chrome_~dThread[12594]: segfault at 0 ip 00007fb9f93b489f sp 00007fb9f0d06a90 error 6
[ 3417.451782] Chrome_~dThread[12544]: segfault at 0 ip 00007f5dd433c89f sp 00007f5dcbc8ea90 error 6
[ 3417.451790]  in libxul.so[7fb9f57bb000+520d000]
[ 3417.451794] Code: 8b 0d 8d 55 7d 03 48 89 01 c7 04 25 00 00 00 00 51 01 00 00 e8 92 00 41 fc 48 8d 05 b2 ad 76 01 48 8b 0d 6c 55 7d 03 48 89 01 <c7> 04 25 00 00 00 00 b7 09 00 00 e8 71 00 41 fc 90 55 48 89 e5 41
[ 3417.451796]  in libxul.so[7f5dd0743000+520d000]
[ 3417.451809] Code: 8b 0d 8d 55 7d 03 48 89 01 c7 04 25 00 00 00 00 51 01 00 00 e8 92 00 41 fc 48 8d 05 b2 ad 76 01 48 8b 0d 6c 55 7d 03 48 89 01 <c7> 04 25 00 00 00 00 b7 09 00 00 e8 71 00 41 fc 90 55 48 89 e5 41
[ 3417.451936] Chrome_~dThread[12630]: segfault at 0 ip 00007efce727489f sp 00007efcdebc6a90 error 6 in libxul.so[7efce367b000+520d000]
[ 3417.451941] Code: 8b 0d 8d 55 7d 03 48 89 01 c7 04 25 00 00 00 00 51 01 00 00 e8 92 00 41 fc 48 8d 05 b2 ad 76 01 48 8b 0d 6c 55 7d 03 48 89 01 <c7> 04 25 00 00 00 00 b7 09 00 00 e8 71 00 41 fc 90 55 48 89 e5 41


System Info:
Extended renderer info (GLX_MESA_query_renderer):
    Vendor: X.Org (0x1002)
    Device: AMD KAVERI (DRM 3.27.0, 4.19.5-300.fc29.x86_64, LLVM 7.0.0) (0x1313)
    Version: 18.2.4
    Accelerated: yes
    Video memory: 1024MB
    Unified memory: no
    Preferred profile: core (0x1)
    Max core profile version: 4.5
    Max compat profile version: 4.4
    Max GLES1 profile version: 1.1
    Max GLES[23] profile version: 3.2
Memory info (GL_ATI_meminfo):
    VBO free memory - total: 254 MB, largest block: 254 MB
    VBO free aux. memory - total: 3039 MB, largest block: 3039 MB
    Texture free memory - total: 254 MB, largest block: 254 MB
    Texture free aux. memory - total: 3039 MB, largest block: 3039 MB
    Renderbuffer free memory - total: 254 MB, largest block: 254 MB
    Renderbuffer free aux. memory - total: 3039 MB, largest block: 3039 MB
Demoting because it's not realistic and more of a nice to have.
Priority: P2 → P4
@Jeff: I agree the demo itself is "not realistic" - however it also sohws Firefox's 3D CSS transforms have major issues - all other browsers render the demo fine (while Firefox with latest RadeonSI even crashes hard).

The demo is now rendering correctly. It is crushing about 20 seconds in, but there is a separate bug - https://bugzilla.mozilla.org/show_bug.cgi?id=1522015 . This is not about visual quality any more.

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
  1. it is better, but I still get artifacts. please re-open.
  2. the WebRender crashes reported (ressource leaks) are not what bug 1522015 is about.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Priority: P4 → P5

Just wanted to chime in here and mention that on dev channel (66.0b10 (64-bit)) w/a geforce rtx 2080ti I get massive artifacts in both the interactive and automated demos for css-fps, and the entire browser chrome periodically turns solid white. The interactive demo also causes frequent hangs. The most significant artifact is that the perspective texturing on various elements is hosed, but sometimes things flicker in and out of visibility entirely.

This appears to be true with and without WebRender, somehow? It may be that about:support is lying about what backend is being used to do layout and rasterization for the page. I checked about:support before testing and with webrender on and off I see the same artifacts.

We'll double check the behavior on Windows/GeForce. It's rather strange that you'd get the same artifacts regardless of Webrender. Are you sure it reported D3D11 compositor when you disabled WR? Did you restart the browser upon flipping that pref?

Flags: needinfo?(kg)

At least on my system (Ubuntu 19.04+Ivy Bridge, FF 71.0a1 (2019-09-18) (64-bit) from PPA), it no longer renders incorrectly. Still drops to about 2FPS in many parts, but now at least some parts actually render with decent FPS.

Same for me when I recently tested this on an AMD 5700 GPU - no artifacts, very smooth 60 fps in most parts, but a few places where it drops to single digit fps.

The 5700 GPU is incredibly powerful, so we must be doing something very wrong to cause those frame drops. I didn't check whether it was a CPU or GPU issue. I wonder if we hit some kind of edge case in plane-split and generate thousands of splits, or something like that. Need to investigate further to see what this remaining issue is.

Glenn, this may happen if we end up with too many draw calls, since a powerful GPU will not help with the driver overhead. Part of the reason for many draw calls could be that we have multiple batch regions per tile. Regardless, I think we should be looking at the performance in a different issue.

Clemens, could you confirm if you are still seeing the issue?

Flags: needinfo?(linuxhippy)

Yep - agreed we should move the performance bits to a different bug.

FWIW, I just tested again - draw call count never exceeds 200, it's often < 100 when the single FPS drops occur. Interestingly, according to the profiler, the CPU threads (backend + renderer) and GPU all remain well under 16 ms / frame. So I wonder if it's actually something else in Gecko that causes the slowdown in these parts, or perhaps something that is not being reported by the WR profiler somehow.

I can confirm, that this has been rendering correctly for a while, on both my MacBook Pro (with AMD RX 560) and Desktop (Win10 + 1070 GTX). The FPS issues of-course are still there

Retested on roughly the same config and can confirm for me that it mostly works great now and has no glitches. There are nasty frame drops later in the demo but I'm sure those are just driver / stack corner cases, so this is at least a dramatic improvement.

Flags: needinfo?(kg)

Bugbug thinks this bug is a regression, but please revert this change in case of error.

Keywords: regression

Can someone close this? No one has reported any correctness issues here in a month, so things are probably fine.

Status: REOPENED → RESOLVED
Closed: 5 years ago5 years ago
Resolution: --- → FIXED
Resolution: FIXED → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: