Closed Bug 1437572 Opened 2 years ago Closed 2 years ago

Update webrender to 4af31b8aa79d5a1f3c0242dea6be4876c71075c5

Categories

(Core :: Graphics: WebRender, enhancement, P3)

60 Branch
enhancement

Tracking

()

RESOLVED FIXED
mozilla60
Tracking Status
firefox60 --- fixed

People

(Reporter: kats, Assigned: kats)

References

(Blocks 2 open bugs)

Details

(Whiteboard: [gfx-noted])

Attachments

(4 files, 2 obsolete files)

+++ This bug was initially created as a clone of Bug #1436058 +++

I'm filing this as a placeholder bug for the next webrender update. I may be running a cron script [1] that does try pushes with webrender update attempts, so that we can track build/test breakages introduced by webrender on a rolling basis. This bug will hold the try push links as well as dependencies filed for those breakages, so that we have a better idea going into the update of what needs fixing. I might abort the cron job because once things get too far out of sync it's hard to fully automate fixing all the breakages.

When we are ready to actually land the update, we can rename this bug and use it for the update, and then file a new bug for the next "future update".

[1] https://github.com/staktrace/moz-scripts/blob/master/try-latest-webrender.sh
The last update (bug 1436058) updated WR to cset 342bc314db94aa439b2001249c5f24ccfcbccc22. There are regressions from newer WR csets, detailed in bug 1436058, but in summary:

- servo/webrender#2399 caused the number of windows reftest failures to spike dramatically, to the point where the log exceeds the max allowed size and turns the job red.
- servo/webrender#2408 caused a couple of linux reftest failures, one of which is fuzzable but the other not.
The Linux reftest failure is fixed. Re-posting my comment from the last wr-update bug here:

I can't reproduce reftest failures on Windows as we're seeing on CI - they mostly run fine on my local Win10 machine, but I have a possible theory on the Windows bustage.

If I run the flexbox/ reftests on both Linux and Windows the memory usage is relatively stable.

However, when I run the image/test/reftest reftests, there appears to be a significant memory leak. On Windows, the memory usage grows by ~100MB/sec until the process seems to go idle / timeout at ~4 GB allocated. Since the default Windows build is 32 bit, that would seem like an OOM is probably occurring somewhere. On Linux, I see the same memory growth, but since it's a 64-bit build, it doesn't have the same address space issues.

Is it possible that this could be the cause of the Windows issues on CI? There are a couple of patches in this changeset that deal with managing lifetime of image and pipeline resources, could they be related?
(In reply to Glenn Watson [:gw] from comment #2)
> However, when I run the image/test/reftest reftests, there appears to be a
> significant memory leak. On Windows, the memory usage grows by ~100MB/sec
> until the process seems to go idle / timeout at ~4 GB allocated. Since the
> default Windows build is 32 bit, that would seem like an OOM is probably
> occurring somewhere. On Linux, I see the same memory growth, but since it's
> a 64-bit build, it doesn't have the same address space issues.
>
> Is it possible that this could be the cause of the Windows issues on CI?
> There are a couple of patches in this changeset that deal with managing
> lifetime of image and pipeline resources, could they be related?

The Windows builds on the try pushes are all 64-bit, so this seems unlikely. Also the patches in the changeset that deal with lifetimes were in a number of the bisection try pushes that I did (bug 1436058 comment 17) and the problem only started occurring with PR 2399. It's possible that it's a combination of the two changes that causes the failures.

Since I could reproduce at least one of the windows reftest issues on my windows machine I'll see if I can get a recording of it, maybe that will give us a clue as to what's going on.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #3)
> Since I could reproduce at least one of the windows reftest issues on my
> windows machine I'll see if I can get a recording of it, maybe that will
> give us a clue as to what's going on.

Quick update: I can consistently reproduce a problem with #2399 applied which doesn't occur without that patch. To reproduce, I build Firefox and run:
  MOZ_WEBRENDER=1 ./mach run layout/reftests/reftest-sanity/647192-1.html
With #2399 the scrollbar area shows up as red, but without #2399 it renders fine.

I attempted to get a recording using ENABLE_WR_RECORDING=1 but discovered that turning on recording made the bug go away. After some investigation it looks like that happens because turning on recording turns off external images. I filed bug 1437925 and servo/webrender#2411 to fix that. At least that revealed the issue is related to external images and #2399, if that helps any. I also tried to get a wr-capture of the bad rendering - I can get the capture, but the capture is not complete (missing files in externals/) so attempting to load it in wrench doesn't work. Trying to dig into that now.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #5)
> I can get
> the capture, but the capture is not complete (missing files in externals/)

I fixed this with bug 1437949.

> so attempting to load it in wrench doesn't work. Trying to dig into that now.

So now I have a capture, attached. Attempting to load it in wrench on my Windows machine doesn't work (the window seems to hang) but I can load it on the same wrench version (built from 65c68cc1d1) on my OS X machine. And it shows the red scrollbars.

I'll make a capture of the good version as well, without #2399, and attach that for comparison purposes. Hopefully this will provide useful information to debug the problem.
Here's a capture of the good rendering. Unfortunately I can't get this capture to load on either windows (same problem as before) or OS X (I get a panic about "Unsupported image shader kind"). Note that the WR revision of this capture is different from the other capture.
^ windows was red as expected

WR @ 38df48c62c929a3603f3c6e6393c2be8f1c88903

https://treeherder.mozilla.org/#/jobs?repo=try&revision=064776f7e05b066c4ad40181ba397b751dcfd477
https://treeherder.mozilla.org/#/jobs?repo=try&revision=233f8acb03b49cc6f65cf29c350bec3a2bea1b8a

Linux green, windows builds not starting yet (bug 1372172). This one should be green since it includes servo/webrender#2412 which was the fix.
Attached patch bindings.patch (obsolete) — Splinter Review
Patch to webrender_bindings/ for rayon update.
Attached patch toolkit.patch (obsolete) — Splinter Review
Patch to toolkit/ for rayon update.
Alias: wr-future-update
Assignee: nobody → bugmail
Summary: Future webrender update bug → Update webrender to 4af31b8aa79d5a1f3c0242dea6be4876c71075c5
Version: unspecified → 60 Branch
Comment on attachment 8951503 [details] [diff] [review]
bindings.patch

This patch will go into my queue for bug 1438892
Attachment #8951503 - Attachment is obsolete: true
Comment on attachment 8951652 [details]
Bug 1437572 - Update reftest fuzziness from WR PR 2408.

https://reviewboard.mozilla.org/r/220940/#review226876
Attachment #8951652 - Flags: review?(jmuizelaar) → review+
Comment on attachment 8951651 [details]
Bug 1437572 - Update webrender to 4af31b8aa79d5a1f3c0242dea6be4876c71075c5.

https://reviewboard.mozilla.org/r/220938/#review226874
Attachment #8951651 - Flags: review?(jmuizelaar) → review+
Pushed by kgupta@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/37d984fce90f
Update webrender to 4af31b8aa79d5a1f3c0242dea6be4876c71075c5. r=jrmuizel
https://hg.mozilla.org/integration/autoland/rev/751d00a65e68
Update reftest fuzziness from WR PR 2408. r=jrmuizel
https://hg.mozilla.org/mozilla-central/rev/37d984fce90f
https://hg.mozilla.org/mozilla-central/rev/751d00a65e68
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla60
Depends on: 1440158
Depends on: 1441025
Depends on: 1442134
Depends on: 1442386
Depends on: 1442828
Depends on: 1444904
Depends on: 1447093
You need to log in before you can comment on or make changes to this bug.