Update dav1d to new version 7d23ec4a042b2feb6c0d4c1b0618a87cb8c34dcb from 2023-03-13
Categories
(Core :: Audio/Video: Playback, enhancement)
Tracking
()
People
(Reporter: update-bot, Assigned: chunmin)
References
Details
(Whiteboard: [3pl-filed][task_id: Zmj-RIa9Ra2hTFSGberHQw])
Attachments
(1 file, 1 obsolete file)
This update covers 7 commits:
9b4b2448106f863bfa4789ba6e86555ad57c838f by Victorien Le Couviour--Tuffet
https://code.videolan.org/videolan/dav1d/commit/9b4b2448106f863bfa4789ba6e86555ad57c838f
Authored: 2023-02-09 16:25:09 +0100
Committed: 2023-02-10 15:11:32 +0100
drain: Properly fix a desync between next and first
The code in dav1d_drain_picture could result in a desync between
c->task_thread.first (oldest submitted frame) and c->frame_thread.next (first
frame to retrieve and/or next submit location).
As we loop through drain, we always increment next, but first only if the
frame has data. If the frame is visible we return. The problem arises when
encountering (an) invisible frame(s), and the next entries haven't been fed
yet, we then keep on looping increasing next but not first, as these have no
data.
We should always return when we encountered data (visible or
invisible decoded frame): for visible, the code already returns, for
invisible, we can store a boolean indicating we drained at least one frame,
whenever we reach an empty entry after that, we return (all subsequent
entries are guaranteed to be empty anyway), not incrementing next nor first.
This will have the effect to insert the next frame at the first free spot
(which is much better than the weird skips it's doing now).
So basically, c->frame_thread.next could skip some (empty) entries.
Now it's contiguous.
Fixes #416.
Files Modified:
- src/lib.c
3f19ece69ff5dcddf3791025dfaf5ab15196c899 by Victorien Le Couviour--Tuffet
https://code.videolan.org/videolan/dav1d/commit/3f19ece69ff5dcddf3791025dfaf5ab15196c899
Authored: 2023-02-09 16:19:00 +0100
Committed: 2023-02-09 16:36:57 +0100
Revert "Fix mismatch between first and next in drain"
This reverts commit a51b6ce417aab690bd0cde7e76124895e0e3adfe.
We can't increment first when no data is there, otherwise we might do it
while the first frame was not yet decoded, messing up ordering: imagine
having a framedelay of 8, and a file with 7 frames. We feed 7 frames over 8
slots, now next points to [7] (empty entry), and we start draining cause EOF.
We do need next to be incremented to reach the first frame ([0]), so it can
be outputted, and only then first too.
Fixes #418.
Files Modified:
- src/lib.c
3b7b09613489bd45922ae76ff4b1dbbdf04ae7a6 by Henrik Gramner
https://code.videolan.org/videolan/dav1d/commit/3b7b09613489bd45922ae76ff4b1dbbdf04ae7a6
Authored: 2023-02-03 13:59:33 +0100
Committed: 2023-02-03 14:13:09 +0100
x86: Add 8-bit ipred z1 SSSE3 asm
Files Modified:
- src/x86/ipred.h
- src/x86/ipred_sse.asm
77b3955537c716acbbdf4bfc46ec2f6b0ccd683a by Martin Storsjö
https://code.videolan.org/videolan/dav1d/commit/77b3955537c716acbbdf4bfc46ec2f6b0ccd683a
Authored: 2023-01-26 15:18:41 +0200
Committed: 2023-01-31 15:33:58 +0200
checkasm: Add an --affinity= option for selecting a CPU core
Add an option for selecting the core where the single thread of
checkasm runs. This allows benchmarking on specific CPU cores on
heterogenous CPUs, like ARM big.LITTLE configurations.
On Linux, one can easily wrap an invocation of checkasm with
"taskset -c <n> [...]" - so this option isn't very essential
there - however it is quite useful on Windows.
On Windows, it is somewhat possible to do the same by launching
the tool with "start /B /affinity <hexmask> [...]", but that
doesn't work well with scripting ("start" returns before the
command has finished running, and it's not obvious how to
invoke "start" from within WSL).
Using "taskset" to launch processes on specific cores within WSL
on Windows doesn't work - regardless of the Linux level affinity,
the process ends up running on the performance cores anyway.
Files Modified:
- meson.build
- tests/checkasm/checkasm.c
99956c737a6d0a70d8aab8a4ff4c15aee65fc197 by Martin Storsjö
https://code.videolan.org/videolan/dav1d/commit/99956c737a6d0a70d8aab8a4ff4c15aee65fc197
Authored: 2023-01-08 23:18:19 +0200
Committed: 2023-01-31 10:16:16 +0200
arm64: ipred: 8 bpc NEON implementation of the Z3 function
The implementation is a hybrid between two approaches; one generic
(but non-ideal) for cases with large max_base_y, which fills two
pixel columns at a time, i.e. looping over pixels first vertically,
then horizontally - i.e. in a non-optimal manner.
For cases with smaller max_base_y, it does two rows at a time, essentially
doing gathers with the TBX instruction.
Relative speedup over the C code:
Cortex A53 A55 A72 A73 A76 Apple M1
intra_pred_z3_w4_8bpc_neon: 3.32 2.89 2.78 3.52 2.52 9.67
intra_pred_z3_w8_8bpc_neon: 6.24 5.55 4.76 5.60 4.11 6.40
intra_pred_z3_w16_8bpc_neon: 7.64 7.07 4.37 6.23 4.18 8.60
intra_pred_z3_w32_8bpc_neon: 7.51 7.21 4.34 5.92 4.27 7.88
intra_pred_z3_w64_8bpc_neon: 6.82 6.25 4.08 5.83 3.52 7.31
Files Modified:
- src/arm/64/ipred.S
- src/arm/ipred.h
fd4f348e7074fd35e23d6187241ab7b800b3bb99 by Martin Storsjö
https://code.videolan.org/videolan/dav1d/commit/fd4f348e7074fd35e23d6187241ab7b800b3bb99
Authored: 2023-01-02 15:07:24 +0200
Committed: 2023-01-27 23:54:44 +0200
arm64: ipred: 8 bpc NEON implementation of the Z1 function
Relative speedup over the C code:
Cortex A53 A55 A72 A73 A76 Apple M1
intra_pred_z1_w4_8bpc_neon: 4.09 3.15 3.63 4.16 3.27 13.00
intra_pred_z1_w8_8bpc_neon: 6.93 5.66 5.57 6.76 5.51 5.50
intra_pred_z1_w16_8bpc_neon: 7.81 6.85 6.24 7.78 6.59 9.00
intra_pred_z1_w32_8bpc_neon: 10.56 9.95 8.72 10.95 8.28 13.33
intra_pred_z1_w64_8bpc_neon: 11.00 11.38 9.11 11.62 8.65 14.61
(The speedup numbers for M1 are kinda noisy due to the very coarse
granularity of the timer used there.)
Files Modified:
- src/arm/64/ipred.S
- src/arm/ipred.h
2e990b370e7a28d14e60f6ee1183fdac0d3470ac by Martin Storsjö
https://code.videolan.org/videolan/dav1d/commit/2e990b370e7a28d14e60f6ee1183fdac0d3470ac
Authored: 2023-01-19 16:02:58 +0200
Committed: 2023-01-27 23:54:20 +0200
checkasm: ipred: Iterate 5 times for each Z1/Z2/Z3 function
These functions contain a number of different codepaths; try to
make sure that we hit most codepaths for each size combination.
This both gives better test coverage in one single run of checkasm,
but also should give a better averaged runtime in benchmarks.
Files Modified:
- tests/checkasm/ipred.c
Reporter | ||
Comment 1•2 years ago
|
||
Zmj-RIa9Ra2hTFSGberHQw |
I've submitted a try run for this commit: https://treeherder.mozilla.org/jobs?repo=try&revision=6a7fbf998b784d2ca94d99ba602cc515abf59e6f
Reporter | ||
Comment 2•2 years ago
|
||
Reporter | ||
Comment 3•2 years ago
|
||
e-1_VMyWRPaUeMcgL5AHOA |
All jobs completed, we found the following issues.
Known Issues (From Push Health):
-
browser/components/aboutlogins/tests/browser/browser_createLogin.js
- 1 of 1 failed on the same (retriggered) task
- test-linux1804-64-qr/debug-mochitest-browser-chrome-swr-2 (GUTRASPtT0O1LSEXOjoFAA)
- 1 of 1 failed on the same (retriggered) task
-
browser/modules/test/browser/browser_preloading_tab_moving.js
- 1 of 1 failed on the same (retriggered) task
- test-linux1804-64-qr/opt-mochitest-browser-chrome-swr-a11y-checks-7 (altEB10qSxCvdHzk9QoSwA)
- 1 of 1 failed on the same (retriggered) task
-
devtools/client/inspector/computed/test/browser_computed_getNodeInfo.js
- 1 of 4 failed on the same (retriggered) task
- test-linux1804-64-qr/opt-mochitest-devtools-chrome-spi-nw-2 (IQ4qxkoVSvmne8InO9UTDw)
- 1 of 4 failed on the same (retriggered) task
-
dom/base/test/test_eventsource_event_listener_leaks.html
- 1 of 4 failed on the same (retriggered) task
- test-linux1804-64-qr/opt-mochitest-plain-xorig-5 (MfOC19r6RHyG2ba9IDFvog)
- 1 of 4 failed on the same (retriggered) task
-
toolkit/components/passwordmgr/test/mochitest/test_autocomplete_basic_form.html
- 1 of 1 failed on the same (retriggered) task
- test-linux1804-64-asan-qr/opt-mochitest-plain-nofis-2 (eGyVIBzgTxaS5L2cTvtpQQ)
- 1 of 1 failed on the same (retriggered) task
These failures may mean that the library update succeeded; you'll need to review
them yourself and decide. If there are lint failures, you will need to fix them in
a follow-up patch. (Or ignore the patch I made, and recreate it yourself with
./mach vendor media/libdav1d/moz.yaml
.)
In either event, I have done all I can, so you will need to take it from here.
When reviewing, please note that this is external code, which needs a full and
careful inspection - not a rubberstamp.
Reporter | ||
Updated•2 years ago
|
Comment 4•2 years ago
|
||
A fix (https://code.videolan.org/videolan/dav1d/-/commit/cf617fdae0b9bfabd27282854c8e81450d955efa) landed on the dav1d repo today that should fix our bug 1814560, which is blocking turning on animated avif. How much work is it to update dav1d to tip March 13 or newer? Thanks!
Assignee | ||
Comment 5•2 years ago
|
||
Comment 6•2 years ago
|
||
The following patch is waiting for review from a reviewer who resigned from the review:
ID | Title | Author | Reviewer Status |
---|---|---|---|
D169659 | Bug 1816484 - Update dav1d to 9b4b2448106f863bfa4789ba6e86555ad57c838f | update-bot | chunmin: Resigned from review |
:update-bot, could you please find another reviewer?
For more information, please visit auto_nag documentation.
Updated•2 years ago
|
Updated•2 years ago
|
Comment 8•2 years ago
|
||
bugherder |
Updated•2 years ago
|
Description
•