Closed Bug 1827248 Opened 3 years ago Closed 3 years ago

Update dav1d to new version 5aa3b38f9871859e14e55f18ab5e38318fe86305 from 2023-04-08 11:47:31

Categories

(Core :: Audio/Video: Playback, enhancement)

enhancement

Tracking

()

RESOLVED FIXED
114 Branch
Tracking Status
firefox114 --- fixed

People

(Reporter: update-bot, Assigned: chunmin)

Details

(Whiteboard: [3pl-filed][task_id: MHY4awiHSJeNveMgYoS34A])

Attachments

(1 file)

This update covers 20 commits.. Here are the overall diff statistics, and then the commit information.


media/libdav1d/moz.yaml | 4 +-
media/libdav1d/vcs_version.h | 2 +-
third_party/dav1d/src/arm/64/ipred.S | 71 +-
third_party/dav1d/src/arm/64/ipred16.S | 1027 ++++++++++++++++++++++++++++
third_party/dav1d/src/arm/ipred.h | 24 +-
third_party/dav1d/src/obu.c | 19 +-
third_party/dav1d/src/picture.c | 26 +-
third_party/dav1d/src/picture.h | 6 +
third_party/dav1d/src/x86/ipred.h | 1 +
third_party/dav1d/src/x86/ipred_sse.asm | 652 +++++++++++++++++
third_party/dav1d/src/x86/itx.h | 4 +
third_party/dav1d/src/x86/itx16_avx2.asm | 47 +-
third_party/dav1d/src/x86/itx16_avx512.asm | 315 ++++++++
third_party/dav1d/src/x86/itx_avx512.asm | 4 +-
third_party/dav1d/src/x86/refmvs.asm | 128 +++-
third_party/dav1d/src/x86/refmvs.h | 2 +
third_party/dav1d/tests/checkasm/ipred.c | 1 +
third_party/dav1d/tests/checkasm/refmvs.c | 18 +-
18 files changed, 2280 insertions(+), 71 deletions(-)


5aa3b38f9871859e14e55f18ab5e38318fe86305 by Ronald S. Bultje

https://code.videolan.org/videolan/dav1d/commit/5aa3b38f9871859e14e55f18ab5e38318fe86305
Authored: 2023-04-05 09:20:22 -0400
Committed: 2023-04-08 11:47:31 +0000

x86: add AVX512-IceLake implementation of HBD 16x64 DCT^2

nop: 39.4
inv_txfm_add_16x64_dct_dct_0_10bpc_c: 2208.0 ( 1.00x)
inv_txfm_add_16x64_dct_dct_0_10bpc_sse4: 133.5 (16.54x)
inv_txfm_add_16x64_dct_dct_0_10bpc_avx2: 71.3 (30.98x)
inv_txfm_add_16x64_dct_dct_0_10bpc_avx512icl: 102.0 (21.66x)
inv_txfm_add_16x64_dct_dct_1_10bpc_c: 25757.0 ( 1.00x)
inv_txfm_add_16x64_dct_dct_1_10bpc_sse4: 1366.1 (18.85x)
inv_txfm_add_16x64_dct_dct_1_10bpc_avx2: 657.6 (39.17x)
inv_txfm_add_16x64_dct_dct_1_10bpc_avx512icl: 378.9 (67.98x)
inv_txfm_add_16x64_dct_dct_2_10bpc_c: 25771.0 ( 1.00x)
inv_txfm_add_16x64_dct_dct_2_10bpc_sse4: 1739.7 (14.81x)
inv_txfm_add_16x64_dct_dct_2_10bpc_avx2: 772.1 (33.38x)
inv_txfm_add_16x64_dct_dct_2_10bpc_avx512icl: 469.3 (54.92x)
inv_txfm_add_16x64_dct_dct_3_10bpc_c: 25775.7 ( 1.00x)
inv_txfm_add_16x64_dct_dct_3_10bpc_sse4: 1968.1 (13.10x)
inv_txfm_add_16x64_dct_dct_3_10bpc_avx2: 886.5 (29.08x)
inv_txfm_add_16x64_dct_dct_3_10bpc_avx512icl: 662.6 (38.90x)
inv_txfm_add_16x64_dct_dct_4_10bpc_c: 25745.9 ( 1.00x)
inv_txfm_add_16x64_dct_dct_4_10bpc_sse4: 2330.9 (11.05x)
inv_txfm_add_16x64_dct_dct_4_10bpc_avx2: 1008.5 (25.53x)
inv_txfm_add_16x64_dct_dct_4_10bpc_avx512icl: 662.3 (38.88x)

Files Modified:

  • src/x86/itx.h
  • src/x86/itx16_avx512.asm
  • src/x86/itx_avx512.asm

380efd764ffe66ac7f36efe4695d31e59add88f6 by Matthias Dressel

https://code.videolan.org/videolan/dav1d/commit/380efd764ffe66ac7f36efe4695d31e59add88f6
Authored: 2023-03-06 23:45:42 +0100
Committed: 2023-04-06 07:52:12 +0000

CI: Add wasm{32,64} builds

Fixes #421

Files Added:

  • package/crossfiles/wasm32.meson
  • package/crossfiles/wasm64.meson

Files Modified:

  • .gitlab-ci.yml

0207e0fe9f688dfa3c80117f9002031eff68e624 by Matthias Dressel

https://code.videolan.org/videolan/dav1d/commit/0207e0fe9f688dfa3c80117f9002031eff68e624
Authored: 2023-03-31 17:54:28 +0200
Committed: 2023-03-31 18:41:54 +0200

x86/itx: Fix identation of macro instructions

Files Modified:

  • src/x86/itx16_avx2.asm

f6d4c0c473a8acc90a9746033837eadd9b2f0ea9 by Matthias Dressel

https://code.videolan.org/videolan/dav1d/commit/f6d4c0c473a8acc90a9746033837eadd9b2f0ea9
Authored: 2023-03-31 17:53:08 +0200
Committed: 2023-03-31 18:41:36 +0200

x86/itx: Add 32x32 12bpc AVX2 idtx

inv_txfm_add_32x32_identity_identity_0_12bpc_c: 5785.8 ( 1.00x)
inv_txfm_add_32x32_identity_identity_0_12bpc_avx2: 20.7 (279.65x)
inv_txfm_add_32x32_identity_identity_1_12bpc_c: 5896.9 ( 1.00x)
inv_txfm_add_32x32_identity_identity_1_12bpc_avx2: 20.7 (285.01x)
inv_txfm_add_32x32_identity_identity_2_12bpc_c: 5799.5 ( 1.00x)
inv_txfm_add_32x32_identity_identity_2_12bpc_avx2: 68.9 (84.20x)
inv_txfm_add_32x32_identity_identity_3_12bpc_c: 5798.1 ( 1.00x)
inv_txfm_add_32x32_identity_identity_3_12bpc_avx2: 140.6 (41.25x)
inv_txfm_add_32x32_identity_identity_4_12bpc_c: 5803.3 ( 1.00x)
inv_txfm_add_32x32_identity_identity_4_12bpc_avx2: 308.2 (18.83x)

Files Modified:

  • src/x86/itx.h
  • src/x86/itx16_avx2.asm

1e602b8b3395d53d35bbd2eddd74fb1abd3c2f18 by Matthias Dressel

https://code.videolan.org/videolan/dav1d/commit/1e602b8b3395d53d35bbd2eddd74fb1abd3c2f18
Authored: 2022-04-25 18:54:38 +0200
Committed: 2023-03-31 18:41:19 +0200

x86/itx: Add 32x16 12bpc AVX2 idtx

inv_txfm_add_32x16_identity_identity_0_12bpc_c: 4138.7 ( 1.00x)
inv_txfm_add_32x16_identity_identity_0_12bpc_avx2: 30.4 (136.26x)
inv_txfm_add_32x16_identity_identity_1_12bpc_c: 4147.5 ( 1.00x)
inv_txfm_add_32x16_identity_identity_1_12bpc_avx2: 30.7 (135.25x)
inv_txfm_add_32x16_identity_identity_2_12bpc_c: 4138.2 ( 1.00x)
inv_txfm_add_32x16_identity_identity_2_12bpc_avx2: 98.9 (41.84x)
inv_txfm_add_32x16_identity_identity_3_12bpc_c: 4136.6 ( 1.00x)
inv_txfm_add_32x16_identity_identity_3_12bpc_avx2: 167.7 (24.67x)
inv_txfm_add_32x16_identity_identity_4_12bpc_c: 4156.3 ( 1.00x)
inv_txfm_add_32x16_identity_identity_4_12bpc_avx2: 242.1 (17.17x)

Files Modified:

  • src/x86/itx.h
  • src/x86/itx16_avx2.asm

e6b194e7d29dee4b0b54a85e7f7659680be2a53e by Matthias Dressel

https://code.videolan.org/videolan/dav1d/commit/e6b194e7d29dee4b0b54a85e7f7659680be2a53e
Authored: 2022-04-25 18:50:46 +0200
Committed: 2023-03-31 18:40:35 +0200

x86/itx: Add 16x32 12bpc AVX2 idtx

inv_txfm_add_16x32_identity_identity_0_12bpc_c: 4287.9 ( 1.00x)
inv_txfm_add_16x32_identity_identity_0_12bpc_avx2: 31.4 (136.66x)
inv_txfm_add_16x32_identity_identity_1_12bpc_c: 4293.7 ( 1.00x)
inv_txfm_add_16x32_identity_identity_1_12bpc_avx2: 30.9 (139.07x)
inv_txfm_add_16x32_identity_identity_2_12bpc_c: 4273.8 ( 1.00x)
inv_txfm_add_16x32_identity_identity_2_12bpc_avx2: 97.3 (43.92x)
inv_txfm_add_16x32_identity_identity_3_12bpc_c: 4269.0 ( 1.00x)
inv_txfm_add_16x32_identity_identity_3_12bpc_avx2: 165.2 (25.83x)
inv_txfm_add_16x32_identity_identity_4_12bpc_c: 4284.4 ( 1.00x)
inv_txfm_add_16x32_identity_identity_4_12bpc_avx2: 235.2 (18.22x)

Files Modified:

  • src/x86/itx.h
  • src/x86/itx16_avx2.asm

922bd82b4e0b4df1e23d8d4325417be812dfa23a by Henrik Gramner

https://code.videolan.org/videolan/dav1d/commit/922bd82b4e0b4df1e23d8d4325417be812dfa23a
Authored: 2023-03-22 21:37:20 +0100
Committed: 2023-03-25 14:29:15 +0000

x86: Add 8-bit ipred z2 SSSE3 asm

Files Modified:

  • src/x86/ipred.h
  • src/x86/ipred_sse.asm
  • tests/checkasm/ipred.c

8c731791c7133c7954f2d7a18d9fbbba2412a302 by Victorien Le Couviour--Tuffet

https://code.videolan.org/videolan/dav1d/commit/8c731791c7133c7954f2d7a18d9fbbba2412a302
Authored: 2023-03-23 13:53:18 +0100
Committed: 2023-03-23 15:44:03 +0100

checkasm: Improve mv generation for refmvs.save_tmvs

Files Modified:

  • tests/checkasm/refmvs.c

1ed24f06832751b04a56488ce8fdcffea14a7a4b by James Almer

https://code.videolan.org/videolan/dav1d/commit/1ed24f06832751b04a56488ce8fdcffea14a7a4b
Authored: 2023-03-21 08:51:00 -0300
Committed: 2023-03-21 09:18:47 -0300

picture: fix attaching props to delayed output pictures

If a Metadata OBU appeared right before a Frame Header OBU with
show_existing_picture = 1, it was not being attached to it but to the next
assembled picture, which was in the following TU.

Signed-off-by: James Almer <jamrial@gmail.com>

Files Modified:

  • src/obu.c
  • src/picture.c
  • src/picture.h

e75caab99e54b4abd3134dfd98f61af38f11520f by Martin Storsjö

https://code.videolan.org/videolan/dav1d/commit/e75caab99e54b4abd3134dfd98f61af38f11520f
Authored: 2023-03-08 09:50:42 +0200
Committed: 2023-03-21 08:57:44 +0200

arm64: ipred: 16 bpc NEON implementation of the Z3 function

Relative speedup over the C code:
Cortex A53 A55 A72 A73 A76 Apple M1
intra_pred_z3_w4_16bpc_neon: 3.06 2.87 2.17 1.97 2.33 7.75
intra_pred_z3_w8_16bpc_neon: 3.90 3.94 2.97 3.16 2.93 4.43
intra_pred_z3_w16_16bpc_neon: 4.08 4.48 3.31 4.68 3.13 5.00
intra_pred_z3_w32_16bpc_neon: 4.43 4.85 3.50 4.02 3.33 5.62
intra_pred_z3_w64_16bpc_neon: 4.68 5.30 3.72 3.96 3.52 5.78

Files Modified:

  • src/arm/64/ipred16.S
  • src/arm/ipred.h

2eb923910067f0872aae79b06e2e8fe4a18486e8 by Martin Storsjö

https://code.videolan.org/videolan/dav1d/commit/2eb923910067f0872aae79b06e2e8fe4a18486e8
Authored: 2023-02-20 15:40:25 +0200
Committed: 2023-03-21 08:57:43 +0200

arm64: ipred: 16 bpc NEON implementation of the Z1 function

Relative speedup over the C code:
Cortex A53 A55 A72 A73 A76 Apple M1
intra_pred_z1_w4_16bpc_neon: 3.49 2.63 2.83 3.85 3.14 9.00
intra_pred_z1_w8_16bpc_neon: 6.19 4.39 3.65 6.58 4.99 6.50
intra_pred_z1_w16_16bpc_neon: 6.65 4.64 3.97 7.78 4.87 7.00
intra_pred_z1_w32_16bpc_neon: 7.76 5.49 5.17 7.83 5.59 8.24
intra_pred_z1_w64_16bpc_neon: 8.02 5.80 5.33 8.41 5.77 8.70

Files Modified:

  • src/arm/64/ipred16.S
  • src/arm/ipred.h

ec38062a1225be863f8ea5e3e56660727b1f3c09 by Martin Storsjö

https://code.videolan.org/videolan/dav1d/commit/ec38062a1225be863f8ea5e3e56660727b1f3c09
Authored: 2023-03-06 13:31:11 +0200
Committed: 2023-03-21 08:57:43 +0200

arm: ipred: Make a SIMD pixel_set function for padding

For 8 bpc, there's probably not much difference to a decent memset,
but for 16 bpc, there might be a bigger difference.

Files Modified:

  • src/arm/64/ipred.S
  • src/arm/ipred.h

6f5bf165e4ef15ab83b2a646b5c12d5ae267c0f0 by Martin Storsjö

https://code.videolan.org/videolan/dav1d/commit/6f5bf165e4ef15ab83b2a646b5c12d5ae267c0f0
Authored: 2023-03-08 11:16:31 +0200
Committed: 2023-03-21 08:57:43 +0200

arm64: ipred: Use fewer registers for table lookups in w=8 in z3_fill1 for 8bpc

Add comments explaining the exact dimensions of the gather tables
used currently. That reasoning shows that the w=8 case can do with
one register less.

Before: Cortex A53 A55 A72 A73 A76 Apple M1
intra_pred_z3_w8_8bpc_neon: 356.2 376.2 218.9 246.4 176.1 0.6
After:
intra_pred_z3_w8_8bpc_neon: 339.6 357.3 205.6 232.3 160.0 0.5

Files Modified:

  • src/arm/64/ipred.S

7be5347c97ea9c9fb470182ec6608204006767df by Martin Storsjö

https://code.videolan.org/videolan/dav1d/commit/7be5347c97ea9c9fb470182ec6608204006767df
Authored: 2023-02-22 13:44:18 +0200
Committed: 2023-03-21 08:57:43 +0200

arm64: ipred: Improve accumulation ordering in 8bpc z1

Start out the multiplication/accumulation with a register that is
available sooner.

Before: Cortex A53 A55 A72 A73 A76 Apple M1
intra_pred_z1_w8_8bpc_neon: 266.3 268.9 146.6 155.3 103.9 0.4
intra_pred_z1_w16_8bpc_neon: 528.6 574.4 333.9 364.3 209.1 0.7
intra_pred_z1_w32_8bpc_neon: 1149.3 1245.4 752.3 811.5 503.4 1.3
intra_pred_z1_w64_8bpc_neon: 2198.4 2360.6 1462.9 1575.0 1007.6 2.4
After:
intra_pred_z1_w8_8bpc_neon: 266.3 269.1 146.6 155.0 100.1 0.4
intra_pred_z1_w16_8bpc_neon: 528.6 573.3 347.9 352.4 204.3 0.7
intra_pred_z1_w32_8bpc_neon: 1149.2 1245.3 763.4 759.6 474.8 1.3
intra_pred_z1_w64_8bpc_neon: 2198.8 2360.6 1430.0 1417.4 943.5 2.3

Files Modified:

  • src/arm/64/ipred.S

92d93f4b350fb9fdf54d87ee6a90b86591e3494e by Martin Storsjö

https://code.videolan.org/videolan/dav1d/commit/92d93f4b350fb9fdf54d87ee6a90b86591e3494e
Authored: 2023-03-06 11:32:11 +0200
Committed: 2023-03-21 08:57:43 +0200

arm64: ipred: Optimize the 3tap filter padding in z1_filter_edge

The second register will at most contain one valid pixel, the
padding pixel. Thus skip padding the register and just fill it
with the padding pixel.

Files Modified:

  • src/arm/64/ipred.S

8ee450cbd0bcf2de926df8a746c8975f1183663a by Martin Storsjö

https://code.videolan.org/videolan/dav1d/commit/8ee450cbd0bcf2de926df8a746c8975f1183663a
Authored: 2023-03-15 23:34:30 +0200
Committed: 2023-03-21 08:57:43 +0200

arm64: ipred: Remove leftover instructions at the start of z3_fill2

There were redundant leftovers from copypasting bits when writing this
function.

Files Modified:

  • src/arm/64/ipred.S

ab6977bc0456d1e9e8b1af171c5d782bf8eb4e82 by Martin Storsjö

https://code.videolan.org/videolan/dav1d/commit/ab6977bc0456d1e9e8b1af171c5d782bf8eb4e82
Authored: 2023-03-15 15:56:55 +0200
Committed: 2023-03-21 08:57:42 +0200

arm64: ipred: Rename a misnamed local label in the assembly

This is for cases with h >= 16.

Files Modified:

  • src/arm/64/ipred.S

da9602a32b3607d4eece23e705078d4102b45bd1 by Martin Storsjö

https://code.videolan.org/videolan/dav1d/commit/da9602a32b3607d4eece23e705078d4102b45bd1
Authored: 2023-03-15 15:48:43 +0200
Committed: 2023-03-21 08:57:42 +0200

arm64: ipred: Fix a misindented operand in the assembly

Files Modified:

  • src/arm/64/ipred.S

50a89b6383701e4eb2980e21584e14b75b7cee1f by Martin Storsjö

https://code.videolan.org/videolan/dav1d/commit/50a89b6383701e4eb2980e21584e14b75b7cee1f
Authored: 2023-03-08 11:15:57 +0200
Committed: 2023-03-21 08:57:42 +0200

arm: ipred: Fix a misindented line in the C wrapper

Files Modified:

  • src/arm/ipred.h

16c943484e63da7ed8a8a8d85af88995369f23cd by Victorien Le Couviour--Tuffet

https://code.videolan.org/videolan/dav1d/commit/16c943484e63da7ed8a8a8d85af88995369f23cd
Authored: 2023-03-13 16:08:56 +0100
Committed: 2023-03-16 16:09:46 +0100

x86: Add refmvs.save_tmvs AVX-512 (Ice Lake) asm

Files Modified:

  • src/x86/refmvs.asm
  • src/x86/refmvs.h

The try push is done, we found jobs with unclassified failures.

Needs Investigation (From Push Health):

  • No tests were found for flavor 'plain' and the following manifest filters:
    skip_if, run_if, fail_if, subsuite(name=media), tags(['media-engine-compatible']), pathprefix(['dom/media/autoplay/test/mochitest/mochitest.ini', 'dom/media/test/mochitest_background_video.ini', 'dom/media/test/mochitest_seek.ini', 'dom/media/webrtc/tests/mochitests/identity/mochitest.ini', 'dom/media/webrtc/tests/mochitests/mochitest_datachannel.ini', 'dom/media/webrtc/tests/mochitests/mochitest_getusermedia.ini', 'dom/media/webspeech/recognition/test/mochitest.ini'])

    Make sure the test paths (if any) are spelt correctly and the corresponding
    --flavor and --subsuite are being used. See mach mochitest --help for a
    list of valid flavors.

    • 4 of 4 failed on the same (retriggered) task
      - test-windows11-64-2009-qr/opt-mochitest-media-wmfme (Vv3FfpYWRkS9lMrYXSxLFg)
      - test-windows11-64-2009-qr/opt-mochitest-media-wmfme (Gd0OB712T1qtMCWBVW_qDQ)
      - test-windows11-64-2009-qr/opt-mochitest-media-wmfme (KZdunINETBa4ktLZfyWPOw)
      - test-windows11-64-2009-qr/opt-mochitest-media-wmfme (Unn-CDJDRDKJPdmsHS9C3Q)

These failures could mean that the library update changed something and caused
tests to fail. You'll need to review them yourself and decide where to go from here.

In either event, I have done all I can and you will need to take it from here. If you
don't want to land my patch, you can replicate it locally for editing with
./mach vendor media/libdav1d/moz.yaml

When reviewing, please note that this is external code, which needs a full and
careful inspection - not a rubberstamp.

Assignee: nobody → cchang
Flags: needinfo?(cchang)
Pushed by cchang@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/fbf7be8c7ad2 Update dav1d to 5aa3b38f9871859e14e55f18ab5e38318fe86305 r=chunmin
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 114 Branch
Flags: needinfo?(cchang)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: