Closed Bug 1760804 Opened 2 years ago Closed 2 years ago

148.38 - 82.72% ts_paint_webext / startup_about_home_paint + 12 more (Windows) regression on Thu March 17 2022

Categories

(Core :: Audio/Video: Playback, defect, P1)

defect

Tracking

()

RESOLVED FIXED
101 Branch
Tracking Status
firefox-esr91 --- unaffected
firefox99 --- unaffected
firefox100 + fixed
firefox101 --- fixed

People

(Reporter: aglavic, Assigned: Zaggy1024)

References

(Regression)

Details

(4 keywords)

Attachments

(2 files)

Perfherder has detected a talos performance regression from push 88a481adf83070d1cdcc2bb074f6545b5dfe97e7. As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

Ratio Test Platform Options Absolute values (old vs new)
148% ts_paint_webext windows10-64-shippable-qr e10s fission stylo webrender 417.54 -> 1,037.08
142% ts_paint windows10-64-shippable-qr e10s fission stylo webrender-sw 385.33 -> 930.75
128% ts_paint_webext windows10-64-shippable-qr e10s fission stylo webrender-sw 395.50 -> 900.75
126% ts_paint windows10-64-shippable-qr e10s fission stylo webrender 406.08 -> 916.92
122% ts_paint_webext windows10-64-shippable-qr e10s fission stylo webrender 426.33 -> 945.50
119% sessionrestore_many_windows windows10-64-shippable-qr e10s fission stylo webrender 465.75 -> 1,021.83
118% sessionrestore_many_windows windows10-64-shippable-qr e10s fission stylo webrender-sw 446.58 -> 971.42
113% sessionrestore windows10-64-shippable-qr e10s fission stylo webrender-sw 457.83 -> 977.00
105% startup_about_home_paint_cached windows10-64-shippable-qr e10s fission stylo webrender-sw 520.42 -> 1,066.17
98% startup_about_home_paint windows10-64-shippable-qr e10s fission stylo webrender-sw 557.25 -> 1,102.33
98% sessionrestore windows10-64-shippable-qr e10s fission stylo webrender 488.00 -> 965.17
91% startup_about_home_paint_realworld_webextensions windows10-64-shippable-qr e10s fission stylo webrender 616.62 -> 1,176.17
89% startup_about_home_paint_cached windows10-64-shippable-qr e10s fission stylo webrender 567.08 -> 1,072.25
83% startup_about_home_paint windows10-64-shippable-qr e10s fission stylo webrender 604.25 -> 1,104.08

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the offending patch(es) will be backed out in accordance with our regression policy.

For more information on performance sheriffing please see our FAQ.

Flags: needinfo?(Zaggy1024)

Hi :Zaggy1024, I just wanted to ask something about this regression.
The regression above was caused by 88a481adf83070d1cdcc2bb074f6545b5dfe97e7 and in that push there were 3 bugs, 1569686, 1652945, 1758605.
Do you know which of these 3 bugs might have caused this regression?

My first theory would be that Bug 1652945 caused startup to be slower when PDMFactory enumerates the supported media types due to using MFTEnumEx to get the usable WMF decoders. That function takes about 30-40ms to run to get one individual decoder, and running that once per format may be adding to the startup time.

It should be possible to rework the way this support check is done to only call MFTEnumEx once per process that supports WMF decoding, since we can just enumerate all the WMF decoders at once and then iterate over them to check each individual mimetype.

Is it somehow possible for me to run these performance tests as I work on a patch to improve this?

Flags: needinfo?(Zaggy1024)
Has Regression Range: --- → yes

:Zaggy1024, you can use the documentation here to run the Talos performance tests while working on a patch to improve the regression

WMFDecoderModule will now cache supported stream types upon decoder process startup. Only one task will be dispatched to the MTA thread where all support will be determined at once.

MFTDecoder now will call all functions necessary to fully free memory on its WMF objects.

Assignee: nobody → Zaggy1024
Status: NEW → ASSIGNED
Priority: -- → P1
Pushed by alwu@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/a8db18c94b60
Use MFT CLSIDs for decoder instantiation instead of MFTEnumEx, as it caused a performance regression in process startup. r=alwu
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 100 Branch
Regressions: 1762728

Backed out for causing regression ( as per alwu request )

Backout link : https://hg.mozilla.org/integration/autoland/rev/caa02dcf67f0406200c4c1f9bb601d619c4d3eda

Status: RESOLVED → REOPENED
Flags: needinfo?(alwu)
Resolution: FIXED → ---
Flags: needinfo?(Zaggy1024)
Flags: needinfo?(alwu)

Fixes an issue where EMEs would check for WMF decoder support in the content process without having initialized PDMs.

Pushed by alwu@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/1848e97ba8f3
Use MFT CLSIDs for decoder instantiation instead of MFTEnumEx, as it caused a performance regression in process startup. r=alwu
https://hg.mozilla.org/integration/autoland/rev/db801d2822a8
Ensure that WMFDecoderModule::Init is called before MFT decoder support is returned. r=alwu
Status: REOPENED → RESOLVED
Closed: 2 years ago2 years ago
Resolution: --- → FIXED

Ni myself as an uplift reminder (if this patch doesn't cause any other regression on Nightly)

Flags: needinfo?(alwu)

My ni should be unnecessary now hopefully..

Flags: needinfo?(Zaggy1024)
Blocks: 1762125
Regressions: 1764032

Comment on attachment 9269294 [details]
Bug 1760804 - Use MFT CLSIDs for decoder instantiation instead of MFTEnumEx, as it caused a performance regression in process startup. r=alwu

Beta/Release Uplift Approval Request

  • User impact if declined: It causes performance degrade and crashes (bug 1760464).
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: No (but the crash seems significantly reduced)
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): The main change is basically reverting the change of how we request a decoder (bug 1652945), so we would back to the old way which we've been using for years. Because it's not adding a new way, the risk is relatively low.
  • String changes made/needed:
Flags: needinfo?(alwu)
Attachment #9269294 - Flags: approval-mozilla-beta?
Attachment #9270921 - Flags: approval-mozilla-beta?

Comment on attachment 9269294 [details]
Bug 1760804 - Use MFT CLSIDs for decoder instantiation instead of MFTEnumEx, as it caused a performance regression in process startup. r=alwu

Approved for 100.0b5

Attachment #9269294 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9270921 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Target Milestone: 100 Branch → 101 Branch
Blocks: 1760464
No longer regressed by: 1569686
No longer regressed by: 1758605
No longer blocks: 1762125

This was a large perf regression (and also a crash), so increasing severity to S2.

Severity: S3 → S2
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: