Closed Bug 1701648 Opened 4 years ago Closed 4 years ago

2.11 - 4.26% compiler_metrics num_static_constructors (android-5-0-x86_64, linux64, osx-cross, windows2012-32, windows2012-64) regression on push 261f77f6af13bbbfd2df0b7e71098e061853d3c6 (Fri March 26 2021)

Categories

(Toolkit :: Telemetry, defect, P1)

Firefox 89
defect

Tracking

()

RESOLVED FIXED
89 Branch
Tracking Status
firefox-esr78 --- unaffected
firefox87 --- unaffected
firefox88 --- unaffected
firefox89 --- fixed

People

(Reporter: alexandrui, Assigned: chutten)

References

(Regression)

Details

(Keywords: perf-alert, regression)

Attachments

(2 files)

Perfherder has detected a build_metrics performance regression from push 261f77f6af13bbbfd2df0b7e71098e061853d3c6. As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

Ratio Suite Test Platform Options Absolute values (old vs new)
4% compiler_metrics num_static_constructors android-5-0-x86_64 94.00 -> 98.00
4% compiler_metrics num_static_constructors osx-cross 104.00 -> 108.00
3% compiler_metrics num_static_constructors windows2012-32 118.00 -> 122.00
3% compiler_metrics num_static_constructors android-5-0-x86_64 140.00 -> 144.00
3% compiler_metrics num_static_constructors linux64 141.00 -> 145.00
3% compiler_metrics num_static_constructors osx-cross 152.00 -> 156.00
3% compiler_metrics num_static_constructors windows2012-32 160.00 -> 164.00
3% compiler_metrics num_static_constructors windows2012-64 160.00 -> 164.00
2% compiler_metrics num_static_constructors linux64 190.00 -> 194.00

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the offending patch(es) will be backed out in accordance with our regression policy.

For more information on performance sheriffing please see our FAQ.

Flags: needinfo?(chutten)

Oh, static constructors are forbidden? I wasn't aware.

If I'm not mistaken, they four are:

  1. static StaticDataMutex<nsTHashMap<TimerId, TimeStamp>> gTimerIdToStarts("gTimerIdToStarts");
  2. static StaticDataMutex<SubmetricToLabeledMirrorMapType> gLabeledMirrors("gLabeledMirrors");
  3. static StaticDataMutex<nsTHashMap<ScalarIDHashKey, TimeStamp>> gTimespanStarts("gTimespanStarts");
  4. Android-only static Atomic<uint64_t> gNextTimerId(1);

I suppose I could work around them via factories with lazy initialization. Adds complexity, though.

What is the goal in keeping the number of static ctors low? I want to make sure if I rewrite these I'm actually serving the purpose and not just working around a test.

Flags: needinfo?(chutten) → needinfo?(aionescu)

Set release status flags based on info from the regressing bug 1685406

Ah, two links deep I was directed to the archived page https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Automated_Performance_Testing_and_Sheriffing/Build_Metrics which says

Number of static constructors found by the compiler in the Firefox C++ codebase. Lower is better. Static constructors are undesirable because their initialization imposes an unavoidable time penalty every time Firefox is started.

Looks as though the pattern in the codebase to resolve this is to use StaticDataMutex<StaticRefPtr<T>>. That, plus a getter, should solve each of these... though I'm not sure that gNextTimerId is actually one of them.

How do I run this build metrics task locally? I grepped the codebase for num_static_constructors and only found something in crashreporter. Skimming mach mach-commands I didn't find anything terribly obvious.

Assignee: nobody → chutten
Status: NEW → ASSIGNED
Priority: -- → P1

Unfortunately I don't have knowledge about how to run these tests locally. Maybe :mhentges can help you more.

Flags: needinfo?(aionescu) → needinfo?(mhentges)

Hmm, good question. I'll dig through the CI tools and see if I can find answers, narrating my steps here because they aren't always obvious:

  1. From the alert summary, I went down to the line for the linux64-shippable (not instrumented) because, from my perspective, the Linux jobs are the most simple.
  2. I clicked the "Graph" link, which brought me to the num_static_constructors graph.
  3. Clicking on the dot where it jumps and clicking "Job" shows me this TreeHerder page.
  4. We're interested in seeing how that number on the graph is calculated, so I looked into the task, and found this line:
PERFHERDER_DATA: {"framework": {"name": "build_metrics"}, "suites": [{"name": "compiler_metrics", "subtests": [{"name": "num_static_constructors", "value": 143, "alertChangeType": "absolute", "alertThreshold": 3}]}]}
  1. This seems about right, but it's unclear how the compilation of Firefox is deciding to print this log line. It looks like it comes from this code.

My takeaways are:

TL;DR: You probably need to run a build locally with similiar options as that CI task.
Let me know if that's enough information, or if you'd prefer that I dig further.
Good luck! :)

Flags: needinfo?(mhentges)

I've had some small success! Looks like a way to make this work is:

$ ./mach python toolkit/crashreporter/tools/symbolstore.py --count-ctors ~/.mozbuild/dump_syms/dump_syms /tmp obj-x86_64-pc-linux-gnu/dist/bin/libxul.so
Beginning work for file: obj-x86_64-pc-linux-gnu/dist/bin/libxul.so
Processing file: obj-x86_64-pc-linux-gnu/dist/bin/libxul.so
/home/chutten/.mozbuild/dump_syms/dump_syms obj-x86_64-pc-linux-gnu/dist/bin/libxul.so
libxul.so/E664C2A677986D037D85079A4C3C09060/libxul.so.sym
PERFHERDER_DATA: {"framework": {"name": "build_metrics"}, "suites": [{"name": "compiler_metrics", "subtests": [{"name": "num_static_constructors", "value": 144, "alertChangeType": "absolute", "alertThreshold": 3}]}]}
Finished processing obj-x86_64-pc-linux-gnu/dist/bin/libxul.so in 26.82s

I'm using /tmp as a dumping ground for the generated symbols, and obj-x86_64-pc-linux-gnu is my objdir.

144 is one less than the number in the alert, which makes sense since I'd already converted one of the StaticDataMutex<T> to StaticDataMutex<UniquePtr<T>>. I think this works!

And while I'm here, add TimespanMetric::Cancel, which was missed.

Pushed by chutten@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/f1a9aaf49557 Replace static ctors with factories r=TravisLong
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 89 Branch
Has Regression Range: --- → yes
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: