Open Bug 1543852 Opened 5 years ago Updated 2 years ago

Evaluate using dromaeo_css for PGO training

Categories

(Firefox Build System :: General, enhancement)

enhancement

Tracking

(Not tracked)

People

(Reporter: erahm, Unassigned)

References

(Blocks 1 open bug)

Details

We would like to evaluate using the dromaeo_css perf test for PGO training. This exercises an extensive rust code base used by the style system and will be useful once we have rust PGO enabled.

Work can start by building on top of the initial patches for bug 1437452 that enable rust PGO on Linux.

Type: defect → enhancement

The results of the work in bug 1543853 (to integrate StyleBench into the PGO runs) were not super promising, but now that we do have rust profiling, I wonder if we can add to our existing (somewhat JS/DOM-focused) test suite and improve stylo/CSS/layout performance with something in this space. Emilio, do you have ideas about what we could/should try here?

Flags: needinfo?(emilio)
Whiteboard: [qf]

I think stylebench and other "large" benchmarks (speedometer, etc...) of sorts are probably the right thing to use for PGO training.

Using micro-benchmarks like our perf-reftests and such (which is what bug 1543853 regressed) probably isn't such a great thing, in terms of improving the performance of Firefox for real workloads.

Dromaeo is a weird one because it's a collection of smaller benchmarks which are supposed to be representative of what some pages do... But I don't have all the context though, so I don't have a sense of what's a better idea.

I'd probably do bug 1543853 again if we now can have rust profiling and we didn't when we tried it.

Flags: needinfo?(emilio)

Somewhat out of order for reasons that will become apparent:

(In reply to Emilio Cobos Álvarez (:emilio) from comment #2)

I'd probably do bug 1543853 again if we now can have rust profiling and we didn't when we tried it.

Sorry, I shouldn't have implied that we didn't have rust profiling for that work; we did.

I think stylebench and other "large" benchmarks (speedometer, etc...) of sorts are probably the right thing to use for PGO training.

Using micro-benchmarks like our perf-reftests and such (which is what bug 1543853 regressed) probably isn't such a great thing, in terms of improving the performance of Firefox for real workloads.

Are you saying we should accept a regresson on perf-reftest if it improves stylebench? (though even so the regression in the last iteration dmajor ran in bug 1543853 was "mixed results", apparently - but of course by now it's too old to know exactly what happened; the comments in bug 1603482 are interesting though)

Also, if we're talking real workloads, should we run raptor page load / running tests as part of profiling instead?

Dromaeo is a weird one because it's a collection of smaller benchmarks which are supposed to be representative of what some pages do... But I don't have all the context though, so I don't have a sense of what's a better idea.

:dmajor, do you feel there's anything I'm missing here that you could clear up? :-)

Flags: needinfo?(emilio)
Flags: needinfo?(dmajor)

I don't have any test-specific thoughts. In general, I'm happy to see more training get added, as long as: it doesn't take too long, it's easy to run locally, and it doesn't completely skew profiles like bug 1603482 ("don't exceed the existing top function's counts" might be a reasonable approximation for this goal).

Flags: needinfo?(dmajor)

(In reply to :Gijs (he/him) from comment #3)

Are you saying we should accept a regresson on perf-reftest if it improves stylebench? (though even so the regression in the last iteration dmajor ran in bug 1543853 was "mixed results", apparently - but of course by now it's too old to know exactly what happened; the comments in bug 1603482 are interesting though)

Depends on the specifics of course, but yes, I think a progression in speedometer/stylebench/etc is better than an equivalent perf-reftest speedup, generally.

The JS team has probably more experience evaluating benchmarks, but I think that's also where things are shifting these days too... Microbenchmarks like octane and so on I think receive very little attention these days IIRC.

Also, if we're talking real workloads, should we run raptor page load / running tests as part of profiling instead?

Probably running raptor is better than running tests (if you meant wpt / mochitests with that). We have a lot of tests for obscure things that won't happen on real time, but raptor is supposed to be loading of real websites, so training with that seems worth it.

Flags: needinfo?(emilio)
Whiteboard: [qf]
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.