Open Bug 1402257 Opened 4 years ago Updated 4 months ago
Tiering: More sophisticated performance estimation in tiering decision
The tiering policy code (bug 1380033) makes a tier/dont-tier decision based on the estimated performance of the current computer, in order to decide whether an Ion compilation can meet an initial compilation deadline. Currently the estimation is crude and buckets the computer into one of essentially four buckets, (desktop,mobile)x(32bit,64bit), with some allowance for dont-know. At a minimum we want to classify devices also along the high/med/low performance spectrum within each of those buckets, if we can. This is hard, because CPU clock speed is not a great predictor of performance, one has to worry about CPU class, manufacturer, caches, NUMA, and so on. (Eg, older AMD have fast clocks but are slow; more recent Intel have slow clocks but are fast.) Some investigation is needed, and we want robustness above all. An older attachment to bug 1380033 (https://bugzilla.mozilla.org/attachment.cgi?id=8906635) has some prototype code for futzing with clock speeds. See also the last paragraph on https://bugzilla.mozilla.org/show_bug.cgi?id=1380033#c7.
A related matter is that we disable tiering for single-core systems because we don't want to risk saturating the single core with "background" compilation. The decision is historical but was just brought forward into the new tier control system. This scheme is suboptimal; really, tiering would benefit single-core systems greatly if we could only control background compilation better. Notably, background compilation must run at lower priority. See bug 1402265. That said, single-core systems are almost extinct, at 3.5% of our installed base and having dropped by 20 percent / one percentage point since May of this year (*after* the EOL of Windows XP, so not affected by that particular event). We live in a world that is uniformly multi-core. This should affect our testing practices too; testing with --no-threads is hardly meaningful.
As a first approximation we want (soon) to use at least two cores for background compilation. But we don't want to do that if those are two hyperthreads. So we may need that info, or to be a little conservative. We may wish to revive the patch that pushes this kind of info into the engine so that we can rely on vetted code in Firefox doing it, without duplication. Would be good even for our current CPU count.
This is strictly quick-and-dirty, it guesstimates the number of physical cores that we can use for the background compilation work. Let me know what you think. To do better than this we'll need to actually compute the physical cores, which is more complex, and at that point I'll probably insist that we make the embedder (the shell, or firefox) pump that information into the engine rather than the engine maintaining its own code for that. So it'll be a bigger job. Not saying it shouldn't be done but asking whether we need to do that right now.
Assignee: nobody → lhansen
Status: NEW → ASSIGNED
Attachment #8916702 - Flags: review?(luke)
Comment on attachment 8916702 [details] [diff] [review] bug1402257-tier2-scheduling.patch Review of attachment 8916702 [details] [diff] [review]: ----------------------------------------------------------------- Thinking about this a bit more, I think the two most interesting cases are dual-core w/ and w/o hyperthreading. I wish we exposed logical cores (https://github.com/mozilla/firefox-hardware-report/issues/60), but I'm going to guess it's 50/50 since the whole Core2 line didn't have HT. I think for dual-core-with-hyperthreading, this heuristic will give 3 threads to tier-2 which will eat into the foreground core's perf and so I think the ideal heuristic would choose "2". Now for 4-core chips, I'm going to guess that most or even all have HT based on the assumption that 4-core didn't become a commodity product until HT was re-enabled. If that's true, then we'd conveniently have the property that cpuCount=4 probably indicates a dual-core-with-hyperthreading. This suggests the alternative heuristic: ceil(double(cpuCount) / 3.0). WDYT?
Other data (just to have it, and to confuse the issue further): * AMD = 12% of the FF population; AMD had no hyperthreading until mid-2017 and we've seen no uptick on that graph. * The last Core 2 appears to have been the Core 2 Quad, in January 2010, and had 4 cores and no HT. I'm happy with the change you propose given that it's more conservative at low core counts. We should keep this bug open and work on getting an actual reading into the engine, so that we can use that.
With a computation as proposed and a comment to match.
Comment on attachment 8916963 [details] [diff] [review] bug1402257-tier2-scheduling-v2.patch Review of attachment 8916963 [details] [diff] [review]: ----------------------------------------------------------------- Hah, yes, that does muddy things somewhat. Makes sense to leave open and revisit later.
Attachment #8916963 - Flags: review?(luke) → review+
So Frank Bertsch got back to us in https://github.com/mozilla/firefox-hardware-report/issues/60#issuecomment-335587002 with a query with precise physical/logical data: https://sql.telemetry.mozilla.org/queries/47219/source So it looks like dual-core w/ and w/o hyperthreading is almost exactly 50/50 as guessed while the quad-core w/o hyperthreading is 71%, quite different than what I guess.
Digging around a little on the web, many of the recent mobile-oriented (low wattage) Pentium and Celeron chips (N and J series) are non-HT quad-core. These would be natural for low/mid range consumer laptops, probably.
Pushed by email@example.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/c7c66aac19da use some fraction of physical cores for tier-2 compiles. r=luke
Assignee: lhansen → nobody
Status: ASSIGNED → NEW
Summary: Wasm tiering: More sophisticated performance estimation in tiering decision → Tiering: More sophisticated performance estimation in tiering decision
You need to log in before you can comment on or make changes to this bug.