Open Bug 1402257 Opened 7 years ago Updated 2 years ago

[exploration] Tiering: More sophisticated performance estimation in tiering decision

Tracking

()

Status:

NEW

People

(Reporter: lth, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: leave-open)

Attachments

(1 file, 1 obsolete file)

bug1402257-tier2-scheduling.patch 7 years ago Lars T Hansen [:lth] 2.27 KB, patch		Details \| Diff \| Splinter Review
bug1402257-tier2-scheduling-v2.patch 7 years ago Lars T Hansen [:lth] 2.31 KB, patch	luke : review+	Details \| Diff \| Splinter Review

Lars T Hansen [:lth]

Reporter

Description

•

7 years ago

The tiering policy code (bug 1380033) makes a tier/dont-tier decision based on the estimated performance of the current computer, in order to decide whether an Ion compilation can meet an initial compilation deadline.  Currently the estimation is crude and buckets the computer into one of essentially four buckets, (desktop,mobile)x(32bit,64bit), with some allowance for dont-know.

At a minimum we want to classify devices also along the high/med/low performance spectrum within each of those buckets, if we can.  This is hard, because CPU clock speed is not a great predictor of performance, one has to worry about CPU class, manufacturer, caches, NUMA, and so on.  (Eg, older AMD have fast clocks but are slow; more recent Intel have slow clocks but are fast.)  Some investigation is needed, and we want robustness above all.

An older attachment to bug 1380033 (https://bugzilla.mozilla.org/attachment.cgi?id=8906635) has some prototype code for futzing with clock speeds.  See also the last paragraph on https://bugzilla.mozilla.org/show_bug.cgi?id=1380033#c7.

Lars T Hansen [:lth]

Reporter

Updated

•

7 years ago

Priority: -- → P3

Lars T Hansen [:lth]

Reporter

Comment 1

•

7 years ago

A related matter is that we disable tiering for single-core systems because we don't want to risk saturating the single core with "background" compilation.  The decision is historical but was just brought forward into the new tier control system.  This scheme is suboptimal; really, tiering would benefit single-core systems greatly if we could only control background compilation better.  Notably, background compilation must run at lower priority.  See bug 1402265.

That said, single-core systems are almost extinct, at 3.5% of our installed base and having dropped by 20 percent / one percentage point since May of this year (*after* the EOL of Windows XP, so not affected by that particular event).  We live in a world that is uniformly multi-core.  This should affect our testing practices too; testing with --no-threads is hardly meaningful.

Lars T Hansen [:lth]

Reporter

Comment 2

•

7 years ago

As a first approximation we want (soon) to use at least two cores for background compilation.  But we don't want to do that if those are two hyperthreads.  So we may need that info, or to be a little conservative.  We may wish to revive the patch that pushes this kind of info into the engine so that we can rely on vetted code in Firefox doing it, without duplication.  Would be good even for our current CPU count.

Lars T Hansen [:lth]

Reporter

Comment 3

•

7 years ago

Attached patch bug1402257-tier2-scheduling.patch (obsolete) — Details — Splinter Review

This is strictly quick-and-dirty, it guesstimates the number of physical cores that we can use for the background compilation work.  Let me know what you think.

To do better than this we'll need to actually compute the physical cores, which is more complex, and at that point I'll probably insist that we make the embedder (the shell, or firefox) pump that information into the engine rather than the engine maintaining its own code for that.  So it'll be a bigger job.  Not saying it shouldn't be done but asking whether we need to do that right now.

Assignee: nobody → lhansen

Status: NEW → ASSIGNED

Attachment #8916702 - Flags: review?(luke)

Luke Wagner [:luke]

Comment 4

•

7 years ago

Comment on attachment 8916702 [details] [diff] [review]
bug1402257-tier2-scheduling.patch

Review of attachment 8916702 [details] [diff] [review]:
-----------------------------------------------------------------

Thinking about this a bit more, I think the two most interesting cases are dual-core w/ and w/o hyperthreading.  I wish we exposed logical cores (https://github.com/mozilla/firefox-hardware-report/issues/60), but I'm going to guess it's 50/50 since the whole Core2 line didn't have HT.  I think for dual-core-with-hyperthreading, this heuristic will give 3 threads to tier-2 which will eat into the foreground core's perf and so I think the ideal heuristic would choose "2".

Now for 4-core chips, I'm going to guess that most or even all have HT based on the assumption that 4-core didn't become a commodity product until HT was re-enabled.  If that's true, then we'd conveniently have the property that cpuCount=4 probably indicates a dual-core-with-hyperthreading.  This suggests the alternative heuristic: ceil(double(cpuCount) / 3.0).  WDYT?

Lars T Hansen [:lth]

Reporter

Comment 5

•

7 years ago

Other data (just to have it, and to confuse the issue further):

* AMD = 12% of the FF population; AMD had no hyperthreading until mid-2017
  and we've seen no uptick on that graph.

* The last Core 2 appears to have been the Core 2 Quad, in January 2010,
  and had 4 cores and no HT.

I'm happy with the change you propose given that it's more conservative at low core counts.  We should keep this bug open and work on getting an actual reading into the engine, so that we can use that.

Lars T Hansen [:lth]

Reporter

Comment 6

•

7 years ago

Attached patch bug1402257-tier2-scheduling-v2.patch — Details — Splinter Review

With a computation as proposed and a comment to match.

Attachment #8916702 - Attachment is obsolete: true

Attachment #8916702 - Flags: review?(luke)

Attachment #8916963 - Flags: review?(luke)

Luke Wagner [:luke]

Comment 7

•

7 years ago

Comment on attachment 8916963 [details] [diff] [review]
bug1402257-tier2-scheduling-v2.patch

Review of attachment 8916963 [details] [diff] [review]:
-----------------------------------------------------------------

Hah, yes, that does muddy things somewhat.  Makes sense to leave open and revisit later.

Attachment #8916963 - Flags: review?(luke) → review+

Luke Wagner [:luke]

Comment 8

•

7 years ago

So Frank Bertsch got back to us in https://github.com/mozilla/firefox-hardware-report/issues/60#issuecomment-335587002 with a query with precise physical/logical data:

  https://sql.telemetry.mozilla.org/queries/47219/source

So it looks like dual-core w/ and w/o hyperthreading is almost exactly 50/50 as guessed while the quad-core w/o hyperthreading is 71%, quite different than what I guess.

Lars T Hansen [:lth]

Reporter

Comment 9

•

7 years ago

Digging around a little on the web, many of the recent mobile-oriented (low wattage) Pentium and Celeron chips (N and J series) are non-HT quad-core.  These would be natural for low/mid range consumer laptops, probably.

Pulsebot

Comment 10

•

7 years ago

Pushed by lhansen@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/c7c66aac19da
use some fraction of physical cores for tier-2 compiles. r=luke

Lars T Hansen [:lth]

Reporter

Updated

•

7 years ago

Keywords: leave-open

Ryan VanderMeulen [:RyanVM]

Comment 11

•

7 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/c7c66aac19da

Lars T Hansen [:lth]

Reporter

Updated

•

6 years ago

Component: JavaScript Engine: JIT → Javascript: Web Assembly

Lars T Hansen [:lth]

Reporter

Updated

•

6 years ago

Assignee: lhansen → nobody

Status: ASSIGNED → NEW

BugBot [:suhaib / :marco/ :calixte]

Comment 12

•

5 years ago

The leave-open keyword is there and there is no activity for 6 months.
:ajones, maybe it's time to close this bug?

Flags: needinfo?(ajones)

Lars T Hansen [:lth]

Reporter

Comment 13

•

5 years ago

(In reply to Release mgmt bot [:sylvestre / :calixte] from comment #12)

The leave-open keyword is there and there is no activity for 6 months.
:ajones, maybe it's time to close this bug?

No, it's not time to close this bug.

Flags: needinfo?(ajones)

BugBot [:suhaib / :marco/ :calixte]

Comment 14

•

5 years ago

The leave-open keyword is there and there is no activity for 6 months.
:lth, maybe it's time to close this bug?

Flags: needinfo?(lhansen)

Lars T Hansen [:lth]

Reporter

Updated

•

5 years ago

Flags: needinfo?(lhansen)

Lars T Hansen [:lth]

Reporter

Updated

•

5 years ago

Depends on: 1590305

Lars T Hansen [:lth]

Reporter

Updated

•

5 years ago

Blocks: 1590305

No longer depends on: 1590305

Lars T Hansen [:lth]

Reporter

Updated

•

5 years ago

Summary: Wasm tiering: More sophisticated performance estimation in tiering decision → Tiering: More sophisticated performance estimation in tiering decision

BugBot [:suhaib / :marco/ :calixte]

Comment 15

•

4 years ago

The leave-open keyword is there and there is no activity for 6 months.
:lth, maybe it's time to close this bug?

Flags: needinfo?(lhansen)

Lars T Hansen [:lth]

Reporter

Updated

•

4 years ago

Flags: needinfo?(lhansen)

BugBot [:suhaib / :marco/ :calixte]

Comment 16

•

4 years ago

The leave-open keyword is there and there is no activity for 6 months.
:lth, maybe it's time to close this bug?

Flags: needinfo?(lhansen)

Lars T Hansen [:lth]

Reporter

Updated

•

4 years ago

Flags: needinfo?(lhansen)

BugBot [:suhaib / :marco/ :calixte]

Comment 17

•

3 years ago

The leave-open keyword is there and there is no activity for 6 months.
:lth, maybe it's time to close this bug?

Flags: needinfo?(lhansen)

Lars T Hansen [:lth]

Reporter

Updated

•

3 years ago

Flags: needinfo?(lhansen)

Lars T Hansen [:lth]

Reporter

Updated

•

2 years ago

Type: enhancement → task

Summary: Tiering: More sophisticated performance estimation in tiering decision → [exploration] Tiering: More sophisticated performance estimation in tiering decision

BugBot [:suhaib / :marco/ :calixte]

Comment 18

•

2 years ago

The leave-open keyword is there and there is no activity for 6 months.
:rhunt, maybe it's time to close this bug?
For more information, please visit auto_nag documentation.

Flags: needinfo?(rhunt)

Ryan Hunt [:rhunt] (on leave until early May)

Updated

•

2 years ago

Flags: needinfo?(rhunt)

BMO Automation

Updated

•

2 years ago

Severity: normal → S3

You need to log in before you can comment on or make changes to this bug.