Closed Bug 1804104 Opened 2 years ago Closed 11 days ago

Support storing into available capacity in StoreDenseElementHole

Categories

(Core :: JavaScript Engine: JIT, defect, P3)

defect

Tracking

()

RESOLVED FIXED
138 Branch
Tracking Status
firefox-esr102 --- unaffected
firefox-esr115 --- wontfix
firefox-esr128 --- wontfix
firefox107 --- unaffected
firefox108 --- unaffected
firefox109 --- wontfix
firefox110 --- wontfix
firefox136 --- wontfix
firefox137 --- wontfix
firefox138 --- fixed

People

(Reporter: mayankleoboy1, Assigned: iain, Mentored)

References

(Blocks 2 open bugs, Regressed 1 open bug, Regression)

Details

(Keywords: perf-alert, regression)

Attachments

(2 files)

Keywords: regression

Iain, bug 1797486 is in the range in comment 0. Mind taking a quick look to see if this is reproducible and easily fixable?

Flags: needinfo?(iireland)

It's reproducible. Not immediately sure about easily fixable.

The relevant code looks something like this:

for (var i = 1; i <= 3; i++) {
  var m = (1 << i) * 10000;
  var flags = Array(m+1);
  for (var j = 2; j <= m; j++) {
    flags[j] = true;
  }
  // Do something with flags.
}

We allocate a large array, then we initialize all but the first two elements.

Before my patch, this code had bimodal behaviour:

  1. The first few iterations of the inner loop happen in the C++ interpreter. We don't attach any ICs.
  2. Eventually, we tier up to blinterp and attach a StoreDenseElementHole IC.
  3. We kick off an OSR compilation on a background thread.
    4A: In the fast case, we finish the compilation before the second iteration of the outer loop. Because the pre-patch code supported indices > initializedLength, we don't bail out when writing to flags[2]: the initialized length is 2, but the capacity is 2046. We continue running in Ion with StoreDenseElementHole.
    4B: In the slow case, we start the second iteration of the loop before tiering up to Ion. When we try writing to flags[2], the guard fails, and we instead attach (and eventually transpile) CallAddOrUpdateSparseElement.

After my patch, both baseline ICs and Ion only support writing one past the initialized length, even if we have the capacity.

One option for fixing this would be to add support (in both CacheIR and Ion) for the OOB case where we have the capacity. We'd have to add a loop to fill the gaps with the magic hole value.

Flags: needinfo?(iireland)

This bug has been marked as a regression. Setting status flag for Nightly to affected.

Component: General → JavaScript: GC
Product: Firefox → Core
Regressed by: 1797486
Component: JavaScript: GC → JavaScript Engine: JIT

Set release status flags based on info from the regressing bug 1797486

Severity: -- → S3
Priority: -- → P2
Whiteboard: [sp3]

Shold we accept the regression and close this bug?

Flags: needinfo?(iireland)

We can accept the regression (in the sense that there's no urgency to fix this for the sake of sunspider), but it would still be nice to fix this at some point. This is a reasonable good-second-bug candidate. It's not a high priority, but it's also fairly small and self-contained.

I'll bump down the priority, switch it to a task, and mark myself as a mentor.

Quick summary: StoreDenseElementHole is used in cases where we are writing to an object with dense elements, but we might be writing past the end of the current elements. In both the CacheIRCompiler and Ion implementations of StoreDenseElementHole, we only support writing to the next empty element, even if we have allocated capacity for more. The primary reason for this is that we have to initialize the gap between the existing elements and the new one we're adding with the magic hole value, which requires us to generate a loop. We also have to make sure that the NON_PACKED flag is set on the elements, now that we've introduced a hole.

Mentor: iireland
Severity: S3 → S4
Type: task → defect
Flags: needinfo?(iireland)
Priority: P2 → P3
Summary: 43% regression on AWFY-Sunspider-access-nsieve on 30-Nov → Support storing into available capacity in StoreDenseElementHole
Type: task → defect
Severity: S4 → --
Type: task → defect
Severity: -- → S4
Blocks: 1951338

The two code paths were already almost identical. The only differences were that the Ion code used setupAlignedABICall and had a small optimization to reuse the condition flags an earlier test (on architectures with condition flags).

Assignee: nobody → iireland
Status: NEW → ASSIGNED

With this patch, we have jitcode support for storing anywhere within the allocated elements capacity, or one item past.

The assertion in addDenseElementPure is no longer correct because we reallocate the capacity before we initialize the holes and update initLength.

We could add support for writing more than one element past the capacity, but we'd have to call a different VM function and think more carefully about when should go sparse.

Pushed by iireland@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/75fc4a9ec050 Share more StoreDenseElementHole code r=jandem https://hg.mozilla.org/integration/autoland/rev/dbeb07f4be69 Initialize holes in prepareOOBStoreElement r=jandem
Status: ASSIGNED → RESOLVED
Closed: 11 days ago
Resolution: --- → FIXED
Target Milestone: --- → 138 Branch

Cross-posting from Bug 1951338:

Either this bug or Bug 1951338 lead to some improvements:

SP3:
Windows: 2.8% on SP3-perf-Dashboard-Total , which was driven by 7.2% improvement on SP3-perf-Dashboard-SelectingPoints-total
Android: 4.8% on Android SP3-perf-Dashboard-SelectingPoints-total

Sunspider
54% on Sunspider-access-nsieve (Probably due to bug 1804104)

Jetstream2
7.6% on Segmentation
3% on Jetstream2-Lebab-wtb

It was this bug.

Perf-Dashboard profiles before/after:
Before: https://share.firefox.dev/4c1XCCr (SetElem IC in findOptimalSegmentationInternal)
After: https://share.firefox.dev/3RjJo6g (no SetElem IC)

Regressions: 1954419

(In reply to Pulsebot from comment #9)

Pushed by iireland@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/75fc4a9ec050
Share more StoreDenseElementHole code r=jandem
https://hg.mozilla.org/integration/autoland/rev/dbeb07f4be69
Initialize holes in prepareOOBStoreElement r=jandem

Perfherder has detected the following browsertime performance change.

Improvements:

Ratio Test Platform Options Absolute values (old vs new) Performance Profiles
17% speedometer3 Perf-Dashboard/SelectingPoints/Sync linux1804-64-nightlyasrelease-qr fission webrender 45.08 -> 37.61 Before/After
16% speedometer3 Perf-Dashboard/SelectingPoints/total linux1804-64-nightlyasrelease-qr fission webrender 48.24 -> 40.67 Before/After
8% speedometer3 Perf-Dashboard/total linux1804-64-nightlyasrelease-qr fission webrender 101.38 -> 93.47 Before/After
6% speedometer3 Perf-Dashboard/SelectingPoints/Sync windows11-64-24h2-nightlyasrelease fission webrender 21.50 -> 20.16 Before/After
6% speedometer3 Perf-Dashboard/SelectingPoints/total windows11-64-24h2-nightlyasrelease fission webrender 23.53 -> 22.12 Before/After
... ... ... ... ... ...
3% speedometer3 Perf-Dashboard/total windows11-64-24h2-nightlyasrelease fission webrender 52.54 -> 51.04 Before/After

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.

You can run these tests on try with ./mach try perf --alert 44345

For more information on performance sheriffing please see our FAQ.

Keywords: perf-alert
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: