Open Bug 1913842 Opened 5 months ago Updated 4 months ago

Register allocation and Ion compile times are too slow on Android during Speedometer3

Tracking

()

Status:

NEW

Project Flags:

Performance Impact

medium

People

(Reporter: denispal, Unassigned)

References

(Depends on 1 open bug, Blocks 2 open bugs)

Details

(Whiteboard: [sp3])

Attachments

(1 file)

perfetto-trace-newssite-next.png 5 months ago Denis Palmeiro [:denispal] 112.09 KB, image/png		Details

Denis Palmeiro [:denispal]

Reporter

Description

•

5 months ago

Attached image perfetto-trace-newssite-next.png — Details

Ion compile times can be very slow on Android. A significant problem we have running Speedometer3 is that we do not spend enough time in Ion during the short subtest windows and making these faster should theoretically help us reach Ion sooner for more functions.

I've attached some examples from perfetto of the NewsSite-Next subtest, where Ion compilations are taking up to 24ms but the subtest itself is only 83ms long. Since we start compiling so late due to thresholds, we don't usually have much time in this tier during execution at all.

A significant portion of the Ion compile time is spent during Register Allocation. Often times it's around 50% of the compile time, and sometimes can be as high as 80%+. It might be useful to experiment with a linear scan allocator to minimize this time.

Denis Palmeiro [:denispal]

Reporter

Comment 1

•

5 months ago

A simpleperf profile of the compile times during the NewsSite-Next subtest: https://share.firefox.dev/4fOV4cq. Roughly 40% of the time is spent in regalloc.

Nicolas B. Pierron [:nbp]

Comment 2

•

5 months ago

Yulia was mentioning a research paper last week where they select which virtual registers gets a chance to be in a register by measuring the density of the virtual register uses in a given window around the studied instruction. Maybe a different approach like this one could be more efficient for a JIT.

Nicolas B. Pierron [:nbp]

Updated

•

5 months ago

Blocks: sm-opt-jits

Severity: -- → S4

Priority: -- → P2

Julian Seward [:jseward]

Comment 3

•

5 months ago

It might be useful to experiment with a linear scan allocator to minimize
this time.

Building a new allocator and getting it production-ready is a big undertaking.
There are a couple of things we could try to make the existing allocator
modestly faster:

on mobile, skip the spill-bundle allocation loop
(tryAllocatingRegistersForSpillBundles). Per comments at [1], this chews
up a bunch of time but almost never improves the allocation.
we know that when allocating large functions, the RA causes a large number of
cache misses because it repeatedly traverses large AVL trees (of register
commitments). We could try to reduce the footprint of the trees by replacing
the inter-node pointers with 32-bit array indices -- a relatively easy
change. Or we could replace the trees with B-trees, which are claimed to be
more cache-friendly.

Yulia was mentioning a research paper last week where they select which
virtual registers gets a chance to be in a register by measuring the density
of the virtual register uses

The paper is an interesting read. If we do want to try out a new allocator
design, I think this might be worth trying instead of a linear-scan allocator.

[1] https://searchfox.org/mozilla-central/source/js/src/jit/BacktrackingAllocator.cpp#4626

Justin Link

Comment 4

•

5 months ago

I am going to try Julian's suggestion above of skipping the call to tryAllocatingRegistersForSpillBundles. I'm curious to see how much that changes compilation time and how that impacts the overall performance.

Justin Link

Comment 5

•

5 months ago

I tried this out and here is the comparison report (on Linux and Windows 11 at least - the A51 jobs never ran for some reason). At least on those platforms, it seems to not make much difference.

I think that I also ran the comparison on Pixel 6 - let me see if I can find that.

Dave Hunt [:davehunt] [he/him] ⌚BST

Updated

•

5 months ago

Whiteboard: [sp3]

Jira Integration Bot

Updated

•

5 months ago

See Also: → https://mozilla-hub.atlassian.net/browse/SP3-797

Denis Palmeiro [:denispal]

Reporter

Updated

•

4 months ago

Depends on: 1922073

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Register allocation and Ion compile times are too slow on Android during Speedometer3

Categories

(Core :: JavaScript Engine: JIT, defect, P2)

Tracking

()

People

(Reporter: denispal, Unassigned)

References

(Depends on 1 open bug, Blocks 2 open bugs)

Details

(Whiteboard: [sp3])

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Comment 1

Comment 2

Updated

Comment 3

Comment 4

Comment 5

Updated

Updated

Updated

Attachment

General

Description

File Name

Content Type