1708381 - Tremendous slowdown for C64 emulator with emscripten 2.0.17 due to more aggressive inlining

Lars T Hansen [:lth]

Reporter

Description

•

3 years ago

Original issue here: https://github.com/emscripten-core/emscripten/issues/13899

STR (for now - the content at "Part 2" may change to mask this bug at some point in the nearish future):

Part 1

download this file and save it: https://sourceforge.net/p/vice-emu/code/HEAD/tree/trunk/vice/data/DRIVES/d1541II?format=raw
go to https://dirkwhoffmann.github.io/virtualc64web/
click on "Install Open Roms"
click on "Disk drive ROM" and open the file you saved
click Close
open developer console, and observe the frame rate
it should be "good", approaching 60fps

Part 2

repeat part 1 (you can reuse the downloaded file) but go instead to http://vc64web.github.io/ for the emulator
observe that the frame rate sucks (i get less than 20fps on my very beefy desktop system)

This was originally reported on an ancient mac but repros on a current Linux system too.

Lars T Hansen [:lth]

Reporter

Comment 1

•

3 years ago

This is almost certainly a problem with Ion. Disabling baselinejit does not make a difference; disabling optimizingjit brings the FPS up to at least 50.

Lars T Hansen [:lth]

Reporter

Comment 2

•

3 years ago

This does not appear to be a problem on the M1 MacBook Pro, suggesting that it is a problem related to code generation or register allocation on x64.

Lars T Hansen [:lth]

Reporter

Comment 3

•

3 years ago

It is also a problem on Windows 10 x64, which is not totally surprising, though Windows has a different ABI so there are differences to how register allocation is done.

Severity: -- → S3

OS: Unspecified → All

Priority: -- → P2

Lars T Hansen [:lth]

Reporter

Comment 4

•

3 years ago

•

Edited

Running Ion-only: Profile with "bad" code shows massive jank and very long requestAnimationFame callback running times. Most of the time (72%) is spent in a single wasm function (labeled "612"). Profile with good code has none of those problems, it's mostly idle.

Running baseline-only, the profile for the "bad" code resembles the Ion profile for the "good" code, there's no jank, animation callbacks are 11-14ms.

Lars T Hansen [:lth]

Reporter

Comment 5

•

3 years ago

Attached file vC64.wat.gz — Details

Disassembly of the bad code (vC64.wasm), 23MB. Function 612 is the mother of all br_tables.

Hard to say how to approach this. We no longer have an alternative register allocator, and simply turning off the other optimization passes will only tell us something if we can assume the allocator is correct.

The tremendous slowdown suggests that something goes very wrong - that a register or local is being clobbered, leading to a long-running loop for example. It's not just an optimization that fails, or something locally suboptimal. (In principle, what goes wrong could have gone wrong elsewhere, and set a status variable to a bogus value, say, but since fiddling with the inlining limit makes the bug come and go, it's more likely that the problem is within the one function.)

vC64.wat.gz 3 years ago Lars T Hansen [:lth] 1.00 MB, application/gzip		Details
hotblocks-baseline.txt 3 years ago Julian Seward [:jseward] 23.21 KB, text/plain		Details
hotblocks-ion-fast.txt 3 years ago Julian Seward [:jseward] 20.28 KB, text/plain		Details
hotblocks-ion-slow.txt 3 years ago Julian Seward [:jseward] 113.31 KB, text/plain		Details
The offending basic block -- the interesting bits, at least 3 years ago Julian Seward [:jseward] 3.69 KB, text/plain		Details
Tiny standalone test case 3 years ago Julian Seward [:jseward] 1.09 KB, text/plain		Details