Closed Bug 1728781 Opened 4 years ago Closed 4 years ago

Ion hangs on web.autocad.com

Tracking

()

Status:

RESOLVED FIXED

Milestone:

94 Branch

Tracking Flags:

Tracking

Status

firefox94

---

fixed

People

(Reporter: lth, Assigned: lth)

References

Details

Attachments

(1 file, 1 obsolete file)

WIP: Bug 1728781 - Short-circuit the regalloc in very large graphs. 4 years ago Lars T Hansen [:lth] 48 bytes, text/x-phabricator-request	nbp : feedback+ iain : feedback+	Details \| Review
Bug 1728781 - Do not merge in very large live ranges. r?nbp 4 years ago Lars T Hansen [:lth] 48 bytes, text/x-phabricator-request		Details \| Review

Lars T Hansen [:lth]

Assignee

Description

•

4 years ago

Create an account on web.autocad.com. Make your browser remember your login. If necessary, navigate into the webapp until you see a folder view with a folder "samples" but go no further.

Once logged in, open about:config and disable javascript.options.wasm_optimizingjit, make sure javascript.options.wasm_baselinejit is enabled, and enable javascript.options.wasm_verbose. Close all tabs and quit.

Reopen the browser from the command line. The folder view should be displayed. In the terminal window you should see something like this:

JavaScript warning: https://web.autocad.com/fabric.c89f4d74a33e58c54283eae0ff8d554c.js, line 2: WebAssembly verbose: async compileStreaming() started
JavaScript warning: https://web.autocad.com/fabric.c89f4d74a33e58c54283eae0ff8d554c.js, line 2: WebAssembly verbose: available wasm compilers: tier1=baseline tier2=none
JavaScript warning: , line 0: WebAssembly verbose: async compile succeeded

Open the samples folder, and open a demo project, the "Drive Roller Assembly" project is suitably small. It should display after a few seconds.

Now go to about:config and enable j.o.wasm_optimizingjit and disable j.o.wasm_baselinejit. Close all tabs and quit.

Reopen the browser from the command line, etc. Observe that the line 'async compile succeeded' never appears (I waited 12 minutes at one point). If you open a process viewer or htop or somesuch, you will see that Firefox is pegging one core. If you open the samples and try to open a sample, it will never load.

Lars T Hansen [:lth]

Assignee

Comment 1

•

4 years ago

•

Edited

Almost certainly this:

#0  0x00007ffff2bea207 in js::jit::CodePosition::operator<=(js::jit::CodePosition) const (this=0x7ffde21c3a20, other=...) at js/src/jit/RegisterAllocator.h:169
#1  js::jit::BacktrackingAllocator::tryMergeReusedRegister(js::jit::VirtualRegister&, js::jit::VirtualRegister&) (this=this@entry=0x7fffd87e5510, def=..., input=...) at js/src/jit/BacktrackingAllocator.cpp:1138
#2  0x00007ffff2be809a in js::jit::BacktrackingAllocator::mergeAndQueueRegisters() (this=this@entry=0x7fffd87e5510) at js/src/jit/BacktrackingAllocator.cpp:1273
#3  0x00007ffff2be7c0b in js::jit::BacktrackingAllocator::go() (this=0x7fffd87e5510) at js/src/jit/BacktrackingAllocator.cpp:885
#4  0x00007ffff2dce563 in js::jit::GenerateLIR(js::jit::MIRGenerator*) (mir=0x7fffd87e6090) at js/src/jit/Ion.cpp:1518
#5  0x00007ffff2f90484 in js::wasm::IonCompileFunctions(js::wasm::ModuleEnvironment const&, js::wasm::CompilerEnvironment const&, js::LifoAlloc&, mozilla::Vector<js::wasm::FuncCompileInput, 8ul, js::SystemAllocPolicy> const&, js::wasm::CompiledCode*, mozilla::UniquePtr<char [], JS::FreePolicy>*) (moduleEnv=..., compilerEnv=<optimized out>, lifo=..., inputs=..., code=code@entry=0x7fffc318f278, error=0x7fffd87e7b18) at js/src/wasm/WasmIonCompile.cpp:5826
#6  0x00007ffff2f796a7 in ExecuteCompileTask(js::wasm::CompileTask*, mozilla::UniquePtr<char [], JS::FreePolicy>*) (task=task@entry=0x7fffc318ef60, error=0x7ffdf2f80068, error@entry=0x7fffd87e7b18) at js/src/wasm/WasmGenerator.cpp:715
#7  0x00007ffff2f79504 in js::wasm::CompileTask::runHelperThreadTask(js::AutoLockHelperThreadState&) (this=0x7fffc318ef60, lock=...) at js/src/wasm/WasmGenerator.cpp:747
#8  0x00007ffff277e77d in js::GlobalHelperThreadState::runTaskLocked(js::HelperThreadTask*, js::AutoLockHelperThreadState&) (this=this@entry=0x7fffda707000, task=0x7fffc318ef60, locked=...) at js/src/vm/HelperThreads.cpp:2840
#9  0x00007ffff277e3b9 in js::GlobalHelperThreadState::runOneTask(js::AutoLockHelperThreadState&) (this=0x7fffda707000, lock=...) at js/src/vm/HelperThreads.cpp:2809
...

The program appears never to return from mergeAndQueueRegisters. Number of virtual registers = 332053... This could just be deadly slow.

Lars T Hansen [:lth]

Assignee

Comment 2

•

4 years ago

Attached file WIP: Bug 1728781 - Short-circuit the regalloc in very large graphs. (obsolete) — Details

Lars T Hansen [:lth]

Assignee

Comment 3

•

4 years ago

Comment on attachment 9239171 [details]
WIP: Bug 1728781 - Short-circuit the regalloc in very large graphs.

This patch papers over the problem by forcing a copy if we think the analysis would run too long, but the question is, is this the best we can do, and if it is, how should we best determine an "optimal" cutoff? (And if this is not the best we can do, how do we do better?)

Attachment #9239171 - Flags: feedback?(nicolas.b.pierron)

Attachment #9239171 - Flags: feedback?(jseward)

Attachment #9239171 - Flags: feedback?(jdemooij)

Attachment #9239171 - Flags: feedback?(iireland)

Lars T Hansen [:lth]

Assignee

Comment 4

•

4 years ago

I guess the cutoff could be a function of the number of times the analysis has been run and not the size of the vreg set, though on x64, almost everything is must-reuse, so it might not matter in practice.

Nicolas B. Pierron [:nbp]

Comment 5

•

4 years ago

Comment on attachment 9239171 [details]
WIP: Bug 1728781 - Short-circuit the regalloc in very large graphs.

I think it would make sense to apply a similar limit, not based on the total number of virtual registers, as this would apply to everything after the threshold is reached.

Instead, I suggest to apply a threshold based on the size of the LiveRange, either when attempting to merge, or after creating the a larger LiveRange after merging multiple: https://searchfox.org/mozilla-central/source/js/src/jit/BacktrackingAllocator.cpp#1096,1179-1180

Attachment #9239171 - Flags: feedback?(nicolas.b.pierron) → feedback+

Julian Seward [:jseward]

Comment 6

•

4 years ago

It seems to me important that we distinguish the cases

it takes too long, but would eventually terminate if we left it long enough
the allocator is in a loop and will never terminate

If we're 100% sure it's (1) then some kind of heuristic cutoff is good, at least
short term. But if there's any chance that it's (2) then we have a much more
serious problem, because then it might happen on other, smaller inputs too.

Lars T Hansen [:lth]

Assignee

Comment 7

•

4 years ago

(In reply to Julian Seward [:jseward] from comment #6)

It seems to me important that we distinguish the cases

it takes too long, but would eventually terminate if we left it long enough

the allocator is in a loop and will never terminate

If we're 100% sure it's (1)

In this case, there's no evidence of anything else. We know Ion has some serious nonlinearity that bites wasm hard, bug 1561127 is another case in point.

Comment 8

•

4 years ago

Attached file Bug 1728781 - Do not merge in very large live ranges. r?nbp — Details

Do not extend an input live range with a must-reuse definition if the
input range is already very large, but insert a copy. This has the
effect of splitting the live range at the definition.

The purpose of this change is to avoid creating live ranges that are
so large that they will make the compiler's nonlinear live range
processing algorithms dominate compilation time to the point where the
compiler does not, practically speaking, terminate.

Lars T Hansen [:lth]

Assignee

Comment 9

•

4 years ago

(In reply to Nicolas B. Pierron [:nbp] from comment #5)

Instead, I suggest to apply a threshold based on the size of the LiveRange, either when attempting to merge, or after creating the a larger LiveRange after merging multiple: https://searchfox.org/mozilla-central/source/js/src/jit/BacktrackingAllocator.cpp#1096,1179-1180

Thanks. I've created a patch around that idea and tried to reason about why it works (though it does work). With this applied and a limit on the live range size of 1e6, I get an 8s compile time for the application on my 2017-era Xeon. With a limit of 1e5 this drops to 5s, but I felt this was possibly too restrictive.

Julian Seward [:jseward]

Comment 10

•

4 years ago

(In reply to Lars T Hansen [:lth] from comment #9)
Maybe it's safe to set the limit to something way lower, say 1000 instead of 100,000 ?
The vast majority of live ranges are short, so this won't affect many, plus, any
live range that is 100k insns long is likely to get either split or spilled anyway.
So it doesn't sound like a big deal to skip a coalescing opportunity for it.

Lars T Hansen [:lth]

Assignee

Comment 11

•

4 years ago

(In reply to Julian Seward [:jseward] from comment #10)

(In reply to Lars T Hansen [:lth] from comment #9)
Maybe it's safe to set the limit to something way lower, say 1000 instead of 100,000 ?
The vast majority of live ranges are short, so this won't affect many, plus, any
live range that is 100k insns long is likely to get either split or spilled anyway.
So it doesn't sound like a big deal to skip a coalescing opportunity for it.

Yeah, "maybe" and maybe "not many". This is the issue - how do we tell? We have poor facilities for understanding wasm performance at the moment, and the impacts of compiler changes that are not about local code generation patterns.

1e6 is very conservative to be sure, and since it resolves the problem OK it seemed like an OK compromise. But it's just an initial guess and the only data points I have are the ones posted in comment 9. I think lowering it is fine, but arguments in favor of that probably need to be backed by data about typical live ranges in a substantial corpus of code (and of course, with our luck some large live ranges will be in somebody's performance-sensitive loop).

(Remember also that ion live ranges are conservative, there may be holes and they may be larger than how we normally think about them for that reason.)

Iain Ireland [:iain]

Updated

•

4 years ago

Attachment #9239171 - Flags: feedback?(iireland) → feedback+

Pulsebot

Comment 12

•

4 years ago

Pushed by lhansen@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/14ed7588e639 Do not merge in very large live ranges. r=nbp

Sandor Molnar[:smolnar]

Comment 13

•

4 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/14ed7588e639

Status: ASSIGNED → RESOLVED

Closed: 4 years ago

status-firefox94: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → 94 Branch

Phabricator Automation

Updated

•

4 years ago

Attachment #9239171 - Attachment is obsolete: true

Attachment #9239171 - Flags: feedback?(jseward)

Attachment #9239171 - Flags: feedback?(jdemooij)

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Ion hangs on web.autocad.com

Categories

(Core :: JavaScript: WebAssembly, defect, P2)

Tracking

()

People

(Reporter: lth, Assigned: lth)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(1 file, 1 obsolete file)

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Updated

Comment 12

Comment 13

Updated

Attachment

General

Description

File Name

Content Type