Bug 1714086 Comment 0 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

Original comment by

Lars T Hansen [:lth]

on 2021-06-02 06:44:53 PDT

See bug 1714072 comment 2.  Debug code + metadata is very large and takes a very long time to create.  And since bringing up the console (not just the debugger) forces debug code to be created, this impacts a number of people and forces various OOM bugs, see again bug 1714072 and the bugs it blocks.

Revision 1 by

Lars T Hansen [:lth]

on 2021-06-02 06:47:16 PDT

See bug 1714072 comment 2.  Debug code + metadata is very large and takes a very long time to create.  And since bringing up the console (not just the debugger) forces debug code to be created, this impacts a number of people and forces various OOM bugs, see again bug 1714072 and the bugs it blocks.

Focus here should be on:
- emit no pointless breakpoints; elide them if possible, or can they be merged if back-to-back?
- avoid tls reload if this can be done cheaply
- look for nonlinear algorithms that will bite when code or metadata become very large (eg, masm buffers are presized for "normal" compiles)
- dig in with a profiler to look for other trouble spots

Revision 2 by

Lars T Hansen [:lth]

on 2021-06-02 06:51:42 PDT

See bug 1714072 comment 1 and bug 1714072 comment 2.  Debug code + metadata is very large and takes a very long time to create.  And since bringing up the console (not just the debugger) forces debug code to be created, this impacts a number of people and forces various OOM bugs, see again bug 1714072 and the bugs it blocks.

Focus here should be on:
- emit no pointless breakpoints; elide them if possible, or can they be merged if back-to-back?
- avoid tls reload if this can be done cheaply
- look for nonlinear algorithms that will bite when code or metadata become very large (eg, masm buffers are presized for "normal" compiles)
- dig in with a profiler to look for other trouble spots

Revision 3 by

Lars T Hansen [:lth]

on 2021-06-03 01:59:32 PDT

(Spun off from bug 1714072.)

Debug code + metadata is very large and takes a very long time to create.

The very slow compilation is due to the presence of the debugger.  Contrast these two programs:
```
var g = newGlobal({newCompartment: true});
g.eval(`new WebAssembly.Module(read("/home/lhansen/moz/rustc_binary.wasm","binary"))`);
```
which compiles the test case in about 2s, and this one:
```
var g = newGlobal({newCompartment: true});
var dbg = new Debugger(g);
g.eval(`new WebAssembly.Module(read("/home/lhansen/moz/rustc_binary.wasm","binary"))`);
```
which takes more than 20s.

The memory use is also due to the debugger.  In the first case, memory use ends up at 215MB, while in the second case, the shell grows to about 1.9GB.

A quick scan of the code emitted for debugging shows that a very large fraction of it is tls reloads; if these could be removed / avoided, the code would shrink dramatically.   Also, there are a lot of back-to-back breakpoints in the emitted code, possibly for operations that emit no machine code at all:
```
00000065  4c 8b 74 24 10            movq 0x10(%rsp), %r14
0000006A  0f 1f 44 00 00            nopl %eax, (%rax,%rax,1)
0000006F  4c 8b 74 24 10            movq 0x10(%rsp), %r14
00000074  0f 1f 44 00 00            nopl %eax, (%rax,%rax,1)
```
Memory consumption will be due to:
- the code bloat above
- creating a stack map for every breakpoint (even if they are merged, and even if the tc probably does not use reftypes and won't have maps)
- appending the CallSiteDesc to the vector of those for every breakpoint
- (other things)

Focus here should be on:
- emit no pointless breakpoints; elide them if possible, or can they be merged if back-to-back?
- avoid tls reload if this can be done cheaply
- look for nonlinear algorithms that will bite when code or metadata become very large (eg, masm buffers are presized for "normal" compiles)
- dig in with a profiler to look for other trouble spots

Revision 4 by

Lars T Hansen [:lth]

on 2021-06-07 01:13:03 PDT

(Spun off from bug 1714072.)

Debug code + metadata are very large and takes a very long time to create.

The very slow compilation is due to the presence of the debugger.  Contrast these two programs:
```
var g = newGlobal({newCompartment: true});
g.eval(`new WebAssembly.Module(read("/home/lhansen/moz/rustc_binary.wasm","binary"))`);
```
which compiles the test case in about 2s, and this one:
```
var g = newGlobal({newCompartment: true});
var dbg = new Debugger(g);
g.eval(`new WebAssembly.Module(read("/home/lhansen/moz/rustc_binary.wasm","binary"))`);
```
which takes more than 20s.

The memory use is also due to the debugger.  In the first case, memory use ends up at 215MB, while in the second case, the shell grows to about 1.9GB.

A quick scan of the code emitted for debugging shows that a very large fraction of it is tls reloads; if these could be removed / avoided, the code would shrink dramatically.   Also, there are a lot of back-to-back breakpoints in the emitted code, possibly for operations that emit no machine code at all:
```
00000065  4c 8b 74 24 10            movq 0x10(%rsp), %r14
0000006A  0f 1f 44 00 00            nopl %eax, (%rax,%rax,1)
0000006F  4c 8b 74 24 10            movq 0x10(%rsp), %r14
00000074  0f 1f 44 00 00            nopl %eax, (%rax,%rax,1)
```
Memory consumption will be due to:
- the code bloat above
- creating a stack map for every breakpoint (even if they are merged, and even if the tc probably does not use reftypes and won't have maps)
- appending the CallSiteDesc to the vector of those for every breakpoint
- (other things)

Focus here should be on:
- emit no pointless breakpoints; elide them if possible, or can they be merged if back-to-back?
- avoid tls reload if this can be done cheaply
- look for nonlinear algorithms that will bite when code or metadata become very large (eg, masm buffers are presized for "normal" compiles)
- dig in with a profiler to look for other trouble spots

Revision 5 by

Lars T Hansen [:lth]

on 2021-06-07 01:14:13 PDT

(Spun off from bug 1714072.)

Debug code + metadata are very large and takes a very long time to create.

The very slow compilation is due to the presence of the debugger.  Contrast these two programs:
```
var g = newGlobal({newCompartment: true});
g.eval(`new WebAssembly.Module(read("/home/lhansen/moz/rustc_binary.wasm","binary"))`);
```
which compiles the test case in about 0.5s, and this one:
```
var g = newGlobal({newCompartment: true});
var dbg = new Debugger(g);
g.eval(`new WebAssembly.Module(read("/home/lhansen/moz/rustc_binary.wasm","binary"))`);
```
which takes more than 20s.

The memory use is also due to the debugger.  In the first case, memory use ends up at 215MB, while in the second case, the shell grows to about 1.9GB.

(More data at comment 3.)

A quick scan of the code emitted for debugging shows that a very large fraction of it is tls reloads; if these could be removed / avoided, the code would shrink dramatically.   Also, there are a lot of back-to-back breakpoints in the emitted code, possibly for operations that emit no machine code at all:
```
00000065  4c 8b 74 24 10            movq 0x10(%rsp), %r14
0000006A  0f 1f 44 00 00            nopl %eax, (%rax,%rax,1)
0000006F  4c 8b 74 24 10            movq 0x10(%rsp), %r14
00000074  0f 1f 44 00 00            nopl %eax, (%rax,%rax,1)
```
Memory consumption will be due to:
- the code bloat above
- creating a stack map for every breakpoint (even if they are merged, and even if the tc probably does not use reftypes and won't have maps)
- appending the CallSiteDesc to the vector of those for every breakpoint
- (other things)

Focus here should be on:
- emit no pointless breakpoints; elide them if possible, or can they be merged if back-to-back?
- avoid tls reload if this can be done cheaply
- look for nonlinear algorithms that will bite when code or metadata become very large (eg, masm buffers are presized for "normal" compiles)
- dig in with a profiler to look for other trouble spots

Back to Bug 1714086 Comment 0

Bugzilla

Quick Search

Bug 1714086 Comment 0 Edit History