Investigate lower stack sizes on threads than debug builds accept
Categories
(Core :: XPCOM, enhancement, P2)
Tracking
()
People
(Reporter: alexical, Unassigned)
References
Details
In bug 1587107 we ran into a problem where the stack size set on the StartupCache's write thread was too low. However, this only manifested in builds which inflate the stack size for safety checks. See the following disassembly:
Dump of assembler code for function LZ4_streamHC_t_alignment:
0x000055c89924bc80 <+0>: push %rbp
0x000055c89924bc81 <+1>: mov %rsp,%rbp
0x000055c89924bc84 <+4>: sub $0x40050,%rsp
0x000055c89924bc8b <+11>: mov %fs:0x28,%rax
0x000055c89924bc94 <+20>: mov %rax,-0x8(%rbp)
0x000055c89924bc98 <+24>: lea -0x40048(%rbp),%rdi
0x000055c89924bc9f <+31>: mov $0xaa,%esi
0x000055c89924bca4 <+36>: mov $0x40040,%edx
0x000055c89924bca9 <+41>: callq 0x55c89928ba80 <memset@plt>
=> 0x000055c89924bcae <+46>: mov %fs:0x28,%rdx
0x000055c89924bcb7 <+55>: mov -0x8(%rbp),%rdi
0x000055c89924bcbb <+59>: cmp %rdi,%rdx
0x000055c89924bcbe <+62>: mov %rax,-0x40050(%rbp)
0x000055c89924bcc5 <+69>: jne 0x55c89924bcd9 <LZ4_streamHC_t_alignment+89>
0x000055c89924bccb <+75>: mov $0x8,%eax
0x000055c89924bcd0 <+80>: add $0x40050,%rsp
0x000055c89924bcd7 <+87>: pop %rbp
0x000055c89924bcd8 <+88>: retq
0x000055c89924bcd9 <+89>: callq 0x55c89928ba50 <__stack_chk_fail@plt>
This is the function:
static size_t LZ4_streamHC_t_alignment(void)
{
struct { char c; LZ4_streamHC_t t; } t_a;
return sizeof(t_a) - sizeof(t_a.t);
}
In any optimized build this will just be a compile time constant, but in our instrumented debug build we push 0x40050 bytes onto the stack and memset them to 0xaa. Obviously this radically inflates the stack size.
So in optimized builds, it's reasonable to say we can save a ton on stack sizes of well-defined threads. However, in optimized builds we are missing many of the safety checks which would let us know if we exceeded our needed stack size.
So is there any way we can safely get these savings? I suspect most of these threads have no recursion or dynamic stack allocation, and a finite provable set of function pointers they can call, and thus have theoretically provable max stack sizes, but I think we'd have to have our fingers deep in the compiler / linker to leverage that.
So I'm calling out to the ether. Does anyone have any thoughts here?
Updated•5 years ago
|
Comment 1•4 years ago
|
||
Hi ,
I've chosen a component for this bug in hope that someone with more expertise may look at it. We'll await their answer. If you consider that there's another component that's more proper for this case you may change it.
Regards, Flor.
Comment 2•4 years ago
|
||
(Moving to XPCOM because my impression is that that's where the stack size is selected.)
Note that even in release builds, JS and wasm have non-optimizing baseline compilers that sometimes create stack frames that are much larger than the optimizing compilers when presented with non-optimized js or wasm code, see eg bug 1409124.
Updated•2 years ago
|
Description
•