Closed Bug 698528 Opened 9 years ago Closed 9 years ago

Missing guard pages on non-main-thread stacks

Categories

(Core :: XPCOM, defect)

x86_64
macOS
defect
Not set
normal

Tracking

()

RESOLVED INVALID

People

(Reporter: jruderman, Assigned: glandium)

References

(Blocks 1 open bug)

Details

We seem to be missing guard pages on non-main-thread stacks. I'm not sure about this, but it would seem to explain some of the observations in bug 671797 comment 2 and bug 691096 comment 29.

Stack guard pages are a simple mechanism to prevent too-much-recursion or infinite-recursion situations from being exploitable. (As long as alloca or compiler weirdness doesn't lead to huge, unchecked stack frames.)
Isn't the OS typically responsible for managing the thread guard page?
This FAQ claims:
http://www.cognitus.net/html/howto/pthreadSemiFAQ_5.html

"5.13 Why doesn't setting both Guard Size AND Stack Attributes work as expected?

If you set the stack location or size using the stack attribute, the guard size attribute will be ignored in ALL cases. You will not get an error when setting any of these in combination as long as the function parameters are correct but, again, the guard size will be ignored.

The rationale is that if an application is setting the stack location or size, that it is using reasonable values for those attributes and should, if necessary, already have allocated it's own "guard buffer" area in the specified stack.
"
Is this mac- or *nix-only? Does this perhaps apply only to threads with a non-default stack size (which applies to the audio threads, IIRC).

http://hg.mozilla.org/mozilla-central/annotate/4271b14e0c0b/nsprpub/pr/src/pthreads/ptthread.c#l340
Do JS workers reduce their thread stack size? I know some media threads do (see bug 664341).
Assignee: nobody → mh+mozilla
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #3)
> Is this mac- or *nix-only? Does this perhaps apply only to threads with a
> non-default stack size (which applies to the audio threads, IIRC).

From my experimentation, it appears this is not a problem on Linux. The glibc default is to allocate one guard page (4096 bytes) at the bottom of the stack, effectively eating that page away from the stack size. Whether one page is enough is open to debate. Using pthread_attr_setguardsize also works to either disable that default behaviour or to allocate more guarding pages. This appears to work because we use pthread_attr_setstacksize, not pthread_attr_setstack or pthread_attr_setstackaddr.

Android has a different libc, so I will check the status there.

For Windows and OSX, I did a test on try, which, if I didn't screw up things, shows that OSX does have at least one page of guard by default as well, and that Windows either doesn't have a guard by default or actually allocates 1MB when you ask for 64KB of stack. I'll check further locally.
(In reply to Mike Hommey [:glandium] from comment #6)
> Android has a different libc, so I will check the status there.

Android does the same thing as Linux, and allows to change the guard size with pthread_attr_setguardsize, except it doesn't actually allows to disable the guard page(s): setting guard size to 0 still creates one guard page.

Also note that both Android and Linux don't put guard pages on the main thread stack, but in practice, the kernel sets things up such that there is unmapped memory before the stack, and nothing is going to be mapped there unless something explicitly does so.

> For Windows and OSX, I did a test on try, which, if I didn't screw up
> things, shows that OSX does have at least one page of guard by default as
> well, and that Windows either doesn't have a guard by default or actually
> allocates 1MB when you ask for 64KB of stack. I'll check further locally.

On Windows, specifying a stack size under 1MB to NS_NewThread leads to a 1MB stack. Sizes above 1MB are used properly. It looks like, from observation with vmmap and various attempts, that it's impossible to have two thread stacks adjacently allocated: there's always a gap between two, but the gap size varies for whatever reason. Someone with better knowledge should probably take a look.

On OSX, the behaviour is the same as on Linux: default guard size is one page, and it can be adjusted with pthread_attr_setguardsize, and disabled with the value 0. Interestingly, whatever the guard size is, it also allocates 8KB more than requested for the stack size.

The conclusion is that in practice, we have guard pages (or gaps of unmapped memory) on all four main platforms. On all but Windows, the size of these guards is 4KB, which may or may not be too small. Any opinion there?
So, reading the crash report that led to this bug being filed is actually enlightening:

> Crash reason:  EXC_BAD_ACCESS / KERN_PROTECTION_FAILURE

In other words, access to unmapped memory, or protected memory.

> Crash address: 0x248c3ff8
>
> Thread 14 (crashed)
> 0  0x7fff5fc07c1d
>    rbx = 0x248c4970   r12 = 0x0101ecf0   r13 = 0x0101ecf0   r14 = 0x248c4970
>    r15 = 0x24a5b400   rip = 0x5fc07c1d   rsp = 0x248c4000   rbp = 0x248c4000
>    Found by: given as instruction pointer in context

The crash address is 8 bytes under rsp. Most likely, the crashing instruction was a push on stack, which would have updated rsp afterwards. Further stack frames (obviously) have a stack pointer after 0x248c4000.

The crash report unfortunately doesn't contain memory mapping information, but what is clear from the above is that there is a guard page, or at least a hole (my observation tells it would be a guard page). There is another thread stack below and near these stack addresses, in thread 13, at 0x24880b88 for the bottom of the stack. Top of that stack is most probably much closer to the bottom of the crashing thread stack.

What this means is that when overflowing the stack on that thread, we accessed the guard page, and fail with a segmentation fault, which is exactly what the guard page is supposed to make us do, instead of spilling on another thread's stack.

Without a guard page, the crash would have occurred at a random address depending on when the other thread has its stack overwritten, where it picks it up, and what it tries to do with it.

I'm thus closing this bug as invalid.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → INVALID
Group: core-security
Whiteboard: [sg:high]
You need to log in before you can comment on or make changes to this bug.