Closed
Bug 846670
Opened 11 years ago
Closed 11 years ago
Assertion failure: stackMin <= stackEnd sometimes seen in xpcshell
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
FIXED
mozilla22
People
(Reporter: dhylands, Assigned: dhylands)
References
Details
Attachments
(1 file, 2 obsolete files)
2.95 KB,
patch
|
billm
:
review+
|
Details | Diff | Splinter Review |
Under bionic, the size of the main thread stack is hard-coded at 128K See line 67 of bionic/libc/bionic/libc_init_common.c I've been noticing the following: Assertion failure: stackMin <= stackEnd, at /home/work/B2G-unagi/mozilla-inbound/js/src/gc/RootMarking.cpp:325 when running xpcshell in a debug build on my unagi. I traced this back to the stack overflowing the 128K stack from the main thread. So this is the stack size of the xpcshell executable. Some observations: There is no guard page for the main thread stack. All other threads seem to get a 1M stack (see line 83 of bionic/libc/bionic/pthread.c I confirmed that the b2g main thread has a 128K stack. Presumably this applies to the android firefox browser as well.
This sounds really bad. With the right JS we could easily pass this limit, right?
blocking-b2g: --- → tef?
Assignee | ||
Comment 2•11 years ago
|
||
It looks like the main thread stack can grow dynamically up to about 8Mb Around the time the stack got that big, maps reported: be19e000-be999000 rw-p 00000000 00:00 0 [stack] which is a size of 0x7fb000 (8172K) Reporting from within the program itself reported: stackBase: 0xbe979000 stackSize: 0x00020000 stackEnd: 0xbe999000 stackPtr: 0xbe19f308 stackBase and stackSize come from the pthread_attr_getstack call, so these don't get updated as the stack grows. So the problem I'm observing isn't caused by the stack being limited, but by something else. (because the assert I was observing happens when the top of stackMin gets > stackEnd which is an issue on the high side of the stack, not the low side)
Assignee | ||
Comment 3•11 years ago
|
||
I've updated the description to reflect more accurately what I observed
Summary: B2G MainThread stacks are too small → Assertion failure: stackMin <= stackEnd sometimes seen in xpcshell
Assignee | ||
Comment 4•11 years ago
|
||
The observation of the stack being around 8Mb was from a test program I wrote to play with stacks, and not B2G (just to be clear).
Assignee | ||
Comment 5•11 years ago
|
||
The 8Mb limit is determined by ulimit -s (8192 Kb). You can use ulimit -s 16384 to increase the max stack (of the main thread) to 16 Mb, and that also seems to working fine.
Assignee | ||
Comment 6•11 years ago
|
||
And since I had the test program, stacks for non-main threads are not dynamic. Their size is determined by the stack size specified in the pthread attrs at thread creation time. The default size is 1Mb.
Assignee | ||
Comment 7•11 years ago
|
||
So the root cause seems to be that pthread_attr_getstack isn't telling the truth for the main thread. I added some code in bionic's __libc_init_common function to stash away stacktop, stackbottom, stacksize, and the results of __get_sp and print them as the first line of code in xpcshell's main function: _stackbottom = 0xbea1b000 _stacksize = 0x00020000 _sp = 0xbea3af50 _stacktop = 0xbea3b000 So far everything looks nice and consistent. Calling pthread_attr_getstack returns: pthread_attr_getstack stackBase = 0xbea1b000 pthread_attr_getstack stackSize = 0x00020000 The kicker is that when main is called, __get_sp returns 0xbea3b600 which is outside the stacktop/stackbottom. I also wrote some code to grab the [stack] section from /proc/self/maps at xpcshell main time. It reports: bea1b000-bea3c000 rw-p 00000000 00:00 0 [stack] So stacktop should really have been 0xbea3c000 instead of 0xbea3b000. Later, when the ASSERT happens, nativeStackTop = 0xbea3b4f0 which is within the stack, but is outside the stackbottom/stacktop range. From this we can conclude that __libc_init_common must have a bunch of stuff on the stack that isn't on the stack when main is called.
Assignee: nobody → general
Component: General → JavaScript Engine
Product: Boot2Gecko → Core
Assignee | ||
Updated•11 years ago
|
Assignee: general → dhylands
Assignee | ||
Comment 8•11 years ago
|
||
This patch causes the stack bounds to be determined properly for the main-thread, by scanning /proc/self/maps. https://tbpl.mozilla.org/?tree=Try&rev=3f5936accc9e
Attachment #720168 -
Flags: review?(wmccloskey)
Assignee | ||
Comment 9•11 years ago
|
||
Oops. Now builds debug and non-debug. New try submission: https://tbpl.mozilla.org/?tree=Try&rev=09f05359f8ba
Attachment #720168 -
Attachment is obsolete: true
Attachment #720168 -
Flags: review?(wmccloskey)
Attachment #720198 -
Flags: review?(wmccloskey)
Assignee | ||
Comment 10•11 years ago
|
||
New try run with #include's added https://tbpl.mozilla.org/?tree=Try&rev=163d253781a7 I'll wait until its green before updating the attached patch. I'll also have to see why B2G didn't generate the same errors as android when building.
Comment 11•11 years ago
|
||
(Clearing tef?. "when running xpcshell in a debug build on my unagi" doesn't sound like this issue can affect a release v1.0.1 build. But if it can then please re-nom)
blocking-b2g: tef? → ---
Comment 12•11 years ago
|
||
The conservative stack scanner in our JS engine depends on the correctness of the values here. We also only care about sections of the stack that are gecko related and can contain pointers into our JS heap. So if it turns out that this section can't contain any valid heap pointers, we shouldn't block on it.
Assignee | ||
Comment 13•11 years ago
|
||
Comment on attachment 720198 [details] [diff] [review] Fix compile problems with non-debug builds Gregor found a patch which does this in a slightly cleaner manner, so I'm going to rework bit. http://repository.timesys.com/buildsources/q/qt-embedded-linux/qt-embedded-linux-4.8.1/qt-embedded-linux-4.8.1-javascript-uclibc-0932.patch Thee key difference is that it looks for a section based on address, where mine looks for [stack]. I like find the section base on address better.
Attachment #720198 -
Flags: review?(wmccloskey)
Assignee | ||
Comment 14•11 years ago
|
||
https://tbpl.mozilla.org/?tree=Try&rev=70d85faa4046
Assignee | ||
Updated•11 years ago
|
Attachment #720198 -
Attachment is obsolete: true
Assignee | ||
Comment 15•11 years ago
|
||
Comment on attachment 722370 [details] [diff] [review] Fix compile problems with non-debug builds. v3 I did a couple of try runs: https://tbpl.mozilla.org/?tree=Try&rev=70d85faa4046 https://tbpl.mozilla.org/?tree=Try&rev=bbc08cf88247
Attachment #722370 -
Flags: review?(wmccloskey)
Comment on attachment 722370 [details] [diff] [review] Fix compile problems with non-debug builds. v3 Review of attachment 722370 [details] [diff] [review]: ----------------------------------------------------------------- Thanks, looks nice. It's awesome you tracked this down. ::: js/src/jsnativestack.cpp @@ +153,5 @@ > + unsigned int stackAddr = (unsigned int)(unsigned long)&sattr; > + while (fgets(line, sizeof(line), fs) != NULL) { > + unsigned int stackStart; > + unsigned int stackEnd; > + if (sscanf(line, "%x-%x ", &stackStart, &stackEnd) != 2) { In the JS engine, we don't put braces around single-line statements. So please remove these ones. I guess we don't need to use %lx since ARM is always 32-bit. @@ +156,5 @@ > + unsigned int stackEnd; > + if (sscanf(line, "%x-%x ", &stackStart, &stackEnd) != 2) { > + continue; > + } > + if ((stackAddr >= stackStart) && (stackAddr < stackEnd)) { These extra parens should also be dropped. @@ +174,2 @@ > # else > + DBG(rc =) pthread_attr_getstack(&sattr, &stackBase, &stackSize); If this code fails, we're really in trouble. Let's assert !rc even in opt builds. Can you convert JS_ASSERT(!rc) to this: if (!rc) MOZ_CRASH(); That will avoid all this ugly debug-only stuff, too.
Attachment #722370 -
Flags: review?(wmccloskey) → review+
Comment 17•11 years ago
|
||
Comment on attachment 722370 [details] [diff] [review] Fix compile problems with non-debug builds. v3 Review of attachment 722370 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jsnativestack.cpp @@ +153,5 @@ > + unsigned int stackAddr = (unsigned int)(unsigned long)&sattr; > + while (fgets(line, sizeof(line), fs) != NULL) { > + unsigned int stackStart; > + unsigned int stackEnd; > + if (sscanf(line, "%x-%x ", &stackStart, &stackEnd) != 2) { Surely we have something better than using sscanf to parse input values, seeing as a too-big input for the output type triggers undefined behavior. :-( Okay, probably not. :-( I'd rather see %lx used everywhere, to be safe.
Assignee | ||
Comment 18•11 years ago
|
||
I could use stroul instead of sscanf. strtoul will parse hex, and will stop on the - and return the pointer to the character that stopped the scanning. Since this is at the beginning of the line, I think it would boil down to something like: s = line; stackStart = strtoul(s, &endPtr, 16); if (stackStart > 0 && *endPtr == '-') { s = endPtr + 1; stackEnd = strtoul(s, &endPtr, 16); if (stackEnd > 0 && *endPtr == ' ' && stackAddr >= stackStart && stackAddr < stackEnd) { ... Even though strtol can parse 0, 0 isn't a valid address for the start of a stack segment on linux.
(In reply to Dave Hylands [:dhylands] from comment #18) > I could use stroul instead of sscanf. strtoul will parse hex, and will stop > on the - and return the pointer to the character that stopped the scanning. > Since this is at the beginning of the line, I think it would boil down to > something like: > > s = line; > stackStart = strtoul(s, &endPtr, 16); > if (stackStart > 0 && *endPtr == '-') { > s = endPtr + 1; > stackEnd = strtoul(s, &endPtr, 16); > if (stackEnd > 0 && *endPtr == ' ' && stackAddr >= stackStart && > stackAddr < stackEnd) { > ... > > Even though strtol can parse 0, 0 isn't a valid address for the start of a > stack segment on linux. That seems strictly worse to me. As long as you use %lx, I don't think there's anything wrong with the scanf.
Assignee | ||
Comment 20•11 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/29230c04b932
Assignee | ||
Comment 21•11 years ago
|
||
Backed out: https://hg.mozilla.org/try/rev/3a2a6239763e Submitted corrected version to try: https://tbpl.mozilla.org/?tree=Try&rev=4aa39325677d
Assignee | ||
Comment 22•11 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/8b3c093d11eb
Comment 23•11 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/8b3c093d11eb
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla22
You need to log in
before you can comment on or make changes to this bug.
Description
•