Closed Bug 89443 Opened 24 years ago Closed 24 years ago

Crash on large number of chained || operators

Categories

(Core :: JavaScript Engine, defect)

x86
All
defect
Not set
critical

Tracking

()

VERIFIED FIXED

People

(Reporter: jester, Assigned: khanson)

References

()

Details

(Keywords: crash, js1.5, relnote, Whiteboard: Included new logic for trailing AND'a)

Attachments

(7 files)

From Bugzilla Helper: User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:0.9.2) Gecko/20010628 BuildID: 2001062815 When loading a large (130KB) proxy autoconfig file, Mozilla crashes with an application error. The PAC file is used by a public library as a site selection tool: only the sites that appear in the PAC are accessible through the browser. All other sites redirect the user to a proxy server which tells the user that access at their particular station is restricted to pre-approved sites. The PAC file consists of a large number of or'd dnsDomainIs() calls. Eg: if (dnsDomainIs(host,".oclc.org") || dnsDomainIs(host,".collegesource.com") || dnsDomainIs(host,".grolier.com") || [...] ) return "DIRECT"; else return "PROXY ..."; I am reporting this on behalf of Glenn Peterson at Hennepin County Library. Glen notes that in his testing a 22KB PAC file works, but a 36KB does not. Reproducible: Always Steps to Reproduce: 1. Use http://www.hclib.org/kidlinks.proxy as the Automatic Proxy Config URL 2. Relaunch browser 3. Wait about 10 seconds for application error
Hennepin County 55405! Confirming on Linux, 2001070206 -- it segfaults on startup every time after setting the autoconfig pref. Ugly. Moving back to networking, as PAC stuff tends to live there.
Severity: normal → critical
Status: UNCONFIRMED → NEW
Component: Networking: HTTP → Networking
Ever confirmed: true
Keywords: crash
OS: Windows 2000 → All
Blocks: 79893
QA Contact: benc → pacqa
Summary: Proxy autoconfig files appear to have a maximum size before crashing → PAC: browser crashes on large PAC files
I got a stack trace using a linux cvs build from 20010705. The trace is quite large, but is dominated by thte 1350 levels of js_EmitTree. The crash is: 0x400dbdba in _js_LookupProperty (cx=0x8233e08, obj=0x827b7a8, id=136891936, objp=0xbfe020c8, propp=0xbfe020bc, file=0x40115aa0 "jsemit.c", line=548) at jsobj.c:2046
Workaround: the PAC consists of a billion lines of this form: || dnsDomainIs(host, ".acceptableviewingmaterial.com") Use sed/awk/perl/vim to convert these lines to the form: if (dnsDomainIs(host, ".sanitizedcontent.com") { return "PROXY proxy.hclib.org:80"; } Since if any of the domains match you'll be returning the proxy, there's no need to chain them all into a single megalithic if statement. I tried this, and I successfully started up without crashing.
I can get xpcshell to segfault merely by loading this PAC file. -> JS engine.
Assignee: neeti → rogerl
Component: Networking → Javascript Engine
QA Contact: pacqa → pschwartau
cc'ing Brendan on this one -
This is not the most intelligent way of controlling internet access. They should create a spoofing DNS server and map the offensive domains to a single webserver to handle errors. Or they should set up access control on a proxy server. We need to declare a maximum PAC size that will be supported. The user is going to need to find a sensible way to manage their network access. At some point, the string size that expresses the desirable or undesirable domains is going to exceed the practical limits of PAC. PAC was meant as a general routing and network access control mechanism, not as a way of distributing large lists of domains to many clients.
It may not be the most intellegent way, but it is arguably the easiest for a library with little technical expertise to accomplish it. Creating a PAC file based on simple instructions, storing on the server, and configuring clients to use it is a very simple thing to do. Granted, this is the largest PAC file I've seen. But it should be handled more gracefully than it is now. Although the execution of such a large JavaScript file may have an impact on client-side performance, I believe that is a choice the site makes for itself, not one imposed by the application. The workaround to convert the lines into: if (dnsDomainIs(host, ".sanitizedcontent.com") return "PROXY proxy.hclib.org:80"; ...worked in my testing, although it almost doubled the size of the PAC file. This by itself shouldn't be a big problem since this application of a PAC is intended for workstations on a LAN -- the PAC will probably never be loaded over a slow link. I'm waiting to hear back from Glenn to see if it works for him at his site. Thanks for the suggestion!
The crash is due to a stack overflow. The code generator uses recursion to walk parse trees that are right-heavy in the case of chained || operators. See the comment near the top of jsparse.h. There is no easy fix, and I hate to WONTFIX a crash bug, but I bet we all have better things to do than sweat a fix for this bug right now. For example, the proxy autoconfig function could be simplified quite a bit without changing its meaning, like so: var hclibProxyDomains = [ ".asahi.com", ... "zoom.baton-rouge.la.us" ]; function FindProxyForURL(url, host) { if (isPlainHostName(host) || dnsDomainIs(host, ".hennepin.lib.mn.us") || dnsDomainIs(host, ".hclib.org")) return "DIRECT"; for (var i = 0; i < hclibProxyDomains.length; i++) if (dnsDomainIs(host, hclibProxyDomains[i])) return "PROXY proxy.hclib.org:80"; return "PROXY 172.16.100.20:80"; } The function could be more efficient if it could avoid linear search by isolating the domain suffix or suffixes of host, and looking them up in an object that maps each string in hclibProxyDomains to true or some other "found" property value. Using JS objects as associative arrays gives O(1) lookup time complexity. /be
Reassigning to Kenton -
Assignee: rogerl → khanson
Compare bug 90445, "Crash compiling bloated function while loading page", which also involves || operators ...
Testcase added to JS testsuite: js1_5/Regress/regress-89443.js
change summary from PAC: browser crashes on large PAC files to PAC: crash on large number of chained '||' is this still considered a PAC bug?
Summary: PAC: browser crashes on large PAC files → PAC: crash on large number of chained '||'
No, I believe this is purely a JavaScript Engine issue. Resummarizing to remove mention of the PAC file per se -
Summary: PAC: crash on large number of chained '||' → Crash on large number of chained || operators
If there is no pre-existing JS bug, can you create a new one and let us be a depends on you?
(Showing my ignorance) Why is this hard? I.e., why doesn't the following work: 1. Reduce using ((A || B) || C) = (A || (B || C)) 2. Evaluate using (A || B) = (A ? true : B) [valid for js?] The evaluation of B would be a tail call. Since destructors might interfere, a suitable goto could be used.
Morten: thanks for the nudge -- of course you are right that the recursion is tail recursion, which can be eliminated, but we need storage as well, because all the JSOP_OR bytecodes must have as their immediate jump-offset operand the offset from each JSOP_OR to the bytecode after the entire chained-|| expression. This is not hard to do with back-patching, so I was wrong to say that there is no easy fix. Patch coming up. I did not do likewise for JSOP_AND, because I believe chained && expressions of such length as to threaten stack overflow to be unheard of, and so not worth the added code. I could be wrong.... /be
Looking for r= and sr= for 0.9.4. /be
sr=shaver (as brendan)
Ran the JS test suite on WinNT in the debug and optimized JS shell with the patch applied. No regressions were introduced by the patch. On the other hand, the testcase for this bug did not pass: Testcase js1_5/Regress/regress-89443.js failed Expected exit code 0, got 253 Testcase terminated with signal 0 Complete testcase output was: Testcase produced no output! http://lxr.mozilla.org/mozilla/source/js/tests/js1_5/Regress/regress-89443.js
Apologies for the indentation style in the testcase - the function it contains is a direct copy from the URL reported above... It's too big to fiddle with!
Worksforme still -- can you find a core file, gdb the shell with it to get a stack backtrace? /be
My blunder! I misused the command-line options for the test driver in such a way that it utilized a NON-patched JS shell I have in another directory. WORKSFORME, too - the patch makes the testcase pass!
+relnote - is there a practical limit I can give for this? If not, can someone pick a safe practical limit?
Keywords: relnote
benc: I don't know of a practical limit, because it depends on OS stack size limits, admin-settable in some cases (see the limit command on modern BSD-ish Unixes), and on how much is alreayd on the stack when JS starts nesting. kenton: please feel free to grill me on details of this patch, questions about the surrounding code, anything. I'd like to get this into 0.9.4 and need code review (r=) from a module owner/peer as well as shaver's sr=. /be
Can you say it's safe up to, says 50? That's a pretty large number to me...
I tested the patch with a an if statement with 707 ||'s that behaved well with the patch but failed without the patch. It seems like a good time to consider the corresponding patch for trailing multiple &&'s as well as trailing combinations of ||'s and &&'s. I will try to construct these patches to test my understanding of the current patch
benc: 50 sounds safe, but I am guessing. khanson: great! With a little more work, we could probably unify the code for the TOK_OR and TOK_AND cases. Note that pn->pn_op contains JSOP_OR for TOK_OR and JSOP_AND for TOK_AND, so we can use pn->pn_op as a parameter instead of hardcoding JSOP_OR as the patch and existing case-code does. Feel free to rip out JS_BUG_SHORT_CIRCUIT while you are at it -- from all the files in which it is tested or defined. /be
Removed all references to JS_BUG_SHORT_CIRCUIT. Applied the change from OR to AND code. Tested changes. Consolidated AND and OR code. However, this code was much less clear, so I did not include these changes. Including logic for combinations of trailing AND's and OR's seems a bit ambitious for this bug. I am willing to pursue it.
Whiteboard: Included new logic for trailing AND'a
I don't see why combinations of || and && is even interesting. What kind of "naturally-occurring" immensely deep expression are we worried about? It would have to be something like "A && (B || (C && (D || ... )))" should does not strike me as likely.
There are only 4 out of 29 non-comment/whitespace-only lines that differ between the TOK_OR and TOK_AND cases, so it seems worthwhile to me to unify the two. I don't find this unification unclear: if (!js_EmitTree(cx, cg, pn->pn_left)) return JS_FALSE; top = js_Emit3(cx, cg, pn->pn_op, 0, 0); if (top < 0) return JS_FALSE; jmp = top; pn2 = pn->pn_right; if (pn2->pn_type == pn->pn_type) { do { pn = pn2; if (!js_EmitTree(cx, cg, pn->pn_left)) return JS_FALSE; off = js_Emit3(cx, cg, pn2->pn_op, 0, 0); if (off < 0) return JS_FALSE; SET_JUMP_OFFSET(CG_CODE(cg, jmp), off - jmp); jmp = off; pn2 = pn->pn_right; } while (pn2->pn_type == pn->pn_type); } if (!js_EmitTree(cx, cg, pn2)) return JS_FALSE; for (off = CG_OFFSET(cg); top < jmp; top += tmp) { jsbytecode *pc = CG_CODE(cg, top); tmp = GET_JUMP_OFFSET(pc); CHECK_AND_SET_JUMP_OFFSET(cx, cg, pc, off - top); } CHECK_AND_SET_JUMP_OFFSET(cx, cg, CG_CODE(cg, jmp), off - jmp); break; What am I missing? Probably the lack of TOK_OR, JSOP_OR, etc. -- but a comment could make up for that lack, surely? /be
Status: NEW → ASSIGNED
Urk, I had a pn2 in that inline unified code fragment that stylistically ought to be pn -- pn and pn2 are == at the point in question, but pn is the "current" TOK_OR or TOK_AND node whose pn_op we want to emit: if (!js_EmitTree(cx, cg, pn->pn_left)) return JS_FALSE; top = js_Emit3(cx, cg, pn->pn_op, 0, 0); if (top < 0) return JS_FALSE; jmp = top; pn2 = pn->pn_right; if (pn2->pn_type == pn->pn_type) { do { pn = pn2; if (!js_EmitTree(cx, cg, pn->pn_left)) return JS_FALSE; off = js_Emit3(cx, cg, pn->pn_op, 0, 0); if (off < 0) return JS_FALSE; SET_JUMP_OFFSET(CG_CODE(cg, jmp), off - jmp); jmp = off; pn2 = pn->pn_right; } while (pn2->pn_type == pn->pn_type); } if (!js_EmitTree(cx, cg, pn2)) return JS_FALSE; for (off = CG_OFFSET(cg); top < jmp; top += tmp) { jsbytecode *pc = CG_CODE(cg, top); tmp = GET_JUMP_OFFSET(pc); CHECK_AND_SET_JUMP_OFFSET(cx, cg, pc, off - top); } CHECK_AND_SET_JUMP_OFFSET(cx, cg, CG_CODE(cg, jmp), off - jmp); break; I agree that combinations of && and || are invariable short enough that recursion never threatens a real-world OS stack limit. /be
Finally getting caffeine, spotting if(C)do{...}while(C) silliness in my patch, which should of course be simplified to while(C){...}. Final unified case code: if (!js_EmitTree(cx, cg, pn->pn_left)) return JS_FALSE; top = js_Emit3(cx, cg, pn->pn_op, 0, 0); if (top < 0) return JS_FALSE; jmp = top; pn2 = pn->pn_right; while (pn2->pn_type == pn->pn_type) { pn = pn2; if (!js_EmitTree(cx, cg, pn->pn_left)) return JS_FALSE; off = js_Emit3(cx, cg, pn->pn_op, 0, 0); if (off < 0) return JS_FALSE; SET_JUMP_OFFSET(CG_CODE(cg, jmp), off - jmp); jmp = off; pn2 = pn->pn_right; } if (!js_EmitTree(cx, cg, pn2)) return JS_FALSE; for (off = CG_OFFSET(cg); top < jmp; top += tmp) { jsbytecode *pc = CG_CODE(cg, top); tmp = GET_JUMP_OFFSET(pc); CHECK_AND_SET_JUMP_OFFSET(cx, cg, pc, off - top); } CHECK_AND_SET_JUMP_OFFSET(cx, cg, CG_CODE(cg, jmp), off - jmp); break; /be
I agree with all the changes including the simplification of the "if do while" going to a "while".
The while statement can be changed to while (pn2->pn_type == TOK_OR || pn2->pn_type == TOK_AND) { to handle mixed && and || tails.
khanson: righteous! new jsemit.c patch? /be
Nominating for js1.5 and mozilla0.9.4. /be
Keywords: js1.5, mozilla0.9.4
khanson: looks great, except for the ^M Mac line endings in the patch, and except for this over-indented loop body: + while (pn2->pn_type == TOK_OR || pn2->pn_type == TOK_AND) { + pn = pn2; + if (!js_EmitTree(cx, cg, pn->pn_left)) + return JS_FALSE; + off = js_Emit3(cx, cg, pn->pn_op, 0, 0); + if (off < 0) + return JS_FALSE; + SET_JUMP_OFFSET(CG_CODE(cg, jmp), off - jmp); + jmp = off; + pn2 = pn->pn_right; + + } Oh, and the extra newline before the closing brace seems unnecessary to me. Not sure if tabs existed in your version that made it look right in CodeWarrior, but the patch has no tabs, and the loop body should indent only by four space (and no tabs should get checked in -- beard can help with CW configuration to avoid this, I think). Pick these nits and sr=brendan@mozilla.org -- thanks. /be
Let's get this in for 0.9.4 -- who can r=? New patch for jsemit.c needed? I'm ok just asking for the overindentation/extra-newline nits to be picked, without seeing yet another patch, myself. /be
r=rogerl
I checked in khanson's patches (made a few whitespace/comment-only changes to the jsemit.c one). The JS testsuite passes except for the five (right, phil?) known failures. Thanks, all. /be
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Trying to verify this one. The JS testsuite passes except for the five known failures. The test driver passes the specific testcase for this bug. However, if I go into the JS shell and load this testcase manually, I get this error: [//d/JS_TRUNK/mozilla/js/src/WINNT4.0_DBG.OBJ]./js js> load('../../tests/js1_5/shell.js') js> load('../../tests/js1_5/Regress/regress-89443.js') InternalError: else statement too large I'm guessing it comes from this in jsemit.c: JSBool js_SetJumpOffset(JSContext *cx, JSCodeGenerator *cg, jsbytecode *pc, ptrdiff_t off) { if (off < JUMP_OFFSET_MIN || JUMP_OFFSET_MAX < off) { ReportStatementTooLarge(cx, cg); return JS_FALSE; } SET_JUMP_OFFSET(pc, off); return JS_TRUE; } where JUMP_OFFSET_MAX is defined in jsopcode.h as: #define JUMP_OFFSET_MAX ((int16)0x7fff) = 32,767 The test driver doesn't detect this internal error because the JS shell gives exit code 0, exit signal 0 for the test... Is this the expected behavior here - or is there something more to do on this bug?
This is probably a caused by the fix to bug 90445. Prior to fix 90445 the JumpOffset was not being range checked.
Can the testcase be tweaked so as to avoid overflowing the jump offset operand but still have so many chained || operators that it overflows typical process stacks, when run against a version of js that lacks this bug's fix? /be
OK, I will try tweaking the testcase -
Testcase successfully tweaked. It now loads successfully by hand in the current JS shell on WinNT and Linux. Yet it crashes on JS shells as of: Recently (01 Aug 2001) Release Candidate 3 (11 May 2001) Release Candidate 2 (14 Aug 2000) Marking this bug VERIFIED FIXED
Status: RESOLVED → VERIFIED
That leaves one question: should I file a new bug on the behavior I observed above? That is, if a testcase triggers something like InternalError: else statement too large shouldn't the JS exit code be non-0?
Yes, any non-warning error detected by the compiler should cause an abnormal termination of the shell. Can you file a bug? This should be easy if a bit tedious to debug. /be
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: