Closed
Bug 326466
(geniter)
Opened 18 years ago
Closed 16 years ago
Implement Pythonic generators and iteration protocol support for JS1.7
Categories
(Core :: JavaScript Engine, defect, P1)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
mozilla1.9alpha1
People
(Reporter: brendan, Assigned: brendan)
References
()
Details
Attachments
(7 files, 23 obsolete files)
29.44 KB,
patch
|
mrbkap
:
review+
|
Details | Diff | Splinter Review |
2.18 KB,
text/plain
|
Details | |
307.06 KB,
patch
|
mrbkap
:
review+
|
Details | Diff | Splinter Review |
312.29 KB,
patch
|
brendan
:
review+
|
Details | Diff | Splinter Review |
34.10 KB,
patch
|
mrbkap
:
review+
|
Details | Diff | Splinter Review |
2.51 KB,
text/plain
|
Details | |
74.62 KB,
patch
|
igor
:
review+
|
Details | Diff | Splinter Review |
This is a topic in ECMA TG1, not certain to be in Edition 4, worth pioneering in a popular implementation. It's also something lots of people have wanted for a long time in SpiderMonkey. /be
Assignee | ||
Updated•18 years ago
|
Status: NEW → ASSIGNED
Assignee | ||
Comment 1•18 years ago
|
||
Try this in a patched js shell: function fib() { var a = 0, b = 1 while (true) { yield a var t = a a = b b += t } } var g = fib() g.next() g.next() g.next() g.next() /be
Assignee | ||
Comment 2•18 years ago
|
||
Last mountain to climb: recast for-in and for-each-in loops in terms of iteration protocol. /be
Assignee | ||
Comment 3•18 years ago
|
||
Attachment #211222 -
Attachment is obsolete: true
Attachment #211570 -
Attachment is obsolete: true
Comment 4•18 years ago
|
||
(In reply to comment #3) > Created an attachment (id=211947) [edit] The patch does not include jsproto.tbl.
Comment 5•18 years ago
|
||
Is the syntax [expr for (i in g)] also allowed (with the parentheses)? It is more JS-like, and people are certainly going to type it this way accidentally. Would it also be an option to make the for statement an expression? I.e. var a = for (i in g) expr;
Assignee | ||
Comment 6•18 years ago
|
||
(In reply to comment #5) > Is the syntax [expr for (i in g)] also allowed (with the parentheses)? It is > more JS-like, and people are certainly going to type it this way accidentally. It could be done, it's neither concise nor Pythonic. Parens should not be required in any case. Parens are required for control structures such as for and if, as in C, because of lack of "do" or "then" keywords, or ":" separators. But in an array comprehension, there's no need for such filigree. > Would it also be an option to make the for statement an expression? I.e. > > var a = for (i in g) expr; This was considered but it's neither Pythonic nor JS-ish -- JS is not an expression language. This looks a bit too much like a pair of statements from which the initializer of the var has been accidentally removed. It seems better a) to avoid special forms where possible, b) then if necessary to avoid special forms that look like errors, c) and if special forms are necessary, to re-use existing ones from "nearby" languages. /be
Assignee | ||
Comment 7•18 years ago
|
||
(In reply to comment #6) > (In reply to comment #5) > > Is the syntax [expr for (i in g)] also allowed (with the parentheses)? It is > > more JS-like, and people are certainly going to type it this way accidentally. > > It could be done, it's neither concise nor Pythonic. Parens should not be > required in any case. Parens are required for control structures such as for > and if, as in C, because of lack of "do" or "then" keywords, or ":" separators. > But in an array comprehension, there's no need for such filigree. What's more, if parens are allowed on the principle that users will expect this use of the for keyword to match the for/in loop, won't some users also want to use var after the opening paren? But the iteration variable (i in your example) is local to the bracketed array comprehension, and var otherwise does not scope that way. One good design principle is that different semantics should have different syntax. If we apply this principle to the exclusion of the "stand on Python's shoulders" one, we would want to avoid the for and in keywords altogether. Say using something Haskell-y, or Math-y (e.g., [i*i | i = g] -- but | is then not usable as bitwise or in the expression). I think that on balance, it is better to be Pythonic, even to the extent of dropping parens required in the C-like for loop header. /be
Comment 8•18 years ago
|
||
Just wondering, what is the value of: [i for i in [10, 20]] You'd expect [10, 20], but the normal for..in semantics of JS gives [0, 1]. And is this also supported: [x * y for x in gx for y in gy if x > y if y != 1] Then we have for, in and if that are used differently than in normal js! I like the Haskell option better, perhaps using 'where' instead of '|': [x * y where x <- gx, y <- gy, x > y, y != 1] <- would also be a nice operator: 10 <- [10, 20] == true
Assignee | ||
Comment 9•18 years ago
|
||
(In reply to comment #8) > Just wondering, what is the value of: > > [i for i in [10, 20]] > > You'd expect [10, 20], but the normal for..in semantics of JS gives [0, 1]. Yes, Arrays are not Python lists. E4X introduced for each (elem in list) to cope, and we're stuck with that doubling of keyword overhead, without clarity on "each" meaning "each value" rather than "each id". So to follow suit (following bad precedent is a bad idea :-/), we could have [i for each i in [10, 20]]. The slippery slope to Python turns into a cliff. Since we are not going to change for-in loops to iterate over values for certain object subtypes, and since E4X is an ECMA, now ISO (shudder), standard, Edition 4 should include backward-compatible for-in and for-each-in loops. If generators are considered different enough cases to warrant different syntax, *and* the value of being Pythonic is diminished so much by lack of value-rather-than-id enumeration that is specialized per Python type, then we ought to consider novel syntax. > And is this also supported: > > [x * y for x in gx for y in gy if x > y if y != 1] No. > Then we have for, in and if that are used differently than in normal js! > I like the Haskell option better, perhaps using 'where' instead of '|': > > [x * y where x <- gx, y <- gy, x > y, y != 1] > > <- would also be a nice operator: 10 <- [10, 20] == true "where" is nice, but do you really want to add <- in JS? How would you specify it fully? /be
Comment 10•18 years ago
|
||
> > [x * y where x <- gx, y <- gy, x > y, y != 1]
> >
> > <- would also be a nice operator: 10 <- [10, 20] == true
>
> "where" is nice, but do you really want to add <- in JS? How would you specify
> it fully?
a <- b would be the same as b.indexOf(a) != -1 (with the obvious implementation of indexOf for generators), or in math terms: a ∈ b (<- even looks like ∈).
Then the where part can be considered a boolean expression, with "," meaning "&&". We could even write it that way:
[x * y where x <- gx && y <- gy && x > y && y != 1]
x and y can be considered to range over all possible JS values, with <- simply as a boolean test (obviously should not be implemented that way). (<- also makes the variable in front of it local to the list comprehension.)
Then you can even write:
[x where x <- [10, 20, 30] && x <- [20, 30, 40]]
resulting in [20, 30]. The way to implement this is to use the first occurrence of <- per variable as the for each loop, and every next occurrence as a boolean test.
Comment 11•18 years ago
|
||
(bugzilla bug) ∈ is the "element of" unicode character, a rounded "E".
Comment 12•18 years ago
|
||
(In reply to comment #8) > I like the Haskell option better, perhaps using 'where' instead of '|': > > [x * y where x <- gx, y <- gy, x > y, y != 1] AFAIK list comprehension in Haskell is effectively a way to write iterators since the language is lazy. Thus for JS it would be nice to have an iterator comprehension or shorthand to replace for (var x in gx) for (var y in gx) if (x > y && y != 1) use_x_y by for (var x in gx, var y in gy, x > y, y != 1) use_x_y But since this is not compatible with the current JS, so what about abusing "do" keyword for that: do (var x: gx; var y: gy; x > y; y != 1) use_x_y where in do (term1; ... ;termN) each term is either "name: iterator" or a filter expression. One can even extent the syntax to allow: do (var x: 1,2, 7..10, gx; var y: gy, 0; x > y; y != 1) use_x_y Here 1,2, 7..10, gx would build an iterator that iterates over 1,2,7,8,9,10 and each value of gx iterator. Then array comprehension is easy: [ do (var x: 1,2, 7..10, gx; var y in gy, 0; x > y; y != 1) x * y ]
Assignee | ||
Comment 13•18 years ago
|
||
var x: T means type annotation, still -- I'll blog to give the latest on this, but that rules out Igor's proposal as written. Sjoerd's element-of is nice, however: "in" exists already and is overloaded. Rather than adding another operator (for value set membership rather than id set member ship testing), we might prefer either of these: 1. Provide a list type where "in" tests value membership. Thanks to the comma operator, we can't do this just via (a, b, c) -- we would need (a, b, c,) or some such ugliness. Whatever the syntax, then "in" would suffice. 2. Provide a function mapping values to ids, so we can test (i in values([10, 20, 30])) or some better name. Neither is Pythonic. In spite of the cliff, I still favor imitating Python for generator functions and simple array comprehensions. For more complicated cases such as values vs. ids and what Lua calls "stateless iterators" (there's state, but you don't need to create an iterator object), I'm still looking into the problems and potential solutions. More in a bit, feel free to comment. /be
Comment 14•18 years ago
|
||
>"in" exists already and is overloaded. Overloaded how? > 1. Provide a list type where "in" tests value membership. Thanks to the comma > operator, we can't do this just via (a, b, c) -- we would need (a, b, c,) or > some such ugliness. Whatever the syntax, then "in" would suffice. Perhaps list(a, b, c). But my oppion is that is unacceptable for a language like EcmaScript to have more than one kind of list when they cannot be used interchangeably. > 2. Provide a function mapping values to ids, so we can test > > (i in values([10, 20, 30])) > > or some better name. That wouldn't work. F.e. "10" in values([10, 20, 30]) would return true, because "in" only works with strings.
Assignee | ||
Comment 15•18 years ago
|
||
(In reply to comment #14) > >"in" exists already and is overloaded. > > Overloaded how? "in" in the for/in loop and as a relational operator tests whether a string property name is bound in an object (or its prototype, etc.). "in" in for/each/in loop has its meaning modified by "each" to enumerate property values, not property names. And "in" is vague. Does it include the prototype chain? It does, but why? I regret that design decision from Spring of 1995. It begot Object.prototype.hasOwnProperty in ECMA Edition 3, which also added Object.prototype.propertyIsEnumerable -- but propertyIsEnumerable returns false unless hasOwnProperty, which means its notion of enumerability is different from for/in and the "in" operator's notion. > > 1. Provide a list type where "in" tests value membership. Thanks to the comma > > operator, we can't do this just via (a, b, c) -- we would need (a, b, c,) or > > some such ugliness. Whatever the syntax, then "in" would suffice. > > Perhaps list(a, b, c). But my oppion is that is unacceptable for a language > like EcmaScript to have more than one kind of list when they cannot be used > interchangeably. Agreed, and no one in ECMA TG1 is proposing any such thing. This was just a thought-trial-balloon, now exploded. > > 2. Provide a function mapping values to ids, so we can test > > > > (i in values([10, 20, 30])) > > > > or some better name. > > That wouldn't work. F.e. > > "10" in values([10, 20, 30]) > > would return true, because "in" only works with strings. True, and I have a proposal in for Edition 4 that adds value-keyed hashes, with syntax for property access by value-name. But it's not obviously a winner, and we shouldn't depend on it here. This leaves your <- proposal standing, although "in" and <- (and for-each-in) are incoherent-looking. Is there a concise, more consistent or minimal way to say these things? I would be happy to relegate for-each-in to E4X compatibility mode status. /be
Assignee | ||
Comment 16•18 years ago
|
||
More thoughts. If I squint, <- looks like set membership, but if I open my eyes and don't skew my laptop screen, it looks more like a binding operator of some sort. It's not clear what the average JS hacker will make of it. For any Python/JS convergence however limited, "in" is ambiguous on its face, since JS users coming from Python are confused by JS's "in" always testing property key existence, rather than property value membership. During Python's evolution (see PEP-234), there was much discussion showing some confusion about whether for-in should iterate over keys or values, differing among lists and dicts. The weak symmetry between lists and dicts favored for-in considering values in list, but keys in dict. As noted, JS has no list type, although array initialisers look like lists from many other languages. Without adding lists (undesirable as noted) or changing how "in" and for-in work on arrays (gross incompatibility, out of bounds for Edition 4), we are left with no choice but to add something like <- under some surface syntax. For consonance with the keyword-happy nature of JS, and specifically instanceof, we could add a memberof relational operator. Then one would write the not so useful example from comment 8 like so: [i for (i memberof [10, 20])] to get [10, 20]. I've added parentheses here, because instead of for each (var p in o) ... to iterate a variable p over values in an object o, this proposal would want: for (var p memberof o) ... The E4X for-each-in loop could be kept as a backward compatibility measure by implementations that supported E4X prior to Edition 4 (ActionScript 3, recent versions of SpiderMonkey and Rhino), but not made a normative part of Edition 4. On the down side, memberof is long and unsightly because runtogether. We should struggle to find the best syntax, because the semantics are clear enough. Adding memberof, possibly requiring parens, not supporting nested for and/or optional if guard at the end... now we are sliding uphill away from Pythonic generators. A special binding keyword such as where starts to look better. But I'm reluctant to give up Python emulation when it seems in reach for common cases (array comprehensions that iterate over array initialisers seem contrived to me). Comments? /be
Assignee | ||
Comment 17•18 years ago
|
||
Just wanted to add that I haven't implemented nested fors or optional if guards, but there's no hardship in doing so. This bug and the proposal before ECMA TG1 say "be Pythonic", and while we have to draw a careful line before the cliff, nested for and if guard fall on the JS side of that line, modulo parenthesizing. /be
Comment 18•18 years ago
|
||
I realized <- wouldn't work. F.e. "x<-3" must mean "x < -3". "memberof" is the best alternative. Can't think of anything better. (from, of, at, IN, inside, all terrible). Other languages don't provide much inspiration either. http://merd.sourceforge.net/pixel/language-study/syntax-across-languages.html#BagAndLst Btw, JS may be "keyword-happy", but few operators are keywords. It's just that !, @ or # don't make much sense either. I don't really consider "in" as overloaded. "for (p in o)" reads to me as "for every p in the universe for which p in o is true". This doesn't look odd to me at all: for (p in o && o.hasOwnProperty(p)) {...} Or even for (p in o1 && p in o2) {...} for (p in o1 || p in o2) {...} All in all, this is the syntax that scares me the least: [x * y for (x memberof gx) for (y memberof gy) if (x > y && y != 1)] And I would require the parentheses, it reads so much better than this: [x * y for x memberof gx for y memberof gy if x > y && y != 1]
Comment 19•18 years ago
|
||
(In reply to comment #18) > All in all, this is the syntax that scares me the least: > > [x * y for (x memberof gx) for (y memberof gy) if (x > y && y != 1)] According to the general rules of JS that would add x and y as global variables unless x and y are already declared. On the other hand adding var prefixes looks ugly: [x * y for (var x memberof gx) for (var y memberof gy) if (x > y && y != 1)] It would be nice to have a form of loop that always introduces new local variable for its cursor.
Comment 20•18 years ago
|
||
After some thinking about the list comprehension I asked myself if it is really necessary? What if the iterator would have a method to generate array from its iterations? Then with a library of iterators like Iter.range(from, to, step) to iterate within the given bounds and step Iter.list(v1, v2, ..., v3) to iterate in sequence over v1, v2... nesting into vi if vi is iterator itself. Then one can have something like: var array = Iter.list(1, 2, Iter.range(100, 200)).toArray(); and for (var i in Iter.range(1, 100)) use_x; where I assume that "for (x in y) { ... } " would stand for: tmp_var = Iter(x); while (!tmp_var.done()) { x = tmp_var.next(); ... } Here Iter(x) is just x when x is iterator and Iter.getObjectKeyIterator(object) otherwise. Similarly "for each (x in y) { ... }" becomes a shortcut for "for (x in Iter.getValueIteretaor(y)) { ... }" The drawback is that instead of var a = [x * y for (var x memberof gx) for (var y memberof gy) if (x > y && y != 1)] one would be forced to write: var a = (function() { for (var x in gx) for (var y in gy) if (x > y && y != 1) yield x * y; })().toArray(); or just var a = []; for (var x in gx) for (var y in gy) if (x > y && y != 1) a.push(x*y); But that does not look bad at all.
Comment 21•18 years ago
|
||
If you don't want to overload the "in" keyword in the "for" loop, you may want to consider the Pythonic loop added in Java 5: http://java.sun.com/j2se/1.5.0/docs/guide/language/foreach.html A reasonable example in JS may be: for (var value : list) (equivalent to Python "for value in list:") And let me add that IMHO the "be Pythonic" proposal is a very good thing for the JS future.
Comment 22•18 years ago
|
||
(In reply to comment #21) > If you don't want to overload the "in" keyword in the "for" loop, you may want > to consider the Pythonic loop added in Java 5: > http://java.sun.com/j2se/1.5.0/docs/guide/language/foreach.html > > A reasonable example in JS may be: > for (var value : list) As Brendan already pointed in comment 13 ":" is reserved for JS2.0: "var x: T means type annotation"
Assignee | ||
Comment 23•18 years ago
|
||
Igor: [x*x for (x in generate(seed))] is a worthwhile idiom inspired by Python to save a few lines. Why make people type var a = [] and a.push if we can avoid it? Re: Iter, my proposal is almost exactly that: Iterator as in the patch, which is a function that can be used to get or construct an iterator for an arbitrary object argument. In Edition 3, we already have special binding forms (function, catch), so we need not use var for every local variable. The bracketing ([]) for comprehensions is sufficient to delimit the local scope. (Dave Herman has a righteous proposal for let block-expressions with local binding scope, which he may blog about -- after that, I think we're done!) It doesn't seem wise to cannibalize arbitrary syntax and semantics from other languages, even from Java. I continue to believe that Pythonic for-in loops and comprehensions (modulo parens and in vs. memberof) are a close fit. /be
How evil would it be to reverse the operands, giving something like if ([1, 2, 3] contains x) ? ("has"?)
Comment 25•18 years ago
|
||
(In reply to comment #23) > Igor: [x*x for (x in generate(seed))] is a worthwhile idiom inspired by Python > to save a few lines. Why make people type var a = [] and a.push if we can > avoid it? I did not see a lot of Phython code, but in the most cases I saw array compehension was used to create a temporary array to iterate over it later. In lazy languages like Haskell that is not a problem, but JS and Python are not lazy and here such usage causes unecessary memory consumption. Also with proper library support one can write: Iter.map(random(seed()), function(x) { return x * x; }).toArray() which does not look that bad while emphasizing the need to have a short for lambda expression. Something like: Iter.map(random(seed()), function(x) x * x).toArray() where function(args) expr is just a shorthand for function(args) { return expr; } would benefit not only iterator cases.
Assignee | ||
Comment 26•18 years ago
|
||
(In reply to comment #24) > How evil would it be to reverse the operands, giving something like > > if ([1, 2, 3] contains x) > > ? ("has"?) Loop control on the left wins. So if we were adding only contains, why not? But everyone (Python-heads, E4X, even you IIRC) wants a for/in variant that iterates over values. And for (a contains v) is just backwards. So yeah, evil. (In reply to comment #25) > (In reply to comment #23) Point about laziness in Haskell is worth attending to, but since Python already has comprehensions and is strict, and JS is closer to Python, the point of this bug still stands. > Also with proper library support one can write: > > Iter.map(random(seed()), function(x) { return x * x; }).toArray() > > which does not look that bad while emphasizing the need to have a short for > lambda expression. Something like: > > Iter.map(random(seed()), function(x) x * x).toArray() > > where > function(args) expr > > is just a shorthand for > function(args) { return expr; } > > would benefit not only iterator cases. These are orthogonal. The point of array comprehensions is to avoid requiring both lambdas (however short-handed) and more explicit array construction. The contest is not judged by what looks good or not-that-bad, but really by what wins by these benefits: * JS lacks a general iteration protocol and any kind of usably structured continuations. Adding generators simplifies many known use cases. Array comprehensions are rarer, but still winning in their own right. * Similarity to Python or related languages that have overlapping communities. In particular, adding Pythonic generators and iteration protocol but not adding comprehensions leaves something out, arbitrarily. So call this "sufficiently complete similarity". * Idiomatic syntax so we can optimize and/or add non-strictness without breaking the more general (map, array construction, etc.) non-idiomatic forms. Outweighing these costs: * Language complexity budget. We have much bigger concerns elsewhere. * Implementation cost. As you note, given generators that for-in iterates, array comprehensions are surface syntax, so I argue this is low. /be
Assignee | ||
Comment 27•18 years ago
|
||
I've been working on various JS2 / ECMAScript Edition 4 issues including this one. I'll have a new patch soon, but this makes the last one work in the mean time. /be
Assignee | ||
Updated•18 years ago
|
Alias: geniter
Comment 28•18 years ago
|
||
Python 2.5 added some new features to its generator protocol in order to support coroutine-like features. Generators in Python 2.5 support send(value) and throw(exc), and yield can be used as an expression. This kind of functionality would be quite useful in JS, especially if some kind of streaming protocol is added, as it saves you the trouble of building an explicit state machine or passing around a complex object from yield. http://www.python.org/dev/peps/pep-0342/ http://docs.python.org/dev/whatsnew/node9.html
Assignee | ||
Updated•18 years ago
|
Assignee | ||
Comment 29•18 years ago
|
||
Hey Bob, thanks for the pointers. I had read PEP-342 a while ago, but I was not tracking Python 2.5 closely. Although one ECMA TG1 member was concerned about adapting their AST-evaluating Edition 3 implementation to support generators, and seemed mollified at having to save and restore state only for statements (not expressions) with yield as a statement, the rest of us are tentatively in favor of yield expressions and send. I don't expect any barriers to implementation from other members. We are even planning to unreserve property identifiers in context, so g.throw(e) should work in Edition 4 -- no need to rename or quote. New patch soon. /be
Priority: -- → P1
Assignee | ||
Comment 30•18 years ago
|
||
I'm trying to cut the total patch into comprehensible pieces, starting with this one, which adds js_IsArrayLike (useful from Function.prototype.apply as well as for the iteration protocol's [key, value] return-value handling) and renames a misnamed macro. /be
Attachment #219943 -
Flags: review?(mrbkap)
Comment 31•18 years ago
|
||
Comment on attachment 219943 [details] [diff] [review] pre-patch to smooth the way >Index: jsarray.c >+js_IsArrayLike(JSContext *cx, JSObject *obj, JSBool *answerp, jsuint *lengthp) >+{ >+ JSClass *clasp; >+ >+ clasp = OBJ_GET_CLASS(cx, obj); >+ *answerp = (clasp == &js_ArgumentsClass || clasp == &js_ArrayClass); The name of this function would suggest that this would return true for things like DOM lists as well (e.g., things that have a length property and numerical indeces from 0 to 'length'). Is that not the definition of 'arraylike', or could you comment on what 'array like' means?
Attachment #219943 -
Flags: review?(mrbkap) → review+
Assignee | ||
Comment 32•18 years ago
|
||
Array-like objects could be loosely defined, but right now this function is used only by fun_apply, the native implementing Function.prototype.apply. And array-like in Edition 3 is defined narrowly: only Array or an arguments object. For Edition 4, there should be a way to bless other objects as array-like, ideally using the type system rather than property id probing (what passes for "duck typing" too often). /be
Comment 33•18 years ago
|
||
Something in this patch broke Greasemonkey's GM_Hitch function (the function that hooks into the browser). This piece of code: var staticArgs = Array.prototype.splice.call(arguments, 2, arguments.length); used to return an array with length zero. After this patch, it returns null. Is this to be expected? Because that's going to break a lot of things, I'm sure.
Assignee | ||
Comment 34•18 years ago
|
||
(In reply to comment #33) > Something in this patch broke Greasemonkey's GM_Hitch function (the function > that hooks into the browser). This piece of code: > > var staticArgs = Array.prototype.splice.call(arguments, 2, arguments.length); > > used to return an array with length zero. After this patch, it returns null. Can you show a testcase including the function containing this line, and the call to it that fails? I can't make it fail: js> function f() { return Array.prototype.splice.call(arguments, 2, arguments.le ngth) } js> f() js> f(1) js> f(1,2) js> f(1,2,3) 3 js> f(1,2,3,4,5,6) 3,4,5,6 js> [].splice(2, 0) js> [1].splice(2, 1) js> [1,2].splice(2, 2) js> [1,2,3].splice(2, 3) 3 js> [1,2,3,4].splice(2, 4) 3,4 js> [1,2,3,4,5].splice(2, 5) 3,4,5 js> [1,2,3,4,5,6].splice(2, 6) 3,4,5,6 Note same results, and undefined return per spec when the start index 2 is >= the |this| array-like's length. > Is this to be expected? Not expected, and I'll fix it as fast as I can. I sure could use a failing testcase that is not all of GM. /be
Assignee | ||
Comment 35•18 years ago
|
||
(In reply to comment #34) Found the regression, which was not from the patch that landed for this bug, but from the patch for bug 325951 -- and remind me to read the splice spec (it hurts my eyes, but it made the regression obvious). Checking in jsarray.c; /cvsroot/mozilla/js/src/jsarray.c,v <-- jsarray.c new revision: 3.83; previous revision: 3.82 done Sorry for the screwup. /be
Comment 36•18 years ago
|
||
Huh. My bad. I looked over all the code checked in over the last 24 hours and it looked like this was the only thing that touched Array. Guess I didn't look hard enough. Glad that got fixed though. :)
Comment 37•18 years ago
|
||
Did this cause bug 336100?
Doubtful.
Assignee | ||
Comment 39•18 years ago
|
||
The work to track Python 2.5 will come in a separate patch, after a dependent bug or two are filed and fixed. This patch is a work in progress. Besides further work to track Python 2.5, SpiderMonkey will track ECMA-262 Edition 4. This may mean incompatible changes; caveat emptor. With this patch, SpiderMonkey supports PEP-255 style simple generators and array comprehensions: function count(n) { for (var i = 0; i < n; i++) yield i; } function f() { return [i*i for (i in count(10))]; } function g() { return [i*j for (i in count(10)) for (j in count(10))]; } function h() { return [i*j for (i in count(10)) for (j in count(10)) if (i != j)]; } var t; print(''); print(f, "=>", (t = f()).length, uneval(t)); print(''); print(g, "=>", (t = g()).length, uneval(t)); print(''); print(h, "=>", (t = h()).length, uneval(t)); Note that parentheses around the for control (the "for head") in array comprehensions, and around the final if condition if present, are required. The syntax is narrow here: the loop variables (i and j above) are lexically scoped by the array comprehension (the square brackets). No other iteration variable or declarative form (e.g., for (var i in count(10))) may be used. Output: function f() { return [i * i for (i in count(10))]; } => 10 [0, 1, 4, 9, 16, 25, 36, 49, 64, 81] function g() { return [i * j for (i in count(10)) for (j in count(10))]; } => 100 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 0, 3, 6, 9, 12, 15, 18, 21, 24, 27, 0, 4, 8, 12, 16, 20, 24, 28, 32, 36, 0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 0, 6, 12, 18, 24, 30, 36, 42, 48, 54, 0, 7, 14, 21, 28, 35, 42, 49, 56, 63, 0, 8, 16, 24, 32, 40, 48, 56, 64, 72, 0, 9, 18, 27, 36, 45, 54, 63, 72, 81] function h() { return [i * j for (i in count(10)) for (j in count(10)) if (i != j)]; } => 90 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 3, 4, 5, 6, 7, 8, 9, 0, 2, 6, 8, 10, 12, 14, 16, 18, 0, 3, 6, 12, 15, 18, 21, 24, 27, 0, 4, 8, 12, 20, 24, 28, 32, 36, 0, 5, 10, 15, 20, 30, 35, 40, 45, 0, 6, 12, 18, 24, 30, 42, 48, 54, 0, 7, 14, 21, 28, 35, 42, 56, 63, 0, 8, 16, 24, 32, 40, 48, 56, 72, 0, 9, 18, 27, 36, 45, 54, 63, 72] A generator function is a function containing one or more yield statements. When called, a generator function binds arguments and returns a special iterator (a generator-iterator). This iterator generates successive values by resuming execution of the generator function after the last yield, or at the start of the function if this is the first iteration. A generator function may not return a value. It may contain a void return (return; or return followed by newline) statement, which stops the iteration. Control may return by "falling off the end" of the function, which also stops the iteration. To explicitly iterate a generator, call its next method: g = count(4); g.next(); => 0 g.next(); => 1 g.next(); => 2 g.next(); => 3 See below for more on the iteration protocol underlying implicit iteration via for-in loops and array comprehensions. Per PEP-255, yield is not allowed in a try block. Generator expressions are not supported yet. Generators build on a Pythonic iteration protocol: all objects are iterable, all iterables have iterator objects, some iterables are their own iterators, but not all iterables are iterators. An iterable has an __iterator__ method taking a boolean flag argument (unlike Python, see below) and returning an iterator for the given |this|-parameter iterable. To get or create an iterator for any value v, call Iterator(v, flag), where flag may be omitted. Object.prototype.__iterator__ returns a default iterator that implements the for-in loop enumeration, which in SpiderMonkey preserves property creation order (an extension to ECMA-262). An iterator has a next method taking no arguments. Given an iterator i for an iterable o, each call to i.next() returns the identifier of the next property in o. This identifier is called the key, and it may be of any type -- it need not be a string, as enumerable property identifiers returned by for-in loops must be. Iteration stops when the global singleton StopIteration object, which is the only instance of a StopIteration subtype of Error, is thrown. JS objects are most like Python dicts, but with keys that convert to the same string equated. Therefore an iterator may return a [key, value] pair for each enumerated property. When the 'for each (v in o)' loop is used, the iterator for o is discovered as if Iterator(o, true) were called, which is equivalent to o.__iterator__(true). Given the true flag parameter, the resulting iterator *must* return [key, value] pairs. Thus the flag parameter to Iterator and __iterator__ is an optimization that may be used by specific implementations of __iterator__ to return keys only (if flag is false), and not [key, value] pairs (as required if flag is true). (The [key, value] array return case is optimized for native default iterators to avoid an array construction, and generalized destructuring assignment in ES4/JS2 will motivate further optimizations to avoid arbitrary length array creation when returning array initialisers to callers that destructure immediately.) Notes on for-in "enumeration" vs. "iteration": The for-in loop has always walked up the prototype chain from the object given on the right of |in|, suppressing ("shadowing") prototype properties of the same name as properties already visited in the original object and any nearer objects along the prototype chain. Also, ECMA-262 requires suppressing properties that are deleted after enumeration starts but before enumeration visits those names, and the spec leaves it up to implementations whether properties added during enumeration are visited. This patch preserves ECMA compatibility, but since an iterator is an iterable, when you write 'for (i in j)' where j is an iterator, a new rule applies: only the returned keys or [key, value] pairs are enumerated. There is no walk up the prototype chain of j, and no attempt to check for shadowed or deleted properties of j named by the returned key. The only property expected of j, once it has been established as an iterator by being returned via its own __iterator__ method, is its next method. Furthermore, when enumerating an iterator with for-in, the loop control variable has arbitrary key type -- it does not always contain a string identifying the property. This avoids unintended string concatenation bugs, and is much saner than the ECMA-prescribed string-ization of property names, especially when dealing with array-like iterables. Unfortunately, backward compatibility means that 'for (i in [a,b,c])...' can't iterate i over 0, 1, and 2. At least for the moment, barring an intentionally incompatible change to ES4 (which i will try to persuade ECMA TG1 to make), i must take on the values "0", "1', and "2" (and in arbitrary order according to the spec! I'll try to get this fixed to be sane, too). General remarks: As in Python, the low-level __iterator__ and next methods are typically not exposed in everyday use, thanks to for-in loops and array comprehensions. But the __iterator__ meta-object protocol finally enables arbitrary JS objects to customize iteration. That's most of the point of this patch. Generators are a separate good thing, which should reduce the amount of manual callback-based state machine code written in JS, but generators fundamentally are iterators. It's likely that other meta-object protocols will be part of ES4/JS2, and the __iterator__ name may be moved into a namespace (e.g., MOP::iterator). We may want to keep __iterator__ in SpiderMonkey for backward compatibility. This patch survived a disk corruption plague that killed my Linux partition, then a motherboard crack that just took out the whole Thinkpad I've been living on for a year and a half, and who knows what other threats. I haven't had time to work consistently on it due to other demands until just recently. I am excessivly grateful to Mozilla's IT folks -- they have supplied me with this beautiful MacBook Pro on which I'm typing. More soon. Please help test this patch. /be
Attachment #211947 -
Attachment is obsolete: true
Attachment #213828 -
Attachment is obsolete: true
Attachment #220576 -
Flags: superreview?(shaver)
Attachment #220576 -
Flags: review?(mrbkap)
Comment 40•18 years ago
|
||
This produces an assertion (Assertion failure: (jsuint)i < JS_BIT(16), at jsinterp.c:5938) if the generator generates more than 65536 results: function g() { for (var i=0; i <= 0x10001; ++i) yield 1 } [i for (i in g())] Maybe there should be some sensible hard limit for the size of an array generated this way?
Some things I've found in scanning and testing: - the attached test case (which I hope shows how .close() would give us the equivalent of context managers!) triggers a handful of bogus frees for me. Stack of the first one is: #3 0x0000f1fc in JS_free (cx=0x500180, p=0x7) at jsapi.c:1680 #4 0x0000ff94 in JS_DestroyIdArray (cx=0x500180, ida=0x7) at jsapi.c:2085 #5 0x0003cbb8 in js_Enumerate (cx=0x500180, obj=0x1804e80, enum_op=JSENUMERATE_DESTROY, statep=0xbfffdb00, idp=0x0) at jsobj.c:3630 #6 0x000ec760 in iterator_finalize (cx=0x500180, obj=0x18055a8) at jsiter.c:158 #7 0x000ed710 in js_FinishIterator (cx=0x500180, iterobj=0x18055a8) at jsiter.c:375 #8 0x0009ea8c in js_Interpret (cx=0x500180, pc=0x1819c7e "?;", result=0xbfffe710) at jsinterp.c:5904 - if we iterate an object that has a non-function-valued __iterator__ property, we get a confusing error message: js> for (var i in { __iterator__: {}}) { } typein:15: TypeError: i is not a function - the #ifndef HAS_LVALUE_RETURN case in SetupKeyValueReturn looks kinda strange to me, calling js_NewObject with a second parameter of 2. - tyop: "/* Inline JS_ClearPendingExcpetions(cx). */"
Comment 42•18 years ago
|
||
Another crasher (in CloneBlockObject): function g() { try { eval("yield 5") } catch (ex) {} yield 1 } g().next()
Assignee | ||
Comment 43•18 years ago
|
||
(In reply to comment #40) > This produces an assertion (Assertion failure: (jsuint)i < JS_BIT(16), at > jsinterp.c:5938) if the generator generates more than 65536 results That was just a bogus assertion -- it's gone. (In reply to comment #41) > - the attached test case (which I hope shows how .close() would give us the > equivalent of context managers!) triggers a handful of bogus frees for me. Fixed (js_FinishIterator is an optimization to eagerly finalize native iterators, so it must class-check iterobj before assuming it's a native iter). > - if we iterate an object that has a non-function-valued __iterator__ property, > we get a confusing error message: > > js> for (var i in { __iterator__: {}}) { } > typein:15: TypeError: i is not a function Working on this, it involves your friend and mine, the decompiler. > - the #ifndef HAS_LVALUE_RETURN case in SetupKeyValueReturn looks kinda strange > to me, calling js_NewObject with a second parameter of 2. Obviously meant js_NewArrayObject -- fixed. > - tyop: "/* Inline JS_ClearPendingExcpetions(cx). */" Fixed. (In reply to comment #42) > Another crasher (in CloneBlockObject): > > function g() { > try { > eval("yield 5") > } > catch (ex) {} > yield 1 > } > g().next() I can't reproduce. Both return and yield are not allowed outside of a function, and eval treats its argument as program source, not function body source when called from a function. So I get the expected error: oops.js:3: SyntaxError: yield not in function: oops.js:3: yield 5 oops.js:3: ^ Can you say more about how to reproduce? What's the stack? /be
Assignee | ||
Comment 44•18 years ago
|
||
I fussed over the error message. Given: var s1 = "o = {__iterator__:", s2 = "}; for (var t in o);"; var values = [ "null", "undefined", "false", "42", "'hi'", "{}", "function () { throw StopIteration; }" ]; for (var i = 0; i < values.length; i++) { try { eval(s1 + values[i] + s2); } catch (e) { print(e); } } the patched shell produces: TypeError: o has invalid __iterator__ value null TypeError: o has invalid __iterator__ value undefined TypeError: o has invalid __iterator__ value false TypeError: o has invalid __iterator__ value 42 TypeError: o has invalid __iterator__ value "hi" TypeError: o has invalid __iterator__ value ({}) StopIteration /be
Attachment #220576 -
Attachment is obsolete: true
Attachment #220739 -
Flags: superreview?(shaver)
Attachment #220739 -
Flags: review?(mrbkap)
Attachment #220576 -
Flags: superreview?(shaver)
Attachment #220576 -
Flags: review?(mrbkap)
Assignee | ||
Comment 45•18 years ago
|
||
Simulated ref-counting in a GC'ed world in order to eagerly finalize native default iterators is hard work. /be
Attachment #220739 -
Attachment is obsolete: true
Attachment #220747 -
Flags: superreview?(shaver)
Attachment #220747 -
Flags: review?(mrbkap)
Attachment #220739 -
Flags: superreview?(shaver)
Attachment #220739 -
Flags: review?(mrbkap)
Comment 46•18 years ago
|
||
(In reply to comment #43) > I can't reproduce. Both return and yield are not allowed outside of a > function, and eval treats its argument as program source, not function body > source when called from a function. So I get the expected error: > > oops.js:3: SyntaxError: yield not in function: > oops.js:3: yield 5 > oops.js:3: ^ > > Can you say more about how to reproduce? What's the stack? Strange, this code crashes every time I try it (even after applying patch v3), with the following stack: > js32.dll!CloneBlockObject(JSContext * cx=0x0251ae20, JSStackFrame * fp=0x028b8f84, JSObject * obj=0xcdcdcdcd) Line 477 + 0x3 bytes C js32.dll!js_GetScopeChain(JSContext * cx=0x0251ae20, JSStackFrame * fp=0x028b8f84) Line 501 + 0x11 bytes C js32.dll!obj_eval(JSContext * cx=0x0251ae20, JSObject * obj=0x02523e80, unsigned int argc=1, long * argv=0x028b8ff4, long * rval=0x0013d770) Line 1223 + 0xd bytes C js32.dll!js_Invoke(JSContext * cx=0x0251ae20, unsigned int argc=1, unsigned int flags=0) Line 1323 + 0x1a bytes C js32.dll!js_Interpret(JSContext * cx=0x0251ae20, unsigned char * pc=0x0285afb1, long * result=0x0013dc6c) Line 3965 + 0xf bytes C js32.dll!generator_next(JSContext * cx=0x0251ae20, JSObject * obj=0x025245a0, unsigned int argc=0, long * argv=0x0282c020, long * rval=0x0013dd64) Line 550 + 0x14 bytes C js32.dll!js_Invoke(JSContext * cx=0x0251ae20, unsigned int argc=0, unsigned int flags=0) Line 1323 + 0x1a bytes C js32.dll!js_Interpret(JSContext * cx=0x0251ae20, unsigned char * pc=0x02864fe7, long * result=0x0013e290) Line 3965 + 0xf bytes C js32.dll!js_Execute(JSContext * cx=0x0251ae20, JSObject * chain=0x02523e80, JSScript * script=0x02864fa8, JSStackFrame * down=0x00000000, unsigned int flags=0, long * result=0x0013e444) Line 1568 + 0x13 bytes C js32.dll!JS_ExecuteScript(JSContext * cx=0x0251ae20, JSObject * obj=0x02523e80, JSScript * script=0x02864fa8, long * rval=0x0013e444) Line 4187 + 0x19 bytes C js.exe!Load(JSContext * cx=0x0251ae20, JSObject * obj=0x02523e80, unsigned int argc=1, long * argv=0x0282c00c, long * rval=0x0013e570) Line 588 + 0x21 bytes C js32.dll!js_Invoke(JSContext * cx=0x0251ae20, unsigned int argc=1, unsigned int flags=0) Line 1323 + 0x1a bytes C js32.dll!js_Interpret(JSContext * cx=0x0251ae20, unsigned char * pc=0x02825fef, long * result=0x0013ea9c) Line 3965 + 0xf bytes C js32.dll!js_Execute(JSContext * cx=0x0251ae20, JSObject * chain=0x02523e80, JSScript * script=0x02825fb8, JSStackFrame * down=0x00000000, unsigned int flags=0, long * result=0x0013fca0) Line 1568 + 0x13 bytes C js32.dll!JS_ExecuteScript(JSContext * cx=0x0251ae20, JSObject * obj=0x02523e80, JSScript * script=0x02825fb8, long * rval=0x0013fca0) Line 4187 + 0x19 bytes C js.exe!Process(JSContext * cx=0x0251ae20, JSObject * obj=0x02523e80, char * filename=0x00000000, int forceTTY=0) Line 264 + 0x18 bytes C js.exe!ProcessArgs(JSContext * cx=0x0251ae20, JSObject * obj=0x02523e80, char * * argv=0x0240af74, int argc=2) Line 481 + 0x15 bytes C js.exe!main(int argc=2, char * * argv=0x0240af74, char * * envp=0x023b8f60) Line 2736 + 0x15 bytes C js.exe!__tmainCRTStartup() Line 586 + 0x19 bytes C js.exe!mainCRTStartup() Line 403 C kernel32.dll!7c8123cd() obj is always 0xcdcdcdcd, the pattern for allocated but not initialised bytes. Another question: The result of |Function("yield 1")()| is 1, not [object Generator]. Is that intentional, generators cannot be build with Function()?
Comment 47•18 years ago
|
||
Adding the line gen->frame.blockChain = fp->blockChain; after the line gen->frame.thisp = fp->thisp; in js_NewGenerator fixes the crash for me; I get the expected error message. I don't know if blockChain should be set to NULL instead.
Assignee | ||
Comment 48•18 years ago
|
||
(In reply to comment #47) > in js_NewGenerator fixes the crash for me; I get the expected error message. I > don't know if blockChain should be set to NULL instead. NULL, thanks. New patch in a trice. /be
Assignee | ||
Comment 49•18 years ago
|
||
Attachment #220747 -
Attachment is obsolete: true
Attachment #220859 -
Flags: superreview?(shaver)
Attachment #220859 -
Flags: review?(mrbkap)
Attachment #220747 -
Flags: superreview?(shaver)
Attachment #220747 -
Flags: review?(mrbkap)
Comment 50•18 years ago
|
||
This example produces "uncaught exception: 1", expected result is 2: function g() { try { throw 1 } catch(exc) { yield 2 } } g().next() Also |exc instanceof StopIteration| raises a TypeError, which prevents its use in catch guards.
Comment 51•18 years ago
|
||
I see, the raised StopIteration exception is always the same object, so you can use |catch (exc if exc == StopIteration)|. Another crasher, I hope you can reproduce this one because the stack trace is complete nonsense: [2 for (i in "a") for (i in "b")]
Assignee | ||
Comment 52•18 years ago
|
||
(In reply to comment #50) > This example produces "uncaught exception: 1", expected result is 2: > > function g() { > try { throw 1 } > catch(exc) { yield 2 } > } > g().next() I should read my own XXX comments: * XXX This assumes the null mark case implies XML filtering predicate * expression execution! from the exception handling code near the bottom of js_Interpret. I righteously blame E4X here. :-/ > > Also |exc instanceof StopIteration| raises a TypeError, which prevents its use > in catch guards. instanceof always looks up the proto chain one hop, alas. Its basis case is not reflexive. Sucks, I blame Edition 3. I doubt we will fix this for Edition 4, but I will put it on the radar. /be
(In reply to comment #52) > > Also |exc instanceof StopIteration| raises a TypeError, which prevents its use > > in catch guards. > > instanceof always looks up the proto chain one hop, alas. Its basis case is > not reflexive. Sucks, I blame Edition 3. I doubt we will fix this for Edition > 4, but I will put it on the radar. Custom hasInstance on js_StopIterationClass?
Assignee | ||
Comment 54•18 years ago
|
||
(In reply to comment #51) > Another crasher, I hope you can reproduce this one because the stack trace is > complete nonsense: > [2 for (i in "a") for (i in "b")] Oops, block local definition without lookup to check for rebinding of an identifier already defined for the same block. Fixed. (In reply to comment #53) > (In reply to comment #52) > Custom hasInstance on js_StopIterationClass? What a good idea. A one-liner, essentially. New patch as soon as I fix the filtering predicate drain bamage (for which I blame myself). /be
Assignee | ||
Comment 55•18 years ago
|
||
Attachment #220859 -
Attachment is obsolete: true
Attachment #221024 -
Flags: superreview?(shaver)
Attachment #221024 -
Flags: review?(mrbkap)
Attachment #220859 -
Flags: superreview?(shaver)
Attachment #220859 -
Flags: review?(mrbkap)
Assignee | ||
Comment 56•18 years ago
|
||
Includes a fix to a copy/paste bug in jsopcode.h that mrbkap noticed. I'm landing this on the JS1_7_ALPHA_BRANCH. /be
Attachment #221024 -
Attachment is obsolete: true
Attachment #221531 -
Flags: superreview?(shaver)
Attachment #221531 -
Flags: review?(mrbkap)
Attachment #221024 -
Flags: superreview?(shaver)
Attachment #221024 -
Flags: review?(mrbkap)
Comment 57•18 years ago
|
||
Comment on attachment 221531 [details] [diff] [review] iteration protocol + simple generators (after PEP-255), v6 *stamp*
Attachment #221531 -
Flags: review?(mrbkap) → review+
Assignee | ||
Comment 58•18 years ago
|
||
JS_1_7_ALPHA_BRANCH merged to trunk, ready for checkin, including patch for this bug and for bug 336376. One change to Iterator's API: the optimization whereby a second flag parameter is passed to Iterator, and on to __iterator__ as the optional first parameter, where the flag is true to signal for-each-in rather than for-in, remains. But now the flag sense is reversed, so minimal calls are Pythonic: it = Iterator(obj) makes or gets an iterator that returns [key, value] pairs. You have to call Iterator(obj, true) to get an iterator that returns keys only. Landing tomorrow morning some time (late at night U.S. time). Testing welcome. /be
Attachment #221531 -
Attachment is obsolete: true
Attachment #222278 -
Flags: superreview?(shaver)
Attachment #222278 -
Flags: review?(mrbkap)
Attachment #221531 -
Flags: superreview?(shaver)
Comment on attachment 222278 [details] [diff] [review] iteration protocol + simple generators (after PEP-255), v7 Running e4x/Statements/12.3-01.js in the browser fails with this patch, as do I suspect all enumeration tests. More in a sec.
Attachment #222278 -
Flags: superreview?(shaver)
Attachment #222278 -
Flags: superreview-
Attachment #222278 -
Flags: review?(mrbkap)
Attachment #222278 -
Flags: review-
js_ThrowStopIteration is setting NULL as a pending exception, because that's what it's getting back from js_GetClassObject for JSProto_StopIteration. Digging more now.
Known issue with innies and outies and the getting of the class objects. On mrbkap's radar apparently, brendan's gonna try something and post a new patch.
Assignee | ||
Comment 62•18 years ago
|
||
Have to fix bug 337528 to get this working in the context of DOM inner and outer window objects. New patch soon. /be
Assignee | ||
Comment 63•18 years ago
|
||
Works in the js shell, but so did the last patch. My Firefox build is still chugging. This patch contains a brute-force fix for bug 337528. /be
Attachment #222278 -
Attachment is obsolete: true
Attachment #222328 -
Flags: superreview?(shaver)
Attachment #222328 -
Flags: review?(mrbkap)
Assignee | ||
Comment 64•18 years ago
|
||
For easier reviewing, it's best to interdiff (real interdiff) against the v7 patch. - Use reserved slots in the global object, indicated by JSCLASS_IS_GLOBAL among the class flags, as part of JSCLASS_GLOBAL_FLAGS (the other flags included are JSCLASS_HAS_RESERVED_SLOTS(JSProto_LIMIT)). See jsapi.h. This is a backward compatible extension. - Use js_FindClassObject in js_ThrowStopIteration, to cope with globals that do not use JSCLASS_GLOBAL_FLAGS. - Moved js_[GS]etClassObject to jsobj.[ch] from jscntxt.[ch]. Note that js_SetClassObject is fallible now (and in the last patch). - Incorporated followup sharp-variable-def bugfix from bug 336376. This was breaking nsExtensionManager.js, which uses sharp vars. /be
Attachment #222328 -
Attachment is obsolete: true
Attachment #222482 -
Flags: superreview?(shaver)
Attachment #222482 -
Flags: review?(mrbkap)
Attachment #222328 -
Flags: superreview?(shaver)
Attachment #222328 -
Flags: review?(mrbkap)
Assignee | ||
Comment 65•18 years ago
|
||
I have landed the latest patch on the JS_1_7_ALPHA_BRANCH, and updated that branch to include all trunk changes since it was cut. It applies only under js/src, but you should be able to pull from this tag in js/src after pulling a trunk tree, and build without problems. Let me know if something doesn't work. Thanks, /be
Comment 66•18 years ago
|
||
Comment on attachment 222482 [details] [diff] [review] iteration protocol + simple generators (after PEP-255), v9 r=mrbkap with the change mentioned in bug 336373 comment 7.
Attachment #222482 -
Flags: review?(mrbkap) → review+
Assignee | ||
Comment 67•18 years ago
|
||
What I just checked in. Seems good from Firefox and js shell testing. Bob, I can't get these tests to work: 322135 324422-2 the two GC without recursion js1_5/Regress/regress-271716-n.js I had Mac OS X kernel panic at one point. Some or all of these tend to use up all available vm. /be
Attachment #222749 -
Flags: superreview?(shaver)
Attachment #222749 -
Flags: review+
Comment 68•18 years ago
|
||
Looks like something is broken in class initialization. This code javascript:function a(){yield 1};b=a();alert(b.next) shows undefined, but if you use for example [i for (i in [])] before it works correctly.
Comment 69•18 years ago
|
||
This fixes problem from comment #68. I forgot to tell that this problems is visible only in browser.
Comment 70•18 years ago
|
||
(In reply to comment #67) > > 322135 Bug 322135 is not fixed yet. > 324422-2 Crashes in windows (bug 332570 duped to Bug 324533 crash [@ js_HashString() line 2813]). Is this what you are seeing? > the two GC without recursion > js1_5/Regress/regress-271716-n.js > I saw a crash in my 20060516 (but not in the prior run on 20060508) run in windows trunk only but it had a strange exit code that matched the one I use to indicate timeouts, so I was waiting to see if it was real or not. I do see a crash in my 20060520 run in windows and mac ppc runs for this test. I tried to crash my recent trunk build on windows but could not although the test ended up eating all my real and virtual ram and I had to kill the process after a half hour or so. > I had Mac OS X kernel panic at one point. Some or all of these tend to use up > all available vm. > I ran my first test run on mac intel yesterday. From what I could tell the machine handled the 1.8.0.4, 1.8.1 tests fine for opt and debug shell and browser and 1.9 opt shell, but appears to have crashed the machine sometime during the 1.9 opt browser test. It was running the 1.9 opt browser test when I went to sleep this morning but the machine lb-qamini01 is now offline. If you are in the office and can restart the machine, I can recover which test caused the problem from the logs.
Comment 71•18 years ago
|
||
(In reply to comment #70) > > the two GC without recursion > > js1_5/Regress/regress-271716-n.js > > > See bug 306350
Assignee | ||
Comment 72•18 years ago
|
||
(In reply to comment #69) > Created an attachment (id=222791) [edit] > fix for problem from comment #68 > > This fixes problem from comment #68. I forgot to tell that this problems is > visible only in browser. Yes, we have not yet modified XPConnect to make its global objects (wrapped natives such as the DOM window object) have JSCLASS_GLOBAL_FLAGS. Thanks for the patch, I'll review in a minute. /be
Assignee | ||
Comment 73•18 years ago
|
||
Attachment 222791 [details] [diff] is fine for now, but when we soon fix XPConnect-wrapped DOM global windows and other global objects in Firefox and other Gecko apps to use the new JSCLASS_GLOBAL_FLAGS, it would result in redundant initialization of the generator class, and other anonymous classes, when the supposedly-unnamed class's name was used inadvertently.
This patch looks bigger than it is. It revises the standard_class_atoms to use the JSStdName type used by the other lazy-class-initialization tables, and adds a JSClass *clasp member to that struct. It factors more macro initializer madness for setting up these tables.
Then (the non-trivial change) the jsapi.c patch uses this new clasp member, if non-null, to avoid initializing (possibly redundantly, in any case unnecessarily) the anonymous class named by some reference to a global identifier that *happens to match* the class's name.
So, e.g., "Generator" should not trigger js_InitIteratorClasses once we have fixed XPConnect to set JSCLASS_GLOBAL_FLAGS in the classes of the DOM window and similar such global objects. The only way a generator construction triggers lazy init of js_GeneratorClass is via js_GetClassObject, which works for named and anonymous classes in the same way, given a global object that reserves the slots needed to cache class objects per ECMA-262.
/be
Attachment #222791 -
Attachment is obsolete: true
Attachment #222965 -
Flags: review?(mrbkap)
Updated•18 years ago
|
Attachment #222965 -
Flags: review?(mrbkap) → review+
Assignee | ||
Comment 74•18 years ago
|
||
Trunk-based patch. Pretty straightforward! /be
Attachment #223254 -
Flags: superreview?(shaver)
Attachment #223254 -
Flags: review?(mrbkap)
Assignee | ||
Comment 75•18 years ago
|
||
Comment on attachment 223254 [details] [diff] [review] coroutine generators, a la Python 2.5 New patch coming. /be
Attachment #223254 -
Attachment is obsolete: true
Attachment #223254 -
Flags: superreview?(shaver)
Attachment #223254 -
Flags: review?(mrbkap)
Assignee | ||
Comment 76•18 years ago
|
||
Whew. Had to add a close protocol before finalization to the GC. Dig it, it avoids penalizing existing workloads. Also fixed a dumb bug in yield expression parsing. Testing help craved. This applies to the trunk. /be
Attachment #223347 -
Flags: superreview?(shaver)
Attachment #223347 -
Flags: review?(mrbkap)
Comment 77•18 years ago
|
||
On recent trunk builds "Call" is behaving strangely: var Call; // -> TypeError: redeclaration of const Call alert(Call); // -> can't convert Call to string Is this behaviour related to the patches committed as part of this bug?
Comment 78•18 years ago
|
||
Comment on attachment 223347 [details] [diff] [review] coroutine generators, a la Python 2.5, v2 Mm, this stuff's fairly spiffy, although I really don't like how send() and throw() have unintuitive behavior regarding exactly where the value/exception is propagated -- blame Python, I suppose. Some observations on behavior with the patch applied: I get the following error when using xpcshell to evaluate the following JavaScript in a trunk build with this patch applied ("^D" represents me pressing Ctrl+D, if it's not clear): js> function b() { yield 17; } x = b() [object Generator] js> ^D Assertion failure: !rt->gcRunning, at /home/jwalden/moz/trees/trunk/mozilla/js/src/jsgc.c:753 Trace/breakpoint trap Evaluating the following JavaScript (which is at least Python-ically legal after PEP 342) in xpcshell produces the following result: js> function t() { yield; } Segmentation fault What behavior is supposed to happen when throw() is called on a generator which has been closed? PEP 342 says the passed-in exception is raised at that point, but that doesn't happen with the current patch: js> function f() { yield 1; yield 2; yield 3; } js> x = f() js> x.close() js> x.throw(7) uncaught exception: StopIteration Along the same lines, I'm not entirely sure after reading the PEP whether the closed state is only entered when close() is called on a generator or whether it's also entered whenever the generator reaches its end and throws StopIteration. If the latter is true, the following code doesn't demonstrate correct results: js> function f() { if (false) yield 5; } js> x = f() [object Generator] js> x.throw(2) uncaught exception: 2 js> x.throw(7) uncaught exception: StopIteration
Assignee | ||
Comment 79•18 years ago
|
||
(In reply to comment #77) > On recent trunk builds "Call" is behaving strangely: > > var Call; // -> TypeError: redeclaration of const Call > alert(Call); // -> can't convert Call to string > > Is this behaviour related to the patches committed as part of this bug? Yes. Pending the fix for bug 339041, anonymous classes must be readonly and permanent properties of the global object. Did this break real-world code? Blake will fix bug 339041 very soon, there's a patch awaiting review (I'll help it get r=jst today). (In reply to comment #78) > (From update of attachment 223347 [details] [diff] [review] [edit]) > I get the following error when using xpcshell to evaluate the following > JavaScript in a trunk build with this patch applied ("^D" represents me > pressing Ctrl+D, if it's not clear): > > js> function b() { yield 17; } x = b() > [object Generator] > js> ^D > Assertion failure: !rt->gcRunning, at > /home/jwalden/moz/trees/trunk/mozilla/js/src/jsgc.c:753 > Trace/breakpoint trap Fixed -- the new close phase must all the GC allocator to nest in the collector. Big fun, wish me luck selling this to ECMA TG1 colleagues. > Evaluating the following JavaScript (which is at least Python-ically legal > after PEP 342) in xpcshell produces the following result: > > js> function t() { yield; } > Segmentation fault Fixed, a remnant from PEP 255 emulation in the code generator. > What behavior is supposed to happen when throw() is called on a generator which > has been closed? PEP 342 says the passed-in exception is raised at that point, > but that doesn't happen with the current patch: > > js> function f() { yield 1; yield 2; yield 3; } > js> x = f() > js> x.close() > js> x.throw(7) > uncaught exception: StopIteration PEP 342 says in the paragraph specifying throw(): If the generator raises another exception (this includes the StopIteration produced when it returns) that exception is raised by the throw() call. Your generator function f has no try block around its body, and it has not been started by a .next() call at the point where you close it, so close dutifully throws GeneratorExit at it and that exception propagates uncaught, ending the iteration and marking the generator exhausted. Quoting PEP 255: If an unhandled exception-- including, but not limited to, StopIteration --is raised by, or passes through, a generator function, then the exception is passed on to the caller in the usual way, and subsequent attempts to resume the generator function raise StopIteration. In other words, an unhandled exception terminates a generator's useful life. > Along the same lines, I'm not entirely sure after reading the PEP whether the > closed state is only entered when close() is called on a generator or whether > it's also entered whenever the generator reaches its end and throws > StopIteration. If the latter is true, the following code doesn't demonstrate > correct results: > > js> function f() { if (false) yield 5; } > js> x = f() > [object Generator] > js> x.throw(2) > uncaught exception: 2 > js> x.throw(7) > uncaught exception: StopIteration The latter (that close is implicitly called when the iterator is exhausted) is false. That would defeat the purpose of adding close as a way to run finally clause code. Quote: 4. Add a close() method for generator-iterators, which raises GeneratorExit at the point where the generator was paused. If the generator then raises StopIteration (by exiting normally, or due to already being closed) or GeneratorExit (by not catching the exception), close() returns to its caller. If the generator yields a value, a RuntimeError is raised. If the generator raises any other exception, it is propagated to the caller. close() does nothing if the generator has already exited due to an exception or normal exit. 5. Add support to ensure that close() is called when a generator iterator is garbage-collected. Nowhere else in the PEP is an implicit call to close specified. Having a less prosaic, more checkable or formal spec, would be great, but I think the PEPs are unambiguous here. New patch next. /be
Assignee | ||
Comment 80•18 years ago
|
||
> Fixed -- the new close phase must all the GC allocator to nest in the
"must allow", of course.
/be
Assignee | ||
Comment 81•18 years ago
|
||
Igor, please review the jsgc.c changes too. /be
Attachment #223347 -
Attachment is obsolete: true
Attachment #223826 -
Flags: superreview?(shaver)
Attachment #223826 -
Flags: review?(mrbkap)
Attachment #223347 -
Flags: superreview?(shaver)
Attachment #223347 -
Flags: review?(mrbkap)
Assignee | ||
Updated•18 years ago
|
Attachment #223826 -
Flags: review?(igor.bukanov)
Assignee | ||
Comment 82•18 years ago
|
||
(In reply to comment #81) > Created an attachment (id=223826) [edit] > coroutine generators, a la Python 2.5, v3 > > Igor, please review the jsgc.c changes too. Here is my comment to shaver over IRC, about the flaw in the patch whereby objects wanting close callbacks can be allocated during the close phase, and never get a close callback: brendan: shaver: big question for this patch: should the "if (rt->gcCloseWanted) {...}" block in js_GC be a while (rt->gcCloseWanted) {...}" block? brendan: (or equiv, inner do-while loop to share before/after code) It's easy to add this, but it means a misbehaving close implementation could make the GC diverge. Not good. /be
Assignee | ||
Comment 83•18 years ago
|
||
(In reply to comment #82) > Here is my comment to shaver over IRC, about the flaw in the patch whereby > objects wanting close callbacks can be allocated during the close phase, and > never get a close callback: > > brendan: shaver: big question for this patch: should the "if > (rt->gcCloseWanted) {...}" block in js_GC be a while (rt->gcCloseWanted) {...}" > block? > brendan: (or equiv, inner do-while loop to share before/after code) > > It's easy to add this, but it means a misbehaving close implementation could > make the GC diverge. Not good. GC connoisseurs will think of Java here. I do not want any object being closed to be able to resurrect itself, causing close to have to run a second time (or more times) later. This matches my understanding of Python's (CPython's, at any rate) behavior. Pythonistas, please chime in if this is wrong. Given the current patch, an object being closed cannot arrange via a scripted close function to resurrect itself or any nearly-garbage object it can reach (garbage objects may of course refer to live objects -- this is how the hazard arises). Attempts to store a reference to any closing or about-to-be-finalized object (which is unmarked) in a live object (which is marked at this point in the collection process) results in a null being stored. The alternative is to mark the closing or nearly-garbage object at the point where a reference to it is stored into a live object, thereby resurrecting the referenced object. I don't see a middle ground. Variations on storing null include throwing an exception, but seem worse. The win of Python-2.5-like close for generators seems worth the cost, since it allows yield from a try block and enables "context management" (before/after resource acquire/release) patterns. /be
Comment 84•18 years ago
|
||
(In reply to comment #79) > Your generator function f has no try block around its body, and it has not been > started by a .next() call at the point where you close it, so close dutifully > throws GeneratorExit at it and that exception propagates uncaught, ending the > iteration and marking the generator exhausted. Quoting PEP 255: > > If an unhandled exception-- including, but not limited to, > StopIteration --is raised by, or passes through, a generator function, > then the exception is passed on to the caller in the usual way, and > subsequent attempts to resume the generator function raise > StopIteration. In other words, an unhandled exception terminates a > generator's useful life. This behavior applies to .next(), not to .throw(). See the end of the first paragraph in the .throw() spec in PEP 342: If the generator is already in the closed state, throw() just raises the exception it was passed without executing any of the generator's code. The cited PEP 255 paragraph was meant to apply to exceptions thrown "internally" within generators during the normal course of execution, not to exceptions passed into generators (since such functionality hadn't even been proposed at the time).
Comment 85•18 years ago
|
||
Comment on attachment 223826 [details] [diff] [review] coroutine generators, a la Python 2.5, v3 >+ if (rt->gcCloseWanted) { A quick idea: why not to put all object with a close hook on a global list and walking the list here? Otherwise when extensions start to use the generators we would always get a relatively long extra GC phase in the browser. Such list would also allow to implement a proper finalization protocol. That is, when the initial mark phase is done, work through the list. When an entry with no mark found, remove it from the list and then mark recursively. Then for all removed entries their close handlers should be executed. If the close handler triggers allocation of GC things, then all such things should be born with GCG_MARK flag set to protect them from finalization stage. After that the finalization is run. This is robust AFAICS.
Comment 86•18 years ago
|
||
The above proposal from comment 85 allows for things to be resurrected, but only once.
Assignee | ||
Comment 87•18 years ago
|
||
(In reply to comment #86) > The above proposal from comment 85 allows for things to be resurrected, but > only once. Then you need state to flag the closed nature of the object, so you don't close it again before it is finalized by a later GC. This seems problematic both on paper and in real implementations. /be
Comment 88•18 years ago
|
||
(In reply to comment #87) > Then you need state to flag the closed nature of the object, so you don't close > it again before it is finalized by a later GC. This seems problematic both on > paper and in real implementations. But this exactly that proposal of a list of things with close handlers is about: things can only be put there once and when removed before the run of the close handlers, their close handlers would never run again.
Comment 89•18 years ago
|
||
(In reply to comment #88) > (In reply to comment #87) > > Then you need state to flag the closed nature of the object, so you don't close > > it again before it is finalized by a later GC. This seems problematic both on > > paper and in real implementations. > > But this exactly that proposal of a list of things with close handlers is > about: things can only be put there once and when removed before the run of the > close handlers, their close handlers would never run again. > Note also this requires to decide if a thing should have a close handler during thing creation or at least during property assignment and not during GC phase. But this true about generators.
Assignee | ||
Comment 90•18 years ago
|
||
(In reply to comment #89) > (In reply to comment #88) > > (In reply to comment #87) > > > Then you need state to flag the closed nature of the object, so you don't close > > > it again before it is finalized by a later GC. This seems problematic both on > > > paper and in real implementations. > > > > But this exactly that proposal of a list of things with close handlers is > > about: things can only be put there once and when removed before the run of the > > close handlers, their close handlers would never run again. Ok, that's what I was calling problematic, but it seems inevitable. New patch in a little bit. /be
Comment 91•18 years ago
|
||
Comment on attachment 223826 [details] [diff] [review] coroutine generators, a la Python 2.5, v3 >--- jsgc.c 22 May 2006 22:46:07 -0000 3.139 >+++ jsgc.c 30 May 2006 19:21:08 -0000 >@@ -745,31 +745,32 @@ js_NewGCThing(JSContext *cx, uintN flags > flbase[flindex] = thing->next; > METER(rt->gcStats.localalloc++); /* this is not thread-safe */ > goto success; > } > > JS_LOCK_GC(rt); > gcLocked = JS_TRUE; > #endif >- JS_ASSERT(!rt->gcRunning); >- if (rt->gcRunning) { >+ JS_ASSERT(!rt->gcRunning || rt->gcInClosePhase); >+ if (rt->gcRunning && !rt->gcInClosePhase) { > METER(rt->gcStats.finalfail++); > JS_UNLOCK_GC(rt); > return NULL; > } > >+ triedGC = rt->gcInClosePhase; > #ifdef TOO_MUCH_GC >+ if (!triedGC) { > #ifdef WAY_TOO_MUCH_GC >- rt->gcPoke = JS_TRUE; >+ rt->gcPoke = JS_TRUE; > #endif >- js_GC(cx, GC_KEEP_ATOMS | GC_ALREADY_LOCKED); >- triedGC = JS_TRUE; >-#else >- triedGC = JS_FALSE; >+ js_GC(cx, GC_KEEP_ATOMS | GC_ALREADY_LOCKED); >+ triedGC = JS_TRUE; >+ } > #endif The changes here should disable populating thread-local-free list when invoked in the close handler. Otherwise after the finalization the threda-local-list would point to deleted GC arenas.
Comment 92•18 years ago
|
||
(In reply to comment #90) > (In reply to comment #89) > > > But this exactly that proposal of a list of things with close handlers is > > > about: things can only be put there once and when removed before the run of the > > > close handlers, their close handlers would never run again. > > Ok, that's what I was calling problematic, but it seems inevitable. But how a list (or flag or hashtable etc.) can be more problematic then js_ClosePhaseInterrupt ;) ?
Assignee | ||
Comment 93•18 years ago
|
||
(In reply to comment #92) > But how a list (or flag or hashtable etc.) can be more problematic then > js_ClosePhaseInterrupt ;) ? It's not, but penalizing the close phase does not bother me so much as adding costs to allocations of close-able objects. I was hoping to use this protocol to eliminate the explicit GC root for the parent slot of all native iterators, using a native close hook (so no interrupt overhead). But iteration is not free, and what you propose works too -- thanks (patch very soon). /be
Comment 94•18 years ago
|
||
(In reply to comment #93) > ... but penalizing the close phase does not bother me so much as adding > costs to allocations of close-able objects. There is always bug 331966 to reduce allocation overhead.
Comment 95•18 years ago
|
||
(In reply to comment #93) > I was hoping to use this protocol > to eliminate the explicit GC root for the parent slot of all native iterators, > using a native close hook (so no interrupt overhead). BTW, that /* * Root the parent slot so we can get it even in our finalizer (otherwise, * it would live as long as we do, but it might be finalized first). */ if (!js_AddRoot(cx, &iterobj->slots[JSSLOT_PARENT], "iterobj->parent")) return JS_FALSE; looks wrong as slot array can be reallocated. Or native iterators are always sealed or inaccessible from scripts?
Assignee | ||
Comment 96•18 years ago
|
||
(In reply to comment #95) > (In reply to comment #93) > > I was hoping to use this protocol > > to eliminate the explicit GC root for the parent slot of all native iterators, > > using a native close hook (so no interrupt overhead). > > BTW, that > > /* > * Root the parent slot so we can get it even in our finalizer (otherwise, > * it would live as long as we do, but it might be finalized first). > */ > if (!js_AddRoot(cx, &iterobj->slots[JSSLOT_PARENT], "iterobj->parent")) > return JS_FALSE; > > looks wrong as slot array can be reallocated. Or native iterators are always > sealed or inaccessible from scripts? They have been, up till now! That's very old code, dating back to 1998 or so when fur introduced the OBJ_ENUMERATE protocol (which unfortunately requires obj, the iterable, to be kept alive until after the iterator is finalized, in order to make the JSENUMERATE_DESTROY callback). Native iterators are now exposed to script, where they were hidden before. So they must use the close phase instead of this rooting mess. However, GC will have to restart on shutdown to collect any closed object. D'oh! /be
Assignee | ||
Comment 97•18 years ago
|
||
(In reply to comment #84) > This behavior applies to .next(), not to .throw(). See the end of the first > paragraph in the .throw() spec in PEP 342: > > If the generator is already in the closed state, throw() just raises the > exception it was passed without executing any of the generator's code. Thanks, missed that -- easily fixed. New patch is coming, I've been held up by more pressing deadlines and some testing gremlins. /be
Comment 98•18 years ago
|
||
When loading in the browser, <script type="text/javascript;version=1.7"></script> is ignored. Do we want to make this available to all versions < 1.7? In the browser on windows, the test throws Operation Failed: TypeError: jsval has invalid __iterator__ value undefined during the page load. Using venkman shows this error occurs in several places. Brendan, is js1_7/generators/ a good place for these tests? FAILED!: Error loading script BUGNUMBER: 326466 STATUS: Implement Pythonic generators and iteration protocol support FAILED!: Implement Pythonic generators and iteration protocol support FAILED!: Expected value '0,1,1,2,3,5,8,13', Actual value '0,1,1,2,3,5,8,13' FAILED!: FAILED!: Error loading script
Assignee | ||
Comment 99•18 years ago
|
||
Attachment #223826 -
Attachment is obsolete: true
Attachment #223826 -
Flags: superreview?(shaver)
Attachment #223826 -
Flags: review?(mrbkap)
Attachment #223826 -
Flags: review?(igor.bukanov)
Assignee | ||
Comment 100•18 years ago
|
||
Comment 101•18 years ago
|
||
Comment on attachment 224118 [details] [diff] [review] diff -w version of last attachment Quick observations about jsgc.c changes: >Index: jsgc.c >=================================================================== > > void * > js_NewGCThing(JSContext *cx, uintN flags, size_t nbytes) .... > { > #ifdef JS_THREADSAFE > gcLocked = JS_FALSE; > JS_ASSERT(cx->thread); > flbase = cx->thread->gcFreeLists; > JS_ASSERT(flbase); > thing = flbase[flindex]; > if (thing) { > flagp = thing->flagp; > flbase[flindex] = thing->next; > METER(rt->gcStats.localalloc++); /* this is not thread-safe */ >+ if (inClosePhase) >+ flags |= GCF_MARK; > goto success; > } Since js_GC nukes thread-local free lists before the marking phase and other changes in NewGCThing ensure that the lists are not repopulated during the close phase, inClosePhase must always be here. Also perhaps it is simpler to add GCF_MARK once where flags are updated after successful allocation? >@@ -2224,19 +2237,82 @@ js_GC(): >+ JS_INIT_CLIST(&todo); >+ link = rt->gcCloseList.next; >+ do { >+ item = (JSCloseListItem *) link; >+ link = link->next; >+ obj = item->object; >+ if (!(*js_GetGCThingFlags(obj) & GCF_MARK)) { >+ JS_REMOVE_LINK(&item->links); >+ JS_APPEND_LINK(&item->links, &todo); >+ GC_MARK(cx, obj, "close-phase object"); >+ } >+ } while (link != &rt->gcCloseList); There is a missed call after the loop to ScanDelayedChildren(cx) here to finish marking in case of too deep recursion.
Comment 102•18 years ago
|
||
BTW, the last patch explicitly exposes the close handler list and leaves the management of the memory for list's cell to the user. This in turn forced to add a finalizer to Iterator. But what about having api just to register the object with close handler and hide the list completely from the user? Then the implementation can allocate the list cells from GC pages providing very cheap and mostly-lock-free allocation option.
Assignee | ||
Comment 103•18 years ago
|
||
(In reply to comment #101) > > #ifdef JS_THREADSAFE > > gcLocked = JS_FALSE; > > JS_ASSERT(cx->thread); > > flbase = cx->thread->gcFreeLists; > > JS_ASSERT(flbase); > > thing = flbase[flindex]; > > if (thing) { > > flagp = thing->flagp; > > flbase[flindex] = thing->next; > > METER(rt->gcStats.localalloc++); /* this is not thread-safe */ > >+ if (inClosePhase) > >+ flags |= GCF_MARK; > > goto success; > > } > > Since js_GC nukes thread-local free lists before the marking phase and other > changes in NewGCThing ensure that the lists are not repopulated during the > close phase, inClosePhase must always be here. "false here", of course -- thanks. > Also perhaps it is simpler to > add GCF_MARK once where flags are updated after successful allocation? I thought of that, but wanted to combine the test guarding the mark bit flipping with the tests you pointed out are needed to avoid repopulating the lists. I'll reconsider this when recovering from the merge conflicts from the patch for bug 338653. ;-) > There is a missed call after the loop to ScanDelayedChildren(cx) here to finish > marking in case of too deep recursion. Thanks again -- I need to pay attention to your post-Deutsch-Schorr-Waite work better here. (In reply to comment #102) > BTW, the last patch explicitly exposes the close handler list and leaves the > management of the memory for list's cell to the user. This in turn forced to > add a finalizer to Iterator. But what about having api just to register the > object with close handler and hide the list completely from the user? > > Then the implementation can allocate the list cells from GC pages providing > very cheap and mostly-lock-free allocation option. Sure, I thought of that. When only generators needed close, it seemed better to make only that class pay the price. Now that iterators do too, it's less clear. But OS mallocs are often thread-free and snappy these days. Are we sure we can beat 'em? You may have some synthetic benchmark results already. /be
Assignee | ||
Comment 104•18 years ago
|
||
Blake, please review too. This is much improved thanks to Igor's comments. I had fretted about adding another test to js_NewObject, to test for extended class with non-null close hook, but we want universal close phase support without requiring each such class to do extra work in its constructors. /be
Attachment #224117 -
Attachment is obsolete: true
Attachment #224118 -
Attachment is obsolete: true
Attachment #224157 -
Flags: superreview?(shaver)
Attachment #224157 -
Flags: review?(igor.bukanov)
Assignee | ||
Updated•18 years ago
|
Attachment #224157 -
Flags: review?(mrbkap)
Comment 105•18 years ago
|
||
> > Sure, I thought of that. When only generators needed close, it seemed better > to make only that class pay the price. Now that iterators do too, it's less > clear. Here is a possible implementation that saves memory compared with simple linked list approach even for the case of generators that have to allocate private data in any case. Suppose all objects with a close handler are put to a malloced array. Then during the close phase the array is scanned and all unmarked elements marked and moved to the array tail. Then the close handler is executed for all tail elements, they are removed from the array and the array is trimmed if necessary. With doubling of array capacity on a full array append this would gives the overhead of extra 1.5 word per each object with a close handler on average compared with 2 words when using a linked list. Alternatively one can use GCF_LOCK bit to mark things that wants close handles. Currently LOCK bit is set for all locked objects but this is unnecessary for JSObject since presence of the object in the lock hash table is enough. So GCF_LOCK can be used for marking objects with close handles with a linked list of GC pages that contains the iterators for faster scanning. So if details of the registration of closable objects are hidden, the there are lot of ways to experiment with touching few places in jsgc.c that would actually safe memory and improve performance. > But OS mallocs are often thread-free and snappy these days. Are we sure > we can beat 'em? Was malloc overhead the reason to use GC things for object slots despite the extra complexity the change brought?
Comment 106•18 years ago
|
||
Comment on attachment 224157 [details] [diff] [review] coroutine generators, a la Python 2.5, v5 There is a bad interaction in js_GC between the close phase and GC callback. When XPCJSRuntime::GCCallback from xpcjsruntime.cpp receives JSGC_MARK_END it marks autoroots and then immediately starts to prepare for finalization using JS_IsAboutToBeFinalized. But it does not work if the close phase resurrects the referenced objects. On the other hand the callback can not be called after the unreachable close handlers are marked. The handlers in this case can be referenced through autoroots and be in fact reachable. I hit a similar problem during implementation of that non-recursive GC - see comments in js_MarkGCThing where this addressed via cx->insideGCMarkCallback flag. But AFAICS it is not possible to address this for close handlers without splitting JSGC_MARK_END into something like JSGC_MARK and JSGC_FINALIZATION_START.
Assignee | ||
Comment 107•18 years ago
|
||
(In reply to comment #105) > So if details of the registration of closable objects are hidden, the there are > lot of ways to experiment with touching few places in jsgc.c that would > actually safe memory and improve performance. Details are hidden with the latest patch, so we can improve this in a followup patch. The schedule pressure is to get this right for the 1.8 branch soon, but followup patches are easier and do not face that pressure. So I'll try to get this to a better place today, and run a new patch by you. After that, if you can take over optimizing close-phase bookkeeping, that would be excellent. > > But OS mallocs are often thread-free and snappy these days. Are we sure > > we can beat 'em? > > Was malloc overhead the reason to use GC things for object slots despite the > extra complexity the change brought? That was not based on measurements, but was more driven toward the goal of fusing initial slots with objects, and proving that the size-classifying GC worked. We should do as you propose in bug 331966 -- I'll comment there shortly. (In reply to comment #106) > (From update of attachment 224157 [details] [diff] [review] [edit]) > I hit a similar problem during implementation of that non-recursive GC - see > comments in js_MarkGCThing where this addressed via cx->insideGCMarkCallback > flag. But AFAICS it is not possible to address this for close handlers without > splitting JSGC_MARK_END into something like JSGC_MARK and > JSGC_FINALIZATION_START. I'll do this too. No way to avoid an API change here, but I believe XPConnect is the only client of this API (I don't have insight into all SpiderMonkey embeddings of course, but the major ones such as tellme.com's don't use this API). Thanks, /be
Comment 108•18 years ago
|
||
Comment on attachment 224157 [details] [diff] [review] coroutine generators, a la Python 2.5, v5 >=================================================================== >RCS file: /cvsroot/mozilla/js/src/jsgc.c,v >@@ -1956,16 +1966,17 @@ void > js_GC(JSContext *cx, uintN gcflags) > { >+ /* >+ * Mark GC-things used to remember objects that need the close phase, but >+ * don't mark the objects, which are weakly referenced. >+ */ >+ itemp = &rt->gcCloseList; >+ for (item = *itemp; item; item = item->next) >+ GC_MARK(cx, item, "close-list item"); >+ ... >+ rt->gcClosePhase = 1; >+ >+ todop = &todo; >+ do { >+ obj = item->object; >+ if (*js_GetGCThingFlags(obj) & GCF_MARK) { >+ itemp = &item->next; >+ } else { >+ *itemp = item->next; >+ *todop = item; >+ todop = &item->next; >+ GC_MARK(cx, obj, "close-phase object"); >+ } >+ } while ((item = *itemp) != NULL); >+ *todop = NULL; It is necessary to mark only those rt->gcCloseList items that corresponds to marked item->object. The rest from todo list should be safe for finalization right after the close phase.
Assignee | ||
Comment 109•18 years ago
|
||
(In reply to comment #108) > It is necessary to mark only those rt->gcCloseList items that corresponds to > marked item->object. The rest from todo list should be safe for finalization > right after the close phase. Sure, but is it cheaper to mark everything, or call js_GetGCThingFlags on item->object and not mark item if object is not marked? Anyway I think I'll take your advice and try to reduce close-phase space overhead even more. It seems if I move the JSGC_MARK_END callback to after the close phase, all is well in XPConnect. Does that seem right? /be
Assignee | ||
Comment 110•18 years ago
|
||
I tried to minimize code for managing rt->gcCloseTable. Let me know if you can think of easy wins, so this can go in soon for more thorough baking. /be
Attachment #224157 -
Attachment is obsolete: true
Attachment #224291 -
Flags: superreview?(shaver)
Attachment #224291 -
Flags: review?(igor.bukanov)
Attachment #224157 -
Flags: superreview?(shaver)
Attachment #224157 -
Flags: review?(mrbkap)
Attachment #224157 -
Flags: review?(igor.bukanov)
Assignee | ||
Updated•18 years ago
|
Attachment #224291 -
Flags: review?(mrbkap)
Assignee | ||
Comment 111•18 years ago
|
||
Attachment #224291 -
Attachment is obsolete: true
Attachment #224292 -
Flags: superreview?(shaver)
Attachment #224292 -
Flags: review?(igor.bukanov)
Attachment #224291 -
Flags: superreview?(shaver)
Attachment #224291 -
Flags: review?(mrbkap)
Attachment #224291 -
Flags: review?(igor.bukanov)
Assignee | ||
Updated•18 years ago
|
Attachment #224292 -
Flags: review?(mrbkap)
Assignee | ||
Comment 112•18 years ago
|
||
Found the right patch, finally. /be
Attachment #224292 -
Attachment is obsolete: true
Attachment #224293 -
Flags: superreview?(shaver)
Attachment #224293 -
Flags: review?(igor.bukanov)
Attachment #224292 -
Flags: superreview?(shaver)
Attachment #224292 -
Flags: review?(mrbkap)
Attachment #224292 -
Flags: review?(igor.bukanov)
Assignee | ||
Updated•18 years ago
|
Attachment #224293 -
Flags: review?(mrbkap)
Comment 113•18 years ago
|
||
Comment on attachment 224293 [details] [diff] [review] coroutine generators, a la Python 2.5, v6b >Index: jsgc.c >=================================================================== >RCS file: /cvsroot/mozilla/js/src/jsgc.c,v >+ >+ while (index < count) { >+ obj = array[index++]; >+ xclasp = (JSExtendedClass *) LOCKED_OBJ_GET_CLASS(obj); >+ JS_ASSERT(xclasp->base.flags & JSCLASS_IS_EXTENDED); >+ xclasp->close(cx, obj); >+ } Since close handlers can add more entries to the close table, the code in the loop can not refer to the local copy of rt->gcCloseTable.array since it can reallocated. For the same reason when shrinking the table all entities inside [count, rt->gcCloseTable.count) that were added by the close handlers should be moved. > JS_ASSERT(!cx->insideGCMarkCallback); > if (rt->gcCallback) { > cx->insideGCMarkCallback = JS_TRUE; > (void) rt->gcCallback(cx, JSGC_MARK_END); > JS_ASSERT(cx->insideGCMarkCallback); > cx->insideGCMarkCallback = JS_FALSE; > } > JS_ASSERT(rt->gcUnscannedBagSize == 0); This does not yet add a new callback GC for the price of potentially invoking the close handlers too early as autoroots can point to closable objects?
Assignee | ||
Comment 114•18 years ago
|
||
(In reply to comment #113) > (From update of attachment 224293 [details] [diff] [review] [edit]) > Since close handlers can add more entries to the close table, the code in the > loop can not refer to the local copy of rt->gcCloseTable.array since it can > reallocated. For the same reason when shrinking the table all entities inside > [count, rt->gcCloseTable.count) that were added by the close handlers should be > moved. Yes, I knew that once -- when using the linked list approach -- but forgot it in recent days. Sleep is good. Thanks, fixing. > > JS_ASSERT(!cx->insideGCMarkCallback); > > if (rt->gcCallback) { > > cx->insideGCMarkCallback = JS_TRUE; > > (void) rt->gcCallback(cx, JSGC_MARK_END); > > JS_ASSERT(cx->insideGCMarkCallback); > > cx->insideGCMarkCallback = JS_FALSE; > > } > > JS_ASSERT(rt->gcUnscannedBagSize == 0); > > This does not yet add a new callback GC for the price of potentially invoking > the close handlers too early as autoroots can point to closable objects? It can (dbaron and I were discussing this yesterday), since XPConnect has a nasty habit of marking from JSGC_MARK_END. I've field bug 340212 on this issue. In practice it is probably not real, as iterators have strong refs from the JS stack that will mark them and prevent premature close. We should fix it, but it shouldn't hold up this bug. /be
Assignee | ||
Comment 115•18 years ago
|
||
Attachment #224293 -
Attachment is obsolete: true
Attachment #224317 -
Flags: superreview?(shaver)
Attachment #224317 -
Flags: review?(igor.bukanov)
Attachment #224293 -
Flags: superreview?(shaver)
Attachment #224293 -
Flags: review?(mrbkap)
Attachment #224293 -
Flags: review?(igor.bukanov)
Assignee | ||
Updated•18 years ago
|
Attachment #224317 -
Flags: review?(mrbkap)
Comment 116•18 years ago
|
||
Comment on attachment 224317 [details] [diff] [review] coroutine generators, a la Python 2.5, v6c >+ /* >+ * Poke the GC in case any marked object is really garbage. This will >+ * cause a restart to collect such objects. Since such objects are no >+ * longer in rt->gcCloseTable, they will not be closed again -- they >+ * will only finalized if they are indeed collectible. >+ */ >+ if (gcflags & GC_LAST_CONTEXT) >+ rt->gcPoke = JS_TRUE; >+ } >+ The last problem: if a close handler creates an object with the same close handler when GC_LAST_CONTEXT set, than we wood get infinite GC loop on shutdown. A simple workaround like ignoring ads to the close table during the shutdown should be good enough. With that I would add that r+ ;).
Comment 117•18 years ago
|
||
(In reply to comment #116) > A simple workaround like ignoring ads to the close table during the > shutdown should be good enough. With that I would add that r+ ;). Or even better would be to change js_AddObjectToCloseTable to fail on shutdown.
Assignee | ||
Comment 118•18 years ago
|
||
I chose to make js_AddObjectToCloseList no-op with a DEBUG message to stderr if called when landing the runtime. For Mozilla-like embeddings this means shutdown, but other embeddings may use many threads symmetrically, and can go through quiet periods of zero contexts live on a runtime. So this is not just a shutdown-time restriction on the new API -- really this says that JS API users must not allocate objects needing the close phase from other such objects' close hooks at any time. This restriction may be onerous. We shall see. An alternative rule that prevents ilooping requires either an arbitrary watchdog on the number of iterations, or some new fancy graph algorithm that can ensure convergence. I'll leave attempting removal of this restriction for later. Note that a misbehaving (native) GC callback handling JSGC_END can iloop the last GC already (see clarified comment in jscntxt.c), but that's not a hazard given the native-code-only nature of the GC callback. In contrast, close hooks call close methods for generators, which can run arbitrary JS, so they can't be trusted not be badly buggy or downright evil. /be
Attachment #224317 -
Attachment is obsolete: true
Attachment #224317 -
Flags: superreview?(shaver)
Attachment #224317 -
Flags: review?(mrbkap)
Attachment #224317 -
Flags: review?(igor.bukanov)
Assignee | ||
Updated•18 years ago
|
Attachment #224322 -
Flags: superreview?(shaver)
Attachment #224322 -
Flags: review?(igor.bukanov)
Assignee | ||
Updated•18 years ago
|
Attachment #224322 -
Flags: review?(mrbkap)
Comment 119•18 years ago
|
||
Comment on attachment 224322 [details] [diff] [review] coroutine generators, a la Python 2.5, v6d here is the promised r+.
Attachment #224322 -
Flags: review?(igor.bukanov) → review+
Comment 120•18 years ago
|
||
Strange stats. I added a printout about table size to JS_GC and got in js shell: js> for (var i in [1,2,3]) { gc(); } gc(); gc(); close table size: 4 before 9232, after 9232, break 097b4000 close table size: 3 before 9232, after 9232, break 097b4000 close table size: 3 before 9232, after 9232, break 097b4000 close table size: 5 before 9232, after 9232, break 097b4000 close table size: 3 before 9232, after 9232, break 097b4000 Strange: I assumed that one close entry would come from Iterator.__prototype__ and another from the iterator itself but apparaently at one momemnt I got 5 of them. I need to follow this in debugger... for (var i in [1]) gc(); gc(); gc();
Assignee | ||
Comment 121•18 years ago
|
||
With a printf like that at the very bottom of js_GC: $ ./Darwin_DBG.OBJ/js js> for(var i in [1])gc(); gc(); gc() close table count 3 before 9232, after 9232, break 01205000 close table count 3 before 9232, after 9232, break 01205000 close table count 3 before 9232, after 9232, break 01205000 js> quit() close table count 0 $ One for the Iterator prototype, one for the [1] array, and one for Array.prototype. No variation with more elements to iterate. /be
Comment 122•18 years ago
|
||
(In reply to comment #121) > With a printf like that at the very bottom of js_GC: My stats were for printout at the beginning. Apparently js_DefaultIterator is called 3 times producing 2 extra iterator objects at the beging of iterations. Weired.
Comment 123•18 years ago
|
||
(In reply to comment #122) > (In reply to comment #121) > > With a printf like that at the very bottom of js_GC: > > My stats were for printout at the beginning. Apparently js_DefaultIterator is > called 3 times producing 2 extra iterator objects at the beging of iterations. > Weired. > This was for iterators for prototype chain: one for array.prototype and one for Object.prototype. So this is not a bug but rather not-optimized code.
Assignee | ||
Comment 124•18 years ago
|
||
There is a bug: with lazy standard class init, checking for StopIteration to avoid reinitializing iterator-related classes is not enough, so we end up wasting time and space making a prototype for Iterator, then recreating it. v6e patch shortly. Optimizing prototype enumeration to avoid creating iterators if there are no enumerable properties is something we never did. There was always an internal iterator object. If it can be done cheaply (JSScope flag?), sure. /be
Assignee | ||
Comment 125•18 years ago
|
||
Oh good, I'm wrong: the double-init guard in js_InitIteratorClasses works as expected. The extra close-able object is the anonymous Generator class's prototype. So Iterator.prototype, <Generator>.prototype, [1], Array.prototype, and Object.prototype for 5 total. Whew. No new patch needed. /be
Comment 126•18 years ago
|
||
This checkin is causing a failure in the bloat test, which is the cause of the orange on balsa and nye. I can reproduce it in a local debug build, and backing out this patch fixes it. The line that is failing is: http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/build/bloatcycle.html&mark=27#26 with: JavaScript error: , line 0: uncaught exception: [Exception... "Unexpected error" nsresult: "0x8000ffff (NS_ERROR_UNEXPECTED)" location: "JS frame :: resource:///res/bloatcycle.html :: loadURL :: line 27" data: no] Evaluating "w.location.href" at that line is raising a "Permission denied to get property Location.href" exception.
Comment 127•18 years ago
|
||
It looks like this bumped RLk from 600B to 700B on nye: XPCWrappedNative 192 0.00% nsXPCComponents 52 0.00% XPCWrappedNative 240 25.00% nsXPCComponents 104 100.00%
Comment 128•18 years ago
|
||
The stacks for the new leaks on nye are: (52 bytes leaked) operator new(unsigned int) nsXPCComponents::AttachNewComponentsObject(XPCCallContext&, XPCWrappedNativeScope*, JSObject*) nsXPConnect::InitClasses(JSContext*, JSObject*) jsdService::OnForRuntime(JSRuntime*) jsdASObserver::Observe(nsISupports*, char const*, unsigned short const*) nsAppStartupNotifier::Observe(nsISupports*, char const*, unsigned short const*) seamonkey-bin+6A0B seamonkey-bin+7B46 (48 bytes leaked) XPCWrappedNative::GetNewOrUsed(XPCCallContext&, nsISupports*, XPCWrappedNativeScope*, XPCNativeInterface*, XPCWrappedNative**) nsXPCComponents::AttachNewComponentsObject(XPCCallContext&, XPCWrappedNativeScope*, JSObject*) nsXPConnect::InitClasses(JSContext*, JSObject*) jsdService::OnForRuntime(JSRuntime*) jsdASObserver::Observe(nsISupports*, char const*, unsigned short const*) nsAppStartupNotifier::Observe(nsISupports*, char const*, unsigned short const*) seamonkey-bin+6A0B seamonkey-bin+7B46
Assignee | ||
Comment 129•18 years ago
|
||
(In reply to comment #126) > This checkin is causing a failure in the bloat test, which is the cause of the > orange on balsa and nye. Gavin kindly filed bug 340340 on this latent bug (or slew of bugs -- not sure how fixable the bad pattern is). /be
Updated•18 years ago
|
Comment 130•18 years ago
|
||
The currently implemented iterator protocol exposes the iterator object to scripts and thus allows to access and manipulate iterator's state object from another thread. This requires that all access to the state objects should be thread-safe. But all implementations of the enumeration hook that present in Mozilla CVS assumes single thread access and do not use any locking. AFAICS the only realistic solution to this is to require that the iterator can only be enumerated using for-in loop ensuring single-thread access to the state. Comments?
Assignee | ||
Comment 131•18 years ago
|
||
(In reply to comment #130) > AFAICS the only realistic solution to this is to require that the iterator can > only be enumerated using for-in loop ensuring single-thread access to the > state. Comments? I'll admit that thread-safety has been a low priority for me up to this point in development, given Gecko's main-thread-only restrictions, and in light of all the other work to do. The lock-free and related request-amortized jslock.h facilities should be usable to synchronize .next() mutators, although the locking cannot all be pushed under the JSObjectOps (and now JSXMLObjectOps) layers -- the state slot is updated by iterator_next. The internally-created iterators used by for-in loops when *enumerating* (i.e. walking up the prototype chain finding unshadowed, enumerable properties that have not been deleted after start of the loop) plain old iterables -- objects that are not iterators -- are almost always inaccessible. More in a bit. /be
Comment 132•18 years ago
|
||
(In reply to comment #131) > > The lock-free and related request-amortized jslock.h facilities should be > usable to synchronize .next() mutators, although the locking cannot all be > pushed under the JSObjectOps (and now JSXMLObjectOps) layers -- the state slot > is updated by iterator_next. <rant> So the idea is to pay an extremely high performance and code maintance price to allow to manipulate explicitly the iterator state just to follow a language that was never designed to cope with untrusted code and where thread safty and GC was mostly an afterthought? </rant> Seriously, just consider what has to be done to correctly deal with instanceOfGenerator.next() calling instanceOfGenerator.next() while another thread tries to call instanceOfGenerator.next().
Assignee | ||
Comment 133•18 years ago
|
||
(In reply to comment #132) > (In reply to comment #131) > > > > The lock-free and related request-amortized jslock.h facilities should be > > usable to synchronize .next() mutators, although the locking cannot all be > > pushed under the JSObjectOps (and now JSXMLObjectOps) layers -- the state slot > > is updated by iterator_next. > > <rant> > So the idea is to pay an extremely high performance and code maintance price to > allow to manipulate explicitly the iterator state just to follow a language > that was never designed to cope with untrusted code and where thread safty and > GC was mostly an afterthought? > </rant> No need to rant. I don't think Python's lack of embedding in mixed-trust web browsers, or notorious GIL, are relevant. Consider Lua, which has iterators of several flavors, including "stateless" (really, just local state that's well isolated). It's not obviously easier or harder to make thread-safe for all the iterator flavors. As always, the common case is single-threaded, so we can and will bias for that, just as we did with jslock.[ch]. > Seriously, just consider what has to be done to correctly deal with > instanceOfGenerator.next() calling instanceOfGenerator.next() while another > thread tries to call instanceOfGenerator.next(). Generators relieve the need for threads, and again we are not exposing threads to web JS, so this is a hard case. In fact it sounds like a nonsense case. We do not need to optimize it, or deoptimize the common ST case for it. We could even make it illegal. /be
Comment 134•18 years ago
|
||
(In reply to comment #133) > No need to rant. I don't think Python's lack of embedding in mixed-trust web > browsers, or notorious GIL, are relevant. Consider the following code: function make_iter() { var iter = (function() { yield 0; })(); iter.close = make_iter; } make_iter(); In the current implementation this will be executed on each following GC even when browser would go to a different page. This is bad so API should be provided that will allow to treat close handlers like timer hooks. Now lets modify the example: function make_iter() { var iter = (function() { yield 0; })(); iter.close = function() { while (true); } while (true) make_iter(); } make_iter(); This constantly generates objects with close handlers with infinite loops. Now the protection against too long running scripts would stop the first loop but what about all those close hooks? I.e. what should runtime do when it finds that a close hook takes too long time to execute? Should it just skip and go to another forever running hook? Or should it kill all close hooks including legitimate ones and introduce leaks? Python can simply ignore such questions since it does not need to execute untrusted scripts. But SM has to worry about them so any new feature should be considered for potential exploits/denial-of-service. A design should allow to address such problems. My experience so far with close hooks and explicit iterator state manipulation on top of current SM implementation tells that it would be a long story before all the problems would be resolved. Thus the idea is to remove the most problematic feature for now while supporting 95% of useful cases.
Assignee | ||
Comment 135•18 years ago
|
||
Good to get off threads, and onto threats the browser faces. I don't mean to be glib about threads, and SpiderMonkey is embedded in many symmetric massively MT systems, so we will care about making shared-iterator hard cases thread-safe. But not all before checking in the main patch for PEP-342-inspired generators. SpiderMonkey originally was ST only, and used ASCII. It has grown. I am still confident in the benefits of Pythonic iterators and generators, not only the mindshare benefit but also two-way benefits between JS and Python as we add Python scripting support to Gecko. On to the threats, which as you note re: timeouts, are not new: * We need an API to prevent close hooks from leaking across page loads. * We again need to police iloops in scripts. We do that today, it's costly and there are bugs on file (bug at least) asking for more policy control. The threats are not huge in practice. As James Gosling said 11 years ago, D.O.S. attacks are always possible in any useful system, but not serious threats in the real world (this is paraphrased from memory, and possibly some of it is me, not Gosling). Java has "user-defined" finalizers, but "user" here means something akin to trusted plugin code (which happens to be virtual machine bytecode rather than real machine code). AFAIK it does not defend. However small the threat in likelihood, the cost is high. If risk = cost x probability, cost is high, and we are not able to control probability, then we can't just do what Java did, and stipulated a certain amount of trust. So I agree, we should defend against misbehaving close implementations. Against the benefits and in light of the threats being akin to existing ones that we cope with, you want to go back to what? PEP-255-style generators, which have no send/throw/close methods? Or just get rid of close, but leave send and throw? Losing close means we must once again forbid yield from a try block. Besides the break in symmetry with Python, we lose useful cases (I know shaver cares about these use-cases -- see comment 41), where you can use try/finally with a PEP-342-style generator to implement automated resource acquire/release patterns in JS. These before/after automation cases could be supported differently, but they come for free with generators. I have a simple counter-proposal for you: let's restrict close hooks to trusted scripts (chrome, UniversalBrowserWrite). /be
Assignee | ||
Comment 136•18 years ago
|
||
(In reply to comment #135) > * We again need to police iloops in scripts. We do that today, it's costly and > there are bugs on file (bug at least) asking for more policy control. I meant "(bug 230909 at least)". /be
Comment 137•18 years ago
|
||
Here is another way to deal with the issues like thread-safty and DOS: 1. Allow to use iterations method of the iterator only from the thread that creates the iterator object. Accessing iter.next and friend from another thread should throw a well defined exception like BadIteratorAccess. This would take care of thread-safety without imposing any significan cost on the current implementation. 2. Properly detect and throw BadIteratorAccess for pathological cases like: var iter; function generator() { yield iter.next(); } iter = generator(); iter.next(); That is, calling iter.next and friends from inside iter.next and friends should generate BadIteratorAccess. A user-implemented iterator can ignore this requirements, but all system-defined iterators should enforce it. 3. Do not call iter.close during GC. Instead define that for (i in iter) loopBody; is equivalent to the following: try { for (;;) { var hiddenTemporary = iter.next(); try { i = hiddenTemporary; loopBody; } catch (e) { iter.throw(e); } } } catch (e if e == StopIterartion) { } finally { iter.close(); } To support scripts that want to code the loop explicitly and forget to call iter.close the runtime should properly release all its objects during GC with no attempt to call the close hook. In the current SpiderMonkey implementation it just require to finalize the iterator state before the corresponding object without any notion of close phase. 4. To support scripts that want to detect when some object becomes unreachable provide a call like: defineUnreachableObjectHook(object, hook); When the object becomes unreachable, the runtime will eventually call the hook *unless* the hook GC-ed itself. This is enough to supports scripts that want for debugging purposes to detect missed close calls for custom iterators or to cover other cases when a script want to deal with various forms of leakages. On the other hand the provision that the hook is not called if it is collected itself will take care about hooks defined for already GC-ed browser scripts. Then an API can be provided so an embedding can schedule calls to hooks in a manner similar to timer calls. In this way dealing with running forever close hooks becomes equivalent to dealing with forever running timer hooks.
Comment 138•18 years ago
|
||
(In reply to comment #137) > 3. Do not call iter.close during GC. Instead define that > > for (i in iter) loopBody; > > is equivalent to the following: > > try { > for (;;) { > var hiddenTemporary = iter.next(); > try { > i = hiddenTemporary; > loopBody; > } catch (e) { > iter.throw(e); > } > } > } catch (e if e == StopIterartion) { > } finally { > iter.close(); > } > Or even remove iter.close completely from the picture and just require that for (i in iter) { loopBody; } is equivalent to the following: try { for (;;) { var hiddenTemporary = iter.next(); try { i = hiddenTemporary; loopBody; } catch (e) { iter.trow(e); } } } catch (e) { if (e !== StopIteration) throw e; } with "break" executed inside loopBody is replaced by iter.break(); break; or if iter.break is not defined then by try { iter.throw(StopIteration); } catch (e) { if (e !== StopIteration) throw e; } break; Similarly break labelOfOuterLoop would translates into iter1.break(); iter2.break(); ... iterN.break(); break; This would fit nicely to the current implementation of native iteartors that close internal state when iter.next would throw and to generators. There if one want to use yeild inside try with finally then function gen() { try { yield 1; } finally { doSomething(); } } can be implemented as: function gen() { try { if (called-outside-for-in) { defineUnreachableObjectHook( currentIter, function() { doSomething()}); } transfer control to the caller with 1 as the result; } finally { doSomething(); } } This penalizes only users who want to mess with own for-in loop replacement while avoiding introducing new cases of DOS since finally would not be called if gen itself is collected.
Comment 139•18 years ago
|
||
Here is yet another suggestion how to avoid close phase while allowing explicit access to the iterator. One can just require that standard iterators/generators can only be used with for-in loop. Of cause it would prevent explicit manipulation like calling iter.next() at random places in code but AFAICS all such manipulations requires quite complex code to correctly implement the iterator protocol. On the hand a useful case like: function skip_even_iteration(iterator) { for (var elem in iterator) { yield elem; iterator.next(); } } would be supported.
Comment 140•18 years ago
|
||
(In reply to comment #139) > Here is yet another suggestion how to avoid close phase while allowing explicit > access to the iterator. One can just require that standard iterators/generators > can only be used with for-in loop. Of cause it would prevent explicit > manipulation like calling iter.next() at random places in code but AFAICS all > such manipulations requires quite complex code to correctly implement the > iterator protocol. The code to do iteration isn't very complex at all (verbose maybe). MochiKit impements that protocol simply and natively, and forcing for loops would definitely break support for MochiKit's iteration functions. close() shouldn't be implemented at all if it's going to be restricted like this. PEP 342 (close/send/throw) was introduced to allow iterators to be used more like co-routines -- which are *not* used in for loops. Before this addition, there was no need for close(). -bob
Assignee | ||
Comment 141•18 years ago
|
||
(In reply to comment #140) > close() shouldn't be implemented at all if it's going to be restricted like > this. PEP 342 (close/send/throw) was introduced to allow iterators to be used > more like co-routines -- which are *not* used in for loops. Before this > addition, there was no need for close(). It's clear we can support close already -- it is implemented. And we *should* support close and all of the PEP 342 inspired coroutine generator methods, this is proposed before ECMA TG1 and likely to be in ES4. IMHO we should do one of the following: 1. Restrict close usage to privileged scripts. 2. Run close hooks as if by setTimeout(bind(close, gen), 0), which we can control as we do (as well as we do...) setTimeout and setInterval chains today (there was a bug where some code created 2^n intervals, using setInterval to arm itself -- IE wackily suppresses the setInterval calls from the evaluated expression, but Gecko does not -- this made for a pretty effective D.O.S.). Implementing 1 is easy, but it pushes hard problems off on policy modules, signed script developers, and end-users. Implementing 2 can't be that hard. Igor, what do you say? /be
Comment 142•18 years ago
|
||
fyi js1_7/geniter/326466-01.js browser trunk crashes across platforms.
on winxp you have a deleted pointer
JSTRAP_CONTINUE 0x00000001 int
+ pc 0x06a53973 "ннннннннннннннннннннннннннннннннннннннннннннннннннннн" unsigned char *
+ rt 0x00fb3ff8 {state=JSRTS_UP gcArenaList=0x00fb3ffc gcRootsHash={...} ...} JSRuntime *
+ script 0x06a53930 {code=0xdddddddd <Bad Ptr> length=0xdddddddd main=0xdddddddd <Bad Ptr> ...} JSScript *
> js3250.dll!js_Interpret(JSContext * cx=0x067508f8, unsigned char * pc=0x06a53973, long * result=0x0012ab40) Line 6140 + 0x42 bytes C
Updated•18 years ago
|
Comment 143•18 years ago
|
||
I get syntax errors with the following expressions copied from PEP 342: x = 12 + (yield 42) x = 12 + (yield) foo(yield 42) foo(yield) OTOH, the following is allowed in JS but not in Python 2.5b2: yield yield yield Is this a bug or an intentional deviation from Python?
Assignee | ||
Comment 144•18 years ago
|
||
(In reply to comment #143) > I get syntax errors with the following expressions copied from PEP 342: > > x = 12 + (yield 42) > x = 12 + (yield) > foo(yield 42) > foo(yield) Works for me in current shell: js> function f() { var x x = 12 + (yield 42) print(yield x) yield 0 } js> g = f() [object Generator] js> g.next() 42 js> g.send(1) 13 js> g.send(2) 2 0 js> g.next() uncaught exception: StopIteration What are you testing in? Can you attach a testcase? > OTOH, the following is allowed in JS but not in Python 2.5b2: > > yield yield yield > > Is this a bug or an intentional deviation from Python? I meant to allow it. Any reason not to, other than to be Pythonic? Rather, why doesn't Python allow it? /be
Assignee | ||
Comment 145•18 years ago
|
||
Ah, I see the one bug you did include: js> function f() { print(12 + (yield)) typein:32: SyntaxError: syntax error: typein:32: print(12 + (yield)) typein:32: ...................^ Could you please file it separately? Also, I'm interested in learning why yield yield ... should be illegal -- if there's a good reason, that should be a separate bug. This bug should be closed when bugs it tracks are fixed and the ES4 generators and iterators spec is more final. Thanks, /be
Comment 146•18 years ago
|
||
In 1.9 20060821 I get a failure in js1_7/geniter/326466-01.js: expected: [object Generator] actual: function fib() { var a = 0, b = 1;...}
Assignee | ||
Comment 147•18 years ago
|
||
(In reply to comment #146) > In 1.9 20060821 I get a failure in js1_7/geniter/326466-01.js: > > expected: [object Generator] actual: function fib() { var a = 0, b = > 1;...} That's bug 349362. /be
Depends on: 349362
Comment 148•18 years ago
|
||
In Python, the following code doesn't propagate StopIteration: def gen(): try: yield finally: raise StopIteration g = gen() g.next() g.close() But JS does: js> function gen() { try { yield; } finally { throw StopIteration; } } js> var g = gen(); js> g.next(); js> g.close(); uncaught exception: [object StopIteration] Is this a bug or an intended behavior?
Comment 149•18 years ago
|
||
(In reply to comment #148) > In Python, the following code doesn't propagate StopIteration: > > def gen(): > try: > yield > finally: > raise StopIteration > > g = gen() > g.next() > g.close() > > But JS does: > > js> function gen() { > try { > yield; > } finally { > throw StopIteration; > } > } > js> var g = gen(); > js> g.next(); > js> g.close(); > uncaught exception: [object StopIteration] > > Is this a bug or an intended behavior? This went together with a patch for bug 349331. With that bug fixed there is no need to catch no longer existing GeneratorExit exception and together with GeneratorExit-catch-code I removed the code to catch StopIteration exception. That was not done on purpose. But I rather like the result! It made it explicit that the normal way to quit the generator is to execute the return. Moreover, catching and ignoring StopIteration on close can easily hide bugs in a generator that explicitly manipulates some other iterator and forgot to catch StopIteration in that code. Still if the consequence is that StopIteration should be ignored, then I will fix that in a separated bug.
Assignee | ||
Comment 150•18 years ago
|
||
(In reply to comment #149) > (In reply to comment #148) > > js> function gen() { > > try { > > yield; > > } finally { > > throw StopIteration; > > } > > } > > js> var g = gen(); > > js> g.next(); > > js> g.close(); > > uncaught exception: [object StopIteration] > > > > Is this a bug or an intended behavior? > > This went together with a patch for bug 349331. > [snip] > Still if the consequence is that StopIteration should be ignored, then I will > fix that in a separated bug. Igor filed 352580 on documenting this difference from Python 2.5. /be
Attachment #222482 -
Flags: superreview?(shaver)
Attachment #222749 -
Flags: superreview?(shaver)
Attachment #224322 -
Flags: superreview?(shaver)
Updated•17 years ago
|
Attachment #224322 -
Flags: review?(mrbkap)
Comment 151•17 years ago
|
||
> extern JS_FRIEND_API(const char *)
> -js_ValueToPrintableString(JSContext *cx, jsval v);
> +js_ValueToPrintable(JSContext *cx, jsval v, JSValueToStringFun v2sfun);
> +
> +#define js_ValueToPrintableString(cx,v) \
> + js_ValueToPrintable(cx, v, js_ValueToString)
> +
> +#define js_ValueToPrintableSource(cx,v) \
> + js_ValueToPrintable(cx, v, js_ValueToSource)
This is very nice, but breaks ABI.
Assignee | ||
Comment 152•16 years ago
|
||
Besides bug 352580, there's nothing left to do that I can see. If anyone sees something, please file new bugs. /be
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Updated•16 years ago
|
Flags: in-testsuite-
Flags: in-litmus-
You need to log in
before you can comment on or make changes to this bug.
Description
•