Closed Bug 657367 Opened 14 years ago Closed 14 years ago

eval('("\u2028")') is evaluated as valid, but, this is illegal ECMAScript source.

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla7

People

(Reporter: utatane.tea, Assigned: Waldo)

References

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(2 files, 2 obsolete files)

add EvalJSON mode to JSONParser 14 years ago Yusuke Suzuki 2.99 KB, patch		Details \| Diff \| Splinter Review
patch v2. move parsingMode condition check to out of the second loop. 14 years ago Yusuke Suzuki 3.31 KB, patch		Details \| Diff \| Splinter Review
patch v3, template based check code dispatch 14 years ago Yusuke Suzuki 6.28 KB, patch		Details \| Diff \| Splinter Review
Make eval not use the JSON parser if it finds U+2028 or U+2029 in the code string 14 years ago Jeff Walden [:Waldo] 4.53 KB, patch	n.nethercote : review+	Details \| Diff \| Splinter Review

Yusuke Suzuki

Reporter

Description

•

14 years ago

User-Agent: Mozilla/5.0 (Windows NT 6.1) AppleWebKit/534.24 (KHTML, like Gecko) Chrome/11.0.696.68 Safari/534.24 Build Identifier: JSON is not subset of ECMAScript. StringLiteral "\u2028" (or "\u2029") is valid JSON String, but invalid ECMAScript String. but, SpiderMonkey eval parse "(...)" expression as JSON at first, so eval('("\u2028")') dosen't raise SyntaxError. I suggests removing JSON shortcut in eval. Reproducible: Always Steps to Reproduce: 1. evaluate eval('"\u2028"') 2. evaluate eval('("\u2028")') Actual Results: 1 raises SyntaxError, but 2 doesn't raise SyntaxError. Expected Results: 1 and 2 should raise SyntaxError. I saw this blog post http://timelessrepo.com/json-isnt-a-javascript-subset and remembered SpiderMonkey's eval evaluates source as JSON at first.

Tom S. (please needinfo tschuster)

Updated

•

14 years ago

Status: UNCONFIRMED → NEW

Ever confirmed: true

Tom S. (please needinfo tschuster)

Updated

•

14 years ago

Blocks: es5

Alice0775 White

Comment 1

•

14 years ago

Regressed Pushlog: http://hg.mozilla.org/tracemonkey/pushloghtml?fromchange=fa851447f195&tochange=b6179390c115 Suspected: 5685f8de41fa Nicholas Nethercote — Bug 578216 - Make eval(json-like string) fast. r=sayrer

Paul Biggar

Comment 2

•

14 years ago

Related discussion: http://timelessrepo.com/json-isnt-a-javascript-subset

Boris Zbarsky [:bzbarsky]

Comment 3

•

14 years ago

I think the right way to do this is to have the json parser set a flag if it sees that char and if that happens we throw it all away even if the json parse succeeded and eval instead. There's no reason to slow down the fast path for this edge case, and slowing down the edge case should be fine, imo.

Blocks: 578216

Yusuke Suzuki

Reporter

Comment 4

•

14 years ago

Attached patch add EvalJSON mode to JSONParser (obsolete) — Details — Splinter Review

> have the json parser set a flag Yes, Good idea. This patch adds EvalJSON flag (StrictJSON and reject "\u2028" + "\2029") to JSONSourceParser. thanks.

Attachment #532706 - Flags: superreview?(dmandelin)

Attachment #532706 - Flags: review?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Comment 5

•

14 years ago

Comment on attachment 532706 [details] [diff] [review] add EvalJSON mode to JSONParser I can probably sr if needed, but waldo's the right reviewer here.

Attachment #532706 - Flags: review?(bzbarsky) → review?(jwalden+bmo)

David Mandelin [:dmandelin]

Comment 6

•

14 years ago

Comment on attachment 532706 [details] [diff] [review] add EvalJSON mode to JSONParser No super-review needed here--just regular review.

Attachment #532706 - Flags: superreview?(dmandelin)

Nicholas Nethercote [inactive]

Comment 7

•

14 years ago

Oh how I hate \u2028 and \u2029. One comment about the patch -- that loop in readString() is very hot. It would be better to have two versions of the loop and test EvalJSON to determine which one to use.

Yusuke Suzuki

Reporter

Comment 8

•

14 years ago

Attached patch patch v2. move parsingMode condition check to out of the second loop. (obsolete) — Details — Splinter Review

> that loop in readString() is very hot thanks, I fixed this patch. move parsingMode condition (in second loop) to out of the second loop. The first loop is a little big to write twice, and iterator "start" is used in the second loop too, so I only moved parsingMode cond in the second loop and didn't create function. Buf if this cost is too heavy, it might be good idea to write loop twice or, to create boolean template parameter to specialize character check function like, template<bool AcceptLineParagraphSeparator> inline bool isJSONStringLiteralRejectChar(jschar ch) { return ch <= 0x001F; } template<> inline bool isJSONStringLiteralRejectChar<false>(jschar ch) { return ch <= 0x001F || ch == 0x2028 || ch == 0x2029; }

Attachment #532706 - Attachment is obsolete: true

Attachment #532706 - Flags: review?(jwalden+bmo)

Attachment #532845 - Flags: review?(jwalden+bmo)

Jeff Walden [:Waldo]

Assignee

Comment 9

•

14 years ago

I looked into what the other open-source engines do here. v8 doesn't try to parse likely JSON passed to eval as JSON, so they don't encounter this issue. WebKit does try to parse as JSON. They use a parser in NonStrictJSON mode. In that mode, they use this method to check string characters: template <LiteralParser::ParserMode mode> static inline bool isSafeStringCharacter(UChar c) { return (c >= ' ' && (mode == LiteralParser::StrictJSON || c <= 0xff) && c != '\\' && c != '"') || c == '\t'; } So they avoid the problem by only fast-pathing for strings containing (non-control) characters all under U+00FF, and thus for \u2028 and \u2029 they fall back to the slow path. I wonder if something similar might perhaps be the best solution. I'm also thinking about how to get the one-off bad-character check out of the first loop. That's the fast path, so keeping it lean as comment 7 notes is important. Anyway: looking at this now, thinking the patch is correct but possibly not perfection as regards performance, noodling over ideas on how to address.

Yusuke Suzuki

Reporter

Comment 10

•

14 years ago

Attached patch patch v3, template based check code dispatch — Details — Splinter Review

Thanks for your code review and very good advice. I read JSC "eval" code, and write patch that uses template specialization for char check function. When using JSON.parse, isBadStringLiteralChar is specialized static inline bool isBadStringLiteralChar(jschar c) { return c <= 0x001F; } not run \u2028 \u2029 check code. And followed recommendation, add c >= 0x00FF guard against running \u2028 \u2029 check for char like ASCII. inline bool isBadStringLiteralChar<JSONParser::EvalStringLiteral>(jschar c) { return c <= 0x001F || ((c >= 0x00FF) && (c == 0x2028 || c == 0x2029)); }

Attachment #532845 - Attachment is obsolete: true

Attachment #532845 - Flags: review?(jwalden+bmo)

Attachment #535396 - Flags: review?(jwalden+bmo)

Jeff Walden [:Waldo]

Assignee

Comment 11

•

14 years ago

The other thing that worries me about this is that, really, the parsing mode is a boolean. The only reason it's an enum is that enums are more readable. Actually making it a full-fledged enum by adding a third value to it raises my hackles. But as I said, still thinking this through.

Yusuke Suzuki

Reporter

Comment 12

•

14 years ago

Oh, sorry. I misunderstood that I was said that I should noodle this patch.

Jeff Walden [:Waldo]

Assignee

Comment 13

•

14 years ago

Oops, didn't mean to imply that -- just trying to decide what's the best-est course of action here, if we can't just not do this parse-as-JSON thing (by far the easiest solution, but also the slowest solution). One idea I keep wondering about. How much bad would it really be to just search potential candidates for JSON-parsing (i.e. start/end with '('/')') for U+2028 and U+2029, and if you find either bail on trying to use the JSON parser? Actual programs don't usually trigger the JSON parser, so most execution would be unaffected. If candidate strings are small enough, search time would be negligible. And even if they're large, it'd still be a very tight loop to search the entire string. The JSON parser wouldn't have to be touched at all, and the cost of U+2028/U+2029 checking would be paid entirely by the code that had to bear it. Thoughts?

Nicholas Nethercote [inactive]

Comment 14

•

14 years ago

> Thoughts? Try it and see. The original motivation for the parse-eval-as-JSON optimization was string-tagcloud in Sunspider, apparently a 30%-ish (13ms) win according to bug 578216; see if it's hurt by your 2028/2029 pre-scan. One thing that's changed since then is that I've sped up vanilla JS parsing by another 30% or so, so the benefit of this whole optimization will have reduced. While you're measuring, you could try turning it off altogether and see what happens.

Jeff Walden [:Waldo]

Assignee

Comment 15

•

14 years ago

Attached patch Make eval not use the JSON parser if it finds U+2028 or U+2029 in the code string — Details — Splinter Review

I tested this patch, and it doesn't change SunSpider scores. (v8 doesn't use eval in a way which could trigger JSON parsing.) I didn't try altogether removing the JSON parser invocation; the JavaScript parser may have gotten faster, but so has the JSON parser. And while I'm more than happy to slow down eval-of-JSON to check for U+2028 and U+2029 on the theory that modern code and libraries now use JSON.parse, and developers should adapt if they want the utmost performance, I hesitate at believing they've switched so completely to warrant removing the JSON parser hack altogether. So for now I think this is the best solution.

Attachment #537273 - Flags: review?(nnethercote)

Nicholas Nethercote [inactive]

Comment 16

•

14 years ago

Comment on attachment 537273 [details] [diff] [review] Make eval not use the JSON parser if it finds U+2028 or U+2029 in the code string Review of attachment 537273 [details] [diff] [review]: ----------------------------------------------------------------- Seems like a good compromise. Thanks for fixing this.

Attachment #537273 - Flags: review?(nnethercote) → review+

Jeff Walden [:Waldo]

Assignee

Comment 17

•

14 years ago

Comment on attachment 535396 [details] [diff] [review] patch v3, template based check code dispatch I think we're going with the eval-specific check that doesn't touch the JSON parser. But thanks for the work on the patch, and the patience with our review process and deciding exactly how to fix this! I should be able to land the patch tomorrow in the TM repository.

Attachment #535396 - Flags: review?(jwalden+bmo)

Yusuke Suzuki

Reporter

Comment 18

•

14 years ago

Thank you for your review. I'm happy that this bug is fixed!

Jeff Walden [:Waldo]

Assignee

Comment 19

•

14 years ago

http://hg.mozilla.org/tracemonkey/rev/ef4d79fc60c0

Assignee: general → jwalden+bmo

Status: NEW → ASSIGNED

Whiteboard: fixed-in-tracemonkey

Target Milestone: --- → mozilla7

Chris Leary [:cdleary] (not checking bugmail)

Comment 20

•

14 years ago

cdleary-bot mozilla-central merge info: http://hg.mozilla.org/mozilla-central/rev/ef4d79fc60c0

Chris Leary [:cdleary] (not checking bugmail)

Updated

•

14 years ago

Status: ASSIGNED → RESOLVED

Closed: 14 years ago

Resolution: --- → FIXED

You need to log in before you can comment on or make changes to this bug.