Last Comment Bug 486643 - deprecate sharp variables
: deprecate sharp variables
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: unspecified
: All All
: -- normal with 1 vote (vote)
: ---
Assigned To: Jason Orendorff [:jorendorff]
:
:
Mentors:
Depends on: WeakMap
Blocks: 566700
  Show dependency treegraph
 
Reported: 2009-04-03 01:40 PDT by Igor Bukanov
Modified: 2012-01-26 18:44 PST (History)
17 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
toSource.js - a JS implementation of {Array,Object}.prototype.toSource (16.09 KB, application/x-javascript)
2011-02-18 09:13 PST, Jason Orendorff [:jorendorff]
no flags Details
toSource.js v2 (17.14 KB, text/plain)
2011-06-16 10:20 PDT, Jason Orendorff [:jorendorff]
no flags Details
WIP 1 (22.26 KB, patch)
2011-06-16 11:41 PDT, Jason Orendorff [:jorendorff]
no flags Details | Diff | Splinter Review

Description Igor Bukanov 2009-04-03 01:40:49 PDT
+++ This bug was initially created as a clone of Bug #485164 +++

The sharp variables is a SpiderMonkey extension to JS that allows to represent cycles in the object graph. But this feature is not going to be standardized and no other implementation is going to implement it. For this reason I suggest to deprecate the sharps so they could be retired later. This would make sure that at some feature point we would avoid paying extra complexity tax for the feature that is rarely used even in SpiderMonkey-specific code. 

As alternatives to sharps during serialization of an object graph with cycles the code may use a function block. For example, #1=[#1#, #2={}, #2] would become:

(function() { var a=[, {}]; a[0] = a; a[2] = a[1]; return a; })()

Although this is definitely longer, it is compatible with EcmaScript 3. In addition, it would be closer to the original source that generated the serialized array in the first place.

With this change the compiler may start to generate warnings on any usage of sharps in the source.
Comment 1 Brendan Eich [:brendan] 2010-04-09 14:58:13 PDT
Functional abstraction violates Tennent's Correspondence Principle, even for expressions (where sharps can occur -- specifically only in initialisers): |this|, |yield|, and arguments are three examples.

The let expression proposal looks promising, however:

http://wiki.ecmascript.org/doku.php?id=strawman:let_expressions

We need a bug on that, which should block this one.

/be
Comment 2 Jeff Walden [:Waldo] (remove +bmo to email) 2010-04-09 15:13:42 PDT
(In reply to comment #1)
> Functional abstraction violates Tennent's Correspondence Principle, even for
> expressions (where sharps can occur -- specifically only in initialisers):
> |this|, |yield|, and arguments are three examples.

Not necessarily.  I think, given enough pain, you could make it work (pretend all MemberExpressions below really use some special form to avoid calling getters or setters on Object.prototype, of course):

  var a = #1={ a: this, b: (yield 17), c: arguments, d: #1# };

  (function() {
     var a = {};
     a.a = arguments[0];
     a.b = arguments[1];
     a.c = arguments[2];
     a.d = a;
     return a;
   })(this, (yield 17), arguments)

Decompiling to function abstraction is the really hard part; this little tweak is potentially (hard to say without doing it) mere icing on that cake.
Comment 3 Brendan Eich [:brendan] 2010-04-09 15:23:25 PDT
"Not necessarily" doesn't help if you do not have a general solution. You don't (because of eval -- |this| and arguments can be in eval'ed source where the eval is called in the initialiser and its source is not analyzable statically).

/be
Comment 4 Brendan Eich [:brendan] 2010-05-27 18:32:23 PDT
Plus, |a| in the example (or one recast using a let expression) has to have a name that is different from any name used in the initialiser expression that refers to a name free in that expression, defined in outer contexts.

EphemeronTables moved from strawman to harmony at http://wiki.ecmascript.org/ and with them one can build an efficient user-code library version of toSource and uneval. OTOH, let expressions moved to strawman:deferred.

Rather than changing syntax and working harder to rearrange deck chairs on what may be a sinking ship (bug 566700), I think we should work on EphemeronTables and self-host sharps on the output side. The input side is less of a problem, but we could get rid of the lexer and parser support and require a library-based parser (like JSON) to be used -- once the EphemeronTable support is in and solid.

/be
Comment 5 Jason Orendorff [:jorendorff] 2011-02-18 09:13:52 PST
Created attachment 513490 [details]
toSource.js - a JS implementation of {Array,Object}.prototype.toSource

A translation of Object.prototype.toSource and Array.prototype.toSource from C++ to JS. It should be fairly easy (not exactly trivial) to tweak this to spit out let-expressions or function calls.

The Map implementation in this script is of course silly. WeakMap will serve as a drop-in replacement once it's done.
Comment 6 Brendan Eich [:brendan] 2011-02-18 13:15:07 PST
Cool code. I was corresponding on IRC with jorendorff about a few nits.

When blockers allow the patch will object-detect WeakMap and use it instead of self-hosted and slow Map, and be ready for user-testing. Although a sharpEval or fromSource method would be welcome as well.

/be
Comment 7 Brendan Eich [:brendan] 2011-02-18 14:11:21 PST
The prototypal get and put (-> set) in Map are good but I like the parallel keys and values arrays in

http://wiki.ecmascript.org/doku.php?id=harmony:weak_maps

for fewer allocations (2 arrays, instead of (1 + map-length) arrays).

/be
Comment 8 Jason Orendorff [:jorendorff] 2011-02-21 10:23:36 PST
In order to serve as a replacement for JS_HAS_TOSOURCE, toSource.js needs an implementation of {Date,Boolean,Number,String,RegExp,Function}.prototype.toSource.

These are all easy. Function is no problem; it can just call Function.prototype.toString.
Comment 9 Brendan Eich [:brendan] 2011-06-09 21:30:31 PDT
Jason, are you willing to work on this more? I can post to the newsgroup but I'd rather you take this home since you gave it a great boost and it's "almost" done.

/be
Comment 10 Jason Orendorff [:jorendorff] 2011-06-15 10:36:39 PDT
I will post an updated toSource.js tomorrow, and I'll produce a patch that turns off JS_HAS_TOSOURCE.

If it breaks the Web or too many addons, we'll discuss how to proceed. In that case I might not be able to take. It depends.
Comment 11 Jason Orendorff [:jorendorff] 2011-06-16 10:20:35 PDT
Created attachment 539819 [details]
toSource.js v2

Updated toSource.js.
Comment 12 Jason Orendorff [:jorendorff] 2011-06-16 11:41:24 PDT
Created attachment 539848 [details] [diff] [review]
WIP 1

Here's the trivial patch to turn off toSource and uneval. It causes a lot of tests to fail; more on that below. First a few notes about the patch.

  - JS_ValueToSource remains, but when the argument is an object, it tries to
    call .toSource() on it, which isn't there anymore. It falls back on
    .toString().

  - ctypes defines some toSource methods, which I'm inclined to leave alone.
    They were designed to cooperate with JS_HAS_TOSOURCE, but they don't
    depend on it.

I surveyed all uses of toSource and uneval in our codebase. They are most commonly used in tests and test support code.

  - Some are actually testing toSource and uneval themselves. It'll be easy
    to delete those, or just change them to test the script.

  - Some use x.toSource() == y.toSource() as a cheap way of saying
    equal(x, y), that is, comparing structural equality. I posted something
    sort of like an equal() function in bug 613066; we can add an appropriate
    function to our test suites.

  - Some use toSource/uneval in error messages. The implementation of assertEq
    and jsapi-test's similar CHECK_SAME macro both do so. Individual tests do
    too.

  - I can't immediately tell what the CrossSiteXHR tests use it for. https://mxr.mozilla.org/mozilla-central/source/content/base/test/test_CrossSiteXHR_origin.html?force=1#124

  - Apart from tests, ConsoleAPIObserver.formatResult uses it: https://mxr.mozilla.org/mozilla-central/source/mobile/chrome/content/content.js#1200

  - Oh look, somebody copied it: https://mxr.mozilla.org/mozilla-central/source/toolkit/components/console/hudservice/HUDService.jsm#4794

  - It's used in some debug logging stuff: https://mxr.mozilla.org/mozilla-central/source/toolkit/components/search/nsSearchService.js#1953 https://mxr.mozilla.org/mozilla-central/source/toolkit/mozapps/extensions/XPIProvider.jsm#2842

My own inclination would be to leave toSource alone, but remove sharp variables from the parser. Functions would still round-trip through eval(uneval(v)), but cyclic object graphs would not. Brendan, are you still opposed to that approach?
Comment 13 Jason Orendorff [:jorendorff] 2011-06-16 11:51:06 PDT
That nsSearchService stuff is better off using JSON.stringify, and maybe a lot of the other uses could do the same. (JSON.stringify does not preserve infinities, NaN, or -0; or accessors or cyclic objects. Otherwise it's a pretty good substitute.)

If we do remove .toSource and eval, we need to include a copy of toSource.js someplace (at least one place) where all tests can conveniently reach it.

I still worry about breaking the Web, even though no other browser implements this.
Comment 14 Jesse Ruderman 2011-06-16 16:29:39 PDT
I use toSource/uneval all the time, but rarely on cyclic object graphs.  I don't understand why you're removing it.
Comment 15 Jesse Ruderman 2011-06-16 17:04:00 PDT
Entirely removing toSource/uneval would break hundreds of extensions and thousands of Greasemonkey scripts.

Addons (https://mxr.mozilla.org/addons/)
  uneval:                   Found 526   matching lines in 140 files
  toSource, case sensitive: Found 1000+ matching lines in 695 files

User scripts hosted on userscripts.org:
  uneval:                   About 1350 results (many using GM_setValue)
  toSource:                 About 1220 results (some using function.toSource)
Comment 16 Jeff Walden [:Waldo] (remove +bmo to email) 2011-06-16 17:22:30 PDT
Erm, reduced attack surface (among others)?  Surely you of all people know the dangers of the decompiler.
Comment 17 Chris Leary [:cdleary] (not checking bugmail) 2011-06-16 17:26:23 PDT
(In reply to comment #14)
> I use toSource/uneval all the time, but rarely on cyclic object graphs.  I
> don't understand why you're removing it.

Why is self-hosted as bad as removed? It looks like jorendorff's patch intends to introduce a version of them implemented in JS.
Comment 18 Jesse Ruderman 2011-06-16 17:38:50 PDT
Maybe I misunderstood the patch. Self-hosted sounds great.
Comment 19 Brendan Eich [:brendan] 2011-06-17 11:37:59 PDT
We are not removing, just rehosting from C++ to JS on the output side. On the input side, you'll need to call a new sharp decoder entry point instead of eval. For symmetry this suggests a porting speedbump: Sharp.eval and Sharp.uneval, with a Sharp module bound to that name. Comments?

/be
Comment 20 Jason Orendorff [:jorendorff] 2011-06-17 13:51:53 PDT
Here's what I think we should do:
  - remove sharp variables from the parser and runtime
  - make toSource and uneval not emit sharp variables

Self-hosting toSource and uneval is really a separate track. toSource.js stands ready once we have self-hosting machinery (see bug 462300 for a little hope on that front).

Providing a new serialization/deserialization library for JS sounds like an adventure in yak-shaving. If that's what it takes to get rid of sharps, I'll sign up, but here's where the requirements start:

  - portable enough to use on the web now
  - API suitable for eventual standardization, the path JSON took
  - when you round-trip |new Gromit()| you get a Gromit object, preferably
    without any extra work
  - user-extensible (objects can tell the library what to serialize) in a way
    that avoids .toSource's unfixable bugs
  - can handle arbitrary object graphs, not just trees like JSON

Less than that and I don't know who would use it or why. Sharps only solve part of the serialization problem.
Comment 21 Brendan Eich [:brendan] 2011-06-17 17:31:10 PDT
"Sharps only solve part of the serialization problem" is true of any solution. This was why they were shot down back in ES3 days.

But it does not mean sharps are not useful. They have users. Such users could use a library, starting now. This is not a "new serialization/deserialization library" at all, it's an implementation in JS of the old SpiderMonkey feature, ya know, the one with users.

Multiple browser support is fine, just object-detect WeakMap and if missing, polyfill the O(n^2) and strong-ref-leaky JS implementation.

/be
Comment 22 Brendan Eich [:brendan] 2011-06-17 17:39:40 PDT
(In reply to comment #20)
> Here's what I think we should do:
>   - remove sharp variables from the parser and runtime
>   - make toSource and uneval not emit sharp variables

That is bug 566700, which depends on this bug. We need the library first, so Jesse can switch to it and user-test it till he's satisfied (repeat for other users).

That's what I meant by re-hosting in comment 19 -- not transparent self-hosting under the current APIs. I am open to shipping the new library, somehow (TBD).

> Self-hosting toSource and uneval is really a separate track. toSource.js
> stands ready once we have self-hosting machinery (see bug 462300 for a
> little hope on that front).

Fair enough, and I'd rather not self-host under the current APIs and integration points in the engine (the compiler for sure) if we can deprecate via library code, then simply remove sharps from SpiderMonkey's C++ source.

> Providing a new serialization/deserialization library for JS sounds like an
> adventure in yak-shaving. If that's what it takes to get rid of sharps,

What it takes to get rid of sharps from the C++ code is at least a JS library implementing (to within epsilon) the same functionality. That's why this bug blocks bug 566700.

So, you game?

/be
Comment 23 Chris Leary [:cdleary] (not checking bugmail) 2011-06-17 18:01:51 PDT
(In reply to comment #21)
> But it does not mean sharps are not useful. They have users.

Out of curiosity, who are these users? I'm surprised if you're saying that removing sharps will break the web. (But I'm not sure that's what you're saying.)
Comment 24 Brendan Eich [:brendan] 2011-06-18 13:15:23 PDT
(In reply to comment #23)
> (In reply to comment #21)
> > But it does not mean sharps are not useful. They have users.
> 
> Out of curiosity, who are these users? I'm surprised if you're saying that
> removing sharps will break the web. (But I'm not sure that's what you're
> saying.)

What's the mystery? SpiderMonkey users, Jesse is one of them, use sharps. Web developers in general do not.

/be
Comment 25 Jeff Walden [:Waldo] (remove +bmo to email) 2011-06-19 18:02:56 PDT
Just to be sure, Jesse, do you yourself use sharps, or is it merely your fuzzers?  I had always been under the impression everything you file that's sharp-y is also fuzz-y.
Comment 26 Jesse Ruderman 2011-06-19 19:00:18 PDT
I don't use sharps intentionally except to test sharps.

I do use uneval frequently. But I usually don't give it cyclic graphs, and I don't care what kind of output I get for cyclic graphs. Sometimes sharps make things more readable and sometimes they make things more confusing.
Comment 27 Brendan Eich [:brendan] 2011-06-20 12:35:34 PDT
"Sharps" is the catch-all for both the uneval/toSource "encoding" side (including the sometimes helps and other times hurts sharp output notation for cycles and join points), and the eval "decoding" side.

So comment 14 was about uneval/toSource (the encoding side), and Jesse is a user.

There is no point separating encoding and decoding in this bug or the one that it blocks. We are not taking away half or breaking both sides before the library solution is available and notice has been given. The library approach works well enough for decoding too.

/be
Comment 28 Brendan Eich [:brendan] 2011-06-20 12:44:03 PDT
(In reply to comment #12)
> My own inclination would be to leave toSource alone, but remove sharp
> variables from the parser. Functions would still round-trip through
> eval(uneval(v)), but cyclic object graphs would not. Brendan, are you still
> opposed to that approach?

Yes, and sorry for missing this question. This bug is about deprecation, meaning fair warning and a library-level replacement for people to use (including those who need something like this for any of our XUL JS, tests, etc.).

Removal, even turning the jsversion.h macro to 0 from 1, should happen after, in the bug blocked by this bug, bug 566700.

IINM Python has deprecation -> obsolescence down to an art, and does not half-break things without warning. We should try to do the same, with whatever means we have. If modules were implemented, we'd be able to ship a built-in and self-hosted library. We might still do something like that with the module pattern, as the "WIP 1" attachment does -- but that patch shouldn't turn off the macro yet.

/be
Comment 29 Jason Orendorff [:jorendorff] 2011-06-20 13:04:40 PDT
OK, I'll write a function, portable across browsers, that parses the possibly-sharp-variable-containing output of uneval/toSource.

I don't intend to support sharp-variable uses nested in function bodies, like
   Sharps.eval("function () { return #1=[#1#]; }")
toSource does not return output like that unless the function used sharp variables to begin with.

(FWIW: I plan to do this using string hackery and eval. It shouldn't be too bad. The worst part is that it'll have to do enough parsing to be able to skip over function-expressions, which means writing a correct tokenizer, which means handling '/' characters correctly.)
Comment 30 Jason Orendorff [:jorendorff] 2011-06-22 11:25:35 PDT
Three caveats about the code I'm writing here:

1. I plan to support only output that toSource/eval actually could generate. That means #1# will be supported only where it occurs as an element of an ArrayLiteral or a property value (following a ':' token) in an ObjectLiteral. And #1= will be supported only immediately before an ArrayLiteral or ObjectLiteral.

SpiderMonkey's parser allows #1= and #1# in other places, but I think that is a bug.

2. I don't plan to support E4X. I don't think it's necessary, and it would be a ton more work to tokenize.

3. As far as I can tell, there are only two uses of sharps in the entire codebase that could actually be produced by toSource. That is to say, we effectively have no regression tests for sharp variables. So I'll be writing a handful of tests myself and that'll be it.

Speak up if this isn't good enough!
Comment 31 Brendan Eich [:brendan] 2011-06-23 00:34:50 PDT
+1 on comment 30.

/be
Comment 32 Jason Orendorff [:jorendorff] 2011-08-15 06:36:48 PDT
https://github.com/jorendorff/sharps
Comment 33 Dave Herman [:dherman] 2011-08-15 09:36:59 PDT
> https://github.com/jorendorff/sharps

Now that this is turning into a JSON-like serialization library, are we constrained to the exact same design as the legacy SM feature? In particular, ISTM it would be more user-friendly to allow arbitrary identifiers as sharp variables, in addition to numbers.

Is this bug a good place for such feedback, or should I pile issues in the Github project?

Dave
Comment 34 Dave Herman [:dherman] 2011-08-15 09:41:24 PDT
> or should I pile issues

*file*
Comment 35 Jason Orendorff [:jorendorff] 2011-08-18 09:59:14 PDT
> https://github.com/jorendorff/sharps

I've added toSource.js to this, renaming it to uneval.js. So now we have both aspects of sharp variables implemented in pure ES5.

(In reply to Dave Herman [:dherman] from comment #33)
> Now that this is turning into a JSON-like serialization library, are we
> constrained to the exact same design as the legacy SM feature? In
> particular, ISTM it would be more user-friendly to allow arbitrary
> identifiers as sharp variables, in addition to numbers.

I'm not personally interested in developing it in that direction. It's really meant as an example of how to achieve all the things sharps do without sharps being built into the language.

It could also serve as a migration path. For people who want to use it that way, any further extensions to the syntax just add uncertainty.
Comment 36 Jason Orendorff [:jorendorff] 2011-08-18 10:14:58 PDT
(In reply to Brendan Eich [:brendan] from comment #28)
> [...] This bug is about deprecation,
> meaning fair warning and a library-level replacement for people to use
> (including those who need something like this for any of our XUL JS, tests,
> etc.).

I think we now have the library-level replacement, so all that remains is giving users fair warning. Earlier you said you could take that task, but I'll be happy do it if you're busy. I think the main things users need to know are:

  * What is changing? We're removing sharp variables, the nonstandard
    language feature described here:
    <https://developer.mozilla.org/en/Sharp_variables_in_JavaScript>

  * How do I know if my code is affected? Search it for #1#. You only need
    to search Firefox-only/SpiderMonkey-only code, because the other JS
    engines have never supported sharp variables.

    If you have SpiderMonkey-only code that uses eval() and uneval() for
    data serialization, you must search your data for #1# as well as your
    code. The change will affect you only if the data you're serializing
    actually contains cycles.

  * What if my code is affected? It'll stop working when you or your users
    upgrade to Firefox 9, which will be released 18 weeks from now.
    You'll have to migrate to something like sharps-lite.js
    <https://github.com/jorendorff/sharps> before then.
Comment 37 Brendan Eich [:brendan] 2011-08-18 11:23:14 PDT
(In reply to Dave Herman [:dherman] from comment #33)
> > https://github.com/jorendorff/sharps
> 
> Now that this is turning into a JSON-like serialization library, are we
> constrained to the exact same design as the legacy SM feature? In
> particular, ISTM it would be more user-friendly to allow arbitrary
> identifiers as sharp variables, in addition to numbers.
> 
> Is this bug a good place for such feedback, or should I pile issues in the
> Github project?

This is pure scope creep for this bug, and therefore should be avoided at all cost. Separate bug, or github project (better), no problem.

/be
Comment 38 Brendan Eich [:brendan] 2011-08-18 11:28:59 PDT
Jason, happy if you announce. I will retweet. What's the plan for the js shell, where uneval/toSource are currently put to good use and will be missed?

/be
Comment 39 Jason Orendorff [:jorendorff] 2011-08-18 15:10:32 PDT
(In reply to Brendan Eich [:brendan] from comment #38)
> Jason, happy if you announce. I will retweet.

OK. I'll do it tomorrow.

> What's the plan for the js
> shell, where uneval/toSource are currently put to good use and will be
> missed?

Those are too widely used to remove. We can rehost them to JS, as you wrote in comment 19. If the infrastructure were there I would do it now; but it's not. Anyway, that's a backward-compatible change. We can do it any time without advance notice.

Orthogonal to that, it would be possible remove the code in uneval/toSource that produces sharp-variable output, replacing it with Python's "..." or let syntax. That's a potentially breaking change (though we have the library-level migration path for it in hand), so it would require an announcement.

Which change(s) did you have in mind with this question, and what do you think we should do? I prefer to remove sharp variables from the grammar and runtime (the input side) but leave uneval/toSource alone.
Comment 40 Jason Orendorff [:jorendorff] 2011-08-19 13:43:09 PDT
Ran out of time today; I'll post to dev.tech.js-engine about the deprecation on Monday instead. That also allows more time to decide whether to change the output of uneval/toSource.
Comment 41 Brendan Eich [:brendan] 2011-08-19 13:48:26 PDT
(In reply to Jason Orendorff [:jorendorff] from comment #39)
> I prefer to remove sharp variables from the grammar and
> runtime (the input side) but leave uneval/toSource alone.

Sounds good to me.

/be
Comment 42 Benjamin Smedberg [:bsmedberg] 2011-08-19 13:56:29 PDT
Do we have stats on how many addons currently use sharp variables? dherman indicated during the e10s analysis that a not-insignificant percentage of addons were failing to parse because of things like sharp variables. If so, would that cause us to reconsider this deprecation now, or merely issue warnings to those extension authors?
Comment 43 Brendan Eich [:brendan] 2011-08-19 13:58:14 PDT
bsmedberg, good point. We can find out easily with existing tools. dherman may have stats.

Remember, this bug is about deprecation. Nothing is being removed right now. We need a decent interval (one release cycle?) to get the word out.

/be
Comment 44 Dave Herman [:dherman] 2011-08-19 14:53:54 PDT
I don't yet know how many addons are using E4X or sharp-literals. But I have a relatively easy way to find out. Will let you know ASAP.

Dave
Comment 45 Jesse Ruderman 2012-01-26 18:44:56 PST
Sharps have been removed (in bug 566700).  I think that counts as deprecation ;)

Note You need to log in before you can comment on or make changes to this bug.