Last Comment Bug 718969 - Remove the decompiler
: Remove the decompiler
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: unspecified
: All All
: -- normal with 3 votes (vote)
: mozilla21
Assigned To: :Benjamin Peterson
:
Mentors:
Depends on: savesource 767274 777474
Blocks: 838960
  Show dependency treegraph
 
Reported: 2012-01-18 03:13 PST by Luke Wagner [:luke]
Modified: 2013-05-06 19:36 PDT (History)
20 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Arrange for the decompiler to be steamrolled, beheaded, eviscerated, bathed in acid, and then torn to bits by club-bearing monkeys with a vendetta. (184.42 KB, patch)
2013-01-31 16:43 PST, :Benjamin Peterson
luke: review+
Details | Diff | Splinter Review

Description Luke Wagner [:luke] 2012-01-18 03:13:38 PST
The decompiler is a significant maintenance burden and source of complexity; we should consider strategies to remove it.

Although my understanding is far from complete, I think the decompiler has two main functions:
 1. decompile the expression that produced a given erroneous value (js_DecompileValueGenerator)
 2. Function.prototype.{toString,toSource} (JS_DecompileFunction)

For (1), it seems like we could implement a drastically simpler subset of the decompiler that just decompiled expressions.  Such an expression decompiler could use a simple, direct recursive strategy (instead of the highly indirect Decompile -> GetOff -> Decompile recursive strategy used by js_DecompileValueGenerator).  While there are expressions that are quite complicated to decompile, it seems like we could just chicken out and print something not-exact-but-useful.

For (2), we could just save the compressed source chars.  It seems like we could then remove all srcnotes (other than trynotes) which would offset the cost of the compressed source.  To get a quick estimate, I cat'd together all of parsemark (attachment 429916 [details]) and measured compiling the whole thing:

  Size on disk:                 5.9M
  Size gziped:                  1.6M
  Size bzip2ed:                 1.4M
  Aggregate JSScript::dataSize: 6.0M
  Aggregate srcnote size:       1.0M

So that is a 7% increase with bzip2 or 10% increase with gzip.  (Note: the JSBug 716068 comment 0 shows that js-total-scripts is roughly 7-10% of the total browser explicit allocation, so that means a total browser increase of .5-1%.

This is all pretty back of the envelope, but it does seem to give hope that this is a viable strategy.  Perhaps it would dovetail nicely with the lazy parsing (bug 716068, which would also want to keep around compressed chars).

I have no immediate plans for this bug; I just wanted to put the bug on the map, post the memory results and gather feedback.
Comment 1 Nicholas Nethercote [:njn] 2012-01-18 03:36:37 PST
>   Size on disk:                 5.9M
>   Size gziped:                  1.6M
>   Size bzip2ed:                 1.4M
>   Aggregate JSScript::dataSize: 6.0M
>   Aggregate srcnote size:       1.0M
> 
> So that is a 7% increase with bzip2 or 10% increase with gzip.

Sorry, where do the 7% or 10% numbers come from?

> Perhaps it would dovetail nicely with the lazy
> parsing (bug 716068, which would also want to keep around compressed chars).

I think you mean bug 678037?
Comment 2 Brian Hackett (:bhackett) 2012-01-18 05:59:23 PST
(In reply to Luke Wagner [:luke] from comment #0)
> For (1), it seems like we could implement a drastically simpler subset of
> the decompiler that just decompiled expressions.  Such an expression
> decompiler could use a simple, direct recursive strategy (instead of the
> highly indirect Decompile -> GetOff -> Decompile recursive strategy used by
> js_DecompileValueGenerator).  While there are expressions that are quite
> complicated to decompile, it seems like we could just chicken out and print
> something not-exact-but-useful.

I think that we could avoid even this by allowing parse trees to be kept around after parsing itself, and using script->analysis() to map each opcode to the node in the parse tree it corresponds to.  This would also be helpful for the operation level decompilation information used in the code inspector extension.  To avoid bloat the parse trees would normally be thrown away immediately after generating the script, but if they were regenerated for a thrown exception or something then they would stick around until the next GC.
Comment 3 Luke Wagner [:luke] 2012-01-18 09:13:00 PST
(In reply to Nicholas Nethercote [:njn] from comment #1)
> >   Size on disk:                 5.9M
> >   Size gziped:                  1.6M
> >   Size bzip2ed:                 1.4M
> >   Aggregate JSScript::dataSize: 6.0M
> >   Aggregate srcnote size:       1.0M
> > 
> > So that is a 7% increase with bzip2 or 10% increase with gzip.
> 
> Sorry, where do the 7% or 10% numbers come from?

((size zipped) - (aggregate srcnote size)) / (aggregate JSScript::dataSize) * 100%

> I think you mean bug 678037?

Oops, yes, thanks
Comment 4 Luke Wagner [:luke] 2012-01-18 09:39:39 PST
(In reply to Brian Hackett (:bhackett) from comment #2)
\> I think that we could avoid even this by allowing parse trees to be kept
> around after parsing itself, and using script->analysis() to map each opcode
> to the node in the parse tree it corresponds to.

So, IIUC, if we throw away the parse tree and then try to decompile an expression, we re-parse the source chars and use script->analysis() to map from pc to parse nodes?  I think that would work.  Decompiling parse tree definitely sounds easier than decompiling bytecode; my question is whether it is *so* much easier (for the simple subset of bytecode) as to warrant the extra architectural support required.  It could very well be.
Comment 5 Luke Wagner [:luke] 2012-01-18 10:52:00 PST
Oh, and, the third option that has been going around: if we have column information (which I think we want for this source-map feature), then we may not even have to "decompile" expressions, but just read a subset of the source.
Comment 6 Nicholas Nethercote [:njn] 2012-01-18 14:29:24 PST
(In reply to Luke Wagner [:luke] from comment #5)
> Oh, and, the third option that has been going around: if we have column
> information (which I think we want for this source-map feature), then we may
> not even have to "decompile" expressions, but just read a subset of the
> source.

That sounds absolutely the right way to go to me!
Comment 7 Nicholas Nethercote [:njn] 2012-06-20 21:07:14 PDT

*** This bug has been marked as a duplicate of bug 761723 ***
Comment 8 :Benjamin Peterson 2012-06-22 21:55:55 PDT
Reopening as a tracking bug for the steps needed to finally remove the decompiler.
Comment 9 Luke Wagner [:luke] 2013-01-31 11:27:00 PST
Just wanted to point out that it's about time to do the deed; the expression decompiler and save-the-source have landed, stuck and shipped, all due to the hard work of Benjamin.  For jit-inspector, which still uses the decompiler, bhackett agreed we could just print the information as a simple list of [opcode: info].
Comment 10 :Benjamin Peterson 2013-01-31 16:43:09 PST
Created attachment 708847 [details] [diff] [review]
Arrange for the decompiler to be steamrolled, beheaded, eviscerated, bathed in acid, and then torn to bits by club-bearing monkeys with a vendetta.

This patch is conditional on the acceptance of my patch for bug 777474.

Before:
wc -l jsopcode.cpp
7186 jsopcode.cpp

After:
wc -l jsopcode.cpp 
2414 jsopcode.cpp
Comment 11 Luke Wagner [:luke] 2013-01-31 17:26:42 PST
Comment on attachment 708847 [details] [diff] [review]
Arrange for the decompiler to be steamrolled, beheaded, eviscerated, bathed in acid, and then torn to bits by club-bearing monkeys with a vendetta.

It gives me no small amount of delight r+ing this patch.

(In reply to :Benjamin Peterson from comment #10)
I am remembering that scene in Last of the Mohicans where Magua cuts out Colonel Munro's heart.
Comment 12 Nicholas Nethercote [:njn] 2013-01-31 17:50:09 PST
Comment on attachment 708847 [details] [diff] [review]
Arrange for the decompiler to be steamrolled, beheaded, eviscerated, bathed in acid, and then torn to bits by club-bearing monkeys with a vendetta.

Review of attachment 708847 [details] [diff] [review]:
-----------------------------------------------------------------

I suspect there's a ton of code in other files that is removable as well.  E.g. jsopcode.tbl mentions the decompiler a couple of times (but don't worry about the XML stuff, I'm taking care of that separately).  And BytecodeEmitter.cpp mentions the decompiler lots of times in comments.
Comment 13 Luke Wagner [:luke] 2013-01-31 18:05:43 PST
More than just mentioning it, we can avoid emitting most srcnotes (only line/column and control-flow annotations used by IM, iiuc).  That should be a tidy memory savings but, given that it is careful work, it should definitely be a follow-up bug.
Comment 14 :Benjamin Peterson 2013-01-31 18:07:07 PST
Yes, there's definitely a lot more to be removed. I think it's best to remove that stuff in individual followup patches, so we don't have to shift through a 200KB patch if something subtle breaks.
Comment 15 Jeff Walden [:Waldo] (remove +bmo to email) 2013-01-31 18:12:39 PST
We'll be removing stuff for years after this.  There's *so* not any need to rush it.  :-)
Comment 16 Bill McCloskey (:billm) 2013-01-31 18:15:45 PST
(In reply to Luke Wagner [:luke] from comment #11)
> I am remembering that scene in Last of the Mohicans where Magua cuts out
> Colonel Munro's heart.

Before he dies, Magua will put his children under the knife, so the Grey Hair will know his seed is wiped out forever.
Comment 17 :Benjamin Peterson 2013-01-31 21:22:39 PST
Our national nightmare is over.

https://hg.mozilla.org/integration/mozilla-inbound/rev/fa772ccdf19d
Comment 18 Phil Ringnalda (:philor, back in August) 2013-01-31 22:34:00 PST
Backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/a62ffd085b9d to clear a path to back out de-e4x.
Comment 19 Phil Ringnalda (:philor, back in August) 2013-02-01 00:38:16 PST
Relanded in https://hg.mozilla.org/integration/mozilla-inbound/rev/9069c3f568f7
Comment 20 Ryan VanderMeulen [:RyanVM] 2013-02-01 13:16:57 PST
https://hg.mozilla.org/mozilla-central/rev/9069c3f568f7

Note You need to log in before you can comment on or make changes to this bug.