Closed Bug 718969 Opened 13 years ago Closed 12 years ago

Remove the decompiler

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla21

People

(Reporter: luke, Assigned: Benjamin)

References

Details

Attachments

(1 file)

The decompiler is a significant maintenance burden and source of complexity; we should consider strategies to remove it.

Although my understanding is far from complete, I think the decompiler has two main functions:
 1. decompile the expression that produced a given erroneous value (js_DecompileValueGenerator)
 2. Function.prototype.{toString,toSource} (JS_DecompileFunction)

For (1), it seems like we could implement a drastically simpler subset of the decompiler that just decompiled expressions.  Such an expression decompiler could use a simple, direct recursive strategy (instead of the highly indirect Decompile -> GetOff -> Decompile recursive strategy used by js_DecompileValueGenerator).  While there are expressions that are quite complicated to decompile, it seems like we could just chicken out and print something not-exact-but-useful.

For (2), we could just save the compressed source chars.  It seems like we could then remove all srcnotes (other than trynotes) which would offset the cost of the compressed source.  To get a quick estimate, I cat'd together all of parsemark (attachment 429916 [details]) and measured compiling the whole thing:

  Size on disk:                 5.9M
  Size gziped:                  1.6M
  Size bzip2ed:                 1.4M
  Aggregate JSScript::dataSize: 6.0M
  Aggregate srcnote size:       1.0M

So that is a 7% increase with bzip2 or 10% increase with gzip.  (Note: the JSBug 716068 comment 0 shows that js-total-scripts is roughly 7-10% of the total browser explicit allocation, so that means a total browser increase of .5-1%.

This is all pretty back of the envelope, but it does seem to give hope that this is a viable strategy.  Perhaps it would dovetail nicely with the lazy parsing (bug 716068, which would also want to keep around compressed chars).

I have no immediate plans for this bug; I just wanted to put the bug on the map, post the memory results and gather feedback.
>   Size on disk:                 5.9M
>   Size gziped:                  1.6M
>   Size bzip2ed:                 1.4M
>   Aggregate JSScript::dataSize: 6.0M
>   Aggregate srcnote size:       1.0M
> 
> So that is a 7% increase with bzip2 or 10% increase with gzip.

Sorry, where do the 7% or 10% numbers come from?

> Perhaps it would dovetail nicely with the lazy
> parsing (bug 716068, which would also want to keep around compressed chars).

I think you mean bug 678037?
(In reply to Luke Wagner [:luke] from comment #0)
> For (1), it seems like we could implement a drastically simpler subset of
> the decompiler that just decompiled expressions.  Such an expression
> decompiler could use a simple, direct recursive strategy (instead of the
> highly indirect Decompile -> GetOff -> Decompile recursive strategy used by
> js_DecompileValueGenerator).  While there are expressions that are quite
> complicated to decompile, it seems like we could just chicken out and print
> something not-exact-but-useful.

I think that we could avoid even this by allowing parse trees to be kept around after parsing itself, and using script->analysis() to map each opcode to the node in the parse tree it corresponds to.  This would also be helpful for the operation level decompilation information used in the code inspector extension.  To avoid bloat the parse trees would normally be thrown away immediately after generating the script, but if they were regenerated for a thrown exception or something then they would stick around until the next GC.
(In reply to Nicholas Nethercote [:njn] from comment #1)
> >   Size on disk:                 5.9M
> >   Size gziped:                  1.6M
> >   Size bzip2ed:                 1.4M
> >   Aggregate JSScript::dataSize: 6.0M
> >   Aggregate srcnote size:       1.0M
> > 
> > So that is a 7% increase with bzip2 or 10% increase with gzip.
> 
> Sorry, where do the 7% or 10% numbers come from?

((size zipped) - (aggregate srcnote size)) / (aggregate JSScript::dataSize) * 100%

> I think you mean bug 678037?

Oops, yes, thanks
(In reply to Brian Hackett (:bhackett) from comment #2)
\> I think that we could avoid even this by allowing parse trees to be kept
> around after parsing itself, and using script->analysis() to map each opcode
> to the node in the parse tree it corresponds to.

So, IIUC, if we throw away the parse tree and then try to decompile an expression, we re-parse the source chars and use script->analysis() to map from pc to parse nodes?  I think that would work.  Decompiling parse tree definitely sounds easier than decompiling bytecode; my question is whether it is *so* much easier (for the simple subset of bytecode) as to warrant the extra architectural support required.  It could very well be.
Oh, and, the third option that has been going around: if we have column information (which I think we want for this source-map feature), then we may not even have to "decompile" expressions, but just read a subset of the source.
(In reply to Luke Wagner [:luke] from comment #5)
> Oh, and, the third option that has been going around: if we have column
> information (which I think we want for this source-map feature), then we may
> not even have to "decompile" expressions, but just read a subset of the
> source.

That sounds absolutely the right way to go to me!
Depends on: 747831
No longer depends on: 747831
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → DUPLICATE
Reopening as a tracking bug for the steps needed to finally remove the decompiler.
Assignee: general → bpeterson
Status: RESOLVED → REOPENED
Depends on: 767274, savesource
Resolution: DUPLICATE → ---
Summary: consider removing the decompiler → Remove the statement decompiler
Summary: Remove the statement decompiler → [meta] Remove the statement decompiler
Depends on: 777474
Just wanted to point out that it's about time to do the deed; the expression decompiler and save-the-source have landed, stuck and shipped, all due to the hard work of Benjamin.  For jit-inspector, which still uses the decompiler, bhackett agreed we could just print the information as a simple list of [opcode: info].
This patch is conditional on the acceptance of my patch for bug 777474.

Before:
wc -l jsopcode.cpp
7186 jsopcode.cpp

After:
wc -l jsopcode.cpp 
2414 jsopcode.cpp
Attachment #708847 - Flags: review?(luke)
Comment on attachment 708847 [details] [diff] [review]
Arrange for the decompiler to be steamrolled, beheaded, eviscerated, bathed in acid, and then torn to bits by club-bearing monkeys with a vendetta.

It gives me no small amount of delight r+ing this patch.

(In reply to :Benjamin Peterson from comment #10)
I am remembering that scene in Last of the Mohicans where Magua cuts out Colonel Munro's heart.
Attachment #708847 - Flags: review?(luke) → review+
Comment on attachment 708847 [details] [diff] [review]
Arrange for the decompiler to be steamrolled, beheaded, eviscerated, bathed in acid, and then torn to bits by club-bearing monkeys with a vendetta.

Review of attachment 708847 [details] [diff] [review]:
-----------------------------------------------------------------

I suspect there's a ton of code in other files that is removable as well.  E.g. jsopcode.tbl mentions the decompiler a couple of times (but don't worry about the XML stuff, I'm taking care of that separately).  And BytecodeEmitter.cpp mentions the decompiler lots of times in comments.
More than just mentioning it, we can avoid emitting most srcnotes (only line/column and control-flow annotations used by IM, iiuc).  That should be a tidy memory savings but, given that it is careful work, it should definitely be a follow-up bug.
Yes, there's definitely a lot more to be removed. I think it's best to remove that stuff in individual followup patches, so we don't have to shift through a 200KB patch if something subtle breaks.
We'll be removing stuff for years after this.  There's *so* not any need to rush it.  :-)
(In reply to Luke Wagner [:luke] from comment #11)
> I am remembering that scene in Last of the Mohicans where Magua cuts out
> Colonel Munro's heart.

Before he dies, Magua will put his children under the knife, so the Grey Hair will know his seed is wiped out forever.
Our national nightmare is over.

https://hg.mozilla.org/integration/mozilla-inbound/rev/fa772ccdf19d
Summary: [meta] Remove the statement decompiler → Remove the decompiler
Backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/a62ffd085b9d to clear a path to back out de-e4x.
https://hg.mozilla.org/mozilla-central/rev/9069c3f568f7
Status: REOPENED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla21
Blocks: 838960
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: