Open Bug 1478034 Opened 6 years ago Updated 1 year ago

[meta] Add an interface to bytecode

Categories

(Core :: JavaScript Engine, enhancement, P3)

enhancement

Tracking

()

Tracking Status
firefox63 --- affected

People

(Reporter: mgaudet, Unassigned)

References

(Depends on 1 open bug)

Details

(Keywords: meta)

Attachments

(1 file, 2 obsolete files)

Currently access into the bytecode often happen relatively loosely, and many parts of the VM look directly at bytecode. This is a tight coupling that reduces the agility of the team, making experimentation and potential evolution more expensive that necessary. 

A bytecode interface can raise the level of discussion about program text to semantics, rather than structure. 

Design Goals: 

* Provide a framework to write semantic queries about bytecode, without exposing bytecode specifics.
* Ease of iteration and analysis
* Encapsulation: longer term only very specific clients outside the interface should directly reference the bytecode (jsbytecode*, pcOffset).
This is a functional first draft of an inteface for bytecode. There are xxx
major portions included in this patch.

- RawBytecode: A typedef to expose jsbytecode* in a controlled manner, to allow
  easy future audits.
- ScriptLocation: This is an immutable view of a single program operation (a
  bytecode). There is is some basic functionality in here included for
  experimentation sake.
- ScriptLocationOffset: An immutable view of a source location offset, (and
  RawscriptLocationOffset, another typedef to ease future auditing)
- ScriptIterator: A forward iterator over script locations, using program
  text order.
- AllBytecodeIterable: A wrapper class that allows the use of a range-based
  for-loop over ScriptLocations in a Script.
- Modifications to JSScript to expose ScriptLocations and answer some queries
  about containment.

The interface is experimentally tested using JSScript::initScriptCounts as a
simple playground for demonstrating the use of the interface.
Assignee: nobody → mgaudet
Status: NEW → ASSIGNED
The interface is expanded and refined in this patch, showing a more
complicated set of requirements and the answers to those requirements
Comment on attachment 8994540 [details] [diff] [review]
[Part 1] Strawman implementation of a bytecode interface

I'd welcome suggestions for anyone else to rope in on feedback!
Attachment #8994540 - Flags: feedback?(tcampbell)
Attachment #8994540 - Flags: feedback?(kvijayan)
Attachment #8994541 - Flags: feedback?(tcampbell)
Attachment #8994541 - Flags: feedback?(kvijayan)
Comment on attachment 8994540 [details] [diff] [review]
[Part 1] Strawman implementation of a bytecode interface

Review of attachment 8994540 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/vm/ScriptIterator.h
@@ +10,5 @@
> +#include "vm/ScriptLocation.h"
> +
> +namespace js {
> +
> +class ScriptIterator

nit: Script Iterator sounds like this would be an Iterator which returns a Script at each step.
just suggesting random ideas:  CodePointIterator ? BytecodeIterator? InstructionIterator? OpcodeIterator? …

::: js/src/vm/ScriptLocation.h
@@ +41,5 @@
> +//   JSBytecode. However, a space efficient implementation of
> +//   an iterator's ::end() seems like it would really want to
> +//   point to the invalid location just past the end of the
> +//   script.
> +class ScriptLocation

nit: ScriptLocation sounds like it would contain a line & column fields, where it is more like a CodePoint / Bytecode / Instruction / Opcode / … (pick one)
I did a quick audit of all the places we use a base JSOP_* in the codebase and identified a few common patterns.

* Asserts - Debug-only asserts as sanity checks.
* ParseNode - Some ParseNode use a small number of opcodes as enums.
* Compare ops - Parts of MacroAssembler use a small number of opcodes as enums for basic comparison operations.
* Bytecode emitter - Obviously we need opcodes to generate bytecode.
* Dump internals - Some debug dump information directly pokes into opcodes.
* JIT translation - Our JITs translate from bytecode, so no surprise.
* Bytecode metadata - Helper methods to query metadata about opcode definitions.
* VM Operation - A few VM options use opcodes instead of enum options.

* Pattern match exact bytecode - A few parts of code pattern match exact bytecode patterns. At the very least they should be defined in a common place.
> builtin/Array.cpp
> builtin/RegExp.cpp
> vm/NativeObject.cpp
> vm/ObjectGroup.cpp

* Control flow - A few places inspect control flow opcodes. These probably should be provided a well designed interface.
> vm/CodeCoverage.cpp
> vm/Debugger.cpp
> vm/GeneratorObject.cpp
> vm/GeneratorObject.h
We should consider the cases that use opcodes as a convenient enum value to instead define their own enum and explictly convert. Assigning same encoding is fine so the conversion is a no-op, but it does better document things and de-couples the bytecode.
(In reply to Ted Campbell [:tcampbell] from comment #5)
> * Compare ops - Parts of MacroAssembler use a small number of opcodes as
> enums for basic comparison operations.

Oops, the only use in macroassembler is a helper function to convert JSOP_* to a condition code. Maybe these should just be pushed to the JITs.
Attachment #8994540 - Flags: feedback?(tcampbell)
Attachment #8994540 - Flags: feedback?(kvijayan)
Attachment #8994541 - Flags: feedback?(tcampbell)
Attachment #8994541 - Flags: feedback?(kvijayan)
Comment on attachment 8994540 [details] [diff] [review]
[Part 1] Strawman implementation of a bytecode interface

Review of attachment 8994540 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry for taking a while to look at this.  Looks like a good start, aside from naming nit.  Can we land this somehow and then build some sort of "isbytecodeprivateyet" metric to measure incremental conversion of all usages of jsbytecode* and pcOffset to this interface?

::: js/src/vm/ScriptIterator.h
@@ +10,5 @@
> +#include "vm/ScriptLocation.h"
> +
> +namespace js {
> +
> +class ScriptIterator

I agree with :nbp - ScriptIterator is confusing.  BytecodeIterator seems more appropriate.

::: js/src/vm/ScriptLocation.h
@@ +41,5 @@
> +//   JSBytecode. However, a space efficient implementation of
> +//   an iterator's ::end() seems like it would really want to
> +//   point to the invalid location just past the end of the
> +//   script.
> +class ScriptLocation

Agree with :nbp again.  BytecodeLocation goes along well with BytecodeIterator.
Priority: -- → P3
QA Contact: sdetar
QA Contact: sdetar
Attachment #8994540 - Attachment is obsolete: true
Attachment #8994541 - Attachment is obsolete: true
Depends on: 1499544
Converting this to a meta-bug
Assignee: mgaudet → nobody
Status: ASSIGNED → NEW
Keywords: meta
Attachment: These are notes from a meeting (I believe in Paris) where we discussed this. Of dubious value, attached just in case.
Blocks: 1503864
Depends on: 1531479
Depends on: 1531480
Summary: Add an interface to bytecode → [meta] Add an interface to bytecode
Depends on: 1538557
Depends on: 1566057
Depends on: 1572504
Depends on: 1572870
Depends on: 1573062
Depends on: 1584758
Depends on: 1586030
Depends on: 1586428
Depends on: 1586439
Depends on: 1586440
Depends on: 1586441
Depends on: 1586443
Depends on: 1587564
Depends on: 1587575
Depends on: 1598786
Depends on: 1668249
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.