Closed
Bug 496693
Opened 16 years ago
Closed 16 years ago
nanojit: comment and refactor DeadCodeFilter vs Assembler::gen interactions
Categories
(Tamarin Graveyard :: Baseline JIT (CodegenLIR), defect)
Tamarin Graveyard
Baseline JIT (CodegenLIR)
Tracking
(Not tracked)
VERIFIED
FIXED
People
(Reporter: jseward, Unassigned)
References
Details
(Whiteboard: fixed-in-tracemonkey)
Attachments
(1 file, 2 obsolete files)
11.45 KB,
patch
|
Details | Diff | Splinter Review |
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.0.10) Gecko/2009042700 SUSE/3.0.10-1.1 Firefox/3.0.10
Build Identifier:
Summary
~~~~~~~
DeadCodeFilter and Assembler::gen interact in a subtle and
undocumented way. This bug report supplies a patch whose purpose is
primarily to document the connection, to the extent that I understand
it. The patch also deletes DeadCodeFilter and inserts equivalent
logic at the head of the LIR-scanning loop in Assembler::gen.
The patch does not change the generated code, nor should it increase
the costs of running the JIT.
Rationale & Background
~~~~~~~~~~~~~~~~~~~~~~
DeadCodeFilter is not a standalone LIR-to-LIR transformation. It only
works if Assembler::gen is somewhere downstream, because it needs
Assembler::gen to poke the Reservation fields on value-producing
nodes, so DeadCodeFilter knows which of such nodes are actually
necessary.
This is confusing for newcomers to the code. What it means is,
although Assembler::gen is reading LIRs from the end of the reader
pipeline, it is modifying what goes in the start of the pipeline, by
poking their Reservation fields.
Assembler::gen also appears to depend on DeadCodeFilter for correct
operation. In particular, in some circumstances it needs
DeadCodeFilter upstream in order to stop it visiting value producing
nodes more than once. I don't understand this part of the mechanism
yet.
Overall, the code generator would be easier to follow if:
* Assembler::gen was standalone, so that it did not require an
upstream DeadCodeFilter for correct operation. We should be able to
chuck any old LIR into Assembler::gen (providing it follows type
rules, and doesn't have a cyclic graph) without it breaking.
* DeadCodeFilter worked as a standalone LIR-to-LIR transformation, so
as to facilitate experimentation with different orderings of LIR
transformations.
In effect, Assembler::gen is at the moment tightly coupled to
preceding transformation pipeline. Removing DeadCodeFilter and
placing equivalent logic at the start of Assembler::gen is a good
first step to decoupling the two.
Finally, w.r.t. definition of "bool required = " in the patch, this
change makes it clear which kinds of LIR nodes Assembler::gen
considers must always be retained, and which need to be retained only
if their output is used. I think this provides a defendable basis on
which to formally split the LIR definition into LIR statements and
LIR values. (Future work).
Reproducible: Always
Reporter | ||
Comment 1•16 years ago
|
||
Reporter | ||
Updated•16 years ago
|
Comment 2•16 years ago
|
||
is this ready for review?
Reporter | ||
Updated•16 years ago
|
Attachment #381915 -
Flags: review?(edwsmith)
Reporter | ||
Comment 3•16 years ago
|
||
yes; request is now filed.
Comment 4•16 years ago
|
||
I agree in general with the idea of cleaning up the DeadCodeFilter logic and decoupling it, but I'd rather not see it land in gen(). Could we make it a separate callable routine?
> Assembler::gen also appears to depend on DeadCodeFilter
> for correct operation.
I'd be very surprised if this was the case, as DeadCodeFilter was added much later to the back-end. But I can only assume that an unintended dependency has crept in.
Reporter | ||
Comment 5•16 years ago
|
||
(in reply to Comment #4)
> Could we make it a separate callable routine?
Do you mean: change the start of the gen() loop to be like this:
bool required = ins->hasEffects() || ins->resv()->used
if (!required)
continue;
and then put this somewhere else:
bool hasEffects ( LIns* ins ) {
return
ins->isGuard()
|| ins->isBranch()
|| (ins->isCall() && !ins->isCse(functions))
|| ins->isStore()
|| op == LIR_loop || op == LIR_label || op == LIR_live
|| ins->isRet();
}
or something else?
Reporter | ||
Comment 6•16 years ago
|
||
Ed, any chance you could have a look at this?
Comment 7•16 years ago
|
||
re #4; sorry my bad, i was looking at the wrong diff. The change looks fine to me.
Comment 8•16 years ago
|
||
Hi,
sorry for the slow review
This might be an okay patch but i'm concerned about the motivating assumption that each LirReader filter needs to work standalone; that's not at all the intention. This is also why Assembler.assembler() creates the DeadCodeFilter instance -- it's a required piece of the pipeline if you're using Assembler. it's not a standalone component. making it standalone is a big change, and we'd want to move it out of the Assembler impl to make it standalone too.
Before DeadCodeFilter was added, the dead code skipping logic actually was in gen() and we factored it out to clean up the code (separation of concerns). Yes, DCF and Assembler must run synchronously, but that alone shouldn't justify putting the code back inline in gen().
so, R+ on the cleanup, and R+ on adding the clarifying comments, but I think we need a stronger reason to un-factor the code. Performance?
We know from profiling that read() is hot, I dont know if dropping one filter stage moves the needle significantly, however.
Updated•16 years ago
|
Attachment #381915 -
Flags: review?(edwsmith) → review-
Comment 9•16 years ago
|
||
(In reply to comment #8)
> This might be an okay patch but i'm concerned about the motivating assumption
> that each LirReader filter needs to work standalone; that's not at all the
> intention.
You've correctly identified this concern; but it seems rather like DCF is the only one for which this inseparability is true. The other read-side filters can be rearranged willy-nilly and have this nice "modular" independent feel to them. As far as I can tell anyway. I think the motivation of the patch is aesthetic: to avoid misleading readers by supporting this separability/pipeline notion everywhere-except-here.
Would comment #5 -- moving the DCF logic from a tightly-coupled "filter" to a helper function -- satisfy everyone? You'll avoid making an object that appears separable but isn't, yet also avoid cluttering up ::gen() with the logic in-line.
Comment 10•16 years ago
|
||
i'd be satisfied with that.
I've written LirWriter filters that are coupled (analyzer feeding information to a later filter that requires the information). We shouldn't try hard to make every filter work standalone, but we absolutely should be clear about where the inter-filter dependencies are.
Reporter | ||
Comment 11•16 years ago
|
||
* splits keep/no-keep logic into a helper function LIns::isStmt
* shorten up & rewrite comments in Assembler::gen
Generated code is unaffected. Both before and after the patch, a
non-debug build using gcc-4.3.1 with flags -g -O running trace-tests.js
requires 57.3 billion instructions (x86-linux).
Attachment #381915 -
Attachment is obsolete: true
Attachment #384787 -
Flags: review?(edwsmith)
![]() |
||
Comment 12•16 years ago
|
||
Just to note: I was just looking at DeadCodeFilter, trying to understand what it does. I remembered Julian talking about this stuff and so talked to him and he pointed me at this bug. Without that help I would have never worked out the coupling between DeadCodeFilter and Assembler. So +1 for the patch.
Updated•16 years ago
|
Attachment #384787 -
Flags: review?(edwsmith) → review+
Reporter | ||
Comment 13•16 years ago
|
||
Attachment #384787 -
Attachment is obsolete: true
Reporter | ||
Updated•16 years ago
|
Keywords: checkin-needed
Comment 14•16 years ago
|
||
I still wonder whether a bitmap or bytemap (derived from the tbl) mapping opcode to stmt yes/no would be faster than the fairly hairy sequence of jumps we do right now in isStmt(). In the alternative we could always set ->used for statements when we construct them.
![]() |
||
Comment 15•16 years ago
|
||
(In reply to comment #14)
> I still wonder whether a bitmap or bytemap (derived from the tbl) mapping
> opcode to stmt yes/no would be faster than the fairly hairy sequence of jumps
> we do right now in isStmt(). In the alternative we could always set ->used for
> statements when we construct them.
I have a patch doing the former, it didn't make any difference on SunSpider, but the code is a bit nicer so I'll put it in a separate bug report. The latter idea sounds error-prone to me.
Comment 16•16 years ago
|
||
Keywords: checkin-needed
Whiteboard: fixed-in-tracemonkey
![]() |
||
Comment 17•16 years ago
|
||
(In reply to comment #14)
> I still wonder whether a bitmap or bytemap (derived from the tbl) mapping
> opcode to stmt yes/no would be faster than the fairly hairy sequence of jumps
> we do right now in isStmt().
Bug 503990 has the details of this.
Comment 18•16 years ago
|
||
Status: UNCONFIRMED → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Updated•15 years ago
|
Status: RESOLVED → VERIFIED
You need to log in
before you can comment on or make changes to this bug.
Description
•