545406 - TM: loop invariant code motion (LICM)

Reporter

Description

•

15 years ago

With bug 545274 (proper alias info for loads, stores and calls) in place, it's possible to contemplate how to do loop invariant code hoisting on traces. Implementation sketch follows (warning: at most 60% baked, possibly less) We want to compute the transitive closure of all value-producing nodes whose value does not depend on anything computed during the loop. So the root nodes are (a) those producing constants and (b) loads whose alias set does not intersect the alias set of any store in the trace nor the store-set of any call in the trace. Then visit all nodes that would be reachable from these in a forward traversal of the LIR. I believe the complete algorithm can be achieved in 3 passes over the LIR buffer. Pass 1: compute aggregate store alias set, and auxiliary forward index table: let S:AliasSet = empty T:Array of LIR* = empty for each node in LIR buffer, working backwards { if node is the trace-branch-back label (head of the loop) break if node is a store S = S union alias-set(node) if node is a call S = S union store-alias-set(node) add node to the end of T } so now S tells us which loads can't be considered loop invariant due to being stored to inside the loop proper T makes it possible to scan forwards during Pass 2 Pass 2: compute transitive closure of loop invariant nodes let I:Set of LIR* = empty // the invariant nodes for each node in T, working backwards (hence fwds in LIR buf) { if node denotes a literal I = I union {node} else if node is a load and intersect( alias-set(node), S ) == empty and node's address argument is in I then I = I union {node} else if node is an arithmetic node of some kind and all of node's arguments are in I then I = I union {node} } so now I is the set (or, a conservative subset) of loop invariant nodes. Note that we scan over the existing loop pre-header, and we ought to add all value producing nodes in the pre-header to I. So this loop is a bit too conservative. Pass 3: move I into the pre-header. Not sure it's possible to rearrange LIR buffer in-place in O(# nodes) time. Hence rearrange while copying into a new LIR buffer of the same size. The copy has to be done in 3 stages: (a) copy the pre-header (all stuff before the loopback label) (b) copy nodes in I that aren't already in the pre header (c) (now copy the loopback label) (d) copy all nodes after the loopback label, that aren't in I again using T so facilitate a forwards traversal Uh, ok. So this requires some further study on the details of the pre-header. Also, there's conflicting requirements in the representation of I. We require I to be some kind of set so that we can quickly determine whether a node is already part of it, in Pass 2. But Pass 3 (b) requires I to be array(ish) so that the order in which I's nodes appear in the pre-header (after the copy) is the same as it was to start with.

extremely rough starting patch 15 years ago Nicholas Nethercote [inactive] 14.05 KB, patch		Details \| Diff \| Splinter Review
WIP patch, v1 15 years ago Nicholas Nethercote [inactive] 25.27 KB, patch		Details \| Diff \| Splinter Review
WIP patch, v2 (against TM 47554:e8bc5f00a165) 15 years ago Nicholas Nethercote [inactive] 44.92 KB, patch		Details \| Diff \| Splinter Review
WIP patch, v3 (against TM 48641:c9e6d0406b80) 15 years ago Nicholas Nethercote [inactive] 60.19 KB, patch		Details \| Diff \| Splinter Review
patch, v4 (against TM 57130:bc000c1509ac) 15 years ago Nicholas Nethercote [inactive] 68.34 KB, patch	edwsmith : feedback+	Details \| Diff \| Splinter Review