Closed
Bug 527754
Opened 15 years ago
Closed 15 years ago
CseFilter not able to handle downstream modification of instructions
Categories
(Core Graveyard :: Nanojit, defect)
Core Graveyard
Nanojit
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: rreitmai, Assigned: n.nethercote)
References
Details
(Whiteboard: fixed-in-nanojit, fixed-in-tracemonkey, fixed-in-tamarin)
Attachments
(1 file, 1 obsolete file)
10.26 KB,
patch
|
Details | Diff | Splinter Review |
The new cse filter as introduced by bug 502778 subtly changed behaviour for downstream filters. It is no longer possible to return a different instruction from a filter than the one received; e.g SoftFloatFilter input=call output=qjoin. As an example, the following code from CseFilter::insCall() expects that out->insCall() returns an ins where isCall() is true. return exprs->add(LInsCall, out->insCall(ci, args), k);
Assignee | ||
Comment 1•15 years ago
|
||
(In reply to comment #0) > > It is no longer possible to return a different instruction from a filter than > the one received; e.g SoftFloatFilter input=call output=qjoin. That's almost right... it's possible to return a different instruction so long as it has the same LInsHashKind. But in the example case, LIR_qjoin has kind LIns2 instead of LInsC so the qjoin is incorrectly put into the LInsC hashtable. Ugh. This wasn't intentional, sorry. Thanks for the diagnosis, I never would have worked that out with Tracemonkey because it doesn't do that kind of optimisation AFAIK. I can see a fix that involves an opcode-to-hashkind lookup table which would change the code above to this: LIns* ret = out->insCall(ci, args); exprs->add(hashkind[ret->opcode()], ret, k); return ret; Either we do that or we have to revert to a single hash table for all instructions. I think the opcode-to-hashkind table sucks less. And I might be able to remove LInsHashKind by using LInsRepKind... I'll take a look.
Assignee: nobody → nnethercote
Assignee | ||
Comment 2•15 years ago
|
||
I've determined that this problem existed even before the patch for bug 502778 was landed. Consider this function from before the landing: LIns* CseFilter::insImm(int32_t imm) { uint32_t k; LInsp found = exprs.find32(imm, k); if (found) return found; return exprs.add(out->insImm(imm), k); } find32() looks up the hash table, returning a previous identical instruction (if found) or NULL (if not). 'k' is a reference argument that gets set to the index where the instruction was found, or the empty slot where it wasn't found and thus should be inserted. In the not-found case, add() then sticks the result of 'out->insImm(imm)' directly into that slot without rehashing. This assumes that 'out->insImm(imm)' belongs in the same slot, ie. that no downstream stages have changed the instruction. So if SoftFloatFilter changes a LIR_call to a LIR_qjoin, that assumption is violated. So it's always been a problem -- CseFilter doesn't correctly handle downstream filters changing instructions -- it just hasn't been noticed. Probably what was happening pre-502778 was that the LIR_qjoin was being put into the wrong place in the hash table, which means that any subsequent identical LIR_qjoin wasn't being CSEd... unless that identical LIR_qjoin had been transformed from a correspondingly identical LIR_call(!) It probably worked most/all of the time because the bug occurred consistently and in the same direction (if you see what I mean). With the 502778 landing the problem was made more obvious because the transformed instructions were not just being put in the wrong slot, but instead in the wrong hashtable. The obvious fix is, in the not-found case, to rehash the result of 'out->insImm(imm)' when adding it. Unfortunately, this will probably cause a non-negligible slowdown as the cost of hashing is already non-negligible, and this fix would almost double the amount of hashing. And it's annoying to have to do, because downstream transformations are extremely rare and so the rehash will almost always give the same value as the hash. Another possibility is to check that the instruction hasn't been transformed downstream, something like this: LIns* CseFilter::insImm(int32_t imm) { uint32_t k; LInsp found = exprs.find32(imm, k); if (found) return found; LInsp new = out->insImm(imm); if (new->opcode == LIR_int && new->imm32() == imm) return exprs.add(new, k); // put straight into slot k else return exprs.add(new); // rehash before adding } The equality test is probably faster than a rehash, but it still sucks. I'm not sure what to do. CSE is a black art, and the writer pipeline's behaviour is a lot more subtle and complex than it first appears. I can backout 502778 and the follow-up 527288 if people think that is a good idea. That at least gets us back to a better-behaving buggy version.
Reporter | ||
Comment 3•15 years ago
|
||
I'm comfortable disallowing downstream transforms, but it would be good to have a mechanism in place to enforce it. I suppose one option is to introduce a filter whose sole purpose is to ensure this behaviour. We'd then simply place it upstream of the CseFilter; probably for debug builds only.
Assignee | ||
Comment 4•15 years ago
|
||
Ah, that makes things easier. An extra pass is overkill, we can just do this: LIns* CseFilter::insImm(int32_t imm) { uint32_t k; LInsp found = exprs.find32(imm, k); if (found) return found; LInsp new = out->insImm(imm); NanoAssert(new->opcode == LIR_int && new->imm32() == imm); return exprs.add(new, k); }
Reporter | ||
Comment 5•15 years ago
|
||
OK, works for me. The advantage of an external filter though is that it won't be tied to cse-only thus allowing it to be re-used in alternate situations where a similar issue may arise. But either approach is fine by me.
Assignee | ||
Comment 6•15 years ago
|
||
This patch adds assertions that check that filters downstream from CseFilter don't change any instructions, because CseFilter assumes they don't. It also changes the signature of findImmf() in order to minimise the amount of uint64_t/double type-punning required. (I didn't use an external filter as comment 5 suggests; in-CseFilter assertions are less work for now, it can be added if it's warranted in the future.) Rick, Brendan: this means that SoftFloatFilter can no longer be downstream from CseFilter. Is this a problem for TR/TM? I don't have patches for TR/TM. AFAIK SoftFloatFilter is broken for TM anyway.
Attachment #412787 -
Flags: review?(rreitmai)
Assignee | ||
Updated•15 years ago
|
Attachment #412787 -
Flags: review?(brendan)
Comment 7•15 years ago
|
||
Comment on attachment 412787 [details] [diff] [review] patch graydon could review too, I'm gonna duck on account of my TM bugs. /be
Attachment #412787 -
Flags: review?(brendan) → review?(gal)
Reporter | ||
Updated•15 years ago
|
Attachment #412787 -
Flags: review?(rreitmai) → review+
Reporter | ||
Comment 8•15 years ago
|
||
Comment on attachment 412787 [details] [diff] [review] patch Looks good and I already moved the filter upstream in TR.
Comment 9•15 years ago
|
||
Comment on attachment 412787 [details] [diff] [review] patch Nice. Please switch the order of the filters in TM. Duping a bug I just filed against this.
Attachment #412787 -
Flags: review?(gal) → review+
Assignee | ||
Comment 10•15 years ago
|
||
Actually, this one depends on 531165 -- if we commit this patch first then platforms using SoftFloatFilter will hit the assertions in this patch. We need to switch the pass order first.
Assignee | ||
Comment 11•15 years ago
|
||
http://hg.mozilla.org/projects/nanojit-central/rev/8d7d6dcce7eb
Whiteboard: fixed-in-nanojit
Assignee | ||
Comment 12•15 years ago
|
||
Backed out due to ARM and WinNT bustage: http://hg.mozilla.org/projects/nanojit-central/rev/2d17a9b2e8da
Assignee | ||
Comment 13•15 years ago
|
||
This patch fixes the WinNT bustage, whereby a uint64_t arg was being passed to hashImmf() which expected a double. Not sure why Linux GCC didn't complain about that. But the ARM problem remains. This bug is all about forbidding anything downstream of CseFilter from modifying instructions. But LirBufWriter::insLoad() does exactly that -- since bug 527178's patch went in, insLoad() can change a load's base by adding a LIR_piadd to it. The same thing is true of LirBufWriter::insStorei(). Argh. It seems like changing the load base should happen further upstream, that LirBufWriter shouldn't be changing instructions as its putting them in the buffer. Graydon, any thoughts?
Attachment #412787 -
Attachment is obsolete: true
Updated•15 years ago
|
Component: JIT Compiler (NanoJIT) → Nanojit
Product: Tamarin → Core
QA Contact: nanojit → nanojit
Comment 14•15 years ago
|
||
(In reply to comment #13) > This bug is all about forbidding anything > downstream of CseFilter from modifying instructions. This limitation is going to be a problem in the long run, IMHO... it effectively means that we can't do peephole optimizations downstream of cse, and trying to do them upstream seems to be painful (if not impossible) without some elaborate machinery to return faux LInsp values (so that things upstream of said peephole filter still "see" the instructions they expect). Granted, this is probably more a general issue with the reader/writer architecture we have, which really seems to be optimized for single-instruction-at-a-time optimizations; adding optimizations that are N-to-1 transformations seems to be quite painful.
Assignee | ||
Updated•15 years ago
|
Whiteboard: fixed-in-nanojit
Assignee | ||
Comment 15•15 years ago
|
||
TM merge (and backout): http://hg.mozilla.org/tracemonkey/rev/04951a8e1c25 http://hg.mozilla.org/tracemonkey/rev/1de57866b40b
Comment 16•15 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/04951a8e1c25
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 17•15 years ago
|
||
Sayre, this bug hasn't been fixed in NJ-central or TM yet -- I pushed a patch but backed it out. (See comment 11, comment 12 and comment 15.) Looks like you merged the push but not the backout?
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 18•15 years ago
|
||
(In reply to comment #17) > Sayre, this bug hasn't been fixed in NJ-central or TM yet -- I pushed a patch > but backed it out. (See comment 11, comment 12 and comment 15.) Looks like > you merged the push but not the backout? I pushed the backout as well, the comment just had the nanojit-central cset id, so I didn't notice it was backing out this check-in. http://hg.mozilla.org/mozilla-central/rev/1de57866b40b
Assignee | ||
Comment 19•15 years ago
|
||
(In reply to comment #14) > (In reply to comment #13) > > This bug is all about forbidding anything > > downstream of CseFilter from modifying instructions. > > This limitation is going to be a problem in the long run, IMHO... it > effectively means that we can't do peephole optimizations downstream of cse, > and trying to do them upstream seems to be painful (if not impossible) without > some elaborate machinery to return faux LInsp values (so that things upstream > of said peephole filter still "see" the instructions they expect). The limitation was always there, it's just that it only became obvious in really obscure cases before bug 502778 landed. (Eg. it was lucky that SoftFloatFilter worked, see comment 2.) The limitation can be revisited in the future if necessary, but it will involve a compile-time performance hit. As for peephole optimisations, a LIR_nop could help a lot. > Granted, this is probably more a general issue with the reader/writer > architecture we have, which really seems to be optimized for > single-instruction-at-a-time optimizations; adding optimizations that are > N-to-1 transformations seems to be quite painful. Yes, the current pipelines are very fast but very restrictive. Eg. in the writer pipeline, instructions can't be modified (without great difficulty) once they've been written into the LirBuffer. And the only optimisations that the reader pipeline can do is to ignore instructions.
Assignee | ||
Comment 20•15 years ago
|
||
http://hg.mozilla.org/projects/nanojit-central/rev/b9640e93e1ef
Whiteboard: fixed-in-nanojit
Assignee | ||
Comment 21•15 years ago
|
||
http://hg.mozilla.org/tracemonkey/rev/507718e9dcf8
Whiteboard: fixed-in-nanojit → fixed-in-nanojit, fixed-in-tracemonkey
Comment 22•15 years ago
|
||
(In reply to comment #19) > As for peephole optimisations, a LIR_nop could help a lot. Yep. I'm also hoping that bug 522593 could make some optimizations a lot more doable, as it would seem to allow construction of meta-opcodes without overhead...
Comment 23•15 years ago
|
||
http://hg.mozilla.org/tamarin-redux/rev/d1bb609bb3fe
Whiteboard: fixed-in-nanojit, fixed-in-tracemonkey → fixed-in-nanojit, fixed-in-tracemonkey, fixed-in-tamarin
Comment 24•15 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/507718e9dcf8
Status: REOPENED → RESOLVED
Closed: 15 years ago → 15 years ago
Resolution: --- → FIXED
Updated•10 years ago
|
Product: Core → Core Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•