Created attachment 418579 [details] [diff] [review] patch This patch tweaks intersectRegisterState() and (less importantly) unionRegisterState(): - Instead of looping over all registers once and then again, it loops over them all once and then only the necessary ones the second time around. It does this by recording them in an array (previously we recorded the ones to skip in a bitmask). This reduces the number of iterations in the second loop and also avoids some unpredictable branches. - In intersectRegisterState() it replaces calls to findSpecificRegFor() with the faster findSpecificRegForUnallocated(). (This wasn't applicable in unionRegisterState().) To do this required inlining assignSaved(). - It avoids printing extraneous whitespace for the union case when dumping assembly code. - It clarifies their top-level comments slightly and improves their formatting. This reduces by almost 1% the number of instructions executed in SunSpider on X64, and I'm seeing a 6--8ms SunSpider speedup on X64 and about 3ms on i386.
Comment on attachment 418579 [details] [diff] [review] patch I was trying to unify union and intersect using a template function and traits. Never really looked pretty though.
http://hg.mozilla.org/projects/nanojit-central/rev/12013e9b8fab I changed the findSpecificRegForUnallocated() back to findSpecificRegFor()... even though it seemed to work, my prior reasoning on why it was ok was faulty and I couldn't convince myself that it would always be safe. This negated much of the speedup, unfortunately, but it's still a clean-up.