Right now, the final codegen step of compilation happens on the main thread. This mostly falls out from having a single MacroAssembler so that all asm.js code in in a single linear slab. Thus the main challenge to parallelizing codegen is to give each AsmJSParallelTask its own MacroAssembler and replace any current cross-function labels (viz., the ones used for function calls and exit trampolines) with entries in AsmJSStaticLinkData. From my measurements, this would win more than 1 second on Epic Citadel uncached startup (whose compilation is about 5s).
It looks like it would be easiest to wait until bug 760642 lands before trying to do this. Otherwise, this patch will end up just getting in its way. (Specifically, bug 760642 changes how calls are patched so that we use the single-instruction encoding in the common case and the 3-byte instruction only when the distance between the call and target is bigger than 32MB and this all requires having all the IonAssemblerBuffers together at once at link time which is counter to what I was thinking in comment 0.)
s/3-byte instruction/3-instruction call/
Actually, codegen is a pretty significant part of main thread compilation these days, 2.1s (out of 8s) on Unity DT2 and .47s out of 1.9s on WMW. Given that we're only saturating ~2 cores atm, this would mean a roughly 25% wall clock reduction.
Bug 1157624 would mostly subsume this in a much simpler way that took codegen off the main parsing thread but still kept it sequential (on its own thread). Assuming parsing is the bottleneck (it is), this would achieve the desired effect.