Closed Bug 1175397 Opened 5 years ago Closed 3 years ago

EliminateDeadResumePointOperands unsound to do after GVN

Categories

(Core :: JavaScript Engine: JIT, defect, P3)

defect

Tracking

()

RESOLVED FIXED
mozilla41
Tracking Status
firefox41 --- fixed

People

(Reporter: shu, Unassigned)

References

Details

Attachments

(1 file)

I ran into this bug while trying to land bug 1165486.

This crashes with JS_GC_ZEAL=14 (it's a modified bug847682.js):

function computeInputVariants(len) {
  var orig = "";
  var pointed = "";
  for (var i = 0; i < len; i++) {
    orig += (i%10)+"";
    pointed += (i%10)+".";
  }
  return [orig, pointed];
}

function test() {
  var re11 = /\./g;
  for (var i=0; i < 40; i++) {
    var a = computeInputVariants(i);
    assertEq(a[0], a[1].replace(re11, ''))
  }
}

for (var i = 0; i < 10; i++)
  test();

What causes the crash is |(i%10)+"."|

This creates:

mod65 = mod phi46 constant64
nop66 = nop
resumepoint mode=After (caller in block4) constant26 constant26 constant26 phi29 phi44 phi45 phi46 phi44 mod65
tostring67 = tostring mod65
nop68 = nop
resumepoint mode=After (caller in block4) constant26 constant26 constant26 phi29 phi44 phi45 phi46 phi44 tostring67
concat69 = concat phi44 tostring67
nop70 = nop
resumepoint mode=After (caller in block4) constant26 constant26 constant26 phi29 concat69 phi45 phi46 phi45 mod65

The problem is that ConcatStrings<CanGC> can, well, GC, and trigger OSI. In this particular case, the OSI point associated with concat69 is actually resuming right after JSOP_MOD, with nop70's resume point. nop70's resume point, however, optimized out mod65 because nop70 is dominated by the last live use of mod65, tostring67. Ion doesn't understand that nop70 actually resumes before the tostring67 is executed.

I'm not sure why nop70's resume point, which comes after concat69, doesn't have the pc right after the concat. Still investigating.
Oh, I think the problem might be that the two |(i%10)|s gets the folded to mod65. The original first mod is indeed dead by the time of |(i%10)+"."|, but we can't optimize it out of the second resume point.

Hm, how to fix this?
The EliminateDeadResumePointOperands phase is actually before GVN, so this usually doesn't happen, *unless* scalar replacement triggered a second iteration. Perhaps subsequent iterations should not eliminate dead resume point operands.
Summary: Some resume point operands are incorrectly optimized out → EliminateDeadResumePointOperands unsound to do after GVN
Okay, here's a better explanation of the bug:

Consider the following example, where def1 dominates, and is congruent with
def4, and use3 dominates, and is congruent with, use6.

def1
nop2
  resumepoint def1
use3 def1
def4
nop5
  resumepoint def4
use6 def4
use7 use3 use6

Assume that use3, use6, and use7 can cause OSI and are non-effectful. That is,
use3 will resume at nop2, and use6 and use7 will resume at nop5.

After GVN, since def1 =~ def4, we have:

def4 - replaced with def1 and pruned
use6 - replaced with use3 and pruned
use7 - renumbered to use5

def1
nop2
  resumepoint def1
use3 def1
nop4
  resumepoint def1
use5 use3 use3

nop4's resumepoint's operand of def1 is considered dead, because it is
dominated by the last use of def1, use3.

However, if use5 causes OSI, we will resume at nop4's resume point. The
baseline frame at that point expects the now-pruned def4 to exist. However,
since it was replaced with def1 by GVN, and def1 is dead at the point of nop4,
the baseline frame incorrectly gets an optimized out value.
There is probably a better solution here. I'll NI sunfish, but I'd like to be able to land the quick fix so I can land bug 1165486.
Blocks: 1165486
See comment 3.
Flags: needinfo?(sunfish)
A slightly better fix would be to have resumepoints artificially extend the live ranges of its operands that *used to* have a use after the resumepoint. I'm not going to be trying this fix though.
Comment on attachment 8623472 [details] [diff] [review]
Do not eliminate dead resume point operands after GVN.

Review of attachment 8623472 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks,

Indeed, EliminateDeadResumePoint is no longer safe as soon as we start moving code.
(doRepeatOptimizations == 1)  condition ensure that we did not moved any code yet.
Attachment #8623472 - Flags: review?(jdemooij) → review+
Blocks: 1165348
No longer blocks: 1166711
Went ahead and landed this now to hopefully get SM tests greener going forward:

https://hg.mozilla.org/integration/mozilla-inbound/rev/a8e0bde30bd4
https://hg.mozilla.org/mozilla-central/rev/a8e0bde30bd4
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla41
also backed out in https://hg.mozilla.org/mozilla-central/rev/b772e603c42f by nbp
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
This bug should be fixed with the backout of looping over GVN, and EliminateDeadResumePointOperands.
I double checked the test case worked fine today.
Status: REOPENED → RESOLVED
Closed: 5 years ago3 years ago
Flags: needinfo?(sunfish)
Priority: -- → P3
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.