Closed Bug 1759029 Opened 2 years ago Closed 2 years ago

Differential testing: incorrect computation involving exceptions

Categories

(Core :: JavaScript Engine: JIT, defect, P2)

defect

Tracking

()

RESOLVED FIXED
100 Branch
Tracking Status
firefox100 --- fixed

People

(Reporter: lukas.bernhard, Assigned: iain)

References

(Blocks 2 open bugs)

Details

Attachments

(1 file)

Steps to reproduce:

On git commit b96000c5311dd5f2065ff7fc97d92bb8753dc3f4 I encountered a miscomputation during differential fuzzing. This seems to be an issue within ion as baseline and v8 compute the same result.

sample.js:

function main() {
  const v5 = [1];
  v5[NaN] = 0;  // if this is the same as v5[0] the computation is correct

  let v73 = 0;
  let bla = 0;
  do {
      function v75() {
          for (let v82 = 0; v82 < 7; v82++) {
              const v85 = v82 % v82;    // NaN or 0
              const v86 = v85 >>> v85;  // 0
              bla += v86;

              try {
                  const v89 = Int32Array(1); // throws
                  v89[0] = 1;                // never reached but still necessary to reproduce
              } catch(v90) {
                  const v92 = v5[v85];
                  bla += v92;
              }
          }

          for (let v95 = 0; v95 < 100; v95++) { }
      }

      v75();
      v73++;
  } while (v73 < 6);

  print(bla);  // 36 without ion and in v8, 41 with ion
}

(Unfortunately, the sample didn't minimize further)

Actual results:

// obj-x86_64-pc-linux-gnu/dist/bin/js --no-threads --cpu-count=1 --ion-offthread-compile=off --fuzzing-safe --differential-testing --no-ion sample.js
=> computes: 36

// obj-x86_64-pc-linux-gnu/dist/bin/js --fast-warmup --no-threads --cpu-count=1 --ion-offthread-compile=off --fuzzing-safe --differential-testing sample.js
=> computes: 41

Component: Untriaged → JavaScript Engine: JIT
Product: Firefox → Core

Good find!

We're going wrong in our truncation code. Here's a reduced testcase, which gives different results with --fast-warmup --no-threads.

var arr = [];
arr[0] = 1;
arr[NaN] = 0;

function foo() {
  for (let i = 0; i < 7; i++) {
    const a = i % i;    // NaN or 0
    counter += a >>> a; // 0

    try {
      throw 3;
    } catch {
      counter += arr[a];
    }
  }
  for (let i = 0; i < 100; i++) { }
}

let counter = 0;
for (var i = 0; i < 10; i++) {
  foo();
}

assertEq(counter, 60);

The catch block in is unreachable in Warp, so we end up two disjoint components in the CFG for foo: one that enters via the normal entry, and one that OSRs into the second loop. The code of interest is in the former. We end up with something along the lines of:

1 constant 0x0
2 mod constant1 constant1
3 ursh mod2 mod2
4 throw <...>
    resumepoint 2 1 ...

When we call ComputeTruncateKind on the mod instruction, we determine that:

  • the only visible uses are in the ursh, which explicitly truncates
  • it is not observable from outside the frame
  • it is not implicitly used

Therefore, we decide that we can safely encode the truncated version as part of the resume point operands. This is incorrect: if we do so, we recover the wrong value of a in the catch block (0 instead of NaN).

It's not immediately obvious to me where we should be fixing this. Maybe we should mark all instructions inside of try blocks as implicitly used?

nbp: Do you have any thoughts on how this is supposed to work?

Flags: needinfo?(nicolas.b.pierron)

(In reply to Iain Ireland [:iain] from comment #1)

nbp: Do you have any thoughts on how this is supposed to work?

From the CFG creation, a (Mod2) should be flagged as being ImplicitlyUsed, as the catch block uses it explicitly and it is not encoded in the CFG.
Thus the truncation can happen, but the Truncation should create a clone of the value which will be captured in the resume point.

Flags: needinfo?(nicolas.b.pierron)

Multiple options in order of implementation complexity:

  1. A naïve implementation would be to flag all stack slot of the last frame has being ImplicitlyUsed when there is a non-empty catch block.
  2. Another would be to generate the block, and remove them after, thus uses would be generated, and the removal of the block will annotate instructions with ImplicitlyUsed.
  3. Otherwise, we should build a new way to visit the CFG which does not generate any instruction, thus avoiding allocating memory, and annotate all operands of would-have-been-generated instructions as being ImplicitlyUsed.

I am in favor of (1) & (2) as these would are the simplest and the future proof solutions, whereas (3) is complex and commit to the lack of support of catch block. The solution (1) can have performance issues as we might flags conservatively too many instructions as being ImplicitlyUsed.

In addition to local fixed slots, this problem can also affect arguments in strict functions:

// This version of foo is also buggy
function foo(a) {
  "use strict";
  for (let i = 0; i < 7; i++) {
    a = i % i;
  ...

Non-strict functions aren't affected because the existence of Function.arguments means that arguments are always observable.

It's probably cleanest to address this from the point of view of slots, rather than instructions/dependencies. Consider the following code:

var x = a(); // This value is implicitly used in the catch block.
try {
  maybeThrows();
  x = b();
} catch {
  use(x);
}

A JSOp::GetLocal inside a catch block may use any definition that reaches a potentially-throwing operation inside the try block. This is not limited to the most recent definition, or to definitions inside the try block. I think option (2) from comment 3 would therefore be difficult to implement. AFAICT, we would have to add a control flow edge from every potentially throwing operation to the catch block. Among other things, that might force us to allocate a lot more blocks in our CFG.

One easy fix, which is more or less equivalent to option (1) from comment 3, is to just check in ComputeRequestedTruncateKind whether the graph contains a try block, and force safeToConvert to false if it does. Note that we're already disabling some optimizations if there's a try block to work around this problem, so we've already accepted a performance hit here.

If we want to be more precise, BytecodeAnalysis already tracks whether ops are reachable via "normal" execution (without going through a catch block) or only via exception unwinding. (Note that catch-only code is not limited to the inside of a catch block; for example, if a try block returns unconditionally, then code following that try block will be catch-only even if it's not in the catch block.) If we extended BytecodeAnalysis to look for catch-only GetLocal and GetArg ops, and then passed a list to WarpBuilder of the stack slots accessed by those ops, then we could use that list to more precisely mark the correct instructions as implicitly used. This might be a way to implement option (3).

My instinct is to start with the easy fix for now.

Blocks: sm-opt-jits
Severity: -- → S4
Priority: -- → P2

This is the simplest fix, and is consistent with what we're doing in e.g. EliminateDeadResumePointOperands. Our Speedometer score is unaffected.

Assignee: nobody → iireland
Status: NEW → ASSIGNED
Pushed by iireland@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/20bbbd2b1bc0
Make ComputeRequestedTruncateKind more conservative for try/catch r=nbp
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 100 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: