Closed Bug 1807669 Opened 3 years ago Closed 3 years ago

Something not quite right with debug-only poisoning on 32-bit targets?

Categories

(Core :: JavaScript: GC, defect, P2)

defect

Tracking

()

RESOLVED FIXED
110 Branch
Tracking Status
firefox110 --- fixed

People

(Reporter: jseward, Assigned: jonco)

References

(Blocks 1 open bug)

Details

Attachments

(2 files)

Attached file valgrind complainage

Recently I ran all of jit-tests on valgrind on both 64- and 32-bit x86.
I noticed that, for many of the 32-bit runs, valgrind reports uninitialised
value uses in template <typename T> bool IsThingPoisoned(T* thing), but
these are never reported for 64-bit runs.

Looking at the code, I wonder if there's something not quite right in the
marking of areas as debug-poisoned in the 32-bit case. It strikes me as
unlikely that there is any real bug here, but it would be nice to fix this if
it is easy, since it generates a lot of noise in the 32-bit case, making it
harder to spot the (very rare) real problems that are detected (eg bug 1807653).

A typical report (see attachment) is

Conditional jump or move depends on uninitialised value(s)
   at 0x2B6CBF3: bool IsThingPoisoned<JSObject>(JSObject*)
                 (src/gc/Marking.cpp:128)
   by 0x2B937BC: void js::CheckTracedThing<JSObject>(JSTracer*, JSObject*)
                 (src/gc/Marking.cpp:214)
   by 0x2BB5DFC: auto js::CheckTracedThing<JS::Value>(JSTracer*, [..]
                 const (src/gc/Marking.cpp:223)
   ...
 Uninitialised value was created by a client request    
   at 0x2B8BD56: SetMemCheckKind(void*, unsigned int, MemCheckKind)
                 (src/util/Poison.h:116)
   by 0x2B8C26A: js::Poison(void*, unsigned char, unsigned int, MemCheckKind)
                 (src/util/Poison.h:204)
   by 0x2B8C2C2: js::DebugOnlyPoison(void*, unsigned char, unsigned int,
                 MemCheckKind) (src/util/Poison.h:212)
   by 0x2BAABD0: js::Nursery::allocate(unsigned int) (src/gc/Nursery.cpp:554)

The error is reported in IsThingPoisoned(), here

  if ((*p & 1) == 0) {  <--- here
    return false;
  }

and at this point, the least significant bit of *p is zero. It seems as if
IsThingPoisoned wants to say "no, this thing is not poisoned". However,
that seems to be contradicted by the fact that the undefinedness (of *p) was
created by a call to DebugOnlyPoison at Nursery.cpp:554.

So I'm wondering if there is some word-size or off-by-one not-quite-rightness
in the poisoning logic.

All hints appreciated, since I have no idea how this stuff works :-)

STR: configure m-c for a 32-bit build:

CC="ccache clang -m32 -msse -msse2 -mfpmath=sse" \
  CXX="ccache clang++ -m32 -msse -msse2 -mfpmath=sse" \
  ../src/configure   --target=i686-pc-linux --enable-debug \
  --enable-optimize="-g -O0"   --enable-valgrind --disable-jemalloc \
  --disable-tests --disable-shared-js

RUN:

valgrind --track-origins=yes \
  ./dist/bin/js -p /nfs/compx/MC_GC/js/src/jit-test/lib/prologue.js \
  -e "const platform='linux'" \
  -e "const libdir='/nfs/compx/MC_GC/js/src/jit-test/lib/'" \
  -e "const scriptdir='/nfs/compx/MC_GC/js/src/jit-test/tests/'" \
  --module-load-path /nfs/compx/MC_GC/js/src/jit-test/modules/ \
  -f /nfs/compx/MC_GC/js/src/jit-test/tests/backup-point-bug1315634.js

In PoisonImpl we have some 32-bit specific code in debug builds. You could try doing a plain memset there (like the non-DEBUG code) in your local build to see if that changes anything.

In IsThingPoisoned, we should then get an odd value for *p (JS_ALLOCATED_NURSERY_PATTERN bytes, 0x2D). If you don't get that value, we probably wrote some value there without marking the memory as "defined".

Assignee: nobody → jcoppeard
Blocks: GC
Severity: -- → S3
Priority: -- → P2

This check is pretty sketchy as it depends on specific bit patterns occuring
poisoned GC cells. And it turns out that it's wrong anyway. Since we now use an
PoisonedObjectValue, if we look at the second word of such a value in a 32 bit
build this function will return false instead of true (since the object tag is
even).

Instead, just check the free list, but restrict this to GC marking so it
doesn't have too much impact. I wasn't able to detect and increase in time
taken to run jit tests with GC zeal 10 with this change.

This fixes the valgrind failure. If there is stil a problem with accessing
uninitialized cell contents it will show up somewhere else.

Pushed by jcoppeard@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/da122a067957 Remove faulty check for poisoned cell contents r=sfink
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 110 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: