Something not quite right with debug-only poisoning on 32-bit targets?
Categories
(Core :: JavaScript: GC, defect, P2)
Tracking
()
| Tracking | Status | |
|---|---|---|
| firefox110 | --- | fixed |
People
(Reporter: jseward, Assigned: jonco)
References
(Blocks 1 open bug)
Details
Attachments
(2 files)
Recently I ran all of jit-tests on valgrind on both 64- and 32-bit x86.
I noticed that, for many of the 32-bit runs, valgrind reports uninitialised
value uses in template <typename T> bool IsThingPoisoned(T* thing), but
these are never reported for 64-bit runs.
Looking at the code, I wonder if there's something not quite right in the
marking of areas as debug-poisoned in the 32-bit case. It strikes me as
unlikely that there is any real bug here, but it would be nice to fix this if
it is easy, since it generates a lot of noise in the 32-bit case, making it
harder to spot the (very rare) real problems that are detected (eg bug 1807653).
A typical report (see attachment) is
Conditional jump or move depends on uninitialised value(s)
at 0x2B6CBF3: bool IsThingPoisoned<JSObject>(JSObject*)
(src/gc/Marking.cpp:128)
by 0x2B937BC: void js::CheckTracedThing<JSObject>(JSTracer*, JSObject*)
(src/gc/Marking.cpp:214)
by 0x2BB5DFC: auto js::CheckTracedThing<JS::Value>(JSTracer*, [..]
const (src/gc/Marking.cpp:223)
...
Uninitialised value was created by a client request
at 0x2B8BD56: SetMemCheckKind(void*, unsigned int, MemCheckKind)
(src/util/Poison.h:116)
by 0x2B8C26A: js::Poison(void*, unsigned char, unsigned int, MemCheckKind)
(src/util/Poison.h:204)
by 0x2B8C2C2: js::DebugOnlyPoison(void*, unsigned char, unsigned int,
MemCheckKind) (src/util/Poison.h:212)
by 0x2BAABD0: js::Nursery::allocate(unsigned int) (src/gc/Nursery.cpp:554)
The error is reported in IsThingPoisoned(), here
if ((*p & 1) == 0) { <--- here
return false;
}
and at this point, the least significant bit of *p is zero. It seems as if
IsThingPoisoned wants to say "no, this thing is not poisoned". However,
that seems to be contradicted by the fact that the undefinedness (of *p) was
created by a call to DebugOnlyPoison at Nursery.cpp:554.
So I'm wondering if there is some word-size or off-by-one not-quite-rightness
in the poisoning logic.
All hints appreciated, since I have no idea how this stuff works :-)
| Reporter | ||
Comment 1•3 years ago
|
||
STR: configure m-c for a 32-bit build:
CC="ccache clang -m32 -msse -msse2 -mfpmath=sse" \
CXX="ccache clang++ -m32 -msse -msse2 -mfpmath=sse" \
../src/configure --target=i686-pc-linux --enable-debug \
--enable-optimize="-g -O0" --enable-valgrind --disable-jemalloc \
--disable-tests --disable-shared-js
RUN:
valgrind --track-origins=yes \
./dist/bin/js -p /nfs/compx/MC_GC/js/src/jit-test/lib/prologue.js \
-e "const platform='linux'" \
-e "const libdir='/nfs/compx/MC_GC/js/src/jit-test/lib/'" \
-e "const scriptdir='/nfs/compx/MC_GC/js/src/jit-test/tests/'" \
--module-load-path /nfs/compx/MC_GC/js/src/jit-test/modules/ \
-f /nfs/compx/MC_GC/js/src/jit-test/tests/backup-point-bug1315634.js
Comment 2•3 years ago
|
||
In PoisonImpl we have some 32-bit specific code in debug builds. You could try doing a plain memset there (like the non-DEBUG code) in your local build to see if that changes anything.
In IsThingPoisoned, we should then get an odd value for *p (JS_ALLOCATED_NURSERY_PATTERN bytes, 0x2D). If you don't get that value, we probably wrote some value there without marking the memory as "defined".
| Assignee | ||
Updated•3 years ago
|
Updated•3 years ago
|
| Assignee | ||
Comment 3•3 years ago
|
||
This check is pretty sketchy as it depends on specific bit patterns occuring
poisoned GC cells. And it turns out that it's wrong anyway. Since we now use an
PoisonedObjectValue, if we look at the second word of such a value in a 32 bit
build this function will return false instead of true (since the object tag is
even).
Instead, just check the free list, but restrict this to GC marking so it
doesn't have too much impact. I wasn't able to detect and increase in time
taken to run jit tests with GC zeal 10 with this change.
This fixes the valgrind failure. If there is stil a problem with accessing
uninitialized cell contents it will show up somewhere else.
Comment 5•3 years ago
|
||
| bugherder | ||
Description
•