Closed Bug 917050 Opened 11 years ago Closed 11 years ago

valgrind - Invalid read of size 1 in js::jit::CompactBufferReader::readByte()

Categories

(Core :: JavaScript Engine, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox26 --- wontfix
firefox27 --- affected
firefox28 --- affected
firefox29 --- affected

People

(Reporter: bc, Assigned: terrence)

References

Details

(Keywords: crash, sec-critical, valgrind)

Attachments

(5 files)

Attached file valgrind stack
Load mozilla/layout/reftests/bidi/bidiMirroring.svg

Hit this twice:

==16881== 1 errors in context 1 of 2:
==16881== Invalid read of size 1
==16881==    at 0xA5EC7FB: js::jit::CompactBufferReader::readByte() (CompactBuffer.h:57)

    inline CompactBufferReader(const CompactBufferWriter &writer);
    uint8_t readByte() {
        JS_ASSERT(buffer_ < end_);
        return *buffer_++;
    }
Attached file crash stack
David, how bad is this? (Needs a sec rating)
This is probably the same issue as bug 914511. Let's assume sec-critical for now.
Assigning this to Terrence based on dependency on 914511.
Assignee: general → terrence
Depends on: 914511
Any updates, Terrence?
Flags: needinfo?(terrence)
I decided to look at the very similar looking bug 914511 first, since that repros in the shell. It's not really clear what's going on there, unfortunately. I'll try to take a look at this bug specifically today.
Flags: needinfo?(terrence)
Assuming affects trunk. Terrence did you get a chance to look (re: comment 6)?
Flags: needinfo?(terrence)
I pulled a fresh trunk valgrind and built on fedora 19 and retested this particular case and could not reproduce it by itself. I have seen other instances of this message recently though. I updated valgrind on the workers and hopefully will have some indication in a few days whether this is still reproducible. I'll report back here with the news when I have it.
I thought there was a flag to make valgrind ignore over-large-reads-as-optimizations, but I can't seem to find it in the docs or in bugzilla at the moment. Nick, do you know how this is spelled?
Flags: needinfo?(n.nethercote)
--partial-loads-ok=yes ?

But this is a 1-byte read that's being complained about, so I don't know if that's relevant.
Flags: needinfo?(n.nethercote)
Reproduced with dom/workers/test/test_json.html
This is most reliable locally when running the 18th chunk of 20 and occurs at the end of the run so it is not associated with a single test but the entire set. For example, locally from the objdir firefox-debug

chunk=18 && rm -f ./mochitest-plain-$chunk.log && /mozilla/builds/nightly/mozilla/firefox-debug/_virtualenv/bin/python _tests/testing/mochitest/runtests.py --autorun --close-when-done --console-level=INFO --log-file=./mochitest-plain-$chunk.log --file-level=INFO --failure-file=/mozilla/builds/nightly/mozilla/firefox-debug/_tests/testing/mochitest/makefailures.json --testing-modules-dir=/mozilla/builds/nightly/mozilla/firefox-debug/_tests/modules --extra-profile-file=./dist/plugins --symbols-path=./dist/crashreporter-symbols  --debugger='valgrind' --debugger-args='--tool=memcheck' --timeout=86400  --total-chunks=20 --this-chunk=$chunk

gives
==8342== Invalid read of size 4
==8342==    at 0x196C8028: ???
==8342==    by 0xA64E962: js::CallJSNative(JSContext*, bool (*)(JSContext*, unsigned int, JS::Value*), JS::CallArgs const&) (jscntxtinlines.h:220)

==8342== Invalid read of size 4
==8342==    at 0x1A4FC028: ???
==8342==    by 0xA64E962: js::CallJSNative(JSContext*, bool (*)(JSContext*, unsigned int, JS::Value*), JS::CallArgs const&) (jscntxtinlines.h:220)

==8342== Invalid read of size 1
==8342==    by 0xA42C2BB: js::jit::CompactBufferReader::readFixedUint32_t() (CompactBuffer.h:60)

The bughunter automation gets the Invalid read of size 1
    at 0xA1A3E5D: js::jit::CompactBufferReader::readByte() (CompactBuffer.h:57) instead of the readFixedUint32...
Great work, Bob! The first two errors look even more like bug 914511. In that bug, Luke and Julian seem to think that the cause may be valgrind improperly caching register values across the asm.js signal handler. Could you re-run that test with that particular valgrind optimization disabled so that we can validate this theory? The valgrind option you need to add is |--vex-iropt-register-updates=allregs-at-mem-access|. Thanks!
Flags: needinfo?(terrence) → needinfo?(bclary)
With --vex-iropt-register-updates=allregs-at-mem-access the test no longer abnormally terminates and I set two Invalid read of size 4 messages around /tests/js/xpconnect/tests/mochitest/test_asmjs.html:

 one with js::CallJSNative
 and another with no information other than the Invalid read message.

Both have set address range perms: large range [..., ...) (noaccess)

The test is still running but is taking a pretty long while. I'll let you know if other messages appear.
Flags: needinfo?(bclary)
Awesome! Thank you for verifying that!

I'd guess the warnings are expected asm.js behavior, but I'm not sure about the remaining error.
> Both have set address range perms: large range [..., ...) (noaccess)

Julian mentioned that these messages "don't per se show anything wrong", as per bug 645641 comment 14.
Bob we discussed this in CRITSMASH and were unsure what is next for this bug. Should we downgrade it?
Flags: needinfo?(bclary)
jandem originally set the sec-critical. I'm ok with downgrading this to a lower rating if jandem doesn't object.
Flags: needinfo?(bclary) → needinfo?(jdemooij)
(In reply to Bob Clary [:bc:] from comment #18)
> jandem originally set the sec-critical. I'm ok with downgrading this to a
> lower rating if jandem doesn't object.

It looks like this (1) only happens with Valgrind and (2) goes away with --vex-iropt-register-updates=allregs-at-mem-access, correct? In that case it's probably not a real bug in your code, not sure.
Flags: needinfo?(jdemooij)
(In reply to Jan de Mooij [:jandem] from comment #19)
> it's probably not a real bug in your code, not sure.

s/your/our

Downgrading seems fine to me, if comment 19 is correct.
In the one test I ran manually with the setting, I didn't get this original message but did get the Invalid reads for js::CallJSNative  and another with no information other than the Invalid read message.

I set all 4 valgrind workers to use the --vex-iropt-register-updates=allregs-at-mem-access setting and restarted them about 2013-11-25 18:00:00. Since then I've seen two:

Invalid read of size 1
    at 0xA1AC0CF: js::jit::CompactBufferReader::readByte() (CompactBuffer.h:57)
    by 0xA4178F9: js::jit::CompactBufferReader::readFixedUint32_t() (CompactBuffer.h:60)
    by 0xA435BA9: RelocationIterator::RelocationIterator(js::jit::CompactBufferReader&) (Assembler-x64.cpp:219)
    by 0xA435D35: js::jit::Assembler::TraceJumpRelocations(JSTracer*, js::jit::IonCode*, js::jit::CompactBufferReader&) (Assembler-x64.cpp:256)
    by 0xA2F090A: js::jit::IonCode::trace(JSTracer*) (Ion.cpp:663)
    by 0xA15C08B: js::gc::MarkChildren(JSTracer*, js::jit::IonCode*) (Marking.cpp:1194)
    by 0xA15C86B: js::GCMarker::processMarkStackOther(js::SliceBudget&, unsigned long, unsigned long) (Marking.cpp:1365)
    by 0xA15CC01: js::GCMarker::processMarkStackTop(js::SliceBudget&) (Marking.cpp:1432)
    by 0xA15D2F7: js::GCMarker::drainMarkStack(js::SliceBudget&) (Marking.cpp:1533)
    by 0xA4EDA4E: DrainMarkStack(JSRuntime*, js::SliceBudget&, js::gcstats::Phase) (jsgc.cpp:3911)
    by 0xA4EF9EE: IncrementalCollectSlice(JSRuntime*, long, JS::gcreason::Reason, js::JSGCInvocationKind) (jsgc.cpp:4473)
    by 0xA4EFFED: GCCycle(JSRuntime*, bool, long, js::JSGCInvocationKind, JS::gcreason::Reason) (jsgc.cpp:4637)
  Address 0x0 is not stack'd, malloc'd or (recently) free'd

at 2013-11-25 19:42:58:

chrome://mochitests/content/chrome/content/xul/templates/tests/chrome/test_tmpl_treeelementquerysyntaxrecursivetreebuilder.xul

and at 2013-11-25 23:07:30
chrome://mochitests/content/chrome/content/xul/templates/tests/chrome/test_tmpl_treeelementsimplesyntaxrecursivetreebuilder.xul

The datetimes are close to the restart time and the jobs may have already been queued up in the database and run without the new setting. Let me test these two manually and wait for the workers to cycle through all of the tests. I'll report back here with the results.
Bob, Julian also suggested using:

--vex-iropt-register-updates=allregs-at-each-insn

in case --vex-iropt-register-updates=allregs-at-mem-access doesn't work out well, as per bug 913876 comment 23. (note that both didn't fix my issue in that bug)
Gary, they aren't mutually exclusive are they?
(In reply to Bob Clary [:bc:] from comment #23)
> Gary, they aren't mutually exclusive are they?

I'm not sure, I'm assuming they are as they seem to be different modes of the "--vex-iropt-register-updates=" flag.
From |valgrind -h -h|:

> --vex-iropt-register-updates=sp-at-mem-access
>                             |unwindregs-at-mem-access
>                             |allregs-at-mem-access
>                             |allregs-at-each-insn  [unwindregs-at-mem-access]

So they are mutually exclusive.  AIUI, they are listed in increasing order of simulation fidelity (and correspondingly, decreasing order of speed).
> So they are mutually exclusive.  AIUI, they are listed in increasing order
> of simulation fidelity (and correspondingly, decreasing order of speed).

Yes.  IIUC allregs-at-each-insn disables all redundant-Put removal, which
at least means the IR optimiser can't screw up precise exceptions.  The
front ends can still generate PX-unfriendly IR, but we could potentially
find and fix those by writing an IR analysis pass, that runs immediately
after the front end initial translation.  We might have to do that to get
this to work reliably.
During my patch tuesday maintenance window I went with allregs-at-each-insn. I'll follow up next week or so when I've had time to collect results.
(In reply to Bob Clary [:bc:] from comment #27)
> During my patch tuesday maintenance window I went with allregs-at-each-insn.
> I'll follow up next week or so when I've had time to collect results.

FWIW, allregs-at-each-insn has been working well with the Valgrind-on-TBPL runs -- we were getting frequent crashes, but have had none since I turned it on (bug 947671).
I wonder what the additional V perf hit of allregs-at-each-insn is (sigh).
Per my analysis of the problem in bug 913876 comment 29, I reckon we should
be able to get away with allregs-at-mem-access, which is less expensive,
since we don't need to have precise register values at instructions that
don't access memory (IIUC, that is).
NI on Terrence to help with next steps here given we are already into a couple a beta's for Fx27.
Flags: needinfo?(terrence)
Sorry I haven't followed up before now. I have not seen this warning since 2013-11-25 with the new allregs-at-each-insn setting. I'll just mark this WFM.
Status: NEW → RESOLVED
Closed: 11 years ago
Flags: needinfo?(terrence)
Resolution: --- → WORKSFORME
Group: core-security → core-security-release
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: