Closed Bug 1538580 Opened 1 year ago Closed 3 months ago

Conditional jump or move depends on uninitialized values in [@ mp4parse::read_stsd]

Categories

(Core :: Audio/Video: Playback, defect, P2)

defect

Tracking

()

RESOLVED FIXED
Tracking Status
firefox68 --- affected

People

(Reporter: tsmith, Assigned: bryce)

References

(Blocks 1 open bug)

Details

(4 keywords)

Attachments

(1 file)

Attached video testcase.mp4

Found with m-c:
BuildID=20190323094805
SourceStamp=59e55930dc0f243357a8730be1a0ca372e6baddb

Conditional jump or move depends on uninitialised value(s)
   at 0x1411C1B7: mp4parse::read_stsd (lib.rs:2078)
   by 0x14111339: mp4parse::read_mdia (lib.rs:976)
   by 0x1410AC87: mp4parse::read_moov (lib.rs:870)
   by 0x14108521: mp4parse_read (lib.rs:729)
   by 0x11D50F83: mozilla::MP4Metadata::Parse() (MP4Metadata.cpp:102)
   by 0x11D4FCD0: mozilla::MP4Demuxer::Init() (MP4Demuxer.cpp:145)
   by 0x11B0FF93: mozilla::detail::ProxyFunctionRunnable<mozilla::MediaFormatReader::DemuxerProxy::Init()::$_13, mozilla::MozPromise<mozilla::MediaResult, mozilla::MediaResult, true> >::Run() (MediaFormatReader.cpp:789)
   by 0xFD45783: mozilla::TaskQueue::Runner::Run() (TaskQueue.cpp:199)
   by 0xFD55E9A: nsThreadPool::Run() (nsThreadPool.cpp:244)
   by 0xFD55FDC: non-virtual thunk to nsThreadPool::Run() (nsThreadPool.cpp:0)
   by 0xFD527A7: nsThread::ProcessNextEvent(bool, bool*) (nsThread.cpp:1180)
   by 0xFD54907: NS_ProcessNextEvent(nsIThread*, bool) (nsThreadUtils.cpp:482)
 Uninitialised value was created by a stack allocation
   at 0x1411548A: mp4parse::read_stsd (lib.rs:2155)
Flags: in-testsuite?

Jean-Yves, can you help me triage this?

Flags: needinfo?(jyavenard)

:bryce could you take a look please?

thanks

Flags: needinfo?(jyavenard) → needinfo?(bvandyk)
Priority: -- → P2
Assignee: nobody → bvandyk
Flags: needinfo?(bvandyk)

IMO this looks like a false positive from Valgrind due to LLVM's x86 backend:

This can be reproduced with a smaller rust program using the following profile in cargo.toml:

[profile.release]
panic = "abort"
codegen-units = 1
debug = true #Not required but makes debugging easier

[dependencies]
mp4parse = "0.11.2"

and the followingmain.rs (where testcase.mp4 is the test file from this bug):

extern crate mp4parse;

use std::fs::File;

use mp4parse::MediaContext;
use mp4parse::read_mp4;

fn main() {
    let mut file = File::open("testcase.mp4").unwrap();
    let mut context = MediaContext::new();
    match read_mp4(&mut file, &mut context) {
        Err(e) => println!("{:?}", e),
        Ok(_) => println!("Okay"),
    }
}

build and run with cargo build --release -vv && valgrind -q --leak-check=no --track-origins=yes target/release/<project_name>

I've been unable to reduce the test case significantly beyond the above.

It looks like the llvm optimization being done on this code results in the report. Based on the llvm and rust reports above, I suspect that what's happening is a reordering of CMP and jumps which upsets valgrind, but which results in code that is still 'correct'. I've been digging at the assembly, and the area where valgrind complains indeed has a number of jumps -- I suspect all the different checks from the linked line. However I think fully verifying the source of the data for the comparisons on that line is doing is beyond my abilities to do in a timely fashion.

NI: kinetik, have you seen anything like this before? I'd appreciate a second set of eyes to sanity check my working.

Flags: needinfo?(kinetik)

Tyson, have you run into this previously? Are there any more steps we could take to ensure this is a false positive as I suspect?

Flags: needinfo?(twsmith)

I have just started fuzzing with Valgrind but this is the second mention of a potential false positive in the last couple days.

Julian, is there a way to determine this is Valgrind reporting a false positive?
If so are there any plans to prevent these false positives?
Should I send the reports from rust code to you (Julian) first as a sanity check?

Flags: needinfo?(twsmith) → needinfo?(jseward)

There are multiple causes for false positives. It may be the &&/|| inversion problem mentioned in comment 3. But it might also be the use of integer equality on partially undefined inputs. The latter has largely been fixed in the Valgrind trunk (or can be). You can (and, for minimal noise level, should) use the trunk, which you can get thusly:

   git clone git://sourceware.org/git/valgrind.git trunk
   cd trunk
   ./autogen.sh
   ./configure --prefix=/wherever
   make -j8 && make -j8 install

But &&/|| inversion is a serious problem and so far there is no obvious route to a fix. I have contemplated this problem at length and continue to do so. See https://archive.fosdem.org/2018/schedule/event/debugging_tools_memcheck/attachments/slides/2655/export/events/attachments/debugging_tools_memcheck/slides/2655/Memcheck.pdf, in particular slide 13, for background.

For &&/|| inversions in C++, as is the case for bug 1538577 comment 3, it's often possible to work around the problem by installing a dummy initialisation. But in Rust that's usually impossible. The ideal solution is obviously to fix Memcheck, but I don't (yet) have a feasible plan for that.

Should I send the reports from rust code to you (Julian) first as a sanity check?

That would be useful, yes, please do. One thing I would like to do here is collect the relevant machine code fragments in the hope of being able to discern some kind of pattern which might lead to a solution.

Flags: needinfo?(jseward)

I should add: my impression is that gcc performs &&/|| inversion less aggressively than clang, so building with gcc might reduce the noise level for C++ code, at least. Also, reducing the optimisation level (eg, -Og instead of -O or -O instead of -O2) helps, although you'll obviously take a potentially large perf hit, particularly with -Og.

(In reply to Bryce Seager van Dyk (:bryce) from comment #3)

NI: kinetik, have you seen anything like this before? I'd appreciate a second set of eyes to sanity check my working.

Thanks for investigating! This is new to me. I can't immediately see the mp4parse-rust code doing anything unsafe here. Like Julian suggested, let's retry with a trunk Valgrind and see if this is still showing up.

Flags: needinfo?(kinetik)
Flags: needinfo?(twsmith)

Confirming that my repro of this was done with a master/trunk build of Valgrind, built from commit 92ecddd13ea52003c0c8fd0a3300411005c8d6b3.

I can confirm this is no longer reproducible with the latest version of Valgrind.

I tested with m-c 20200102-c7082b580eeb and Valgrind commit 2a7d3ae7681.

Status: NEW → RESOLVED
Closed: 3 months ago
Flags: needinfo?(twsmith)
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.