Closed
Bug 479713
Opened 15 years ago
Closed 15 years ago
"Conditional jump or move depends on uninitialised value" in nsHTMLDocument::MaybeEditingStateChanged
Categories
(Core :: DOM: Editor, defect)
Core
DOM: Editor
Tracking
()
RESOLVED
FIXED
People
(Reporter: jruderman, Assigned: MatsPalmgren_bugz)
Details
(Keywords: testcase, valgrind)
Attachments
(4 files)
12.02 KB,
text/plain
|
Details | |
1.25 KB,
patch
|
Details | Diff | Splinter Review | |
7.15 KB,
text/plain
|
Details | |
721 bytes,
patch
|
roc
:
review+
roc
:
superreview+
|
Details | Diff | Splinter Review |
1. Load editor/composer/src/crashtests/428844-1.html in Firefox under Valgrind. Result: ==56620== Conditional jump or move depends on uninitialised value(s) ==56620== at 0x1870962: nsHTMLDocument::MaybeEditingStateChanged() ==56620== by 0x1792BD5: nsGenericElement::doInsertChildAt(nsIContent*, unsigned int, int, nsIContent*, nsIDocument*, nsAttrAndChildArray&) ==56620== by 0x177668B: nsDocument::InsertChildAt(nsIContent*, unsigned int, int) ==56620== by 0x188096C: nsXMLContentSink::SetDocElement(int, nsIAtom*, nsIContent*) ==56620== by 0x1880D2F: nsXMLContentSink::HandleStartElement(unsigned short const*, unsigned short const**, unsigned int, int, unsigned int, int) ==56620== by 0x1880E9B: nsXMLContentSink::HandleStartElement(unsigned short const*, unsigned short const**, unsigned int, int, unsigned int) ==56620== by 0x154FA95: nsExpatDriver::HandleStartElement(unsigned short const*, unsigned short const**) ==56620== by 0x1575D59: doContent ==56620== by 0x1576CFD: contentProcessor ==56620== by 0x1574840: doProlog ==56620== by 0x15755A4: prologProcessor ==56620== by 0x156E02D: MOZ_XML_Parse See http://blog.mozilla.com/nnethercote/2009/02/17/valgrind-mac-os-x-update-feb-17-2009/#comment-25 for comments about running Valgrind on Mac.
Flags: in-testsuite+
Reporter | ||
Comment 1•15 years ago
|
||
Assignee | ||
Comment 2•15 years ago
|
||
I can reproduce this on x86_64 Linux. It's nsHTMLDocument::mEditingState it complains about, which is odd since nsHTMLDocument inherits ZEROING_OPERATOR_NEW from nsIDocument. I added a printf in nsHTMLDocument::GetEditingState() and there are several preceding calls for the same document that does not trigger a valgrind error...
Assignee | ||
Comment 3•15 years ago
|
||
... adding an explicit assignment "mEditingState=eOff" just before said printf makes valgrind silent. So, at the time of the CTOR valgrind thinks it's initialized, then for 10 or so calls to GetEditingState() it thinks it's initialized, then all of sudden it thinks it's uninitialized... The only possible explanations I can think of is either a bug in valgrind itself, or we make bogus VALGRIND_* calls to misinform it about the status of this memory. AFAICT, jemalloc is only place we use VALGRIND_* macros (http://mxr.mozilla.org/mozilla-central/search?string=VALGRIND_) and I traced all of those and I don't see any suspicious calls...
Assignee | ||
Comment 4•15 years ago
|
||
... but using a suppressions file is better than messing around in the code, IMO.
Comment 5•15 years ago
|
||
Valgrind (Memcheck) is pretty good at tracking definedness, and is usually only confused by highly optimised hand-written assembly code, such as found in glibc's memcpy() function (and Memcheck replaces such functions with its own, less gnarly versions). However, interpreting these messages can be tricky because undefined values can be created and used in various ways without being complained about. Memcheck will only complain if an undefined value is used in a way that could affect your program's behaviour. The common cases are: - in the condition of a conditional branch (could cause incorrect control flow); - as a parameter to a syscall (could cause incorrect side-effects); - as the address of a load/store (could cause seg fault or similar). Things that it won't complain about include: - copying undefined values; - doing arithmetic on undefined values. So there can be a big gap between the original defect (you forgot to initialise a value) and the error reporting point. One way to help with diagnosing these problems is to use the VALGRIND_CHECK_MEM_IS_DEFINED or VALGRIND_CHECK_VALUE_IS_DEFINED macros. They're much better than printf for working backwards and finding out where the value is defined or undefined. Even better, if you have Valgrind 3.4.0, the --track-origins=yes option gives you information about where the undefined value originated, eg. the malloc'd block you forgot to initialise, or the stack variable you forgot to initialise. It makes Memcheck even slower than normal, but is very helpful in these situations. If you don't have Valgrind 3.4.0 it's worth getting just for this feature. See how these help you. If it's still a problem, I'll try to replicate and investigate it myself.
Assignee | ||
Comment 6•15 years ago
|
||
> Valgrind (Memcheck) is pretty good at tracking definedness I agree, my experience is that valgrind rarely have false positives, in fact this is the first time I have ever suspected so... and I was wrong. Thanks for your help; it allowed me to find the real error eventually. Using --track-origins=yes I got this output. It blames nsDocShell::EnsureEditorData() http://mxr.mozilla.org/mozilla-central/source/docshell/base/nsDocShell.cpp#9154 which allocates a nsDocShellEditorData, I printed the address and extent of nsDocShellEditorData and nsHTMLDocument but there's no overlap or anything like that. So, I looked a bit closer at nsDocShellEditorData and saw that it has a nsIHTMLDocument::EditingState member... which isn't initialized in the CTOR... doing so fixed it, so I'm assuming its "uninitializedness" traveled to nsHTMLDocument::mEditingState through some path... --track-origins is a great feature, but is it possible to force the error to show up earlier somehow? (That is, to have every use of an uninitialized value trigger an error.)
Assignee: nobody → mats.palmgren
Assignee | ||
Comment 7•15 years ago
|
||
Attachment #363811 -
Flags: superreview?(roc)
Attachment #363811 -
Flags: review?(roc)
Assignee | ||
Updated•15 years ago
|
OS: Mac OS X → All
Hardware: x86 → All
Reporter | ||
Comment 8•15 years ago
|
||
http://valgrind.org/docs/manual/mc-manual.html#mc-manual.value describes some of the constraints on detecting "use" of uninitialized memory. I wonder if we could use a macro or compiler hack to help Valgrind detect bugs like this without introducing new false positives.
Attachment #363811 -
Flags: superreview?(roc)
Attachment #363811 -
Flags: superreview+
Attachment #363811 -
Flags: review?(roc)
Attachment #363811 -
Flags: review+
Comment 9•15 years ago
|
||
Mats (re comment #6): no, there's no way to do more eager checking without modifying Valgrind. At the very least, you can't complain about copying undefined values, it happens all the time due to struct padding. It might be possible to warn if an undefined value is used in an arithmetic operation, but it would slow things down a lot as every arithmetic op would require test on all the operands. (And I bet there'd be false positives.) --track-origins is the current best compromise. Jesse (re comment #8): what would these hacks look like?
Reporter | ||
Comment 10•15 years ago
|
||
Add VALGRIND_CHECK_VALUE_IS_DEFINED for every argument to a constructor, using the precise (non-padded) size for structs. Annotate out-params (we already do this) and "optional" params, if needed.
Assignee | ||
Comment 11•15 years ago
|
||
(In reply to comment #9) > ... it happens all the time due to struct padding. I see what you mean and I agree there would be a lot of false positives from "low level C" parts of the application, e.g. libraries like GTK, cairo, sqlite3 etc, and maybe JS too, but other parts of the code like layout/ content/ dom/ etc would have few I think. The ideal is of course if there is a way for valgrind to find the exact layout of structs and know which parts are padding. Can it figure that out if the code is compiled with debug info? In this case though it was an undefined byte value passed as argument to a function... it should be possible to flag this directly without false positives? With eager checking... 1, it's easier to find the origin of the "use" 2, some "uses" will otherwise not be detected until they "might conceivably affect the outcome of your program's computation" and we want to fix the code before that happens --track-origins helps a lot with 1 and maybe 2 isn't that common (assuming that our tests have good code coverage), so I'm starting to think that eager checking is perhaps less valuable than I first thought. A feature to eagerly check integral values when passed as arguments to a function might be worth it though?
Assignee | ||
Comment 12•15 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/cf8a92034d2a -> FIXED
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Comment 13•15 years ago
|
||
I tried turning on naive eager checking (checking even copies) once. I got hundreds of errors reported for /bin/true. Almost all structs have padding in them. As for debug info... in theory it would help, but it gets complicated pretty quickly. Reading all that extra debug info takes a lot more time and space as well. This is an area we've given a lot of consideration to, trading-off performance and likelihood of false positives. It's probably not going to change any time soon. Lots of people were happy when --track-origins=yes was added :)
You need to log in
before you can comment on or make changes to this bug.
Description
•