Closed
Bug 934542
Opened 10 years ago
Closed 10 years ago
Investigate why "possibly lost" Valgrind errors aren't being ignored even though --show-possibly-lost=no is present
Categories
(Testing :: Mozbase, defect)
Testing
Mozbase
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: froydnj, Assigned: n.nethercote)
References
Details
Attachments
(2 files, 1 obsolete file)
2.58 KB,
patch
|
ted
:
review+
|
Details | Diff | Splinter Review |
5.10 KB,
patch
|
n.nethercote
:
review+
|
Details | Diff | Splinter Review |
There are a lot of these, as one might expect; they usually look like: ==19788== 96 bytes in 3 blocks are possibly lost in loss record 13,568 of 20,628 ==19788== at 0x4C28A49: malloc (vg_replace_malloc.c:270) ==19788== by 0x9C6430B: js::PropertyTree::insertChild(js::ExclusiveContext*, js::Shape*, js::Shape*) (Utility.h:142) ==19788== by 0x9C65450: js::PropertyTree::getChild(js::ExclusiveContext*, js::Shape*, unsigned int, js::StackShape const&) (jspropertytree.cpp:194) ==19788== by 0x9C52347: JSObject::sealOrFreeze(JSContext*, JS::Handle<JSObject*>, JSObject::ImmutabilityType) (jsobj.cpp:1137) ==19788== by 0x9BC7D57: JS_FreezeObject(JSContext*, JSObject*) (jsobj.h:578) ==19788== by 0x9E7F33B: js::ctypes::InitTypeConstructor(JSContext*, JS::Handle<JSObject*>, JS::Handle<JSObject*>, JS::Handle<JSObject*>, JSFunctionSpec, JSFunctionSpec const*, JSPropertySpec const*, JSFunctionSpec const*, JSPropertySpec const*, JS::MutableHandle<JSObject*>, JS::MutableHandle<JSObject*>) (CTypes.cpp:1045) ==19788== by 0x9E8F2B6: JS_InitCTypesClass(JSContext*, JSObject*) (CTypes.cpp:1174) ==19788== by 0x8999554: mozilla::dom::workers::DefineChromeWorkerFunctions(JSContext*, JS::Handle<JSObject*>) (ChromeWorkerScope.cpp:57) ==19788== by 0x89C14C7: mozilla::dom::workers::CreateGlobalScope(JSContext*) (WorkerScope.cpp:1484) ==19788== by 0x89B0D60: (anonymous namespace)::CompileScriptRunnable::WorkerRun(JSContext*, mozilla::dom::workers::WorkerPrivate*) (WorkerPrivate.cpp:808) ==19788== by 0x89B3A0B: mozilla::dom::workers::WorkerRunnable::Run() (WorkerPrivate.cpp:1769) ==19788== by 0x89BAC4C: mozilla::dom::workers::WorkerPrivate::DoRunLoop(JSContext*) (WorkerPrivate.cpp:3573) or ==19788== 96 bytes in 3 blocks are possibly lost in loss record 13,572 of 20,628 ==19788== at 0x4C28A49: malloc (vg_replace_malloc.c:270) ==19788== by 0x9C6430B: js::PropertyTree::insertChild(js::ExclusiveContext*, js::Shape*, js::Shape*) (Utility.h:142) ==19788== by 0x9C65450: js::PropertyTree::getChild(js::ExclusiveContext*, js::Shape*, unsigned int, js::StackShape const&) (jspropertytree.cpp:194) ==19788== by 0x9D0CAA5: js::Shape::replaceLastProperty(js::ExclusiveContext*, js::StackBaseShape const&, js::TaggedProto, JS::Handle<js::Shape*>) (Shape.cpp:344) ==19788== by 0x9D0CB8D: js::Shape::setObjectFlag(js::ExclusiveContext*, js::BaseShape::Flag, js::TaggedProto, js::Shape*) (Shape.cpp:1372) ==19788== by 0x9D0CDBE: js::ObjectImpl::setFlag(js::ExclusiveContext*, unsigned int, js::ObjectImpl::GenerateShape) (Shape.cpp:1336) ==19788== by 0x9C29DB2: JSObject::setNewTypeUnknown(JSContext*, js::Class const*, JS::Handle<JSObject*>) (jsinfer.cpp:3689) ==19788== by 0x9BD656E: js_InitArrayClass(JSContext*, JS::Handle<JSObject*>) (jsarray.cpp:3077) ==19788== by 0x9C44316: js_GetClassObject(js::ExclusiveContext*, JSObject*, JSProtoKey, JS::MutableHandle<JSObject*>) (jsobj.cpp:3075) ==19788== by 0x9CA93CE: js::StartOffThreadParseScript(JSContext*, JS::CompileOptions const&, char16_t const*, unsigned long, JS::Handle<JSObject*>, void (*)(void*, void*), void*) (jsworkers.cpp:242) ==19788== by 0x9BCA2EC: JS::CompileOffThread(JSContext*, JS::Handle<JSObject*>, JS::CompileOptions, char16_t const*, unsigned long, void (*)(void*, void*), void*) (jsapi.cpp:4425) ==19788== by 0x8B2C3B9: nsXULPrototypeScript::Compile(char16_t const*, int, nsIURI*, unsigned int, nsIDocument*, nsXULPrototypeDocument*, nsIOffThreadScriptReceiver*) (nsXULElement.cpp:2645) Should this memory get released at some point, or should we just ignore these sorts of stacks in Valgrind?
![]() |
||
Comment 1•10 years ago
|
||
"Possibly lost" errors from Valgrind found in Gecko products should be ignored. We should ignore them, as per: http://hg.mozilla.org/build/tools/file/default/scripts/valgrind/valgrind.sh#l70 However, I don't know why they are still showing. Perhaps njn has ideas... ?
![]() |
Assignee | |
Comment 2•10 years ago
|
||
Yeah, should be ignored. I'll investigate as to why they aren't getting ignored.
Assignee: nobody → n.nethercote
![]() |
Assignee | |
Comment 3•10 years ago
|
||
I think something is wrong with how Valgrind is invoked by the script -- judging from the output in https://tbpl.mozilla.org/php/getParsedLog.php?id=30044729&tree=Mozilla-Central&full=1, --show-possibly-lost=no and --gen-suppressions=all and --num-callers=50 are all being ignored. Line 83 of http://hg.mozilla.org/build/tools/file/default/scripts/valgrind/valgrind.sh is the invocation: make pgo-profile-run EXTRA_TEST_ARGS="--debugger=valgrind --debugger-args='$debugger_args'" || exit 1 The actual invocation, as reported by the shell, is this: + make pgo-profile-run 'EXTRA_TEST_ARGS=--debugger=valgrind --debugger-args='\''--error-exitcode=1 --smc-check=all-non-file --gen-suppressions=all --leak-check=full --num-callers=50 --show-possibly-lost=no --track-origins=yes --suppressions=/builds/slave/m-cen-l64-valgrind-00000000000/objdir/_valgrind/cross-architecture.sup --suppressions=/builds/slave/m-cen-l64-valgrind-00000000000/objdir/_valgrind/x86_64-redhat-linux-gnu.sup'\''' Shell quoting can be confusing, and I wonder if something is wrong here. Or does the definition of EXTRA_TEST_ARGS need to precede |make pgo-profile-run|? Or should there not be a '=' after |--debugger-args|? And yet, when I try running a similar command locally: make -C cd64 mochitest EXTRA_TEST_ARGS="--debugger=valgrind --debugger-args='$debugger_args'" (with $debugger_args set appropriately) I see this invocation: + make -C cd64 mochitest 'EXTRA_TEST_ARGS=--debugger=valgrind --debugger-args='\''--show-possibly-lost=no --num-callers=3 --gen-suppressions=all'\''' which matches the syntax of the try run, and it works fine, i.e. Valgrind's output indicate that it sees the options correctly. So I don't know what's going on.
![]() |
||
Comment 4•10 years ago
|
||
Ted, I'm not sure if you wrote the original Valgrind bootstrap script here, but do you possibly have any ideas/tips?
Flags: needinfo?(ted)
![]() |
||
Updated•10 years ago
|
Summary: Valgrind on tbpl detects leak - stacks containing js::PropertyTree::insertChild → Investigate why "possibly lost" Valgrind errors aren't being ignored even though --show-possibly-lost=no is present
![]() |
Assignee | |
Comment 5•10 years ago
|
||
Ok, I've worked this out. --debugger-args works with |make mochitest|. That target runs testing/mochitest/runtests.py, which uses getDebuggerInfo from automationutils.py. --debugger-args doesn't work with |make pgo-profile-run|; I've confirmed this locally. That target runs build/pgo/profileserver.py. Prior to changeset 134096:8b379faea096 (bug 746244), it used getDebuggerInfo from automationutils.py, which respects --debugger-args. Now it uses cli.debugger_arguments(), which ignores --debugger-args and instead ends up using these default arguments for Valgrind from testing/mozbase/mozrunner/mozrunner/local.py: debuggers = {'gdb': {'interactive': True, 'args': ['-q', '--args'],}, 'valgrind': {'interactive': False, 'args': ['--leak-check=full']} } So I think this has been busted since June 5th or so, when bug 746244 landed.
Assignee: n.nethercote → ted
QA Contact: n.nethercote
![]() |
Assignee | |
Comment 6•10 years ago
|
||
There are two problems relating to debugger_args, which holds the result of any --debugger-args options. - First, it's not passed to debugger_arguments(). - Second, there's some type confusion. As written, debugger_arguments is a list of arguments which gets appended to with multiple --debugger-arguments options, e.g. like this: > EXTRA_TEST_ARGS="--debugger=valgrind --debugger-args=--num-callers=10 --debugger-args=--gen-suppressions=all" But this is different to runtests.py, which can only handle a single --debugger-arguments option, like this: > EXTRA_TEST_ARGS="--debugger=valgrind --debugger-args='--num-callers=10 --gen-suppressions=all'" and stores them as a single string that uses space as a separator. I don't know which is preferred, but runtests.py uses the latter form, as doe valgrind.sh (in the build/tools/ repo), I've changed local.py to work like that. With this patch in place, Valgrind correctly receives its arguments when I run it locally.
Attachment #828597 -
Flags: review?(ted)
![]() |
Assignee | |
Updated•10 years ago
|
Assignee: ted → n.nethercote
Status: NEW → ASSIGNED
![]() |
Assignee | |
Comment 7•10 years ago
|
||
Oh, the previous patch didn't work when --debugger-args was omitted. This one does.
Attachment #828599 -
Flags: review?(ted)
![]() |
Assignee | |
Updated•10 years ago
|
Attachment #828597 -
Attachment is obsolete: true
Attachment #828597 -
Flags: review?(ted)
Comment 8•10 years ago
|
||
Comment on attachment 828599 [details] [diff] [review] Make --debugger-args work in EXTRA_TEST_ARGS with |make pgo-profile-run|. Review of attachment 828599 [details] [diff] [review]: ----------------------------------------------------------------- Yeah, that's pretty obvious after looking at your patch. Sorry that this was broken in mozbase and thus in profileserver.py when I ported it over. This needs to land upstream in github, I can do that for you: https://github.com/mozilla/mozbase/ You can then cherry-pick it to inbound.
Attachment #828599 -
Flags: review?(ted) → review+
Updated•10 years ago
|
Component: JavaScript Engine → Mozbase
Flags: needinfo?(ted)
Product: Core → Testing
QA Contact: hskupin
Comment 9•10 years ago
|
||
Pushed to github: https://github.com/mozilla/mozbase/commit/ac5ad6efe4a8adf35c0a05c7219720bf25f2e43e Feel free to land in inbound to get this fixed faster than the next mozbase sync from github->hg.
Comment 10•10 years ago
|
||
pushed all mozrunner changes to try: https://tbpl.mozilla.org/?tree=Try&rev=8008fa03d45f If this goes well, I can+will bump mozrunner -> 5.27 + mirror -> m-c
![]() |
Assignee | |
Comment 11•10 years ago
|
||
jhammel, am I right in thinking that I don't need to do anything else here, I can just wait for you? If so, thanks!
> https://tbpl.mozilla.org/?tree=Try&rev=8008fa03d45f
That push contains a couple of changes to remote.py and runner.py that weren't in my patch. Is that intentional?
Comment 12•10 years ago
|
||
(In reply to Nicholas Nethercote [:njn] from comment #11) > jhammel, am I right in thinking that I don't need to do anything else here, > I can just wait for you? If so, thanks! Assuming the try run goes well, yes. > > https://tbpl.mozilla.org/?tree=Try&rev=8008fa03d45f > > That push contains a couple of changes to remote.py and runner.py that > weren't in my patch. Is that intentional? Yes. They are all outstanding changes to mozrunner in github.
Comment 13•10 years ago
|
||
version bumped + released: https://github.com/mozilla/mozbase/commit/ae38a09787dfb2357afb2851dcb1e0840572b9e7 https://pypi.python.org/pypi/mozrunner/5.27
![]() |
Assignee | |
Comment 15•10 years ago
|
||
Ok, so the next Valgrind run (3am on Nov 13) will be the first one to see the changes, and will hopefully work.
![]() |
||
Comment 16•10 years ago
|
||
(In reply to Nicholas Nethercote [:njn] from comment #15) > Ok, so the next Valgrind run (3am on Nov 13) will be the first one to see > the changes, and will hopefully work. We could also politely ask the buildduty releng folk to do a manual trigger. Just mentioning an option.
![]() |
Assignee | |
Comment 17•10 years ago
|
||
> We could also politely ask the buildduty releng folk to do a manual trigger.
> Just mentioning an option.
I'm about to finish up for the day, so I didn't bother, but if someone else wants to, please do! :)
![]() |
Assignee | |
Comment 18•10 years ago
|
||
> Ok, so the next Valgrind run (3am on Nov 13) will be the first one to see
> the changes, and will hopefully work.
Oh wait, jhammel's patch hasn't landed yet.
Comment 19•10 years ago
|
||
(In reply to Nicholas Nethercote [:njn] from comment #18) > > Ok, so the next Valgrind run (3am on Nov 13) will be the first one to see > > the changes, and will hopefully work. > > Oh wait, jhammel's patch hasn't landed yet. no review yet. Please feel free to review
Updated•10 years ago
|
Attachment #830911 -
Flags: review?(n.nethercote)
![]() |
Assignee | |
Comment 20•10 years ago
|
||
Comment on attachment 830911 [details] [diff] [review] ae38a09787dfb2357afb2851dcb1e0840572b9e7.diff Review of attachment 830911 [details] [diff] [review]: ----------------------------------------------------------------- Er, ok... - The local.py changes are identical to the ones from my patch, so that's fine. - The remote.py change is a trivial change from double-quotes to single-quotes, so that's fine. - The runner.py change is just removal of what I assume is a dead function. I don't know if it really is dead, but if try says it's ok it must be. - The setup.py change (PACKAGE_VERSION) I also don't know about but it seems reasonable. r=me, I guess.
Attachment #830911 -
Flags: review?(n.nethercote) → review+
![]() |
Assignee | |
Updated•10 years ago
|
Attachment #830911 -
Flags: review?(ted)
![]() |
Assignee | |
Comment 21•10 years ago
|
||
I landed this because I want to make progress on the Valgrind-on-TBPL work. Hopefully I did it correctly... https://hg.mozilla.org/mozilla-central/rev/d58ab6f6ca0a
![]() |
Assignee | |
Comment 22•10 years ago
|
||
The first run post-landing is here: https://tbpl.mozilla.org/php/getParsedLog.php?id=30692824&tree=Mozilla-Central The good news is that the right flags are now being passed. Yay. The bad news is that we still get "Output exceeded 52428800 bytes, remaining output has been truncated". Although we are getting many fewer errors due to ignoring "possibly lost" leaks, we are printing more info for each error because we're (a) generating suppressions, and (b) printing stacks up to 50 entries deep. The good news is that it looks like there are a small number of distinct leaks that account for most of the errors, and they show up as many separate errors due to minor variations in the stack traces. I will investigate further and file follow-up bugs.
![]() |
||
Comment 23•10 years ago
|
||
> The bad news is that we still get "Output exceeded 52428800 bytes, remaining > output has been truncated". Although we are getting many fewer errors due > to ignoring "possibly lost" leaks, we are printing more info for each error > because we're (a) generating suppressions, and (b) printing stacks up to 50 > entries deep. And we're using --track-origins=yes, printing both more info and causing it to be slower. However, we get the benefit of not having to re-run any instance to get these suppression stacks/additional information. We print 50 entries deep because of bug 714690 comment 5, likewise for the same reason. We can definitely tweak this number, though..
![]() |
Assignee | |
Comment 24•10 years ago
|
||
Oh, I'm not complaining about the current options. If we fix 5 or 6 leaks the output will be reduced to almost nothing.
![]() |
||
Comment 25•10 years ago
|
||
And which allows us to spot the hard-to-reproduce occasional error. :)
![]() |
Assignee | |
Updated•10 years ago
|
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
![]() |
Assignee | |
Updated•10 years ago
|
You need to log in
before you can comment on or make changes to this bug.
Description
•