Closed
Bug 960520
Opened 12 years ago
Closed 11 years ago
PendingMarkers::mGenID is used without being initialised
Categories
(Core :: Gecko Profiler, defect)
Tracking
()
RESOLVED
FIXED
mozilla33
People
(Reporter: jseward, Assigned: BenWa)
References
Details
Attachments
(1 file, 1 obsolete file)
|
3.05 KB,
patch
|
jseward
:
review+
|
Details | Diff | Splinter Review |
When using SPS on x86_64-linux on Valgrind, I get a lot of the
following errors. I've been seeing them for a couple of months:
Uninitialised byte(s) found during client check request
at 0x6F9E3E2: PendingMarkers::addMarker(ProfilerMarker*) (platform.cpp:173)
by 0x6FA4A2B: PseudoStack::addMarker(char const*, ProfilerMarkerPayload*) (PseudoStack.h:326)
by 0x6FA2080: mozilla_sampler_add_marker(char const*, ProfilerMarkerPayload*) (platform.cpp:917)
by 0x6FA2178: mozilla::ProfilerIOInterposeObserver::Observe(mozilla::IOInterposeObserver::Observation&) (ProfilerIOInterposeObserver.cpp:42)
by 0x6F9BA6D: mozilla::IOInterposer::Report(mozilla::IOInterposeObserver::Observation&) (IOInterposer.cpp:188)
by 0x6F9BDAF: (anonymous namespace)::interposedFileInfo(PRFileDesc*, PRFileInfo*) (NSPRInterposer.cpp:47)
by 0x4C4D92C: PR_GetOpenFileInfo (priometh.c:139)
by 0x4C6915B: _MD_CreateFileMap (unix.c:3553)
by 0x4C4FBF1: PR_CreateFileMap (prmmap.c:32)
by 0x64CFA5D: mozJSComponentLoader::ObjectForLocation(nsIFile*, nsIURI*, JSObject**, JSScript**, char**, bool, JS::MutableHandle<JS::Value>) (mozJSComponentLoader.cpp:843)
by 0x64D14B2: mozJSComponentLoader::ImportInto(nsACString_internal const&, JS::Handle<JSObject*>, JSContext*, JS::MutableHandle<JSObject*>) (mozJSComponentLoader.cpp:1232)
by 0x64D174C: mozJSComponentLoader::Import(nsACString_internal const&, JS::Value const&, JSContext*, unsigned char, JS::Value*) (mozJSComponentLoader.cpp:1128)
Address 0x487fc14 is 32,804 bytes inside a block of size 32,856 alloc'd
at 0x4808FD4: malloc (vg_replace_malloc.c:292)
by 0x481D451: moz_xmalloc (mozalloc.cpp:52)
by 0x6FA13F4: new_PseudoStack() (mozalloc.h:201)
by 0x6FA30CA: mozilla_sampler_init(void*) (platform.cpp:465)
by 0x70293A7: profiler_init(void*) (GeckoProfilerImpl.h:59)
by 0x702F544: XREMain::XRE_main(int, char**, nsXREAppData const*) (GeckoProfiler.h:183)
by 0x702F920: XRE_main (nsAppRunner.cpp:4331)
by 0x4036F2: do_main(int, char**, nsIFile*) (nsBrowserApp.cpp:280)
by 0x403819: main (nsBrowserApp.cpp:648)
STR:
MOZ_PROFILER_INTERVAL=50 MOZ_PROFILER_MODE=native MOZ_PROFILER_NEW=1 \
MOZ_PROFILER_VERBOSE=1 MOZ_PROFILER_STACK_SCAN=0 MOZ_PROFILER_STARTUP= \
vTRUNK --track-origins=no --smc-check=all-non-file --fair-sched=yes \
./ff-opt-linux/dist/bin/firefox-bin -P dev -no-remote
This is with the patch for bug 938157 in place, but I am sure I can repro
without that too.
| Reporter | ||
Comment 1•12 years ago
|
||
(analysis):
PendingMarkers::addMarker(ProfilerMarker*) looks like this:
void
PendingMarkers::addMarker(ProfilerMarker *aMarker) {
mSignalLock = true;
STORE_SEQUENCER();
MOZ_ASSERT(aMarker);
mPendingMarkers.insert(aMarker);
// Clear markers that have been overwritten
VALGRIND_CHECK_VALUE_IS_DEFINED(mGenID); <---- LINE 213; V complains here
while (mStoredMarkers.peek() &&
mStoredMarkers.peek()->HasExpired(mGenID)) { <------ and here
delete mStoredMarkers.popHead();
}
STORE_SEQUENCER();
mSignalLock = false;
}
Line 213 was added by me -- it makes it easy to ascertain that mGenID
is the problem.
Allocation path for this mGenID is
mozilla_sampler_init calls new PseudoStack()
PseudoStack has "PendingMarkers mPendingMarkers"
PendingMarkers has "volatile mozilla::sig_safe_t mGenID"
and it is this mGenID that is used before initialisation.
In particular PendingMarkers::PendingMarkers does not
initialise it.
| Reporter | ||
Comment 2•12 years ago
|
||
Note that it's not ProfilerMarker::mGenID that is the problem, but
rather PendingMarkers::mGenID. In irc chat with BenWa I suggested
that it was ProfilerMarker::mGenID that was the problem, but that
turns out not to be the case. I didn't realise at the time that
mGenID exists in more than one class.
| Reporter | ||
Comment 3•12 years ago
|
||
Trivial fix that stops Valgrind complaining. No idea if it is
right in the "bigger picture". I chose to initialise mGenID to
zero rather than -1 because I am not sure whether mozilla::sig_safe_t
is signed on all platforms.
Attachment #8361063 -
Flags: feedback?(bgirard)
| Assignee | ||
Comment 4•12 years ago
|
||
I think mGenID should always be assigned before being written. A value of zero is incorrect. If Valgrind complains then its finding a real problem.
IMO this should be WONTFIX but I could be convinced.
| Assignee | ||
Updated•12 years ago
|
Attachment #8361063 -
Flags: feedback?(bgirard)
| Reporter | ||
Comment 5•11 years ago
|
||
(In reply to Benoit Girard (:BenWa) (Off until June 30th) from comment #4)
This fix, or something equivalent, needs to land. Without it, debug
desktop-linux builds of SPS+LUL segfault consistently when
ProfilerButton -> Analyze is clicked. With it, when applied
together with the fix for bug 1031284, SPS+LUL on desktop Linux runs
reliably in both debug and opt builds.
The segfaults have various stacks, one of which is shown below. Note
the invalid address 0x110000001a -- I have also seen
0x5a5a5a5a5a5a5a5a for that value. Is that freed-memory-fill?
Although I can't reproduce it now, I have seen
ProfilerMarker::mPayload being a nonsense pointer too. My hunch is
that the non-initialised-ness of PendingMarkers::mGenID causes invalid
(not-filled-in? old?) ProfilerMarkers to be passed to the
serialisation mechanism, resulting in these failures.
Any hints as to a better fix?
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff44e50a2 in EscapeToStream (stream=..., str=str@entry=0x110000001a
<Address 0x110000001a out of bounds>)
at tools/profiler/JSStreamWriter.cpp:26
26 size_t len = strlen(str);
(gdb) p str
$2 = 0x110000001a <Address 0x110000001a out of bounds>
(gdb) where
#0 EscapeToStream (stream=..., str=str@entry=0x110000001a
<Address 0x110000001a out of bounds>)
at tools/profiler/JSStreamWriter.cpp:26
#1 JSStreamWriter::Value (this=0x7ffffffee4d0,
aValue=0x110000001a <Address 0x110000001a out of bounds>)
at tools/profiler/JSStreamWriter.cpp:175
#2 NameValue<char const*> (aValue=0x110000001a
<Address 0x110000001a out of bounds>, aName=0x7ffff54d4fc7 "name",
this=0x7ffffffee4d0)
at tools/profiler/JSStreamWriter.h:32
#3 ProfilerMarker::StreamJSObject (this=0x7fffd5baf7c0, b=...)
at tools/profiler/platform.cpp:151
#4 ThreadProfile::StreamJSObject (this=0x7fffd5c8dc10, b=...)
at tools/profiler/ProfileEntry.cpp:471
#5 TableTicker::StreamJSObject (this=this@entry=0x7fffd78e3ae0, b=...)
at tools/profiler/TableTicker.cpp:282
#6 TableTicker::ToJSObject (this=0x7fffd78e3ae0, aCx=0x7fffe7517170)
at tools/profiler/TableTicker.cpp:174
| Assignee | ||
Updated•11 years ago
|
Flags: needinfo?(bgirard)
| Assignee | ||
Comment 6•11 years ago
|
||
We can't get to using mGenID in PendingMarkers::addMarker unless we had a call to PendingMarkers::addStoredMarker which would of also read mGenId. Wouldn't this complain about mGenID first?
Flags: needinfo?(jseward)
| Reporter | ||
Comment 7•11 years ago
|
||
(In reply to Benoit Girard (:BenWa) from comment #6)
> Wouldn't this complain about mGenID first?
I think the answer is no, because Memcheck doesn't complain about
copying undefined data around. It only complains when it is used to
control a branch or generate a memory address. Analysis:
Suppose we first call PendingMarkers::addStoredMarker with
PendingMarkers::mGenID being undefined. This calls
aStoredMarker->SetGeneration(mGenID);
but that doesn't inspect the value -- it just copies it into
memory
ProfilerMarker::SetGeneration(int aGenID) {
mGenID = aGenID;
}
so Memcheck marks the destination (ProfilerMarker::mGenID) as
undefined, but doesn't say anything.
Then we do
mStoredMarkers.insert(aStoredMarker);
but mStoredMarkers is a ProfilerLinkedList<ProfilerMarker>
and ProfilerLinkedList::insert doesn't look at the list
elements in any way. Just copies them around, so again
Memcheck doesn't complain.
Flags: needinfo?(jseward)
| Assignee | ||
Comment 8•11 years ago
|
||
Ahh ok, with your explanation then we're just adding markers before we set the generation number. I believe this fix fits the evidence that we have.
Assignee: jseward → bgirard
Attachment #8361063 -
Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #8451127 -
Flags: review?(jseward)
Flags: needinfo?(bgirard)
| Reporter | ||
Comment 9•11 years ago
|
||
Comment on attachment 8451127 [details] [diff] [review]
Set generation before adding markers
Review of attachment 8451127 [details] [diff] [review]:
-----------------------------------------------------------------
Looks good. I verified that it (1) gets rid of the Valgrind warnings,
and (2) stops SPS+LUL segfaulting in debug builds.
Attachment #8451127 -
Flags: review?(jseward) → review+
| Assignee | ||
Updated•11 years ago
|
Flags: needinfo?(bgirard)
| Assignee | ||
Comment 10•11 years ago
|
||
Flags: needinfo?(bgirard)
Comment 11•11 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla33
You need to log in
before you can comment on or make changes to this bug.
Description
•