Closed Bug 960520 Opened 9 years ago Closed 8 years ago

PendingMarkers::mGenID is used without being initialised


(Core :: Gecko Profiler, defect)

Not set





(Reporter: jseward, Assigned: BenWa)




(1 file, 1 obsolete file)

When using SPS on x86_64-linux on Valgrind, I get a lot of the
following errors.  I've been seeing them for a couple of months:

Uninitialised byte(s) found during client check request
   at 0x6F9E3E2: PendingMarkers::addMarker(ProfilerMarker*) (platform.cpp:173)
   by 0x6FA4A2B: PseudoStack::addMarker(char const*, ProfilerMarkerPayload*) (PseudoStack.h:326)
   by 0x6FA2080: mozilla_sampler_add_marker(char const*, ProfilerMarkerPayload*) (platform.cpp:917)
   by 0x6FA2178: mozilla::ProfilerIOInterposeObserver::Observe(mozilla::IOInterposeObserver::Observation&) (ProfilerIOInterposeObserver.cpp:42)
   by 0x6F9BA6D: mozilla::IOInterposer::Report(mozilla::IOInterposeObserver::Observation&) (IOInterposer.cpp:188)
   by 0x6F9BDAF: (anonymous namespace)::interposedFileInfo(PRFileDesc*, PRFileInfo*) (NSPRInterposer.cpp:47)
   by 0x4C4D92C: PR_GetOpenFileInfo (priometh.c:139)
   by 0x4C6915B: _MD_CreateFileMap (unix.c:3553)
   by 0x4C4FBF1: PR_CreateFileMap (prmmap.c:32)
   by 0x64CFA5D: mozJSComponentLoader::ObjectForLocation(nsIFile*, nsIURI*, JSObject**, JSScript**, char**, bool, JS::MutableHandle<JS::Value>) (mozJSComponentLoader.cpp:843)
   by 0x64D14B2: mozJSComponentLoader::ImportInto(nsACString_internal const&, JS::Handle<JSObject*>, JSContext*, JS::MutableHandle<JSObject*>) (mozJSComponentLoader.cpp:1232)
   by 0x64D174C: mozJSComponentLoader::Import(nsACString_internal const&, JS::Value const&, JSContext*, unsigned char, JS::Value*) (mozJSComponentLoader.cpp:1128)
 Address 0x487fc14 is 32,804 bytes inside a block of size 32,856 alloc'd
   at 0x4808FD4: malloc (vg_replace_malloc.c:292)
   by 0x481D451: moz_xmalloc (mozalloc.cpp:52)
   by 0x6FA13F4: new_PseudoStack() (mozalloc.h:201)
   by 0x6FA30CA: mozilla_sampler_init(void*) (platform.cpp:465)
   by 0x70293A7: profiler_init(void*) (GeckoProfilerImpl.h:59)
   by 0x702F544: XREMain::XRE_main(int, char**, nsXREAppData const*) (GeckoProfiler.h:183)
   by 0x702F920: XRE_main (nsAppRunner.cpp:4331)
   by 0x4036F2: do_main(int, char**, nsIFile*) (nsBrowserApp.cpp:280)
   by 0x403819: main (nsBrowserApp.cpp:648)

   vTRUNK --track-origins=no --smc-check=all-non-file --fair-sched=yes \
   ./ff-opt-linux/dist/bin/firefox-bin -P dev -no-remote

This is with the patch for bug 938157 in place, but I am sure I can repro
without that too.

PendingMarkers::addMarker(ProfilerMarker*) looks like this:

  PendingMarkers::addMarker(ProfilerMarker *aMarker) {
    mSignalLock = true;


    // Clear markers that have been overwritten
    VALGRIND_CHECK_VALUE_IS_DEFINED(mGenID);  <---- LINE 213; V complains here
    while (mStoredMarkers.peek() &&
           mStoredMarkers.peek()->HasExpired(mGenID)) { <------ and here
      delete mStoredMarkers.popHead();
    mSignalLock = false;

Line 213 was added by me -- it makes it easy to ascertain that mGenID
is the problem.

Allocation path for this mGenID is

  mozilla_sampler_init calls new PseudoStack()

  PseudoStack has "PendingMarkers mPendingMarkers"

  PendingMarkers has "volatile mozilla::sig_safe_t mGenID"
  and it is this mGenID that is used before initialisation.
  In particular PendingMarkers::PendingMarkers does not
  initialise it.
Note that it's not ProfilerMarker::mGenID that is the problem, but
rather PendingMarkers::mGenID.  In irc chat with BenWa I suggested
that it was ProfilerMarker::mGenID that was the problem, but that
turns out not to be the case.  I didn't realise at the time that
mGenID exists in more than one class.
Attached patch trivial fix (obsolete) — Splinter Review
Trivial fix that stops Valgrind complaining.  No idea if it is
right in the "bigger picture".  I chose to initialise mGenID to
zero rather than -1 because I am not sure whether mozilla::sig_safe_t
is signed on all platforms.
Attachment #8361063 - Flags: feedback?(bgirard)
I think mGenID should always be assigned before being written. A value of zero is incorrect. If Valgrind complains then its finding a real problem.

IMO this should be WONTFIX but I could be convinced.
Attachment #8361063 - Flags: feedback?(bgirard)
Blocks: 1022583
(In reply to Benoit Girard (:BenWa) (Off until June 30th) from comment #4)

This fix, or something equivalent, needs to land.  Without it, debug
desktop-linux builds of SPS+LUL segfault consistently when
ProfilerButton -> Analyze is clicked.  With it, when applied
together with the fix for bug 1031284, SPS+LUL on desktop Linux runs
reliably in both debug and opt builds.

The segfaults have various stacks, one of which is shown below.  Note
the invalid address 0x110000001a -- I have also seen
0x5a5a5a5a5a5a5a5a for that value.  Is that freed-memory-fill?

Although I can't reproduce it now, I have seen
ProfilerMarker::mPayload being a nonsense pointer too.  My hunch is
that the non-initialised-ness of PendingMarkers::mGenID causes invalid
(not-filled-in? old?) ProfilerMarkers to be passed to the
serialisation mechanism, resulting in these failures.

Any hints as to a better fix?

Program received signal SIGSEGV, Segmentation fault.

0x00007ffff44e50a2 in EscapeToStream (stream=..., str=str@entry=0x110000001a
    <Address 0x110000001a out of bounds>)
    at tools/profiler/JSStreamWriter.cpp:26
26	  size_t len = strlen(str);
(gdb) p str
$2 = 0x110000001a <Address 0x110000001a out of bounds>

(gdb) where
#0  EscapeToStream (stream=..., str=str@entry=0x110000001a 
    <Address 0x110000001a out of bounds>)
    at tools/profiler/JSStreamWriter.cpp:26

#1  JSStreamWriter::Value (this=0x7ffffffee4d0,
    aValue=0x110000001a <Address 0x110000001a out of bounds>)
    at tools/profiler/JSStreamWriter.cpp:175

#2  NameValue<char const*> (aValue=0x110000001a
    <Address 0x110000001a out of bounds>, aName=0x7ffff54d4fc7 "name", 
    at tools/profiler/JSStreamWriter.h:32

#3  ProfilerMarker::StreamJSObject (this=0x7fffd5baf7c0, b=...) 
    at tools/profiler/platform.cpp:151

#4  ThreadProfile::StreamJSObject (this=0x7fffd5c8dc10, b=...)
    at tools/profiler/ProfileEntry.cpp:471

#5  TableTicker::StreamJSObject (this=this@entry=0x7fffd78e3ae0, b=...)
    at tools/profiler/TableTicker.cpp:282

#6  TableTicker::ToJSObject (this=0x7fffd78e3ae0, aCx=0x7fffe7517170)
    at tools/profiler/TableTicker.cpp:174
Flags: needinfo?(bgirard)
We can't get to using mGenID in PendingMarkers::addMarker unless we had a call to PendingMarkers::addStoredMarker which would of also read mGenId. Wouldn't this complain about mGenID first?
Flags: needinfo?(jseward)
(In reply to Benoit Girard (:BenWa) from comment #6)
> Wouldn't this complain about mGenID first?

I think the answer is no, because Memcheck doesn't complain about
copying undefined data around.  It only complains when it is used to
control a branch or generate a memory address.  Analysis:

Suppose we first call PendingMarkers::addStoredMarker with
PendingMarkers::mGenID being undefined.  This calls


but that doesn't inspect the value -- it just copies it into

  ProfilerMarker::SetGeneration(int aGenID) {
    mGenID = aGenID;

so Memcheck marks the destination (ProfilerMarker::mGenID) as 
undefined, but doesn't say anything.

Then we do


but mStoredMarkers is a ProfilerLinkedList<ProfilerMarker>
and ProfilerLinkedList::insert doesn't look at the list
elements in any way.  Just copies them around, so again
Memcheck doesn't complain.
Flags: needinfo?(jseward)
Ahh ok, with your explanation then we're just adding markers before we set the generation number. I believe this fix fits the evidence that we have.
Assignee: jseward → bgirard
Attachment #8361063 - Attachment is obsolete: true
Attachment #8451127 - Flags: review?(jseward)
Flags: needinfo?(bgirard)
Comment on attachment 8451127 [details] [diff] [review]
Set generation before adding markers

Review of attachment 8451127 [details] [diff] [review]:

Looks good.  I verified that it (1) gets rid of the Valgrind warnings, 
and (2) stops SPS+LUL segfaulting in debug builds.
Attachment #8451127 - Flags: review?(jseward) → review+
Flags: needinfo?(bgirard)
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla33
You need to log in before you can comment on or make changes to this bug.