If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

With e10s enabled, TargetConfig::orientation is used uninitialised

RESOLVED FIXED in Firefox 38

Status

()

Core
Graphics: Layers
RESOLVED FIXED
3 years ago
3 years ago

People

(Reporter: jseward, Assigned: kats)

Tracking

unspecified
mozilla38
x86_64
Linux
Points:
---

Firefox Tracking Flags

(e10s-, firefox38 fixed)

Details

Attachments

(4 attachments, 2 obsolete attachments)

(Reporter)

Description

3 years ago
STR: enable e10s, and restart Fx on V thusly:

vTRUNK --smc-check=all-non-file 
  --vex-iropt-register-updates=allregs-at-mem-access
  --read-inline-info=yes --fair-sched=yes --partial-loads-ok=yes
  --show-mismatched-frees=no --fullpath-after=-MC/ --track-origins=yes
  ./ff-O2-linux64/dist/bin/firefox-bin

Moving of the mouse cursor around the default new tab (the one with
the newfangled tiley-things) causes the complaints shown in the next
comment.

I don't see this when e10s is not enabled.

Approximate analysis is as follows:

CompositorParent::AllocPLayerTransactionParent
(gfx/layers/ipc.CompositorParent.cpp:970)

creates a new AsyncCompositionManager:

994:  mCompositionManager = new AsyncCompositionManager(mLayerManager);

AsyncCompositionManager has the following member

  TargetConfig mTargetConfig;

I could not find any constructor for TargetConfig, so I assume it
uses the default constructor, which does not initialise any fields.

Later on though, we look at one of mTargetConfig's fields, in
AsyncCompositionManager::RequiresReorientation:

  bool RequiresReorientation(mozilla::dom::ScreenOrientation aOrientation)
  {
    return mTargetConfig.orientation() != aOrientation;
  }

I assume that mTargetConfig.orientation simply returns
TargetConfig::orientation, as defined at
./gfx/layers/ipc/LayersMessages.ipdlh:49

Adding a bit of temporary instrumentation to
AsyncCompositionManager::RequiresReorientation confirms that it is
mTargetConfig.orientation() that is undefined, not aOrientation:

  bool RequiresReorientation(mozilla::dom::ScreenOrientation aOrientation)
  {
    VALGRIND_CHECK_VALUE_IS_DEFINED(aOrientation);
    mozilla::dom::ScreenOrientation tmp = mTargetConfig.orientation();
    VALGRIND_CHECK_VALUE_IS_DEFINED(tmp);
    return mTargetConfig.orientation() != aOrientation;
  }
(Reporter)

Comment 1

3 years ago
Valgrind complaint:

Thread 22 Compositor:
Conditional jump or move depends on uninitialised value(s)
   at 0x61D8436: mozilla::layers::CompositorParent::ScheduleRotationOnCompositorThread(mozilla::layers::TargetConfig const&, bool) (gfx/layers/ipc/CompositorParent.cpp:785)
   by 0x61E42AC: mozilla::layers::CompositorParent::ShadowLayersUpdated(mozilla::layers::LayerTransactionParent*, unsigned long const&, mozilla::layers::TargetConfig const&, bool, bool, unsigned int, bool) (gfx/layers/ipc/CompositorParent.cpp:805)
   by 0x61DE63E: mozilla::layers::LayerTransactionParent::RecvUpdate(nsTArray<mozilla::layers::Edit> const&, unsigned long const&, mozilla::layers::TargetConfig const&, bool const&, bool const&, unsigned int const&, bool const&, mozilla::TimeStamp const&, nsTArray<mozilla::layers::EditReply>*) (gfx/layers/ipc/LayerTransactionParent.cpp:545)
   by 0x5D1E65C: mozilla::layers::PLayerTransactionParent::OnMessageReceived(IPC::Message const&, IPC::Message*&) (ff-O2-linux64/ipc/ipdl/PLayerTransactionParent.cpp:687)
   by 0x5C0FA28: mozilla::layers::PCompositorParent::OnMessageReceived(IPC::Message const&, IPC::Message*&) (ff-O2-linux64/ipc/ipdl/PCompositorParent.cpp:575)
   by 0x5B3BFE2: mozilla::ipc::MessageChannel::DispatchSyncMessage(IPC::Message const&) (ipc/glue/MessageChannel.cpp:1134)
   by 0x5B4390D: mozilla::ipc::MessageChannel::OnMaybeDequeueOne() (ipc/glue/MessageChannel.cpp:1106)
   by 0x5B1EBB4: RunTask (ipc/chromium/src/base/message_loop.cc:358)
   by 0x5B1EBB4: MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&) (ipc/chromium/src/base/message_loop.cc:366)
   by 0x5B1EE0A: DoWork (ipc/chromium/src/base/message_loop.cc:444)
   by 0x5B1EE0A: MessageLoop::DoWork() (ipc/chromium/src/base/message_loop.cc:423)
   by 0x5B18789: base::MessagePumpDefault::Run(base::MessagePump::Delegate*) (ipc/chromium/src/base/message_pump_default.cc:34)
   by 0x5B18A11: RunInternal (ipc/chromium/src/base/message_loop.cc:230)
   by 0x5B18A11: RunHandler (ipc/chromium/src/base/message_loop.cc:223)
   by 0x5B18A11: MessageLoop::Run() (ipc/chromium/src/base/message_loop.cc:197)
   by 0x5B232A6: base::Thread::ThreadMain() (ipc/chromium/src/base/thread.cc:170)

 Uninitialised value was created by a heap allocation
   at 0x4809064: malloc (/home/sewardj/VgTRUNK/trunk/coregrind/m_replacemalloc/vg_replace_malloc.c:296)
   by 0x4828803: moz_xmalloc (memory/mozalloc/mozalloc.cpp:52)
   by 0x61E534E: operator new (ff-O2-linux64/gfx/layers/../../dist/include/mozilla/mozalloc.h:208)
   by 0x61E534E: mozilla::layers::CompositorParent::AllocPLayerTransactionParent(nsTArray<mozilla::layers::LayersBackend> const&, unsigned long const&, mozilla::layers::TextureFactoryIdentifier*, bool*) (gfx/layers/ipc/CompositorParent.cpp:981)
   by 0x5C10A65: mozilla::layers::PCompositorParent::OnMessageReceived(IPC::Message const&, IPC::Message*&) (ff-O2-linux64/ipc/ipdl/PCompositorParent.cpp:868)
   by 0x5B3BFE2: mozilla::ipc::MessageChannel::DispatchSyncMessage(IPC::Message const&) (ipc/glue/MessageChannel.cpp:1134)
   by 0x5B4390D: mozilla::ipc::MessageChannel::OnMaybeDequeueOne() (ipc/glue/MessageChannel.cpp:1106)
   by 0x5B1EBB4: RunTask (ipc/chromium/src/base/message_loop.cc:358)
   by 0x5B1EBB4: MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&) (ipc/chromium/src/base/message_loop.cc:366)
   by 0x5B1EE0A: DoWork (ipc/chromium/src/base/message_loop.cc:444)
   by 0x5B1EE0A: MessageLoop::DoWork() (ipc/chromium/src/base/message_loop.cc:423)
   by 0x5B18789: base::MessagePumpDefault::Run(base::MessagePump::Delegate*) (ipc/chromium/src/base/message_pump_default.cc:34)
   by 0x5B18A11: RunInternal (ipc/chromium/src/base/message_loop.cc:230)
   by 0x5B18A11: RunHandler (ipc/chromium/src/base/message_loop.cc:223)
   by 0x5B18A11: MessageLoop::Run() (ipc/chromium/src/base/message_loop.cc:197)
   by 0x5B232A6: base::Thread::ThreadMain() (ipc/chromium/src/base/thread.cc:170)
   by 0x5B07F29: ThreadFunc(void*) (ipc/chromium/src/base/platform_thread_posix.cc:39)
Jeff, if this is real, maybe the timing changed since bug 1064479 so this is now uninitialized?
Julian, there is a constructor for TargetConfig, calls the Init() method, which is empty :)
Flags: needinfo?(jmuizelaar)

Updated

3 years ago
tracking-e10s: --- → ?
tracking-e10s: ? → -
Julian is it possible to get the value of mIsFirstPaint when this happens? mTargetConfig should unitialized before mIsFirstPaint is set and should be initialized after that.
Flags: needinfo?(jmuizelaar)
(Reporter)

Comment 4

3 years ago
(In reply to Jeff Muizelaar [:jrmuizel] from comment #3)

It looks like mTargetConfig is getting looked at when mFirstPaint == false.
I rewrote RequiresReorientation thusly

  bool RequiresReorientation(mozilla::dom::ScreenOrientation aOrientation)
  {
    //return mTargetConfig.orientation() != aOrientation;
#   define COMPILER_BARRIER __asm__ __volatile__("":::"cc","memory");
    mozilla::dom::ScreenOrientation tmp = mTargetConfig.orientation();
    if (mIsFirstPaint) {
      fprintf(stderr, "mIsFirstPaint = true\n");
      COMPILER_BARRIER;
      VALGRIND_CHECK_VALUE_IS_DEFINED(tmp);
      COMPILER_BARRIER;
    } else {
      fprintf(stderr, "mIsFirstPaint = false\n");
      COMPILER_BARRIER;
      VALGRIND_CHECK_VALUE_IS_DEFINED(tmp); <==== AsyncCompositionManager.h:125
      COMPILER_BARRIER;
    }
    return tmp != aOrientation;
#   undef COMPILER_BARRIER
  }

and got this

mIsFirstPaint = false
mIsFirstPaint = false
mIsFirstPaint = false
mIsFirstPaint = false
mIsFirstPaint = false
mIsFirstPaint = false
mIsFirstPaint = false
mIsFirstPaint = false
mIsFirstPaint = false
mIsFirstPaint = false
mIsFirstPaint = false
mIsFirstPaint = false
mIsFirstPaint = false
mIsFirstPaint = false
mIsFirstPaint = false
mIsFirstPaint = false
mIsFirstPaint = false
mIsFirstPaint = false
mIsFirstPaint = false
mIsFirstPaint = false
mIsFirstPaint = false
Thread 23 Compositor:
Uninitialised byte(s) found during client check request
   at 0x61EDC8D: mozilla::layers::AsyncCompositionManager::RequiresReorientation(unsigned int) (ff-O2-linux64/gfx/layers/../../dist/include/mozilla/layers/AsyncCompositionManager.h:125)
   by 0x61DC1D7: mozilla::layers::CompositorParent::ScheduleRotationOnCompositorThread(mozilla::layers::TargetConfig const&, bool) (gfx/layers/ipc/CompositorParent.cpp:937)
   by 0x61E976C: mozilla::layers::CompositorParent::ShadowLayersUpdated(mozilla::layers::LayerTransactionParent*, unsigned long const&, mozilla::layers::TargetConfig const&, bool, bool, unsigned int, bool) (gfx/layers/ipc/CompositorParent.cpp:955)
   by 0x61E7F16: mozilla::layers::LayerTransactionParent::RecvUpdate(nsTArray<mozilla::layers::Edit> const&, unsigned long const&, mozilla::layers::TargetConfig const&, bool const&, bool const&, unsigned int const&, bool const&, mozilla::TimeStamp const&, nsTArray<mozilla::layers::EditReply>*) (gfx/layers/ipc/LayerTransactionParent.cpp:569)
   by 0x5D1C3EA: mozilla::layers::PLayerTransactionParent::OnMessageReceived(IPC::Message const&, IPC::Message*&) (ff-O2-linux64/ipc/ipdl/PLayerTransactionParent.cpp:708)
   by 0x5C0CB08: mozilla::layers::PCompositorParent::OnMessageReceived(IPC::Message const&, IPC::Message*&) (ff-O2-linux64/ipc/ipdl/PCompositorParent.cpp:607)
   by 0x5B38BFE: mozilla::ipc::MessageChannel::DispatchSyncMessage(IPC::Message const&) (ipc/glue/MessageChannel.cpp:1082)
   by 0x5B3DDA7: mozilla::ipc::MessageChannel::OnMaybeDequeueOne() (ipc/glue/MessageChannel.cpp:1037)
   by 0x5B194A4: RunTask (ipc/chromium/src/base/message_loop.cc:361)
   by 0x5B194A4: MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&) (ipc/chromium/src/base/message_loop.cc:369)
   by 0x5B196FA: DoWork (ipc/chromium/src/base/message_loop.cc:447)
   by 0x5B196FA: MessageLoop::DoWork() (ipc/chromium/src/base/message_loop.cc:426)
   by 0x5B13079: base::MessagePumpDefault::Run(base::MessagePump::Delegate*) (ipc/chromium/src/base/message_pump_default.cc:34)
   by 0x5B13301: RunInternal (ipc/chromium/src/base/message_loop.cc:233)
   by 0x5B13301: RunHandler (ipc/chromium/src/base/message_loop.cc:226)
   by 0x5B13301: MessageLoop::Run() (ipc/chromium/src/base/message_loop.cc:200)
 Address 0x2160060c is on thread 23's stack
 in frame #0, created by mozilla::layers::AsyncCompositionManager::RequiresReorientation(unsigned int) (AsyncCompositionManager.h:112)
(Reporter)

Comment 5

3 years ago
Created attachment 8548138 [details] [diff] [review]
Surely not an acceptable solution

Jeff, do you know how to fix this properly, by doing correct
sequencing, or whatever?  I tried to understand the sequencing logic a
bit, but got lost.  I blunt-instrumented it using the attached patch,
but it strikes me as unlikely to be an acceptable solution :)
Attachment #8548138 - Flags: feedback?(jmuizelaar)
(Reporter)

Comment 6

3 years ago
Jeff, ping?
Flags: needinfo?(jmuizelaar)
(In reply to Julian Seward [:jseward] from comment #6)
> Jeff, ping?

Sorry, I had some sickness the last while and haven't been able to get to this. Maybe today.
Flags: needinfo?(jmuizelaar)
Created attachment 8558537 [details] [diff] [review]
A different approach

I'm actually not sure what the correct solution is here. The conditions under which mTargetConfig is initialized are not clear to me. We also use mTargetConfig in lots of places and it's not clear if it has been initialized before use in all of those cases. I think it's best that someone more knowledgeable solve this.
Assignee: nobody → bugmail.mozilla
Comment on attachment 8548138 [details] [diff] [review]
Surely not an acceptable solution

This is probably not correct
Attachment #8548138 - Flags: feedback?(jmuizelaar)
Flags: needinfo?(bugmail.mozilla)
(Reporter)

Updated

3 years ago
Attachment #8548138 - Attachment is obsolete: true
(Reporter)

Comment 10

3 years ago
Comment on attachment 8558537 [details] [diff] [review]
A different approach

This does stop Valgrind complaining, yes.  f+ from me therefore.
Attachment #8558537 - Flags: feedback+
I know hardly anything about this TargetConfig, but after poking around in the code some, I agree with Jeff that the isFirstPaint flag is relevant, although not in the way he said (IIUC).

What should be happening is that when the first layers transaction happens, the aIsFirstPaint flag should be set to true in the call to ShadowLayersUpdated at [1]. This aIsFirstPaint=true should get passed into ScheduleRotationOnCompositorThread at the top of the function, which will prevent the RequiresReorientation call at [2] from ever happening due to short-circuit evaluation. After ScheduleRotationOnCompositorThread completes, the call at [3] should initialize mTargetConfig in AsyncCompositionManager. Subsequent transactions should then have no problem.

So given the error being seen here I assume that the first layers transaction happens with aIsFirstPaint set to false, which I think is wrong. Tracing backwards through the code to the content side of things, the is-first-paint flag (with e10s on Linux) I think should be originating at [4], which will only get run if APZ is enabled. My knowledge of the startup code is pretty hazy though, so I could be wrong here.

Still, I can provide a patch that attempts to correct this and we can see if it works.

[1] http://mxr.mozilla.org/mozilla-central/source/gfx/layers/ipc/CompositorParent.cpp?rev=74ec30a5de72#1059
[2] http://mxr.mozilla.org/mozilla-central/source/gfx/layers/ipc/CompositorParent.cpp?rev=74ec30a5de72#1043
[3] http://mxr.mozilla.org/mozilla-central/source/gfx/layers/ipc/CompositorParent.cpp?rev=74ec30a5de72#1072
[4] http://mxr.mozilla.org/mozilla-central/source/dom/ipc/TabChild.cpp?rev=31fde84fdb64#942
Flags: needinfo?(bugmail.mozilla)
Created attachment 8559964 [details] [diff] [review]
A different different approach

Does this work?
Attachment #8559964 - Flags: feedback?(jseward)
(Reporter)

Comment 13

3 years ago
Created attachment 8560026 [details]
Memcheck still complains

Kats, thank you for chasing this around.  I tried that but unfortunately
it doesn't work -- I still get an uninitialised value error here

void
CompositorParent::ScheduleRotationOnCompositorThread(const TargetConfig& aTargetConfig,
                                                     bool aIsFirstPaint)
{
  MOZ_ASSERT(IsInCompositorThread());

  if (!aIsFirstPaint &&
      !mCompositionManager->IsFirstPaint() &&  <--------------------HERE

Memcheck complaint is in the attachment, but I suspect the same as
before.  What further information can I get you to help this?
Is it easy to set this up so I can reproduce locally?
(Reporter)

Comment 15

3 years ago
Yes, very, if you have a linux64 box to hand.  linux32 should also
work but I haven't tried it.

Clone m-c, build with this mozconfig

== BEGIN mozconfig ===================================
. $topsrcdir/browser/config/mozconfig
mk_add_options MOZ_OBJDIR=@TOPSRCDIR@/ff-O-linux64
ac_add_options --enable-tests
ac_add_options --enable-optimize="-g -O"
ac_add_options --enable-debug-symbols
ac_add_options --enable-valgrind
ac_add_options --enable-profiling
ac_add_options --enable-elf-hack
ac_add_options --disable-crashreporter
ac_add_options --disable-jemalloc
mk_add_options MOZ_MAKE_FLAGS="-j4"
== END mozconfig =====================================

Run like this:

G_SLICE=always-malloc /usr/bin/valgrind --fair-sched=yes --smc-check=all-non-file \
  --error-limit=no --trace-children=yes --child-silent-after-fork=yes \
  --trace-children-skip=/usr/bin/hg,/bin/rm,*/bin/certutil,*/bin/pk12util,*/bin/ssltunnel,*/bin/uname,*/bin/which,*/bin/ps,*/bin/grep,*/bin/java \
  --num-transtab-sectors=24 --tool=memcheck --freelist-vol=500000000 \
  --redzone-size=256 --gen-suppressions=no \
  --vex-iropt-register-updates=allregs-at-mem-access \
  --partial-loads-ok=yes --show-mismatched-frees=no \
  --read-inline-info=yes --num-callers=24 \
  ./ff-O-linux64/dist/bin/firefox-bin -P dev -no-remote 2>&1 | tee logfile-01-mc

When Fx starts you get the standard "default new" tab, with 4 or
9 or however many tiles in it.  If you move the mouse pointer into
a new tile, so that the hazy blue border gets drawn around the
blue tile, then V complains at that point (when the blue border
moves).
(Reporter)

Comment 16

3 years ago
Comment on attachment 8559964 [details] [diff] [review]
A different different approach

Review of attachment 8559964 [details] [diff] [review]:
-----------------------------------------------------------------

Doesn't help, alas.
Attachment #8559964 - Flags: feedback?(jseward) → feedback-
Created attachment 8560579 [details]
Stack of tooltip widget creation

Ok, I reproduced this locally and dug into it (woo being able to use rr!). What seems to be happening is as follows:
- The browser starts, creates a nsWindow, nsView, nsViewManager, PresShell, AsyncCompositionManager and stuff, and everything paints just fine.
- When the user mouseovers the tiles on the new tab, it creates a tooltip popup. This popup (an nsMenuPopupFrame) has its own nsView that is separate from the main view, and also creates its own widget, an nsChildWindow (stack attached).
- Now we have two views, each with a separate widget, and they both share a single nsViewManager.
- At some point after that, the code at [1] is run, and it invokes ProcessPendingUpdatesPaint using the new widget. Since the nsViewManager is the same as before, the PresShell is the same as before too, but the nsView* parameter passed in to PresShell::Paint is different.
- The code at [2] tries to get the layer manager, but there is no layer manager on this second widget, so one is created. Creating a layer manager also creates a new compositor and AsyncCompositionManager! (see second stack in attached file).
- The paint transaction then goes to this second layer manager and second AsyncCompositionManager, which doesn't have an initialized mTargetConfig. Also the isFirstPaint flag is not set on the transaction because the PresShell already cleared the flag on an earlier transaction (which went to a different layer manager / AsyncCompositionManager).

To me it seems wrong that the PresShell is initially painted with one view/widget and then later switches to painting with a second view/widget. I'm not sure if that's expected behaviour or not. Even if it is, it seems wrong to then be creating a second AsyncCompositionManager/Compositor stuff for a tooltip popup. Maybe we should just skip doing that if mWindowType == eWindowType_popup. I'm not knowledgeable about these bits of code though so ni? to Markus who might know this better. If creating a new AsyncCompositionManager for this tooltip popup is correct then we can either force the first-paint flag to be true on the compositor side for the first transaction received, or just initialize mTargetConfig better.

[1] http://mxr.mozilla.org/mozilla-central/source/view/nsViewManager.cpp?rev=5c772b35c40d#380
[2] http://mxr.mozilla.org/mozilla-central/source/layout/base/nsPresShell.cpp?rev=5c772b35c40d#6253
Flags: needinfo?(mstange)
Created attachment 8560625 [details] [diff] [review]
Patch

After thinking about this some more I suspect all this business with creating a new widget and compositor is actually expected behaviour, inefficient though it may be. Therefore I think the first transaction being received into a compositor should always be treated as a first-paint; this patch does that. Will flag for review once Markus confirms.
Attachment #8559964 - Attachment is obsolete: true
Comment on attachment 8558537 [details] [diff] [review]
A different approach

And the reason I think Jeff's patch is wrong is because mLayersUpdated gets set back to false during the composite process, so any call to RequiresReorientation after that will probably return false when it shouldn't.
Attachment #8558537 - Flags: review-
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #18)
> After thinking about this some more I suspect all this business with
> creating a new widget and compositor is actually expected behaviour,

Yes, this is expected. A tooltip needs its own platform window, because it can extend beyond the bounds of its parent window. Each platform window gets a widget, a layer manager and a compositor. And it looks like there's one AsyncCompositionManager per LayerManager, so your fix looks correct to me.
Flags: needinfo?(mstange)
Comment on attachment 8560625 [details] [diff] [review]
Patch

Review of attachment 8560625 [details] [diff] [review]:
-----------------------------------------------------------------

Cool, thanks. Mind reviewing then?
Attachment #8560625 - Flags: review?(mstange)
Attachment #8560625 - Flags: review?(mstange) → review+
https://hg.mozilla.org/integration/b2g-inbound/rev/f9f464a55ac3
https://hg.mozilla.org/mozilla-central/rev/f9f464a55ac3
Status: NEW → RESOLVED
Last Resolved: 3 years ago
status-firefox38: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla38
(Reporter)

Comment 24

3 years ago
Yes, looks Valgrind-clean now.  Kats, thanks!
You need to log in before you can comment on or make changes to this bug.