Bug 1764544 Comment 28 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

Hi all,

I've tried quite a few things to see if I can make some progress here, but so far no luck:

1. Introducing sleeps into a few places to see if I can change the timing on things: Didn't reproduce -- It seems it's not a race condition
2. Getting a stack trace the first time the Main Thread attempts to launch a content process -- Main thread doesn't even attempt to queue up a content process launch before `EnsureWin32kInitialized()` is called on my machine
3. I attempted to implement Nika's advice above and have `PrepareLaunch()` call `EnsureWin32kInitialized()`, which itself needs to call `gfxPlatform::GetPlatform()` before attempting to read WebRender vars.
    
    Nika's comment above is correct that `gfx::gfxVars::Initialize()` just creates the vars with their default initial values, and `gfxPlatform::Init()` is where the values actually get populated (Example for `UseWebRender()` [here](https://searchfox.org/mozilla-central/rev/4719c903af605c2caeda3bb1ce4edb9ca58bf68e/gfx/thebes/gfxPlatform.cpp#2615))

    So here is the patch I used:
    ```diff
   diff --git a/ipc/glue/GeckoChildProcessHost.cpp b/ipc/glue/GeckoChildProcessHost.cpp
   --- a/ipc/glue/GeckoChildProcessHost.cpp
   +++ b/ipc/glue/GeckoChildProcessHost.cpp
   @@ -594,6 +594,7 @@ void GeckoChildProcessHost::PrepareLaunc
    #  if defined(MOZ_SANDBOX)
      // We need to get the pref here as the process is launched off main thread.
      if (mProcessType == GeckoProcessType_Content) {
   +    GetWin32kLockdownState();
        mSandboxLevel = GetEffectiveContentSandboxLevel();
        mEnableSandboxLogging =
            Preferences::GetBool("security.sandbox.logging.enabled");
   diff --git a/toolkit/xre/nsAppRunner.cpp b/toolkit/xre/nsAppRunner.cpp
   --- a/toolkit/xre/nsAppRunner.cpp
   +++ b/toolkit/xre/nsAppRunner.cpp
   @@ -698,7 +698,7 @@ nsIXULRuntime::ContentWin32kLockdownStat
      // HasUserValue The Pref functions can only be called on main thread
      MOZ_ASSERT(NS_IsMainThread());
      mozilla::EnsureWin32kInitialized();
   -  gfx::gfxVars::Initialize();
   +  gfxPlatform::GetPlatform();

      if (gSafeMode) {
        return nsIXULRuntime::ContentWin32kLockdownState::DisabledBySafeMode;
   @@ -791,7 +791,6 @@ void EnsureWin32kInitialized() {
      gWin32kInitialized = true;

    #ifdef XP_WIN
   -
      // Initialize the Win32k experiment, configuring win32k's
      // default, before checking other overrides. This allows opting-out of a
      // Win32k experiment through about:preferences or about:config from a
    ```
    
    Unfortunately, it looks from [this build](https://treeherder.mozilla.org/jobs?repo=try&revision=a49b4d78f944838905037011c4eed0e2608ccdb5&selectedTaskRun=FbAA0nNhSsKHVXf2v7xLRQ.0) that this didn't work either. There are tons of GFX-related issues, so I'm guessing that this initializes `gfxPlatform` too early and it ends up with some weird values for things.

Note that this crash also seems to be happening on the FF 100 release ([Bug 1769807](https://bugzilla.mozilla.org/show_bug.cgi?id=1769807)), so it seems that it's not unique to Pine. Since I'm fairly certain that FF release is also built with PGO, that part of it seems to line up.
Hi all,

I've tried quite a few things to see if I can make some progress here, but so far no luck:

1. Introducing sleeps into a few places to see if I can change the timing on things: Didn't reproduce -- It seems it's not a race condition
2. Getting a stack trace the first time the Main Thread attempts to launch a content process -- Main thread doesn't even attempt to queue up a content process launch before `EnsureWin32kInitialized()` is called on my machine
3. I attempted to implement Nika's advice above and have `PrepareLaunch()` call `EnsureWin32kInitialized()`, which itself needs to call `gfxPlatform::GetPlatform()` before attempting to read WebRender vars.
    
    Nika's comment above is correct that `gfx::gfxVars::Initialize()` just creates the vars with their default initial values, and `gfxPlatform::Init()` is where the values actually get populated (Example for `UseWebRender()` [here](https://searchfox.org/mozilla-central/rev/4719c903af605c2caeda3bb1ce4edb9ca58bf68e/gfx/thebes/gfxPlatform.cpp#2615))

    So here is the patch I used:
    ```diff
   diff --git a/ipc/glue/GeckoChildProcessHost.cpp b/ipc/glue/GeckoChildProcessHost.cpp
   --- a/ipc/glue/GeckoChildProcessHost.cpp
   +++ b/ipc/glue/GeckoChildProcessHost.cpp
   @@ -594,6 +594,7 @@ void GeckoChildProcessHost::PrepareLaunc
    #  if defined(MOZ_SANDBOX)
      // We need to get the pref here as the process is launched off main thread.
      if (mProcessType == GeckoProcessType_Content) {
   +    GetWin32kLockdownState();
        mSandboxLevel = GetEffectiveContentSandboxLevel();
        mEnableSandboxLogging =
            Preferences::GetBool("security.sandbox.logging.enabled");
   diff --git a/toolkit/xre/nsAppRunner.cpp b/toolkit/xre/nsAppRunner.cpp
   --- a/toolkit/xre/nsAppRunner.cpp
   +++ b/toolkit/xre/nsAppRunner.cpp
   @@ -698,7 +698,7 @@ nsIXULRuntime::ContentWin32kLockdownStat
      // HasUserValue The Pref functions can only be called on main thread
      MOZ_ASSERT(NS_IsMainThread());
      mozilla::EnsureWin32kInitialized();
   -  gfx::gfxVars::Initialize();
   +  gfxPlatform::GetPlatform();

      if (gSafeMode) {
        return nsIXULRuntime::ContentWin32kLockdownState::DisabledBySafeMode;
   @@ -791,7 +791,6 @@ void EnsureWin32kInitialized() {
      gWin32kInitialized = true;

    #ifdef XP_WIN
   -
      // Initialize the Win32k experiment, configuring win32k's
      // default, before checking other overrides. This allows opting-out of a
      // Win32k experiment through about:preferences or about:config from a
    ```
    
    Unfortunately, it looks from [this build](https://treeherder.mozilla.org/jobs?repo=try&revision=a49b4d78f944838905037011c4eed0e2608ccdb5&selectedTaskRun=FbAA0nNhSsKHVXf2v7xLRQ.0) that this didn't work either. There are tons of GFX-related issues, so I'm guessing that this initializes `gfxPlatform` too early and it ends up with some weird values for things.

Note that this crash also seems to be happening on the FF 100 release ([Bug 1769807](https://bugzilla.mozilla.org/show_bug.cgi?id=1769807)), so it seems that it's not unique to Pine. Since I'm fairly certain that FF release is also built with PGO, that part of it seems to line up.

I'm going to try the same patch in Central to see if that works. I don't know if there is something different about Pine compared to Central WRT these tests.

Back to Bug 1764544 Comment 28