Bug 1909898 Comment 13 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

(In reply to Mike Hommey [:glandium] from comment #12)
> That said, if you moved the function to the Bootstrap class instead, like the other XRE_* functions defined in Bootstrap.cpp/h, then you wouldn't have to care about exporting it.

My understanding is that this diagnostic code was specifically intended to be run _before_ `XRE_GetBootstrap`, due to the implicit TLS use in the `static bool sBootstrapInitialized = false;` line.

Also, now that I've done a bit of digging, I don't think this _would_ fix it. See below.

----

(In reply to Mike Hommey [:glandium] from comment #10)
> We're doing that on nightly too.

I no longer believe we're doing this at all. I've checked a copy of `xul.dll` from a known-affected build (129.0b6), and `dumpbin.exe` clearly shows `XRE_CheckBlockScopeStaticVarInit` as exported:

```
File Type: DLL

  Section contains the following exports for xul.dll

          [...]
          23 number of functions
          23 number of names

    ordinal hint RVA      name

          1    0 085FA000 ?WerNotifyProc@CrashReporter@@YAKPEAX@Z
          2    1 02EF1140 DumpCompleteHeap
          [...]
         11    A 00030F60 XRE_CheckBlockScopeStaticVarInit
         12    B 00030FD0 XRE_GetBootstrap
          [...]
```
... which sinks my theory from comment 4.

(In reply to Mike Hommey [:glandium] from comment #11)
> The code is behind MOZ_DIAGNOSTIC_ASSERT_ENABLED, why aren't debug builds affected? (MOZ_DIAGNOSTIC_ASSERT_ENABLED is early-beta or central-debug)

Yeah. And the relatively low volume of this crash (even though it _is_ a topcrash!) seems inconsistent with it affecting _all_ early-beta builds.

What I now think is going on in this crash is that `firefox.exe` v129.0b6 is trying to load `xul.dll` v129.0b7, and is failing because the load procedure for the two is no longer identical: the former is looking for a function the latter doesn't have. This is consistent with concerns raised in bug 1816848 about incomplete partial installation. Unlike my previous hypothesis, this would _actually_ explain the crash volume profile described in comment 0 [_emphasis added_]:

> Observed for Firefox early betas (b1 - b6) for version 128 and 129 **with a noticeable spike to the end of the early beta cycle (b5 and b6)** - are there clients which don't update to late beta builds?

The only clients that would be crashing here would be clients upgrading from early-beta to late-beta, which tracks with the above.

I don't know if we have or want any checks comparing the version number of `xul.dll` to `firefox.exe`, but I suspect the answers are "no" and "yes" respectively. More immediately, though, I'm withdrawing any consideration of mucking about with `moz.build` and `SYMBOLS_FILE`, because it won't have any effect.
(In reply to Mike Hommey [:glandium] from comment #12)
> That said, if you moved the function to the Bootstrap class instead, like the other XRE_* functions defined in Bootstrap.cpp/h, then you wouldn't have to care about exporting it.

My understanding is that this diagnostic code was specifically intended to be run _before_ `XRE_GetBootstrap`, due to the implicit TLS use in the `static bool sBootstrapInitialized = false;` line.

Also, now that I've done a bit of digging, I don't think this _would_ fix it. See below.

----

(In reply to Mike Hommey [:glandium] from comment #10)
> We're doing that on nightly too.

I no longer believe we're doing this at all. I've checked a copy of `xul.dll` from a known-affected build (129.0b6), and `dumpbin.exe` clearly shows `XRE_CheckBlockScopeStaticVarInit` as exported:

```
File Type: DLL

  Section contains the following exports for xul.dll

          [...]
          23 number of functions
          23 number of names

    ordinal hint RVA      name

          1    0 085FA000 ?WerNotifyProc@CrashReporter@@YAKPEAX@Z
          2    1 02EF1140 DumpCompleteHeap
          [...]
         11    A 00030F60 XRE_CheckBlockScopeStaticVarInit
         12    B 00030FD0 XRE_GetBootstrap
          [...]
```
... which sinks my theory from comment 4.

(In reply to Mike Hommey [:glandium] from comment #11)
> The code is behind MOZ_DIAGNOSTIC_ASSERT_ENABLED, why aren't debug builds affected? (MOZ_DIAGNOSTIC_ASSERT_ENABLED is early-beta or central-debug)

Yeah. And the relatively low volume of this crash (even though it _is_ a topcrash!) seems inconsistent with it affecting _all_ early-beta builds.

What I now think is going on in this crash is that `firefox.exe` v129.0b6 is trying to load `xul.dll` v129.0b7, and is failing because the load procedure for the two is no longer identical: the former is looking for a function the latter doesn't have. This is consistent with concerns raised in bug 1816848 about incomplete partial installation. Unlike my previous hypothesis, this would _actually_ explain the crash volume profile described in comment 0 (emphasis added):

> Observed for Firefox early betas (b1 - b6) for version 128 and 129 **with a noticeable spike to the end of the early beta cycle (b5 and b6)** - are there clients which don't update to late beta builds?

The only clients that would be crashing here would be clients upgrading from early-beta to late-beta, which tracks with the above.

I don't know if we have or want any checks comparing the version number of `xul.dll` to `firefox.exe`, but I suspect the answers are "no" and "yes" respectively. More immediately, though, I'm withdrawing any consideration of mucking about with `moz.build` and `SYMBOLS_FILE`, because it won't have any effect.

Back to Bug 1909898 Comment 13