Closed Bug 1569285 Opened 6 years ago Closed 5 years ago

Crash in [@ mozilla::ExtensionPolicyService::ExtensionPolicyService]

Categories

(Core :: Gecko Profiler, defect, P3)

69 Branch
Unspecified
Windows 10
defect

Tracking

()

RESOLVED FIXED
88 Branch
Tracking Status
firefox-esr60 --- unaffected
firefox-esr68 --- unaffected
firefox-esr78 --- wontfix
firefox68 --- unaffected
firefox69 --- wontfix
firefox70 --- wontfix
firefox86 --- wontfix
firefox87 --- wontfix
firefox88 --- fixed

People

(Reporter: zxspectrum3579, Assigned: mozbugz)

References

Details

(Keywords: crash, regression)

Crash Data

This bug is for crash report bp-840eb0d7-5775-4499-a56a-dedf40190726.

Top 10 frames of crashing thread:

0 xul.dll void mozilla::ExtensionPolicyService::ExtensionPolicyService toolkit/components/extensions/ExtensionPolicyService.cpp:95
1 xul.dll mozilla::ExtensionPolicyService::GetSingleton toolkit/components/extensions/ExtensionPolicyService.cpp:86
2 xul.dll static void locked_profiler_stream_json_for_this_process tools/profiler/core/platform.cpp:2117
3 xul.dll static void locked_profiler_save_profile_to_file tools/profiler/core/platform.cpp:3281
4 xul.dll profiler_shutdown tools/profiler/core/platform.cpp:3076
5 xul.dll int XREMain::XRE_main toolkit/xre/nsAppRunner.cpp:4845
6 xul.dll XRE_main toolkit/xre/nsAppRunner.cpp:4852
7 firefox.exe static int NS_internal_main browser/app/nsBrowserApp.cpp:295
8 firefox.exe wmain toolkit/xre/nsWindowsWMain.cpp:131
9 firefox.exe static int __scrt_common_main_seh f:/dd/vctools/crt/vcstartup/src/startup/exe_common.inl:288

As it can be seen from the statistics graph this error usually happens very sporadically and at times not even in FireFox but in Thunderbird, but for a few days now some change in FireFox 69 beta has made every link I open from Thunderbird is causing a crash.

Looks as if the crashes started in 69.0b6. All of them have MOZ_RELEASE_ASSERT(mObs), which I think maybe was added in Bug 1368102? Adding a ni on kmag.

Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: needinfo?(kmaglione+bmo)
Keywords: crash, regression

This is happening because the profiler is trying to access the policy service too late in shutdown and causing it to try to reinstantiate itself. This is the sort of case where we're supposed to crash.

Flags: needinfo?(kmaglione+bmo)

Thanks, Kristoff, but can you please advice me how to stop this? (It is unbearable.)

Also, how can we understand what has changed in the code to cause this?

And, indeed, shutdown does not work properly since e.g. YouTube videos have to remember the timestamp in the video playback but FF does recover it, I always have to try to remember it myself.

Thanks in advance.

Flags: needinfo?(kmaglione+bmo)

I believe the profiler just needs to make sure it doesn't do this sort of thing this late in shutdown.

Component: General → Gecko Profiler
Flags: needinfo?(kmaglione+bmo)
Product: Toolkit → Core

(In reply to User Dderss from comment #4)

can you please advice me how to stop this? (It is unbearable.)

This looks like it's happening when the Profiler tries to write a shutdown profile, which should only happen if you have environment variables MOZ_PROFILER_STARTUP=1 and MOZ_PROFILER_SHUTDOWN=<file>.
If that's is not what you're trying to do, then you need to delete these.
In case you're not familiar with this, see https://www.digitalcitizen.life/how-edit-or-delete-environment-variables-windows-7-windows-8

(In reply to Kris Maglione [:kmag] from comment #5)

I believe the profiler just needs to make sure it doesn't do this sort of thing this late in shutdown.

There's currently an RAII object in XRE_main:
https://searchfox.org/mozilla-central/rev/1eb05019f47069172ba81a6c108a584a409a24ea/toolkit/xre/nsAppRunner.cpp#4661
When it is destroyed at the end of the function, it shuts down the profiler and optionally writes a profile, which tries to access lots of things.
I see that ExtensionPolicyService is destroyed by ClearOnShutdown, I could probably work out a way to deal with this...

Flags: needinfo?(zxspectrum3579)

(In reply to Gerald Squelart [:gerald] from comment #6)

(In reply to Kris Maglione [:kmag] from comment #5)

I believe the profiler just needs to make sure it doesn't do this sort of thing this late in shutdown.

There's currently an RAII object in XRE_main:
https://searchfox.org/mozilla-central/rev/1eb05019f47069172ba81a6c108a584a409a24ea/toolkit/xre/nsAppRunner.cpp#4661
When it is destroyed at the end of the function, it shuts down the profiler and optionally writes a profile, which tries to access lots of things.
I see that ExtensionPolicyService is destroyed by ClearOnShutdown, I could probably work out a way to deal with this...

The problem is that by that point, a lot of things have been destroyed and shut down, which means it really isn't a safe time for the profiler to try to collect metadata. The information that it collects is likely to be inconsistent, and there's a risk of it accidentally reinitializing code that's already been shut down. Not to mention the fact that none of its do_GetService calls will work.

We should really probably collect that data earlier in shutdown if we need to dump the profile data that late.

(In reply to Kris Maglione [:kmag] from comment #7)

We should really probably collect that data earlier in shutdown if we need to dump the profile data that late.

Thank you. That's what I was actually thinking about!
I had to do that for a previous bug about thread names; I'm guessing there should really be some kind of similar pre-shutdown data collection that the late profiler shutdown can use.

(My ClearOnShutdown remark was more a note-to-self for a potential quick fix, by not accessing this data when it's too late.)

(In reply to Gerald Squelart [:gerald] from comment #6)

This looks like it's happening when the Profiler tries to write a shutdown profile, which should only happen if you have environment variables MOZ_PROFILER_STARTUP=1 and MOZ_PROFILER_SHUTDOWN=<file>.
If that's is not what you're trying to do, then you need to delete these.
In case you're not familiar with this, see https://www.digitalcitizen.life/how-edit-or-delete-environment-variables-windows-7-windows-8

I have MOZ_PROFILER_STARTUP=0 set (just in case I checked both local/user and global/system variables), so why the core is ignoring the setting?

Can you please check the code because if the crashes happen because of this variable presence rather than its value being equal to 1, then it does not work correctly.

Thank you in advance, Gerald.

Flags: needinfo?(zxspectrum3579) → needinfo?(gsquelart)

(In reply to User Dderss from comment #9)

I have MOZ_PROFILER_STARTUP=0 set (just in case I checked both local/user and global/system variables), so why the core is ignoring the setting?

Can you please check the code because if the crashes happen because of this variable presence rather than its value being equal to 1, then it does not work correctly.

As per the MOZ_PROFILER_HELP documentation:

MOZ_PROFILER_STARTUP
If set to any value, starts the profiler immediately on start-up.
Useful if you want profile code that runs very early.

So you need to make sure this variable is not set at all, not even "0". Please let us know if that helps.

Flags: needinfo?(gsquelart)

Thanks, Gerard, it did help.

But the decision to ignore the value of the setting is inconvenient because if you do not use this feature very frequently you will forget the exact spelling of the variable what would make you wasting time while searching for it instead of just seeing it in the list of variables and changing the value from 0 to 1 when you need the profiler to run again.

Do you remember what the rationale was for that? Can we ask for this mechanics to change for the convenience sake?

Good to know it helps, great!

I think it's just easy to check in code if a variable is set or not, for binary choices; glancing at Firefox code it's used everywhere!
But it should be easy enough to change it to interpret "0" as OFF; please open a separate bug for that. (I want to keep this bug here open for the crash, which needs fixing anyway.)

Priority: -- → P3

Since this is already triaged and set to P3 priority I'm setting it as fix-optional.
That will remove the bug from weekly regression triage.
It also looks like the issue may have been resolved since no crashes are showing up after 69 beta 8.

It was not resolved, it is just I switched off the profiler altogether hence there are no more crashes for now. So please fix it in a way suggested earlier in the comments.

Thanks in advance, Elizabeth.

Closing because no crashes reported for 12 weeks.

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WORKSFORME

Actually, I believe I've just fixed that in bug 1693502, so I'll re-label this one FIXED.
That other bug should probably have been marked duplicate of this one here (if I had not forgotten about it), but when it was filed it had useful details that allowed me to find the cause quickly.

Depends on: 1693502
Resolution: WORKSFORME → FIXED
Assignee: nobody → gsquelart
Target Milestone: --- → 88 Branch
You need to log in before you can comment on or make changes to this bug.