Closed Bug 1583770 Opened 2 years ago Closed 2 years ago

Crash in [@ nsNTLMAuthModule::GetNextToken]

Categories

(Core :: Preferences: Backend, defect)

70 Branch
Unspecified
Windows 10
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla71
Tracking Status
firefox71 --- fixed

People

(Reporter: wsmwk, Assigned: jorgk-bmo)

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file)

First appears in 70.0a1 build id 20190818074802 bp-eb64a4f7-b9e6-4810-9569-192690190819. Assuming regression.

All crashes are linux https://crash-stats.mozilla.com/signature/?product=Thunderbird&signature=nsNTLMAuthModule%3A%3AGetNextToken&date=%3E%3D2019-03-25T09%3A58%3A00.000Z&date=%3C2019-09-25T09%3A58%3A00.000Z&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&_columns=install_time&_columns=startup_crash&_sort=-date&page=1#reports
No other crash signatures containing nsNTLMAuthModule https://crash-stats.mozilla.com/search/?signature=~nsNTLMAuthModule&date=%3E%3D2019-03-25T10%3A01%3A00.000Z&date=%3C2019-09-25T10%3A01%3A00.000Z&_facets=signature&_sort=-date&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature

Top 10 frames of crashing thread:

0 xul.dll nsNTLMAuthModule::GetNextToken security/manager/ssl/nsNTLMAuthModule.cpp:978
1 xul.dll nsMsgProtocol::DoNtlmStep2 comm/mailnews/base/util/nsMsgProtocol.cpp:930
2 xul.dll nsImapProtocol::AuthLogin comm/mailnews/imap/src/nsImapProtocol.cpp:5702
3 xul.dll nsImapProtocol::TryToLogon comm/mailnews/imap/src/nsImapProtocol.cpp:8192
4 xul.dll nsImapProtocol::ProcessCurrentURL comm/mailnews/imap/src/nsImapProtocol.cpp:1739
5 xul.dll nsImapProtocol::ImapThreadMainLoop comm/mailnews/imap/src/nsImapProtocol.cpp:1420
6 xul.dll nsImapProtocol::Run comm/mailnews/imap/src/nsImapProtocol.cpp:1103
7 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1225
8 xul.dll NS_ProcessNextEvent xpcom/threads/nsThreadUtils.cpp:486
9 xul.dll mozilla::ipc::MessagePumpForNonMainThreads::Run ipc/glue/MessagePump.cpp:303

No crashes (so far) since build ID 20190923003205 bp-ecbf9245-69e6-4347-bf4d-bfedc0190924 2019-09-24 Thunderbird 71.0a1.
0 libxul.so nsNTLMAuthModule::GetNextToken(void const*, unsigned int, void**, unsigned int*) security/manager/ssl/nsNTLMAuthModule.cpp:959
1 libxul.so nsMsgProtocol::DoNtlmStep2(nsTString<char>&, nsTString<char>&) comm/mailnews/base/util/nsMsgProtocol.cpp:931
2 libxul.so nsImapProtocol::AuthLogin(char const*, nsTString<char16_t> const&, unsigned long) comm/mailnews/imap/src/nsImapProtocol.cpp:5702
3 libxul.so nsImapProtocol::TryToLogon() comm/mailnews/imap/src/nsImapProtocol.cpp:8192
4 libxul.so nsImapProtocol::ProcessCurrentURL() comm/mailnews/imap/src/nsImapProtocol.cpp:1739
5 libxul.so nsImapProtocol::ImapThreadMainLoop() comm/mailnews/imap/src/nsImapProtocol.cpp:1420
6 libxul.so nsImapProtocol::Run() comm/mailnews/imap/src/nsImapProtocol.cpp:1103
7 libxul.so non-virtual thunk to nsImapProtocol::Run() comm/mailnews/imap/src/nsImapProtocol.cpp:0
8 libxul.so nsThread::ProcessNextEvent(bool, bool*) xpcom/threads/nsThread.cpp:1225

But rare so far, so cannot determine if we are out of the woods yet. What do you think, is this legit ntlm issue, or just another imap threading issue?

Flags: needinfo?(jorgk)

We should all get into the reason to check the MOZ_CRASH Reason (Sanitized) first without looking at anything else. Here we have:

MOZ_DIAGNOSTIC_ASSERT(IsAtomic<bool>::value || NS_IsMainThread()) (Non-atomic static pref 'network.auth.force-generic-ntlm-v1' being accessed on background thread by getter)

MOZ_DIAGNOSTIC_ASSERT() is something that only causes a premeditated crash on Nightly/Daily builds and debug builds. You will never see them in beta or release.

So clearly some IMAP threading issue has gone wrong here: So somehow, the IMAP thread calls into the Msg protocol which in turn makes the NTLMAuth call. Most likely that call was changed to now assert when not called from the main thread. There was some movement in this code in Aug 2019 here:
https://hg.mozilla.org/mozilla-central/rev/89b4edfa8150#l3.33
but it looks more like a reshuffle, so the diagnostic assert was there before.

The proper fix is to proxy the call to the main thread, right, Nicholas? What's the background to the diagnostic assert?

Flags: needinfo?(jorgk) → needinfo?(n.nethercote)

The proper fix is easier than that. Just change the type of this static pref from bool to RelaxedAtomicBool. That will let it be accessed off the main thread.

Flags: needinfo?(n.nethercote)

Thanks, Nicholas, so we just do that in M-C code, right?

Assignee: nobody → jorgk
Attachment #9100323 - Flags: review?(n.nethercote)
Comment on attachment 9100323 [details] [diff] [review]
1583770-network-pref.patch

Review of attachment 9100323 [details] [diff] [review]:
-----------------------------------------------------------------

Yes!
Attachment #9100323 - Flags: review?(n.nethercote) → review+

Dear sheriff, please import with
hg qimport bz:1583770
or
hg qimport https://bugzilla.mozilla.org/attachment.cgi?id=9100323 -n 1583770-network-pref.patch

See bug 1576146 for reasons of not using Lando.

Component: Security → Preferences: Backend
Product: Thunderbird → Core
Version: 70 → 70 Branch

Pushed by apavel@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/94ab8abd4d3a
Allow access to pref network.auth.force-generic-ntlm-v1 off the main tread. r=njn

Keywords: checkin-needed
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla71
You need to log in before you can comment on or make changes to this bug.