No pages are rendered / Shutdownhang crashes with aetpkss1.dll PKCS#11 module in Firefox 55

RESOLVED FIXED

Status

()

Core
Security: PSM
--
critical
RESOLVED FIXED
15 days ago
7 days ago

People

(Reporter: philipp, Unassigned, NeedInfo)

Tracking

({crash, regression})

55 Branch
All
Windows
crash, regression
Points:
---
Dependency tree / graph
Bug Flags:
qe-verify +

Firefox Tracking Flags

(firefox-esr52 unaffected, firefox55blocking fixed, firefox56 fixed, firefox57 fixed)

Details

(crash signature)

Attachments

(1 attachment)

This bug was filed from the Socorro interface and is 
report bp-34b15b8d-6111-4265-8462-d5e780170808.
=============================================================

[Tracking Requested - why for this release]:
in early crash data coming in after the go-live of the 55.0 release there is an extraordinary high share of shutdownhang signatures (close to 40% of all browser crashes). spotting for similarities between the reports pt-BR locales are affected inproportionally high and lots of reports show the aetpkss1.dll file being present in the modules list of the browser.

this seems to be a PKCS#11 crypto library probably relating to https://www.aeteurope.com/our-solutions/safesign-identity-client/

what's worrying is that some user comments there seem to say that they are unable to load pages at all, so this may not be a mere shutdownhang situation but a symptom of something more fundamental going wrong with the browser:
* bp-5ca6b074-c948-49a9-9e0c-27c860170808: I have uploaded SafeSign PKCS#11 library to interact with smartcards. After upload the library successfull I saw my certificate in the list. Afterwards I left the firefox and when I came back and clicked in Security Devices button in Advanced tab, Firefox crashed. This is really important issue for many people.
* bp-34b15b8d-6111-4265-8462-d5e780170808: seit der Installation auf 55.0 kann ich Firefox nicht mehr nutzen, da kein Seitenaufbau mehr funktioniert!!!! Bitte den Fehler beheben! Danke. 
* bp-4d722f0e-69b9-42d8-8de2-857cd0170808: A nova atualização, não consigo navegar e nem pesquisa.
* bp-d70098e0-79ec-42d3-90d6-d92790170808: Não abre nenhum site 
* bp-e7895c43-8498-4c21-add7-7afe00170808: Navegador se tornou inservível, pois não abre qualquer página, não responde a nenhum comando.
...
(Reporter)

Updated

15 days ago
Summary: Crash in shutdownhang | NtWaitForKeyedEvent | RtlSleepConditionVariableSRW | GetTickCount64 | GetTickCount64 | mozilla::TimeStamp::Now | mozilla::CondVar::Wait → Shutdownhang crashes with aetpkss1.dll PKCS#11 module in Firefox 55
This is critical, tracking.
We might disable updates or block it.


Aron, david, do you think we could block this dll? Would it work?
tracking-firefox55: ? → blocking
Flags: needinfo?(dmajor)
Flags: needinfo?(aklotz)
Looking at the stack traces this looks very similar to bug 1372505.
See Also: → bug 1372505
I defer to Carl regarding the technical aspects of blocking this.

Keep in mind that if we successfully block the software, we'll be breaking some crypto-related activity (banking? corporate vpn?) for these users and it may still result in a bad experience with Firefox. We may want to prepare some support/messaging.

Also it would be good to contact the software vendor in case they can put out an update quickly and/or give us advice on how to minimally work around the issue, preferably without a wholesale block.
Flags: needinfo?(dmajor) → needinfo?(ccorcoran)
(Reporter)

Comment 4

15 days ago
judging on a crash-stats search [1] it looks like the issue for pt-BR users potentially has started with 55.0a1 build 20170608030205. the pushlog to the day before [2] had bug 1345368 landing, so maybe it could related to that?

[1] https://crash-stats.mozilla.com/search/?signature=%5Eshutdownhang&proto_signature=~mozilla%3A%3ASpinEventLoopUntil%3CT%3E%20%7C%20mozilla%3A%3Anet%3A%3AnsHttpConnectionMgr%3A%3AShutdown&useragent_locale=pt-br&version=55.0a1&date=%3E%3D2016-11-01T00%3A00%3A00.000Z&date=%3C2017-08-08T15%3A12%3A00.000Z#crash-reports
[2] https://hg.mozilla.org/mozilla-central/pushloghtml?startdate=2017-06-07&tochange=7efda263a842e60cd0cc00b3c4a7058c65590702
Flags: needinfo?(franziskuskiefer)
(Reporter)

Updated

15 days ago
Crash Signature: [@ shutdownhang | NtWaitForKeyedEvent | RtlSleepConditionVariableSRW | GetTickCount64 | GetTickCount64 | mozilla::TimeStamp::Now | mozilla::CondVar::Wait] [@ shutdownhang | NtWaitForKeyedEvent | RtlSleepConditionVariableSRW | SleepConditionVariabl… → [@ shutdownhang | NtWaitForKeyedEvent | RtlSleepConditionVariableSRW | GetTickCount64 | GetTickCount64 | mozilla::TimeStamp::Now | mozilla::CondVar::Wait] [@ shutdownhang | NtWaitForKeyedEvent | RtlSleepConditionVariableSRW | SleepConditionVariabl…
ccorcoran should be your point of contact here.
Flags: needinfo?(aklotz)
(In reply to [:philipp] from comment #4)
...
> [2]
> https://hg.mozilla.org/mozilla-central/pushloghtml?startdate=2017-06-
> 07&tochange=7efda263a842e60cd0cc00b3c4a7058c65590702

That includes https://hg.mozilla.org/mozilla-central/rev/aafc907d2aae which includes a patch that introduced a deadlock that was fixed by bug 1381784 ( https://hg.mozilla.org/projects/nss/rev/58026f3ade78 ). Unfortunately that only landed in NSS 3.32 and as a result didn't make it into the version of NSS used by Firefox 55 (3.31). This is a likely candidate for the behavior people are reporting.

According to telemetry ( https://mozkeeler.github.io/pkcs11-module-telemetry/ - may take a while to load) this may affect on the order of 60k users, although it may be higher than that if other PKCS#11 modules behave similarly.
(Reporter)

Comment 7

15 days ago
yes, when manually looking into a couple of those shutdownhang reports occasionally there are other PKCS#11 modules as well - aetpkss1.dll is just the most frequent one...
Franziskus, David, could you prepare a new version of NSS to fix this? 3.31.1? Seems like a potential driver for a 55.0.1
Thanks
Flags: needinfo?(dkeeler)
Summary: Shutdownhang crashes with aetpkss1.dll PKCS#11 module in Firefox 55 → No pages are rendered / Shutdownhang crashes with aetpkss1.dll PKCS#11 module in Firefox 55
I don't actually have the permissions to cut a new NSS release (much less uplift a patch) (nor do I know how, for that matter). If Franziskus isn't around, maybe Tim is? (although it's end-of-day in their timezone)
Flags: needinfo?(dkeeler) → needinfo?(ttaubert)
Two information:
* we are going to disable updates of 55 because of this bug
* Kai is in vacation.
(Reporter)

Comment 11

14 days ago
Leonardo, is your issue fixed if you upgrade to the (not yet fully released) 56.0b1 which you can get for your locale from https://archive.mozilla.org/pub/firefox/releases/56.0b1/win32/ ?
Flags: needinfo?(leonardo1.poa)

Comment 12

14 days ago
Yes, version 56.0b1 runs flawlessly with Safesign installed.
Flags: needinfo?(leonardo1.poa)
In an emergency, we could in theory land patches into Firefox's NSS without a formal release, right?
yeah, we should do that in nobody is available
(Reporter)

Comment 15

14 days ago
could you do another test-run in order no narrow down the regression range and increase the confidence of the planned fix:
* first please try https://archive.mozilla.org/pub/firefox/nightly/2017/07/2017-07-27-16-28-01-mozilla-central/firefox-56.0a1.en-US.win32.installer.exe which was the last nightly build without the NSS 3.32 and should therefore probably still exhibit the same problem as you're seeing on 55 release.
* secondly please try https://archive.mozilla.org/pub/firefox/nightly/2017/07/2017-07-29-10-02-54-mozilla-central/firefox-56.0a1.en-US.win32.installer.exe which is the first nightly with NSS 3.32 and should no longer have the bug.

can you confirm these two assumptions? many thanks!
Flags: needinfo?(leonardo1.poa)
(In reply to Sylvestre Ledru [:sylvestre] from comment #8)
> Franziskus, David, could you prepare a new version of NSS to fix this?
> 3.31.1? Seems like a potential driver for a 55.0.1
> Thanks

On it.
Flags: needinfo?(ttaubert)

Comment 17

14 days ago
Can't confirm.
Both versions showed no problems.
Flags: needinfo?(leonardo1.poa)
(Reporter)

Comment 18

14 days ago
ok, i may have taken a wrong turn. is https://archive.mozilla.org/pub/firefox/nightly/2017/07/2017-07-23-03-02-06-mozilla-central/firefox-56.0a1.en-US.win32.installer.exe still buggy?
Created attachment 8895070 [details]
script to update to NSS 3.31.1

Ok, here's the command. The release is live and the release notes can be found here:

https://developer.mozilla.org/en-US/docs/Mozilla/Projects/NSS/NSS_3.31.1_release_notes

Comment 20

14 days ago
(In reply to [:philipp] from comment #18)
> ok, i may have taken a wrong turn. is
> https://archive.mozilla.org/pub/firefox/nightly/2017/07/2017-07-23-03-02-06-
> mozilla-central/firefox-56.0a1.en-US.win32.installer.exe still buggy?

Still no problems.
Tim, I have the permissions to make an NSS release (and I did it in the past).
Should I just go ahead or will you do it?
Thanks
I am planning to start a build of 55.0.1 today.
Flags: needinfo?(ttaubert)
The release is done https://ftp.mozilla.org/pub/security/nss/releases/NSS_3_31_1_RTM/.
You can update Firefox with `python client.py update_nss NSS_3_31_1_RTM`.
Flags: needinfo?(ttaubert)
Flags: needinfo?(franziskuskiefer)
ok, I need more coffee, thanks
Flags: needinfo?(ccorcoran)

Comment 24

14 days ago
uplift
I used my superpower to land it:
https://hg.mozilla.org/releases/mozilla-release/rev/3adc9de7f0de0f235eeb3c9f047a2076bd49a2f8

We will need to verify that.
status-firefox55: affected → fixed
Flags: qe-verify+
(Reporter)

Comment 25

14 days ago
combing through module correlations data for those crash signatures this morning it looks like aetpkss1.dll is present for the bulk (80-90%) of them, but it's not exclusive and other PKCS#11 modules are implicated as well (eTPKCS11.dll, WDBraz_P11_CCID_v34.dll, PKCS11.dll, asepkcs.dll).
Comment on attachment 8895070 [details]
script to update to NSS 3.31.1

Asking the uplift request for posterity

Approval Request Comment
[Feature/Bug causing the regression]: Bug 1273678
[User impact if declined]: Deadlock causing the pages not to render
[Is this code covered by automated tests?]: Yes, seems that we use this line a lot: https://codecov.io/gh/marco-c/gecko-dev/src/master/security/nss/lib/dev/devslot.c#L231
[Has the fix been verified in Nightly?]: Yes, we have already this NSS fix in nightly & beta
[Needs manual test from QE? If yes, steps to reproduce]: Install the extension mentioned above
[List of other uplifts needed for the feature/fix]: No
[Is the change risky?]: Yes and no, this is a race condition so, never trivial but we had the fix in nightly and beta for a few days.
[Why is the change risky/not risky?]: Explained just above
[String changes made/needed]: None
Attachment #8895070 - Flags: approval-mozilla-release?
Depends on: 1381784
Is Fennec impacted by this bug?
No, we don't support external PK11 tokens on Android.

Comment 29

14 days ago
(In reply to Leonardo Cervo from comment #20)
> (In reply to [:philipp] from comment #18)
> > ok, i may have taken a wrong turn. is
> > https://archive.mozilla.org/pub/firefox/nightly/2017/07/2017-07-23-03-02-06-
> > mozilla-central/firefox-56.0a1.en-US.win32.installer.exe still buggy?
> 
> Still no problems.

Well, after using it longer, I can confirm the same behavior, pages hanging on load and crashing on exit.
Andrei, could you try to see if someone in your team can reproduce the issue with the extension? Thanks
Flags: needinfo?(andrei.vaida)
(Reporter)

Updated

13 days ago
Crash Signature: [@ shutdownhang | NtWaitForKeyedEvent | RtlSleepConditionVariableSRW | GetTickCount64 | GetTickCount64 | mozilla::TimeStamp::Now | mozilla::CondVar::Wait] [@ shutdownhang | NtWaitForKeyedEvent | RtlSleepConditionVariableSRW | SleepConditionVariabl… → [@ shutdownhang | NtWaitForKeyedEvent | RtlSleepConditionVariableSRW | GetTickCount64 | GetTickCount64 | mozilla::TimeStamp::Now | mozilla::CondVar::Wait] [@ shutdownhang | NtWaitForKeyedEvent | RtlSleepConditionVariableSRW | SleepConditionVariabl…
Comment on attachment 8895070 [details]
script to update to NSS 3.31.1

I landed it in m-r
Attachment #8895070 - Flags: approval-mozilla-release? → approval-mozilla-release+
Tried to reproduce the initial issue on Windows 7 x64 and Windows 10 x64, but we need some additional details for the to PKCS11 Modules install. We just managed to confirm that the NSS library version is updated to 3.31.1 in 55.0.1 build1 (20170809080026).
Flags: needinfo?(madperson)
(Reporter)

Comment 35

12 days ago
here's one site that hosts the module: http://www.ciespdigital.com.br/index.php/suporte/downloads
didn't experience any immediate shutdownhangs just with installing this, though aetpkss1.dll was hooking into the browser then
Flags: needinfo?(madperson)
(In reply to Sylvestre Ledru [:sylvestre] from comment #33)
> I landed it in m-r

Should this bug be marked "fixed" then?
Blocks: 1273678
Flags: needinfo?(sledru)
Probably, I just updated the status flag
Status: NEW → RESOLVED
Last Resolved: 8 days ago
Flags: needinfo?(sledru)
Resolution: --- → FIXED
status-firefox56: --- → fixed
status-firefox57: --- → fixed
status-firefox-esr52: --- → unaffected
You need to log in before you can comment on or make changes to this bug.