Closed Bug 1369361 Opened 8 years ago Closed 7 years ago

Startup crash in abort | _acrt_RtlGenRandom (Lenovo OneKey Theater)

Categories

(External Software Affecting Firefox :: Other, defect)

x86_64
Windows 7
defect
Not set
critical

Tracking

(relnote-firefox 56+, firefox-esr52 wontfix, firefox55 wontfix, firefox56+ fixed, firefox57 fixed)

RESOLVED FIXED
Tracking Status
relnote-firefox --- 56+
firefox-esr52 --- wontfix
firefox55 --- wontfix
firefox56 + fixed
firefox57 --- fixed

People

(Reporter: philipp, Assigned: marco)

References

(Depends on 1 open bug, Blocks 2 open bugs)

Details

(Keywords: crash)

Crash Data

Attachments

(2 files, 2 obsolete files)

This bug was filed from the Socorro interface and is report bp-7c5d3993-2a37-4751-97a4-9a5701170601. ============================================================= Crashing Thread (0) Frame Module Signature Source 0 ucrtbase.dll abort 1 ucrtbase.dll _acrt_RtlGenRandom 2 ucrtbase.dll rand_s 3 nss3.dll winRandomness db/sqlite3/src/sqlite3.c:43256 4 nss3.dll sqlite3_randomness db/sqlite3/src/sqlite3.c:26694 5 nss3.dll writeJournalHdr db/sqlite3/src/sqlite3.c:48251 6 nss3.dll pager_open_journal db/sqlite3/src/sqlite3.c:52471 7 nss3.dll pager_write db/sqlite3/src/sqlite3.c:52656 8 nss3.dll sqlite3PagerWrite db/sqlite3/src/sqlite3.c:52829 9 nss3.dll newDatabase db/sqlite3/src/sqlite3.c:61930 10 nss3.dll sqlite3BtreeBeginTrans db/sqlite3/src/sqlite3.c:62080 11 nss3.dll sqlite3VdbeExec db/sqlite3/src/sqlite3.c:81184 12 nss3.dll sqlite3_step db/sqlite3/src/sqlite3.c:76477 13 nss3.dll sqlite3_exec db/sqlite3/src/sqlite3.c:110601 14 xul.dll mozilla::storage::Connection::executeSql(sqlite3*, char const*) storage/mozStorageConnection.cpp:1178 15 xul.dll mozilla::storage::Connection::ExecuteSimpleSQL(nsACString_internal const&) storage/mozStorageConnection.cpp:1646 16 xul.dll mozilla::places::Database::InitTempEntities() toolkit/components/places/Database.cpp:1121 17 xul.dll mozilla::places::Database::Init() toolkit/components/places/Database.cpp:488 18 xul.dll mozilla::places::Database::GetSingleton() toolkit/components/places/Database.cpp:320 19 xul.dll mozilla::places::Database::GetDatabase() toolkit/components/places/Database.cpp:429 20 xul.dll nsNavHistory::Init() toolkit/components/places/nsNavHistory.cpp:308 21 xul.dll nsNavHistory::GetSingleton() toolkit/components/places/nsNavHistory.cpp:266 22 xul.dll nsNavHistoryConstructor toolkit/components/places/nsPlacesModule.cpp:18 23 xul.dll nsComponentManagerImpl::CreateInstance(nsID const&, nsISupports*, nsID const&, void**) xpcom/components/nsComponentManager.cpp:1057 24 xul.dll nsComponentManagerImpl::GetService(nsID const&, nsID const&, void**) xpcom/components/nsComponentManager.cpp:1300 25 xul.dll nsJSCID::GetService(JS::Handle<JS::Value>, JSContext*, unsigned char, JS::MutableHandle<JS::Value>) js/xpconnect/src/XPCJSID.cpp:693 26 xul.dll XPTC__InvokebyIndex xpcom/reflect/xptcall/md/win32/xptcinvoke_asm_x86_64.asm:97 27 @0x2d9227 28 xul.dll XPCWrappedNative::CallMethod(XPCCallContext&, XPCWrappedNative::CallMode) js/xpconnect/src/XPCWrappedNative.cpp:1296 29 xul.dll XPC_WN_CallMethod(JSContext*, unsigned int, JS::Value*) js/xpconnect/src/XPCWrappedNativeJSOps.cpp:983 30 xul.dll js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct) js/src/vm/Interpreter.cpp:460 31 xul.dll Interpret js/src/vm/Interpreter.cpp:2989 32 xul.dll js::RunScript(JSContext*, js::RunState&) js/src/vm/Interpreter.cpp:406 33 xul.dll js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct) js/src/vm/Interpreter.cpp:478 34 xul.dll js::CrossCompartmentWrapper::call(JSContext*, JS::Handle<JSObject*>, JS::CallArgs const&) js/src/proxy/CrossCompartmentWrapper.cpp:333 35 xul.dll js::Proxy::call(JSContext*, JS::Handle<JSObject*>, JS::CallArgs const&) js/src/proxy/Proxy.cpp:421 36 xul.dll js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct) js/src/vm/Interpreter.cpp:448 37 xul.dll Interpret js/src/vm/Interpreter.cpp:2989 38 xul.dll js::RunScript(JSContext*, js::RunState&) js/src/vm/Interpreter.cpp:406 39 xul.dll js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct) js/src/vm/Interpreter.cpp:478 40 xul.dll js::CrossCompartmentWrapper::call(JSContext*, JS::Handle<JSObject*>, JS::CallArgs const&) js/src/proxy/CrossCompartmentWrapper.cpp:333 41 xul.dll js::Proxy::call(JSContext*, JS::Handle<JSObject*>, JS::CallArgs const&) js/src/proxy/Proxy.cpp:421 42 xul.dll js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct) js/src/vm/Interpreter.cpp:448 43 xul.dll Interpret js/src/vm/Interpreter.cpp:2989 44 xul.dll js::RunScript(JSContext*, js::RunState&) js/src/vm/Interpreter.cpp:406 45 xul.dll js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct) js/src/vm/Interpreter.cpp:478 46 xul.dll JS_CallFunctionValue(JSContext*, JS::Handle<JSObject*>, JS::Handle<JS::Value>, JS::HandleValueArray const&, JS::MutableHandle<JS::Value>) js/src/jsapi.cpp:2788 47 xul.dll nsXPCWrappedJSClass::CallMethod(nsXPCWrappedJS*, unsigned short, XPTMethodDescriptor const*, nsXPTCMiniVariant*) js/xpconnect/src/XPCWrappedJSClass.cpp:1213 48 xul.dll nsXPCWrappedJS::CallMethod(unsigned short, XPTMethodDescriptor const*, nsXPTCMiniVariant*) js/xpconnect/src/XPCWrappedJS.cpp:613 49 xul.dll PrepareAndDispatch xpcom/reflect/xptcall/md/win32/xptcstubs_x86_64.cpp:174 50 xul.dll SharedStub xpcom/reflect/xptcall/md/win32/xptcstubs_asm_x86_64.asm:57 51 xul.dll nsObserverList::NotifyObservers(nsISupports*, char const*, char16_t const*) xpcom/ds/nsObserverList.cpp:112 52 xul.dll nsObserverService::NotifyObservers(nsISupports*, char const*, char16_t const*) xpcom/ds/nsObserverService.cpp:281 53 xul.dll XREMain::XRE_mainRun() toolkit/xre/nsAppRunner.cpp:4402 54 xul.dll XREMain::XRE_main(int, char** const, mozilla::BootstrapConfig const&) toolkit/xre/nsAppRunner.cpp:4654 55 xul.dll XRE_main(int, char** const, mozilla::BootstrapConfig const&) toolkit/xre/nsAppRunner.cpp:4745 56 firefox.exe NS_internal_main(int, char**, char**) browser/app/nsBrowserApp.cpp:305 57 firefox.exe wmain toolkit/xre/nsWindowsWMain.cpp:115 58 firefox.exe __scrt_common_main_seh f:/dd/vctools/crt/vcstartup/src/startup/exe_common.inl:253 59 kernel32.dll BaseThreadInitThunk 60 ntdll.dll RtlUserThreadStart we had a user reporting this startup crash on this thread: https://support.mozilla.org/questions/1162522 looking through a couple of reports with the same signature, all had the unversioned DLLs "WindowsApiHookDll64.dll" & "ActiveDetect64.dll" in the modules list hooking into firefox. these modules belong to Lenovo OneKey Theater. though the overall crash volume for it is currently very low, this is a 64bit-only firefox crash and may get more common once we migrate more users to win64. in the small sample we have available, the crash also seems to be more predominant with some eastern european locales (pl, cs, ru, uk), so maybe it's dependant on particular notebook models that are sold there.
I'm fine with blocklisting if we can get that user to report that the blocklisting works and doesn't cause other problems/crashes. Marco can you take this?
Component: OpenH264 → Other
Flags: needinfo?(mcastelluccio)
Attached patch Patch (obsolete) — Splinter Review
I'll spin a try build so that we can ask users to test.
Assignee: nobody → mcastelluccio
Status: NEW → ASSIGNED
Flags: needinfo?(mcastelluccio)
Philipp asked the user to try and reproduce the bug with the custom build. We don't have other options as in the past 14 days nobody affected by this crash has left an email address. There are addresses in crashes older than 14 days, but I think it's not useful to contact them (if they are not crashing anymore, they can't verify the fix).
These rand_s crashes are caused by bad third-party software injecting hooks into advapi32.dll. Other rand_s crashes over the years: bug 723447, bug 1094945, bug 1167248, bug 1240589. We patched those crashes by calling the RtlGenRandom API instead of rand_s, but in this case SQLite is calling rand_s.
There are some email addresses now, I've triggered a new try build and will send an email to those (eight) users.
Marco, is this crash signature spiking? We just migrated Beta users running 32-bit Firefox to 64-bit (in build 56.0b9), so we are seeing other 64-bit specific crash signatures starting to spike. Can we land your patch from comment 3 to block these DLLs in Beta 56? Over the last six months, 100% of these crashes are on Windows 7, so we can limit the block to Windows 7. Curiously, I don't see this crash signature at all on Nightly, even looking back six months.
Flags: needinfo?(mcastelluccio)
(In reply to Chris Peterson [:cpeterson] from comment #7) > Marco, is this crash signature spiking? We just migrated Beta users running > 32-bit Firefox to 64-bit (in build 56.0b9), so we are seeing other 64-bit > specific crash signatures starting to spike. > > Can we land your patch from comment 3 to block these DLLs in Beta 56? Over > the last six months, 100% of these crashes are on Windows 7, so we can limit > the block to Windows 7. No check for this currently but should be easy to add. We already have a win8 or higher flag we can crib from.
(In reply to Chris Peterson [:cpeterson] from comment #7) > Marco, is this crash signature spiking? We just migrated Beta users running > 32-bit Firefox to 64-bit (in build 56.0b9), so we are seeing other 64-bit > specific crash signatures starting to spike. > > Can we land your patch from comment 3 to block these DLLs in Beta 56? Over > the last six months, 100% of these crashes are on Windows 7, so we can limit > the block to Windows 7. > > Curiously, I don't see this crash signature at all on Nightly, even looking > back six months. It looks like it is more common now than before, 161 crashes over the last week (from approximately 27 installations). The only problem is that we don't know if the blocklist works or if it causes problems/crashes. I've sent emails to the users who left their address in the reports, but I haven't heard back from them yet.
Julien, can you rebased Marco's DLL blocklist patch for these Lenovo DLLs? Unfortunately, we don't know anyone who has reproduced the crash or can test whether blocking those Lenovo DLLs fixes the Firefox crash without causing other problems. I can try asking on Lenovo's support forum for any testers.
Crash Signature: [@ abort | _acrt_RtlGenRandom] → [@ abort | _acrt_RtlGenRandom] [@ abort | __acrt_RtlGenRandom] [@ __acrt_RtlGenRandom]
Flags: needinfo?(jcristau)
btw, here is the installer for Lenovo Onekey Theater for Windows 7 on the IdeaPad Z460, Z560. The latest version if from February 2012, so we should not expect Lenovo to provide a fix. https://support.lenovo.com/us/en/downloads/ds012956
Some Firefox users kindly included the email address in their crash reports. I emailed them to see if they can help test the DLL blocklist fix.
(In reply to Chris Peterson [:cpeterson] from comment #12) > Some Firefox users kindly included the email address in their crash reports. > I emailed them to see if they can help test the DLL blocklist fix. I've already done this yesterday :) The updated build I sent them is here: https://queue.taskcluster.net/v1/task/ETKqHlG-Q8K_RyfrjZqVtg/runs/0/artifacts/public/build/install/sea/target.installer.exe.
Flags: needinfo?(mcastelluccio)
Flags: needinfo?(jcristau)
(In reply to Marco Castelluccio [:marco] from comment #13) > > I emailed them to see if they can help test the DLL blocklist fix. > > I've already done this yesterday :) oops! Sorry I missed that detail. <:)
I received a user feedback. Crash report: https://crash-stats.mozilla.com/report/index/0e8258d2-9aaf-4220-b377-b5ef81170911 According to his description, Firefox crashed after the installation and the safe mode can not work as well. So I can ask him try to use EXE in comment 13? If the problem is solved, the normal performance is that firefox can use fine and no longer crash.
(In reply to yxu from comment #15) > I received a user feedback. Crash report: > https://crash-stats.mozilla.com/report/index/0e8258d2-9aaf-4220-b377- > b5ef81170911 > According to his description, Firefox crashed after the installation and the > safe mode can not work as well. > > So I can ask him try to use EXE in comment 13? If the problem is solved, the > normal performance is that firefox can use fine and no longer crash. Yes please! It would help us make a decision here.
Flags: needinfo?(yxu)
Marco, dmajor already blocked these Lenovo OneKey Theatre DLLs in Firefox 36 bug 1123778, but he only blocked UNVERSIONED. The activedetect64.dll and windowsapihookdll64.dll DLLs in yxu's crash report bp-0e8258d2-9aaf-4220-b377-b5ef81170911 are unversioned, so I'm not sure why the UNVERSIONED block is not working. So your new patch can just update the existing entries for activedetect64.dll and windowsapihookdll64.dll to block ALL_VERSIONS. You might as well block ALL_VERSIONS for the 32-bit activedetect32.dll and windowsapihookdll32.dll too, since the 32-bit DLLs were causing problems in the past and the UNVERSIONED block doesn't seem to be working for 64-bit.
Blocks: 1123778
Flags: needinfo?(mcastelluccio)
Attached patch Patch (obsolete) — Splinter Review
Thanks, I hadn't noticed we were already blocking them when unversioned.
Attachment #8873508 - Attachment is obsolete: true
Flags: needinfo?(mcastelluccio)
Attachment #8906805 - Flags: review?(dmajor)
I'm confused. If we're seeing continuing crash reports with unversioned DLLs, which are already on the blocklist, why do we expect that upgrading the blocklist to ALL_VERSIONS will change anything?
Flags: needinfo?(mcastelluccio)
Tracking for 56, since this may affect more users as they migrate to 64-bit.
(In reply to David Major [:dmajor] from comment #19) > I'm confused. If we're seeing continuing crash reports with unversioned > DLLs, which are already on the blocklist, why do we expect that upgrading > the blocklist to ALL_VERSIONS will change anything? The unversioned block used to work, maybe it is still working for a subset of users. So my hope is that the versioned one will block at least a subset of the crashes.
Flags: needinfo?(mcastelluccio)
(In reply to Marco Castelluccio [:marco] from comment #21) > (In reply to David Major [:dmajor] from comment #19) > > I'm confused. If we're seeing continuing crash reports with unversioned > > DLLs, which are already on the blocklist, why do we expect that upgrading > > the blocklist to ALL_VERSIONS will change anything? > > The unversioned block used to work, maybe it is still working for a subset > of users. So my hope is that the versioned one will block at least a subset > of the crashes. My point isn't about whether the unversioned block is effective. What I'm saying is that all the reports I looked at have unversioned binaries, so changing the version condition isn't going to change the set of binaries that we attempt to block (regardless of whether the block works or not). Am I missing something?
David, you are right. Lenovo's support website [1] lists a product version (2.0.1.8), so I figured that there was no harm in expanding the block to all DLL versions, in case some Lenovo DLLs included a version in a later release. But that would not solve this crash because we know there the unversioned DLLs are still crashing and not being blocked. So we need to figure out why these DLLs are not being blocked. Bob says we are such a problem in Quick Heal Antivirus bug 1347867. What are our options for preventing these early DLL loads? I don't expect Lenovo to fix the crash in their DLL because their last update was in 2012. Can we do something such as shipping a dummy DLL in the Firefox directory that will get loaded instead of Lenovo's? [1] https://support.lenovo.com/us/en/downloads/ds012956
I wonder what has changed since Firefox 36. The Lenovo DLLs were successfully blocked when bug 1123778 was resolved and verified, but now they're not. Maybe the difference is that the Firefox 36 crashes were in the 32-bit DLLs and now we are crashing in the 64-bit DLLs?
mak or ddurst, I think we should patch our copy of SQLite to stop calling rand_s to work around these Lenovo DLL crashes in Beta 56. Can you help with this or recommend someone else I can ask? We are crashing in SQLite's call to the Windows API `rand_s`. We (and Chrome) have had recurring problems with antivirus DLL hooks causing crashes in rand_s, e.g. bug 1240589, bug 1167248, bug 723447, and bug 694344. We replaced some rand_s calls with RtlGenRandom (e.g. changeset [1]) and those crashes seemed to go away. There are three calls to rand_s in mozilla-central [2], but the SQLite call is the only one that shows up in crash reports now, probably because it is called very early during browser startup. [1] https://hg.mozilla.org/mozilla-central/rev/795a10bae428 [2] https://searchfox.org/mozilla-central/search?case=true&regexp=true&q=%5Cbrand_s%5Cb
Flags: needinfo?(mak77)
Flags: needinfo?(ddurst)
I reworked my rand_s() patch from bug 1167248 to replace rand_s() with RtlGenRandom() in sqlite3.c. If this works around the Lenovo DLL crashes, we might consider pushing this change to upstream SQLite. A sqlite3.c code comment says rand_s() isn't available with MinGW. I don't know if RtlGenRandom() is available with MinGW or not. Try run: https://treeherder.mozilla.org/#/jobs?repo=try&revision=488ac239bd6be858eeedf158123d3a73df6cf586
Flags: needinfo?(mak77)
Flags: needinfo?(ddurst)
(In reply to Chris Peterson [:cpeterson] from comment #24) > I wonder what has changed since Firefox 36. The Lenovo DLLs were > successfully blocked when bug 1123778 was resolved and verified, but now > they're not. Maybe the difference is that the Firefox 36 crashes were in the > 32-bit DLLs and now we are crashing in the 64-bit DLLs? FWIW, the block didn't completely work on Chrome either, and that was back in 2014, see https://bugs.chromium.org/p/chromium/issues/detail?id=379218#c15 and later comments. I support your "rewrite rand_s" patch. It's probably the best technical means at our disposal right now. I didn't bring it up because I assumed it would be too much pain/process to patch sqlite. If that's not the case, then great!
Comment on attachment 8906805 [details] [diff] [review] Patch I hadn't noticed the modules were always unversioned.
Attachment #8906805 - Attachment is obsolete: true
Attachment #8906805 - Flags: review?(dmajor)
Attachment #8906805 - Flags: review-
(In reply to Chris Peterson [:cpeterson] from comment #25) > I think we should patch our copy of SQLite to stop calling > rand_s to work around these Lenovo DLL crashes in Beta 56. We (the SQLite developers) will make sure that the next release of SQLite does not call rand_s(). To be clear, rand_s() is currently only used to help seed SQLite's own internal PRNG on Windows. rand_s() should only be called once, when SQLite first initializes. On UNIX we have /dev/random and rand() is never used.
Sorry to reply late, I have been waiting for the user's response. The user installed the above commented version for testing. Currently in use he did not find the problem, the startup crash has been resolved. About Lenovo DLLs virsion, this user used Onekey Theater 2.0.2.7
Flags: needinfo?(yxu)
Comment on attachment 8907319 [details] Bug 1369361 - Backport SQLite fix removing rand_s() in an attempt to avoid Lenovo DLL crashes. https://reviewboard.mozilla.org/r/178994/#review184232 We don't patch Sqlite, and I'd not want to go this direction unless the world is falling upon us. The current setup is really cheap to maintain, anyone can indeed update the library without the need for peers or the module owner to do it (there's no one officially assigned to work on mozStorage, we just contribute our time from other teams). This creates a precedent, and people will start willing to patch this thing and that thing, I dont' really want to lose the precious testing the Sqlite team does on the original code. This crash is caused by old libraries from 2012, I'm sure it's not the only one and we can't just patch all of them. Richard promised a solution upstream in comment 30 and that's likely what we'll take. At a maximum we can patch our sqlite with the upstream patch, without a patches folder and without complicating the upgrade process. So that on upgrade we'll just pick upstream. Did we already figured out why the dll block doesn't work? That sounds like would have a nicer impact on the product.
Attachment #8907319 - Flags: review?(mak77) → review-
If you would like a quick SQLite 3.20.2 release that contains this one patch, let me know.
(In reply to Marco Bonardo [::mak] from comment #32) > We don't patch Sqlite, and I'd not want to go this direction unless the > world is falling upon us. The current setup is really cheap to maintain, > anyone can indeed update the library without the need for peers or the > module owner to do it (there's no one officially assigned to work on > mozStorage, we just contribute our time from other teams). This creates a > precedent, and people will start willing to patch this thing and that thing, That's a good point. Upstream SQLite just removed the rand_s() call [1] so we could just apply their tiny fix to our copy of SQLite 3.20.1 without committing patch files in hg. Whenever the next version of SQLite is available, it will aleady have the rand_s() fix and won't need to be patched. [1] https://www.sqlite.org/src/info/3a2793aa65727cbb > Did we already figured out why the dll block doesn't work? That sounds like > would have a nicer impact on the product. I don't know why the DLL blocklist isn't working, but I am talking to some other people about this. (In reply to D. Richard Hipp from comment #33) > If you would like a quick SQLite 3.20.2 release that contains this one > patch, let me know. I don't think you need to release a SQLite 3.20.2 just for this fix. I can just apply that fix to our copy of 3.20.1, which is the same effect but less work for you. :-)
Comment on attachment 8907319 [details] Bug 1369361 - Backport SQLite fix removing rand_s() in an attempt to avoid Lenovo DLL crashes. https://reviewboard.mozilla.org/r/178994/#review184824 Much better, considered this only affects Windows it won't even be a problem with Linux usage of System Sqlite. Thanks!
Attachment #8907319 - Flags: review?(mak77) → review+
Pushed by cpeterson@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/14207baa622e Backport SQLite fix removing rand_s() in an attempt to avoid Lenovo DLL crashes. r=mak
Comment on attachment 8907319 [details] Bug 1369361 - Backport SQLite fix removing rand_s() in an attempt to avoid Lenovo DLL crashes. Approval Request Comment [Feature/Bug causing the regression]: Migrating Beta and Dev Edition users to 64-bit Firefox caused this existing bug in a Lenovo DLL to start crashing more than before. [User impact if declined]: Some 64-bit Firefox 56 users on Windows 7 will see startup crashes. Beta 56 has had 266 Lenovo DLL crashes total, Beta 55 had 66, and Beta 54 had zero. [Is this code covered by automated tests?]: No. [Has the fix been verified in Nightly?]: No because we don't know anyone who can reproduce the crash. We don't know for certain that this change actually fixes the crash or just moves it to a different crash signature. [Needs manual test from QE? If yes, steps to reproduce]: No because we don't know anyone who can reproduce the crash. [List of other uplifts needed for the feature/fix]: None [Is the change risky?]: No [Why is the change risky/not risky?]: This should be a safe fix because it just removes some Windows code that is now also removed from SQLite upstream. SQLite runs its own extensive tests. [String changes made/needed]: None
Attachment #8907319 - Flags: approval-mozilla-beta?
(In reply to Chris Peterson [:cpeterson] from comment #38) > [Has the fix been verified in Nightly?]: No because we don't know anyone who > can reproduce the crash. We don't know for certain that this change actually > fixes the crash or just moves it to a different crash signature. > > [Needs manual test from QE? If yes, steps to reproduce]: No because we don't > know anyone who can reproduce the crash. There's the user who contacted yxu who can reproduce, but he wasn't able to reproduce with my try build (which shouldn't have fixed the crash). Maybe he can't reproduce on Nightly at all. Perhaps he can test once this is in beta.
btw, this should be a very safe fix, but this particular crash's volume is quite low (266 reports from about 30 Beta users). So not taking this uplift in Beta 56 final RC build would not be the end of the world.
Comment on attachment 8907319 [details] Bug 1369361 - Backport SQLite fix removing rand_s() in an attempt to avoid Lenovo DLL crashes. Let's try this for the RC build on Monday. I'd love for us to avoid this startup crash on 56 release.
Attachment #8907319 - Flags: approval-mozilla-release+
Attachment #8907319 - Flags: approval-mozilla-beta?
Attachment #8907319 - Flags: approval-mozilla-beta-
Liz, I think we should relnote this Lenovo crash and resolution for Firefox 56. [Why is this notable]: Windows 7 users running 64-bit Firefox and Lenovo's OneKey Theater software will hit startup crashes. We plan to automatically migrate some 32-bit Firefox users to 64-bit during the 56 release, so some users may hit this crash but not know they are now running 64-bit Firefox. [Affects Firefox for Android]: No [Suggested wording]: Lenovo's "OneKey Theater" software for IdeaPad laptops can cause 64-bit Firefox startup crashes on Windows 7. To fix this crash, please [re-install 32-bit Firefox](https://www.mozilla.org/firefox/all/). [Links (documentation, blog post, etc)]: https://www.mozilla.org/firefox/all/
relnote-firefox: --- → ?
Flags: needinfo?(lhenry)
(In reply to Chris Peterson [:cpeterson] from comment #44) > Lenovo's "OneKey Theater" software for IdeaPad laptops can cause 64-bit > Firefox startup crashes on Windows 7. To fix this crash, please [re-install > 32-bit Firefox](https://www.mozilla.org/firefox/all/). Have we actually confirmed that reinstalling 32-bit Firefox fixes these crashes? I am wondering if they merely show up with different signatures...
Flags: needinfo?(cpeterson)
(In reply to David Major [:dmajor] from comment #45) > (In reply to Chris Peterson [:cpeterson] from comment #44) > > Lenovo's "OneKey Theater" software for IdeaPad laptops can cause 64-bit > > Firefox startup crashes on Windows 7. To fix this crash, please [re-install > > 32-bit Firefox](https://www.mozilla.org/firefox/all/). > > Have we actually confirmed that reinstalling 32-bit Firefox fixes these > crashes? I am wondering if they merely show up with different signatures... I haven't confirmed that downgrading to 32-bit avoids the crashes, but it seems like a safe suggestion. If migrating a user from 32- to 64-bit causes enough crashes that the user seeks out information in the Firefox release notes, then they probably had no or few crashes with 32-bit Firefox. When we've had rand_s crash spikes in the past, removing the call to rand_s seemed to make the crash reports go away. If the crashes just moved to a different signature, we might have seen a new spike. But the new crash signatures are random, we might not see a new spike. I'm not sure how to search for crash reports that these Lenovo DLLs (ActiveDetect64.dll and WindowsApiHookDll64.dll). Socorro's Super Search doesn't appear to allow searching for module names, but it does have a field called "app init dlls" that has some DLL module names (but not ActiveDetect64.dll or WindowsApiHookDll64.dll).
Flags: needinfo?(cpeterson)
Noted, slightly reworded: "Startup crashes with 64-bit Firefox on Windows 7, for users of Lenovo's "OneKey Theater" software for IdeaPad laptops. To fix this crash, please re-install 32-bit Firefox."
Flags: needinfo?(lhenry)
I wonder if we should try to detect these DLLs and prevent migration of those users to 64-bit.
(In reply to Chris Peterson [:cpeterson] from comment #46) > I'm not sure how to > search for crash reports that these Lenovo DLLs (ActiveDetect64.dll and > WindowsApiHookDll64.dll). Socorro's Super Search doesn't appear to allow > searching for module names, but it does have a field called "app init dlls" > that has some DLL module names (but not ActiveDetect64.dll or > WindowsApiHookDll64.dll). The DLLs are not available via SuperSearch, they are only available in the Telemetry Socorro dataset (which is not as easy to use). (In reply to David Major [:dmajor] from comment #48) > I wonder if we should try to detect these DLLs and prevent migration of > those users to 64-bit. This is a good idea and should be pretty trivial to implement. I don't know how we could deploy it to users though (56 is ready, we would need a system addon).
This is a good idea, but since we don't have time to add a DLL check in 56 (or probably 57), we could exclude Windows 7 users from 56's migration. This Lenovo crash only affects Windows 7. Unfortunately, Windows 7 is the most common OS version among Firefox users (46%). Windows 10 is 33% and Windows 8.x is 10%. I think we should still include all Windows versions in our 1% migration of 56. If we see similar crashes in 56 even though we have patched SQLite to avoid rand_s(), we can then decide what to do about these DLLs and Windows 7 users in future migration attempts.
I tried to reproduce the issue with 55.0.3, and I saw the troublesome dlls are blocked. But then I found this comment: bp-ad568e71-46bc-4179-be81-14a490170924: sudden crash after Avast install I installed Avast and then can reproduce that the dlls are not blocked. I'll try to understand why.
(In reply to Ting-Yu Chou [:ting] from comment #51) > I tried to reproduce the issue with 55.0.3, and I saw the troublesome dlls > are blocked. But then I found this comment: > > bp-ad568e71-46bc-4179-be81-14a490170924: sudden crash after Avast install > > I installed Avast and then can reproduce that the dlls are not blocked. I'll > try to understand why. Oh, that's a good point! The crash reports that we're getting could be the intersection of "has OneKey software" and "has another software that breaks out blocklist" -- either of those would be OK in isolation, but together they spell trouble. This makes me wonder if the relnote is maybe too broad; not all OneKey users would be affected by this. Same for the plans re blocking migration. Can anyone check whether these reports are setting BlocklistInitFailed (not sure if that's the exact spelling)? I don't see it on crash-stats but maybe it's a private field. Specifically in the Avast case, I bet they hook LdrLoadDll before we do, and then our disassembler no longer understands it.
(In reply to David Major [:dmajor] from comment #52) > This makes me wonder if the relnote is maybe too broad; not all OneKey users > would be affected by this. Same for the plans re blocking migration. I see what you mean. I don't think the relnote will be a problem. I assume that only Lenovo users actually affected by the startup crashes would actually seek out or act upon the relnote's recommendation to reinstall 32-bit Firefox. And it's not that big a deal if anyone else reinstalls 32-bit Firefox. > Can anyone check whether these reports are setting BlocklistInitFailed (not > sure if that's the exact spelling)? I don't see it on crash-stats but maybe > it's a private field. I spot checked some of these Lenovo crash reports from different Firefox versions. All of these crash reports had "BlocklistInitFailed: 1". You do need to sign into Socorro to view BlocklistInitFailed in the crash report's Metadata tab. > Specifically in the Avast case, I bet they hook LdrLoadDll before we do, and > then our disassembler no longer understands it. Do we have a bug on file for this type of LdrLoadDll problem? I don't see an exact match in any of the bugs blocking the "injecteject" bug 1306406. Should I reach out to Avast?
(In reply to Chris Peterson [:cpeterson] from comment #53) > Do we have a bug on file for this type of LdrLoadDll problem? I don't see an > exact match in any of the bugs blocking the "injecteject" bug 1306406. I'm sure Avast isn't the only one; it's in the nature of AV software to want to filter DLL loads. I recall seeing bugs for specific instances over the years but not one for the problem in general. > Should I reach out to Avast? I'll leave that up to you and Jim. (Maybe it's pointless if we're planning to take a harder stance on external DLLs anyway)
This problem appears massively after 56.0.1 32-bit Firefox is automatically upgraded to 64-bit. The problem now is that when user startup Firefox no longer shows a crash, but instead displays an error pop-up window for the application (see attachment). At which point Firefox can not open any page. After re-install 32-bit Firefox works properly. The application error code is 0x80000003.
Attached image error message
yxu, can you still reproduce this crash? In which Firefox versions? Do you have the Lenovo "OneKey Theater" software installed? Unfortunately, there's not much we can do. We can't block Lenovo's crashing DLL. I landed a speculative fix in Beta 56b13 (comment 43) that seemed to make the crashes "go away". There have only been 17 crash reports in the last week, all from people running Firefox < 56b13, suggesting my speculative fix worked (or moved the crash to a different signature): https://crash-stats.mozilla.com/search/?signature=~acrt_RtlGenRandom&product=Firefox&platform=Windows&date=%3E%3D2017-11-13T14%3A33%3A20.000Z&date=%3C2017-11-20T14%3A33%3A20.000Z&_sort=-date&_facets=signature&_facets=version&_facets=platform_pretty_version&_columns=date&_columns=signature&_columns=version#facet-version
Flags: needinfo?(yxu)
(In reply to Chris Peterson [:cpeterson] from comment #57) > yxu, can you still reproduce this crash? In which Firefox versions? Yes I can reproduce, but unfortunately not on my computer, I try to reach some users, they will appear above I said the kind of situation. > Do you have the Lenovo "OneKey Theater" software installed? This is not all of them have this software installed or said can not find in the control panel, but it seems that all users are Lenovo machines & win7 system. The problem began to appear in large numbers after we upgrade 32-bit Firefox to 64 bits in 56.0.1, for the user, the performance is that they can't use Firefox after upgraded. I think the previous fixes came into effect, but it seemed to turn the problem into another. Because in the past this problem led to the crash at startup, and now Firefox did not crash, but pop up the window in my attachment above. No collapse means we have no way of tracking.
Flags: needinfo?(yxu)
(In reply to yxu from comment #58) > > Do you have the Lenovo "OneKey Theater" software installed? > > This is not all of them have this software installed or said can not find in > the control panel, but it seems that all users are Lenovo machines & win7 > system. Do you know many people hitting this crash? How can we estimate how many people are affected? If this is a very common problem, we could add a Firefox stub installer check for Lenovo software and give those users 32-bit Firefox. > The problem began to appear in large numbers after we upgrade 32-bit Firefox > to 64 bits in 56.0.1, for the user, the performance is that they can't use > Firefox after upgraded. I think the previous fixes came into effect, but it > seemed to turn the problem into another. Because in the past this problem > led to the crash at startup, and now Firefox did not crash, but pop up the > window in my attachment above. No collapse means we have no way of tracking. That is unfortunate. I don't know what we can do for users we have already upgraded to 64-bit. They will keep crashing until they re-install 32-bit Firefox. We can't automatically downgrade them to 32-bit if they can't run Firefox (to receive the new update). btw, I have been watching for _acrt_RtlGenRandom crash reports and sending instructions for re-installing 32-bit Firefox to people whose crash reports included their email address. Unfortunately, it sounds like I am missing those people who crash before Firefox sends a crash report. :(
(In reply to Chris Peterson [:cpeterson] from comment #59) > (In reply to yxu from comment #58) > If this is a very common problem, we could add a Firefox stub installer > check for Lenovo software and give those users 32-bit Firefox. Yes, this is a common problem, but I am not sure whether this issue is only related to Lenovo Onekey. Because there is no crash report, just the application error code(0x80000003). According to this article(http://iknow.lenovo.com/detail/dc_R0039.html), use Lenovo Onekey software needs to be installed corresponding graphics driver, sound card driver, Energy Management, Lenovo SlideNav (some software may not be available) and OneKey Theater at the same time. I think it is necessary to add a Firefox stub installer check for Lenovo software because this problem has become a major problem for Chinese users. Also in Bug 1274659, we may need to suspend this part of the user 32-bit to 64-bit migration. > That is unfortunate. I don't know what we can do for users we have already > upgraded to 64-bit. They will keep crashing until they re-install 32-bit > Firefox. We can't automatically downgrade them to 32-bit if they can't run > Firefox (to receive the new update). I have wirte a FAQ article of this problem in the Chinese user's community. If the user encountered the problem, they should be able to understand how to solve.
(In reply to yxu from comment #60) > Yes, this is a common problem, but I am not sure whether this issue is only > related to Lenovo Onekey. Because there is no crash report, just the > application error code(0x80000003). According to this > article(http://iknow.lenovo.com/detail/dc_R0039.html), use Lenovo Onekey > software needs to be installed corresponding graphics driver, sound card > driver, Energy Management, Lenovo SlideNav (some software may not be > available) and OneKey Theater at the same time. I think it is necessary to > add a Firefox stub installer check for Lenovo software because this problem > has become a major problem for Chinese users. Also in Bug 1274659, we may > need to suspend this part of the user 32-bit to 64-bit migration. Do you know how we can check whether the Lenovo software is installed or enabled? Is there a certain DLL the stub installer or auto updater should look for? > I have wirte a FAQ article of this problem in the Chinese user's community. > If the user encountered the problem, they should be able to understand how > to solve. Thanks! Should we promote your FAQ on Mozilla social media or create a SUMO page? We currently only have a comment in the Firefox 56 release notes: "Startup crashes with 64-bit Firefox on Windows 7, for users of Lenovo's "OneKey Theater" software for IdeaPad laptops. To fix this crash, please re-install 32-bit Firefox" https://www.mozilla.org/en-US/firefox/56.0/releasenotes/
(In reply to yxu from comment #60) > Yes, this is a common problem, but I am not sure whether this issue is only > related to Lenovo Onekey. Because there is no crash report, just the > application error code(0x80000003). Ted, do you know why Breakpad is not able to catch these "unknown software exception 0x80000003" crashes in firefox.exe? Exception code 0x80000003 implies corrupted system files or buggy anti-virus hooks. We used to catch these Lenovo users' crash reports when SQLite called Windows' rand_s() API. Removing the call to rand_s() turned these crashes into exception code 0x80000003 instead of just crashing in other Firefox code. There is a screenshot of the exception dialog in attachment 8929992 [details].
Flags: needinfo?(ted)
Blocks: 1420251
I've opened bug 1420251 to track the new crash, we can continue the discussion there.
Flags: needinfo?(ted)
(In reply to Chris Peterson [:cpeterson] from comment #53) > Should I reach out to Avast? Did you take any action regarding this? Even if we don't contact them, it might be worth getting a bug on file for the InjectEject project.
Flags: needinfo?(cpeterson)
(In reply to Marco Castelluccio [:marco] from comment #64) > (In reply to Chris Peterson [:cpeterson] from comment #53) > > Should I reach out to Avast? > > Did you take any action regarding this? > Even if we don't contact them, it might be worth getting a bug on file for > the InjectEject project. No. I didn't reach out to them. I can file a new bug for InjectEject. Based on comment 51 and 52, it sounds like the issue is that Avast does something that prevents our DLL blocklist from blocking ActiveDetect64.dll and WindowsApiHookDll64.dll.
Flags: needinfo?(cpeterson)
Depends on: 1444030
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: