Closed Bug 1843977 Opened 1 year ago Closed 1 year ago

ESR 115.0.2 crashes with Beijing Qihu Technology modules

Categories

(External Software Affecting Firefox :: Other, defect, P1)

All
Windows 7

Tracking

(firefox-esr115115+ fixed, firefox115 unaffected, firefox116 unaffected, firefox117 unaffected)

RESOLVED FIXED
Tracking Status
firefox-esr115 115+ fixed
firefox115 --- unaffected
firefox116 --- unaffected
firefox117 --- unaffected

People

(Reporter: mccr8, Assigned: handyman)

References

Details

(Keywords: crash)

Crash Data

Attachments

(1 obsolete file)

Crash report: https://crash-stats.mozilla.org/report/index/e3d8c503-4c02-4fee-805d-098780230717

Reason: EXCEPTION_ACCESS_VIOLATION_READ

Top 10 frames of crashing thread:

0  xul.dll  mozilla::Vector<mozilla::BufferList<InfallibleAllocPolicy>::Segment, 1, InfallibleAllocPolicy>::empty const  mfbt/Vector.h:559
0  xul.dll  mozilla::BufferList<InfallibleAllocPolicy>::IterImpl::IterImpl  mfbt/BufferList.h:187
0  xul.dll  mozilla::BufferList<InfallibleAllocPolicy>::Iter const  mfbt/BufferList.h:325
0  xul.dll  PickleIterator::PickleIterator  ipc/chromium/src/base/pickle.cc:87
1  xul.dll  IPC::MessageReader::MessageReader  ipc/chromium/src/chrome/common/ipc_message_utils.h:133
1  xul.dll  mozilla::PProfilerParent::OnMessageReceived  ipc/ipdl/PProfilerParent.cpp:695
2  mozglue.dll  Mutex::Unlock  memory/build/Mutex.h:129
2  mozglue.dll  AutoLock<Mutex>::~AutoLock  memory/build/Mutex.h:186
2  mozglue.dll  arena_dalloc  memory/build/mozjemalloc.cpp:3759
2  mozglue.dll  BaseAllocator::free  memory/build/mozjemalloc.cpp:4547

This is an odd looking crash. Basically only on ESR115, basically all Windows 7. Lots of install times. Mostly a startup crash. Almost all of the crashes are on the address 0xe5e5e511 which could be jemalloc poison, but it is hard to say.

They all have PProfilerParent::OnMessageReceived in the stack. Are that many people on ESR really using the profiler?

Summary: Crash in [@ mozilla::Vector<T>::empty] → Crash in [@ mozilla::Vector<T>::empty] in PProfilerParent::OnMessageReceived

It looks like this is specifically for the Stop message in the profiler.

The spike in 115.0.2 specifically makes me strongly suspect something related to third-party modules. Full changelog:
https://hg.mozilla.org/releases/mozilla-esr115/pushloghtml?changeset=872b5aae170ffa755f89d0c6a9ebc380af677e9a

65% of the crashes have the locale zh-cn.

I opened a couple of crashes, and they all had libzdtp.pdb, SafeWrapper32.pdb, and chromesafe.pdb from Beijing Qihu Technology Co., Ltd. in the modules. I'll just go ahead and unhide this then. Maybe an addon is generating bogus IPC messages because it wasn't updated?

Group: dom-core-security
Component: IPC → Other
Product: Core → External Software Affecting Firefox
See Also: → 1841751
Flags: needinfo?(yjuglaret)

There are other top crashes on ESR 115.0.2. I'll just add them in here, unless it would be useful to file separate bugs. All of the ones I looked at had a similar set of Qihu modules.

Crash Signature: [@ mozilla::Vector<T>::empty] → [@ mozilla::Vector<T>::empty] [@ nsDocShell::MoveLoadingToActiveEntry ]
Summary: Crash in [@ mozilla::Vector<T>::empty] in PProfilerParent::OnMessageReceived → ESR 115.0.2 crashes with Beijing Qihu Technology modules
Severity: -- → S1
Priority: -- → P1

This is dot release driver level crashiness and puts our Release->ESR migration at risk.

The [@ hwndForDOMWindow ] crashes have libzdtp64.pdb, uniconft64.pdb, chromesafe64.pdb.

Crash Signature: [@ mozilla::Vector<T>::empty] [@ nsDocShell::MoveLoadingToActiveEntry ] → [@ mozilla::Vector<T>::empty] [@ nsDocShell::MoveLoadingToActiveEntry ] [@ mozilla::AppWindow::Initialize ] [@ nsChromeTreeOwner::nsChromeTreeOwner ] [@ nsCOMPtr<T>::~nsCOMPtr | mozilla::AppWindow::Initialize ] [@ IPC::Message::type ] [@ hwndForDOMWindow ]

Just looking at the top 10 crashes on ESR 115.0.2, all of the top 10 signatures, except possibly OOM | small, seem to have these Qihu modules in the crash reports, at least when I looked at 4 or 5 crashes for each. That's at least 70% of all crashes.

Only the ESR builds are affected.

Both x86 and amd64 Windows 7 are affected. (Possibly other Windows OS, but the percentage of non-Win7 is extremely small.)

Hardware: x86 → All
See Also: → 1842882
See Also: → 1706031

The faulty DLLs seem to be libzdtp64.dll 1.0.0.1300 for x64 builds and libzdtp.dll 1.0.0.1270 for x86 builds. The international product 360 Total Security does not provide these versions yet as far as I can tell (I had libzdtp64.dll 1.0.0.1190 and libzdtp.dll 1.0.0.1180). Maybe the Chinese version of the product is required. This incident could be caused by Qihoo currently serving an update for their DLL?

We could take the patch from bug 1842088 which finishes fixing the blocklist tests on Windows 7, but for the moment it's unclear how this incident would relate to what's fixed there.

Flags: needinfo?(yjuglaret)

Adding some folks from 360.cn.

Would any of you know who could help us track down this issue?

During the channel meeting, [:Aryx] suggested that Qihoo 360 could be applying the same modifications to 115.0.2 release and 115.0.2 ESR binaries, if it was only relying on version numbers. And indeed, in the crash reports, we can find evidence that the code around the crashing instruction in 115.0.2esr has been altered to set up hooks, at offsets that only make sense in 115.0.2 release binaries of xul.dll.

Below is the legit code around the crashing instruction in 115.0.2esr:

0:090> u xul+0x1288351
xul!mozilla::AppWindow::Initialize+0x265 [/builds/worker/checkouts/gecko/xpfe/appshell/nsAppShellService.cpp @ 671] [inlined in xul!nsAppShellService::JustCreateTopWindow+0x751 [/builds/worker/checkouts/gecko/xpfe/appshell/nsAppShellService.cpp @ 671]]:
00007fff`0a868351 89542430        mov     dword ptr [rsp+30h],edx
00007fff`0a868355 89442428        mov     dword ptr [rsp+28h],eax
00007fff`0a868359 c744242000000000 mov     dword ptr [rsp+20h],0
00007fff`0a868361 31d2            xor     edx,edx
00007fff`0a868363 4531c9          xor     r9d,r9d
00007fff`0a868366 e8555ae1fe      call    xul!nsDocShell::InitWindow (00007fff`0967ddc0)
00007fff`0a86836b 85c0            test    eax,eax
00007fff`0a86836d 782e            js      xul!nsAppShellService::JustCreateTopWindow+0x79d (00007fff`0a86839d)

Below is the altered code found in a crash report:

0:000> u xul+0x1288351
xul!mozilla::AppWindow::Initialize+0x265 [/builds/worker/checkouts/gecko/xpfe/appshell/nsAppShellService.cpp @ 671] [inlined in xul!nsAppShellService::JustCreateTopWindow+0x751 [/builds/worker/checkouts/gecko/xpfe/appshell/nsAppShellService.cpp @ 671]]:
000007fe`e64a8351 89542430        mov     dword ptr [rsp+30h],edx
000007fe`e64a8355 89442428        mov     dword ptr [rsp+28h],eax
000007fe`e64a8359 c7442420000000e9 mov     dword ptr [rsp+20h],0E9000000h
000007fe`e64a8361 93              xchg    eax,ebx
000007fe`e64a8362 7ff6            jg      xul!nsAppShellService::JustCreateTopWindow+0x75a (000007fe`e64a835a)
000007fe`e64a8364 d6              ???
000007fe`e64a8365 c9              leave
000007fe`e64a8366 e8555ae1fe      call    xul!nsDocShell::InitWindow (000007fe`e52bddc0)

xul.dll was altered with the intention to add a hook at offset 0x1288360. This is what the same bytes as above would yield if given the proper start of instruction:

0:000> u xul+0x1288360
xul!mozilla::AppWindow::Initialize+0x274 [/builds/worker/checkouts/gecko/xpfe/appshell/nsAppShellService.cpp @ 671] [inlined in xul!nsAppShellService::JustCreateTopWindow+0x760 [/builds/worker/checkouts/gecko/xpfe/appshell/nsAppShellService.cpp @ 671]]:
000007fe`e64a8360 e9937ff6d6      jmp     000007fe`bd4102f8:

But in xul.dll for 115.0.2esr, this offset of 0x1288360 doesn't make sense. It falls in the middle of mozilla::AppWindow::Initialize, where it is not aligned on an instruction start, so we crash shortly after reaching the altered code.

However, this offset makes a lot of sense in 115.0.2 release, where it is the beginning of a function (mozilla::AppWindow::WindowActivated):

0:087> u xul+0x1288360
xul!mozilla::AppWindow::WindowActivated [/builds/worker/checkouts/gecko/xpfe/appshell/AppWindow.cpp @ 3090]:
00007ffe`f1b48360 4156            push    r14
00007ffe`f1b48362 56              push    rsi
00007ffe`f1b48363 57              push    rdi
00007ffe`f1b48364 53              push    rbx
00007ffe`f1b48365 4883ec28        sub     rsp,28h
00007ffe`f1b48369 4889cf          mov     rdi,rcx
00007ffe`f1b4836c 488d7110        lea     rsi,[rcx+10h]
00007ffe`f1b48370 488b4110        mov     rax,qword ptr [rcx+10h]

So, this evidence strongly supports [:Aryx]'s suggestion.

No crashes with the version bump only 115.0.3esr release. Closing this out as fixed.

Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Duplicate of this bug: 1842882

Copying crash signatures from duplicate bugs.

Crash Signature: ] → ] [@ mozilla::dom::ContentParent::ChildID] [@ mozilla::dom::ContentProcessManager::AddContentProcess] [@ new]
See Also: → 1844461
Assignee: nobody → davidp99
Attachment #9344737 - Attachment is obsolete: true

Copying crash signatures from duplicate bugs.

Crash Signature: ] [@ mozilla::dom::ContentParent::ChildID] [@ mozilla::dom::ContentProcessManager::AddContentProcess] [@ new] → ] [@ mozilla::dom::ContentParent::ChildID] [@ mozilla::dom::ContentProcessManager::AddContentProcess] [@ new] [@ _tailMerge_hid.dll | mozilla::dom::NodeInfo::~NodeInfo] [@ mozilla::dom::Element::SetAttr] [@ mozilla::dom::Event::~Event] [@ mozilla::dom::N…
Crash Signature: ] [@ mozilla::dom::ContentParent::ChildID] [@ mozilla::dom::ContentProcessManager::AddContentProcess] [@ new] [@ _tailMerge_hid.dll | mozilla::dom::NodeInfo::~NodeInfo] [@ mozilla::dom::Element::SetAttr] [@ mozilla::dom::Event::~Event] [@ mozilla::dom::N… → ] [@ mozilla::dom::ContentParent::ChildID] [@ mozilla::dom::ContentProcessManager::AddContentProcess] [@ new] [@ _tailMerge_hid.dll | mozilla::dom::NodeInfo::~NodeInfo] [@ mozilla::dom::Element::SetAttr] [@ mozilla::dom::Event::~Event] [@ mozilla:…
Crash Signature: mozilla::dom::NodeInfo::~NodeInfo] [@ nsAppShellService::JustCreateTopWindow] → mozilla::dom::NodeInfo::~NodeInfo] [@ nsAppShellService::JustCreateTopWindow] [@ mozilla::PProfilerParent::OnMessageReceived]
See Also: → 1872242
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: