Closed
Bug 1394483
Opened 7 years ago
Closed 7 years ago
FireFox/Win7 misaligned access in xul.dll with high cpu usage
Categories
(Core :: General, defect)
Tracking
()
RESOLVED
INCOMPLETE
People
(Reporter: mynova, Unassigned)
Details
Attachments
(4 files)
User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:55.0) Gecko/20100101 Firefox/55.0
Build ID: 20170814072924
Steps to reproduce:
This issue has been like that for years on my Windows7 systems, finally decided to file a bz to see if anyone can help. Since searching online found very similar reports:
http://forums.mozillazine.org/viewtopic.php?f=38&t=2580941
without followup all the way through..
I've tried the refresh Firefox, and that did not help, after a few hours or day of usage, with maybe 3 windows and 60 tabs, or sometimes less, I find FF in 100% cpu loop, on a 4 cpu system, that appears as 25% cpu.
Actual results:
FF locks up on 1 cpu and there are 2 threads one called: Firefox.exe+0x62ed
and another xul.dll+0x680db6.
the firefox thread has this stack:
0, ntoskrnl.exe!_misaligned_access+0x1833
1, ntoskrnl.exe!KeSetEvent+0x3f2
2, dxgmms1.sys!VidMmInterface+0x292b0
3, dxgmms1.sys!VidMmInterface+0x2bfe9
4, dxgkrnl.sys!TdrResetFromTimeout+0x139f0
5, dxgkrnl.sys!TdrResetFromTimeout+0x129ff
6, win32k.sys!EngUnlockDirectDrawSurface+0x2c62
7, ntoskrnl.exe!KeSynchronizeExecution+0x3a23
8, wow64win.dll+0x416aa
9, wow64win.dll+0x27c9d
10, wow64.dll!Wow64SystemServiceEx+0xd7
11, wow64cpu.dll!TurboDispatchJumpAddressEnd+0x2d
12, wow64.dll!Wow64SystemServiceEx+0x1ce
13, wow64.dll!Wow64KiUserCallbackDispatcher+0x204
14, wow64win.dll+0x32caa
15, ntdll.dll!KiUserCallbackDispatcher+0x1f
16, wow64win.dll+0x4019a
17, wow64win.dll+0x32877
18, wow64win.dll+0x1c137
19, wow64.dll!Wow64SystemServiceEx+0xd7
20, wow64cpu.dll!TurboDispatchJumpAddressEnd+0x2d
21, wow64.dll!Wow64SystemServiceEx+0x1ce
22, wow64.dll!Wow64LdrpInitialize+0x42a
23, ntdll.dll!RtlUniform+0x6e6
24, ntdll.dll!EtwEventSetInformation+0x1da09
25, ntdll.dll!LdrInitializeThunk+0xe
26, gdi32.dll!D3DKMTRender+0x15 (No unwind info)
27, igd10iumd32.dll!OpenAdapter10_2+0x4d18f (No unwind info)
28, igd10iumd32.dll!OpenAdapter10_2+0x4d5e4 (No unwind info)
29, igd10iumd32.dll!OpenAdapter10_2+0x131074 (No unwind info)
30, d3d11.dll!D3D11CreateDeviceAndSwapChain+0x2e910 (No unwind info)
31, d3d11.dll!D3D11CreateDeviceAndSwapChain+0x55538 (No unwind info)
32, xul.dll!PrintJSStack+0x13ed35 (No unwind info)
33, xul.dll!PrintJSStack+0x13ea56 (No unwind info)
34, xul.dll!PrintJSStack+0x13d8bf (No unwind info)
35, xul.dll!workerlz4_decompress+0x8a3a7 (No unwind info)
36, xul.dll!workerlz4_decompress+0x239c9c (No unwind info)
37, xul.dll+0x3618b2 (No unwind info)
38, xul.dll+0x3610b5 (No unwind info)
39, xul.dll+0x361000 (No unwind info)
40, xul.dll+0x360fdd (No unwind info)
41, user32.dll!gapfnScSendMessage+0x332 (No unwind info)
42, user32.dll!GetThreadDesktop+0xd7 (No unwind info)
43, user32.dll!GetThreadDesktop+0x185 (No unwind info)
44, user32.dll!GetThreadDesktop+0x1e1 (No unwind info)
45, ntdll.dll!KiUserCallbackDispatcher+0x2e (No unwind info)
46, user32.dll!DispatchMessageW+0xf (No unwind info)
47, xul.dll+0x35f406 (No unwind info)
48, xul.dll+0x35f125 (No unwind info)
49, xul.dll+0x1a0e3c (No unwind info)
50, xul.dll+0x30c233 (No unwind info)
51, xul.dll+0x427abf (No unwind info)
52, xul.dll+0x427a8e (No unwind info)
53, xul.dll+0x50257e (No unwind info)
54, xul.dll+0x50230d (No unwind info)
55, xul.dll+0x5022c2 (No unwind info)
56, xul.dll+0x502044 (No unwind info)
57, xul.dll+0x5d16e6 (No unwind info)
58, xul.dll+0x697f14 (No unwind info)
59, xul.dll+0x697ed6 (No unwind info)
60, firefox.exe+0x1bb2 (No unwind info)
61, firefox.exe+0x6277 (No unwind info)
62, kernel32.dll!BaseThreadInitThunk+0x12 (No unwind info)
63, ntdll.dll!RtlInitializeExceptionChain+0x63 (No unwind info)
64, ntdll.dll!RtlInitializeExceptionChain+0x36 (No unwind info)
and the xul.dll has this stack:
0, ntoskrnl.exe!_misaligned_access+0x1833
1, ntoskrnl.exe!KeWaitForMultipleObjects+0xf5d
2, ntoskrnl.exe!KeWaitForMutexObject+0x19f
3, ntoskrnl.exe!_misaligned_access+0xbd4
4, ntoskrnl.exe!_misaligned_access+0x186d
5, ntoskrnl.exe!IoFreeErrorLogEntry+0x287
6, ntdll.dll!RtlAllocateHeap+0x76
7, 0x50bb16b140
8, 0x774be17213223ad0
9, 0x3800000000
10, 0x774be17213223ad0
11, 0x3800000000
12, 0x11457e67c
13, 0x2ce50b18774be40c
14, 0x6a028c00000035
15, 0x230001006a0000
16, 0x1457edbc0008904e
17, 0x6a000000000000
18, 0x226ecf8c8
19, 0x5800000000
20, 0x1457e65000c6156b
21, 0x6a0000604fa9e7
22, 0x5000000000
23, 0x13223ad0bb16b140
24, 0x1457e65c0000000b
25, 0x50604449ea
26, 0x604449831457e680
27, 0x1457e6f41457e6a0
28, 0x1457e70800000000
29, 0x13223ad01457e6e4
30, 0x1457e6d0bb16b140
31, 0x1457e6a06043bb6a
32, 0xbb79cdd01457e6f4
33, 0xbb79cdd0bb546120
34, 0x605180c01457e6e8
35, 0x1457e6b000000000
36, 0x80b0175c1457e738
37, 0x1457e72000000000
38, 0x80b0175c26ecfc88
39, 0x1457e6e800000001
40, 0x1457e7401457e6e4
41, 0xbb546120603d82f4
42, 0x1457e740bb79cdd0
43, 0x603d82f41457e768
44, 0x80b0175c1457e708
45, 0x1f
46, ntdll.dll!RtlAllocateHeap+0x76 (No unwind info)
47, igdusc32.dll!USC::CShaderInstruction::ResetPrecision+0x2b2f97 (No unwind info)
48, igdusc32.dll!USC::CShaderInstruction::ResetPrecision+0x1fcf9a (No unwind info)
49, igdusc32.dll!USC::CShaderInstruction::ResetPrecision+0x1fcf33 (No unwind info)
50, igdusc32.dll!USC::CShaderInstruction::ResetPrecision+0x1f411a (No unwind info)
51, igdusc32.dll!USC::CShaderInstruction::ResetPrecision+0x1908a4 (No unwind info)
52, igdusc32.dll!USC::CShaderInstruction::ResetPrecision+0x19017c (No unwind info)
53, igdusc32.dll!USC::CShaderInstruction::ResetPrecision+0x77508 (No unwind info)
54, igdusc32.dll!USC::CShaderInstruction::ResetPrecision+0x66803 (No unwind info)
55, igdusc32.dll!USC::CShaderInstruction::ResetPrecision+0x6644d (No unwind info)
56, igdusc32.dll!USC::CShaderInstruction::ResetPrecision+0x663c2 (No unwind info)
57, igdusc32.dll!USC::CShaderInstruction::ResetPrecision+0x66140 (No unwind info)
58, igdusc32.dll!USC::CShaderInstruction::ResetPrecision+0x2f83d (No unwind info)
59, igdusc32.dll!USC::CShaderInstruction::SupportsPredicate+0x6951 (No unwind info)
60, igdusc32.dll!USC::CShaderInstruction::SupportsPredicate+0x90c2 (No unwind info)
61, igdusc32.dll!USC::CShaderInstruction::SupportsPredicate+0xa340 (No unwind info)
62, igd10iumd32.dll!OpenAdapter10_2+0x128739 (No unwind info)
63, d3d11.dll!D3D11CreateDeviceAndSwapChain+0x4a7e8 (No unwind info)
64, d3d11.dll!D3D11CreateDeviceAndSwapChain+0x1d2f2 (No unwind info)
65, d3d11.dll!D3D11CreateDeviceAndSwapChain+0x19d02 (No unwind info)
66, d3d11.dll!D3D11CreateDeviceAndSwapChain+0x10859 (No unwind info)
67, d3d11.dll!D3D11CreateDeviceAndSwapChain+0x929 (No unwind info)
68, d3d11.dll!D3D11CreateDeviceAndSwapChain+0x2bac1 (No unwind info)
69, d3d11.dll!D3D11CreateDeviceAndSwapChain+0xa9c (No unwind info)
70, d3d11.dll!D3D11CreateDeviceAndSwapChain+0x3b39e (No unwind info)
71, d3d11.dll!D3D11CreateDeviceAndSwapChain+0x3afde (No unwind info)
72, d3d11.dll!D3D11CreateDeviceAndSwapChain+0x3a8b7 (No unwind info)
73, d3d11.dll!D3D11CreateDeviceAndSwapChain+0x1811f (No unwind info)
74, d3d11.dll!D3D11CreateDeviceAndSwapChain+0x7829 (No unwind info)
75, d3d11.dll!D3D11CreateDeviceAndSwapChain+0x7758 (No unwind info)
76, xul.dll!PrintJSStack+0xcd79d (No unwind info)
77, xul.dll!PrintJSStack+0xcd768 (No unwind info)
78, xul.dll!PrintJSStack+0xcd307 (No unwind info)
79, xul.dll!PrintJSStack+0xcd2db (No unwind info)
80, xul.dll!PrintJSStack+0x13c56a (No unwind info)
81, xul.dll!PrintJSStack+0x143899 (No unwind info)
82, xul.dll+0x23ddfe (No unwind info)
83, xul.dll+0x30d3bb (No unwind info)
84, xul.dll+0x23ce62 (No unwind info)
85, xul.dll+0x683447 (No unwind info)
86, xul.dll+0x427abf (No unwind info)
87, xul.dll+0x427a8e (No unwind info)
88, xul.dll+0x680e8c (No unwind info)
89, xul.dll+0x680dc1 (No unwind info)
90, kernel32.dll!BaseThreadInitThunk+0x12 (No unwind info)
91, mozglue.dll!mozilla::BaseTimeDurationPlatformUtils::TicksFromMilliseconds+0xd3 (No unwind info)
92, ntdll.dll!RtlInitializeExceptionChain+0x63 (No unwind info)
93, ntdll.dll!RtlInitializeExceptionChain+0x36 (No unwind info)
Expected results:
FF should not lock up on such a high cpu/misasligned access.
Above stacks are with FF 55.0.3 (32bit), running on 64bit Windows 7 system.
Comment 4•7 years ago
|
||
Hi Nova,
Can you please retest this issue using a new Firefox profile to eliminate custom settings as a possible cause (https://goo.gl/L92rLJ)?
If the issue is still reproducible, can you try collecting a performance profile to see if anything stands out there? The link below has directions for how to collect and submit a profile.
https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Reporting_a_Performance_Problem
Flags: needinfo?(mynova)
Ok, created new profile, this takes a few days to reproduce..
(lost all my user/passwords, as they were stored in that other profile :) will report back when/if it reproduces.
btw, after new profile, added 7 LEGACY addons, without which I might as well switch browser all together :)
1) Adobe Acrobad DC
2) Classic Theme Restorer
3) Google Translator for FF
4) NoScript
5) Right Links
6) Session Manager
7) Tab Mix Plus
given that all these will be desupported soon, I doubt they are the cause, but if u see no value, i.e. one must debug with no addons, then let me know, and we can close this bug, as i said.. not possible to use the browser w/o these adds and if the price is to restart every few days, so be it :)
Nova
ok, it reproduced, but then went to the performance report url, and it says one must run the nightly developer build?? oh dear.. I suggest you change the reply to state that up front, i.e. that one must run the dev build.. i'll try that now.. each time it formats env and must reinstall all things again...
Comment 8•7 years ago
|
||
Thanks for testing!
You didn't lost your data, are still in the other profile. In order to return to the old profile you can use Profile Manager. Here are more information about this: https://goo.gl/VTpi82.
So you reproduce the issue with all the 7 legacy add-ons installed?
I forget to mention that you have to use the Nightly build in order to collect the performance profile, sorry about that. You can download it from here: https://ftp.mozilla.org/pub/firefox/nightly/latest-mozilla-central/
You don't have to install it, you can just extract the zip file. Also you can open it with the created profile where you reproduce the issue.
Please let me know if you need more information.
no worries, yes, I need these (now legacy) addons to get things done.. issue reproduce relatively quick.
u said no need to install it from which file to extract the exe? found this:
https://ftp.mozilla.org/pub/firefox/nightly/latest-mozilla-central/firefox-57.0a1.en-US.win32.installer.exe but i guess u mean some .zip to extract just the binary and replace in the existing tree?
Comment 10•7 years ago
|
||
Here is the file: https://ftp.mozilla.org/pub/firefox/nightly/latest-mozilla-central/firefox-57.0a1.en-US.win32.zip
After you extract the file, there is a "firefox" folder where you can find the "firefox.exe" file. For opening it with a profile you should follow the steps from the provided link in comment 8.
Reporter | ||
Comment 11•7 years ago
|
||
Renamed firefox.exe to firefox-original.exe, and attempted to put in place the firefox.zip from above zip file, however it does not execute, says: Error: Platform version '55.0.3' is not compatible with minVersion >= 57.0a1 MaxVersion < 57.0a1, so it sounds like to debug i would need to fully install/use version 57.. this is very strange, isn't there some 'tracing' or instrumentation or way to split the binary into threads and figure out what is spinning on cpu, this is a tight loop left untouched it runs for hours or days, and cpu becomes overheated.. on Chrome there is a thread for each tab , so it is relatively easy to spot mis-behaving tabs.. With all these years and efforts, there is no such methods on Firefox? I really want to help, but running development version of firefox, it sounds like all the addons will not work? Sorry for all this noise, I posted what I thought should be more than enough to isolate the cpu loop on the first posting of the bug, and as found others ran into the same issue long ago: http://forums.mozillazine.org/viewtopic.php?f=38&t=2580941, and i've been having it for over a YEAR, until had enough then decided to file this bug.. but it may be more than a normal/avg user can handle.
My suggestion is to add an instrumentation/tracing method that user , any user, with any version enables, then runs their test, and uploads the test/traces...
Nova
Reporter | ||
Comment 12•7 years ago
|
||
Went back to default firefox.exe binary, and within 2hrs it is now spinning 100% (of 1 cpu), so that is 25% solid. Ran process monitor (sysinternals), and it has Stack Summary, so uploading few screenshot of that. then in that trace it has many attempts to access a volume with serial EACB-A7C3 (which happens to be my Win7 OS disk C:), except it calls it as follows:
Date & Time: 09/02/2017 12:16:17 AM
Event Class: File System
Operation: QueryInformationVolume
Result: BUFFER OVERFLOW
Path: C:\Program Files (x86)\Mozilla Firefox\firefox.exe
TID: 5508
Duration: 0.0000064
VolumeCreationTime: 11/05/2012 1:57:40 AM
VolumeSerialNumber: EACB-A7C3
SupportsObjects: True
VolumeLabel: WinX
This volume label is no longer used for over 2 years, since I've moved to Windows 7 from WinXp.
There is a save-Stack for that exact call, and I'll upload that as well. that buffer overflow looks bad i would think, and it's in the cpu loop, repeatedly called upon...
another line shows this call failing with similar result:
Date & Time: 09/02/2017 12:16:17 AM
Event Class: File System
Operation: QueryAllInformationFile
Result: BUFFER OVERFLOW
Path: C:\Program Files (x86)\Mozilla Firefox\firefox.exe
TID: 5508
Duration: 0.0000069
CreationTime: 09/01/2017 11:40:06 PM
LastAccessTime: 08/28/2017 7:37:01 PM
LastWriteTime: 08/28/2017 7:37:01 PM
ChangeTime: 09/01/2017 11:46:55 PM
FileAttributes: A
AllocationSize: 532,480
EndOfFile: 532,432
NumberOfLinks: 1
DeletePending: False
Directory: False
IndexNumber: 0x6500000003c878
EaSize: 0
Access: Read Attributes, Synchronize
Position: 0
Mode: Synchronous IO Non-Alert
AlignmentRequirement: Word
stack for that is called: Stack-for-Bufferoverflow2.CSV
will upload these files in ffcpu.zip see if that helps anything.
Flags: needinfo?(mynova)
Reporter | ||
Comment 13•7 years ago
|
||
output from Stack Summary, as well as proc mon, showing buffer overflow, not sure if that is the issue though.
Reporter | ||
Comment 14•7 years ago
|
||
when wrote: "volume label is no longer used " the volume name CHANGED from WinX to Win7, I'm surprised that FF still uses the OLD name, can't find it anywhere, on dev manager/storage the new name Win7 appears as the LABEL, but the same Volume ID number is there. I have removed/reinstalled FF many times after moving from WinXP to Win7, so not sure how it still has this WinX volume name. Searching registry for WinX or the volume ID yielded no hits.
Reporter | ||
Comment 15•7 years ago
|
||
Current volume info:
Volume in drive C is Win7
Volume Serial Number is EACB-A7C3
Reporter | ||
Comment 16•7 years ago
|
||
I've pressed SUSPEND on that firefox.exe in case u guys want to debug this spinner.. (new profile and freshly reset), i suspect it might be the OS disk having been upgraded (3 years ago) from winxp to win7, however, no other app/program is having any issues, only FF spinning on cpu.. and printing the old disk volume label, which i have no clue how it has access to...
Comment 17•7 years ago
|
||
From comment 9 I understand that the issue is only reproducible with the legacy add-ons installed, right? So probably there is an add-on that caused this. Can you please disable one by one to see which add-on caused this?
Also, did you managed to collect a performance profile as I suggested in comment 4?
Flags: needinfo?(mynova)
Reporter | ||
Comment 18•7 years ago
|
||
1) I've attempted to collect, however, see comment 11, it seems that i must reinstall completely firefox 57... just dropping in the binary did not do it.
2) It seems the firefox framework requires a specific binary to debug, and the current one, that's spinning 100% cannot be diagnosed, even though procmon clearly shows a tight loop as above screenshot.. debugging such issues should be the foundation of any design/implementation, esp. one that supports addons.. More I use Chrome, it seems there a TAB is a PROCESS, and such issues are much easier to handle..
3) I did not try without the addons, b/c it would be useless funtionality-wise, however, for DEBUG purposes did that just now, started in SAFEMODE, and all addons show as disabled.
I'll report back in few day.
Reporter | ||
Comment 19•7 years ago
|
||
in safe-mode, with all addons disabled, it did not reproduce in few days, now enabled 1/2 and rest disabled.
Comment 20•7 years ago
|
||
Moving to Core:XUL for more attention.
Component: Untriaged → XUL
Product: Firefox → Core
Updated•7 years ago
|
Component: XUL → General
Comment 21•7 years ago
|
||
(In reply to Nova from comment #19)
> in safe-mode, with all addons disabled, it did not reproduce in few days,
> now enabled 1/2 and rest disabled.
How are things performing now?
Reporter | ||
Comment 22•7 years ago
|
||
I'm not a developer, so having a hard time tracking this enable/disabled on 4 PCs. If there was a simple way to debug this I would do it. Either drop in a debug firefox.exe or somehow "create dump file" (Process Hacker has this option), when the issue reproduces. It might be add-on related, in which case I would argue it is useless to chase this in this debugging methods, since all these addons will be obsoleted by newer firefox soon enough anyways.
If you do have a simple method to debugging, let me know, I cannot uninstall firefox and reinstall a full nightly build, I'm simple an end user, not a developer and naively thought that the original installation allows for debugging of cpu loops.
Flags: needinfo?(mynova)
Comment 23•7 years ago
|
||
(In reply to Nova from comment #22)
> I'm not a developer, so having a hard time tracking this enable/disabled on
> 4 PCs. If there was a simple way to debug this I would do it. Either drop in
> a debug firefox.exe or somehow "create dump file" (Process Hacker has this
> option), when the issue reproduces. It might be add-on related, in which
> case I would argue it is useless to chase this in this debugging methods,
> since all these addons will be obsoleted by newer firefox soon enough
> anyways.
>
> If you do have a simple method to debugging, let me know, I cannot uninstall
> firefox and reinstall a full nightly build, I'm simple an end user, not a
> developer and naively thought that the original installation allows for
> debugging of cpu loops.
Thanks for trying to figure it out :)
A goal for us at this point was identifying the addon that might be the culprit, so that we could identify the addon author.
I don't have a simple debugging to offer, other than enabling addons one at a time until it becomes obvious which addon is contributing or causing the issue.
Totally understand if you don't have time to test this out given the timeframe required to get the issue to manifest. I'm going to close this INCOMPLETE. Feel free to re-open if you do find time to test it.
Status: UNCONFIRMED → RESOLVED
Closed: 7 years ago
Resolution: --- → INCOMPLETE
Reporter | ||
Comment 24•7 years ago
|
||
Thanks. Hopefully you come up with some "fencing" framework where by such debugging can be all programatic, and misbehaving addons can simply be immediately flagged, as they consume cpu/IO/memory, then terminated. Left to the users (authors of addons and users of them) this is a neverending debug in my view, since it's not just the addon, it can be a "combination of addons" coupled with the exact pages that are browsed.. in short, possibilities unlimited :)
You need to log in
before you can comment on or make changes to this bug.
Description
•