Closed
Bug 526038
Opened 15 years ago
Closed 14 years ago
win/nsIconChannel.cpp miscompiled by -O2 optimization
Categories
(Core :: Graphics, defect)
Tracking
()
RESOLVED
WORKSFORME
People
(Reporter: swsnyder, Unassigned)
Details
Attachments
(5 files, 1 obsolete file)
98.85 KB,
image/jpeg
|
Details | |
1.53 KB,
patch
|
joe
:
review-
|
Details | Diff | Splinter Review |
11.41 KB,
text/plain
|
Details | |
11.38 KB,
text/plain
|
Details | |
994 bytes,
patch
|
Details | Diff | Splinter Review |
With the source files from the Firefox 3.6b1 tarball, compiling ~/modules/libpr0n/decoders/icon/win/nsIconChannel.cpp with the -O2 optimization level (VS2005/SP1) causes a runtime protection fault when a null pointer is dereferenced. The fault occurs here, on the assignment: http://mxr.mozilla.org/mozilla1.9.2/source/modules/libpr0n/decoders/icon/win/nsIconChannel.cpp#644 Example: --- mozilla/modules/libpr0n/decoders/icon/win/Makefile.in.original 2009-10-29 18:11:06.000000000 -0400 +++ mozilla/modules/libpr0n/decoders/icon/win/Makefile.in 2009-11-02 18:07:24.000000000 -0500 @@ -57,6 +57,7 @@ # we don't want the shared lib, but we want to force the creation of a static lib. FORCE_STATIC_LIB = 1 include $(topsrcdir)/config/rules.mk +OS_COMPILE_CXXFLAGS += -O2
Comment 1•15 years ago
|
||
Does the same problem exist in VS2008?
Reporter | ||
Comment 2•15 years ago
|
||
(In reply to comment #1) > Does the same problem exist in VS2008? Unknown; I only have VS2005/SP1 (Win7SDK).
I don't understand -- is the variable actually null? If so, that line will indeed cause a crash.
Reporter | ||
Comment 4•15 years ago
|
||
The variable _retval is defined as "nsIInputStream**", a pointer to a pointer. _retval itself is not null, but the pointer that it points to is null. It is on this 2nd dereferencing that the fault occurs. It'll take me a couple of hours to come up with the exact x86 asm code, but it will look a lot like this: mov eax, _retval ; load valid pointer mov ecx, [eax] ; load null pointer mov [ecx], inStream ; <--- protection fault occurs here, on deref
Where that should be mov [eax], inStream ? I can believe a compiler bug, but it's just that this sequence appears all over the place.. (the NS_ADDREF(...)) stuff. Could be something earlier tripping it up.
Reporter | ||
Comment 6•15 years ago
|
||
This graphic is a screen shot of the VS2005 debugger at the time of the fault. It probably conveys the information better than my description. Speaking of inadequate descriptions, I neglected to describe how to trigger the fault. Select Tools>>Options... from the Firefox 3.6b1 menu. The fault occurs in the course of displaying the 1st tab in the Options dialog box. No problem seen when win/nsIconChannel.cpp is compiled with -O1 optimization.
Comment 7•14 years ago
|
||
see Windows Event Viewer, The fault occurs at C Runtime(mozcrt19.dll, msvcr80.dll, msvcr90.dll). same problem ? http://forums.mozillazine.org/viewtopic.php?f=42&t=1708215 (In reply to comment #1) > Does the same problem exist in VS2008? I tried ... good: VS2005(jemalloc, no PGO) bad : VS2008(jemalloc, no PGO) bad : VS2005 express(no jemalloc, no PGO) bad : VS2008 express(no jemalloc, no PGO)
Reporter | ||
Comment 8•14 years ago
|
||
(In reply to comment #7) > Created an attachment (id=434854) [details] > use memmove(), no memcpy() I don't see from the code that the source and destination buffers overlap, so how does memmove() work better than memcpy()?
Comment 9•14 years ago
|
||
(In reply to comment #8) > I don't see from the code that the source and destination buffers overlap, Me too. But I have not verified it yet. I think ... 1. This problem is caused by a compiler bug. 2. When buffers overlapping, memcpy() is not stable. 3. So, can not predict what will happen. 4. Then, at least, memmove() should be used for safety.
Updated•14 years ago
|
Attachment #434854 -
Flags: review?(joe)
Comment 10•14 years ago
|
||
Comment on attachment 434854 [details] [diff] [review] use memmove(), no memcpy() These arrays aren't even close; one is on the heap, the other the stack. I'm going to need some convincing to switch from memcpy to memmove. (Other than "this works.")
Attachment #434854 -
Flags: review?(joe) → review-
Comment 11•14 years ago
|
||
I think I'm seeing this with SeaMonkey trunk as well (self-compiled build): <http://crash-stats.mozilla.com/report/index/26aad785-e7b8-41e8-9f65-b3ec32100604> Found this bug through: <http://forums.mozillazine.org/viewtopic.php?p=8510915#p8510915> Note: 1. SeaMonkey doesn't build with libxul (yet). 2. The SeaMonkey Download Manager shows an icon matching the download's file type and a trash can icon for each download. 3. If there is no entry in the Download Manager: no crash. If a new download is added or the Download Manager is opened and contains entries from previous runs (e.g. profile previously used with official nightlies): crash. 100% reproducible. I'm using --enable-optimize="-O2 -GALFT -GS- -Gs -Zc:wchar_t- -fp:fast -arch:SSE".
Reporter | ||
Comment 12•14 years ago
|
||
(In reply to comment #11) > I'm using --enable-optimize="-O2 -GALFT -GS- -Gs -Zc:wchar_t- -fp:fast > -arch:SSE". This "fixes" the reported problem by forcing -O1 optimization for the icon code only, overriding the -O2 switch in your global opt flags: diff -Nru mozilla.original/modules/libpr0n/decoders/icon/win/Makefile.in mozilla/modules/libpr0n/decoders/icon/win/Makefile.in --- mozilla.original/modules/libpr0n/decoders/icon/win/Makefile.in 2010-01-05 22:35:27.000000000 -0500 +++ mozilla/modules/libpr0n/decoders/icon/win/Makefile.in 2010-01-07 10:03:20.000000000 -0500 @@ -59,4 +59,5 @@ FORCE_STATIC_LIB = 1 include $(topsrcdir)/config/rules.mk +OS_COMPILE_CXXFLAGS += -O1
Comment 13•14 years ago
|
||
(In reply to comment #12) > (In reply to comment #11) > > I'm using --enable-optimize="-O2 -GALFT -GS- -Gs -Zc:wchar_t- -fp:fast > > -arch:SSE". > > This "fixes" the reported problem by forcing -O1 optimization for the icon code > only, overriding the -O2 switch in your global opt flags: Confirmed, thanks! I first tried removing the L from GALFT but that only changed the stack trace (so that it ended in msvcr80.dll/memcpy instead of imgicon.dll/imgicon.dll@0x2d95).
Comment 14•14 years ago
|
||
Comment on attachment 409839 [details]
JPEG showing state of code at fault
This screenshot shows _retval is 0x00000000, which is bad. But you'd need to be able to walk the stack to find out why it is zero.
Comment 15•14 years ago
|
||
On VS2005, the source code of memcpy and memmove in C-runtime library is the same when BLD_ASM is set to the value 1. Although the compiler may replace the function call of memcpy with inline instructions, if it will result in better performance.
Comment 16•14 years ago
|
||
(In reply to comment #15) > On VS2005, the source code of memcpy and memmove in C-runtime library is the > same when BLD_ASM is set to the value 1. Sorry, original C-runtime libraries (CRTs) have been built with BLD_ASM=0. My private builds have always used CRTs which are built with BLD_ASM=1. Even if using such CRTs, the builds often crashed when I opened "Options" dialog and clicked "Applications" button. The crash has not occurred yet since I disabled the optimization of nsIconChannel::MakeInputStream() about 3 months ago in the following way. #ifdef _MSC_VER #pragma optimize("", off) #endif nsresult nsIconChannel::MakeInputStream(nsIInputStream** _retval, PRBool nonBlocking) { (snip) } #ifdef _MSC_VER #pragma optimize("", on) #endif
Comment 17•14 years ago
|
||
(In reply to comment #15) I had the wrong idea about BLD_ASM. BLD_ASM is used to specify whether to rebuild the assembler sources.
Comment 18•14 years ago
|
||
inline memcpy causes a crash. memmove is not inline instruction. and /O1 not contains /Oi, /O2 contains /Oi.
Attachment #459836 -
Flags: review?(joe)
Comment 19•14 years ago
|
||
Comment on attachment 459836 [details] [diff] [review] use #pragma function(memcpy) I'm okay with this (with some added comments to explain them), but first I'd like to see precisely where things go wrong - a comparison, for example, of the -O1 and the -O2 generated asm, and where it goes wrong. Once I have that I'll be more than happy to grant this r+. (Probably we should put these #pragmas in #ifdefs for the precise version of MSVC that is broken, too.)
Attachment #459836 -
Flags: review?(joe)
Comment 20•14 years ago
|
||
Comment 21•14 years ago
|
||
Comment 22•14 years ago
|
||
VS2005: crash VS2008: crash VS2010: icon disappeared asm code is created by VS2005. but I still don't understand asm code. sorry.
Attachment #459836 -
Attachment is obsolete: true
Updated•14 years ago
|
Attachment #461212 -
Flags: review?(joe)
Comment 23•14 years ago
|
||
Comment on attachment 461212 [details] [diff] [review] use #pragma function(memcpy) This is going to require some pretty in-depth ASM debugging, which I don't have the time to do right now :( (The reason I don't just r+ this and let it in is because I worry that we've got a latent bug that just happens to be triggered on various MSVC versions. Until we can prove that's not the case, I don't want to be papering over things like this.)
Attachment #461212 -
Flags: review?(joe)
Comment 24•14 years ago
|
||
I fixed a memory corruption issue with this code in bug 590116. Could someone confirm whether this still happens?
Comment 25•14 years ago
|
||
The problem I reported in comment 11 (SM trunk) is gone, so this is WFM for me, but Steve reported the bug for 1.9.2 and needs to verify his case (FF 3.6 branch) from comment 6.
Reporter | ||
Comment 26•14 years ago
|
||
(In reply to comment #25) > The problem I reported in comment 11 (SM trunk) is gone, so this is WFM for me, > but Steve reported the bug for 1.9.2 and needs to verify his case (FF 3.6 > branch) from comment 6. I can no longer reproduce the problem I reported, so I can't test your fix. This is with the v3.6.13 Build #3 tarball. Beyond using the contemporary Firefox code, I am also using a different Windows SDK. For the original report I was using the WinSDK v6.0a that came with VS2005. I'm still using the same compiler, but now with WinSDK v7.0. These 2 changes are all I can think of in the past 13 months to explain why I can no longer miscompile nsIconChannel::MakeInputStream() at -O2 optimization.
Comment 27•14 years ago
|
||
In that case it's very likely bug 590116 fixed it.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → WORKSFORME
Reporter | ||
Comment 28•14 years ago
|
||
(In reply to comment #27) > In that case it's very likely bug 590116 fixed it. Oh, OK. I didn't realize that the fix was already checked in for v1.9.2.x. I was trying to miscompile the code as a baseline to testing what I thought was a proposed fix.
You need to log in
before you can comment on or make changes to this bug.
Description
•