Closed Bug 141834 Opened 23 years ago Closed 23 years ago

Source file path not being reported by Talkback for MozillaTrunk builds

Categories

(SeaMonkey :: Build Config, defect, P1)

x86
Windows NT
defect

Tracking

(Not tracked)

VERIFIED FIXED
mozilla1.1beta

People

(Reporter: jay, Assigned: netscape)

References

Details

Attachments

(2 files, 6 obsolete files)

I'm not sure when this started happening, but I'm guessing within the last couple of weeks. Lately, a lot of Talkback reports for MozillaTrunk crashes on Windows are missing the file paths in the stack. We used to get stacks like this for almost ALL crashes on Windows: nsImageBoxListener::OnStopDecode [d:\builds\seamonkey\mozilla\layout\xul\base\src\nsImageBoxFrame.cpp line 877] imgRequestProxy::FrameChanged [d:\builds\seamonkey\mozilla\modules\libpr0n\src\imgRequestProxy.cpp line 295] imgRequest::FrameChanged [d:\builds\seamonkey\mozilla\modules\libpr0n\src\imgRequest.cpp line 337] imgContainer::Notify [d:\builds\seamonkey\mozilla\modules\libpr0n\src\imgContainer.cpp line 459] nsTimerImpl::Process [d:\builds\seamonkey\mozilla\xpcom\threads\nsTimerImpl.cpp line 342] handleMyEvent [d:\builds\seamonkey\mozilla\xpcom\threads\nsTimerImpl.cpp line 381] PL_HandleEvent [d:\builds\seamonkey\mozilla\xpcom\threads\plevent.c line 591] PL_ProcessPendingEvents [d:\builds\seamonkey\mozilla\xpcom\threads\plevent.c line 524] _md_EventReceiverProc [d:\builds\seamonkey\mozilla\xpcom\threads\plevent.c line 1072] nsAppShellService::Run [d:\builds\seamonkey\mozilla\xpfe\appshell\src\nsAppShellService.cpp line 309] main1 [d:\builds\seamonkey\mozilla\xpfe\bootstrap\nsAppRunner.cpp line 1305] main [d:\builds\seamonkey\mozilla\xpfe\bootstrap\nsAppRunner.cpp line 1640] WinMain [d:\builds\seamonkey\mozilla\xpfe\bootstrap\nsAppRunner.cpp line 1658] WinMainCRTStartup() KERNEL32.DLL + 0x17d08 (0x77e97d08) But now they look like this (not all of them, but A LOT): nsImageBoxListener::OnStopDecode [nsImageBoxFrame.cpp line 877] imgRequestProxy::FrameChanged [imgRequestProxy.cpp line 294] imgRequest::FrameChanged [imgRequest.cpp line 336] imgContainer::Notify [imgContainer.cpp line 459] nsTimerImpl::Fire [nsTimerImpl.cpp line 357] nsTimerManager::FireNextIdleTimer [nsTimerImpl.cpp line 591] nsAppShell::Run [nsAppShell.cpp line 134] nsAppShellService::Run [nsAppShellService.cpp line 451] main1 [nsAppRunner.cpp line 1447] main [nsAppRunner.cpp line 1783] WinMain [nsAppRunner.cpp line 1801] WinMainCRTStartup() kernel32.dll + 0x1eb69 (0x77e7eb69) Did something change in the build automation or are one of the machines configured differently so that we don't get the full path for the files in the stack? We have also ALWAYS had this problem with MacOS crashes (only the file name is captured, no path)...and We have never seen any file names or their paths for Linux crashes (which might be a Talkback issue). If anyone on the build team has insight into this problem, and a possible fix...that would be great! The path for the filenames is important for our topcrash analysis.
Cc'ing Leaf and Loan...not sure if Chris is still in charge of build config or not.
Um, I'm still here. I don't know exactly how talkback works so I have no idea why source paths would stop working. I suspect the gmake switch was to blame. Help?
Keywords: helpwanted
Leaf: Can you help here ?
my first guess is that the switch to jebus builds (with gmake) caused this... we can compare the forge builds of last week with the jebus builds on the same day and see if there's a difference.
Leaf, Shiva and I compared talkback reports and confirmed that the file paths disappeared following the 4/25 switch to gmake. Linux crash data has been devoid of file paths for quite a while (I believe they have been built with gmake for some time). The bonus to fixing this bug may be that we can get file paths for Linux crash data also.
I should have mentioned this before: this bug needs to be moved to bugscape. It's definitely not a mozilla bug as talkback lives in the netscape tree.
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → INVALID
This is a direct impact of move from nmake to gmake. Talkback is extensively used by Mozilla community. We need to fix this problem. I am re-opening this bug.
Status: RESOLVED → REOPENED
Resolution: INVALID → ---
cc'ing chofmann and brendan.
Shiva, fair enough. I was anticipating that the problems were in the ns tree since that's where the talkback makefiles live. However, there's a good chance that I'm wrong. So, little known fact (to me, anyway) about win32 talkback builds. They're profile builds; they contain debug info. Linux nightly builds are not profile builds which explains why their stacks are missing the good stuff. This patch should cause the win32 gmake builds to override the optimize cflags & ldflags like the nmake builds do. So there will be no adding custom optimize flags to MOZ_PROFILE, MOZ_COVERAGE or NS_TRACE_MALLOC builds.
FWIW this is a problem a problem for the VC++ debugger, too. And for normal debug builds. I moved to gmake but had to move back to nmake because of the missing path information in the debug symbols. It makes it nearly impossible to debug code. I thought this stemmed more from the fact that the dll/exe and pdb files were generated in the same directory as the source, thus VC++ doesn't put the path information in. The nmake system used win32_[d|o].obj directory for the output. I am not sure this true. I looked but couldn't find any cl/link option that would force path info on.
Comment on attachment 82753 [details] [diff] [review] Override flags instead of appending for profile, coverage & trace-malloc builds r=leaf, bake on the trunk!
Attachment #82753 - Flags: review+
Jim: Can you sr= ?
afaik sr's are listed here: http://www.mozilla.org/hacking/reviewers.html and build config changes don't really need sr's (if they get a blessing from cls...)
The patch has been checked into the trunk.
Status: REOPENED → ASSIGNED
Priority: -- → P1
Target Milestone: --- → mozilla1.0
Chris: That patch didn't help. We are still not getting the source path information.
Coredump: Empirical evidence seems to substantiate the claim that msvc doesn't include path info into the .pdb file if the object file is in the same directory as the source directory. I couldn't find this documented anywhere either. So after peterl set me straight on win32 debugging expectations, I was able to partially reproduce the debugging problem in my profile build. Where peterl wasn't having any sourcefiles automatically popup when double-clicking on a stack function for any function, they were popping up for select libraries in my build (js & view to be specific). After a few days of testing, I believe that this is because I (as always) build using a separate objdir. When I built in the srcdir, I was able to reproduce the problem he showed me. I spent sometime making sure that select libraries were being built in the same flags in the exact same order as the nmake build but it didn't seem to make a difference. I'm suspecting that msvc is having problems handling sourcefiles with relative paths that contain .. . I'm hoping that if I can build using an absolute path, then there will be no problem finding the correct files. However, as I pointed out in bug 135589, doing an absolute path build on win32 is painful due to the differences between dos drive paths and cygwin's alternate / mount point. I managed to get the Makefiles generated with a drive-colon srcdir & top_srcdir set but cygwin make still attempts to pass a /cygdrive/ path to msvc which thinks it's another flag. As a stopgap, mingw make seems to handle that situation properly but now xpidl is dying with the following error: Parse of c:/root/mozilla/xpcom/base/nsIErrorService.idl failed: Invalid argument. I'm not sure if it's failing on the parsing of the idl file or the -Ic:/root/mozilla/xpcom/base . The unfortunate alternative to that mess is to resurrect the .OBJ feature for win32 builds which I am loathe to do.
Using absolute file paths for the sourcefiles fixed the debugging problem. Now to figure out how to work around this particular cygwin make deficiency. I really do _not_ want to add another toolkit dependency. Worst case, I'll have to bug the cygwin team to incorporate the mingw changes and have us require a specific version of cygwin make which is a bit less annoying.
Depends on: 148731
This patch sync's our setting of PDBFILE to match the nmake build. Having 2 /PDB: is apparently not a bug. It also adds WIN32_EXE_LDFLAGS which is used for setting link flags to be used only for executables. It also fixes the problem of intermediate libraries being listed twice when building dlls. The additional list is only needed for static builds (where it actually only shows up once).
I wasn't able to find any docs on how to make cygwin make properly handle dos path names. There is a --win32 flag but that just invokes cmd.exe instead of sh which tends to break on some of our commands and still doesn't properly resolve the path issue. After hitting several dead-ends, I decided to see how much of a hit a wrapper script would be. The wrapper is used to call all native win32 progs, including xpidl & nsinstall. It's a simple shell/perl script that does a /cygdrdive/x/ -> x:/ translation on all args. The script would have been all perl except that activestate perl doesn't grok /cygdrive/ either so calling the script with a full path would fail. If anyone knows how to do the equivalent of `perl -pe 's/\/cygdrive\/(.)\//\1:\//g;'` in awk or sed, let me know. That should speed things up signficantly. Right now, we're looking at about a 14% hit. :-( My profile build times went from 1:28 to 1:41. With the patch, objdir builds use full absolute paths and the debugging info shows up properly. Scrdir builds are still out of luck since make notes that the file is in the current dir and automatically strips the extra path info from the automatic variables ($< $@ etc) that we use in the build rules.
I figured out the proper magic for using sed. I also removed a couple of unnecessary wrapper uses but it only seemed to drop the build times by a minute. Wrapping every cl, nsinstall & xpidl use seems to be the big timesink. The cl usage might be easy to remove by using $(srcdir)/%.cpp instead of $< though that's going to break us in the one place that we actually use multiple dirs in VPATH. bryner: The non-cygdrive condition that we discussed last night is still going to be a problem. configure bombs out when creating autoconf.mk if it was called using a dos-path. (Internal configure macros use : as a delimiter between the name of the generated file and the name of the template file for cases where the template isn't <generated-file>.in .)
Attachment #86218 - Attachment is obsolete: true
This patch fixes the debugging problem for objdir builds and srcdir builds. I'm seeing between a zero (after reboot) & 6 min difference between pre-patch & post-patch build times. Things to note: * We have to call configure using the full path so that $(srcdir) contains the full path in the Makefiles. The rules have been changed for win32 so that it always use $(srcdir)/$*.{c,cpp} as the sourcefile name even when the file is in the cwd. This works around both the /cygdrive issue and the msvc's path info optimization. * We still use the wrapper when calling rc.exe & nsinstall. nsinstall frequently takes multiple args and is called all over the tree so changing each calling site is going to be expensive. (I'll have to check the logs again but I think the rc wrapping can go.) * nspr & ldap still use the wrappers for all native win32 progs. Neither nspr nor ldap uses the acoutput-fast.pl script to speed up the substitution of @srcdir@ into their Makefiles so, makefile substitution will break if we use dos-paths. * In the handful of directories where we copy srcfiles from another directory in lieu of using VPATHs, we have to now copy those files to $(srcdir) so that the default rules can find them.
Attachment #82753 - Attachment is obsolete: true
Attachment #86020 - Attachment is obsolete: true
Attachment #86409 - Attachment is obsolete: true
This patch didn't work for me ... all I did was install the patch and do make -f client.mk. When I debugged I started hitting F11 all over the place (starting in content/html/content/src). It seems to find the xpcom header files all right, but when it needed to go to nsHTMLAttributes.cpp, it asked me for the filename because it couldn't find it. I will try a clobber build now.
I made another go at it, on a hunch, but this time it didn't find anything automatically at all. It asked for xpcom and it even asked me to find content/html/content/src/nsFormSubmission.cpp, even though the file I had started with and put a breakpoint in was in that directory (nsHTMLFormElement). Now back to clobber.
Oog, my system, sorry: - WinXP - gmake build - running Mozilla from the bash shell, starting MSVC++ 6.0 and attaching to the process, then opening a relevant file and setting a breakpoint - I am using as cygwin tools wherever possible--zip and cvs and such--so perhaps this factors into whatever is happening.
Building with the patch, and doing clobber build, earns me "CYGWIN_WRAPPER@: command not found" in perl -I. ./aboutime.pl ...
After applying the patch, you need to run autoconf in mozilla/,mozilla/nsprpub & mozilla/directory/c-sdk to recreate the configure files. I found another bug while watching the game though. Cygwin's perl misinterprets -Ic:/root/mozilla/config as 2 paths so I'll have to check for cygwin perl and wrap those calls to do unix paths instead. I also tested the case where you build under the cygwin / mountpoint so you have no dos drive letters. If the objdir is under /, it works fine. If the srcdir is under /, then we have problems.
Target Milestone: mozilla1.0 → mozilla1.0.1
This just keeps getting uglier and uglier...
mcs: can you take a look at the ldap portion of the patch? wtc: can you take a look at the nspr portion of the patch?
Comment on attachment 87173 [details] [diff] [review] make cygwin perl happy Not pretty, but the proposed changes under mozilla/directory/c-sdk/ look OK to me. dmose did the autoconf work there, so he should take a look too if he hasn't already.
You may want to look at 105271 for a similar issue about source file paths not being included in object files on Windows and OS/2 debug builds. I came up with a fix for the gmake build . Unfortunately it requires the latest gmake 3.79.1 that supports $(if condition,then,else). That version of gmake is available on Windows, but not on OS/2 (only 3.76.1 has been ported to OS/2). I'm not aware of any cygwin for OS/2, so your proposed fix here would probably only fix the Windows case.
Comment on attachment 87173 [details] [diff] [review] make cygwin perl happy r=leaf
Attachment #87173 - Flags: review+
ldap portion looks ok to me too.
Comment on attachment 87173 [details] [diff] [review] make cygwin perl happy r=wtc on the NSPR part of the patch. Chris, NSPR can also be built with the MKS Toolkit. Could you set CYGWIN_WRAPPER only when we are using Cygwin tools?
Comment on attachment 87173 [details] [diff] [review] make cygwin perl happy A nit: in nsprpub/configure.in, you have >+ topsrcdir=$srcdir >+ CYGWIN_WRAPPER='$(topsrcdir)/build/cygwin-wrapper' Can we just use $srcdir? CYGWIN_WRAPPER='$(srcdir)/build/cygwin-wrapper'
Actually, I was pondering fixing it so that we didn't have to wrap all of NSPR & LDAP at all. First, we'd have to use the acoutput-fast.pl script that we use on the main tree so that srcdir is set to the proper value in the Makefiles, which would also speed up Makefile generation noticably. Then we'd have to do a similar hack in rules.mk to use $(srcdir) instead of $< to work around the cygwin make deficiency. Not sure if those steps are better or worse than just wrapping everything. The 'topsrcdir=$srcdir' is leftover from another test. It's not needed or used in the current patch as that CYGWIN_WRAPPER setting doesn't get evaluated until buildtime.
Attachment #86612 - Attachment is obsolete: true
Attachment #87173 - Attachment is obsolete: true
Comment on attachment 87380 [details] [diff] [review] Use acoutput-fast.pl instead of cygwin-wrapper for nspr & ldap Chris, I must say that this patch is kind of ugly and I like the CYGWIN_WRAPPER solution better. Does the CYGWIN_WRAPPER solution significantly slow the build down?
As I said before, wrapping every win32 native program for the entire build had about a 15% hit. I don't have exact figures for nspr or ldap. I'd have to clean my tree and test again.
Here's the times: current nspr build - 1:27 w/ cygwin wrapper - 1:34 (+8%) w/ acoutput-fast - 1:15 (-14%) That run was using a modified nsprpub/gmakefile.win so it included the time of the configure run which was why acoutput-fast.pl had such a big win. Not counting the time it takes to run configure & generate the makefiles, it looks like: current - 1:09 wrapped - 1:16 (+10%) acoutput- 1:10 (+1%) Each time was the best of 3 runs.
Attachment #87380 - Attachment is obsolete: true
Comment on attachment 87173 [details] [diff] [review] make cygwin perl happy This change is going to require a distclean so it will be landed in a carpool on Monday.
Attachment #87173 - Attachment is obsolete: false
The carpool landed this morning. Talkback reports made with yesterday afternoon's win32 build contain the full srcpath & line number info, http://climate.netscape.com/reports/incidenttemplate.cfm?bbid=7455164 .
Status: ASSIGNED → RESOLVED
Closed: 23 years ago23 years ago
Resolution: --- → FIXED
Target Milestone: mozilla1.0.1 → mozilla1.1beta
Other than / vs \. Everything else seems to be ok. Marking the bug as verified with respect to talkback.
Status: RESOLVED → VERIFIED
Product: Browser → Seamonkey
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: