Closed
Bug 24312
Opened 25 years ago
Closed 24 years ago
some component has bad registration behavior (optimized mac RegXPCom crashes)
Categories
(Core :: XPCOM, defect, P3)
Tracking
()
Future
People
(Reporter: jj.enser, Assigned: sfraser_bugs)
References
Details
(Keywords: crash, helpwanted)
Attachments
(2 files)
"PowerPC unmapped memory exception at NQDGetPort+00048"
Bumping up the memory partition from 10 to 12 MB puts the app back on its feet.
| Reporter | ||
Updated•25 years ago
|
Status: NEW → RESOLVED
Closed: 25 years ago
Resolution: --- → FIXED
Target Milestone: M13
| Reporter | ||
Comment 1•25 years ago
|
||
fixed. minimum/preferred memory size bumped up to 16MB in
mozilla/xpcom/tools/registry/macbuild/RegXPCOM.mcp
| Reporter | ||
Comment 2•25 years ago
|
||
Reopening since RegXPCOM crashed again in today's verification.
I will bump the executable memory up to 18 MB as a temporary fix, but we need a
better long-term solution.
Status: RESOLVED → REOPENED
| Reporter | ||
Updated•25 years ago
|
Resolution: FIXED → ---
Comment 4•25 years ago
|
||
This is purely a mac binary executable issue. I have not clue. We should do what
we do to navigator in this regard (ie) if we increase nav, we should increase
regxpcom too.
Scc/Simon Fraser, can you help.
Assignee: dp → scc
Target Milestone: M13 → M14
| Reporter | ||
Comment 5•25 years ago
|
||
adding sfraser to the cc list.
This is becoming critical, as we now still crash with a 20MB partition.
Comment 6•25 years ago
|
||
Assinging to simon fraser. Simon, scc is not in town until tomorrow. If you can
help jj that would be super.
Assignee: scc → sfraser
it might be useful to turn off all optimization, just to make sure we aren't
getting killed by the compiler trying to be too clever.
| Assignee | ||
Comment 8•25 years ago
|
||
Well, this is heinous.
I found a serious bug in nsLocalFileMac.cpp, where a full path handle was assumed
to be null terminated, when it was not. This could have caused all kinds of
random behaviour. Here's a diff for that:
Index: nsLocalFileMac.cpp
===================================================================
RCS file: /cvsroot/mozilla/xpcom/io/nsLocalFileMac.cpp,v
retrieving revision 1.3
diff -r1.3 nsLocalFileMac.cpp
940d939
< OSErr err;
942,944c941,942
< err = FSpGetFullPath(&mResolvedSpec, &fullPathLen, &
fullPathHandle);
< *_retval = (char*) nsAllocator::Clone(*fullPathHandle,
fullPathLen+1);
< DisposeHandle(fullPathHandle);
---
> (void)::FSpGetFullPath(&mResolvedSpec, &fullPathLen, &fullPathHandle);
> if (!fullPathHandle) return NS_ERROR_OUT_OF_MEMORY;
946c944,945
< ((*_retval)+fullPathLen)[0] = 0;
---
> char* fullPath = (char *)nsAllocator::Alloc(fullPathLen + 1);
> if (!fullPath) return NS_ERROR_OUT_OF_MEMORY;
947a947,952
> ::HLock(fullPathHandle);
> nsCRT::memcpy(fullPath, *fullPathHandle, fullPathLen);
> fullPath[fullPathLen] = '\0';
>
> *_retval = fullPath;
> ::DisposeHandle(fullPathHandle);
But even with this fix, RegXPCOM is still crashing in non-debug builds.
Status: NEW → ASSIGNED
Comment 9•25 years ago
|
||
cc-ing sdagley. Steve, are there other cases when strings are assumed to be
terminated? Steve, can you work with simon to fix these problem?
| Assignee | ||
Comment 10•25 years ago
|
||
dougt: the deal here is that FSpGetFullPath() returns a Mac handle, which is not
null-terminated. I used lxr to look for other calls to this function, and this is
the only one that looks bad.
| Assignee | ||
Comment 11•25 years ago
|
||
That patch above has been checked in to nsLocalFileMac.cpp. The quit crash
remains, however.
Comment 12•25 years ago
|
||
smfr, do you have a stack crawl?
| Assignee | ||
Comment 13•25 years ago
|
||
It crashes in a call to free() coming out of JS garbage collection.
| Assignee | ||
Comment 14•25 years ago
|
||
ok, we got this now. sdagley was writing one off the end of a buffer in
nsLocalFileMac.
Status: ASSIGNED → RESOLVED
Closed: 25 years ago → 25 years ago
Resolution: --- → FIXED
Comment 16•25 years ago
|
||
Jan, is this reopened??
| Reporter | ||
Comment 17•25 years ago
|
||
I don't think so. I haven't seen RegXPCom crashing on the release build mac for a
while.
Jan just updated the keyword
Comment 18•25 years ago
|
||
actually, this did just happen again this morning. we'll have to wait and see
if it continues to be a problem before revisiting this.
Comment 19•25 years ago
|
||
Reopening. This happened again this morning. We have no mac verification
builds.
Severity: critical → blocker
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: M14 → M16
Comment 21•25 years ago
|
||
do you have a stack crawl?
Comment 22•25 years ago
|
||
what kind of stack crawl? output from stdlog?
Comment 23•25 years ago
|
||
And this shouldn't stop delivery of builds. If you run the browser, it should
generate the component.reg that you can bundle.
Comment 24•25 years ago
|
||
that would be fine. just attached the stdlog to this bug.
Comment 25•25 years ago
|
||
the mac goes into macsbug, you es out of it, various things time out, then you
try to do anything at all and it crashes. repeat as necessary. unless you
catch the regxpcom crash almost immediately, it is impossible to get a build.
I'll get a stdlog the next time it crashes.
| Assignee | ||
Comment 26•25 years ago
|
||
This really isn't my bug. Reassigning to xpcom owner.
Assignee: sfraser → dp
Status: REOPENED → NEW
Comment 30•25 years ago
|
||
not going to hold the tree for this, removing smoketest keyword.
Keywords: smoketest
| Reporter | ||
Comment 31•25 years ago
|
||
RegXPCOM didn't crash today during verif. builds... Scott, is it magic or did you
do something about it? (don't see any checkin though)
Comment 32•25 years ago
|
||
yes it did. there's a stdlog from this morning on the desktop. i just caught
it in time so we didn't have to respin and all the builds finished.
| Reporter | ||
Comment 33•25 years ago
|
||
ok, here's the current status:
- regXPCOM keeps crashing every morning when ran against the mozilla build (2nd
attempt against ns build succeeds though)
- no stdlog possible as Macsbug sez "File system busy"
- unless we don't "es" from Macsbug quickly enough (within 5 minutes after the
crash), the entire build process times out and no builds are delivered to QA
Given all this, I removed the call to RegXPCOM from the verification build script
until it gets _really_ fixed.
This will allow the morning verification build to be delivered in a timely
manner, while it will significantly increase the startup time of the app.
Whiteboard: [dogfood+] → [dogfood+] regXPCOM removed from Mac build
Comment 34•25 years ago
|
||
Scott can you provide status on this. It has been 3 days since this got
assigned to you and "blockers" get priority over everything else.
Comment 35•25 years ago
|
||
This shouldn't really be marked `blocker' since we have a work-around in place.
I think I agree with the `dogfood+' label. I'm kind of at a loss, so far.
Everyone who has looked at this bug has come up empty. I will continue to work
on it, but I all easy avenues have been explored.
| Reporter | ||
Comment 36•25 years ago
|
||
Scott, the workaround is costly (startup time) and the fact that all "easy
avenues" have been explored is the very reason why we now need to dig further.
I just want to remind all that this has been bugging us for over 4 months now...
If dp agrees that we can leave without running RegXPCOM in the Mac release build,
then we might as well mark this bug invalid and forget about it. otherwise, let's
nail this baby!
We discussed about trying to run a debug build of RegXPCOM against one of the
morning verification builds (optimized) where it usually crashes. Would that help
making some progress?
Comment 37•25 years ago
|
||
I would like to keep this a blocker. The goal here was that this step helps
improve installation and startup time immensely. Autoreg, a costly step, wont
happen at the customer's machine at all. Even if it did, wont take as much
time as no dlls would have changed.
Scott, i think we should try fixing it. You said this was optimized only. Do
you have a stack trace. I dont see any reason on why this failing from you.
Comment 38•25 years ago
|
||
mass re-assigning to my new bugzilla account
Assignee: scc → scc
Status: ASSIGNED → NEW
Updated•25 years ago
|
Status: NEW → ASSIGNED
| Assignee | ||
Comment 39•25 years ago
|
||
Why can't we just generate the components registry by just running the app once?
| Assignee | ||
Comment 40•25 years ago
|
||
(the app == mozilla or Netscape)
Comment 41•25 years ago
|
||
This is not a dogfood bug. According to dveditz@netscape.com, the pregenerated
reg file is being removed from the build. According to him, we should (or will
soon be in the state of) not be running RegXPCom anymore. Yes, whatever module
is causing the bustage should be found and fixed, but this is definitely _not_
dogfood. Taking RegXPCom out of the build process is the real fix to this
blocking situation, according to dveditz.
This bug needs to transform into a low priority bug to find and fix whatever
component it is that's screwing us up at the moment.
Please remove the [dogfood+] status.
The remaining low priority task can probably be assigned to rayw as the owner of
components ... once he knows which component is failing (dp suggests using a
binary search of the component space :-) he can reassign the bug to the owner of
the bad piece.
Comment 42•25 years ago
|
||
So... based on conversation and email exchange, I'm removing [dogfood+] status
from this bug, lowering the priority, changing the summary and target milestone,
re-assigning it to rayw, and cc'ing sgehani@netscape.com (samir). This is the
correct action, right? dp? dveditz? samir?
Assignee: scc → rayw
Severity: blocker → normal
Status: ASSIGNED → NEW
Summary: mac RegXPCom crashes during verification build → some component has bad registration behavior (optimized mac RegXPCom crashes)
Whiteboard: [dogfood+] regXPCOM removed from Mac build
Target Milestone: M16 → M17
Comment 43•25 years ago
|
||
Not so sure this is low priority. As dp points out, the pre-generated component
registry is shipped because the initial Mac browser startup time was rather
sluggish (due to autoreg).
A couple of reasons not to ship the pre-generated component registry:
1> So we pick up third party components (but, this happens anyways since we
check timestamps during autoreg every time we startup and hence pick up new
components -- needs clarification as to why it is a problem).
2> A larger registry means a larger footprint since we hold multiple copies of
the registry in memory as I recall based on a discussion with Dan (dveditz).
The argument here is that if we only install navigator (say without mail etc.)
then we only have the navigator components registered (smaller registry size ==
lower memory footprint). However, I don't know how relevant this really is: if
it is the case that majority of the components are installed when we install
navigator only, the registry size savings may not be siginificant if we autoreg
at install time.
As of now, dp's argument is making a lot of sense and I believe we should be
shipping a pre-generated component registry. But, there's probably information
I'm missing that Dan/dp may fill in.
Comment 44•25 years ago
|
||
Renominating for dogfood. If it is important that we ship the Mac with a
component.reg then the build process must not regularly crash while creating
it.
I'm concerned about blaming a component for the crash -- if that were so why
wouldn't the product itself crash? Or perhaps it means the Component Registry
is being created in the *build* directory with all the test crap rather than a
deliver/staging area, which will bloat the registry horribly.
Keywords: dogfood
Comment 45•25 years ago
|
||
Putting on [dogfood-] radar. Was the beta1 performance contingent upon this.
We must be equal to beta1 performance.
Whiteboard: [dogfood-]
Comment 46•25 years ago
|
||
Unless we're planning on doing away with autoreg, we'll have to ship a
components.reg file. The whole point of regxpcom is to improve startup time.
startup time on the mac is atrocious.
we can take this out of the build process, it already is out of the build
process. as soon as we do the next set of performance tests though, people are
going to scream about the mac startup time and we'll be right back here again.
| Assignee | ||
Comment 47•25 years ago
|
||
No-one answered my earlier question. Why can't we just run Mozilla/Netscape to
generate the component registry?
Comment 48•25 years ago
|
||
We've had long discussions over email. The outcome was that we should keep
RegXPCOM'ing at build time and ship a minimal Components Registry on the Mac.
JJ is aware of this. Rayw was cc'ed on the discussion. If we don't ship a
Components Registry we will regress in performance compared to beta1. Please
consult dp if you have further questions. Nominating for nsbeta2 since it is
arguably not dogfood.
JJ,
Is it possible for us to execute on Simon's suggestion in your build system? If
so, please remove the nsbeta2 nomination and swap in running RegXPCOM with
running the app. Thanks.
Keywords: nsbeta2
| Reporter | ||
Comment 49•25 years ago
|
||
I'm a bit skeptical running Netscape for the following reasons:
1) it doesn't quit automatically, like RegXPCOM did. How do I know when to
continue with the packaging? Then, how do I quit Netscape (does it support the
'quit' appleevent yet?)
2) Running netscape has proven to be risky in the past. I don't want to hose the
packaging process because of an obsolete profile, prefs, or registry. On the
other hand, if Netscape is DOA, I'll be the first to know, and there's no real
need to package it!
3) If I can 'safely' run RegXPCOM against a _reduced_ set of componenets like we
discussed, why not go this way ? We can try this solution "manually" to see if it
works. All I need is the minimum set of components that must be pre-registered.
| Reporter | ||
Comment 50•25 years ago
|
||
Jonathan: <startup time on the mac is atrocious> is it really that bad? do we
have precise data on this and did we measure the kind of improvement obtained
with an existing Registry versus without?
Sorry to re-open the debate!
Comment 51•25 years ago
|
||
you'd have to ask leger as I haven't seen startup times in a while and wasn't
able to find them on the QA page. But as I recall the difference in startup
between having a registry and not was significant. Easy way to test it would be
to start it up with a registry, then remove it and restart it, and compare the
times.
Comment 52•25 years ago
|
||
If I understand the comments on this correctly:
1. Dogfood should be removed as a keyword.
2. The problem to be solved is a poorly-behaved component that crashes on Mac,
which I do not currently have the ability to do a binary test on and find the
misbehaved component, lacking the hardware.
3. Is there still a debate on whether a registry should be precreated? Would a
registry creator that makes a smaller footprint than Netscape be of any use
here?
Status: NEW → ASSIGNED
Comment 53•25 years ago
|
||
No. There is no debate.
1> We would like RegXPCOM to work because jj has found running the mozilla
binary to be unreliable for his automated release build process.
2> This should not be dogfood but certainly nsbeta2 (based on plenty of mail
with plenty of qualified folks).
Comment 54•25 years ago
|
||
Putting on nsbeta2- radar. We will be focusing on performance in beta 3.
Reassigning to sfraser. Ray has no mac.
Comment 55•25 years ago
|
||
there are work arounds for this issue and is specific to making builds, this
will not be encountered by the end user, marking as later
Status: NEW → RESOLVED
Closed: 25 years ago → 25 years ago
Resolution: --- → LATER
Comment 56•25 years ago
|
||
ok. just remember this bug when we start doing startup performance testing again
and the mac takes a minute or longer to start, and that's on a fast system...
| Assignee | ||
Comment 57•25 years ago
|
||
If you can get me a reproducible case for this bug, where I can run RegXPCOM by
hand and consistently see a crash, then I'd be glad to look at it. As it stands,
the bug is too hard to track down.
And I still don't understand why the workarounds I suggested above (i.e. just run
Mozilla/Netscape to generate the components reg) were rejected.
Comment 58•25 years ago
|
||
Reopening bug. Just because it's a build issue doesn't mean it's not a problem.
Status: RESOLVED → REOPENED
Resolution: LATER → ---
Comment 59•25 years ago
|
||
Marking milestone "Future" in the spirit of the "LATER" resolution.
Simon, the reason we can't just run Mozilla/Netscape 6 to generate this is that
cannot be automated as part of the build process: we could start Mozilla but it
wouldn't shut down and the build would not continue. regxpcom shouldn't be
doing anything differently than AutoReg in Mozilla, so the crash is quite odd.
Target Milestone: M17 → Future
| Assignee | ||
Comment 60•25 years ago
|
||
If we can't run then quit mozilla using AppleEvents then that's a bug that we
should certainly fix.
Comment 61•25 years ago
|
||
Well *that's* an interesting thought. Couldn't really do that on win or
linux so I didn't think about it, but I guess it could be a workaround for the
Mac build process.
Doesn't obviate the need to make sure regxpcom works, though, as we should be
shipping that so developers can register drop-in components. AutoReg does not
run in optimized builds to detect new components. AutoReg will run if people
use XPInstall to add the components, but not everyone is going to want to do
that.
| Reporter | ||
Comment 62•25 years ago
|
||
a little more input:
- bug reproducibility: it crashes on first run after build completion (_every_
build). subsequent runs of RegXPCOM are successful. I noticed that even after a
crash I get a Component Registry file, about 20K shorter than if the process
completes. Maybe analyzing that incomplete registry would give us a better idea
of where the crash occurs.
Simon, you can always poke around with the build machine after the verification
build is delivered and see the bug in action.
- Mozilla/Netscape scriptability: I haven't tried to see if actually supports the
quit AppleEvent, but assuming that it does, how would the script know when the
startup process is done before quitting the app and continuing the packaging?
| Assignee | ||
Comment 63•25 years ago
|
||
Mozilla won't quit until it has created the components registry, because it
doesn't get to the main event loop until after that. So a simple script to run
then quit should just work. See, however, bug 43163.
| Assignee | ||
Comment 64•25 years ago
|
||
jj: please send a copy of the incomplete registry to dveditz, so he can analyze
where the failure occurred.
| Assignee | ||
Comment 65•25 years ago
|
||
After spending some hours debugging this on the verification machine, I have a
better understanding of what is going on.
First, the bug is reproducible only the first time you run RegXPCOM after
rebooting the machine. The crash occurs after all component DLLs have been
registered, and while we are registering JS components. We actually crash
while reading in default pref files, which happens because the JSLoader
code load the scriptsecuritymanager service, which in turn loads the prefs
service. I'll attach the call stack just before we crash.
What's odd is that we don't crash loading the first default prefs file, but
the 2nd one after sorting them (mailnews.js).
| Assignee | ||
Comment 66•25 years ago
|
||
| Assignee | ||
Comment 67•25 years ago
|
||
Why the crash occurs I have no idea. Perhaps there are other NSPR threads
running, and we yield in the async read call to another thread? Or maybe there is
some stack corruption going on here. It's too early to tell.
Comment 68•25 years ago
|
||
CC'ing mstoltz in case the scriptsecuritymanager service involvement was more
than coincidental.
Comment 69•25 years ago
|
||
clean up keywords, adding help wanted
Comment 72•25 years ago
|
||
Chris, did you mean bug 46000 instead of 43000?
Comment 73•25 years ago
|
||
Yes, I did. Thanks, dveditz.
| Assignee | ||
Comment 74•25 years ago
|
||
OK, here's how to make it not crash -- 2 options.
1. Don't have RegXPCOM register JS compnents.
2. Move the default prefs folder out of the way before running RegXPCOM so that
it doesn't try to load the default prefs files.
| Reporter | ||
Comment 75•25 years ago
|
||
(2) is in my reach. Why does regxpcom need to load default prefs files ? Isn't
supposed to focus on XPCOM components?
| Reporter | ||
Comment 76•25 years ago
|
||
Temporary fix:
- rename viewer/Defaults to <anything> to avoid hitting Defaults/Pref
- run RegXPCOM -> no crash, Component Registry created
- rename "Defaults" back.
After testing this manually with success, I updated the release build automation
and will watch it over the weekend.
Even though this is an ugly patch, it will get us moving forward and include a
component.reg with the mac build, hence reducing initial startup time.
Feel free to mark this bug fixed if this solution is good enough in the long run.
Comment 77•25 years ago
|
||
So now we just need to add the component registry back into the package list
(browser.xpi section) and we're all set.
Ideally this would be a component reg for the browser components only to cut
down on unnecessary footprint for mail and aim components for users who don't
install those options, but I'll take what we can get at this point.
| Reporter | ||
Comment 78•25 years ago
|
||
| Reporter | ||
Comment 79•25 years ago
|
||
Bad news: the patch in place (renaming the 'Defaults' folder before running
RegXPCOM) worked fine for just a week... but we're dealing with a tough guy here.
RegXPCOM is crashing again every now and then (mostly now :-)
However, this time the crash doesn't seem as deep now as it used to, cuz I can
log a stack trace. I attach one from today's crash:
http://bugzilla.mozilla.org/showattachment.cgi?attach_id=13327
Comment 80•25 years ago
|
||
Is this bug the reason there is no regxpcom installed with mac moz/n6? If so,
this blocks xpcom plugin uninstallers (plugins can be removed but since the
plugin reg entries still exist, moz/n6 thinks the plugin still exists). If
not, sorry for the spam.
| Reporter | ||
Comment 81•25 years ago
|
||
I don't think regxpcom was ever designed to be shipped with either mozilla or
ns6. it's an internal tool whose only purpose is to generate a component registry
without having to launch the app.
no installer / uninstaller should refer to regxpcom.
Simon, you can mark this as "worksforme" if you think it's ok now that #46000 if
fixed (running the app instead of regxpcom)
| Assignee | ||
Comment 82•25 years ago
|
||
Right; RegXPCOM was never meant to be external.
jj: I'd still like to understand why this happens. If it crashes RegXPCOM, it'll
probably crash someone's embedding app at some point.
Comment 83•25 years ago
|
||
thanks for setting me straight.
Does anyone have any ideas how to unregister an xpcom component?
Is deleting the component registry the preferred way of unregistering xpcom
components?
Comment 84•25 years ago
|
||
dp is no longer @netscape.com. reassigning qa contact to default for this component
QA Contact: dp → rayw
| Assignee | ||
Comment 85•24 years ago
|
||
For the skinny on this, see bug 64978.
| Assignee | ||
Comment 86•24 years ago
|
||
Dupping this to bug 64978, which contains much better data.
*** This bug has been marked as a duplicate of 64978 ***
Status: ASSIGNED → RESOLVED
Closed: 25 years ago → 24 years ago
Resolution: --- → DUPLICATE
You need to log in
before you can comment on or make changes to this bug.
Description
•