Closed
Bug 29992
Opened 25 years ago
Closed 24 years ago
Continual debug assertions
Categories
(Core :: DOM: Navigation, defect, P3)
Tracking
()
VERIFIED
WONTFIX
People
(Reporter: morse, Assigned: Bienvenu)
Details
Attachments
(2 files)
With a tree that I just pulled, I can't do anything without getting continual debug assertions. I had to comment out the assert statement in order to be able to run the browser. Here's the stack trace at the point of assertion. (Bugzilla insists that I select a component but I have absolutely no idea what component this is. So assigning to browser general although I suspect that this is wrong.) NTDLL! 77f76274() nsDebug::Assertion(const char * 0x028ecb28, const char * 0x028eca34, const char * 0x028eca0c, int 78) line 189 + 13 bytes mork_assertion_signal(const char * 0x028ecb28) line 78 + 31 bytes morkEnv::NewError(const char * 0x028ecf70) line 369 + 19 bytes morkStore::AddAlias(morkEnv * 0x024ebbc0, const morkMid & {...}, unsigned long 0) line 975 morkBuilder::OnAlias(morkEnv * 0x024ebbc0, const morkSpan & {...}, const morkMid & {...}) line 635 morkParser::ReadAlias(morkEnv * 0x024ebbc0) line 956 morkParser::ReadDict(morkEnv * 0x024ebbc0) line 1230 morkParser::ReadContent(morkEnv * 0x024ebbc0, unsigned char 1) line 1322 morkParser::ReadGroup(morkEnv * 0x024ebbc0) line 1116 morkParser::ReadAt(morkEnv * 0x024ebbc0, unsigned char 0) line 1148 morkParser::ReadContent(morkEnv * 0x024ebbc0, unsigned char 0) line 1325 + 16 bytes morkParser::OnPortState(morkEnv * 0x024ebbc0) line 1359 + 14 bytes morkParser::ParseLoop(morkEnv * 0x024ebbc0) line 1415 + 12 bytes morkParser::ParseMore(morkEnv * 0x024ebbc0, long * 0x0012f740, unsigned char * 0x024ebef8, unsigned char * 0x024ebef9) line 1454 morkThumb::DoMore_OpenFileStore(morkEnv * 0x024ebbc0) line 433 morkThumb::DoMore(morkEnv * 0x024ebbc0, unsigned long * 0x0012f868, unsigned long * 0x0012f888, unsigned char * 0x0012f86c, unsigned char * 0x0012f864) line 353 + 12 bytes orkinThumb::DoMore(nsIMdbEnv * 0x024ebfd8, unsigned long * 0x0012f868, unsigned long * 0x0012f888, unsigned char * 0x0012f86c, unsigned char * 0x0012f864) line 230 nsGlobalHistory::OpenDB() line 1688 + 37 bytes nsGlobalHistory::Init() line 1610 + 8 bytes NS_NewGlobalHistory(nsISupports * 0x00000000, const nsID & {...}, void * * 0x0012f99c) line 533 + 8 bytes nsGenericFactory::CreateInstance(nsGenericFactory * const 0x025069f0, nsISupports * 0x00000000, const nsID & {...}, void * * 0x0012f99c) line 46 nsComponentManagerImpl::CreateInstance(nsComponentManagerImpl * const 0x00c94520, const nsID & {...}, nsISupports * 0x00000000, const nsID & {...}, void * * 0x0012f99c) line 1253 + 24 bytes nsComponentManager::CreateInstance(const nsID & {...}, nsISupports * 0x00000000, const nsID & {...}, void * * 0x0012f99c) line 82 nsServiceManagerImpl::GetService(nsServiceManagerImpl * const 0x00c94890, const nsID & {...}, const nsID & {...}, nsISupports * * 0x0012fa58, nsIShutdownListener * 0x00000000) line 293 + 19 bytes nsServiceManagerImpl::GetService(nsServiceManagerImpl * const 0x00c94890, const char * 0x003998f4, const nsID & {...}, nsISupports * * 0x0012fa58, nsIShutdownListener * 0x00000000) line 432 nsServiceManager::GetService(const char * 0x003998f4, const nsID & {...}, nsISupports * * 0x0012fa58, nsIShutdownListener * 0x00000000) line 545 nsGetServiceByProgID::operator()(const nsID & {...}, void * * 0x0012fa58) line 63 + 22 bytes nsCOMPtr<nsIGlobalHistory>::assign_from_helper(const nsCOMPtr_helper & {...}, const nsID & {...}) line 795 + 18 bytes nsCOMPtr<nsIGlobalHistory>::nsCOMPtr<nsIGlobalHistory>(const nsCOMPtr_helper & {...}) line 498 nsDocShell::SetTitle(nsDocShell * const 0x02332e30, const unsigned short * 0x10083548 gCommonEmptyBuffer) line 1410 + 28 bytes nsWebShell::SetTitle(nsWebShell * const 0x02332e30, const unsigned short * 0x10083548 gCommonEmptyBuffer) line 3566 nsHTMLDocument::SetTitle(nsHTMLDocument * const 0x024a3d9c, const nsString & {""}) line 784 HTMLContentSink::DidBuildModel(HTMLContentSink * const 0x024a3150, int 0) line 2274 + 38 bytes CNavDTD::DidBuildModel(CNavDTD * const 0x024a2b50, unsigned int 0, int 1, nsIParser * 0x024a3490, nsIContentSink * 0x024a3150) line 631 + 14 bytes nsParser::DidBuildModel(unsigned int 0) line 721 + 55 bytes nsParser::ResumeParse(nsIDTD * 0x00000000, int 1) line 1170 nsParser::OnStopRequest(nsParser * const 0x024a3494, nsIChannel * 0x0233dba0, nsISupports * 0x00000000, unsigned int 0, const unsigned short * 0x00000000) line 1560 + 19 bytes nsDocumentOpenInfo::OnStopRequest(nsDocumentOpenInfo * const 0x0233d920, nsIChannel * 0x0233dba0, nsISupports * 0x00000000, unsigned int 0, const unsigned short * 0x00000000) line 277 nsInputStreamChannel::OnStopRequest(nsInputStreamChannel * const 0x0233dba4, nsIChannel * 0x0233d7b0, nsISupports * 0x00000000, unsigned int 0, const unsigned short * 0x00000000) line 358 + 45 bytes nsOnStopRequestEvent::HandleEvent(nsOnStopRequestEvent * const 0x0233ee30) line 292 nsStreamListenerEvent::HandlePLEvent(PLEvent * 0x0233e360) line 97 + 12 bytes PL_HandleEvent(PLEvent * 0x0233e360) line 526 + 10 bytes PL_ProcessPendingEvents(PLEventQueue * 0x01027370) line 487 + 9 bytes _md_EventReceiverProc(HWND__ * 0xa00004e2, unsigned int 49361, unsigned int 0, long 16937840) line 975 + 9 bytes USER32! 77e71268()
I suppose I should add another mode to Mork assertions, so that for folks not developing Mork code, the first one will cause a fatal error for the Mork db, which signals a soft error to the db users letting them know the db is no good. Actually something like this should already be happening, so maybe I have loop somewhere in parsing where I don't check the outstanding error status.
Has this ever happened again after the first observation? I looked at the code, but the parser always seems to stop after the first error, because the loops check for ev->Good(), which stays false after an error. So maybe a continual sequence of assertions might be once for every mork file, for some reason that's a bit hard to imagine.
When I study the stack crawl more, I see this is happening when opening a history db in response to a shut down listener notification. Is that right? Why would that happen? Anyway, that seems to rule out the multiple mork file idea which might apply when opening a bunch of mail/news summaries. So then a new theory might be -- what if opening the history db fails, but this is not accepted by the caller? What if they try to open the history db again? If the history db seems to become corrupt, do we throw it away and start fresh?
Reporter | ||
Comment 5•25 years ago
|
||
It's happening to me with a fresh tree that I just pulled and built this morning. It happens on startup -- before the first screen ever comes up. Things were so bad that I had to comment out the Break in nsDebug.cpp in order to get any useful work done.
Assignee | ||
Comment 6•25 years ago
|
||
Over lunch, someone suggested that this might have got corrupted because you ran two builds (e.g., release and debug) against the same profile at the same time. If so, that could corrupt history files. You could just rename your history file to get avoid the asserts (but don't delete it in case that could help david recreate the problem)
Here's something for Chris Waterson: since it's possible for Mork db's to become corrupt, is it data loss to throw away an old db? (By entropic theory and also empirical observation, all db's eventually get corrupt, so software needs a plan to cope when it happens.) If history db's hold info important to preserve, then you might want to keep a backup copy of the second to last db which looked good. So if the most recent db becomes corrupt, you might fall back on the secondary copy. I should go look at the history db code now and see whether it makes a new db when the old one seems un-openable.
Reporter | ||
Comment 8•25 years ago
|
||
An excellent theory. I renamed the history.dat file and it no longer asserts. I'll attach my old history.dat file so you can do a post-mortem on it.
Here's another question for Chris Waterson: got plans for a different db? If not, we might consider a server version of MDB interfaces in a few months. I'm working on a language with a small runtime engine, which might make it easy to write a Mork implementation which handles an arbitrary number of clients. It's just a home hobby thing, but eventually it will make some things possible at work which otherwise must be rejected for lack of time and resources.
Reporter | ||
Comment 10•25 years ago
|
||
Comment 11•25 years ago
|
||
davidmc, (re: "precious") That probably depends on who you ask :-) It'd be interesting to try to figure out *why* the history DB got corrupted. If it's because two processes were writing to it, I'd say that this is a problem that we WONTFIX. I think the right thing to do here is to detect that the history database is corrupted, bail, and create a new, empty db file in its place. This may require extending the APIs a bit (e.g., maybe other Mork users would rather try to untangle a corrupted DB).
Assignee | ||
Comment 12•25 years ago
|
||
I think this is up to the mork client - the mailnews code does this today (well, maybe not, but we do throw databases away that are out of date - in 4.x, we throw away .snm files when they get corrupted)
Comment 13•25 years ago
|
||
re: server. I have no immediate plans to use a different DB for history. I don't think that we need to solve the multiple-accessors problem for the browser's global history (or mail folders). That said, if you did it, and it was smaller, faster, stronger, etc., it might make sense to switch implementations behind the scenes.
Comment 14•25 years ago
|
||
Maybe on some platform we are failing to open with exclusive access, so both files could try appending a commit transaction, which would almost certainly corrupt a file, when the longer transaction goes first so it extends past the end of the shorter transaction. (But I don't see obvious signs of two writers in the posted file causing assertions.) Otherwise if we have exclusive access, I don't see how two app instances can manage to corrupt the file, even though this has been correlated in the past with observed file corruptions. Note that when db software is rather stable, most reported file corruptions have little apparent explanation. Anecdotal evidence suggests memory whackage was underway prior to a crash, which managed to propagate to memory buffers destined for disk i/o. (This is the reason why no database is safe from corruption, if any bugs exist in software running in a process that can access disk i/o buffers.) I only say all that to explain why corruption cause seems more often a mystery than not. Repeatable corruption is usually a findable bug, and rare transient corruption is usually written off to bad timing of memory chaos.
Comment 15•25 years ago
|
||
No wait, I found what looks like two writers appending the same transaction number of 171 to the end of the file: @$${171{@ <(43CE=951942485280000)(43CA=951942483007000)(43FE=951942559286000) (45AA=951946071386000)(45AC=951946072007000)(45AE=951946074731000) (45A2=951946006223000)(44D5=951942622818000)(45A4=951946021475000) (43CC=951942483167000)(44D6=951942622908000)(45A8=951946063655000) (45AB=951946071416000)(45AD=951946072027000)(45AF=951946074741000) (45A3=951946006303000)(44D4=951942622537000)(450E=http://abcnews.com/) (450F=951945907010000)(4512=A$00B$00C$00N$00E$00W$00S$00.$00c$00o$00m$00) (4510=http://abcnews.go.com/)(459D=951945991211000)> {1:^80 {/*r=0*/ (k^81:c)(s=9u)} [-4E /*r=1*/ (^82^450E)(^84^450F)(^85^4512)] [-4F /*r=1*/ (^82^4510)(^84^459D)(^85^4512)]} [1:^80 /*r=2*/ (^84^43CE)] [2:^80 /*r=2*/ (^84^43CA)] [3:^80 /*r=2*/ (^84^43FE)] [5:^80 /*r=2*/ (^84^45AA)] [6:^80 /*r=2*/ (^84^45AC)] [7:^80 /*r=2*/ (^84^45AE)] [8:^80 /*r=2*/ (^84^45A2)] [A:^80 /*r=2*/ (^84^44D5)] [C:^80 /*r=2*/ (^84^45A4)] [2F:^80 /*r=1*/ (^84^43CC)] [31:^80 /*r=1*/ (^84^44D6)] [34:^80 /*r=1*/ (^84^45A8)] [40:^80 /*r=1*/ (^84^45AB)] [41:^80 /*r=1*/ (^84^45AD)] [42:^80 /*r=1*/ (^84^45AF)] [43:^80 /*r=1*/ (^84^45A3)] [46:^80 /*r=1*/ (^84^44D4)] @$$}171}@ @$${171{@ <(43CE=951942301666000)>[1:^80 /*r=2*/ (^84^43CE)] <(43CA=951942298551000)>[2:^80 /*r=2*/ (^84^43CA)] <(43D0=951942303188000)>[3:^80 /*r=2*/ (^84^43D0)] <(43CC=951942298872000)>[2F:^80 /*r=1*/ (^84^43CC)] <(4469=951942338419000)>[47:^80 /*r=1*/ (^84^4469)] @$$}171}@ I might have expected an error report on non-ascending transaction numbers; but actually I was only concerned that start and end numbers match. I could add a better warning to the code when a transaction number repeats, to tell users they might have multiple writers.
Comment 16•25 years ago
|
||
Is this really a browser-general bug. If not, can someone move it to the correct Product/Component.
Comment 17•25 years ago
|
||
changing component to 'history', though it's interaction with MDB of course.
Component: Browser-General → History
Reporter | ||
Comment 19•24 years ago
|
||
I had forgotten about this bug (I filed it so long ago) so I'm glad that bienvenue just posted to it and I got the bugzilla announcement. I just got hit by this again two days ago and forgot what the fix was. So I had to zero in on a bad history.dat file by a process of elimination. I still have that bad history.dat so I'll attach it here in case it gives any more clues over the last attachment that I made.
Reporter | ||
Comment 20•24 years ago
|
||
Assignee | ||
Comment 21•24 years ago
|
||
moving to m20 - probably won't fix for reasons described above.
Target Milestone: --- → M20
Comment 24•24 years ago
|
||
nav triage team: WONTFIX
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → WONTFIX
Comment 25•22 years ago
|
||
mass-verifying WontFix bugs which haven't changed since 2001-12-31. use the search string "BoletusEdulis" if you want to filter out this msg.
Status: RESOLVED → VERIFIED
Component: History: Session → Document Navigation
QA Contact: claudius → docshell
You need to log in
before you can comment on or make changes to this bug.
Description
•