29992 - Continual debug assertions

Reporter

Description

•

25 years ago

With a tree that I just pulled, I can't do anything without getting continual 
debug assertions.  I had to comment out the assert statement in order to be able 
to run the browser.  Here's the stack trace at the point of assertion.

(Bugzilla insists that I select a component but I have absolutely no idea what 
component this is.  So assigning to browser general although I suspect that this 
is wrong.)

NTDLL! 77f76274()
nsDebug::Assertion(const char * 0x028ecb28, const char * 0x028eca34,
const char * 0x028eca0c, int 78) line 189 + 13 bytes
mork_assertion_signal(const char * 0x028ecb28) line 78 + 31 bytes
morkEnv::NewError(const char * 0x028ecf70) line 369 + 19 bytes
morkStore::AddAlias(morkEnv * 0x024ebbc0, const morkMid & {...},
unsigned long 0) line 975
morkBuilder::OnAlias(morkEnv * 0x024ebbc0, const morkSpan & {...}, const
morkMid & {...}) line 635
morkParser::ReadAlias(morkEnv * 0x024ebbc0) line 956
morkParser::ReadDict(morkEnv * 0x024ebbc0) line 1230
morkParser::ReadContent(morkEnv * 0x024ebbc0, unsigned char 1) line 1322
morkParser::ReadGroup(morkEnv * 0x024ebbc0) line 1116
morkParser::ReadAt(morkEnv * 0x024ebbc0, unsigned char 0) line 1148
morkParser::ReadContent(morkEnv * 0x024ebbc0, unsigned char 0) line 1325
+ 16 bytes
morkParser::OnPortState(morkEnv * 0x024ebbc0) line 1359 + 14 bytes
morkParser::ParseLoop(morkEnv * 0x024ebbc0) line 1415 + 12 bytes
morkParser::ParseMore(morkEnv * 0x024ebbc0, long * 0x0012f740, unsigned
char * 0x024ebef8, unsigned char * 0x024ebef9) line 1454
morkThumb::DoMore_OpenFileStore(morkEnv * 0x024ebbc0) line 433
morkThumb::DoMore(morkEnv * 0x024ebbc0, unsigned long * 0x0012f868,
unsigned long * 0x0012f888, unsigned char * 0x0012f86c, unsigned char *
0x0012f864) line 353 + 12 bytes
orkinThumb::DoMore(nsIMdbEnv * 0x024ebfd8, unsigned long * 0x0012f868,
unsigned long * 0x0012f888, unsigned char * 0x0012f86c, unsigned char *
0x0012f864) line 230
nsGlobalHistory::OpenDB() line 1688 + 37 bytes
nsGlobalHistory::Init() line 1610 + 8 bytes
NS_NewGlobalHistory(nsISupports * 0x00000000, const nsID & {...}, void *
* 0x0012f99c) line 533 + 8 bytes
nsGenericFactory::CreateInstance(nsGenericFactory * const 0x025069f0,
nsISupports * 0x00000000, const nsID & {...}, void * * 0x0012f99c) line
46
nsComponentManagerImpl::CreateInstance(nsComponentManagerImpl * const
0x00c94520, const nsID & {...}, nsISupports * 0x00000000, const nsID &
{...}, void * * 0x0012f99c) line 1253 + 24 bytes
nsComponentManager::CreateInstance(const nsID & {...}, nsISupports *
0x00000000, const nsID & {...}, void * * 0x0012f99c) line 82
nsServiceManagerImpl::GetService(nsServiceManagerImpl * const
0x00c94890, const nsID & {...}, const nsID & {...}, nsISupports * *
0x0012fa58, nsIShutdownListener * 0x00000000) line 293 + 19 bytes
nsServiceManagerImpl::GetService(nsServiceManagerImpl * const
0x00c94890, const char * 0x003998f4, const nsID & {...}, nsISupports * *
0x0012fa58, nsIShutdownListener * 0x00000000) line 432
nsServiceManager::GetService(const char * 0x003998f4, const nsID &
{...}, nsISupports * * 0x0012fa58, nsIShutdownListener * 0x00000000)
line 545
nsGetServiceByProgID::operator()(const nsID & {...}, void * *
0x0012fa58) line 63 + 22 bytes
nsCOMPtr<nsIGlobalHistory>::assign_from_helper(const nsCOMPtr_helper &
{...}, const nsID & {...}) line 795 + 18 bytes
nsCOMPtr<nsIGlobalHistory>::nsCOMPtr<nsIGlobalHistory>(const
nsCOMPtr_helper & {...}) line 498
nsDocShell::SetTitle(nsDocShell * const 0x02332e30, const unsigned short
* 0x10083548 gCommonEmptyBuffer) line 1410 + 28 bytes
nsWebShell::SetTitle(nsWebShell * const 0x02332e30, const unsigned short
* 0x10083548 gCommonEmptyBuffer) line 3566
nsHTMLDocument::SetTitle(nsHTMLDocument * const 0x024a3d9c, const
nsString & {""}) line 784
HTMLContentSink::DidBuildModel(HTMLContentSink * const 0x024a3150, int
0) line 2274 + 38 bytes
CNavDTD::DidBuildModel(CNavDTD * const 0x024a2b50, unsigned int 0, int
1, nsIParser * 0x024a3490, nsIContentSink * 0x024a3150) line 631 + 14
bytes
nsParser::DidBuildModel(unsigned int 0) line 721 + 55 bytes
nsParser::ResumeParse(nsIDTD * 0x00000000, int 1) line 1170
nsParser::OnStopRequest(nsParser * const 0x024a3494, nsIChannel *
0x0233dba0, nsISupports * 0x00000000, unsigned int 0, const unsigned
short * 0x00000000) line 1560 + 19 bytes
nsDocumentOpenInfo::OnStopRequest(nsDocumentOpenInfo * const 0x0233d920,
nsIChannel * 0x0233dba0, nsISupports * 0x00000000, unsigned int 0, const
unsigned short * 0x00000000) line 277
nsInputStreamChannel::OnStopRequest(nsInputStreamChannel * const
0x0233dba4, nsIChannel * 0x0233d7b0, nsISupports * 0x00000000, unsigned
int 0, const unsigned short * 0x00000000) line 358 + 45 bytes
nsOnStopRequestEvent::HandleEvent(nsOnStopRequestEvent * const
0x0233ee30) line 292
nsStreamListenerEvent::HandlePLEvent(PLEvent * 0x0233e360) line 97 + 12
bytes
PL_HandleEvent(PLEvent * 0x0233e360) line 526 + 10 bytes
PL_ProcessPendingEvents(PLEventQueue * 0x01027370) line 487 + 9 bytes
_md_EventReceiverProc(HWND__ * 0xa00004e2, unsigned int 49361, unsigned
int 0, long 16937840) line 975 + 9 bytes
USER32! 77e71268()

David :Bienvenu

Assignee

Comment 1

•

25 years ago

reassiging.

Assignee: bienvenu → davidmc

davidmc

Comment 2

•

25 years ago

I suppose I should add another mode to Mork assertions, so that for folks not 
developing Mork code, the first one will cause a fatal error for the Mork db, 
which signals a soft error to the db users letting them know the db is no good.
Actually something like this should already be happening, so maybe I have  loop 
somewhere in parsing where I don't check the outstanding error status.

davidmc

Comment 3

•

25 years ago

Has this ever happened again after the first observation?

I looked at the code, but the parser always seems to stop after the first error, 
because the loops check for ev->Good(), which stays false after an error.  So 
maybe a continual sequence of assertions might be once for every mork file, for
some reason that's a bit hard to imagine.

davidmc

Comment 4

•

25 years ago

When I study the stack crawl more, I see this is happening when opening a history 
db in response to a shut down listener notification.  Is that right?  Why would 
that happen?  Anyway, that seems to rule out the multiple mork file idea which 
might apply when opening a bunch of mail/news summaries.

So then a new theory might be -- what if opening the history db fails, but this 
is not accepted by the caller?  What if they try to open the history db again?  
If the history db seems to become corrupt, do we throw it away and start fresh?

Stephen P. Morse

Reporter

Comment 5

•

25 years ago

It's happening to me with a fresh tree that I just pulled and built this 
morning.  It happens on startup -- before the first screen ever comes up.  
Things were so bad that I had to comment out the Break in nsDebug.cpp in order 
to get any useful work done.

David :Bienvenu

Assignee

Comment 6

•

25 years ago

Over lunch, someone suggested that this might have got corrupted because you ran
two builds (e.g., release and debug) against the same profile at the same time.
If so, that could corrupt history files. You could just rename your history file
to get avoid the asserts (but don't delete it in case that could help david
recreate the problem)

davidmc

Comment 7

•

25 years ago

Here's something for Chris Waterson: since it's possible for Mork db's to become 
corrupt, is it data loss to throw away an old db?  (By entropic theory and also 
empirical observation, all db's eventually get corrupt, so software needs a plan 
to cope when it happens.)

If history db's hold info important to preserve, then you might want to keep a 
backup copy of the second to last db which looked good.  So if the most recent db 
becomes corrupt, you might fall back on the secondary copy.

I should go look at the history db code now and see whether it makes a new db 
when the old one seems un-openable.

Stephen P. Morse

Reporter

Comment 8

•

25 years ago

An excellent theory.  I renamed the history.dat file and it no longer asserts.  
I'll attach my old history.dat file so you can do a post-mortem on it.

davidmc

Comment 9

•

25 years ago

Here's another question for Chris Waterson: got plans for a different db?  If 
not, we might consider a server version of MDB interfaces in a few months.  I'm 
working on a language with a small runtime engine, which might make it easy to 
write a Mork implementation which handles an arbitrary number of clients.  It's 
just a home hobby thing, but eventually it will make some things possible at work 
which otherwise must be rejected for lack of time and resources.

Stephen P. Morse

Reporter

Comment 10

•

25 years ago

Attached file My old history.dat file — Details

Chris Waterson

Comment 11

•

25 years ago

davidmc, (re: "precious") That probably depends on who you ask :-)

It'd be interesting to try to figure out *why* the history DB got corrupted. If 
it's because two processes were writing to it, I'd say that this is a problem 
that we WONTFIX.

I think the right thing to do here is to detect that the history database is 
corrupted, bail, and create a new, empty db file in its place. This may require 
extending the APIs a bit (e.g., maybe other Mork users would rather try to 
untangle a corrupted DB).

David :Bienvenu

Assignee

Comment 12

•

25 years ago

I think this is up to the mork client - the mailnews code does this today (well,
maybe not, but we do throw databases away that are out of date - in 4.x, we
throw away .snm files when they get corrupted)

Chris Waterson

Comment 13

•

25 years ago

re: server. I have no immediate plans to use a different DB for history. I 
don't think that we need to solve the multiple-accessors problem for the 
browser's global history (or mail folders).

That said, if you did it, and it was smaller, faster, stronger, etc., it might 
make sense to switch implementations behind the scenes.

davidmc

Comment 14

•

25 years ago

Maybe on some platform we are failing to open with exclusive access, so both 
files could try appending a commit transaction, which would almost certainly 
corrupt a file, when the longer transaction goes first so it extends past the end 
of the shorter transaction.  (But I don't see obvious signs of two writers in the 
posted file causing assertions.)

Otherwise if we have exclusive access, I don't see how two app instances can 
manage to corrupt the file, even though this has been correlated in the past with 
observed file corruptions.

Note that when db software is rather stable, most reported file corruptions have 
little apparent explanation.  Anecdotal evidence suggests memory whackage was 
underway prior to a crash, which managed to propagate to memory buffers destined 
for disk i/o.  (This is the reason why no database is safe from corruption, if 
any bugs exist in software running in a process that can access disk i/o 
buffers.)

I only say all that to explain why corruption cause seems more often a mystery 
than not.  Repeatable corruption is usually a findable bug, and rare transient 
corruption is usually written off to bad timing of memory chaos.

davidmc

Comment 15

•

25 years ago

No wait, I found what looks like two writers appending the same transaction 
number of 171 to the end of the file:

@$${171{@

<(43CE=951942485280000)(43CA=951942483007000)(43FE=951942559286000)
  (45AA=951946071386000)(45AC=951946072007000)(45AE=951946074731000)
  (45A2=951946006223000)(44D5=951942622818000)(45A4=951946021475000)
  (43CC=951942483167000)(44D6=951942622908000)(45A8=951946063655000)
  (45AB=951946071416000)(45AD=951946072027000)(45AF=951946074741000)
  (45A3=951946006303000)(44D4=951942622537000)(450E=http://abcnews.com/)
  (450F=951945907010000)(4512=A$00B$00C$00N$00E$00W$00S$00.$00c$00o$00m$00)
  (4510=http://abcnews.go.com/)(459D=951945991211000)>
{1:^80 {/*r=0*/ (k^81:c)(s=9u)} 
  [-4E /*r=1*/ (^82^450E)(^84^450F)(^85^4512)]
  [-4F /*r=1*/ (^82^4510)(^84^459D)(^85^4512)]}
[1:^80 /*r=2*/ (^84^43CE)]
[2:^80 /*r=2*/ (^84^43CA)]
[3:^80 /*r=2*/ (^84^43FE)]
[5:^80 /*r=2*/ (^84^45AA)]
[6:^80 /*r=2*/ (^84^45AC)]
[7:^80 /*r=2*/ (^84^45AE)]
[8:^80 /*r=2*/ (^84^45A2)]
[A:^80 /*r=2*/ (^84^44D5)]
[C:^80 /*r=2*/ (^84^45A4)]
[2F:^80 /*r=1*/ (^84^43CC)]
[31:^80 /*r=1*/ (^84^44D6)]
[34:^80 /*r=1*/ (^84^45A8)]
[40:^80 /*r=1*/ (^84^45AB)]
[41:^80 /*r=1*/ (^84^45AD)]
[42:^80 /*r=1*/ (^84^45AF)]
[43:^80 /*r=1*/ (^84^45A3)]
[46:^80 /*r=1*/ (^84^44D4)]
@$$}171}@

@$${171{@
<(43CE=951942301666000)>[1:^80 /*r=2*/ (^84^43CE)]
<(43CA=951942298551000)>[2:^80 /*r=2*/ (^84^43CA)]
<(43D0=951942303188000)>[3:^80 /*r=2*/ (^84^43D0)]
<(43CC=951942298872000)>[2F:^80 /*r=1*/ (^84^43CC)]
<(4469=951942338419000)>[47:^80 /*r=1*/ (^84^4469)]
@$$}171}@

I might have expected an error report on non-ascending transaction numbers; but 
actually I was only concerned that start and end numbers match.  I could add a 
better warning to the code when a transaction number repeats, to tell users they 
might have multiple writers.

Asa Dotzler [:asa]

Comment 16

•

25 years ago

Is this really a browser-general bug.  If not, can someone move it to the
correct Product/Component.

davidmc

Comment 17

•

25 years ago

changing component to 'history', though it's interaction with MDB of course.

Component: Browser-General → History

David :Bienvenu

Assignee

Comment 18

•

24 years ago

these are mine now, I guess.

Assignee: davidmc → bienvenu

Stephen P. Morse

Reporter

Comment 19

•

24 years ago

I had forgotten about this bug (I filed it so long ago) so I'm glad that 
bienvenue just posted to it and I got the bugzilla announcement.

I just got hit by this again two days ago and forgot what the fix was.  So I had 
to zero in on a bad history.dat file by a process of elimination.  I still have 
that bad history.dat so I'll attach it here in case it gives any more clues over 
the last attachment that I made.

Stephen P. Morse

Reporter

Comment 20

•

24 years ago

Attached file Another bad history.dat file — Details

David :Bienvenu

Assignee

Comment 21

•

24 years ago

moving to m20 - probably won't fix for reasons described above.

Target Milestone: --- → M20

David :Bienvenu

Assignee

Comment 22

•

24 years ago

accepting

Status: NEW → ASSIGNED

Asa Dotzler [:asa]

Comment 23

•

24 years ago

updating qa contact

QA Contact: asa → claudius

Viswanath Ramachandran

Comment 24

•

24 years ago

nav triage team: WONTFIX

Status: ASSIGNED → RESOLVED

Closed: 24 years ago

Resolution: --- → WONTFIX

sairuh (rarely reading bugmail)

Comment 25

•

22 years ago

mass-verifying WontFix bugs which haven't changed since 2001-12-31.

use the search string "BoletusEdulis" if you want to filter out this msg.

Status: RESOLVED → VERIFIED

timeless

Updated

•

16 years ago

Component: History: Session → Document Navigation

QA Contact: claudius → docshell

My old history.dat file 25 years ago Stephen P. Morse 65.33 KB, text/plain		Details
Another bad history.dat file 24 years ago Stephen P. Morse 22.90 KB, text/plain		Details