Closed Bug 220566 Opened 21 years ago Closed 19 years ago

Randomly hangs in PL_HashTableRawLookup at start of page load

Categories

(Core :: Networking: HTTP, defect, P1)

x86
Linux
defect

Tracking

()

RESOLVED FIXED
mozilla1.8beta2

People

(Reporter: chpe, Assigned: darin.moz)

References

Details

(Keywords: hang)

Attachments

(2 files, 2 obsolete files)

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20020823 Netscape/7.0
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; de-AT; rv:1.5b) Gecko/20030912

[I've chosen the Component based on the trace, apologies if I guessed wrong.]

Note: This happens in Epiphany [a gtkmozembed client], but the trace indicates
that it is mozilla code which causes the hang.

Sometimes Epiphany hangs when starting a page load. This happens about once or
twice a day for me, non-reproducibly, with different pages. After killing the
program and restarting, the same page loads without problems.
I've observed this hang using different mozilla versions during the last 6
months; the trace is always similar.

Reproducible: Sometimes

Steps to Reproduce:
None known.
Actual Results:  
Hang.

Expected Results:  
No hang.

Trace:

Thread 6 (Thread 81925 (LWP 4082)):
#0  0x40b1c536 in nanosleep () from /lib/libpthread.so.0

Thread 5 (Thread 49156 (LWP 4078)):
#0  0x40b1c536 in nanosleep () from /lib/libpthread.so.0
#1  0x00000001 in ?? ()
#2  0xbf3ff7ac in ?? ()

Thread 4 (Thread 32771 (LWP 4077)):
#0  0x40b18b88 in __pthread_sigsuspend () from /lib/libpthread.so.0
#1  0x40b18999 in __pthread_wait_for_restart_signal () from /lib/libpthread.so.0
#2  0x40b15aa8 in pthread_cond_wait@GLIBC_2.0 () from /lib/libpthread.so.0
#3  0x401d2175 in PR_WaitCondVar (cvar=0x81e18e8, timeout=4294967295) at
ptsynch.c:389
#4  0x40148de6 in nsThreadPool::GetRequest(nsIThread*) (this=0x8214e98,
currentThread=0x8178788) at nsThread.cpp:662
#5  0x40149989 in nsThreadPoolRunnable::Run() (this=0x8178770) at nsThread.cpp:904
#6  0x40147878 in nsThread::Main(void*) (arg=0x8178788) at nsThread.cpp:118
#7  0x401d7f7f in _pt_root (arg=0x8178808) at ptthread.c:214
#8  0x40b168be in pthread_start_thread () from /lib/libpthread.so.0
#9  0x40d81217 in clone () from /lib/libc.so.6

Thread 3 (Thread 16386 (LWP 4076)):
#0  0x40d79bb4 in poll () from /lib/libc.so.6
#1  0x401d6a9c in _pr_poll_with_poll (pds=0x81e7cbc, npds=1, timeout=4294967295)
at ptio.c:3887
#2  0x414ba7ef in nsSocketTransportService::Poll() (this=0xfffffffc) at
nsSocketTransportService2.cpp:324
#3  0x414baee5 in nsSocketTransportService::Run() (this=0x81e7968) at
nsSocketTransportService2.cpp:528
#4  0x40147878 in nsThread::Main(void*) (arg=0x81e1ad8) at nsThread.cpp:118
#5  0x401d7f7f in _pt_root (arg=0x821dc30) at ptthread.c:214
#6  0x40b168be in pthread_start_thread () from /lib/libpthread.so.0
#7  0x40d81217 in clone () from /lib/libc.so.6

Thread 2 (Thread 32769 (LWP 4075)):
#0  0x40d79bb4 in poll () from /lib/libc.so.6
#1  0x40b1661b in __pthread_manager () from /lib/libpthread.so.0
#2  0x40d81217 in clone () from /lib/libc.so.6

Thread 1 (Thread 16384 (LWP 4068)):
#0  0x401a574e in PL_HashTableRawLookup (ht=0x8475c00, keyHash=108352116,
key=0x41f577de) at plhash.c:206
#1  0x401a5c1a in PL_HashTableLookup (ht=0x8475c00, key=0x41f577de) at plhash.c:388
#2  0x41519fc0 in nsHttp::ResolveAtom(char const*) (str=0x41f577de "Accept") at
nsHttp.cpp:140
#3  0x4153c285 in nsHttp::ResolveAtom(nsACString const&) (s=@0xbfffe7f0) at
nsHttp.h:125
#4  0x4153863c in nsHttpChannel::SetRequestHeader(nsACString const&, nsACString
const&, int) (this=0x8a78138, header=@0xbfffe7f0, value=@0xbfffe7e0, merge=0)
    at nsHttpChannel.cpp:2920
#5  0x41f3ebcb in NewImageChannel (aResult=0xbfffe9b0, aURI=0x86edaa0,
aInitialDocumentURI=0x86b3860, aReferringURI=0x86b3860, aLoadGroup=0x86cd3d0,
aLoadFlags=0)
    at imgLoader.cpp:199
#6  0x41f3f25d in imgLoader::LoadImage(nsIURI*, nsIURI*, nsIURI*, nsILoadGroup*,
imgIDecoderObserver*, nsISupports*, unsigned, nsISupports*, imgIRequest*,
imgIRequest**)
    (this=0x8215008, aURI=0x86edaa0, aInitialDocumentURI=0x86b3860,
aReferrerURI=0x86b3860, aLoadGroup=0x86cd3d0, aObserver=0x86ed960,
aCX=0x8abe060, aLoadFlags=0, 
    cacheKey=0x0, aRequest=0x0, _retval=0x86ed964) at imgLoader.cpp:396
#7  0x41b3ba8d in nsImageLoadingContent::ImageURIChanged(nsACString const&)
(this=0x86ed960, aNewURI=@0xbfffecf0) at nsImageLoadingContent.cpp:439
#8  0x41b3b731 in nsImageLoadingContent::ImageURIChanged(nsAString const&)
(this=0x86ed960, aNewURI=@0xbfffed80) at nsImageLoadingContent.cpp:374
#9  0x41ba01a0 in nsHTMLImageElement::SetParent(nsIContent*) (this=0x86ed940,
aParent=0xbfffed80) at nsHTMLImageElement.cpp:673
#10 0x41b67a9d in nsGenericHTMLContainerElement::AppendChildTo(nsIContent*, int,
int) (this=0x86ed868, aKid=0x86ed940, aNotify=0, aDeepSetDocument=0)
    at nsGenericHTMLElement.cpp:3777
#11 0x41bf037a in SinkContext::AddLeaf(nsIHTMLContent*) (this=0x8752380,
aContent=0x86ed940) at nsHTMLContentSink.cpp:1933
#12 0x41bf015f in SinkContext::AddLeaf(nsIParserNode const&) (this=0x8752380,
aNode=@0x87c64b8) at nsHTMLContentSink.cpp:1861
#13 0x41bf47ad in HTMLContentSink::AddLeaf(nsIParserNode const&)
(this=0x825bba8, aNode=@0x87c64b8) at nsHTMLContentSink.cpp:3604
#14 0x4203fac3 in CNavDTD::AddLeaf(nsIParserNode const*) (this=0x8591ea8,
aNode=0x87c64b8) at CNavDTD.cpp:3773
#15 0x4203c5b2 in CNavDTD::HandleDefaultStartToken(CToken*, nsHTMLTag,
nsCParserNode*) (this=0x8591ea8, aToken=0x8974b50, aChildTag=eHTMLTag_img,
aNode=0x87c64b8)
    at CNavDTD.cpp:1457
#16 0x4203d06a in CNavDTD::HandleStartToken(CToken*) (this=0x8591ea8,
aToken=0x8974b50) at CNavDTD.cpp:1832
#17 0x4203bae4 in CNavDTD::HandleToken(CToken*, nsIParser*) (this=0x8591ea8,
aToken=0x8974b50, aParser=0x8b125f8) at CNavDTD.cpp:1016
#18 0x4203ae28 in CNavDTD::BuildModel(nsIParser*, nsITokenizer*,
nsITokenObserver*, nsIContentSink*) (this=0x8591ea8, aParser=0x8b125f8,
aTokenizer=0x86df2b8, 
    anObserver=0x0, aSink=0x825bba8) at CNavDTD.cpp:514
#19 0x4204fcaa in nsParser::BuildModel() (this=0x8b125f8) at nsParser.cpp:1905
#20 0x4204f9d4 in nsParser::ResumeParse(int, int, int) (this=0x8b125f8,
allowIteration=1, aIsFinalChunk=0, aCanInterrupt=1) at nsParser.cpp:1772
#21 0x4205106b in nsParser::OnDataAvailable(nsIRequest*, nsISupports*,
nsIInputStream*, unsigned, unsigned) (this=0x8b125f8, request=0x86f9858,
aContext=0x0, 
    pIStream=0x8abe668, sourceOffset=2521, aLength=11814) at nsParser.cpp:2437
#22 0x42204ed2 in nsDocumentOpenInfo::OnDataAvailable(nsIRequest*, nsISupports*,
nsIInputStream*, unsigned, unsigned) (this=0x8a21288, request=0x86f9858, aCtxt=0x0, 
    inStr=0x8abe668, sourceOffset=2521, count=11814) at nsURILoader.cpp:242
#23 0x414d4341 in nsHTTPCompressConv::do_OnDataAvailable(nsIRequest*,
nsISupports*, unsigned, char*, unsigned) (this=0x8219970, request=0x86f9858,
aContext=0x0, 
    aSourceOffset=2521, 
    buffer=0x8a5f5b0 "b, nav links -->\n<table cellpadding=\"2\"
cellspacing=\"0\" border=\"0\" width=\"100%\" 
align=\"center\">\n<tr>\n\t<td><img src=\"images/vb_bullet.gif\" border=\"0\"
align=\"middle\" alt=\""..., aCount=11814) at nsHTTPCompressConv.cpp:368
#24 0x414d417a in nsHTTPCompressConv::OnDataAvailable(nsIRequest*, nsISupports*,
nsIInputStream*, unsigned, unsigned) (this=0x8219970, request=0x86f9858,
aContext=0x0, 
    iStr=0x82199f8, aSourceOffset=2521, aCount=141524800) at
nsHTTPCompressConv.cpp:304
#25 0x414bb9c5 in nsStreamListenerTee::OnDataAvailable(nsIRequest*,
nsISupports*, nsIInputStream*, unsigned, unsigned) (this=0x41569b08,
request=0x86f9858, context=0x0, 
    input=0xbffff480, offset=2521, count=3938) at nsStreamListenerTee.cpp:97
#26 0x4153987b in nsHttpChannel::OnDataAvailable(nsIRequest*, nsISupports*,
nsIInputStream*, unsigned, unsigned) (this=0x86f9858, request=0x8b0d468, ctxt=0x0, 
    input=0x86bbbd4, offset=2521, count=3938) at nsHttpChannel.cpp:3380
#27 0x41497d91 in nsInputStreamPump::OnStateTransfer() (this=0x8af85e0) at
nsInputStreamPump.cpp:418
#28 0x41497a87 in nsInputStreamPump::OnInputStreamReady(nsIAsyncInputStream*)
(this=0x8af85e0, stream=0x86bbbd4) at nsInputStreamPump.cpp:321
#29 0x40126849 in nsInputStreamReadyEvent::EventHandler(PLEvent*)
(plevent=0x6755274) at nsStreamUtils.cpp:116
#30 0x40143374 in PL_HandleEvent (self=0x8ab4b5c) at plevent.c:671
#31 0x4014324d in PL_ProcessPendingEvents (self=0x81e1dd8) at plevent.c:606
#32 0x401450e4 in nsEventQueueImpl::ProcessPendingEvents() (this=0x8217190) at
nsEventQueue.cpp:391
#33 0x41f911de in event_processor_callback (source=0x81eed08, condition=G_IO_IN,
data=0x86f7f40) at nsAppShell.cpp:67
#34 0x40ba8409 in g_io_unix_dispatch (source=0x81eed08, callback=0x41f911c8
<event_processor_callback>, user_data=0x8217190) at giounix.c:159
#35 0x40b88726 in g_main_dispatch (context=0x813a4a0) at gmain.c:1752
#36 0x40b897c9 in g_main_context_dispatch (context=0x813a4a0) at gmain.c:2300
#37 0x40b89b09 in g_main_context_iterate (context=0x813a4a0, block=1,
dispatch=1, self=0x8114a08) at gmain.c:2381
#38 0x40b8a1c1 in g_main_loop_run (loop=0x8240ce0) at gmain.c:2601
#39 0x4045177f in bonobo_main () at bonobo-main.c:294
#40 0x0807748a in main (argc=1, argv=0xbffff854) at ephy-main.c:154
Keywords: hang
Depends on: 210125
hmm.. can you repro this hang with a nightly mozilla build?  what is the CPU
utilization when the process "hangs"?  0% or 100%... i.e., are we in some
infinite loop?  unfortunately, the stack traces don't indicate any obvious
deadlocks.
I haven't been able to reproduce this with a nightly build, i.e. using
mozilla-the-browser.
The mozillas I have seen this with under epiphany have been mozilla-snapshot
packages from debian [at least three different versions dating from 03/2003 to
06/2003], the debian mozilla 1.4 packages, and now with self-compiled mozillas
from cvs [using jhbuild].

When the hang happens, CPU utilisation is at max, about 99%.
I can confirm this in the latest Debian mozilla from unstable (as of writing
mozilla-1.5-3).

CPU usage is high, probably entered some loop and is unable to get out.
This still occurs infrequently for me. Now using
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7a) Gecko/20031227 Epiphany/1.1.2

Here's a partial full trace:
#0  0x401bc74e in PL_HashTableRawLookup (ht=0x856e988, keyHash=108352116, 

    key=0x420d383e) at plhash.c:206

	he = (PLHashEntry *) 0x85deee0

	hep = (PLHashEntry **) 0x85deee0

	hep0 = (PLHashEntry **) 0x85de5a8

	h = 108352116

#1  0x401bcc1a in PL_HashTableLookup (ht=0x856e988, key=0x420d383e)

    at plhash.c:388

	keyHash = 108352116

	he = (PLHashEntry *) 0x6755274

	hep = (PLHashEntry **) 0x6755274

#2  0x415fd048 in nsHttp::ResolveAtom(char const*) (str=0x420d383e "Accept")

    at nsHttp.cpp:140

	atom = {_val = 0x0}

#3  0x4161e991 in nsHttp::ResolveAtom(nsACString const&) (s=@0xbfffccd0)

    at nsHttp.h:125

No locals.

#4  0x4161b096 in nsHttpChannel::SetRequestHeader(nsACString const&, nsACString
const&, int) (this=0x88e1c38, header=@0xbfffccd0, value=@0xbfffccc0, merge=0)

    at nsHttpChannel.cpp:2990

	atom = {_val = 0x420d3a6e ""}

#5  0x420bae81 in NewImageChannel (aResult=0xbfffcde0, aURI=0x89405f8, 

    aInitialDocumentURI=0x0, aReferringURI=0x8813d70, aLoadGroup=0x86efce0, 

    aLoadFlags=1) at imgLoader.cpp:199

	httpChannelInternal = {mRawPtr = 0x420d6548}

	rv = 3221212368

	newChannel = {mRawPtr = 0x0}

	newHttpChannel = {mRawPtr = 0x88e1c38}

	callbacks = {mRawPtr = 0x878ee10}

#6  0x420bb844 in imgLoader::LoadImage(nsIURI*, nsIURI*, nsIURI*, nsILoadGroup*,
imgIDecoderObserver*, nsISupports*, unsigned, nsISupports*, imgIRequest*,
imgIRequest**) (this=0x83d2480, aURI=0x89405f8, aInitialDocumentURI=0x0, 

    aReferrerURI=0x8813d70, aLoadGroup=0x86efce0, aObserver=0x876e198, 

    aCX=0x862fe98, aLoadFlags=1, cacheKey=0x0, aRequest=0x0, _retval=0x876e1ac)

    at imgLoader.cpp:452

	LOG_SCOPE_TMP_VAR__LINE__ = {mLog = 0x83cc6f8, mFrom = 0x83d2480, 

  mFunc = {<nsCString> = {<nsAFlatCString> = {<nsASingleFragmentCString> =
{<nsACString> = {

            _vptr.nsACString = 0x8119108}, <No data fields>}, <No data fields>},
<nsStr> = {{mStr = 0xbfffce08 "imgLoader::LoadImage |cache miss|", 

          mUStr = 0xbfffce08}, mLength = 33, 

        mCapacityAndFlags = 63}, <No data fields>}, 

    mBuffer = "imgLoader::LoadImage |cache
miss|\000\000\000øe\e@\020b5\b\000\000\000\000øÎÿ¿Âõ\024@\002\000\000\000pÏÿ¿"}}

	pl = (ProxyListener *) 0xbfffcdd0

	newChannel = {mRawPtr = 0x88e1c38}

	openRes = 3221212592

	spec = {<nsCString> = {<nsAFlatCString> = {<nsASingleFragmentCString> = 

          _vptr.nsACString = 0x8119108}, <No data fields>}, <No data fields>},
<nsStr> = {{

        mStr = 0xbfffcfd0
"http://www.gnomedesktop.org/themes/default/images/platnum.png", mUStr =
0xbfffcfd0}, mLength = 61, 

      mCapacityAndFlags = 63}, <No data fields>}, 

  mBuffer = "http://www.gnomedesktop.org/themes/default/images/platnum.png\000¥A"}

	LOG_SCOPE_TMP_VAR__LINE__ = {mLog = 0x83cc6f8, mFrom = 0x83d2480, 

  mFunc = {<nsCString> = {<nsAFlatCString> = {<nsASingleFragmentCString> =
{<nsACString> = {

            _vptr.nsACString = 0x8119108}, <No data fields>}, <No data fields>},
<nsStr> = {{mStr = 0xbfffcf78 "imgLoader::LoadImage", mUStr = 0xbfffcf78}, 

        mLength = 20, mCapacityAndFlags = 63}, <No data fields>}, 

    mBuffer =
"imgLoader::LoadImage\000ª\vB\204$=\b\003\000\000\00048\rBøÕ\030@ð\2244\b\000Ðÿ¿\200$=\bPÊøA\000Ðÿ¿ÐÐÿ¿"}}

	request = (class imgRequest *) 0x0

	rv = 0

	requestFlags = 1

	entry = {mRawPtr = 0x0}

	bCanCacheRequest = 1

	bHasExpired = 0

	bValidateRequest = 0

	addToLoadGroup = 1

	eventQService = {mRawPtr = 0x83c0228}

	activeQ = {mRawPtr = 0x83c03e0}

	cacheId = (void *) 0x83c03e0

	proxy = (class imgRequestProxy *) 0xbfffcdd0

[........]

Here's the function where it hangs:
[http://lxr.mozilla.org/nspr/source/nsprpub/lib/ds/plhash.c#185]

185 PL_HashTableRawLookup(PLHashTable *ht, PLHashNumber keyHash, const void *key)
186 {
187     PLHashEntry *he, **hep, **hep0;
188     PLHashNumber h;
189 
190 #ifdef HASHMETER
191     ht->nlookups++;
192 #endif
193     h = keyHash * GOLDEN_RATIO;
194     h >>= ht->shift;
195     hep = hep0 = &ht->buckets[h];
196     while ((he = *hep) != 0) {
197         if (he->keyHash == keyHash && (*ht->keyCompare)(key, he->key)) {
198             /* Move to front of chain if not already there */
199             if (hep != hep0) {
200                 *hep = he->next;
201                 he->next = *hep0;
202                 *hep0 = he;
203             }
204             return hep0;
205         }
206         hep = &he->next;
207 #ifdef HASHMETER
208         ht->nsteps++;
209 #endif
210     }
211     return hep;
212 }

Let's look at some vars:
(gdb) p he

$25 = (PLHashEntry *) 0x85deee0

(gdb) p *hep

$26 = (PLHashEntry *) 0x85deee0

(gdb) p &he->next

$29 = (PLHashEntry **) 0x85deee0


It appears it's stuck in the while() loop.

Also it may be noteworthy that every time I've seen this hang [and gone into gdb
to look at it], the key it was looking for was the same, it's always "Accept",
and it's always coming from an image load.
Christian: have you seen this hang more recently?  for example, does the problem
still exist in Mozilla 1.6?  thanks!
The last time I've seen this hang was nearly 2 months ago, when I posted comment
#4, with a build from 2003-12-27 (cvs trunk).
Just observed this again, for the first time in months, with 1.7:
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7) Gecko/20040622 Epiphany/1.2.6.90

Trace from the hang is still as in comment 0.
I believe this is a dupe of bug #218949

It happened to me again today, using Mozilla/5.0 (X11; U; Linux i686; en-US;
rv:1.7.3) Gecko/20040913

Alex
Yes, I think this is the same as bug 218949; but since this one has better info
(stack traces with debug symbols in comment 0, and comment 4) I suggest duping
bug 218949 against this bug.

I've made one additional observation which I didn't notice before, but I'm not
sure if it'll be helpful: almost every time I've observed this was when loading
pages from vBulletin[www.vbulletin.com]-driven discussion forums.
I can confirm Christian Persch's observation that the bug occurs most frequently
when viewing vBulletin pages. Afai remember it only ever happened to me while
reading www.mersenneforum.org, but I haven't found out yet which steps are
neccessary to reproduce it.
*** Bug 218949 has been marked as a duplicate of this bug. ***
gHttpAtomTable is not enumerated, so there's no obvious reason to use
PL_HashTableLookupConst, but it sure seems as though the hash chain reordering
done by PL_HashTableLookup is messing with some other code's mutations to the
table, resulting in a cycle.

BTW, this looks lika a dense small-entry table, so it probably should use
pldhash -- see pldhash.h, the big comment.

/be
Status: UNCONFIRMED → NEW
Ever confirmed: true
Yeah, perhaps the easy solution to this bug is to just switch over to pldhash. 
I've been wanting to do that for years ;-)
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla1.8beta2
Severity: normal → major
Priority: -- → P1
Attached patch v1 patch (obsolete) — Splinter Review
OK, this patch makes ResolveAtom use pldhash.  I've included some unit tests,
but feel free to ignore those when reviewing this patch.
Attachment #179087 - Flags: review?(brendan)
Comment on attachment 179087 [details] [diff] [review]
v1 patch

test/Makefile.in
-		  unit/test_protocolproxyservice.js
+		  unit/test_protocolproxyservice.js \
+		  unit/test_http_headers.js
 libs:: $(_UNIT_FILES)

might be a good idea to end the _UNIT_FILES list with $(NULL) on its own line,
so that further additions don't affect blame for existing files
Comment on attachment 179087 [details] [diff] [review]
v1 patch

>-static struct PLHashTable  *gHttpAtomTable = nsnull;
>+static struct PLDHashTable *gHttpAtomTable = nsnull;

You could make this a PLDHashTable, not a pointer to one, and save a malloc.

> // Hash string ignore case, based on PL_HashString
>-static PLHashNumber
>-StringHash(const PRUint8 *key)
>+PR_STATIC_CALLBACK(PLDHashNumber)
>+StringHash(PLDHashTable *table, const void *key)
> {
>-    PLHashNumber h;
>-    const PRUint8 *s;
>-
>-    h = 0;
>-    for (s = key; *s; s++)
>-        h = (h >> 28) ^ (h << 4) ^ nsCRT::ToLower((char)*s);
>+    PLDHashNumber h = 0;
>+    for (const char *s = NS_REINTERPRET_CAST(const char*, key); *s; ++s)
>+        h = (h >> 28) ^ (h << 4) ^ nsCRT::ToLower(*s);
>     return h;
> }

Use nsCRT::HashCode instead of essentially duplicating it?  Hrm, maybe not --
it has bogus null-defense and an unnecessary out param to get the string's
length.

>-static PRIntn
>-StringCompare(const char *a, const char *b)
>-{
>-    return PL_strcasecmp(a, b) == 0;
>-}
>-    gHttpAtomTable = PL_NewHashTable(128, (PLHashFunction) StringHash,
>-                                          (PLHashComparator) StringCompare,
>-                                          (PLHashComparator) 0, 0, 0);
>+    // The capacity for this table should be chosen equal to or larger
>+    // than the number of HTTP_ATOM entries in nsHttpAtomList.h
>+    gHttpAtomTable =
>+            PL_NewDHashTable(&ops, nsnull, sizeof(PLDHashEntryStub), 100);

Instead of wiring in 100, you could #include "nsHttpAtomList.h" inside an enum,
with HTTP_ATOM defined appropriately, and add a final enumerator that counts
the number of atoms in that list.

>+    PLDHashEntryStub *stub = NS_REINTERPRET_CAST(PLDHashEntryStub *,
>+            PL_DHashTableOperate(gHttpAtomTable, str, PL_DHASH_LOOKUP));
>+    if (stub && PL_DHASH_ENTRY_IS_BUSY(&stub->hdr)) {

Don't null-test stub after PL_DHASH_LOOKOUP, it can be null only after
PL_DHASH_ADD.

[snip ...]
>+    // now insert the heap atom into the atom table
>+    PL_DHashTableOperate(gHttpAtomTable, heapAtom->value, PL_DHASH_ADD);

This does a double-lookup on miss -- better to call PL_DHASH_ADD up front,
detecting after it returns whether it found an existing entry, or added a new
one.  If you weren't using an initEntry hook, you could tell by null-testing
key -- if null, the ADD created this entry, otherwise it found a pre-existing
one.

What's more, if you consolidate this ADD with the earlier LOOKUP by making one
ADD operation up front, then your initEntry hook in the miss case will store
the temporary "test" key uselessly in the entry, and you'll have to store again
the heapAtom->value persistent key.

So I don't think you want to use an initEntry hook here. 

Try the combined-lookup-with-conditional-create single ADD operation, with no
initEntry hook, and therefore with an explicit store of the newly-created
entry's key in the miss-case you discern by testing for a null key.  You should
find it's less code all around.

In any event, null-test PL_DHASH_ADD operation return values to detect OOM.

Wonder what the underlying bug was with PLHashTable -- I hope this won't just
move the crash around.

/be
Attachment #179087 - Flags: review?(brendan) → review-
Thanks for the review feedback.  I knew I queried the right person for this ;-)


> Use nsCRT::HashCode instead of essentially duplicating it?

Well, I'm computing a hash of the lowercased value of the string. 
nsCRT::HashCode is case-sensitive, so I can't use it without first generating a
lowercased copy of my input string.


I've used the ADD first approach in other code.  I should have remembered to use
it here.  Will do.
Attached patch v1.1 patch (obsolete) — Splinter Review
revised per suggestions.  i also added a few more enhancements.  overall, i'm
much happier with this solution.

i'm not so interested in the reason for the hang.  i'll chalk it up to some
oddity of plhash or perhaps an error in the way i was using it.
Attachment #179087 - Attachment is obsolete: true
Attachment #179143 - Flags: review?(brendan)
Comment on attachment 179143 [details] [diff] [review]
v1.1 patch

> static nsresult
> CreateAtomTable()
> {
>-    LOG(("CreateAtomTable\n"));
>-
>-    if (gHttpAtomTable)
>-        return NS_OK;
> 
>-    gHttpAtomTable = PL_NewHashTable(128, (PLHashFunction) StringHash,
>-                                          (PLHashComparator) StringCompare,
>-                                          (PLHashComparator) 0, 0, 0);
>-    if (!gHttpAtomTable)
>+    // The capacity for this table is initialized to a value greater than the
>+    // number of known atoms (NUM_HTTP_ATOMS) because we expect to encounter a
>+    // few random headers right off the bat.
>+    if (!PL_DHashTableInit(&sAtomTable, &ops, nsnull,
>+                           sizeof(PLDHashEntryStub),
>+                           NUM_HTTP_ATOMS + 10))
>         return NS_ERROR_OUT_OF_MEMORY;

Set sAtomTable.ops = nsnull before this OOM return.  May as well assert ops is
null on entry to CreateAtomTable too.

Good to see the string allocation fused with the nsHttpAtom!  Tight, yo. ;-)

/be
Attachment #179143 - Flags: review?(brendan) → review+
fixed-on-trunk
Status: ASSIGNED → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
Backed out to hopefully resolve tbox orange.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
So, I found two problems with the previous patch:

1) With the switch to remove initEntry, I forgot to set the key after the ADDs
in CreateAtomTable.

2) ResolveAtom can be called from the socket transport thread.  This means that
we need a mutex around the PL_DHashTableOperate calls.  This probably explains
the original bug.

New patch coming up.
Status: REOPENED → ASSIGNED
Attached patch v1.2 patchSplinter Review
Attachment #179143 - Attachment is obsolete: true
Attachment #179179 - Flags: review?(brendan)
Brendan, please see this interdiff for easy reviewing.
have you considered using one of the templatized hashtable classes we have?
Comment on attachment 179179 [details] [diff] [review]
v1.2 patch

Heh, I knew the real underlying bug would shake loose -- glad you found it
quickly.

/be
Attachment #179179 - Flags: review?(brendan) → review+
> have you considered using one of the templatized hashtable classes we have?

Yes, and they don't fit my needs (i.e., the value stored in the hash table is
the key).
I think I should also call CreateAtomTable from nsHttpHandler::Init because the
first call to ResolveAtom might appear on a any thread.
fixed-on-trunk

I moved the CreateAtomTable call into nsHttpHandler::Init
Status: ASSIGNED → RESOLVED
Closed: 19 years ago19 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: