Closed
Bug 192207
Opened 22 years ago
Closed 22 years ago
M130B Trunk crash [@ 0x00000000 - PL_DHashTableOperate] (really in nsSocketTransportService::RememberHost)
Categories
(Core :: Networking, defect, P1)
Core
Networking
Tracking
()
RESOLVED
FIXED
mozilla1.3final
People
(Reporter: darin.moz, Assigned: darin.moz)
References
Details
(4 keywords, Whiteboard: fixed1.3)
Crash Data
Attachments
(1 file, 3 obsolete files)
18.12 KB,
patch
|
brendan
:
review+
dougt
:
superreview+
dbaron
:
approval1.3+
|
Details | Diff | Splinter Review |
topcrash @nsSocketTransportService::RememberHost
0x00000000
PL_DHashTableOperate [xpcom/ds/pldhash.c line 480]
nsSocketTransportService::RememberHost
[netwerk/base/src/nsSocketTransportService2.cpp line 294]
nsSocketTransport::OnSocketConnected [netwerk/base/src/nsSocketTransport2.cpp
line 1114]
nsSocketTransport::OnSocketReady [netwerk/base/src/nsSocketTransport2.cpp line 1262]
nsSocketTransportService::Run [netwerk/base/src/nsSocketTransportService2.cpp
line 516]
nsThread::Main [xpcom/threads/nsThread.cpp line 134]
_PR_NativeRunThread [pruthr.c line 455]
msvcrt.dll + 0x27fb8 (0x77c07fb8)
kernel32.dll + 0x1d33b (0x77e5d33b)
Assignee | ||
Comment 1•22 years ago
|
||
also, this stack trace:
0x00000000
PL_DHashTableOperate [xpcom/ds/pldhash.c, line 480]
nsSocketTransportService::LookupHost
[netwerk/base/src/nsSocketTransportService2.cpp, line 275]
nsSocketTransport::ResolveHost [netwerk/base/src/nsSocketTransport2.cpp, line 747]
nsSocketTransport::OnSocketEvent [netwerk/base/src/nsSocketTransport2.cpp, line
1208]
nsSocketTransportService::Run [netwerk/base/src/nsSocketTransportService2.cpp,
line 548]
nsThread::Main [xpcom/threads/nsThread.cpp, line 134]
_PR_NativeRunThread [pruthr.c, line 455]
msvcrt.dll + 0x27fb8 (0x77c07fb8)
kernel32.dll + 0x1d33b (0x77e5d33b)
Status: NEW → ASSIGNED
Flags: blocking1.3?
Priority: -- → P1
Target Milestone: --- → mozilla1.3beta
Updated•22 years ago
|
Assignee | ||
Updated•22 years ago
|
Target Milestone: mozilla1.3beta → mozilla1.3final
Comment 3•22 years ago
|
||
Darin, any hope for 1.3? We're supposed to be done with it (or at least have it
on a branch) by now.
/be
Assignee | ||
Comment 4•22 years ago
|
||
brendan: nope, this bug is a real mystery to me. i don't understand how the
PLDHashTable could not be initialized. any thoughts?
Comment 5•22 years ago
|
||
Updating summary for tracking. Here are a few sets of crashes from Talkback data:
Count Offset Real Signature
[ 46 0x00000000 59282756 - PL_DHashTableOperate ]
Crash date range: 2003-02-08 to 2003-02-16
Min/Max Seconds since last crash: 34 - 296449
Min/Max Runtime: 34 - 449139
Count Platform List
44 Windows NT 5.0 build 2195
2 Windows NT 4.0 build 1381
Count Build Id List
40 2003021008
3 2003021108
1 2003020904
1 2003020808
1 2003020708
No of Unique Users 19
Stack trace(Frame)
0x00000000
PL_DHashTableOperate [c:/builds/seamonkey/mozilla/xpcom/ds/pldhash.c line 480]
nsSocketTransportService::RememberHost
[c:/builds/seamonkey/mozilla/netwerk/base/src/nsSocketTransportService2.cpp
line 357]
(17253177) URL: www.gbcentrum.com
(17250720) Comments: The system was idle and the LAN was not connected to
the Internet at the time the failure occurred. It was probably the mail client
trying to reach the mail server.
(17216963) URL: http://www.cnn.com/WORLD
(17206814) URL: yahoo.de
(17164310) Comments: Opened one to many windows? It was working so well
then BLAM! when I clicked a link.
(17134954) URL: ftp://priede.bf.lu.lv
(17134954) Comments: Get files with Leech plugin.
(17132263) URL: ftp://priede.bf.lu.lv
(17132263) Comments: Trying to download files with leech plugin from
ftp://pride.bf.lu.lv
(17088991) Comments: moz running over night. hello. it's a new day - it's
a new bug. :)
====================================================================================================
Count Offset Real Signature
[ 17 0x00000000 6094ca9a - PL_DHashTableOperate ]
[ 7 0x00000000 12f0253a - PL_DHashTableOperate ]
Crash date range: 2003-02-08 to 2003-02-16
Min/Max Seconds since last crash: 20 - 232263
Min/Max Runtime: 68 - 232263
Count Platform List
24 Windows NT 5.1 build 2600
Count Build Id List
9 2003021104
5 2003020908
4 2003021408
3 2003021008
1 2003021508
1 2003021504
1 2003020808
No of Unique Users 5
Stack trace(Frame)
0x00000000
PL_DHashTableOperate [c:/builds/seamonkey/mozilla/xpcom/ds/pldhash.c line 480]
nsSocketTransportService::LookupHost
[c:/builds/seamonkey/mozilla/netwerk/base/src/nsSocketTransportService2.cpp
line 334]
nsSocketTransport::ResolveHost
[c:/builds/seamonkey/mozilla/netwerk/base/src/nsSocketTransport2.cpp line 747]
nsSocketTransport::OnSocketEvent
[c:/builds/seamonkey/mozilla/netwerk/base/src/nsSocketTransport2.cpp line 1211]
nsSocketTransportService::ServiceEventQ
[c:/builds/seamonkey/mozilla/netwerk/base/src/nsSocketTransportService2.cpp
line 277]
nsSocketTransportService::Run
[c:/builds/seamonkey/mozilla/netwerk/base/src/nsSocketTransportService2.cpp
line 613]
nsThread::Main [c:/builds/seamonkey/mozilla/xpcom/threads/nsThread.cpp line 134]
_PR_NativeRunThread [pruthr.c line 455]
msvcrt.dll + 0x27fb8 (0x77c07fb8)
kernel32.dll + 0x1d33b (0x77e5d33b)
(17253235) URL: naral.org
(17253235) Comments: Trying to read email from Naral.org
(17131755) URL: zapo.net
(17131755) Comments: Logging into web based email..... the 1.3b release
seems crash prone
====================================================================================================
Count Offset Real Signature
[ 16 0x00000000 0c76354a - PL_DHashTableOperate ]
[ 12 0x00000000 7e12daea - PL_DHashTableOperate ]
Crash date range: 2003-02-09 to 2003-02-16
Min/Max Seconds since last crash: 102 - 137451
Min/Max Runtime: 716 - 224712
Count Platform List
27 Windows NT 5.1 build 2600
1 Windows NT 5.0 build 2195
Count Build Id List
13 2003021008
7 2003021104
3 2003021408
2 2003021508
2 2003020908
1 2003021504
No of Unique Users 10
Stack trace(Frame)
0x00000000
PL_DHashTableOperate [c:/builds/seamonkey/mozilla/xpcom/ds/pldhash.c line 480]
nsSocketTransportService::RememberHost
[c:/builds/seamonkey/mozilla/netwerk/base/src/nsSocketTransportService2.cpp
line 353]
nsSocketTransport::OnSocketConnected
[c:/builds/seamonkey/mozilla/netwerk/base/src/nsSocketTransport2.cpp line 1117]
nsSocketTransport::OnSocketReady
[c:/builds/seamonkey/mozilla/netwerk/base/src/nsSocketTransport2.cpp line 1269]
nsSocketTransportService::Run
[c:/builds/seamonkey/mozilla/netwerk/base/src/nsSocketTransportService2.cpp
line 584]
nsThread::Main [c:/builds/seamonkey/mozilla/xpcom/threads/nsThread.cpp line 134]
_PR_NativeRunThread [pruthr.c line 455]
msvcrt.dll + 0x27fb8 (0x77c37fb8)
kernel32.dll + 0x1d33b (0x77e7d33b)
(17253138) URL: naral.org
(17253138) Comments: Trying to access their Donate To Naral link
(17232094) Comments: Trying to access open a bookmark in a new tab
(17207040) Comments: another groupmark crash see bug 192744
(17166422) Comments: opening tabs in the background
(17146778) Comments: turning off junk mail controls didn't help . . .
(17144443) Comments: Another crash while automatically checking email.
Might try disabling the junk controls see if that has any effect.
(17143454) Comments: Nothing. I think it was automatically checking for
emails at the time.
(17128908) Comments: opening groupmark
(17106677) Comments: opening groupmark
(17098969) Comments: opening group of tabs
(17098646) URL: http://www.acme.com/heartmaker/hearts.cgi
====================================================================================================
Count Offset Real Signature
[ 8 0x00000000 09ce90c5 - PL_DHashTableOperate ]
Crash date range: 2003-02-11 to 2003-02-16
Min/Max Seconds since last crash: 43 - 209555
Min/Max Runtime: 171 - 351058
Count Platform List
8 Windows NT 5.0 build 2195
Count Build Id List
7 2003021008
1 2003021404
No of Unique Users 7
Stack trace(Frame)
0x00000000
PL_DHashTableOperate [c:/builds/seamonkey/mozilla/xpcom/ds/pldhash.c line 480]
nsSocketTransportService::LookupHost
[c:/builds/seamonkey/mozilla/netwerk/base/src/nsSocketTransportService2.cpp
line 338]
nsSocketTransport::ResolveHost
[c:/builds/seamonkey/mozilla/netwerk/base/src/nsSocketTransport2.cpp line 747]
nsSocketTransport::OnSocketEvent
[c:/builds/seamonkey/mozilla/netwerk/base/src/nsSocketTransport2.cpp line 1208]
(17221006) Comments: Had just opened browser - one window open to my mail
web gateway. Opened Freenet gateway - clicked on a link there - started to load
- program crashed. Fix it! ;-)
(17179078) URL: pop-up window at www.lacasadelcdvirgen.com.ar
Added qawanted to see if we can get this reproduced.
Keywords: qawanted
Summary: topcrash @nsSocketTransportService::RememberHost → M130B Trunk crash [@ 0x00000000 - PL_DHashTableOperate] (really in nsSocketTransportService::RememberHost)
Comment 6•22 years ago
|
||
*** Bug 192744 has been marked as a duplicate of this bug. ***
I couldn't duplicate this bug based on "user comments" from talkback report. I
tried naral.org, zapo.net, opening groupmarks, marking mail msgs as junk etc.
:(
Assignee | ||
Comment 8•22 years ago
|
||
suresh: right, me either. there's something really screwy going on here.
Comment 9•22 years ago
|
||
For the record, I've been trying to reproduce this as well...with no luck. I'll
keep an eye out for more user comments/urls to try.
Comment 10•22 years ago
|
||
Hello,
I was the reporter for bug 192744 which was marked as a duplicate of this bug. I
could reproduce this bug :
opened groupmarks with usual account -> crash
Cleaned profile (deleted *.rdf files, xul.mfl, chrome folder, Cache folder,
*.dat files), repeated operation : another crash
Created new profile, imported bookmarks, clicked on the same groupmark -> no crash
added user.js customizations from old profile into new profile -> no crash
directly copied prefs.js from old profile to new profile -> CRASH
The bug seems to be somewhere into the preferences set in my prefs file.
TB17278040Q
Comment 11•22 years ago
|
||
If a programmer wants my prefs.js I can send it to him by email but I'd rather
not have it attached to a bugzilla file report since it countains information
about my email accounts and some of them do not receive spam yet :-)
Comment 12•22 years ago
|
||
Ok, it looks like i spoke too soon. Now none of my profiles crashes with this
groupmarks...
This groupmark had the following sites :
http://www.osnews.com/
http://solutions.journaldunet.com/
http://www.journaldunet.com/
http://fr.news.yahoo.com/101/
http://www.mozillazine.org/
http://news.zdnet.fr/
http://www.webstandards.org/
http://fr.news.yahoo.com/32/
Since most of these sites are news sites, perhaps the problem is not the site
but a certain type of banner add that would make mozilla crash when it is
displayed...
Assignee | ||
Comment 13•22 years ago
|
||
spoke with brendan about this. no chance that pldhash is doing something wrong
here. in fact, it never nulls out the |ops| member var. chances are something
is corrupting memory. after inspecting the socket transport service a bit, it
appears that some of the array bounds are not completely protected. i'm going
to put together a safety patch to eliminate any possibility of memory corruption
when too many sockets are in use.
Assignee | ||
Comment 14•22 years ago
|
||
Assignee | ||
Comment 15•22 years ago
|
||
Comment on attachment 114861 [details] [diff] [review]
patch: stronger bounds checking
summary of changes:
1- AttachSocket: utilize AddToIdleList so that we are only incrementing
mIdleCount in one place.
2- add runtime checks to enforce array bounds.
3- make MoveToPollList and MoveToIdleList handle errors.
4- eliminate unnecessary mServicingEventQ flag.
Attachment #114861 -
Flags: superreview?(bzbarsky)
Attachment #114861 -
Flags: review?(brendan)
Assignee | ||
Comment 16•22 years ago
|
||
also, it's likely that socket transport leaks (e.g., bug 191835) could explain
the ABR problem.
Comment 17•22 years ago
|
||
Comment on attachment 114861 [details] [diff] [review]
patch: stronger bounds checking
>- // find out what list this is on...
>- PRInt32 index = sock - mActiveList;
>- if (index > 0 && index <= NS_SOCKET_MAX_COUNT)
>+ // find out what list this is on.
>+ PRUint32 index = sock - mActiveList;
>+ if (index <= NS_SOCKET_MAX_COUNT)
> RemoveFromPollList(sock);
> else
> RemoveFromIdleList(sock);
mActiveList[0] seems to be unused, given the ++mActiveCount below:
> nsresult
> nsSocketTransportService::AddToPollList(SocketContext *sock)
> {
> LOG(("nsSocketTransportService::AddToPollList [handler=%x]\n", sock->mHandler));
>
>- NS_ASSERTION(mActiveCount < NS_SOCKET_MAX_COUNT, "too many active sockets");
>+ if (mActiveCount == NS_SOCKET_MAX_COUNT) {
>+ NS_ERROR("too many active sockets");
>+ return NS_ERROR_UNEXPECTED;
>+ }
>
> memcpy(&mActiveList[++mActiveCount], sock, sizeof(SocketContext));
Do you really need mActiveList and mPollList to be NS_SOCKET_MAX_COUNT+1 in
length, while mIdleList is NS_SOCKET_MAX_COUNT? My suggestion over irc about
PRUint32 based single-ended range checks allows an mActiveList index of 0,
where the old assertions did not. The for loop from NS_SOCKET_MAX_COUNT down
to 1 would have to change, of course....
/be
Assignee | ||
Comment 18•22 years ago
|
||
brendan:
yeah, ultimately i want to change mActiveList to be based at index 0 instead of
index 1. that was not a good design decision on my part to base it at index 1.
i was thinking of making that change post 1.3 final since it seems more risky,
and i don't fear my checks allowing access to mActiveList[0] since at least that
wouldn't be corrupting memory :-/
that said, i can certainly put together a patch to change mActiveList to be
based at index 0 instead. let me know if that's what you would prefer. maybe
it won't turn out to be that much more complex. hmm...
Assignee | ||
Comment 19•22 years ago
|
||
i have a better patch in mind...
Assignee | ||
Comment 20•22 years ago
|
||
Attachment #114861 -
Attachment is obsolete: true
Assignee | ||
Comment 21•22 years ago
|
||
Comment on attachment 115108 [details] [diff] [review]
v1.1 patch: revised per brendan's comments
ok, i feel a lot better about this patch. mActiveList[i] => mPollList[i+1].
this actually ended up simplifying a bunch of the array manipulations since
mActiveList is now based at index 0 :)
Attachment #115108 -
Flags: review?(brendan)
Comment 22•22 years ago
|
||
Comment on attachment 115108 [details] [diff] [review]
v1.1 patch: revised per brendan's comments
Only thought is to use struct assignment and not memcpy, as SocketContext is a
two-word struct. Gcc may inline equivalently; wonder what MSVC does.
Nit: "take care to only check idle sockets that were idle to begin with ;-)" --
transpose "only" and "check".
Where's the .h file patch?
/be
Attachment #115108 -
Flags: review?(brendan) → review+
Assignee | ||
Comment 23•22 years ago
|
||
*** Bug 194402 has been marked as a duplicate of this bug. ***
Assignee | ||
Comment 25•22 years ago
|
||
Comment on attachment 115167 [details] [diff] [review]
v1.2 patch: yeah, structure assignment does make a lot more sense
carrying forward r=brendan, requesting sr= from bz.
Attachment #115167 -
Flags: superreview?(bzbarsky)
Attachment #115167 -
Flags: review+
Comment 26•22 years ago
|
||
Comment on attachment 115167 [details] [diff] [review]
v1.2 patch: yeah, structure assignment does make a lot more sense
Do you need an assignment operator? If struct SocketContext is plain old data,
I think you should let the compiler select the best instructions (64-bit if
possible on new architectures).
/be
Attachment #115167 -
Flags: review+
Assignee | ||
Comment 27•22 years ago
|
||
brendan: whoa.. good point. new patch coming up.
Assignee | ||
Comment 28•22 years ago
|
||
Assignee | ||
Updated•22 years ago
|
Attachment #115167 -
Attachment is obsolete: true
Assignee | ||
Updated•22 years ago
|
Attachment #115168 -
Flags: superreview?(bzbarsky)
Attachment #115168 -
Flags: review?(brendan)
Assignee | ||
Updated•22 years ago
|
Attachment #114861 -
Flags: superreview?(bzbarsky)
Attachment #114861 -
Flags: review?(brendan)
Assignee | ||
Updated•22 years ago
|
Attachment #115167 -
Flags: superreview?(bzbarsky)
Comment 29•22 years ago
|
||
Comment on attachment 115168 [details] [diff] [review]
v1.3 patch
r=me, thanks.
/be
Attachment #115168 -
Flags: review?(brendan) → review+
Comment 30•22 years ago
|
||
I won't be able to super-review this until at least a week from now....
Assignee | ||
Updated•22 years ago
|
Attachment #115168 -
Flags: superreview?(bzbarsky) → superreview?(dougt)
Comment 31•22 years ago
|
||
Comment on attachment 115168 [details] [diff] [review]
v1.3 patch
+ nsresult rv = AddToIdleList(&sock);
+ if (NS_SUCCEEDED(rv))
+ NS_ADDREF(handler);
Does it make sense to hide the addref within AddToIdleList?
the patch doesn't apply cleanly to the tree.
Attachment #115168 -
Flags: superreview?(dougt) → superreview+
Assignee | ||
Comment 32•22 years ago
|
||
doug: AddToIdleList is also called from MoveToIdleList. in that case the STS
already owns a reference to the handler, and it just needs to move the reference
from the active list to the idle list. so, i think it makes sense for
AttachSocket to be responsible for the ADDREF. NOTE: the release happens in
DetachSocket. thx for the review!
Assignee | ||
Comment 33•22 years ago
|
||
Comment on attachment 115168 [details] [diff] [review]
v1.3 patch
requesting drivers approval for the 1.3 branch. this fixes a topcrash in the
socket code that is most easily hit when publishing a big document via FTP (see
bug 194402). reviewed by dougt and brendan. thx!
Attachment #115168 -
Flags: approval1.3?
Attachment #115168 -
Flags: approval1.3? → approval1.3+
Assignee | ||
Comment 34•22 years ago
|
||
fixed-on-trunk
Assignee | ||
Comment 35•22 years ago
|
||
fixed1.3
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Updated•22 years ago
|
Whiteboard: fixed1.3
Updated•13 years ago
|
Crash Signature: [@ 0x00000000 - PL_DHashTableOperate]
You need to log in
before you can comment on or make changes to this bug.
Description
•