Closed Bug 290190 Opened 16 years ago Closed 14 years ago

crash when network connection drops and reconnects [@ msvcrt.dll - nsDNSRecord::GetNextAddr ]

Categories

(Core :: Networking, defect, P3)

defect

Tracking

()

RESOLVED DUPLICATE of bug 337418
mozilla1.9alpha1

People

(Reporter: darin.moz, Unassigned)

References

()

Details

(Keywords: crash)

Crash Data

Attachments

(1 file)

crash when network connection drops and reconnects, while using gmail.

see:
http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=2&type=iid&id=5046300
Status: NEW → ASSIGNED
Priority: -- → P2
Target Milestone: --- → mozilla1.8beta2
this has happened to our product too. i couldn't figure out what to do with it.
Summary: crash when network connection drops and reconnects → crash when network connection drops and reconnects [@ msvcrt.dll - nsDNSRecord::GetNextAddr ]
*** Bug 293111 has been marked as a duplicate of this bug. ***
Also happens on Mac OS X...

Stack Signature	 0xffff89c0 0d35ece4
Product ID	Firefox10
Build ID	2005071117
Trigger Time	2005-07-17 19:59:30.0
Platform	MacOSX
Operating System	Darwin 8.2.0
Module	
URL visited	
User Comments	
Since Last Crash	298633 sec
Total Uptime	298633 sec
Trigger Reason	SIGBUS: Bus Error: (signal 10)
Source File, Line No.	N/A
Stack Trace	
0xffff89c0
nsDNSRecord::GetNextAddr()  [/builds/tinderbox/Fx-Aviary1.0.1/Darwin_7.9.0_Depend/mozilla/
netwerk/dns/src/nsDNSService2.cpp, line 122]
nsSocketTransport::OnSocketDetached()  [/builds/tinderbox/Fx-Aviary1.0.1/Darwin_7.9.0_Depend/
mozilla/netwerk/base/src/nsSocketTransport2.cpp, line 1442]
nsSocketTransportService::DetachSocket(nsSocketTransportService::SocketContext*)()  [/builds/
tinderbox/Fx-Aviary1.0.1/Darwin_7.9.0_Depend/mozilla/netwerk/base/src/
nsSocketTransportService2.cpp, line 187]
nsSocketTransportService::Run()  [/builds/tinderbox/Fx-Aviary1.0.1/Darwin_7.9.0_Depend/mozilla/
netwerk/base/src/nsSocketTransportService2.cpp, line 548]
nsThread::Main()  [/builds/tinderbox/Fx-Aviary1.0.1/Darwin_7.9.0_Depend/mozilla/xpcom/threads/
nsThread.cpp, line 607]
_pt_root()  [/builds/tinderbox/Fx-Aviary1.0.1/Darwin_7.9.0_Depend/mozilla/nsprpub/pr/src/
pthreads/ptthread.c, line 217]
libSystem.B.dylib.88.0.0 + 0x2c3d4 (0x9002c3d4)
OS: Windows XP → All
Hardware: PC → All
Target Milestone: mozilla1.8beta2 → mozilla1.8beta5
Flags: blocking1.8rc1?
This looks high-profile. If we can get a low-risk patch, we should take it.
Flags: blocking1.8rc1? → blocking1.8rc1+
It seems like we are crashing on the memcpy in nsDNSRecord::GetNextAddr, which
implies that the nsHostRecord's addr_info field is null.  In that case, we
expect the addr field to be non-null.  The addr_info field is null and the addr
field is non-null when the hostname was an IP address literal that could simply
be parsed.  I doubt this patch will actually help.  I suspect it is more likely
that the nsHostRecord data structure is getting corrupted, but I figure that
this patch is worth a shot since it might help.  If it does, then at least we
know that addr is somehow ending up null, and we can go from there.  Otherwise,
we're no worse off than we are now.
Attachment #199372 - Flags: review?(cbiesinger)
Attachment #199372 - Flags: superreview?(bzbarsky)
is there some way to find out from the talkback data whether that variable was
null or not?
That information is not exposed on http://talkback-reports.mozilla.org/ :-(
Comment on attachment 199372 [details] [diff] [review]
v1 patch - bandaid

sr=bzbarsky if we come up with nothing better...
Attachment #199372 - Flags: superreview?(bzbarsky) → superreview+
Jay, can you get darin local var data and such for the relevant talkbacks?
Comment on attachment 199372 [details] [diff] [review]
v1 patch - bandaid

r=biesi I guess. but understanding why this is not a valid pointer would be
good...
Attachment #199372 - Flags: review?(cbiesinger) → review+
Attachment #199372 - Flags: approval1.8rc1?
Has anyone been able to reproduce this with a recent 1.5 branch build?  Is this
a problem on Win32 and Linux, or just MacOSX?

I tried looking for more detailed info on these crashes, couldn't find any.  But
if anyone is able to reproduce this with a recent build, post your incident id
so I can try to dig up any details that we might be collecting.  I'm pretty sure
our best bet for local var info is with Win32 crashes...if I get a chance to
look it up within 1-2 days of submission (Talkback db problems force us to
delete a lot of detailed info every couple of days).
How would you guys like to proceed with this bug? Wait another day to see if we
can get local var data from someone who crashes on this in the next day or so?
Or do you want approval on the patch before that?
If this is a low risk patch, as it seems to be... we should take this and I can
keep an eye on Talkback data going foward.  If anyone is able to reproduce this
consistently with older builds, we can have them retest with the latest builds
with this fix to verify whether or not it actually fixed the problem.
Comment on attachment 199372 [details] [diff] [review]
v1 patch - bandaid

we'll take this and then watch talkback.
Attachment #199372 - Flags: approval1.8rc1? → approval1.8rc1+
fixed-on-trunk, fixed1.8
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Keywords: fixed1.8
Resolution: --- → FIXED
I think the band-aid was not good enough.  Reopening this bug.

L. David Baron wrote:
> This is a crash I saw while Thunderbird 1.5 was idle.  (I use it as a
> feedreader, so it was probably fetching feeds.)
>
> I've also noticed that sometimes it just stops fetching new feeds and I
> have to restart it -- that behavior *may* be associated with switching
> networks between home and office, but I'm not sure.  I've seen that
> quite a number of times, but I've only seen this crash once.  (I haven't
> seen any similar problems with Firefox trunk, though, so it seems more
> likely to be mail-specific.)
>
> I was wondering if you thought this was related to
> https://bugzilla.mozilla.org/show_bug.cgi?id=290190 or if there was
> anything else (e.g., PR_LOGging) I should investigate.
>
> The address in libc.so.6 is a non-instruction-aligned address in the
> middle of strstr.
>
> -David
>
> Incident ID: 14252378
> Stack Signature libc.so.6 + 0x6a58c (0x00b5058c) 2215a3c4
> Product ID      Thunderbird15
> Build ID        2005120113
> Trigger Time    2006-01-21 15:50:15.0
> Platform        LinuxIntel
> Operating System        Linux 2.6.14-1.1656_FC4
> Module  libc.so.6 + (0006a58c)
> URL visited     
> User Comments   unknown
> Since Last Crash        193 sec
> Total Uptime    193 sec
> Trigger Reason  SIGSEGV: Segmentation Fault: (signal 11)
> Source File, Line No.   N/A
> Stack Trace     
> libc.so.6 + 0x6a58c (0x00b5058c)
> PR_EnumerateAddrInfo()  [/builds/tinderbox/Tb-Mozilla1.8/Linux_2.4.18-14_Depend/mozilla/nsprpub/pr/src/misc/prnetdb.c, line 2149]
> nsDNSRecord::GetNextAddr()  [/builds/tinderbox/Tb-Mozilla1.8/Linux_2.4.18-14_Depend/mozilla/netwerk/dns/src/nsDNSService2.cpp, line 136]
> nsSocketTransport::RecoverFromError()  [/builds/tinderbox/Tb-Mozilla1.8/Linux_2.4.18-14_Depend/mozilla/netwerk/base/src/nsSocketTransport2.cpp, line 1223]
> nsSocketTransport::OnSocketDetached()  [/builds/tinderbox/Tb-Mozilla1.8/Linux_2.4.18-14_Depend/mozilla/netwerk/base/src/nsSocketTransport2.cpp, line 1534]
> nsSocketTransportService::DetachSocket(nsSocketTransportService::SocketContext*)()  [/builds/tinderbox/Tb-Mozilla1.8/Linux_2.4.18-14_Depend/mozilla/netwerk/base/src/nsSocketTransportService2.cpp, line 196]
> nsSocketTransportService::Run()  [/builds/tinderbox/Tb-Mozilla1.8/Linux_2.4.18-14_Depend/mozilla/netwerk/base/src/nsSocketTransportService2.cpp, line 605]
> nsThread::Main()  [/builds/tinderbox/Tb-Mozilla1.8/Linux_2.4.18-14_Depend/mozilla/xpcom/threads/nsThread.cpp, line 713]
> _pt_root()  [/builds/tinderbox/Tb-Mozilla1.8/Linux_2.4.18-14_Depend/mozilla/nsprpub/pr/src/pthreads/ptthread.c, line 223]
> libpthread.so.0 + 0x5341 (0x00d2e341)
Status: RESOLVED → REOPENED
Priority: P2 → --
Resolution: FIXED → ---
Target Milestone: mozilla1.8beta5 → mozilla1.9alpha
Status: REOPENED → ASSIGNED
Priority: -- → P1
Using the new search in the top 5 frames feature of talkback, I found 3 FF1.5 crashes and 1 TB1.5 crash involving nsDNSRecord::GetNextAddr.  By comparison, there are roughly 160 FF1.0/TB1.0/MZ1.7 crashes.
Priority: P1 → P3
This looks very much like bug 337418.
clearng fixed flag. i'm fairly certain that it isn't fixed on branch either.

and i don't think the structure was null.

the comparison is useless, you want percentage of total crashes for a release. because the 1.7 series has been deployed much longer.
Keywords: fixed1.8
-> reassign to default owner
Assignee: darin.moz → nobody
Status: ASSIGNED → NEW
QA Contact: benc → networking
i agree with colin
Status: NEW → RESOLVED
Closed: 16 years ago14 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 337418
Depends on: 426060
Crash Signature: [@ msvcrt.dll - nsDNSRecord::GetNextAddr ]
You need to log in before you can comment on or make changes to this bug.