Closed Bug 739023 Opened 13 years ago Closed 13 years ago

crash in nsHttpChannel::OnDataAvailable

Categories

(Core :: Networking: HTTP, defect)

defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla14
Tracking Status
firefox13 --- unaffected
firefox14 + fixed

People

(Reporter: scoobidiver, Assigned: mcmanus)

References

Details

(Keywords: crash, regression, topcrash, Whiteboard: [native-crash])

Crash Data

Attachments

(1 file, 1 obsolete file)

It first appeared in 14.0a1/20120324 and is currently #2 top crasher in 14.0a1 over the last day. The regression range is: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=ab2ff3b5611f&tochange=df1f94b2bdee Signature nsHttpChannel::OnDataAvailable(nsIRequest*, nsISupports*, nsIInputStream*, unsigned int, unsigned int) More Reports Search UUID 9f2a61bb-5e01-45e9-874a-3f2962120325 Date Processed 2012-03-25 06:31:45 Uptime 1401 Last Crash 27.1 minutes before submission Install Age 15.6 hours since version was first installed. Install Time 2012-03-24 14:52:24 Product Firefox Version 14.0a1 Build ID 20120324031100 Release Channel nightly OS Windows NT OS Version 6.1.7601 Service Pack 1 Build Architecture x86 Build Architecture Info GenuineIntel family 6 model 37 stepping 2 Crash Reason EXCEPTION_ACCESS_VIOLATION_READ Crash Address 0x1c App Notes AdapterVendorID: 0x1002, AdapterDeviceID: 0x68c1, AdapterSubsysID: 395117aa, AdapterDriverVersion: 8.772.0.0 Has dual GPUs. GPU #2: AdapterVendorID2: 0x8086, AdapterDeviceID2: 0x0046, AdapterSubsysID2: 395117aa, AdapterDriverVersion2: 8.772.0.0D2D? D2D+ DWrite? DWrite+ D3D10 Layers? D3D10 Layers+ EMCheckCompatibility True Total Virtual Memory 4294836224 Available Virtual Memory 3541532672 System Memory Use Percentage 28 Available Page File 3622756352 Available Physical Memory 2904408064 Frame Module Signature [Expand] Source 0 xul.dll nsHttpChannel::OnDataAvailable netwerk/protocol/http/nsHttpChannel.cpp:4596 1 xul.dll nsInputStreamPump::OnStateTransfer netwerk/base/src/nsInputStreamPump.cpp:514 2 xul.dll nsInputStreamPump::OnInputStreamReady netwerk/base/src/nsInputStreamPump.cpp:402 3 xul.dll nsInputStreamReadyEvent::Run xpcom/io/nsStreamUtils.cpp:114 4 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:657 5 xul.dll mozilla::ipc::MessagePump::Run ipc/glue/MessagePump.cpp:110 6 xul.dll MessageLoop::RunHandler ipc/chromium/src/base/message_loop.cc:201 7 xul.dll MessageLoop::Run ipc/chromium/src/base/message_loop.cc:175 8 xul.dll nsBaseAppShell::Run widget/xpwidgets/nsBaseAppShell.cpp:189 9 xul.dll nsAppShell::Run widget/windows/nsAppShell.cpp:267 10 xul.dll nsAppStartup::Run toolkit/components/startup/nsAppStartup.cpp:295 11 xul.dll XRE_main toolkit/xre/nsAppRunner.cpp:3703 ... More reports at: https://crash-stats.mozilla.com/report/list?signature=nsHttpChannel%3A%3AOnDataAvailable%28nsIRequest*%2C+nsISupports*%2C+nsIInputStream*%2C+unsigned+int%2C+unsigned+int%29
Whiteboard: [native-crash]
I just wanted to note in the bug that this is at the top of my coding list, I'm just out of the office this week and it is not clear if I'll have time to tackle it until next monday.
Assignee: nobody → mcmanus
I get this crash on some pages only if i set network.http.pipelining.aggressive true.
(In reply to SpeciesX from comment #2) > I get this crash on some pages only if i set > network.http.pipelining.aggressive true. if you have a url (or a set) where it seems to show up that would be helpful.
Keywords: needURLs
Crashes happens randomly and in my environment, firefox crashes even if network.http.pipelining.aggressive is set off. As a tendency, once a crash happened with a site, it repeats 2 or 3 times with same site. But once opened correctly, no more crashes...
(In reply to Patrick McManus [:mcmanus] from comment #3) > if you have a url (or a set) where it seems to show up that would be helpful. These crash with 100% reliability over here on Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:14.0) Gecko/20120327 Firefox/14.0a1 ID:20120327031149 Anything *.google.com https://support.xobni.com/entries/20151838 To add insult to injury, break pad is not catching these reliably. After crashing all morning today eventually bug 739720 was captured, not sure if it is related.
Blocks: 739762
mresponsehead in nshttpchannel::ondataavailable() is pretty clearly null from those reports. Presumably from TakeResponseHead(). (I can't reproduce, but there are lots of reports :)) that of course should be impossible.
I didn't check the report more deeply but couldn't mResponseHead rather be invalid pointer? Do you think this is related to the restart logic? Maybe we could try to land a small patch to disable some of these functionality bits to identify the crash. It is quit frequent, so we should figure out quickly ;) I can take care of it if you agree, but I also have a busy week...
(In reply to Honza Bambas (:mayhemer) from comment #7) > I didn't check the report more deeply but couldn't mResponseHead rather be > invalid pointer? all the crashes are on deref of 0x1c or 0x20 (dep on 32 or 64 bit build), so It seems very likely to be a member off the null object. Some of the other reports look like invalid ptrs :)
(by other reports I meant differnet bugzilla entries - that was unclear)
honza, given that its hard to really look into this this week - would you considering r? and pushing this for me? We could get data on the crashrate.. diff --git a/netwerk/protocol/http/nsHttpTransaction.cpp b/netwerk/protocol/http /nsHttpTransaction.cpp --- a/netwerk/protocol/http/nsHttpTransaction.cpp +++ b/netwerk/protocol/http/nsHttpTransaction.cpp @@ -835,16 +835,18 @@ nsHttpTransaction::PipelinePosition() // nsHttpTransaction <private> //----------------------------------------------------------------------------- nsresult nsHttpTransaction::RestartInProgress() { NS_ASSERTION(PR_GetCurrentThread() == gSocketThread, "wrong thread"); + return NS_ERROR_FAILURE; + // Lock RestartInProgress() and TakeResponseHead() against main thread MutexAutoLock lock(*nsHttp::GetLock()); // don't try and restart 0.9 if (mHaveAllHeaders && !mRestartInProgressVerifier.IsSetup()) return NS_ERROR_NET_RESET; LOG(("Will restart transaction %p and skip first %lld bytes, "
Don't know if this helps - my FF14 crashes when pipelining is enabled and Avira Webguard is running, which AFAIK acts as a proxy. Disabling pipelining or Webguard makes the browser stable again. (When I first had the problem, I found no difference between running with or withoug Webguard, during the last days this has changed to what I describe above)
(In reply to Bernd Wolff from comment #11) > Don't know if this helps - my FF14 crashes when pipelining is enabled and > Avira Webguard is running, which AFAIK acts as a proxy. Disabling pipelining > or Webguard makes the browser stable again. > > (When I first had the problem, I found no difference between running with or > withoug Webguard, during the last days this has changed to what I describe > above) Bernd, that's a helpful lead - thanks! I won't be able to track it down in earnest until next week but until then let me ask you is that Avira "antivirus" or Avira "Internet Security"? I don't see anything named webguard on their website (anymore?)
I have Avira Antivirus Premium 2012, Webguard is one of the components (the others being Mailguard and Realtime protection). IIRC "Internet Security" adds a firewall.
How to disable pipelining? All boolean values containing "pipelining" in about:config are false here and the crashes persist on Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:14.0) Gecko/20120328 Firefox/14.0a1 ID:20120328031211
http.network.pipelining was true in my config, after setting it to false the crashes stopped. My current config values found when filtering "http.pipe" are: network.http.pipelining.read-timeout;10000 network.http.pipelining.maxsize;300000 network.http.pipelining.max-optimistic-requests;4 network.http.pipelining.maxrequests;50 network.http.pipelining;false network.http.pipelining.aggressive;false network.http.pipelining.ssl;false Broadening the filter to use just "pipe", I find network.http.proxy.pipelining;true maxrequests is marked as "user set", but I cannot remember when or why I should have changed it.
(In reply to alex_mayorga from comment #14) > How to disable pipelining? > > All boolean values containing "pipelining" in about:config are false here > and the crashes persist on Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:14.0) > Gecko/20120328 Firefox/14.0a1 ID:20120328031211 Then you probably have it off - its not clear that you need to have it on for this to happen. (though that is true for some users) are you running any firewall software like avira webguard? (comment 13) can you paste in any non default preferences you have set (about:support will tell you this)
(In reply to Patrick McManus [:mcmanus] from comment #16) > are you running any firewall software like avira webguard? (comment 13) No but I'm behind a corporate proxy (Websense IIRC) > can you paste in any non default preferences you have set (about:support > will tell you this) accessibility.typeaheadfind.flashBar 0 browser.cache.disk.capacity 1048576 browser.cache.disk.smart_size.first_run false browser.cache.disk.smart_size_cached_value 1048576 browser.places.smartBookmarksVersion 3 browser.startup.homepage_override.buildID 20120328031211 browser.startup.homepage_override.mstone 14.0a1 extensions.lastAppVersion 14.0a1 gfx.direct3d.prefer_10_1 true network.cookie.prefsMigrated true network.negotiate-auth.delegation-uris .work.net,.work.com network.negotiate-auth.trusted-uris .work.net,.work.com places.history.expiration.transient_current_max_pages 104254 privacy.sanitize.migrateFx3Prefs true security.warn_viewing_mixed false
Everything for "pipe" in about:config has "default" values: network.http.pipelining;false network.http.pipelining.aggressive;false network.http.pipelining.max-optimistic-requests;4 network.http.pipelining.maxrequests;32 network.http.pipelining.maxsize;300000 network.http.pipelining.read-timeout;10000 network.http.pipelining.ssl;false network.http.proxy.pipelining;false
The crashes caused by https://support.xobni.com/entries/20151838 are now being caught by breakpad on Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:14.0) Gecko/20120329 Firefox/14.0a1 ID:20120329031156 \o/ bp-417dc8c5-3ef9-401f-83d3-701c02120329 bp-96bfabbc-e0a4-4ffa-b06a-89a192120329 Seems like it's bug 739024.
(In reply to Patrick McManus [:mcmanus] from comment #10) > honza, > > given that its hard to really look into this this week - would you > considering r? and pushing this for me? We could get data on the crashrate.. Landed investigation patch: https://hg.mozilla.org/integration/mozilla-inbound/rev/4df2d9a68d6e Please don't close the bug.
Whiteboard: [native-crash] → [native-crash] [DON'T CLOSE]
Some URLs as requested, these are for nsHttpChannel::OnDataAvailable: 107 \N 79 about:blank 10 http://www.repubblica.it/ 10 http://www.boerse.bz/ 10 http://0day.kiev.ua/ 8 http://www.harveynorman.com.au/ 7 http://www.t-online.de/ 6 http://www.112brabantnieuws.nl/ 5 http://www.facebook.com/ 5 https://www.facebook.com/ 5 http://forums.mozillazine.org/viewtopic.php?f=23&t=2448905&p=11857895 5 http://boards.4chan.org/b/ 5 about:newaddon?id=uriloader@pdf.js 4 http://www.tagesschau.de/ 4 http://www.penny-arcade.com/ 4 http://www.momanda.de/videos 4 http://www.facebook.com/ 4 http://de.mg40.mail.yahoo.com/ 4 http://boerse.bz/ 4 about:newtab 3 http://www.wetteronline.de/ 3 http://www.toyota.de/cars/coming_soon/index.tmex 3 http://www.repubblica.it/index.html?refresh_ce 3 http://www.getprice.com.au/D-Link-DHP-501AV-Powerline-AV-500-Network-Adapter-Starter-Kit-Gpnc_53--59619858.htm 3 http://www.geekstogo.com/forum/topic/297804-my-antivirus-doesnt-work/page__st__45 3 http://www.ebay.de/itm/Sennheiser-IE-8-Ear-Canal-Phones-Earphone-IE8-New-/290629761152?pt=Kopfh%C3%B6rer&hash=item43aae23880 3 http://www.divxtotal.com/ 3 http://www.computerbild.de/ 3 http://dimonvideo.ru/articles.html 3 http://boards.4chan.org/g/ 3 http://bay152.mail.live.com/ 2 http://zhan.renren.com/eblaster?checked=true 2 http://www.youtube.com/show/minecraft?s=1 2 http://www.youtube.com/ 2 http://www.unitymedia.de/entertainment/index.html?salesid_nrw=43180&salesid_hessen=43180&cid=UM-04019 2 http://www.sportschau.de/sp/ 2 http://www.sportmediaset.mediaset.it/ 2 http://www.pixmania.com/televisori/itit3_8_pm.html 2 http://www.nero.com/rus/store.html?NeroSID=78c09c7881514307b41870fb43a5e922 2 http://www.mobafire.com/league-of-legends/build/dutchbrothers-volibear-build-155239 2 http://www.inoprosport.ru/football/ 2 http://www.huyandex.com/ 2 http://www.gamekeyshop.net/ 2 http://www.footytube.com/index.php 2 http://www.flickr.com/photos/toyota-europe/sets/ 2 http://www.ekonja-verlag.com/ 2 http://www.dobreprogramy.pl/ 2 http://www.bvb.de/?Z%1B%E7%F4%9D 2 http://www.bom.gov.au/nsw/ 2 http://www.avclub.com/tvclub/tvshow/30-rock,23/ 2 http://www.abnehmen-aktuell.de/ 2 http://www5.407etr.com/tolls/rate-chart-2012.html 2 http://www.123video.nl/members/home.asp 2 http://well.ca/
Keywords: needURLs
Whiteboard: [native-crash] [DON'T CLOSE] → [native-crash] [leave open]
Target Milestone: --- → mozilla14
I'm getting this sporadically in SeaMonkey while loading my homepage (200+ tabs). It takes some time to happen, and doesn't happen every time: I've had it twice so far (Sm 2.11a1 [Gecko 14.0a1] on linux-x86_64). bp-ab98c71f-cc59-47f0-9b27-e54942120326 build ID: 20120326003030 bp-37bf9bc1-7bfe-4d4c-b7c9-2d34f2120331 build ID: 20120330003035
Version: 14 Branch → Trunk
I can repro this with avira webguard, pipelining on, and a build from 3/26 https://crash-stats.mozilla.com/report/index/bp-115d7e92-c6e4-457e-9b49-8aa722120403 https://crash-stats.mozilla.com/report/index/bp-a1084763-00c7-4430-b6c1-18fc42120403 https://crash-stats.mozilla.com/report/index/bp-4f3fa72a-1174-4fb8-952b-a45c12120403 disable pipelines and the crashes go away. proxy and ssl pipeline settings are not relevant. url is not consistent but redhat.com and ibm.com reproduce after a few seconds of clicking. crash stats at https://crash-stats.mozilla.com/report/list?signature=nsHttpChannel%3A%3AOnDataAvailable%28nsIRequest*%2C+nsISupports*%2C+nsIInputStream*%2C+unsigned+int%2C+unsigned+int%29 show this signature basically going away starting with the 3/31 nightly. The previous versions crashed 169, 135, 36, and 134 times - but 3/31 and 4/1 have just 1 crash a piece (and there are occasional instances of this crash from before the pipeline code). The patch in comment 20 that disabled restart-in-progress first appreared in the 3/31 nightly, so we can feel pretty confident the issue is in RIP (and that patch was worth doing for the data - thanks honza!) I've further confirmed by trying to repro with the 3/30 nightly (positive) and the 3/31 nightly (negative). All of this is on a windows 7 VM with avira webguard.
Attached patch patch 0 (obsolete) — Splinter Review
This turns out to be pretty simple. TakeResponseHeaders() was returning null if mHaveAllHeaders was false. In the case where we had done a restart mHaveAllHeaders is also reset until the restarted transaction has its headers arrive, even though if it is true at the time of reset the existing headers are set aside. Therefore in TakeResponseHeaders() any set-aside headers can always be returned without fear of them being incomplete. My testing here didn't show that race condition (i.e. I verified the setaside headers being returned from TakeResponseHeaders() but they mHaveAllHeaders happened to have migrated back to true at the time) The patch is just fyi for the moment while try verifies it and gives me real windows builds I can use to verify the fix. (I was able to simulate it and get the right stack trace on linux by injecting errors). [hmm.. it seems try is closed for the night :(] this patch also re-enables restartinprogress()
Attached patch patch 1Splinter Review
Update this to only enable restart-in-progress for 200/GET transactions. That's clearly its intended use case and I'm not 100% comfortable with other permutations right now. (the main fix in this patch is necessary for the race condition with 200/GET, so that limitation is not meant as a workaround - just a further refinement.)
Attachment #612070 - Attachment is obsolete: true
Whiteboard: [native-crash] [leave open] → [native-crash]
Attachment #612194 - Flags: review?(honzab.moz)
Worth noting that even though this was an implementation bug, this use case is a real success story for the overall patch set. Avira is terminating the pipelines during some large responses and the piepline code is adding a blacklist (so the problem does not recur), is rescheduling aborted parts of the pipeline, and fixing in place the image that was the large response. (well its fixing it in place with this patch applied :)). The avira "proxy" is really an lsp so this is not explicitly detectable otherwise.
Comment on attachment 612194 [details] [diff] [review] patch 1 Review of attachment 612194 [details] [diff] [review]: ----------------------------------------------------------------- r=honzab ::: netwerk/protocol/http/nsHttpTransaction.cpp @@ +373,5 @@ > + if (mForTakeResponseHead) { > + head = mForTakeResponseHead; > + mForTakeResponseHead = nsnull; > + return head; > + } Agrhh... yes, now I can see it. Hell, I had to catch this during the review! Good work finding it. @@ +1621,5 @@ > if (mSetup) > return; > > + // If mSetup does not transition to true then this verifier cannot > + // be used for restartInProgress Maybe: If mSetup does not transition to true restart in progress is later forbidden.
Attachment #612194 - Flags: review?(honzab.moz) → review+
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: