Closed Bug 960902 Opened 11 years ago Closed 11 years ago

HTTP cache v2: crash in mozilla::net::nsHttpChannel::ProcessResponse(), mAuthProvider null, on opening multiple MozillaWiki pages

Categories

(Core :: Networking: HTTP, defect)

23 Branch
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla30

People

(Reporter: mayankleoboy1, Assigned: mayhemer)

References

Details

(Keywords: crash, topcrash)

Crash Data

Attachments

(4 files)

Attached file about support.txt
User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:29.0) Gecko/20100101 Firefox/29.0 (Beta/Release) Build ID: 20140116030250 Steps to reproduce: Enable the new HTTP cache i opened 3 Mozilla wiki pages in 3 different tabs at once, and on each page, i opened a few links in further new tabs Actual results: crash with signature like : https://crash-stats.mozilla.com/report/index/129817d2-0a8d-4b62-bd8f-8aef82140117 and similar like this. This has happened many times over a few months. Expected results: no crash Cant reproduce this all the time, so this is an intermittent crash.
Component: General → Networking: Cache
It's a null deref on this line: mAuthProvider->CheckForSuperfluousAuth() in nsHttpChannel::ProcessResponse. Is it reasonable for mAuthProvider to be null there?
Component: Networking: Cache → Networking: HTTP
Flags: needinfo?(honzab.moz)
Thanks for the report. This means that the nsHttpChannel object is processing a secondary network request. That is not supposed to happen, nsHttpChannel is constructed to process only one request during it's lifetime, so this is a bug. I will try to reproduce.
Assignee: nobody → honzab.moz
Blocks: cache2enable
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: needinfo?(honzab.moz)
Crash Signature: [@ mozilla::net::nsHttpChannel::ProcessResponse()]
Summary: crashing on opening multiple MozillaWiki pages → crash in mozilla::net::nsHttpChannel::ProcessResponse(), mAuthProvider null, on opening multiple MozillaWiki pages
Summary: crash in mozilla::net::nsHttpChannel::ProcessResponse(), mAuthProvider null, on opening multiple MozillaWiki pages → HTTP cache v2: crash in mozilla::net::nsHttpChannel::ProcessResponse(), mAuthProvider null, on opening multiple MozillaWiki pages
just hit this too. not reducible for me.
For me, if you bookmark these 4 links in one folder, and then right click->"open all in tabs" , i get this crash. ( Maybe try to reproduce with a shitty, ~120KB/s net connection ? ) https://wiki.mozilla.org/Firefox/Meeting https://wiki.mozilla.org/WeeklyUpdates https://wiki.mozilla.org/Firefox/Planning https://wiki.mozilla.org/Platform
This is currently the top volume crasher on Nightly (Fx30), It's on all platforms and occurring with x86 and amd64
OS: Windows 7 → All
Hardware: x86_64 → All
Keywords: crash, topcrash
The crash volume on here is very high. Can we either disable cachev2 or fix this crash today for tomorrow's nightly?
Flags: needinfo?(jduell.mcbugs)
Flags: needinfo?(honzab.moz)
Benjamin, yes, see bellow: :jduell, cache2 disable (end trial) patch is waiting for your review - since yesterday. (bug 975829) I was hoping to land it today.
Flags: needinfo?(honzab.moz)
Yup, reviewed it (and noted that it didn't really need review IMO for future reference).
Flags: needinfo?(jduell.mcbugs)
mayankleoboy1, can you reproduce regularly? If so, can I ask you to produce a NSPR log? The way to do it is just about setting some environment variables: NSPR_LOG_MODULES=nsHttp:4,cache2:5 NSPR_LOG_FILE=path/log-file ; path must exist, log-file is a name of your choice GECKO_SEPARATE_NSPR_LOGS=1 Let you know that if you have updated yesterday your Nighty, it may already use the old cache now. Please make sure that "browser.cache.use_new_backend" preference in about:config is at 1. (No longer on Nightly, the cache2 trial has ended, removing tracking flags)
Flags: needinfo?(mayankleoboy1)
Maybe make the modules like this: NSPR_LOG_MODULES=sync,nsHttp:4,cache2:5
(In reply to Honza Bambas (:mayhemer) from comment #10) > mayankleoboy1, can you reproduce regularly? If so, can I ask you to produce > a NSPR log? Cant reproduce this regularly. However, once the crashes start happening, they are reproducible at every crash-restart-STR-crash cycle. I am more than happy to give all support possible. > > > Please make sure that "browser.cache.use_new_backend" preference in > about:config is at 1. My nightly is built from https://hg.mozilla.org/mozilla-central/rev/e3daaa4c73dd and the new cache is at 1 > The way to do it is just about setting some environment variables: > NSPR_LOG_MODULES=nsHttp:4,cache2:5 > NSPR_LOG_FILE=path/log-file ; path must exist, log-file is a name of your > choice > GECKO_SEPARATE_NSPR_LOGS=1 I have no idea how to do whatever you want me to do. You will have to give detailed instructions.
Flags: needinfo?(mayankleoboy1)
mayankleoboy1, here are the detailed instructions, I presume you are on Windows. * close Firefox, make sure it's not running * press Windows-R, "Run" prompt will appear * type cmd and press enter * the black command prompt window will appear with C:\>_ * type following commands into it (just copy/paste), presuming you have installed Nightly at c:\Program Files (x86)\Nightly: set NSPR_LOG_MODULES=sync,nsHttp:4,cache2:5 set NSPR_LOG_FILE=nspr.log set GECKO_SEPARATE_NSPR_LOGS=1 "c:\Program Files (x86)\Nightly\firefox.exe" This will run Firefox as normal and also produce "nspr.log" file in C:\Users\<your-username>\ directory with log info from the current Firefox session - the thing that will tell us what's wrong. Make sure the file is there after Firefox has started so that we are sure everything is correct. *** IMPORTANT ***: when you experience the crash, *don't let Firefox restart!!* It would overwrite the log file and the information would be lost. Thanks for your help!
(In reply to Honza Bambas (:mayhemer) from comment #13) > mayankleoboy1, here are the detailed instructions, I presume you are on > Windows. > > * close Firefox, make sure it's not running > * press Windows-R, "Run" prompt will appear > * type cmd and press enter > * the black command prompt window will appear with C:\>_ > * type following commands into it (just copy/paste), presuming you have > installed Nightly at c:\Program Files (x86)\Nightly: > > set NSPR_LOG_MODULES=sync,nsHttp:4,cache2:5 > set NSPR_LOG_FILE=nspr.log > set GECKO_SEPARATE_NSPR_LOGS=1 > "c:\Program Files (x86)\Nightly\firefox.exe" done the exact same. > > This will run Firefox as normal and also produce "nspr.log" file in > C:\Users\<your-username>\ directory with log info from the current Firefox > session - the thing that will tell us what's wrong. Make sure the file is > there after Firefox has started so that we are sure everything is correct. A "nspr.log" file is being formed, but it is in the root directory , that is "C:\nspr.log ". Is that fine ?
(In reply to mayankleoboy1 from comment #14) > A "nspr.log" file is being formed, but it is in the root directory , that is > "C:\nspr.log ". Is that fine ? Yes, that is fine. If you leave firefox running for a longer time, the file will grow. And it may also slow firefox a bit down, just to let you know.
Attached file nspr.7z
After much trying, i finally got Nightly to crash. Attaching the nspr.log , in a 7z ppmd compressed file.
mayankleoboy1, thanks so much for the log! I can see the problem - one of those you hit your forehead and say "how could I miss this?". The thing is the following: channel 1 - loads a resource (in the log it's a css from the mozilla wikipedia) chennel 2 - wants to load the same resource - is put to queue (doesn't get its onCacheEntryAvail callback) channel 1 - gets onCacheEntryAvail callback with a new empty entry - starts the full network load and gets http headers, fills the entry's metadata channel 2 - now gets onCacheEntryAvail - finds out that the entry needs to be revalidated - deploys a conditional request - gets 304 - starts concurrent read from the cache (while data are still being downloaded from the network) channel 1 - got several OnDataAvailable calls, filled some of the entry's data - gets canceled with NS_BINDING_ABORTED (probably a reload due to charset detection) the entry is now only partial channel 2 - finishes read of the entry data that channel 1 has stored - finds out the entry is partial ( -> concurrent write has been interrupted) - the response is resumable, yay! - the channel deploys a range request for the rest of the entry content => does a second request => crash since nsHttpChannel is not designed to do more then one request (except authentication rounds) during its lifetime Possible solutions (more then one): 1. when a channel decides to revalidate its entry simply don't allow concurrent read (simple change, performance may suffer a bit, but it won't be worse than with the old cache) 2. teach nsHttpChannel to do more then one request during its lifetime (a larger change) 3. do an internal redirect (always) to load the rest of the data (even larger change) In this bug I'd rather do 1. And have a new bug for 2/3 we can deal with later. It first needs to be decided whether to allow more then one request in the channel is a good idea at all. We don't do it (otherwise we have this crash!), we usually internally redirect to the same URI to work this around and its probably better way to deal with it from more not just obvious reasons.
Attached patch v1Splinter Review
Here is a roughly tested patch. Try on gum: https://tbpl.mozilla.org/?tree=Gum&rev=2a0bb843cdd9 over some other patches (probably not yet 100% stable)
Attachment #8383159 - Flags: review?(michal.novotny)
Here is an ignore-whitespace patch for easier review.
Attachment #8383159 - Flags: review?(michal.novotny) → review+
Rather checking for cache v1 before push: https://tbpl.mozilla.org/?tree=Try&rev=d5ab8c494151
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla30
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: