Closed
Bug 1154124
Opened 10 years ago
Closed 9 years ago
Stack overflow by a sync call of net::CacheEntry's callbacks in a chain
Categories
(Core :: Networking: Cache, defect)
Core
Networking: Cache
Tracking
()
RESOLVED
FIXED
mozilla46
People
(Reporter: ws.bugzilla, Assigned: mayhemer)
References
Details
(Keywords: crash)
Crash Data
Attachments
(3 files, 1 obsolete file)
2.37 MB,
application/x-7z-compressed
|
Details | |
4.11 KB,
patch
|
michal
:
review+
|
Details | Diff | Splinter Review |
2.40 KB,
patch
|
Details | Diff | Splinter Review |
Steps to reproduce:
1. Go to this URL: http://www.geeksofdoom.com/2015/01/13/rockstar-delays-grand-theft-auto-v-pc-release-minimum-specs-4k-screenshots-shared
2. Middle click three of the screenshots at the bottom of this post.
3. Wait. First the browser becomes unresponsive, showing the dialog. "Stop Script" is either unclickable or ineffective. Eventually the browser crashes.
Here are two crash signatures I got for this crash:
https://crash-stats.mozilla.com/report/index/56a058a3-d1aa-4c50-adc6-564392150414
https://crash-stats.mozilla.com/report/index/4f42870b-d66a-452d-bff9-fcf9c2150414
(the latter is a Windows XP VirtualBox VM with Nightly, following the same STR)
Comment 1•10 years ago
|
||
Stack overflow:
mozilla::net::CacheEntryHandle::Release()
nsRefPtr<nsIAtom>::~nsRefPtr<nsIAtom>()
mozilla::net::CacheEntry::InvokeAvailableCallback(mozilla::net::CacheEntry::Callback const&)
mozilla::net::CacheEntry::InvokeCallback(mozilla::net::CacheEntry::Callback&)
mozilla::net::CacheEntry::InvokeCallbacks(bool)
mozilla::net::CacheEntry::InvokeCallbacks()
mozilla::net::CacheEntry::OnHandleClosed(mozilla::net::CacheEntryHandle const*)
mozilla::net::CacheEntryHandle::~CacheEntryHandle()
mozilla::net::CacheEntryHandle::`scalar deleting destructor'(unsigned int)
mozilla::net::CacheEntryHandle::Release()
...
Severity: major → critical
Component: General → Networking: Cache
Keywords: crash,
crashreportid
Product: Firefox → Core
Assignee | ||
Comment 2•10 years ago
|
||
Roman, are you using a fresh profile? If not, can you retry with it? I was trying to reproduce this bug with Nightly (e10s on and off) and Fx 37 (all fresh profile) but no luck.
Assignee: nobody → honzab.moz
Flags: needinfo?(ws.bugzilla)
Assignee | ||
Comment 3•10 years ago
|
||
nsHttpChannel::OnCacheEntryCheck apparently returns ENTRY_NEEDS_REVALIDATION, hence we go to nsHttpChannel::OnCacheEntryAvailable. But that doesn't take the entry handle ownership (and returns OK) and also somehow restarts wait for the same entry (ending up added as a callback to the list) so on releasing the handle inside CacheEntry::InvokeAvailableCallback (line 832, not 802) we do immediately another loop.
How this happens state happens, is absolutely unclear to me.
Anyway, we should be able resilient. Have to think on a fix.
@Roman, can you please produce an NSPR log while reproducing the bug? set two environment vars (best in a cmd.exe window):
NSPR_LOG_MODULES=cache2:5,nsHttp:4,timestamp,sync
NSPR_LOG_FILE=\path\to\the\file
and then run firefox with that env. If you could not reproduce, remove the 'sync' module from the list and retry.
Thanks!
The crash reproduces on today's nightly with a clean profile (previously the profile was almost-clean).
Add one step to the STR: close the browser. That seems to be what finally triggers the crash reporter. What I observe when I open the screenshots in new tabs is that the browser becomes very unresponsive, but the crash reporter seems to only be triggered when the browser is closed.
NSPR log (sync module included): attached. This was done on a WinXP VM, because that's where my nightly lives after a couple of incidents with accidentally running the nightly on my main profile. Let me know if you still can't reproduce and require further logs from Win7.
Flags: needinfo?(ws.bugzilla)
Assignee | ||
Comment 5•10 years ago
|
||
Thanks. Now I know what's going on. A script schedules an enormous amount of requests to a single URL. It means there are 100+ consumer waiting for a single entry. nsHttpChannel::OnCacheEntryCheck returns REVAL_NEEDED but nsHttpChannel::OnCacheEntryAvailable later fails since the channel has been canceled in the mean time. This leads to processing (no redispatch) of the next callback in chain => stack overflow.
I am in favor of leaving the code as much as possible w/o redispatching (i.e. breaking the stack). But in this case we need to make an exception, tho there may be more places.
This will not fix the unresponsive script. e10s turned on would, but still, this is somewhat malicious behavior.
Status: NEW → ASSIGNED
Perfect!
As for e10s, my nightly was pretty unresponsive with e10s enabled - a bit better than v37, but only slightly. Is it supposed to remain perfectly responsive? If so, maybe I should file another bug related to how these pages affect UI responsiveness with e10s enabled?
Updated•9 years ago
|
Crash Signature: [@ choose_arena ]
[@ nsACString_internal::Replace(unsigned int, unsigned int, char const*, unsigned int) ]
Keywords: crashreportid
Summary: Reproduceable crash in 37 through nightly, preceded by unresponsive script warning → Reproducible crash preceded by Unresponsive Script warning. high cache hit results in stack overflow
Updated•9 years ago
|
Crash Signature: [@ choose_arena ]
[@ nsACString_internal::Replace(unsigned int, unsigned int, char const*, unsigned int) ] → [@ choose_arena ]
[@ nsACString_internal::Replace(unsigned int, unsigned int, char const*, unsigned int) ]
[@ nsACString_internal::Replace ]
Assignee | ||
Updated•9 years ago
|
Summary: Reproducible crash preceded by Unresponsive Script warning. high cache hit results in stack overflow → Stack overflow by a sync call of net::CacheEntry's callbacks in a chain
Assignee | ||
Comment 9•9 years ago
|
||
The fix is actually that easy...
Attachment #8706917 -
Flags: review?(michal.novotny)
Assignee | ||
Comment 10•9 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=edc9797d8e2d
(I've also added a comment to the patch)
Assignee | ||
Comment 11•9 years ago
|
||
Comment on attachment 8706917 [details] [diff] [review]
v1
I think we should also prevent the race with closing the output stream that btw invokes (by a dispatch!) CacheEntry::OnOutputClosed() with invokes the callbacks again.
I will update the patch accordingly.
Attachment #8706917 -
Flags: review?(michal.novotny)
Assignee | ||
Comment 12•9 years ago
|
||
- the phantom output stream can be easily closed and released under the lock
- mOutputStream->Close() invokes callbacks (asynchronously), so no need to do invoke callbacks twice
https://treeherder.mozilla.org/#/jobs?repo=try&revision=6b43e1905e73
-w patch follows
Attachment #8706917 -
Attachment is obsolete: true
Attachment #8708032 -
Flags: review?(michal.novotny)
Assignee | ||
Comment 13•9 years ago
|
||
Updated•9 years ago
|
Attachment #8708032 -
Flags: review?(michal.novotny) → review+
Assignee | ||
Updated•9 years ago
|
Keywords: checkin-needed
Comment 14•9 years ago
|
||
Keywords: checkin-needed
Comment 15•9 years ago
|
||
bugherder |
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
status-firefox46:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla46
You need to log in
before you can comment on or make changes to this bug.
Description
•