Closed Bug 1154124 Opened 10 years ago Closed 9 years ago

Stack overflow by a sync call of net::CacheEntry's callbacks in a chain

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla46

Tracking Flags:

Tracking

Status

firefox40

---

affected

firefox46

---

fixed

People

(Reporter: ws.bugzilla, Assigned: mayhemer)

References

Details

(Keywords: crash)

Crash Data

Attachments

(3 files, 1 obsolete file)

nspr.7z 10 years ago Roman 2.37 MB, application/x-7z-compressed		Details
v1 9 years ago Honza Bambas (:mayhemer) 990 bytes, patch		Details \| Diff \| Splinter Review
v2 9 years ago Honza Bambas (:mayhemer) 4.11 KB, patch	michal : review+	Details \| Diff \| Splinter Review
v2 -w 9 years ago Honza Bambas (:mayhemer) 2.40 KB, patch		Details \| Diff \| Splinter Review

Roman

Reporter

Description

•

10 years ago

Steps to reproduce: 1. Go to this URL: http://www.geeksofdoom.com/2015/01/13/rockstar-delays-grand-theft-auto-v-pc-release-minimum-specs-4k-screenshots-shared 2. Middle click three of the screenshots at the bottom of this post. 3. Wait. First the browser becomes unresponsive, showing the dialog. "Stop Script" is either unclickable or ineffective. Eventually the browser crashes. Here are two crash signatures I got for this crash: https://crash-stats.mozilla.com/report/index/56a058a3-d1aa-4c50-adc6-564392150414 https://crash-stats.mozilla.com/report/index/4f42870b-d66a-452d-bff9-fcf9c2150414 (the latter is a Windows XP VirtualBox VM with Nightly, following the same STR)

Roman

Reporter

Updated

•

10 years ago

OS: Windows 7 → All

Hardware: x86_64 → All

Mats Palmgren (inactive)

Comment 1

•

10 years ago

Stack overflow: mozilla::net::CacheEntryHandle::Release() nsRefPtr<nsIAtom>::~nsRefPtr<nsIAtom>() mozilla::net::CacheEntry::InvokeAvailableCallback(mozilla::net::CacheEntry::Callback const&) mozilla::net::CacheEntry::InvokeCallback(mozilla::net::CacheEntry::Callback&) mozilla::net::CacheEntry::InvokeCallbacks(bool) mozilla::net::CacheEntry::InvokeCallbacks() mozilla::net::CacheEntry::OnHandleClosed(mozilla::net::CacheEntryHandle const*) mozilla::net::CacheEntryHandle::~CacheEntryHandle() mozilla::net::CacheEntryHandle::`scalar deleting destructor'(unsigned int) mozilla::net::CacheEntryHandle::Release() ...

Severity: major → critical

Component: General → Networking: Cache

Keywords: crash, crashreportid

Product: Firefox → Core

Honza Bambas (:mayhemer)

Assignee

Comment 2

•

10 years ago

Roman, are you using a fresh profile? If not, can you retry with it? I was trying to reproduce this bug with Nightly (e10s on and off) and Fx 37 (all fresh profile) but no luck.

Assignee: nobody → honzab.moz

Flags: needinfo?(ws.bugzilla)

Honza Bambas (:mayhemer)

Assignee

Comment 3

•

10 years ago

nsHttpChannel::OnCacheEntryCheck apparently returns ENTRY_NEEDS_REVALIDATION, hence we go to nsHttpChannel::OnCacheEntryAvailable. But that doesn't take the entry handle ownership (and returns OK) and also somehow restarts wait for the same entry (ending up added as a callback to the list) so on releasing the handle inside CacheEntry::InvokeAvailableCallback (line 832, not 802) we do immediately another loop. How this happens state happens, is absolutely unclear to me. Anyway, we should be able resilient. Have to think on a fix. @Roman, can you please produce an NSPR log while reproducing the bug? set two environment vars (best in a cmd.exe window): NSPR_LOG_MODULES=cache2:5,nsHttp:4,timestamp,sync NSPR_LOG_FILE=\path\to\the\file and then run firefox with that env. If you could not reproduce, remove the 'sync' module from the list and retry. Thanks!

Roman

Reporter

Comment 4

•

10 years ago

Attached file nspr.7z — Details

The crash reproduces on today's nightly with a clean profile (previously the profile was almost-clean). Add one step to the STR: close the browser. That seems to be what finally triggers the crash reporter. What I observe when I open the screenshots in new tabs is that the browser becomes very unresponsive, but the crash reporter seems to only be triggered when the browser is closed. NSPR log (sync module included): attached. This was done on a WinXP VM, because that's where my nightly lives after a couple of incidents with accidentally running the nightly on my main profile. Let me know if you still can't reproduce and require further logs from Win7.

Flags: needinfo?(ws.bugzilla)

Honza Bambas (:mayhemer)

Assignee

Comment 5

•

10 years ago

Thanks. Now I know what's going on. A script schedules an enormous amount of requests to a single URL. It means there are 100+ consumer waiting for a single entry. nsHttpChannel::OnCacheEntryCheck returns REVAL_NEEDED but nsHttpChannel::OnCacheEntryAvailable later fails since the channel has been canceled in the mean time. This leads to processing (no redispatch) of the next callback in chain => stack overflow. I am in favor of leaving the code as much as possible w/o redispatching (i.e. breaking the stack). But in this case we need to make an exception, tho there may be more places. This will not fix the unresponsive script. e10s turned on would, but still, this is somewhat malicious behavior.

Status: NEW → ASSIGNED

Roman

Reporter

Comment 6

•

10 years ago

Perfect! As for e10s, my nightly was pretty unresponsive with e10s enabled - a bit better than v37, but only slightly. Is it supposed to remain perfectly responsive? If so, maybe I should file another bug related to how these pages affect UI responsiveness with e10s enabled?

Wayne Mery (:wsmwk)

Updated

•

9 years ago

Crash Signature: [@ choose_arena ] [@ nsACString_internal::Replace(unsigned int, unsigned int, char const*, unsigned int) ]

Keywords: crashreportid

Summary: Reproduceable crash in 37 through nightly, preceded by unresponsive script warning → Reproducible crash preceded by Unresponsive Script warning. high cache hit results in stack overflow

BMO Automation

Updated

•

9 years ago

Crash Signature: [@ choose_arena ] [@ nsACString_internal::Replace(unsigned int, unsigned int, char const*, unsigned int) ] → [@ choose_arena ] [@ nsACString_internal::Replace(unsigned int, unsigned int, char const*, unsigned int) ] [@ nsACString_internal::Replace ]

Honza Bambas (:mayhemer)

Assignee

Updated

•

9 years ago

Summary: Reproducible crash preceded by Unresponsive Script warning. high cache hit results in stack overflow → Stack overflow by a sync call of net::CacheEntry's callbacks in a chain

Honza Bambas (:mayhemer)

Assignee

Comment 9

•

9 years ago

Attached patch v1 (obsolete) — Details — Splinter Review

The fix is actually that easy...

Attachment #8706917 - Flags: review?(michal.novotny)

Honza Bambas (:mayhemer)

Assignee

Comment 10

•

9 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=edc9797d8e2d (I've also added a comment to the patch)

Honza Bambas (:mayhemer)

Assignee

Comment 11

•

9 years ago

Comment on attachment 8706917 [details] [diff] [review] v1 I think we should also prevent the race with closing the output stream that btw invokes (by a dispatch!) CacheEntry::OnOutputClosed() with invokes the callbacks again. I will update the patch accordingly.

Attachment #8706917 - Flags: review?(michal.novotny)

Honza Bambas (:mayhemer)

Assignee

Comment 12

•

9 years ago

Attached patch v2 — Details — Splinter Review

- the phantom output stream can be easily closed and released under the lock - mOutputStream->Close() invokes callbacks (asynchronously), so no need to do invoke callbacks twice https://treeherder.mozilla.org/#/jobs?repo=try&revision=6b43e1905e73 -w patch follows

Attachment #8706917 - Attachment is obsolete: true

Attachment #8708032 - Flags: review?(michal.novotny)

Honza Bambas (:mayhemer)

Assignee

Comment 13

•

9 years ago

Attached patch v2 -w — Details — Splinter Review

Michal Novotny [:michal]

Updated

•

9 years ago

Attachment #8708032 - Flags: review?(michal.novotny) → review+

Honza Bambas (:mayhemer)

Assignee

Updated

•

9 years ago

Keywords: checkin-needed

Pulsebot

Comment 14

•

9 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/85bc8c6ea180

Keywords: checkin-needed

Carsten Book [:Tomcat]

Comment 15

•

9 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/85bc8c6ea180

Status: ASSIGNED → RESOLVED

Closed: 9 years ago

status-firefox46: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla46

You need to log in before you can comment on or make changes to this bug.