725993 - Remove ability to tell cache to STORE_ON_DISK_AS_FILE

Reporter

Description

•

13 years ago

Attached file firefox-debug_10f4_2012-02-09_17-39-23-408.log — Details

User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:10.0) Gecko/20100101 Firefox/10.0 Build ID: 20120129021758 Steps to reproduce: Watch an embedded YouTube video like the one on Actual results: 24-26% CPU of my Intel Xeon 2.93 Ghz, making Firefox unresponsive but the video kept playing. Process Explorer 12.04 says the offending thread is firefox.exe!wmainCRTStartup with varying stack, most interesting: mozutils.dll!malloc_usable_size+0x6 mozsqlite3.dll!sqlite3_step+0xc2 xul.dll!??1gfxFont@@UAE@XZ+0x564a xul.dll!?CanDraw2D@gfx3DMatrix@@QBE_NPAUgfxMatrix@@@Z+0xbe4c xul.dll!XRE_LockProfileDirectory+0x43882 xul.dll!XRE_LockProfileDirectory+0x4e5a2 xul.dll!XRE_LockProfileDirectory+0x4e742 xul.dll!?AsShadowLayer@Layer@layers@mozilla@@UAEPAVShadowLayer@23@XZ+0x543f xul.dll!NS_CycleCollectorSuspect2_P+0x4186 mozjs.dll!JS_GetTypeInferenceMemoryStats+0x6d7b xul.dll!NS_CycleCollectorForget2_P+0x66e1 mozjs.dll!JS_GetOptions+0xc mozjs.dll!JS_InitReflect+0x7710 xul.dll!NS_CycleCollectorSuspect2_P+0x2592 I'm currently running Firefox via 32-bit WinDbg on Windows 7 Enterprise with Nvidia Quadro FX 1800. Physical usage 4.4/6.0 GB RAM. Application: Firefox 10.0 (20120129021758) Operating System: WINNT (x86-msvc) - BarTab Lite 1.2 - Download Statusbar 0.9.10 (Disabled) - Extension List Dumper 1.15.2 - Firebug 1.9.1 - Firefogg 2.9.19 (Disabled) - FireFTP 2.0.1 - Flashblock 1.5.15.1 - FromWhereToWhere 0.25.0 (Disabled) - Memory Fox 7.4 (Disabled) - Mozilla Archive Format 2.0.4 - New Tab JumpStart 0.5a5.4.3 - NoScript 2.2.9 - Places Maintenance 1.3 (Disabled) - Session History Tree 1.0 (Disabled) - Session Manager 0.7.8.1 - Showcase 0.9.5.8 (Disabled) - Tab History Menu 2.1.1 (Disabled) I use this for web development, so installing a nightly etc. doesn't sound appealing. Attached is the running WinDbg log. I downloaded the symbols with a 64-bit WinDbg but didn't get any notice about that so i assume that's no problem. Possibly related info in bug 465648. Expected results: Stay responsive until you can blame it on the host OS swap.

Cees T.

Reporter

Updated

•

13 years ago

URL: http://www.security.nl/artikel/32339/...

Keywords: perf

Thomas Ahlblom

•

13 years ago

The flooding of the error console definitely causes the CPU spikes. Prolly useless, but a Process Explorer stack while it was flooding: xul.dll!?GetFirstChild@ContainerLayer@layers@mozilla@@UAEPAVLayer@23@XZ+0x1188 xul.dll!??0gfxSize@@QAE@XZ+0xcbfe xul.dll!?TransformBounds@gfxMatrix@@QBE?AUgfxRect@@ABU2@@Z+0x4a26 xul.dll!NS_InvokeByIndex_P+0x17d xul.dll!NS_InvokeByIndex_P+0x27 xul.dll!NS_StackWalk+0x35e xul.dll!??0gfxSize@@QAE@XZ+0x1009e xul.dll!?CanDraw2D@gfx3DMatrix@@QBE_NPAUgfxMatrix@@@Z+0xbf6e xul.dll!?CreatePlatformFontList@gfxWindowsPlatform@@UAEPAVgfxPlatformFontList@@XZ+0x199c

Chris Pearce [:cpearce (Not reading bugmail)]

Updated

•

13 years ago

Summary: Random up to a minute-long unresponsive GUI → Random up to a minute-long unresponsive GUI watching HTML5 YouTube

Cees T.

Reporter

Comment 5

•

13 years ago

The random freeze is the flooding described in comment 4, and causes font entries in the stack. The on-demand terrible performance happens while playing the video at the URL, and when not interrupted by the error console flood, steadily has this stack in the 23+% CPU firefox.exe+0x1c59 thread according to Process Explorer v15.12: mozjs.dll!js_RemoveRoot+0x23c2 mozjs.dll!js_RemoveRoot+0x25f3 mozjs.dll!js_RemoveRoot+0xa53 mozjs.dll!js_RemoveRoot+0xbbb mozjs.dll!JS_GC+0x17 xul.dll!?GetCMSRGBTransform@gfxPlatform@@SAPAU_qcms_transform@@XZ+0x950 Application: Firefox 10.0.1 (20120208060813) Operating System: WINNT (x86-msvc) - BarTab Lite 1.2 (Disabled) - Download Statusbar 0.9.10 (Disabled) - Extension List Dumper 1.15.2 - Firebug 1.9.1 - Firefogg 2.9.19 (Disabled) - FireFTP 2.0.1 - Flashblock 1.5.15.1 - FromWhereToWhere 0.25.0 (Disabled) - Memory Fox 7.4 (Disabled) - Mozilla Archive Format 2.0.4 - New Tab JumpStart 0.5a5.4.3 - NoScript 2.3 - Places Maintenance 1.3 (Disabled) - Session History Tree 1.0 (Disabled) - Session Manager 0.7.8.1 - Showcase 0.9.5.8 (Disabled) - Tab History Menu 2.1.1 (Disabled) According to plugin check (because i can't copy from the add-on screen), these are my active plugins: Shockwave Flash 11.1 r102 11.1.102.55 Up to Date Google Update Unknown plugin

Chris Pearce [:cpearce (Not reading bugmail)]

•

13 years ago

I've updated to Firebug 1.10.0a3 and the Flash CPU problem doesn't occur anymore. If i still see the onBeforeItemRemoved flood (that disabled both ways of middle mouse button scrolling but not the scrollbar grip scrolling), i'll find or create a different bug for that.

Status: UNCONFIRMED → RESOLVED

Closed: 13 years ago

Resolution: --- → INVALID

Chris Pearce [:cpearce (Not reading bugmail)]

Comment 12

•

13 years ago

I can reproduce longish UI freezes (multi second, sometimes tens of seconds) semi-reliably when my system is doing heavy disk IO (for example unziping a zip of my object dir, or `winrm -rf $objdir`). I caught a callstack of a pause in a locally built opt build, and I get the following stack: ntdll.dll!_NtWriteFile@36() + 0x15 bytes ntdll.dll!_NtWriteFile@36() + 0x15 bytes kernel32.dll!_WriteFileImplementation@20() + 0x4a bytes nspr4.dll!FileWrite(PRFileDesc * fd, const void * buf, int amount) Line 109 + 0x8 bytes C nspr4.dll!PR_Write(PRFileDesc * fd, const void * buf, int amount) Line 146 + 0x13 bytes C xul.dll!nsDiskCacheStreamIO::FlushBufferToFile() Line 791 + 0xf bytes C++ xul.dll!nsDiskCacheStreamIO::Write(const char * buffer, unsigned int count, unsigned int * bytesWritten) Line 639 C++ xul.dll!nsDiskCacheOutputStream::Write(const char * buf, unsigned int count, unsigned int * bytesWritten) Line 294 C++ xul.dll!nsCacheEntryDescriptor::nsOutputStreamWrapper::Write(const char * buf, unsigned int count, unsigned int * result) Line 845 + 0x12 bytes C++ xul.dll!nsInputStreamTee::TeeSegment(const char * buf, unsigned int count) Line 200 C++ xul.dll!nsInputStreamTee::WriteSegmentFun(nsIInputStream * in, void * closure, const char * fromSegment, unsigned int offset, unsigned int count, unsigned int * writeCount) Line 230 + 0xb bytes C++ xul.dll!nsPipeInputStream::ReadSegments(unsigned int (nsIInputStream *, void *, const char *, unsigned int, unsigned int, unsigned int *)* writer, void * closure, unsigned int count, unsigned int * readCount) Line 799 + 0x13 bytes C++ xul.dll!nsInputStreamTee::ReadSegments(unsigned int (nsIInputStream *, void *, const char *, unsigned int, unsigned int, unsigned int *)* writer, void * closure, unsigned int count, unsigned int * bytesRead) Line 277 C++ xul.dll!mozilla::ChannelMediaResource::OnDataAvailable(nsIRequest * aRequest, nsIInputStream * aStream, unsigned int aCount) Line 409 C++ xul.dll!mozilla::ChannelMediaResource::Listener::OnDataAvailable(nsIRequest * aRequest, nsISupports * aContext, nsIInputStream * aStream, unsigned int aOffset, unsigned int aCount) Line 128 C++ xul.dll!nsHTMLMediaElement::MediaLoadListener::OnDataAvailable(nsIRequest * aRequest, nsISupports * aContext, nsIInputStream * aStream, unsigned int aOffset, unsigned int aCount) Line 388 C++ xul.dll!nsStreamListenerTee::OnDataAvailable(nsIRequest * request, nsISupports * context, nsIInputStream * input, unsigned int offset, unsigned int count) Line 122 + 0x18 bytes C++ xul.dll!nsHttpChannel::OnDataAvailable(nsIRequest * request, nsISupports * ctxt, nsIInputStream * input, unsigned int offset, unsigned int count) Line 4478 + 0x36 bytes C++ xul.dll!nsInputStreamPump::OnStateTransfer() Line 514 + 0x18 bytes C++ xul.dll!nsInputStreamPump::OnInputStreamReady(nsIAsyncInputStream * stream) Line 403 C++ xul.dll!nsOutputStreamReadyEvent::Run() Line 115 C++ xul.dll!nsThread::ProcessNextEvent(bool mayWait, bool * result) Line 657 + 0x9 bytes C++ xul.dll!NS_ProcessNextEvent_P(nsIThread * thread, bool mayWait) Line 245 + 0xd bytes C++ xul.dll!mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate * aDelegate) Line 110 + 0xa bytes C++ xul.dll!MessageLoop::RunHandler() Line 202 C++ xul.dll!MessageLoop::Run() Line 176 C++ So am I reading that stack right, we're writing data that's been received to a disk cache in nsDiskCacheStreamIO, and that's flushing the cache to disk which is blocking, presumably waiting on a disk seek? I wonder if users are seeing these freezes caused by this due to disk activity on their systems by virus scanners (some people even run multiple scanners!) or the OS's search indexing service (SearchIndexer.exe) misbehaving. Josh/jduell: I understand are there no current plans to enable off-main-thread nsHttpChannel::OnDataAvailable() listeners? Is there a way we can specify that we don't want the nsMediaCache's nsHttpChannels cached by the network cache, since we're caching them ourselves in the nsMediaCache? Or some other way to work around this? On a side note, the video's audio still keeps playing, which requires reading the uncompressed data from disk in the nsMediaCache, so I'd have expected it to block too. Maybe the audio's required segments in the media cache's file are being cached in memory by the OS.

Status: RESOLVED → REOPENED

Ever confirmed: true

Resolution: INVALID → ---

Jason Duell

Comment 13

•

13 years ago

Michal/Nick, So looking at the stack trace in comment 12, it seems that we're trying to read the inputstream during OnDataAvailable, but the inputstreamTee is somehow blocking that read while it writes stuff to disk. That seems like it should never happen--all the data in the OnDataAvailable stream should be readable w/o blocking for the client. On that basis, I'm classifying this for now as a disk cache bug. Possible workaround in the meantime may be to bypass disk caching for loads that get stored in the media cache anyway. (I'm not sure how best to make that happen--the INHIBIT_CACHING flag would work, but we'd need to know the media type at channel load time, right? That may be doable. Anyway, that belongs in a separate bug).

Assignee: nobody → michal.novotny

Component: Untriaged → Networking: Cache

Product: Firefox → Core

QA Contact: untriaged → networking.cache

Michal Novotny [:michal]

Assignee

Comment 14

•

13 years ago

I can't reproduce it even if I'm doing heavy disk IO. Anyway, if it is really caused by doing IO on the main thread then following builds shouldn't suffer from this issue. Could you please try it and check if the problem is fixed? http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mnovotny@mozilla.com-fc1d2e1954b7/

Chris Pearce [:cpearce (Not reading bugmail)]

Comment 15

•

13 years ago

I extracted your patch from your try push and built that locally. I can confirm the stack trace in comment 12 no longer is the main cause of pauses on the main thread when the disk is doing heavy IO. There are a number of other IO operations on the main thread which are causing pauses, but the one in comment 12 seems fixed by that patch.

Michal Novotny [:michal]

Assignee

•

13 years ago

(In reply to Brian Smith (:bsmith) from comment #20) > My understanding is that your patch doesn't actually reduce functionality > (e.g. XHR and NPAPI will work as before). Right. > If so, it seems like a good idea. But, why is all this code using cacheAsFile > in the first place? A long time ago, only cacheAsFile was used (see bug #209560). The current code is just a result of a decision in comment #10. > Is there any negative impact at all? The main difference is that cacheAsFile forces to cache file of any size. So removing this function means that e.g. 100MB HTML5 video won't be cached in the disk cache anymore. I personally consider this as a positive impact but others might have different opinion.

Chris Pearce [:cpearce (Not reading bugmail)]

Comment 22

•

13 years ago

(In reply to Michal Novotny (:michal) from comment #21) > (In reply to Brian Smith (:bsmith) from comment #20) > > Is there any negative impact at all? > > The main difference is that cacheAsFile forces to cache file of any size. So > removing this function means that e.g. 100MB HTML5 video won't be cached in > the disk cache anymore. I personally consider this as a positive impact but > others might have different opinion. <video>'s HTTP channels are currently created with the nsICachingChannel::LOAD_BYPASS_LOCAL_CACHE_IF_BUSY flag, would this change affect them? Currently the media cache isn't smart enough to persistently store data longer than the lifetime of all media elements playing that resource. We rely on the Necko cache to ensure that resources that have been loaded before, but which aren't currently stored in a live media element, are loaded quickly. This is important for low latency when creating media elements.

Michal Novotny [:michal]

Assignee

•

12 years ago

Attached patch patch v2 - plugin part — Details — Splinter Review

Attachment #609523 - Attachment is obsolete: true

Attachment #678771 - Flags: review?(joshmoz)

Michal Novotny [:michal]

Assignee

Comment 66

•

12 years ago

Attached patch patch v2 - media cache — Details — Splinter Review

Attachment #678772 - Flags: review?(roc)

Michal Novotny [:michal]

Assignee

Comment 67

•

12 years ago

Attached patch patch v2 - XHR — Details — Splinter Review

Attachment #678773 - Flags: review?(jonas)

Michal Novotny [:michal]

Assignee

Comment 68

•

12 years ago

Attached patch patch v2 - necko part (obsolete) — Details — Splinter Review

Attachment #678777 - Flags: review?(bsmith)

Robert O'Callahan (:roc) (email my personal email if necessary)

Updated

•

12 years ago

Attachment #678772 - Flags: review?(roc) → review+

Jonas Sicking (:sicking) No longer reading bugmail consistently

Updated

•

12 years ago

Attachment #678773 - Flags: review?(jonas) → review+

Josh Aas

Comment 69

•

12 years ago

Comment on attachment 678771 [details] [diff] [review] patch v2 - plugin part Review of attachment 678771 [details] [diff] [review]: ----------------------------------------------------------------- Looks fine to me but I'd like to have bsmedberg check it over as well.

Attachment #678771 - Flags: review?(joshmoz)

Attachment #678771 - Flags: review?(benjamin)

Attachment #678771 - Flags: review+

Benjamin Smedberg

•

12 years ago

https://hg.mozilla.org/mozilla-central/rev/4c02ec84a2c1

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 74

•

12 years ago

Comment on attachment 678777 [details] [diff] [review] patch v2 - necko part Dropping r? since Michal said he's going to update the patch. Also, please answer my question: "Is it really OK to just remove them completely, or should we just be updating them or replacing them with new tests?" I didn't look at the tests carefully so I'm not sure either way.

Attachment #678777 - Flags: review?(bsmith)

Michal Novotny [:michal]

Assignee

Comment 75

•

12 years ago

Attached patch patch v3 - necko part (obsolete) — Details — Splinter Review

(In reply to Brian Smith (:bsmith) from comment #70) > I don't quite understand what these tests are testing. Is it really OK to > just remove them completely, or should we just be updating them or replacing > them with new tests? I've updated the tests and I also verified that they can still detect the issues fixed by bugs #651100 and #654926. > Do not change the value of the STORE_OFFLINE flag. Instead, leave a comment > that says that value 3 is now unused, so we don't reuse it for something in > the future. fixed

Attachment #678777 - Attachment is obsolete: true

Attachment #691957 - Flags: review?(bsmith)

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Updated

•

12 years ago

OS: Windows 7 → All

Hardware: x86_64 → All

Version: 10 Branch → Trunk

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 76

•

12 years ago

Comment on attachment 691957 [details] [diff] [review] patch v3 - necko part Review of attachment 691957 [details] [diff] [review]: ----------------------------------------------------------------- Looks good to me. I admit I did not read the test changes too carefully :(. ::: netwerk/base/src/nsDownloader.cpp @@ +55,5 @@ > + > + rv = mLocation->CreateUnique(nsIFile::NORMAL_FILE_TYPE, 0600); > + if (NS_FAILED(rv)) return rv; > + > + mLocationIsTemp = true; The existing code code (before your change) was confusing to the point of being dangerous. In particular, when an error occurs that causes us to return early, mLocation will be set to an incorrect path. Now that you've made this change, mLocationIsTemp will always be true in the success case. I suggest that you remove the need for the mLocationisTemp variable by writing the code like this: nsCOMPtr<nsIFile> location; rv = NS_GetSpecialDirectory(NS_OS_TEMP_DIR, getter_AddRefs(location)); char buf[13]; NS_MakeRandomString(buf, 8); memcpy(buf+8, ".tmp", 5); rv = location->AppendNative(nsDependentCString(buf, 12)); if (NS_FAILED(rv)) return rv; rv = location->CreateUnique(nsIFile::NORMAL_FILE_TYPE, 0600); if (NS_FAILED(rv)) return rv; location.forget(mLocation); Then, we would know that whenever mLocation is set, it is set to a valid path, and that we've created that file.

Attachment #691957 - Flags: review?(bsmith) → review+

Michal Novotny [:michal]

Assignee

Comment 77

•

12 years ago

(In reply to Brian Smith (:bsmith) from comment #76) > Now that you've made this change, mLocationIsTemp will always be true in the > success case. I suggest that you remove the need for the mLocationisTemp > variable by writing the code like this: mLocationIsTemp will be always true when we set mLocation in nsDownloader::OnStartRequest() but it will be false if mLocation is set in nsDownloader::Init(). mLocationIsTemp is there to differentiate between these situations so we know whether we should keep or remove the file in destructor. Or am I missing something?

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 78

•

12 years ago

(In reply to Michal Novotny (:michal) from comment #77) > (In reply to Brian Smith (:bsmith) from comment #76) > > Now that you've made this change, mLocationIsTemp will always be true in the > > success case. I suggest that you remove the need for the mLocationisTemp > > variable by writing the code like this: > > mLocationIsTemp will be always true when we set mLocation in > nsDownloader::OnStartRequest() but it will be false if mLocation is set in > nsDownloader::Init(). mLocationIsTemp is there to differentiate between > these situations so we know whether we should keep or remove the file in > destructor. Or am I missing something? OK. thanks for pointing that out. Never mind the suggestion to remove mLocationisTemp. Still seems like a good idea to avoid setting mLocation in the case of an error though.

Alfred Kayser

Updated

•

12 years ago

Blocks: 405407

Alfred Kayser

Updated

•

12 years ago

Blocks: 814010

Michal Novotny [:michal]

Assignee

Comment 79

•

12 years ago

Attached patch patch v4 - necko part — Details — Splinter Review

(In reply to Brian Smith (:bsmith) from comment #78) > Still seems like a good idea to avoid setting mLocation in the case of an > error though. OK, it is fixed in the new patch. Carrying forward r+

Attachment #691957 - Attachment is obsolete: true

Attachment #692995 - Flags: review+

Michal Novotny [:michal]

Assignee

Comment 80

•

12 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/cddc8be15e62 https://hg.mozilla.org/integration/mozilla-inbound/rev/9c02f2a0bd9b https://hg.mozilla.org/integration/mozilla-inbound/rev/3553adfc7a6c

Michal Novotny [:michal]

Assignee

Updated

•

12 years ago

Whiteboard: [leave open]

Ed Morley [:emorley]

Comment 81

•

12 years ago

https://hg.mozilla.org/mozilla-central/rev/cddc8be15e62 https://hg.mozilla.org/mozilla-central/rev/9c02f2a0bd9b https://hg.mozilla.org/mozilla-central/rev/3553adfc7a6c

Status: NEW → RESOLVED

Closed: 13 years ago → 12 years ago

Resolution: --- → FIXED

Target Milestone: --- → mozilla20

Masatoshi Kimura [:emk]

Updated

•

12 years ago

Blocks: 758296

No longer depends on: 758296

firefox-debug_10f4_2012-02-09_17-39-23-408.log 13 years ago Cees T. 132.94 KB, text/plain		Details
patch v1 - removes cacheAsFile functionality 13 years ago Michal Novotny [:michal] 48.29 KB, patch	jduell.mcbugs : review-	Details \| Diff \| Splinter Review
patch v2 - plugin part 12 years ago Michal Novotny [:michal] 3.92 KB, patch	jaas : review+ benjamin : review+	Details \| Diff \| Splinter Review
patch v2 - media cache 12 years ago Michal Novotny [:michal] 1015 bytes, patch	roc : review+	Details \| Diff \| Splinter Review
patch v2 - XHR 12 years ago Michal Novotny [:michal] 8.49 KB, patch	sicking : review+	Details \| Diff \| Splinter Review
patch v2 - necko part 12 years ago Michal Novotny [:michal] 33.45 KB, patch		Details \| Diff \| Splinter Review
patch v3 - necko part 12 years ago Michal Novotny [:michal] 29.70 KB, patch	briansmith : review+	Details \| Diff \| Splinter Review
patch v4 - necko part 12 years ago Michal Novotny [:michal] 29.78 KB, patch	michal : review+	Details \| Diff \| Splinter Review