1253883 - Intermittent ASAN test_browserElement_oop_getWebManifest.html AddressSanitizer: heap-use-after-free in cache2 code

Reporter

Description

•

9 years ago

https://treeherder.mozilla.org/logviewer.html#?job_id=23100626&repo=mozilla-inbound 14:03:20 INFO - 1865 INFO TEST-START | dom/browser-element/mochitest/test_browserElement_oop_getWebManifest.html 14:03:23 INFO - [Child 7242] WARNING: g_path_get_basename: assertion `file_name != NULL' failed: 'glib warning', file /builds/slave/m-in-l64-asan-0000000000000000/build/src/toolkit/xre/nsSigHandlers.cpp, line 142 14:03:23 INFO - (process:7242): GLib-CRITICAL **: g_path_get_basename: assertion `file_name != NULL' failed 14:03:25 INFO - ----------------------------------------------------- 14:03:25 INFO - Suppressions used: 14:03:25 INFO - count bytes template 14:03:25 INFO - 820 26216 nsComponentManagerImpl 14:03:25 INFO - 6 816 mozJSComponentLoader::LoadModule 14:03:25 INFO - 1 384 pixman_implementation_lookup_composite 14:03:25 INFO - 360 15936 libfontconfig.so 14:03:25 INFO - 8 1816 MessageLoop::MessageLoop 14:03:25 INFO - 1 24 base::WaitableEvent::TimedWait 14:03:25 INFO - 6 192 libcairo.so 14:03:25 INFO - 1 32 libdl.so 14:03:25 INFO - 26 6492 libglib-2.0.so 14:03:25 INFO - ----------------------------------------------------- 14:03:26 INFO - ###################################### BrowserElementCopyPaste.js loaded 14:03:26 INFO - ############################### browserElementPanningAPZDisabled.js loaded 14:03:26 INFO - ############################### browserElementPanning.js loaded 14:03:26 INFO - ######################## BrowserElementChildPreload.js loaded 14:03:26 INFO - ######################## extensions.js loaded 14:03:26 INFO - ###################################### BrowserElementCopyPaste.js loaded 14:03:26 INFO - ############################### browserElementPanningAPZDisabled.js loaded 14:03:26 INFO - ############################### browserElementPanning.js loaded 14:03:26 INFO - ######################## BrowserElementChildPreload.js loaded 14:03:26 INFO - ######################## extensions.js loaded 14:03:27 INFO - ###################################### BrowserElementCopyPaste.js loaded 14:03:27 INFO - ############################### browserElementPanningAPZDisabled.js loaded 14:03:27 INFO - ############################### browserElementPanning.js loaded 14:03:27 INFO - ######################## BrowserElementChildPreload.js loaded 14:03:27 INFO - ######################## extensions.js loaded 14:03:27 INFO - ************************* 14:03:27 INFO - A coding exception was thrown and uncaught in a Task. 14:03:27 INFO - Full message: TypeError: NetworkError when attempting to fetch resource. 14:03:27 INFO - Full stack: 14:03:27 INFO - ************************* 14:03:27 INFO - ************************* 14:03:27 INFO - A coding exception was thrown and uncaught in a Task. 14:03:27 INFO - Full message: TypeError: NetworkError when attempting to fetch resource. 14:03:27 INFO - Full stack: 14:03:27 INFO - ************************* 14:03:27 INFO - MEMORY STAT | vsize 20973037MB | residentFast 652MB 14:03:27 INFO - 1866 INFO TEST-OK | dom/browser-element/mochitest/test_browserElement_oop_getWebManifest.html | took 6796ms 14:03:30 INFO - ================================================================= 14:03:30 INFO - ==5209==ERROR: AddressSanitizer: heap-use-after-free on address 0x627000230100 at pc 0x440ba3 bp 0x7fcb3807a750 sp 0x7fcb3807a730 14:03:30 INFO - READ of size 13480 at 0x627000230100 thread T15 (Cache2 I/O) 14:03:31 INFO - #0 0x440ba2 in write /builds/slave/moz-toolchain/src/llvm/projects/compiler-rt/lib/sanitizer_common/sanitizer_common_interceptors.inc:325 14:03:31 INFO - #1 0x7fcb40dd52ef in pt_Write /builds/slave/m-in-l64-asan-0000000000000000/build/src/nsprpub/pr/src/pthreads/ptio.c:1315 14:03:31 INFO - ----------------------------------------------------- 14:03:31 INFO - Suppressions used: 14:03:31 INFO - count bytes template 14:03:31 INFO - 820 26216 nsComponentManagerImpl 14:03:31 INFO - 6 816 mozJSComponentLoader::LoadModule 14:03:31 INFO - 1 384 pixman_implementation_lookup_composite 14:03:31 INFO - 360 15936 libfontconfig.so 14:03:31 INFO - 8 1816 MessageLoop::MessageLoop 14:03:31 INFO - 1 24 base::WaitableEvent::TimedWait 14:03:31 INFO - 6 192 libcairo.so 14:03:31 INFO - 1 32 libdl.so 14:03:31 INFO - 26 6492 libglib-2.0.so 14:03:31 INFO - ----------------------------------------------------- 14:03:33 INFO - #2 0x7fcb26133d1f in (anonymous namespace)::interposedWrite(PRFileDesc*, void const*, int) /builds/slave/m-in-l64-asan-0000000000000000/build/src/xpcom/build/NSPRInterposer.cpp:72 14:03:33 INFO - #3 0x7fcb2651943f in mozilla::net::CacheFileIOManager::WriteInternal(mozilla::net::CacheFileHandle*, long, char const*, int, bool, bool) /builds/slave/m-in-l64-asan-0000000000000000/build/src/netwerk/cache2/CacheFileIOManager.cpp:2024 14:03:33 INFO - #4 0x7fcb265696f6 in mozilla::net::WriteEvent::Run() /builds/slave/m-in-l64-asan-0000000000000000/build/src/netwerk/cache2/CacheFileIOManager.cpp:725 14:03:33 INFO - #5 0x7fcb2653f739 in mozilla::net::CacheIOThread::LoopOneLevel(unsigned int) /builds/slave/m-in-l64-asan-0000000000000000/build/src/netwerk/cache2/CacheIOThread.cpp:294 14:03:33 INFO - #6 0x7fcb2653ef3c in mozilla::net::CacheIOThread::ThreadFunc() /builds/slave/m-in-l64-asan-0000000000000000/build/src/netwerk/cache2/CacheIOThread.cpp:224 14:03:33 INFO - #7 0x7fcb2653e983 in mozilla::net::CacheIOThread::ThreadFunc(void*) /builds/slave/m-in-l64-asan-0000000000000000/build/src/netwerk/cache2/CacheIOThread.cpp:173 14:03:33 INFO - #8 0x7fcb40ddd3cf in _pt_root /builds/slave/m-in-l64-asan-0000000000000000/build/src/nsprpub/pr/src/pthreads/ptthread.c:216 14:03:33 INFO - #9 0x7fcb453afe99 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x7e99) 14:03:33 INFO - #10 0x7fcb444bf38c (/lib/x86_64-linux-gnu/libc.so.6+0xf338c) 14:03:33 INFO - ASAN:SIGSEGV 14:03:33 INFO - ==5209==AddressSanitizer: while reporting a bug found another one.Ignoring. 14:03:33 INFO - TEST-INFO | Main app process: exit 1 14:03:33 INFO - 1867 INFO TEST-START | Shutdown 14:03:33 INFO - 1868 INFO Passed: 1517 14:03:33 INFO - 1869 INFO Failed: 0 14:03:33 INFO - 1870 INFO Todo: 12 14:03:33 INFO - 1871 INFO Slowest: 12181ms - /tests/dom/browser-element/mochitest/test_browserElement_oop_AudioChannel.html 14:03:33 INFO - 1872 INFO SimpleTest FINISHED 14:03:33 INFO - 1873 INFO TEST-INFO | Ran 1 Loops 14:03:33 INFO - 1874 INFO SimpleTest FINISHED 14:03:33 WARNING - TEST-UNEXPECTED-FAIL | dom/browser-element/mochitest/test_browserElement_oop_getWebManifest.html | application terminated with exit code 1

Michal Novotny [:michal]

Assignee

Updated

•

9 years ago

Assignee: nobody → michal.novotny

Daniel Veditz [:dveditz]

Updated

•

9 years ago

Group: core-security → network-core-security

Matt Wobensmith [:mwobensmith][:matt:]

Updated

•

9 years ago

Keywords: sec-high

Patrick McManus [:mcmanus]

Updated

•

9 years ago

Whiteboard: [necko-active]

Ryan VanderMeulen [:RyanVM]

Reporter

Comment 1

•

9 years ago

https://treeherder.mozilla.org/logviewer.html#?job_id=23469675&repo=mozilla-inbound

Comment 2

•

9 years ago

https://treeherder.mozilla.org/logviewer.html#?job_id=23629177&repo=mozilla-inbound

Ryan VanderMeulen [:RyanVM]

Reporter

Comment 3

•

9 years ago

https://treeherder.mozilla.org/logviewer.html#?job_id=2144120&repo=mozilla-aurora

status-firefox48: --- → affected

Ryan VanderMeulen [:RyanVM]

Reporter

Comment 4

•

9 years ago

https://treeherder.mozilla.org/logviewer.html#?job_id=3496203&repo=mozilla-central

Michal Novotny [:michal]

Assignee

Comment 5

•

9 years ago

Attached patch fix (obsolete) — Details — Splinter Review

I was not able to find exact cause of this but I guess that the fix in bug 1248958 doesn't work in all cases. I think this write event must be from CacheIndex because both CacheFileMetadata and CacheFileChunk release their write buffer in the callback or in destructor.

Attachment #8740947 - Flags: review?(honzab.moz)

Michal Novotny [:michal]

Assignee

Comment 6

•

9 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=1596b3c0a528

Honza Bambas (:mayhemer)

Comment 7

•

9 years ago

Comment on attachment 8740947 [details] [diff] [review] fix Review of attachment 8740947 [details] [diff] [review]: ----------------------------------------------------------------- I really don't know what to think about this patch. There is a problem (serious) in this code that we don't fully understand. This flag is something I wouldn't trust, because it's built with assumption of otherwise functional code, which we don't have. And in the past it showed up having a patch built on a hunch was not a good thing to do, specially in case of a security bug. Could we rather change the buffering ownership somehow, or even better - find out the true cause? I think that is the fragile thing here at the first place.

Attachment #8740947 - Flags: review?(honzab.moz)

Michal Novotny [:michal]

Assignee

Comment 8

•

9 years ago

Comment on attachment 8740947 [details] [diff] [review] fix (In reply to Honza Bambas (:mayhemer) from comment #7) > I really don't know what to think about this patch. There is a problem > (serious) in this code that we don't fully understand. This flag is > something I wouldn't trust, because it's built with assumption of otherwise > functional code, which we don't have. And in the past it showed up having a > patch built on a hunch was not a good thing to do, specially in case of a > security bug. This is simply a proper fix of the problem discovered in bug 1248958. The buffer handling in CacheIndex is incorrect because the buffer can be freed when read/write event is pending. After many many tries I was able to reproduce this crash and the patch in bug 1248958 doesn't work because it relies on assumption that no read/write event can be posted once CacheIndex::mShuttingDown is set to true, but it can. The sequence is as follows: 1) CacheIndex::WriteIndexToDisk 2) CacheIndex::ChangeState READY -> WRITING 3) CacheFileIOManager::OpenFile index.tmp 4) CacheFileIOManager::Shutdown 5) CacheIndex::PreShutdown PreShutdownInternal posted on WRITE level 6) CacheIndex::OnFileOpenedInternal doesn't return early because mState != READY 7) CacheIndex::WriteRecords 8) CacheFileIOManager::Write write event pointing to mRWBuf is posted 9) CacheIndex::PreShutdownInternal 10) CacheIndex::FinishWrite mRWBuf is freed This patch will solve the crash. I also need to review mShuttingDown check because I'm not sure why the condition includes mState == READY. Anyway, it's much safer to take this patch for now than changing mShuttingDown check. There is also a benefit from this patch that we don't need to post PreShutdownInternal on WRITE level so we'll shut the index down faster.

Attachment #8740947 - Flags: review?(honzab.moz)

Michal Novotny [:michal]

Assignee

Comment 9

•

9 years ago

Is this really sec-high? It can happen only during shutdown and the crash needs exact timing which IMO cannot be influenced by any web page, add-on or plug-in.

Flags: needinfo?(mwobensmith)

Honza Bambas (:mayhemer)

Comment 10

•

9 years ago

(In reply to Michal Novotny (:michal) from comment #8) > This is simply a proper fix of the problem discovered in bug 1248958. The The true reason I don't much like the patch: having a flag for buffer management where there can potentially be more then one pending operation the flag relates to - there is nothing that ensures just one write op is happening, I also mean when we later change something - is simply a very bad coding architecture. It's fragile, and will probably break with future changes, even just not directly related, like what we've done to reveal this very bug. Or at least might make finding a cause of any issues around this code hard, simply because it's not reliable from its nature. I would prefer to simply change the buffer referencing here (have a referenced wrapper for your buffers), which is IMO the only true correct and safe fix here, which would remove any potential for a hard-to-find crash like this once and for all. I don't remember I would ever suggest changing mShuttingDown logic, or did I? Anyway, it's your code, your decision. I will try to get to this patch review soon if you don't change your mind, but patches like these are extremely hard to check on also because the cache index code is pretty much complex already and as shown with bug 1248958 pretty much sensitive to changes of the above layers.

Matt Wobensmith [:mwobensmith][:matt:]

Comment 11

•

9 years ago

The rating was determined by a consensus at our weekly triage. What do you feel would be a better rating?

Flags: needinfo?(mwobensmith) → needinfo?(michal.novotny)

Michal Novotny [:michal]

Assignee

Comment 12

•

9 years ago

(In reply to Matt Wobensmith [:mwobensmith][:matt] from comment #11) > The rating was determined by a consensus at our weekly triage. What do you > feel would be a better rating? This bug is IMO not exploitable, so sec-moderate or sec-low would be probably more appropriate.

Flags: needinfo?(michal.novotny)

Matt Wobensmith [:mwobensmith][:matt:]

Comment 13

•

9 years ago

Andrew, do you recall our logic in rating this a sec-high? Or, based on the information above, do you think a lower rating is more appropriate?

Flags: needinfo?(continuation)

Michal Novotny [:michal]

Assignee

Comment 14

•

9 years ago

(In reply to Honza Bambas (:mayhemer) from comment #10) > The true reason I don't much like the patch: having a flag for buffer > management where there can potentially be more then one pending operation > the flag relates to - there is nothing that ensures just one write op is > happening This is nonsense and it shows that you're commenting without understanding the patch and the code. Depending on the state in which CacheIndex is, there can be reading or writing in progress, but not both at the same time. In both states there can be only one read or write operation in progress. All we need to do is to ensure that the buffer is not freed before the read/write operation finishes. This is currently broken only in case we want to cancel reading/writing of the index which we do in case of shutdown and from CacheIndex::RemoveAll (this code is currently unused). > I don't remember I would ever suggest changing mShuttingDown logic, or did I? I don't remember I said you suggested it, or did I? I mentioned it because changing the mShuttingDown check would probably make the patch in bug 1248958 working, but doing this would be IMO a bad decision just like in bug 1248958 (comment #2).

Andrew McCreight [:mccr8]

Comment 15

•

9 years ago

(In reply to Matt Wobensmith [:mwobensmith][:matt] from comment #13) > Andrew, do you recall our logic in rating this a sec-high? Or, based on the > information above, do you think a lower rating is more appropriate? We usually rate use-after-frees sec-critical, so one that is hard to trigger gets sec-high. But it sounds like this is very removed from content, and timing-dependent, and at shutdown, so it probably doesn't need to be rated like that. I'll just mark it sec-other.

Flags: needinfo?(continuation)

Keywords: sec-high → sec-other

Honza Bambas (:mayhemer)

Comment 16

•

9 years ago

Comment on attachment 8740947 [details] [diff] [review] fix Review of attachment 8740947 [details] [diff] [review]: ----------------------------------------------------------------- I'm leaving r? as few answers are needed first and also two small but important changes are definitely needed. As I understand, today cache index cannot be writing anything before reading is done, right? The only way to start index write is through WriteIndexToDiskIfNeeded(), which happens on any updated coming from the platform side and when the index is READY. So write scheduling is bypassed also when WRITING, is that so? ::: netwerk/cache2/CacheIndex.cpp @@ +1564,5 @@ > > bool > CacheIndex::WriteIndexToDiskIfNeeded() > { > + if (mState != READY || mShuttingDown || mRWPending) { do we need to reschedule this? as I understand, this method is called whenever we do an update to the index data from the platform side. could it happen we end up with some unwritten data because of this? is it worth complicating the code? also, can it actually ever happen that mState == READY && mRWPending? because when I'm looking into the code, it seems it can only be READING or WRITING while mRWPending == true, or with different state somehow during shutdown (but that is beyond my willingness of putting even more cycles to this review to find out exactly) based on that, this condition will never reach a point when mState == READY && !mShuttingDown && mRWPending. so, adding || mRWPending is probably a dead code. if my assertion here is true, then your change in CacheIndex::OnDataWritten (the handle check) is probably not correct. or am I wrong? @@ +3182,5 @@ > > void > CacheIndex::ReleaseBuffer() > { > + if (!mRWBuf || mRWPending) { ReleaseBuffer() may be called on the main thread: CacheIndex::~CacheIndex() CacheIndex::Shutdown() -> index->FinishWrite(false) CacheIndex::RemoveAll() -> index->FinishWrite(false) CacheIndex::Shutdown() -> index->FinishRead(false) CacheIndex::RemoveAll() -> index->FinishRead(false) CacheIndex::Init -> CacheIndex::InitInternal -> CacheIndex::ReadIndexFromDisk() -> FinishRead(false) according how this code and the flag work, it is probably OK to make the mRWPending flag simply Atomic (at least rel-acq) @@ +3412,5 @@ > return NS_ERROR_NOT_AVAILABLE; > } > > + MOZ_ASSERT(mRWPending); > + mRWPending = false; why not before if (!IsIndexUsable()) check? I think we should drop this flag regardless the index state, no? because otherwise ReleaseBuffers() called from dtor will not release the buffer and we will leak. @@ +3421,5 @@ > > switch (mState) { > case WRITING: > + // mRWPending prevents dispatching another write event, so the handle must > + // be mIndexHandle. hm.. so what write operation was that before? because the flag prevents (IIUC) only WriteRecords, which is the only one working with mRWBuf. Do you want to say it was possible that two write operations where sharing the mRWBuf reference? @@ +3445,5 @@ > default: > // Writing was canceled. > LOG(("CacheIndex::OnDataWritten() - ignoring notification since the " > "operation was previously canceled [state=%d]", mState)); > + ReleaseBuffer(); There are possible code paths that may lead to not releasing the buffer at all (until dtor), like when here we case for WRITING, then do the file rename, ending up at CacheIndex::OnFileRenamed, that may not get to call FinishWrite (which would otherwise release the buffer). This may not be a big deal as we release the buffer in dtor of CacheIndex (when this patch is fixed, as above), but still it may be a waste to keep the buffer around when not needed. @@ +3464,5 @@ > return NS_ERROR_NOT_AVAILABLE; > } > > + MOZ_ASSERT(mRWPending); > + mRWPending = false; here as well, why not before if (!IsIndexUsable()) check? ::: netwerk/cache2/CacheIndex.h @@ +990,5 @@ > uint32_t mRWBufSize; > uint32_t mRWBufPos; > RefPtr<CacheHash> mRWHash; > > + // True if read or write operation is pending. It is used to ensure that probably worth mentioning that it applies only to write operations that work with mRWBuf member.

Michal Novotny [:michal]

Assignee

Comment 17

•

9 years ago

Thanks for the feedback. (In reply to Honza Bambas (:mayhemer) from comment #16) > As I understand, today cache index cannot be writing anything before reading > is done, right? The only way to start index write is through > WriteIndexToDiskIfNeeded(), which happens on any updated coming from the > platform side and when the index is READY. So write scheduling is bypassed > also when WRITING, is that so? Right. > > bool > > CacheIndex::WriteIndexToDiskIfNeeded() > > { > > + if (mState != READY || mShuttingDown || mRWPending) { > > do we need to reschedule this? as I understand, this method is called > whenever we do an update to the index data from the platform side. could it > happen we end up with some unwritten data because of this? is it worth > complicating the code? Right, we call it from methods that could have changed the data in index. There is definitely no need to reschedule it, because it will be called on the next change. If there is no other change then all unwritten changes will be written to the journal during shutdown. If the process crashes, then those unwritten changes will be restored during update process on the next start. > also, can it actually ever happen that mState == READY && mRWPending? > because when I'm looking into the code, it seems it can only be READING or > WRITING while mRWPending == true, or with different state somehow during > shutdown (but that is beyond my willingness of putting even more cycles to > this review to find out exactly) > > based on that, this condition will never reach a point when mState == READY > && !mShuttingDown && mRWPending. so, adding || mRWPending is probably a > dead code. > > if my assertion here is true, then your change in CacheIndex::OnDataWritten > (the handle check) is probably not correct. > > or am I wrong? This happens when read/write operation is pending and FinishRead/FinishWrite is called to cancel the read/write process. In current code this can happen only during shutdown and from CacheIndex::RemoveAll. Given that RemoveAll() is and probably will stay unused, I could remove this method and then the current check would be sufficient. > > @@ +3182,5 @@ > > > > void > > CacheIndex::ReleaseBuffer() > > { > > + if (!mRWBuf || mRWPending) { > > ReleaseBuffer() may be called on the main thread: > > CacheIndex::~CacheIndex() > CacheIndex::Shutdown() -> index->FinishWrite(false) > CacheIndex::RemoveAll() -> index->FinishWrite(false) > CacheIndex::Shutdown() -> index->FinishRead(false) > CacheIndex::RemoveAll() -> index->FinishRead(false) > CacheIndex::Init -> CacheIndex::InitInternal -> > CacheIndex::ReadIndexFromDisk() -> FinishRead(false) > > according how this code and the flag work, it is probably OK to make the > mRWPending flag simply Atomic (at least rel-acq) All calls to ReleaseBuffer are called under sLock (also all other mRWPending accesses) except the call from destructor. So is it necessary to make it atomic? What could go wrong? Anyway, I know you don't like locks, but since it is already there it would be probably cleaner to grab the static lock in the destructor and put sLock.AssertCurrentThreadOwns() into ReleaseBuffer(). > > + MOZ_ASSERT(mRWPending); > > + mRWPending = false; > > why not before if (!IsIndexUsable()) check? I think we should drop this > flag regardless the index state, no? because otherwise ReleaseBuffers() > called from dtor will not release the buffer and we will leak. Good question. In fact IsIndexUsable must return true in these methods. It is not usable when the state is INITIAL or SHUTDOWN. In case the state is INITIAL nobody called CacheFileIOManager::Read or CacheFileIOManager::Write so we cannot be here. When the state is SHUTDOWN IO thread no longer exists so these callbacks cannot be called. I'll change it to assertion. > > switch (mState) { > > case WRITING: > > + // mRWPending prevents dispatching another write event, so the handle must > > + // be mIndexHandle. > > hm.. so what write operation was that before? because the flag prevents > (IIUC) only WriteRecords, which is the only one working with mRWBuf. Do you > want to say it was possible that two write operations where sharing the > mRWBuf reference? This code was simply wrong and it couldn't happen that mIndexHandle != aHandle. Not even back in the days when method CacheIndex::RemoveAll was used. Simply because the index was empty and to start writing the index we would need to make 300 changes in the index and those changes would be queued after the pending write event that was canceled. > @@ +3445,5 @@ > > default: > > // Writing was canceled. > > LOG(("CacheIndex::OnDataWritten() - ignoring notification since the " > > "operation was previously canceled [state=%d]", mState)); > > + ReleaseBuffer(); > > There are possible code paths that may lead to not releasing the buffer at > all (until dtor), like when here we case for WRITING, then do the file > rename, ending up at CacheIndex::OnFileRenamed, that may not get to call > FinishWrite (which would otherwise release the buffer). I don't think there is such code path. The state is changed in FinishRead/FinishWrite so if don't call it we will stay in READ/WRITE state forever. > ::: netwerk/cache2/CacheIndex.h > @@ +990,5 @@ > > uint32_t mRWBufSize; > > uint32_t mRWBufPos; > > RefPtr<CacheHash> mRWHash; > > > > + // True if read or write operation is pending. It is used to ensure that > > probably worth mentioning that it applies only to write operations that work > with mRWBuf member. I'm not sure I understand this. All read or write operations that call OnDataRead and OnDataWritten back use mRWBuf. CacheFileIOManager::Write that marks the index dirty (in CacheIndex::ParseRecords) doesn't call OnDataWritten because it is called with aCallback==nullptr.

Michal Novotny [:michal]

Assignee

Comment 18

•

9 years ago

Comment on attachment 8740947 [details] [diff] [review] fix I will attach a new patch soon.

Attachment #8740947 - Attachment is obsolete: true

Attachment #8740947 - Flags: review?(honzab.moz)

Michal Novotny [:michal]

Assignee

Comment 19

•

9 years ago

(In reply to Michal Novotny (:michal) from comment #17) > it would be probably cleaner to grab the static lock in > the destructor and put sLock.AssertCurrentThreadOwns() into ReleaseBuffer(). AFAICS, destructor is already called under the lock because CacheIndex::Shutdown should release the last reference and it does it under the lock.

Honza Bambas (:mayhemer)

Comment 20

•

9 years ago

(In reply to Michal Novotny (:michal) from comment #17) > > also, can it actually ever happen that mState == READY && mRWPending? > > because when I'm looking into the code, it seems it can only be READING or > > WRITING while mRWPending == true, or with different state somehow during > > shutdown (but that is beyond my willingness of putting even more cycles to > > this review to find out exactly) > > > > based on that, this condition will never reach a point when mState == READY > > && !mShuttingDown && mRWPending. so, adding || mRWPending is probably a > > dead code. > > > > if my assertion here is true, then your change in CacheIndex::OnDataWritten > > (the handle check) is probably not correct. > > > > or am I wrong? > > This happens when read/write operation is pending and FinishRead/FinishWrite > is called to cancel the read/write process. In current code this can happen > only during shutdown and from CacheIndex::RemoveAll. Given that RemoveAll() > is and probably will stay unused, I could remove this method and then the > current check would be sufficient. Based on your comment (1) below, this is moot. > > according how this code and the flag work, it is probably OK to make the > > mRWPending flag simply Atomic (at least rel-acq) > > All calls to ReleaseBuffer are called under sLock (also all other mRWPending > accesses) except the call from destructor. So is it necessary to make it > atomic? What could go wrong? Anyway, I know you don't like locks, but since > it is already there it would be probably cleaner to grab the static lock in > the destructor and put sLock.AssertCurrentThreadOwns() into ReleaseBuffer(). Please definitely assert the lock in ReleaseBuffer() and make sure that each method itself that is using this new flag does so as well. > > > switch (mState) { > > > case WRITING: > > > + // mRWPending prevents dispatching another write event, so the handle must > > > + // be mIndexHandle. > > > > hm.. so what write operation was that before? because the flag prevents > > (IIUC) only WriteRecords, which is the only one working with mRWBuf. Do you > > want to say it was possible that two write operations where sharing the > > mRWBuf reference? > (1) > This code was simply wrong and it couldn't happen that mIndexHandle != > aHandle. Not even back in the days when method CacheIndex::RemoveAll was > used. Simply because the index was empty and to start writing the index we > would need to make 300 changes in the index and those changes would be > queued after the pending write event that was canceled. Good! So there will be no if and no assertion in the next patch? > > > @@ +3445,5 @@ > > > default: > > > // Writing was canceled. > > > LOG(("CacheIndex::OnDataWritten() - ignoring notification since the " > > > "operation was previously canceled [state=%d]", mState)); > > > + ReleaseBuffer(); > > > > There are possible code paths that may lead to not releasing the buffer at > > all (until dtor), like when here we case for WRITING, then do the file > > rename, ending up at CacheIndex::OnFileRenamed, that may not get to call > > FinishWrite (which would otherwise release the buffer). > > I don't think there is such code path. The state is changed in > FinishRead/FinishWrite so if don't call it we will stay in READ/WRITE state > forever. I still think there are potential paths. When the index state changes between (e.g.) file rename, we may not get to the point OnFileRenamed calls FinishWrite. There is a potential for that, and when there is a potential, there is usually a problem. > > > ::: netwerk/cache2/CacheIndex.h > > @@ +990,5 @@ > > > uint32_t mRWBufSize; > > > uint32_t mRWBufPos; > > > RefPtr<CacheHash> mRWHash; > > > > > > + // True if read or write operation is pending. It is used to ensure that > > > > probably worth mentioning that it applies only to write operations that work > > with mRWBuf member. > > I'm not sure I understand this. All read or write operations that call > OnDataRead and OnDataWritten back use mRWBuf. CacheFileIOManager::Write that > marks the index dirty (in CacheIndex::ParseRecords) doesn't call > OnDataWritten because it is called with aCallback==nullptr. Ups, you are right, not sure now what was this comment exactly about. Ignore me at this point. (In reply to Michal Novotny (:michal) from comment #19) > (In reply to Michal Novotny (:michal) from comment #17) > > it would be probably cleaner to grab the static lock in > > the destructor and put sLock.AssertCurrentThreadOwns() into ReleaseBuffer(). > > AFAICS, destructor is already called under the lock because > CacheIndex::Shutdown should release the last reference and it does it under > the lock. Even better. Thanks.

Michal Novotny [:michal]

Assignee

Comment 21

•

9 years ago

Attached patch patch v2 — Details — Splinter Review

changes in this patch: - releasing of mRWBuf is now logged - sLock.AssertCurrentThreadOwns() is in every method using mRWPending - more assertions (with comment) in CacheIndex::FinishWrite and CacheIndex::FinishRead - added assertion to check correct thread in CacheFileIOListener methods - IsIndexUsable() check changed to assertion in CacheFileIOListener methods (In reply to Honza Bambas (:mayhemer) from comment #20) > > This happens when read/write operation is pending and FinishRead/FinishWrite > > is called to cancel the read/write process. In current code this can happen > > only during shutdown and from CacheIndex::RemoveAll. Given that RemoveAll() > > is and probably will stay unused, I could remove this method and then the > > current check would be sufficient. In the end I didn't remove the method because I would need to remove CacheFileIOManager::EvictAll(), CacheFileIOManager::EvictAllInternal() and all references to CacheIndex::mRemovingAll too. This would be quite a large change. > > This code was simply wrong and it couldn't happen that mIndexHandle != > > aHandle. Not even back in the days when method CacheIndex::RemoveAll was > > used. Simply because the index was empty and to start writing the index we > > would need to make 300 changes in the index and those changes would be > > queued after the pending write event that was canceled. > > Good! So there will be no if and no assertion in the next patch? I left the assertion just to be consistent with OnDataRead which contains the assertion too. But I removed the comment which was a bit confusing. > I still think there are potential paths. When the index state changes > between (e.g.) file rename, we may not get to the point OnFileRenamed calls > FinishWrite. There is a potential for that, and when there is a potential, > there is usually a problem. In every On... method we either continue with next step of the read/write process or we call FinishRead/Write to fail it which then switches index to READY state. I agree that this isn't the nicest code but it seems to work. OTOH it is true that if we get stuck in READ or WRITE state we probably would not notice it. I'll file a follow up bug to clean this up.

Attachment #8746530 - Flags: review?(honzab.moz)

Michal Novotny [:michal]

Assignee

Comment 22

•

9 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=b0e22c6e7dfd

Honza Bambas (:mayhemer)

Updated

•

9 years ago

Attachment #8746530 - Flags: review?(honzab.moz) → review+

Carsten Book [:Tomcat]

Comment 23

•

9 years ago

https://hg.mozilla.org/mozilla-central/rev/398b93819760

Status: NEW → RESOLVED

Closed: 9 years ago

status-firefox49: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla49

Daniel Veditz [:dveditz]

Updated

•

9 years ago

Group: network-core-security → core-security-release

Ryan VanderMeulen [:RyanVM]

Reporter

Comment 24

•

9 years ago

Probably too late for Beta at this point, but can we please uplift this to Aurora at least?

Flags: needinfo?(michal.novotny)

Michal Novotny [:michal]

Assignee

Comment 25

•

9 years ago

Comment on attachment 8746530 [details] [diff] [review] patch v2 Approval Request Comment [Feature/regressing bug #]: probably 923016 [User impact if declined]: possibly using a freed memory during shutdown [Describe test coverage new/current, TreeHerder]: the cache index is written to disk when running mochitests which use the cache, but there is no test of canceled writing of the index during shutdown [Risks and why]: probably low, the code has been in the tree for a while already and there is no crash or use-after-free report [String/UUID change made/needed]: none

Flags: needinfo?(michal.novotny)

Attachment #8746530 - Flags: approval-mozilla-aurora?

Sylvestre Ledru [:Sylvestre]

Updated

•

9 years ago

status-firefox47: affected → wontfix

Sylvestre Ledru [:Sylvestre]

Comment 26

•

9 years ago

Comment on attachment 8746530 [details] [diff] [review] patch v2 Fix an intermittent, taking it

Attachment #8746530 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+

Carsten Book [:Tomcat]

Comment 27

•

9 years ago

https://hg.mozilla.org/releases/mozilla-aurora/rev/d5c83c0c2aff

status-firefox48: affected → fixed

Matt Wobensmith [:mwobensmith][:matt:]

Updated

•

9 years ago

Whiteboard: [necko-active] → [necko-active][post-critsmash-triage]

Al Billings [:abillings - ex-MoCo]

Updated

•

9 years ago

Whiteboard: [necko-active][post-critsmash-triage] → [necko-active][post-critsmash-triage][adv-main48-]

Daniel Veditz [:dveditz]

Updated

•

9 years ago

Group: core-security-release

Daniel Veditz [:dveditz]

Updated

•

8 years ago

Keywords: csectype-uaf

Sylvestre Ledru [:Sylvestre]

Updated

•

5 years ago

Blocks: asan-maintenance

fix 9 years ago Michal Novotny [:michal] 11.21 KB, patch		Details \| Diff \| Splinter Review
patch v2 9 years ago Michal Novotny [:michal] 15.08 KB, patch	mayhemer : review+ Sylvestre : approval-mozilla-aurora+	Details \| Diff \| Splinter Review