923016 - HTTP cache v2: design and impl a persistent index

- removed unused code - removed assertions that are no longer valid - fixed possible wrong usage of CacheIndexEntryAutoManage There is still a strange assertion on OS X 10.8 Debug that needs to be fixed: https://tbpl.mozilla.org/php/getParsedLog.php?id=32378794&tree=Try#error0

Attachment #8351394 - Attachment is obsolete: true

Michal Novotny [:michal]

Assignee

Comment 4

•

11 years ago

Attached patch WIP patch - updating, building v3 (obsolete) — Details — Splinter Review

- fixed assertion on OS X 10.8 Debug - added checks of correct usage of CacheIndexEntryAutoManage

Attachment #8351433 - Attachment is obsolete: true

Michal Novotny [:michal]

Assignee

Comment 5

•

11 years ago

Comment on attachment 8351527 [details] [diff] [review] WIP patch - updating, building v3 Landed on gum https://hg.mozilla.org/projects/gum/rev/38de00102cb8

Michal Novotny [:michal]

Assignee

Comment 6

•

11 years ago

Comment on attachment 8340511 [details] [diff] [review] WIP patch Landed on gum https://hg.mozilla.org/projects/gum/rev/7a67eb536bb4

Michal Novotny [:michal]

Assignee

Comment 7

•

11 years ago

Attached patch WIP patch - various fixes (obsolete) — Details — Splinter Review

- delayed update/build to minimize unnecessary IO during startup - CacheFile::Init() uses index - better time limit in CacheIndex::BuildIndex()/UpdateIndex() - converted mLastDumpTime to TimeStamp - PR_htonl/PR_ntohl replaced with NetworkEndian functions

Michal Novotny [:michal]

Assignee

Comment 8

•

11 years ago

Comment on attachment 8351644 [details] [diff] [review] WIP patch - various fixes Landed on gum https://hg.mozilla.org/projects/gum/rev/084327a51eb8

Michal Novotny [:michal]

Assignee

Comment 9

•

11 years ago

Attached patch WIP patch - another fixes (obsolete) — Details — Splinter Review

pushed to gum https://hg.mozilla.org/projects/gum/rev/e7d4f3803b2b

Michal Novotny [:michal]

Assignee

Comment 10

•

11 years ago

Attached patch WIP patch - fixed delayed updating/building (obsolete) — Details — Splinter Review

Michal Novotny [:michal]

Assignee

Comment 11

•

11 years ago

Comment on attachment 8355088 [details] [diff] [review] WIP patch - fixed delayed updating/building https://hg.mozilla.org/projects/gum/rev/1eb52df806df

Honza Bambas (:mayhemer)

Reporter

Updated

•

11 years ago

Blocks: 949499

No longer depends on: 949499

Honza Bambas (:mayhemer)

Reporter

Updated

•

11 years ago

Blocks: 948566

No longer depends on: 948566

Honza Bambas (:mayhemer)

Reporter

Updated

•

11 years ago

Blocks: 948570

No longer depends on: 948570

Michal Novotny [:michal]

Assignee

Comment 14

•

11 years ago

Attached patch single WIP patch (obsolete) — Details — Splinter Review

This is a merged patch of all patches from this bug and bugs #948570 and #949499. Also contains following patches that don't belong to any bug: https://hg.mozilla.org/projects/gum/rev/7f39c19bd254 https://hg.mozilla.org/projects/gum/rev/696603413510 https://hg.mozilla.org/projects/gum/rev/31aa5203add6 https://hg.mozilla.org/projects/gum/rev/c99208c44aff

Attachment #8340511 - Attachment is obsolete: true

Attachment #8351527 - Attachment is obsolete: true

Attachment #8351644 - Attachment is obsolete: true

Attachment #8354927 - Attachment is obsolete: true

Attachment #8355088 - Attachment is obsolete: true

Michal Novotny [:michal]

Assignee

Comment 15

•

11 years ago

Attached patch single WIP patch v2 (obsolete) — Details — Splinter Review

Yet another small change. I think it is ready for review. Try server push https://tbpl.mozilla.org/?tree=Try&rev=d2023ceaa748

Attachment #8355557 - Attachment is obsolete: true

Attachment #8355597 - Flags: review?(honzab.moz)

Michal Novotny [:michal]

Assignee

Comment 16

•

11 years ago

Attached patch single WIP patch v1->v2 interdiff (obsolete) — Details — Splinter Review

Honza Bambas (:mayhemer)

Reporter

Updated

•

11 years ago

No longer blocks: 948566

Michal Novotny [:michal]

Assignee

Comment 17

•

11 years ago

Attached patch single WIP patch v3 (obsolete) — Details — Splinter Review

- removed unused methods InsertEntryToArrays(), RemoveEntryFromArrays() - fixed heap-use-after-free reported in bug #958311

Michal Novotny [:michal]

Assignee

Updated

•

11 years ago

Attachment #8363219 - Flags: review?(honzab.moz)

Honza Bambas (:mayhemer)

Reporter

Comment 18

•

11 years ago

Attached patch single WIP patch v3 (as to land after bugs 945945 and 945250) (obsolete) — Details — Splinter Review

Identical to WIP v3, only on top of the two patches. Only locally net xpcshell tests checked. I will review this patch, as it seems we have a consensus to land first the crash patches.

Attachment #8363313 - Flags: review?(honzab.moz)

Honza Bambas (:mayhemer)

Reporter

Comment 19

•

11 years ago

Comment on attachment 8363219 [details] [diff] [review] single WIP patch v3 This patch can no longer be applied on the current gum (m-c). The remerged version is the current version to deal with.

Attachment #8363219 - Attachment is obsolete: true

Attachment #8363219 - Flags: review?(honzab.moz)

Honza Bambas (:mayhemer)

Reporter

Comment 20

•

11 years ago

Comment on attachment 8363313 [details] [diff] [review] single WIP patch v3 (as to land after bugs 945945 and 945250) Needs a little more merging.

Attachment #8363313 - Attachment is obsolete: true

Attachment #8363313 - Flags: review?(honzab.moz)

Honza Bambas (:mayhemer)

Reporter

Comment 21

•

11 years ago

Attached patch merged to the current gum WIP v3 [by Michal] (obsolete) — Details — Splinter Review

Attachment #8363808 - Flags: review?(honzab.moz)

Honza Bambas (:mayhemer)

Reporter

Updated

•

11 years ago

Status: NEW → ASSIGNED

Honza Bambas (:mayhemer)

Reporter

Comment 22

•

11 years ago

Comment on attachment 8355597 [details] [diff] [review] single WIP patch v2 Review of attachment 8355597 [details] [diff] [review]: ----------------------------------------------------------------- r- to refer the comment this is just a draft of some thoughts on the patch, most of them are either nits or stuff to discuss CacheIndex.* is left to go through for me ::: netwerk/cache2/CacheFile.cpp @@ +449,4 @@ > flags = CacheFileIOManager::CREATE; > > + if (!mKeyIsHash) { > + // Have a look into index and change to CREATE_NEW when we are sure into the index? @@ +467,5 @@ > + // operation can change the result. > + nsRefPtr<NotifyCacheFileListenerEvent> ev; > + ev = new NotifyCacheFileListenerEvent(aCallback, NS_OK, true); > + rv = NS_DispatchToCurrentThread(ev); > + NS_ENSURE_SUCCESS(rv, rv); question, not a requirement if not easy: could we just call directly? the code in cacheentry is prepared for it. ::: netwerk/cache2/CacheFileIOManager.cpp @@ +77,5 @@ > , mFD(nullptr) > { > + LOG(("CacheFileHandle::CacheFileHandle() [this=%p, hash=%08x%08x%08x%08x%08x]" > + , this, LOGSHA1(aHash))); > + MOZ_COUNT_CTOR(CacheFileHandle); we should remove these duplicate counters when there is the default impl for addref/release. I have some too... (no need to do it in this bug) @@ +137,5 @@ > + MOZ_ASSERT(mFileSize != -1); > + uint64_t size64 = mFileSize; > + > + if (size64 & 0x3FF) > + size64 += 0x400; won't just += 3ff do the same job? @@ +533,5 @@ > if (mTarget) { > mRV = NS_OK; > > + if (mFlags & CacheFileIOManager::SPECIAL_FILE) { > + } at least comment there is nothing to do for it or so... this can be easily overlooked @@ +715,5 @@ > { > MOZ_COUNT_DTOR(WriteEvent); > + > + if (!mCallback && mBuf) { > + free(const_cast<char *>(mBuf)); isn't there auto-free ptr? @@ +1059,5 @@ > +protected: > + nsRefPtr<CacheFileHandle> mHandle; > + bool mHasFrecency; > + bool mHasExpirationTime; > + bool mHasSize; bit fields please as the last member if not needed for mt access @@ +1381,5 @@ > + } > + else { > + handle->mFileSize = 0; > + > + CacheIndex::AddEntry(aHash); is it possible the entry is already there for the hash? if so, what then happens? @@ +1400,5 @@ > + > + nsresult rv; > + > + if (mShuttingDown) > + return NS_ERROR_NOT_INITIALIZED; hmm.. what if we want to process this after mShuttingDown = true? can that happen? why should we disallow that? @@ +1415,5 @@ > + nsRefPtr<CacheFileHandle> handle; > + for (uint32_t i = 0 ; i < mSpecialHandles.Length() ; i++) { > + if (!mSpecialHandles[i]->IsDoomed() && mSpecialHandles[i]->Key() == aKey) { > + handle = mSpecialHandles[i]; > + break; might backward search be faster? since as I understand doomed handles for the same key has lower indexes then the one we search for here. @@ +1528,5 @@ > aHandle->mFile->Remove(false); > } > > + if (!aHandle->IsSpecialFile() && !aHandle->mIsDoomed && > + (aHandle->mInvalid || !aHandle->mFileExists)) { not sure, but isn't aHandle->mFileExists always true? regardless that, some comments on this complicated condition would be good. @@ +1667,5 @@ > > + if (bytesWritten != -1 && aHandle->mFileSize < aOffset+bytesWritten) { > + aHandle->mFileSize = aOffset+bytesWritten; > + > + if (!aHandle->IsDoomed() && !aHandle->IsSpecialFile()) { Could we rename SpecialFile to SimpleFile or BasicFile? Still, I would be so pleased when this were virtualized (CacheFileHandle : public SimpleFileHandle), but that is too much work now.. @@ +1991,5 @@ > + > + if (aHandle->IsDoomed()) > + return NS_ERROR_NOT_AVAILABLE; > + > + // Doom old handle if it exists and is not doomed Please add a comment what the loop under this comment does exactly. @@ +2367,5 @@ > rv = aHandle->mFile->OpenNSPRFileDesc(PR_RDWR, 0600, &aHandle->mFD); > if (NS_ERROR_FILE_NOT_FOUND == rv) { > LOG((" file doesn't exists")); > aHandle->mFileExists = false; > + return DoomFileInternal(aHandle); why this change? ::: netwerk/cache2/CacheFileIOManager.h @@ +38,5 @@ > NS_DECL_THREADSAFE_ISUPPORTS > > CacheFileHandle(const SHA1Sum::Hash *aHash, bool aPriority); > + CacheFileHandle(const nsACString &aKey, bool aPriority); > + void Log(); #ifdef'ed? @@ +47,4 @@ > bool IsPriority() { return mPriority; } > bool FileExists() { return mFileExists; } > bool IsClosed() { return mClosed; } > + bool IsSpecialFile() { return !mHash; } please make this (and probably all the getter when applicable) const also, I think we can make pointer mHash const, so that it emphasizes the state "is special" cannot change during the lifetime. ::: netwerk/cache2/CacheFileMetadata.cpp @@ +126,5 @@ > +{ > + LOG(("CacheFileMetadata::CacheFileMetadata() [this=%p]", this)); > + > + MOZ_COUNT_CTOR(CacheFileMetadata); > + memset(&mMetaHdr, 0, sizeof(CacheFileMetadataHeader)); sizeof(mMetaHdr) rather? @@ +314,5 @@ > + MOZ_ASSERT(!mWriteBuf); > + MOZ_ASSERT(mKey.IsEmpty()); > + > + nsresult rv; > + Please assert here you are on the IO thread. @@ +915,5 @@ > +CacheFileMetadata::ParseKey(const nsACString &aKey) > +{ > + if (aKey.Length() < 4) { > + return NS_ERROR_FAILURE; > + } Would be fun to check the first letter is really '-' :) @@ +929,5 @@ > + } > + > + if (aKey[2] != ':') { > + return NS_ERROR_FAILURE; > + } Don't know why I've added this colon there :(( Now we have to bear it... @@ +950,5 @@ > + > + if (appIdEndIdx == 3) { > + mAppId = nsILoadContextInfo::NO_APP_ID; > + } > + else { I heard '} else {' on one line is preferred @@ +956,5 @@ > + nsresult rv; > + int64_t appId64 = appIdStr.ToInteger64(&rv); > + if (NS_FAILED(rv) || appId64 > PR_UINT32_MAX) > + return NS_ERROR_FAILURE; > + mAppId = appId64; appid's are 'unsigned long', so the > PR_UINT32_MAX should do the 32-bit parser, no? Otherwise please cast explicitly 64->32 at this assignment ::: netwerk/cache2/CacheHashUtils.cpp @@ +100,5 @@ > +{ > + switch (mPos) { > + case 0: > + mA += aVal; > + mPos ++; ++mPos (more effective) or mPos = 1; Up to you. @@ +117,5 @@ > + } > + else { > + mC += aVal << 8; > + } > + } I'll definitely need some instant explanation to this :) ::: netwerk/cache2/CacheIOThread.h @@ +37,5 @@ > OPEN_TRUNCATE, > WRITE, > CLOSE, > EVICT, > + BUILD_OR_UPDATE_INDEX, I would not be against to keep this name as simple as just INDEX. ::: netwerk/cache2/CacheIndex.cpp @@ +99,5 @@ > +} > + > + > +nsresult > +CacheIndex::Init(nsIFile *aCacheDirectory) if this is static then please add // static comment over it. this applies to every static method definition @@ +249,5 @@ > + index->ChangeState(SHUTDOWN); > + > + if (oldState != READY) { > + LOG(("CacheIndex::Shutdown() - Unexpected state. Did posting of " > + "PreShutdownInternal() fail?")); should be said this is dependent of the shutdown code in IOMan that calls this on the main thread after the IOThread is joined, i.e. that PreShutdownInternal is synchronized. @@ +262,5 @@ > + if (NS_FAILED(index->WriteLogToDisk())) > + index->RemoveIndexFromDisk(); > + } > + else > + index->RemoveIndexFromDisk(); } else { ... } ::: netwerk/cache2/CacheIndex.h @@ +682,5 @@ > + bool mDoNotSearchInIndex; > + bool mDoNotSearchInUpdates; > +}; > + > +class CacheIndexAutoLock { should be a guard object, but this just internal class on the other hand { on a new line

Attachment #8355597 - Flags: review?(honzab.moz) → review-

Honza Bambas (:mayhemer)

Reporter

Updated

•

11 years ago

Attachment #8355597 - Attachment is obsolete: true

Honza Bambas (:mayhemer)

Reporter

Comment 23

•

11 years ago

Comment on attachment 8355599 [details] [diff] [review] single WIP patch v1->v2 interdiff I don't think I need this to do the review anymore.

Attachment #8355599 - Attachment is obsolete: true

Honza Bambas (:mayhemer)

Reporter

Comment 24

•

11 years ago

And one more thing we might want to do: the CacheIndex.h and .cpp files are quite large, containing impls for some 3 classes. That should be split IMO, unless there is a strong reason not to.

Michal Novotny [:michal]

Assignee

Comment 25

•

11 years ago

Attached patch WIP v4 - added comments explaining how CacheIndex works (obsolete) — Details — Splinter Review

Attachment #8365423 - Flags: review?(honzab.moz)

Honza Bambas (:mayhemer)

Reporter

Comment 26

•

11 years ago

Comment on attachment 8363808 [details] [diff] [review] merged to the current gum WIP v3 [by Michal] There is a newer version, draft comments copied.

Attachment #8363808 - Attachment is obsolete: true

Attachment #8363808 - Flags: review?(honzab.moz)

Honza Bambas (:mayhemer)

Reporter

Updated

•

11 years ago

Depends on: 965899

Honza Bambas (:mayhemer)

Reporter

Comment 27

•

11 years ago

Comment on attachment 8365423 [details] [diff] [review] WIP v4 - added comments explaining how CacheIndex works Review of attachment 8365423 [details] [diff] [review]: ----------------------------------------------------------------- comment 22 + this r- for IO under lock and lack of comments Main concerns: - critical r comments marked with (!) - still (big) lack of comments, please comment on all the state flags like mIndexNeedsUpdate et al with rich comments, why assertions in the code are put as are, write about the 3 files you use, what's their lifetime and format, how they are renamed, deleted etc, comment on all properties and flags of entries, how they change etc... - buffering work, cleanup in a followup bug, I think should be done before we let this among people if not found too complicated (bug 966024 tries to introduce some nice APIs) IO under the lock must be fixed, we could also have a followup for it and land this patch just with more comments (those are really needed, the code is pretty complicated to read because all logic (reading/writing/building/updating/logging) is all in one giant class with a tun of state variables very hard to track. Split would be nice, but it would delay the work :( I'm also missing some tests, I understand that cpp test might be needed. Followup? ::: netwerk/cache2/CacheFileIOManager.cpp @@ +2157,5 @@ > + return NS_ERROR_UNEXPECTED; > + > + nsRefPtr<InitIndexEntryEvent> ev = > + new InitIndexEntryEvent(aHandle, aAppId, aAnonymous, aInBrowser); > + rv = ioMan->mIOThread->Dispatch(ev, CacheIOThread::WRITE); for curiosity, why WRITE level? I think simple dispatch might be better. writes can be delayed and when someone else then the caller to this method wants to use the index entry info (like if it exists) it might stupidly lay in the thread queue. ::: netwerk/cache2/CacheHashUtils.h @@ +50,5 @@ > + uint8_t mPos; > + uint32_t mBuf; > + uint8_t mBufPos; > + uint32_t mLength; > + bool mFinalized; See bug 966114 ::: netwerk/cache2/CacheIndex.cpp @@ +37,5 @@ > +/** > + * This helper class is responsible for keeping CacheIndex::mIndexStats, > + * CacheIndex::mFrecencyArray and CacheIndex::mExpirationArray up to date. > + */ > +class CacheIndexEntryAutoManage as I understand, this needs to exist when you modify something on the entry (a state, flags). could CacheIndexEntry be just const and only this class could provide API to modify an entry to make it more safe that you accidentally don't forget to wrap it? @@ +72,5 @@ > + mIndex->InsertRecordToExpirationArray(entry->mRec); > + } > + else if (!entry && mOldRecord) { > + mIndex->RemoveRecordFromFrecencyArray(mOldRecord); > + mIndex->RemoveRecordFromExpirationArray(mOldRecord); (!) record should remove itself from these arrays in its dtor for safety reasons. you may have CacheIndexRecordStruct that will only keep the properties, used for buffers and arrays and then CacheIndexRecord that will derive from it and have the dtor added. @@ +78,5 @@ > + else if (entry && mOldRecord) { > + bool replaceFrecency = false; > + bool replaceExpiration = false; > + > + if (entry->mRec != mOldRecord) { (!) since mOldRecord can be an invalid pointer here [1] you really cannot do this compare safely. but when looking at the code, this is just an optimization. so it should either be removed (maybe the mOldRecord at all) or you should add a good comment why it's ok to compare to address of a bad ptr - still, I don't think it's a good idea. [1] https://bugzilla.mozilla.org/show_bug.cgi?id=958311#attach_8363051 @@ +265,5 @@ > +CacheIndex::Unlock() > +{ > + MOZ_ASSERT(!mIndexStats.StateLogged()); > + > + mLock.Unlock(); maybe assertownslock before here @@ +350,5 @@ > + > + nsCOMPtr<nsIEventTarget> ioTarget = CacheFileIOManager::IOTarget(); > + MOZ_ASSERT(ioTarget); > + > + rv = ioTarget->Dispatch(event, nsIEventTarget::DISPATCH_NORMAL); maybe it's correct, but are you aware this will not land on the INDEX level of yours but will be executed sooner? maybe add a comment this is expected. @@ +369,5 @@ > + CacheIndexAutoLock lock(this); > + > + LOG(("CacheIndex::PreShutdownInternal() - [state=%d, indexOnDiskIsValid=%d, " > + "dontMarkIndexClean=%d]", mState, mIndexOnDiskIsValid, > + mDontMarkIndexClean)); double logging? @@ +427,5 @@ > + > + if (oldState != READY) { > + LOG(("CacheIndex::Shutdown() - Unexpected state. Did posting of " > + "PreShutdownInternal() fail?")); > + } (!) one way you log this as something unexpected, but bellow you handle it, I'm confused @@ +458,5 @@ > + return NS_OK; > +} > + > +nsresult > +CacheIndex::AddEntry(const SHA1Sum::Hash *aHash) if this is static, then add a // static comment @@ +524,5 @@ > + if (!entry) { > + entry = index->mIndex.PutEntry(*aHash); > + } > + } > + else { // WRITING, READING maybe use switch @@ +1024,5 @@ > + return NS_OK; > +} > + > +bool > +CacheIndex::CheckCollision(CacheIndexEntry *aEntry, according the result, should be rather "HasCollision" ? @@ +1049,5 @@ > +bool > +CacheIndex::EntryChanged(CacheIndexEntry *aEntry, > + const uint32_t *aFrecency, > + const uint32_t *aExpirationTime, > + const uint32_t *aSize) IsEntryChanged ? @@ +1063,5 @@ > + return false; > +} > + > +void > +CacheIndex::ProcessPendingOperations() MergePendingUpdated? @@ +1077,5 @@ > + EnsureCorrectStats(); > +} > + > +PLDHashOperator > +CacheIndex::UpdateEntryInIndex(CacheIndexEntry *aEntry, void* aClosure) As well here.. MergePendingEntryToIndexEnum ? @@ +1098,5 @@ > + if (entry->IsRemoved()) { > + MOZ_ASSERT(entry->IsFresh()); > + MOZ_ASSERT(entry->IsDirty()); > + } > + else if (!entry->IsRemoved() && !entry->IsDirty() && entry->IsEmpty()) { !entry->IsRemoved() is alwasy true here @@ +1214,5 @@ > + LOG(("CacheIndex::WriteRecords()")); > + > + nsresult rv; > + > + AssertOwnsLock(); (!) assert also you are in the WRITING state @@ +1255,5 @@ > + // We've processed all records > + if (mRWBufPos + sizeof(CacheHash::Hash32_t) > mRWBufSize) { > + // realloc buffer to spare another write cycle > + mRWBufSize = mRWBufPos + sizeof(CacheHash::Hash32_t); > + mRWBuf = static_cast<char *>(moz_xrealloc(mRWBuf, mRWBufSize)); use moz_realloc and check result, if null, fail the write. however, I think we should handle this via bug 966024 integration. @@ +1383,5 @@ > + NS_ENSURE_SUCCESS(rv, rv); > + > + bool exists; > + rv = file->Exists(&exists); > + NS_ENSURE_SUCCESS(rv, rv); (!) MOZ_ASSERT(!NS_IsMainThread()) please... @@ +1516,5 @@ > + nsresult rv; > + > + MOZ_ASSERT(mPendingUpdates.Count() == 0); > + > + RemoveFile(NS_LITERAL_CSTRING(kTempIndexName)); why is the temp file removed here? I'd expect this is never deleted, only the 'real' index ever is, and this is renamed to the real index. I'm confused @@ +1530,5 @@ > + mIndexStats.Log(); > + > + PRFileDesc *fd = nullptr; > + rv = logFile->OpenNSPRFileDesc(PR_RDWR | PR_CREATE_FILE | PR_TRUNCATE, > + 0600, &fd); (!) assert not on main thread @@ +1652,5 @@ > + > + int64_t entriesSize = mHandle->FileSize() - sizeof(CacheIndexHeader) - > + sizeof(CacheHash::Hash32_t); > + > + if (entriesSize < 0 || entriesSize % sizeof(CacheIndexRecord)) { maybe entriesSize <= 0 ? for case the file is zero length @@ +1687,5 @@ > + uint32_t pos = 0; > + > + if (!mSkipEntries) { > + CacheIndexHeader *hdr = reinterpret_cast<CacheIndexHeader *>( > + moz_xmalloc(sizeof(CacheIndexHeader))); Another place for the buffer probably (or some autofree class, can also be part of 966024). @@ +1701,5 @@ > + > + if (NetworkEndian::readUint32(&hdr->mIsDirty)) { > + if (mHandle2) { > + CacheFileIOManager::DoomFile(mHandle2, nullptr); > + mHandle2 = nullptr; why? @@ +1710,5 @@ > + NetworkEndian::writeUint32(&hdr->mIsDirty, 1); > + > + // Mark index dirty > + rv = CacheFileIOManager::Write(mHandle, 0, reinterpret_cast<char *>(hdr), > + sizeof(CacheIndexHeader), true, nullptr); comment the buffer is freed later by this method @@ +1750,5 @@ > + > + mHash->Update(mRWBuf + hashOffset, pos - hashOffset); > + > + if (pos != mRWBufPos) { > + memmove(mRWBuf, mRWBuf + pos, mRWBufPos - pos); hmm.. I'd rather prevent getting here in case mRWBufPos < pos @@ +1774,5 @@ > + > + if (mHandle2) > + StartReadingJournal(); > + else > + FinishRead(false); (!) should be (true) ? @@ +1807,5 @@ > + > + MOZ_ASSERT(mHandle2); > + MOZ_ASSERT(mIndexOnDiskIsValid); > + MOZ_ASSERT(mTmpJournal.Count() == 0); > + MOZ_ASSERT(mHandle2->FileSize() >= 0); some comments why these asserts are assumed would be nice (apply generally everywhere where not obvious) @@ +1811,5 @@ > + MOZ_ASSERT(mHandle2->FileSize() >= 0); > + > + int64_t entriesSize = mHandle2->FileSize() - sizeof(CacheHash::Hash32_t); > + > + if (entriesSize < 0 || entriesSize % sizeof(CacheIndexRecord)) { <= 0 ? @@ +1820,5 @@ > + > + mSkipEntries = 0; > + mHash = new CacheHash(); > + > + mRWBufPos = std::min(mRWBufSize, static_cast<uint32_t>(mHandle2->FileSize())); is it ensured the buffer is allocated? is mRWBufPos variable the best to store this value? the Pos suffix indices a read or write position, but here it stands for something else... really buffer encapsulation will make this so much more readable... @@ +1948,5 @@ > + return PL_DHASH_REMOVE; > +} > + > +void > +CacheIndex::EnsureNoFreshEntry() I'd personally rename this to Assert...() @@ +1997,5 @@ > + (aSucceeded && mIndexOnDiskIsValid && mJournalReadSuccessfully)); > + > + if (mState == SHUTDOWN) { > + RemoveFile(NS_LITERAL_CSTRING(kTempIndexName)); > + RemoveFile(NS_LITERAL_CSTRING(kJournalName)); (!) assert not on main thread @@ +2021,5 @@ > + ProcessPendingOperations(); > + // Remove all entries that we haven't seen during this session > + mIndex.EnumerateEntries(&CacheIndex::RemoveNonFreshEntries, this); > + StartBuildingIndex(); > + return; please why and what is here actually happening can mIndexOnDiskIsValid == false when aSucceeded == true? as I understand, reading is the first operation we do with the index, only way mIndexOnDiskIsValid can become true is that we successfully wrote the index to disk, but that probably didn't have a chance to happen @@ +2075,5 @@ > + LOG(("CacheIndex::DelayedBuildUpdate() - Can't dispatch event" )); > + if (index->mState == BUILDING) > + index->FinishBuild(false); > + else > + index->FinishUpdate(false); switch? @@ +2129,5 @@ > + NS_WARNING("CacheIndex::SetupDirectoryEnumerator() - Entries directory " > + "doesn't exist!"); > + LOG(("CacheIndex::SetupDirectoryEnumerator() - Entries directory doesn't " > + "exist!" )); > + return NS_ERROR_UNEXPECTED; (!) should this really NS_WARNING? what if ran right after startup, it can be OK. @@ +2143,5 @@ > + return NS_OK; > +} > + > +void > +CacheIndex::BuildUpdateInitEntry(CacheIndexEntry *aEntry, InitEntryFromDiskFileMetadata? @@ +2164,5 @@ > + aMetaData->GetFrecency(&frecency); > + aEntry->SetFrecency(frecency); > + > + int64_t size64 = 0; > + rv = aFile->GetFileSize(&size64); (!) called under the lock, doesn't produce IO? according some mxr research it probably does.. @@ +2177,5 @@ > + > + size64 >>= 10; > + > + uint32_t size; > + if (size64 >> 32) { seems redundant, could you check against the ~mask here on the 64bit value? also simple std::min would do IMO @@ +2244,5 @@ > + > + nsresult rv; > + > + if (!mDirEnumerator) { > + rv = SetupDirectoryEnumerator(); (!) can produce IO? this is called under the lock... @@ +2266,5 @@ > + } > + } > + > + nsCOMPtr<nsIFile> file; > + rv = mDirEnumerator->GetNextFile(getter_AddRefs(file)); (!) isn't necessary (and better) to call HasMore (or so) ? also, IO under lock. @@ +2325,5 @@ > + CacheIndexAutoUnlock unlock(this); > + rv = meta->SyncReadMetadata(file); > + } > + > + // Nobody should add the entry while the lock was released this is like "dare anyone do something with the index!" or is this really somehow ensured? by state I presume? if so, how is the state ensured not to change? richer comment might answer all of this. @@ +2370,5 @@ > + LOG(("CacheIndex::FinishBuild() - posting of PreShutdownInternal failed? " > + "Cannot safely release mDirEnumerator, leaking it!")); > + // This can happen only in case dispatching event to IO thread failed in > + // CacheIndex::PreShutdown(). > + mDirEnumerator.forget(); // Leak it! NS_WARNING to show this in try logs @@ +2475,5 @@ > + if (start.IsNull()) { > + start = TimeStamp::Now(); > + } else { > + TimeDuration elapsed = TimeStamp::Now() - start; > + if (elapsed.ToMilliseconds() >= kUpdateIndexLoopLimit) { more optimal way is: static TimeDuration const kLimit = TimeDuration::FromMilliseconds(kUpdateIndexLoopLimit); if (elapsed >= kLimit) { ... } ToMilliseconds is expensive, also I think you can freely use NowLoRes here, win won't switch the threads probably anyway sooner then 15.6xN millis. @@ +2521,5 @@ > +#ifdef DEBUG > + nsRefPtr<CacheFileHandle> handle; > + CacheFileIOManager::gInstance->mHandles.GetHandle(&hash, false, > + getter_AddRefs(handle)); > +#endif a ton of duplicate code, (!) same IO under lock issues @@ +2546,5 @@ > + } > + else { > + if (mIndexTimeStamp > (lastModifiedTime / PR_MSEC_PER_SEC)) { > + LOG(("CacheIndex::UpdateIndex() - Skipping file because of last " > + "modified time. [name=%s, indexTimeStamp=%u, " because of last modified time is older then time of the index" @@ +2612,5 @@ > + > + if (mDirEnumerator) { > + if (NS_IsMainThread()) { > + LOG(("CacheIndex::FinishUpdate() - posting of PreShutdownInternal failed?" > + " Cannot safely release mDirEnumerator, leaking it!")); (!) turn this to NS_WARNING so that it appears in try logs by default and in console @@ +2631,5 @@ > + if (mState == UPDATING) { > + if (aSucceeded) { > + // Delete all non-fresh entries only if we iterated all entries > + // successfully. > + mIndex.EnumerateEntries(&CacheIndex::RemoveNonFreshEntries, this); a bit richer comment would be nice @@ +2660,5 @@ > + > + return PL_DHASH_REMOVE; > +} > + > +#ifdef MOZ_LOGGING (!) - disable-logging builds may fail because of this PR_LOG rather? @@ +2685,5 @@ > +{ > +#ifdef MOZ_LOGGING > + LOG(("CacheIndex::ChangeState() changing state %s -> %s", StateString(mState), > + StateString(aNewState))); > +#endif why under MOZ_LOGGING? If StateString were under PR_LOG you didn't need this define wrap. @@ +2692,5 @@ > + MOZ_ASSERT(mPendingUpdates.Count() == 0); > + > + // PreShutdownInternal() should change the state to READY from every state. It > + // may go through different states, but once we are in READY state the only > + // possible transition is to SHUTDOWN state. really? and what if you find out some inconsistency between the index and files under entries/ ? isn't there then indexneedsupdate or so? that can go from ready to updating. at least we go to writing. please check this comment is correct @@ +2696,5 @@ > + // possible transition is to SHUTDOWN state. > + MOZ_ASSERT(!mShuttingDown || mState != READY || aNewState == SHUTDOWN); > + > + // Start updating process when switching to READY state if needed > + if (aNewState == READY && StartUpdatingIndexIfNeeded(true)) why not to switch to the ready state first and then call Start..()? you wouldn't need the argument then. @@ +2713,5 @@ > + mRWBufSize = kMaxBufSize; > + } > + else { > + mRWBufSize = kMaxBufSize; > + } switch would be better + (!) default: assert(false) @@ +2714,5 @@ > + } > + else { > + mRWBufSize = kMaxBufSize; > + } > + mRWBuf = static_cast<char *>(moz_xmalloc(mRWBufSize)); this is definitely not a critical allocation, use fallible alloc and check non-null. If it fails, just don't write/read the index (fail the write/read). 966024 @@ +2842,5 @@ > + return NS_OK; > + > + switch (mState) { > + case WRITING: > + WriteIndexHeader(aHandle, aResult); this also starts writing entries, can we give it a different name? @@ +2858,5 @@ > + return NS_OK; > + } > + > + switch (mReadOpenCount) { > + case 2: // kIndexName (!) - semi critical... are we sure the order will always be preserved? this seems fragile to me I think comparing aHandle->Key() and branch based on that is safer @@ +2891,5 @@ > + if (mHandle2) { > + // Rename journal to make sure we update index on next start in case > + // firefox crashes > + rv = CacheFileIOManager::RenameFile( > + mHandle2, NS_LITERAL_CSTRING(kTempIndexName), this); this operation is a bit unclear to me... why is it needed to do this rename? @@ +3022,5 @@ > + return NS_OK; > + > + switch (mState) { > + case WRITING: > + FinishWrite(NS_FAILED(aResult) ? false : true); I think NS_SUCCEEDED() alone is sufficient ::: netwerk/cache2/CacheIndex.h @@ +34,5 @@ > +typedef struct { > + uint32_t mVersion; > + uint32_t mTimeStamp; // for quick entry file validation > + uint32_t mIsDirty; > +} CacheIndexHeader; comments please, and detailed ones @@ +84,5 @@ > + { > + MOZ_COUNT_DTOR(CacheIndexEntry); > + LOG(("CacheIndexEntry::~CacheIndexEntry() - Deleting record [rec=%p]", > + mRec)); > + delete mRec; use nsAutoPtr @@ +127,5 @@ > + { > + mRec->mFrecency = 0; > + mRec->mExpirationTime = nsICacheEntry::NO_EXPIRATION_TIME; > + mRec->mAppId = nsILoadContextInfo::NO_APP_ID; > + mRec->mFlags = 0; should the hash be inited somehow too? @@ +138,5 @@ > + mRec->mFlags |= eAnonymousMask; > + if (aInBrowser) > + mRec->mFlags |= eInBrowserMask; > + > + mRec->mFlags |= eInitializedMask; (!) - stability concern any risk mFlags might be non-0 when entering this method? I'd rather see the code as mFlags = eInitializedMask, if (aAnon) mFlags |= aAnonMask; ... Also add assert mflags == 0 @@ +143,5 @@ > + } > + > + const SHA1Sum::Hash * Hash() { return &mRec->mHash; } > + > + bool IsInitialized() { return !!(mRec->mFlags & eInitializedMask); } I think you don't need the !! casting here @@ +166,5 @@ > + void SetExpirationTime(uint32_t aExpirationTime) > + { > + mRec->mExpirationTime = aExpirationTime; > + } > + uint32_t GetExpirationTime() { return mRec->mExpirationTime; } all gethers that can be made please make const. in other words - correct constness of this class @@ +172,5 @@ > + void SetFileSize(uint32_t aFileSize) > + { > + MOZ_ASSERT((aFileSize & ~eFileSizeMask) == 0); > + mRec->mFlags &= ~eFileSizeMask; > + mRec->mFlags |= aFileSize; (!) - semi critical this could be exploitable by servers playing with content-length to alter the other flags, min(eFileSizeMask, aFileSize) ? @@ +175,5 @@ > + mRec->mFlags &= ~eFileSizeMask; > + mRec->mFlags |= aFileSize; > + } > + uint32_t GetFileSize() { return mRec->mFlags & eFileSizeMask; } > + bool IsEmpty() { return GetFileSize() == 0; } Should be IsZeroFileSize or so, IsEmpty doesn't say anything. @@ +184,5 @@ > + memcpy(aBuf, mRec, sizeof(CacheIndexRecord)); > + // clear dirty flag > + dst->mFlags &= ~eDirtyMask; > + // clear fresh flag > + dst->mFlags &= ~eFreshMask; it should be explained why the flags are not dropped from the source. better comments needed. @@ +191,5 @@ > + NetworkEndian::writeUint32(&dst->mFrecency, dst->mFrecency); > + NetworkEndian::writeUint32(&dst->mExpirationTime, dst->mExpirationTime); > + NetworkEndian::writeUint32(&dst->mAppId, dst->mAppId); > + NetworkEndian::writeUint32(&dst->mFlags, dst->mFlags); > +#endif just comment that when !IS_LITTLE_ENDIAN we transfer the values via the memcpy above @@ +225,5 @@ > + eAnonymousMask = 0x40000000, > + eInBrowserMask = 0x20000000, > + eRemovedMask = 0x10000000, > + eDirtyMask = 0x08000000, > + eFreshMask = 0x04000000, suggestion: IsFaithful or IsConfidential @@ +227,5 @@ > + eRemovedMask = 0x10000000, > + eDirtyMask = 0x08000000, > + eFreshMask = 0x04000000, > + eReservedMask = 0x03000000, > + eFileSizeMask = 0x00FFFFFF comments please. These should be uint32_t const kXXX, ~ on enum values could screw the result, or use typed enum. But I think to use enum for defining bits is odd. @@ +273,5 @@ > + } > + > + bool StateLogged() { > + return mStateLogged; > + } This getter should be only in DEBUG @@ +311,5 @@ > + } > + > + MOZ_ASSERT(!mStateLogged, "CacheIndexStats::BeforeChange() - state " > + "logged!"); > + mStateLogged = !mStateLogged; hmm.. so when somebody screw up, the result here will be just random flag. up to you what real value this flag has for you, whether to set it directly to |true| here, or leave the dangerous = ! assignment. @@ +314,5 @@ > + "logged!"); > + mStateLogged = !mStateLogged; > + if (aEntry) { > + MOZ_ASSERT(mCount); > + mCount--; generally putting -- or ++ before the var is more efficient @@ +328,5 @@ > + MOZ_ASSERT(mRemoved); > + mRemoved--; > + } > + else { > + if (!aEntry->IsInitialized()) { I was told to prefer.. } else if () { ..on a one line, I have a lot of places to fix this my self... some general cleanup might be good, up to you to fix it whenever you want, but we should start doing it (this is a new file) also, early returns may work here as well to make the code simpler to read and follow as well as to make it more optimal @@ +339,5 @@ > + mEmpty--; > + } > + else { > + MOZ_ASSERT(mSize); > + mSize -= aEntry->GetFileSize(); we won't get here when entry is eRemoved, is that ok? how do you ensure that an entry cannot be altered outside before/afterchange? in the automanage class? @@ +359,5 @@ > + else { > + if (!aEntry->IsInitialized()) mNotInitialized++; > + else { > + if (aEntry->IsEmpty()) mEmpty++; > + else mSize += aEntry->GetFileSize(); but here it's better like: if () do; else do2; all statements should have a new line actually @@ +373,5 @@ > + } > + } > + > +private: > + bool mStateLogged; as I understand, this is false all the time. only when an entry is being changed which is between BeforeChange and AfterChange, this state is set to true. During that time some getters (that should be made const if possible) must not be called. Can you please add a comment for this member? This also has to be DEBUG only. @@ +374,5 @@ > + } > + > +private: > + bool mStateLogged; > + bool mDisableLogging; I'd somehow try to do this PR_LOG only built @@ +407,5 @@ > + // Initialize the entry. It MUST be present in index. > + static nsresult InitEntry(const SHA1Sum::Hash *aHash, > + uint32_t aAppId, > + bool aAnonymous, > + bool aInBrowser); Please explain very well and in detail why we have beside Add also Init and EnsureExists methods. What is the state expected before and after the method is called of both the entry and the index. @@ +427,5 @@ > + }; > + > + static nsresult HasEntry(const nsACString &aKey, EntryStatus *_retval); > + > + NS_IMETHOD OnFileOpened(CacheFileHandle *aHandle, nsresult aResult); should be private: @@ +444,5 @@ > + virtual ~CacheIndex(); > + > + void Lock(); > + void Unlock(); > + void AssertOwnsLock(); use inline, that may optimize the code a bit @@ +447,5 @@ > + void Unlock(); > + void AssertOwnsLock(); > + > + nsresult InitInternal(nsIFile *aCacheDirectory); > + void PreShutdownInternal(); what PreShutdownInternal does? @@ +449,5 @@ > + > + nsresult InitInternal(nsIFile *aCacheDirectory); > + void PreShutdownInternal(); > + > + // This method fails when index is not initialized or is shut down. and when is it "shut down" ? @@ +450,5 @@ > + nsresult InitInternal(nsIFile *aCacheDirectory); > + void PreShutdownInternal(); > + > + // This method fails when index is not initialized or is shut down. > + nsresult EnsureIndexUsable(); Ensure sounds like it does something. but this just checks. please rename the method to say "CheckIndexUsable". return bool would be OK too, up to you. @@ +523,5 @@ > + // to make sure mIndexStats contains correct information. > + void EnsureCorrectStats(); > + static PLDHashOperator SumIndexStats(CacheIndexEntry *aEntry, void* aClosure); > + // Finalizes reading process. > + void FinishRead(bool aSucceeded); IMO better would be OnReadFinished(bool); but up to you. @@ +532,5 @@ > + // Following methods perform updating and building of the index. > + // Timer callback that starts update or build process. > + static void DelayedBuildUpdate(nsITimer *aTimer, void *aClosure); > + // Posts timer event that start update or build process. > + nsresult PostBuildUpdateTimer(uint32_t aDelay); because of frequent use of Pre*/Post* in method names by means of timing, maybe rename this to ScheduleBuild..() @@ +633,5 @@ > + EState mState; > + // Timestamp of time when the index was initialized. We use it to delay > + // initial update or build of index. > + TimeStamp mStartTime; > + bool mShuttingDown; This one is the most interesting and no comment.. @@ +654,5 @@ > + // Timer of delayed update/build. > + nsCOMPtr<nsITimer> mTimer; > + > + // Helper members used when reading/writing index from/to disk. > + // Contains number of entries that should be skipped: and why skipped? @@ +657,5 @@ > + // Helper members used when reading/writing index from/to disk. > + // Contains number of entries that should be skipped: > + // - in hashtable when writing index > + // - in index file when reading index > + uint32_t mSkipEntries; The name mSkipEntries is confusing, mAlreadyProcessedEntries? @@ +661,5 @@ > + uint32_t mSkipEntries; > + // Number of entries that should be written to disk. This is number of entries > + // in hashtable that are initialized and are not marked as removed when writing > + // begins. > + uint32_t mProcessEntries; the var should have a different name according the comment, like mEntriesToWrite or so. @@ +665,5 @@ > + uint32_t mProcessEntries; > + char *mRWBuf; > + uint32_t mRWBufSize; > + uint32_t mRWBufPos; > + nsRefPtr<CacheHash> mHash; mRWHash or so? This is too general for something that specific as writing. Causes confusion. For all these members it would be good wrap then to some nested class like WriteState or so (actually namespace them). It would then be less confusing what these are. @@ +677,5 @@ > + // Reading of journal succeeded if true. > + bool mJournalReadSuccessfully; > + > + // Handle used for writing and reading index file. > + nsRefPtr<CacheFileHandle> mHandle; (!) for safety I think it would be good to have separate handles for read and for write. @@ +679,5 @@ > + > + // Handle used for writing and reading index file. > + nsRefPtr<CacheFileHandle> mHandle; > + // Handle used for reading journal file. > + nsRefPtr<CacheFileHandle> mHandle2; mJournalHandle please @@ +696,5 @@ > + CacheIndexStats mIndexStats; > + > + // When reading journal, we must first parse the whole file and apply the > + // changes iff the journal was read successfully. mTmpJournal is used to store > + // entries during parsing. can you please be a bit more detailed here? during parsing of what? how do we handle the entries and when? also some general comment on what the journal,tmp,index files are would be good

Attachment #8365423 - Flags: review?(honzab.moz) → review-

Michal Novotny [:michal]

Assignee

Updated

•

11 years ago

Blocks: 968101

Michal Novotny [:michal]

Assignee

Comment 28

•

11 years ago

Attached patch v4 -> v5 diff (obsolete) — Details — Splinter Review

I've ignored some less important comments for now. The patch contains fix for the IO under lock when updating/building index, lot of other fixes and new comments. This is just a diff against the old patch. I'll attach a new version of the patch once it successfully passes try server and my local testing. (In reply to Honza Bambas (:mayhemer) from comment #27) > > + rv = ioMan->mIOThread->Dispatch(ev, CacheIOThread::WRITE); > > for curiosity, why WRITE level? I think simple dispatch might be better. > writes can be delayed and when someone else then the caller to this method > wants to use the index entry info (like if it exists) it might stupidly lay > in the thread queue. I don't see a reason why writing of index should have a higher priority than opening and entry etc. The index is usable while we're writing it to disk... > ::: netwerk/cache2/CacheHashUtils.h > @@ +50,5 @@ > > + uint8_t mPos; > > + uint32_t mBuf; > > + uint8_t mBufPos; > > + uint32_t mLength; > > + bool mFinalized; > > See bug 966114 This hash function was chosen because it is pretty strong while being still fast (see bug #290032). Are there any tests how good mfbt/HashFunction is? > @@ +265,5 @@ > > +CacheIndex::Unlock() > > +{ > > + MOZ_ASSERT(!mIndexStats.StateLogged()); > > + > > + mLock.Unlock(); > > maybe assertownslock before here AFAICS this assert is already included in the underlying code. > @@ +350,5 @@ > > + > > + nsCOMPtr<nsIEventTarget> ioTarget = CacheFileIOManager::IOTarget(); > > + MOZ_ASSERT(ioTarget); > > + > > + rv = ioTarget->Dispatch(event, nsIEventTarget::DISPATCH_NORMAL); > > maybe it's correct, but are you aware this will not land on the INDEX level > of yours but will be executed sooner? maybe add a comment this is expected. That's OK, comment added. > @@ +427,5 @@ > > + > > + if (oldState != READY) { > > + LOG(("CacheIndex::Shutdown() - Unexpected state. Did posting of " > > + "PreShutdownInternal() fail?")); > > + } > > (!) > > one way you log this as something unexpected, but bellow you handle it, I'm > confused IMO this is unexpected and is worth of logging it. But the code is written so that it can still handle situation when PreShutdownInternal is not executed. The only problem is that we cannot release the directory enumerator, since it is not thread safe and was created on IO thread. > @@ +1383,5 @@ > > + NS_ENSURE_SUCCESS(rv, rv); > > + > > + bool exists; > > + rv = file->Exists(&exists); > > + NS_ENSURE_SUCCESS(rv, rv); > > (!) > > MOZ_ASSERT(!NS_IsMainThread()) please... No, it can be called from WriteLogToDisk() which is executed on the main thread. See below. > @@ +1516,5 @@ > > + nsresult rv; > > + > > + MOZ_ASSERT(mPendingUpdates.Count() == 0); > > + > > + RemoveFile(NS_LITERAL_CSTRING(kTempIndexName)); > > why is the temp file removed here? I'd expect this is never deleted, only > the 'real' index ever is, and this is renamed to the real index. I'm > confused Because the only clean state at startup (when we start neither update nor build process) is when tmpfile doesn't exist and index and log file exist and are successfully parsed. Existing tmpfile means that some operation didn't finish. > @@ +1530,5 @@ > > + mIndexStats.Log(); > > + > > + PRFileDesc *fd = nullptr; > > + rv = logFile->OpenNSPRFileDesc(PR_RDWR | PR_CREATE_FILE | PR_TRUNCATE, > > + 0600, &fd); > > (!) > > assert not on main thread This HAPPENS on the main thread. We write journal during shutdown because the index is used until its shutdown and we cannot easily start writing journal earlier. It is small enough and no fsync is used. Btw, this has been discussed and found to be OK. > @@ +1701,5 @@ > > + > > + if (NetworkEndian::readUint32(&hdr->mIsDirty)) { > > + if (mHandle2) { > > + CacheFileIOManager::DoomFile(mHandle2, nullptr); > > + mHandle2 = nullptr; > > why? Because there is no reason to trust the data in the journal, when the index is dirty. We'll simply ignore the journal and start the update. > @@ +1774,5 @@ > > + > > + if (mHandle2) > > + StartReadingJournal(); > > + else > > + FinishRead(false); > > (!) > > should be (true) ? No, true means that we've read the whole index (including journal) successfully, which isn't apparently this case. > @@ +1811,5 @@ > > + MOZ_ASSERT(mHandle2->FileSize() >= 0); > > + > > + int64_t entriesSize = mHandle2->FileSize() - sizeof(CacheHash::Hash32_t); > > + > > + if (entriesSize < 0 || entriesSize % sizeof(CacheIndexRecord)) { > > <= 0 ? Journal containing only the hash and no entries is a valid journal. > @@ +1820,5 @@ > > + > > + mSkipEntries = 0; > > + mHash = new CacheHash(); > > + > > + mRWBufPos = std::min(mRWBufSize, static_cast<uint32_t>(mHandle2->FileSize())); > > is it ensured the buffer is allocated? Yes, the buffer is allocated in StartReadingIndex() and this always precedes reading journal. > is mRWBufPos variable the best to store this value? the Pos suffix indices > a read or write position, but here it stands for something else... really > buffer encapsulation will make this so much more readable... Why not, but definitely as a followup. > @@ +1997,5 @@ > > + (aSucceeded && mIndexOnDiskIsValid && mJournalReadSuccessfully)); > > + > > + if (mState == SHUTDOWN) { > > + RemoveFile(NS_LITERAL_CSTRING(kTempIndexName)); > > + RemoveFile(NS_LITERAL_CSTRING(kJournalName)); > > (!) > > assert not on main thread Again, this actually happens on main thread and that's the reason why we must remove the file synchronously and not via DoomFile. > @@ +2021,5 @@ > > + ProcessPendingOperations(); > > + // Remove all entries that we haven't seen during this session > > + mIndex.EnumerateEntries(&CacheIndex::RemoveNonFreshEntries, this); > > + StartBuildingIndex(); > > + return; > > please why and what is here actually happening > can mIndexOnDiskIsValid == false when aSucceeded == true? No, I think the assert at the beginning of this method is self explaining. See also the new comment explaining reading of the index. > as I understand, reading is the first operation we do with the index, only > way mIndexOnDiskIsValid can become true is that we successfully wrote the > index to disk, but that probably didn't have a chance to happen You've missed that this variable is also set in CacheIndex::ParseRecords(). > @@ +2129,5 @@ > > + NS_WARNING("CacheIndex::SetupDirectoryEnumerator() - Entries directory " > > + "doesn't exist!"); > > + LOG(("CacheIndex::SetupDirectoryEnumerator() - Entries directory doesn't " > > + "exist!" )); > > + return NS_ERROR_UNEXPECTED; > > (!) > > should this really NS_WARNING? what if ran right after startup, it can be > OK. We always first try to open index file and both CacheFileIOManager::OpenFileInternal() and CacheFileIOManager::OpenSpecialFileInternal() create the cache tree if it doesn't exist. So this is really unexpected error. > @@ +2266,5 @@ > > + } > > + } > > + > > + nsCOMPtr<nsIFile> file; > > + rv = mDirEnumerator->GetNextFile(getter_AddRefs(file)); > > isn't necessary (and better) to call HasMore (or so) ? Why? It does exactly what we need. It returns first/next file or null when there is no next file. > @@ +2692,5 @@ > > + MOZ_ASSERT(mPendingUpdates.Count() == 0); > > + > > + // PreShutdownInternal() should change the state to READY from every state. It > > + // may go through different states, but once we are in READY state the only > > + // possible transition is to SHUTDOWN state. > > really? and what if you find out some inconsistency between the index and > files under entries/ ? isn't there then indexneedsupdate or so? that can > go from ready to updating. at least we go to writing. please check this > comment is correct It is correct. PreShutdownInternal() interrupts any process (update/build/write/read) and we end up in ready state. And we test whether mShuttingDown is set to true everywhere we start these processes, e.g. WriteIndexToDiskIfNeeded(), DelayedBuildUpdate(), StartUpdatingIndexIfNeeded(), etc... > @@ +2858,5 @@ > > + return NS_OK; > > + } > > + > > + switch (mReadOpenCount) { > > + case 2: // kIndexName > > (!) - semi critical... > > are we sure the order will always be preserved? this seems fragile to me > I think comparing aHandle->Key() and branch based on that is safer Yes, nothing could IMO reorder the callback notifications. We cannot compare the name if we get an error and we expect the error while opening the tmpfile. And there is already an assertion that checks the file name when the file was opened successfully. > @@ +2891,5 @@ > > + if (mHandle2) { > > + // Rename journal to make sure we update index on next start in case > > + // firefox crashes > > + rv = CacheFileIOManager::RenameFile( > > + mHandle2, NS_LITERAL_CSTRING(kTempIndexName), this); > > this operation is a bit unclear to me... why is it needed to do this rename? I think the comment is clear. If the cache is used while we read the index from the disk, entries are normally created/used/rewritten. When FF crashes while we read the index and some entries were altered then these changes won't be in index on next startup because we'll find valid index and valid journal as well (both are valid but outdated). It is not critical to run with outdated index, but there is no reason to do it when not necessary. Renaming journal to tmpfile minimizes this time window, since we start update process if journal is not found on disk. > @@ +127,5 @@ > > + { > > + mRec->mFrecency = 0; > > + mRec->mExpirationTime = nsICacheEntry::NO_EXPIRATION_TIME; > > + mRec->mAppId = nsILoadContextInfo::NO_APP_ID; > > + mRec->mFlags = 0; > > should the hash be inited somehow too? No, CacheIndexEntry can never change its hash and it is set up in constructor. > @@ +339,5 @@ > > + mEmpty--; > > + } > > + else { > > + MOZ_ASSERT(mSize); > > + mSize -= aEntry->GetFileSize(); > > we won't get here when entry is eRemoved, is that ok? Yes, it is OK, since removed entries are not counted to the total count. > how do you ensure that an entry cannot be altered outside > before/afterchange? in the automanage class? Yes, all places where we alter, create or remove entry are wrapped with automanage class. > @@ +447,5 @@ > > + void Unlock(); > > + void AssertOwnsLock(); > > + > > + nsresult InitInternal(nsIFile *aCacheDirectory); > > + void PreShutdownInternal(); > > what PreShutdownInternal does? Interrupts any pending process (read, write, update, build) and makes sure the index stays in READY state until Shutdown() is called. > @@ +449,5 @@ > > + > > + nsresult InitInternal(nsIFile *aCacheDirectory); > > + void PreShutdownInternal(); > > + > > + // This method fails when index is not initialized or is shut down. > > and when is it "shut down" ? Between calls to PreShutdownInternal() and Shutdown(). > @@ +523,5 @@ > > + // to make sure mIndexStats contains correct information. > > + void EnsureCorrectStats(); > > + static PLDHashOperator SumIndexStats(CacheIndexEntry *aEntry, void* aClosure); > > + // Finalizes reading process. > > + void FinishRead(bool aSucceeded); > > IMO better would be OnReadFinished(bool); but up to you. No, it actually finishes the reading. One can call this method anytime with false argument to stop the reading process. > @@ +677,5 @@ > > + // Reading of journal succeeded if true. > > + bool mJournalReadSuccessfully; > > + > > + // Handle used for writing and reading index file. > > + nsRefPtr<CacheFileHandle> mHandle; > > (!) > > for safety I think it would be good to have separate handles for read and > for write. We can never read and write at the same time. We always call FinishRead() before we leave READING state and we null out the handles in FinishRead().

Attachment #8374056 - Flags: review?(honzab.moz)

Michal Novotny [:michal]

Assignee

Comment 29

•

11 years ago

Attached patch v5 -> v6 diff (obsolete) — Details — Splinter Review

few additional fixes try server push https://tbpl.mozilla.org/?tree=Try&rev=608d3070beed

Attachment #8374696 - Flags: review?(honzab.moz)

Michal Novotny [:michal]

Assignee

Comment 30

•

11 years ago

Attached patch v6 -> v7 diff (obsolete) — Details — Splinter Review

Yet another small fix of a bug that caused removing index file during shutdown.

Attachment #8374706 - Flags: review?(honzab.moz)

Michal Novotny [:michal]

Assignee

Comment 31

•

11 years ago

Attached patch patch v7 (obsolete) — Details — Splinter Review

Attachment #8365423 - Attachment is obsolete: true

Honza Bambas (:mayhemer)

Reporter

Comment 32

•

11 years ago

(In reply to Michal Novotny (:michal) from comment #28) > Created attachment 8374056 [details] [diff] [review] > v4 -> v5 diff > > I've ignored some less important comments for now. The patch contains fix > for the IO under lock when updating/building index, lot of other fixes and > new comments. This is just a diff against the old patch. I'll attach a new > version of the patch once it successfully passes try server and my local > testing. > > > (In reply to Honza Bambas (:mayhemer) from comment #27) > > > + rv = ioMan->mIOThread->Dispatch(ev, CacheIOThread::WRITE); > > > > for curiosity, why WRITE level? I think simple dispatch might be better. > > writes can be delayed and when someone else then the caller to this method > > wants to use the index entry info (like if it exists) it might stupidly lay > > in the thread queue. > > I don't see a reason why writing of index should have a higher priority than > opening and entry etc. The index is usable while we're writing it to disk... It's clear to me now. > > > > ::: netwerk/cache2/CacheHashUtils.h > > @@ +50,5 @@ > > > + uint8_t mPos; > > > + uint32_t mBuf; > > > + uint8_t mBufPos; > > > + uint32_t mLength; > > > + bool mFinalized; > > > > See bug 966114 > > This hash function was chosen because it is pretty strong while being still > fast (see bug #290032). Are there any tests how good mfbt/HashFunction is? Depends on what you want to test, maybe somebody did the evaluation already. If then we (you?) should do that evaluation in bug 966114. This a footprint optimization, if not really necessary we shouldn't duplicate code. No necessary for enabling cache2 tho. > > @@ +1383,5 @@ > > > + NS_ENSURE_SUCCESS(rv, rv); > > > + > > > + bool exists; > > > + rv = file->Exists(&exists); > > > + NS_ENSURE_SUCCESS(rv, rv); > > > > (!) > > > > MOZ_ASSERT(!NS_IsMainThread()) please... > > No, it can be called from WriteLogToDisk() which is executed on the main > thread. See below. If this can happen just during shutdown, then the check should be: if (!mShuttingDown) MOZ_ASSERT(!NS_IsMainThread()); > > > > @@ +1516,5 @@ > > > + nsresult rv; > > > + > > > + MOZ_ASSERT(mPendingUpdates.Count() == 0); > > > + > > > + RemoveFile(NS_LITERAL_CSTRING(kTempIndexName)); > > > > why is the temp file removed here? I'd expect this is never deleted, only > > the 'real' index ever is, and this is renamed to the real index. I'm > > confused > > Because the only clean state at startup (when we start neither update nor > build process) is when tmpfile doesn't exist and index and log file exist > and are successfully parsed. Existing tmpfile means that some operation > didn't finish. OK, comments would help. (didn't check the new patch yet) > > > @@ +1530,5 @@ > > > + mIndexStats.Log(); > > > + > > > + PRFileDesc *fd = nullptr; > > > + rv = logFile->OpenNSPRFileDesc(PR_RDWR | PR_CREATE_FILE | PR_TRUNCATE, > > > + 0600, &fd); > > > > (!) > > > > assert not on main thread > > This HAPPENS on the main thread. We write journal during shutdown because > the index is used until its shutdown and we cannot easily start writing > journal earlier. It is small enough and no fsync is used. Btw, this has been > discussed and found to be OK. Aha! This needs comments, and probably similar check (if the conditions are met:) if (!mShuttingDown) MOZ_ASSERT(!NS_IsMainThread()); > > is mRWBufPos variable the best to store this value? the Pos suffix indices > > a read or write position, but here it stands for something else... really > > buffer encapsulation will make this so much more readable... > > Why not, but definitely as a followup. Sure, that would be too much changes. > > > > @@ +1997,5 @@ > > > + (aSucceeded && mIndexOnDiskIsValid && mJournalReadSuccessfully)); > > > + > > > + if (mState == SHUTDOWN) { > > > + RemoveFile(NS_LITERAL_CSTRING(kTempIndexName)); > > > + RemoveFile(NS_LITERAL_CSTRING(kJournalName)); > > > > (!) > > > > assert not on main thread > > Again, this actually happens on main thread and that's the reason why we > must remove the file synchronously and not via DoomFile. Ah, I overlooked the state check. OK. > > @@ +2891,5 @@ > > > + if (mHandle2) { > > > + // Rename journal to make sure we update index on next start in case > > > + // firefox crashes > > > + rv = CacheFileIOManager::RenameFile( > > > + mHandle2, NS_LITERAL_CSTRING(kTempIndexName), this); > > > > this operation is a bit unclear to me... why is it needed to do this rename? > > I think the comment is clear. If the cache is used while we read the index > from the disk, entries are normally created/used/rewritten. When FF crashes > while we read the index and some entries were altered then these changes > won't be in index on next startup because we'll find valid index and valid > journal as well (both are valid but outdated). It is not critical to run > with outdated index, but there is no reason to do it when not necessary. > Renaming journal to tmpfile minimizes this time window, since we start > update process if journal is not found on disk. OK, still not 100% clear, hope I get it from the new comments. > > @@ +339,5 @@ > > > + mEmpty--; > > > + } > > > + else { > > > + MOZ_ASSERT(mSize); > > > + mSize -= aEntry->GetFileSize(); > > > > we won't get here when entry is eRemoved, is that ok? > > Yes, it is OK, since removed entries are not counted to the total count. I wanted point out some possibility to optimize the code I think. > > how do you ensure that an entry cannot be altered outside > > before/afterchange? in the automanage class? > > Yes, all places where we alter, create or remove entry are wrapped with > automanage class. That is not the answer. How do you ensure it is always wrapped? :) But that is probably not that critical since noone from outside this code can access the entries, so access to them is fully under our control. > > @@ +677,5 @@ > > > + // Reading of journal succeeded if true. > > > + bool mJournalReadSuccessfully; > > > + > > > + // Handle used for writing and reading index file. > > > + nsRefPtr<CacheFileHandle> mHandle; > > > > (!) > > > > for safety I think it would be good to have separate handles for read and > > for write. > > We can never read and write at the same time. We always call FinishRead() > before we leave READING state and we null out the handles in FinishRead(). You might miss the point - I want to say that it would be safer to have two handles so that we avoid accidental writes to a handle open for read and vise versa. But leave it.

Honza Bambas (:mayhemer)

Reporter

Comment 33

•

11 years ago

(In reply to Michal Novotny (:michal) from comment #28) > Created attachment 8374056 [details] [diff] [review] > v4 -> v5 diff > > @@ +2858,5 @@ > > > + return NS_OK; > > > + } > > > + > > > + switch (mReadOpenCount) { > > > + case 2: // kIndexName > > > > (!) - semi critical... > > > > are we sure the order will always be preserved? this seems fragile to me > > I think comparing aHandle->Key() and branch based on that is safer > > Yes, nothing could IMO reorder the callback notifications. We cannot compare > the name if we get an error and we expect the error while opening the > tmpfile. And there is already an assertion that checks the file name when > the file was opened successfully. Forgot about this one: I still don't think this is alright also regarding we really completely lack testing for this. There is another full-proof way to do it: have separate listeners for each file to OpenFile(). Followup is sufficient, but this definitely should be made more resilient.

Honza Bambas (:mayhemer)

Reporter

Comment 34

•

11 years ago

Comment on attachment 8374056 [details] [diff] [review] v4 -> v5 diff Review of attachment 8374056 [details] [diff] [review]: ----------------------------------------------------------------- Thanks for the changes. Please wait with landing this (feel free to put in gum tho) until we do the test (if we even do it...) r=hoznab ::: netwerk/cache2/CacheIndex.cpp @@ +617,1 @@ > (entryRemoved && !entry->IsFresh())) { nit: indent @@ +725,3 @@ > index->mIndexNeedsUpdate = true; // TODO Does this really help in case of collision? > + } else { > + if (entry->IsInitialized()) { } else if (..) { } would do as well, but leave it. @@ +816,1 @@ > (entryRemoved && !entry->IsFresh())) { indent @@ +873,5 @@ > nsresult > CacheIndex::UpdateEntry(const SHA1Sum::Hash *aHash, > const uint32_t *aFrecency, > const uint32_t *aExpirationTime, > const uint32_t *aSize) I think the size is unused anywhere. @@ -932,5 @@ > - LOG(("FileSize is too big, truncating to %u", > - CacheIndexEntry::eFileSizeMask)); > - size = CacheIndexEntry::eFileSizeMask; > - } > - entry->SetFileSize(size); Ah, so it was here. But definitely it's better to have it in SetFileSize directly. @@ +1084,2 @@ > if (aSize && > + (*aSize & CacheIndexEntry::kFileSizeMask) != aEntry->GetFileSize()) { Based on the suggestion (bellow) to expose the sizes on entry in bytes, best have a method on entry like FileSizeEquals(int64_t aFileSize) that will do the correct check for you. @@ +1218,5 @@ > uint32_t mProcessed; > #ifdef DEBUG > bool mHasMore; > #endif > }; should this be in an anonymous namespace? @@ +1278,5 @@ > MOZ_ASSERT(data.mHasMore); > } > > + rv = CacheFileIOManager::Write(mIndexHandle, fileOffset, mRWBuf, mRWBufPos, > + mSkipEntries == mProcessEntries ? false : true, seems fixed in followups @@ +2200,5 @@ > aEntry->SetFrecency(frecency); > > + // round up to whole kB > + if (aFileSize & 0x3FF) { > + aFileSize += 0x400; += 0x3ff will do, right? @@ +2206,5 @@ > + aFileSize >>= 10; > + > + aEntry->SetFileSize(static_cast<uint32_t>( > + std::max(static_cast<int64_t>(PR_UINT32_MAX), > + aFileSize))); I think the uint32_t sizeK = (size64 + 0x3ff) >> 10 game should be put into the SetFileSize function. As well GetFileSize should return the byte size. Also because the method names are not expressing that the size is in kB. Generally it would be safer and prevent any conversion mistakes as well save reviewer's time to check you covert on all places correctly. But up to you, at least please rename the methods correctly and make sure you compare to the right numbers all the time. @@ +2944,4 @@ > > switch (mState) { > case WRITING: > + if (NS_SUCCEEDED(aResult)) { also fixed in a followup @@ +3053,5 @@ > case WRITING: > if (NS_FAILED(aResult)) { > FinishWrite(false); > + } else { > + if (mSkipEntries == mProcessEntries) { If I'm right this means we have processed all entries here? This is a big change, how could this even work before? @@ +3125,4 @@ > } > > nsresult > CacheIndex::OnFileRenamed(CacheFileHandle *aHandle, nsresult aResult) can you please add comments somewhere to this method where do we trigger the rename and what processes that is part of? ::: netwerk/cache2/CacheIndex.h @@ +32,5 @@ > class CacheFileMetadata; > > typedef struct { > + // Version of the index. The index must be ignored and deleted when the file > + // on disk was written with a newer version. with a different version actually (which is absolutely correct!). @@ +162,5 @@ > } > > const SHA1Sum::Hash * Hash() { return &mRec->mHash; } > > + bool IsInitialized() { return !!(mRec->mFlags & kInitializedMask); } nit: you don't need !! (bool converts) and the getters should all be const (but I think this was mentioned during the first review as a non-critical nit) @@ +209,4 @@ > memcpy(aBuf, mRec, sizeof(CacheIndexRecord)); > + > + // Dirty and fresh flags should never go to disk, since they make sense only > + // current session. during current session? @@ +481,4 @@ > static nsresult EnsureEntryExists(const SHA1Sum::Hash *aHash); > > + // Initialize the entry. It MUST be present in index. Call to AddEntry() or > + // EnsureEntryExists() must precede to call to this method. precede THE call to ? @@ +491,5 @@ > static nsresult RemoveEntry(const SHA1Sum::Hash *aHash); > > + // Update some information in entry. The entry MUST be present in index and > + // MUST be initialized. Call to AddEntry() or EnsureEntryExists() and to > + // InitEntry() must precede to call to this method. same as just above @@ +579,5 @@ > void* aClosure); > > + // Following methods perform writing of the journal during shutdown. All these > + // methods must be called only during shutdown since they write/delete files > + // directly instead of using CacheFileIOManager. Journal contains only entries directly ON THE MAIN THREAD instead of using CacheFileIOManager THAT DOES IT ASYNCHRONOUSLY ON IO THREAD ? @@ +582,5 @@ > + // methods must be called only during shutdown since they write/delete files > + // directly instead of using CacheFileIOManager. Journal contains only entries > + // that are dirty, i.e. changes that are not present in the index file on the > + // disk. When the log is written successfully, the dirty flag in index file is > + // cleared. Maybe append: "i.e. changes that have not been written to the disk index yet" ? @@ +618,5 @@ > + // E - file exists > + // M - file is missing > + // I - data is invalid (parsing failed or hash didn't match) > + // D - dirty (data in index file is correct, but dirty flag is set) > + // C - clean (index file is clean) not sure what this means, like empty or correct or not dirty or what? confusing with V btw, using a word (explained bellow) instead of a letter in the table might be better. @@ +626,5 @@ > + // a whole (i.e. C,V,M state). > + // > + // We rename the journal file to the temporary file as soon as possible after > + // initial test to ensure that we start update process on the next startup if > + // FF crashes during parsing of the index. Thanks for this large comment! @@ +657,5 @@ > // Following methods perform updating and building of the index. > // Timer callback that starts update or build process. > static void DelayedBuildUpdate(nsITimer *aTimer, void *aClosure); > // Posts timer event that start update or build process. > + nsresult FireBuildUpdateTimer(uint32_t aDelay); "Fire" means the action is to be invoked right now. This is "Schedule" - e.g. that the action is about to be fired later.

Attachment #8374056 - Flags: review?(honzab.moz) → review+

Honza Bambas (:mayhemer)

Reporter

Comment 35

•

11 years ago

Comment on attachment 8374696 [details] [diff] [review] v5 -> v6 diff Review of attachment 8374696 [details] [diff] [review]: ----------------------------------------------------------------- ::: netwerk/cache2/CacheFileIOManager.cpp @@ +934,5 @@ > > CacheIndex::InitEntry(mHandle->Hash(), mAppId, mAnonymous, mInBrowser); > + > + uint32_t sizeInK = mHandle->FileSizeInK(); > + CacheIndex::UpdateEntry(mHandle->Hash(), nullptr, nullptr, &sizeInK); some small comment why would be good ::: netwerk/cache2/CacheIndex.cpp @@ +772,5 @@ > + // There is a collision and we are going to rewrite this entry. Initialize > + // it as a new entry. > + entry->InitNew(); > + entry->MarkFresh(); > + entry->MarkDirty(); nit: MarkDirty called twice or Init() rewrites it? @@ +2216,5 @@ > } > aFileSize >>= 10; > > aEntry->SetFileSize(static_cast<uint32_t>( > + std::min(static_cast<int64_t>(PR_UINT32_MAX), favorite mistake of my own :) ::: netwerk/cache2/CacheIndex.h @@ +148,5 @@ > { > MOZ_ASSERT(mRec->mFrecency == 0); > MOZ_ASSERT(mRec->mExpirationTime == nsICacheEntry::NO_EXPIRATION_TIME); > MOZ_ASSERT(mRec->mAppId == nsILoadContextInfo::NO_APP_ID); > + // When we init the entry it must be fresh and could be dirty s/could/may/

Attachment #8374696 - Flags: review?(honzab.moz) → review+

Honza Bambas (:mayhemer)

Reporter

Comment 36

•

11 years ago

Comment on attachment 8374706 [details] [diff] [review] v6 -> v7 diff Review of attachment 8374706 [details] [diff] [review]: ----------------------------------------------------------------- ::: netwerk/cache2/CacheIndex.cpp @@ +1289,5 @@ > MOZ_ASSERT(data.mHasMore); > } > > rv = CacheFileIOManager::Write(mIndexHandle, fileOffset, mRWBuf, mRWBufPos, > + mSkipEntries == mProcessEntries, this); I was wondering :)

Attachment #8374706 - Flags: review?(honzab.moz) → review+

Honza Bambas (:mayhemer)

Reporter

Comment 37

•

11 years ago

(In reply to Honza Bambas (:mayhemer) from comment #34) > gum tho) until we do the test (if we even do it...) Means bug 967693.

Michal Novotny [:michal]

Assignee

Comment 38

•

11 years ago

Attached patch patch v8 - few minor changes according to reviewer's comments (obsolete) — Details — Splinter Review

Attachment #8374056 - Attachment is obsolete: true

Attachment #8374696 - Attachment is obsolete: true

Attachment #8374706 - Attachment is obsolete: true

Attachment #8374708 - Attachment is obsolete: true

Attachment #8377530 - Flags: review+

Michal Novotny [:michal]

Assignee

Comment 39

•

11 years ago

Attached patch v7 -> v8 diff (obsolete) — Details — Splinter Review

Michal Novotny [:michal]

Assignee

Comment 40

•

11 years ago

> @@ +873,5 @@ > > nsresult > > CacheIndex::UpdateEntry(const SHA1Sum::Hash *aHash, > > const uint32_t *aFrecency, > > const uint32_t *aExpirationTime, > > const uint32_t *aSize) > > I think the size is unused anywhere. It is used in: InitIndexEntryEvent::Run() CacheFileIOManager::WriteInternal() Maybe you mean CacheFileIOManager::UpdateIndexEntry() which has the same arguments but we never set size with it. We used to use it to set the file size but it has cnaged. I removed the argument from this method. > @@ +2206,5 @@ > > + aFileSize >>= 10; > > + > > + aEntry->SetFileSize(static_cast<uint32_t>( > > + std::max(static_cast<int64_t>(PR_UINT32_MAX), > > + aFileSize))); > > I think the uint32_t sizeK = (size64 + 0x3ff) >> 10 game should be put into > the SetFileSize function. As well GetFileSize should return the byte size. > Also because the method names are not expressing that the size is in kB. > Generally it would be safer and prevent any conversion mistakes as well save > reviewer's time to check you covert on all places correctly. > > But up to you, at least please rename the methods correctly and make sure > you compare to the right numbers all the time. First, I don't think that the bytes -> kilobytes conversion should be done in SetFileSize() since it would be IMO much more confusing that calling SetFileSize(6) results in GetFileSize() == 1024. I've added comments to the methods that sizes are in kB. All sizes in cache index are in kB and renaming to Set/GetFileSizeInK could somebody confuse that we also have some other sizes in bytes. > @@ +3053,5 @@ > > case WRITING: > > if (NS_FAILED(aResult)) { > > FinishWrite(false); > > + } else { > > + if (mSkipEntries == mProcessEntries) { > > If I'm right this means we have processed all entries here? This is a big > change, how could this even work before? This has changed when I merged WriteIndexHeader() into WriteIndexToDisk(). Before this change mSkipEntries was set to 0 when we've written all entries. > ::: netwerk/cache2/CacheIndex.h > @@ +32,5 @@ > > class CacheFileMetadata; > > > > typedef struct { > > + // Version of the index. The index must be ignored and deleted when the file > > + // on disk was written with a newer version. > > with a different version actually (which is absolutely correct!). When the index file on disk in older format, we can read it (e.g. it could be just a format change, not more information about entries) or ignore it. But if it is a newer format we must ignore it. I think my formulation is more correct. > @@ +162,5 @@ > > } > > > > const SHA1Sum::Hash * Hash() { return &mRec->mHash; } > > > > + bool IsInitialized() { return !!(mRec->mFlags & kInitializedMask); } > > nit: you don't need !! (bool converts) and the getters should all be const > (but I think this was mentioned during the first review as a non-critical > nit) And it was ignored as a non-critical nit :) > @@ +582,5 @@ > > + // methods must be called only during shutdown since they write/delete files > > + // directly instead of using CacheFileIOManager. Journal contains only entries > > + // that are dirty, i.e. changes that are not present in the index file on the > > + // disk. When the log is written successfully, the dirty flag in index file is > > + // cleared. > > Maybe append: > > "i.e. changes that have not been written to the disk index yet" Append? Where? Or do you mean to replace this test: "i.e. changes that are not present in the index file on the disk"? > @@ +618,5 @@ > > + // E - file exists > > + // M - file is missing > > + // I - data is invalid (parsing failed or hash didn't match) > > + // D - dirty (data in index file is correct, but dirty flag is set) > > + // C - clean (index file is clean) > > not sure what this means, like empty or correct or not dirty or what? > confusing with V btw, using a word (explained bellow) instead of a letter > in the table might be better. Journal file doesn't have any header and the states can be only M, I, V. Index file has also a header with dirty flag, so there are actually 2 valid states: valid but dirty = D, valid and clean = C. From the comment it should be clear that states D and C are only used for index file and state V only for journal file.

Honza Bambas (:mayhemer)

Reporter

Comment 41

•

11 years ago

(In reply to Michal Novotny (:michal) from comment #40) > > ::: netwerk/cache2/CacheIndex.h > > @@ +32,5 @@ > > > class CacheFileMetadata; > > > > > > typedef struct { > > > + // Version of the index. The index must be ignored and deleted when the file > > > + // on disk was written with a newer version. > > > > with a different version actually (which is absolutely correct!). > > When the index file on disk in older format, we can read it (e.g. it could > be just a format change, not more information about entries) or ignore it. > But if it is a newer format we must ignore it. I think my formulation is > more correct. But in the code I see: if (NetworkEndian::readUint32(&hdr->mVersion) != kIndexVersion) { free(hdr); FinishRead(false); return; } > > @@ +582,5 @@ > > > + // methods must be called only during shutdown since they write/delete files > > > + // directly instead of using CacheFileIOManager. Journal contains only entries > > > + // that are dirty, i.e. changes that are not present in the index file on the > > > + // disk. When the log is written successfully, the dirty flag in index file is > > > + // cleared. > > > > Maybe append: > > > > "i.e. changes that have not been written to the disk index yet" > > Append? Where? Or do you mean to replace this test: "i.e. changes that are > not present in the index file on the disk"? So that it looks: the dirty flag in index file is cleared, i.e. changes that have not been written to the disk index yet. > > > > @@ +618,5 @@ > > > + // E - file exists > > > + // M - file is missing > > > + // I - data is invalid (parsing failed or hash didn't match) > > > + // D - dirty (data in index file is correct, but dirty flag is set) > > > + // C - clean (index file is clean) > > > > not sure what this means, like empty or correct or not dirty or what? > > confusing with V btw, using a word (explained bellow) instead of a letter > > in the table might be better. > > Journal file doesn't have any header and the states can be only M, I, V. > Index file has also a header with dirty flag, so there are actually 2 valid > states: valid but dirty = D, valid and clean = C. From the comment it should > be clear that states D and C are only used for index file and state V only > for journal file. Still, when you have to explain it, the comment is probably not readable as you would like to ;)

Michal Novotny [:michal]

Assignee

Comment 42

•

11 years ago

(In reply to Honza Bambas (:mayhemer) from comment #41) > > When the index file on disk in older format, we can read it (e.g. it could > > be just a format change, not more information about entries) or ignore it. > > But if it is a newer format we must ignore it. I think my formulation is > > more correct. > > But in the code I see: > > if (NetworkEndian::readUint32(&hdr->mVersion) != kIndexVersion) { > free(hdr); > FinishRead(false); > return; > } The comment just says what we should do in general. Note that it doesn't say anything about what to do when the index file version is older. Btw, we now have just one version, so I don't think I can even write some code that would make you happy. It seems you have to wait until we have version 2 ;) > > > @@ +582,5 @@ > > > > + // methods must be called only during shutdown since they write/delete files > > > > + // directly instead of using CacheFileIOManager. Journal contains only entries > > > > + // that are dirty, i.e. changes that are not present in the index file on the > > > > + // disk. When the log is written successfully, the dirty flag in index file is > > > > + // cleared. > > > > > > Maybe append: > > > > > > "i.e. changes that have not been written to the disk index yet" > > > > Append? Where? Or do you mean to replace this test: "i.e. changes that are > > not present in the index file on the disk"? > > So that it looks: > > the dirty flag in index file is cleared, i.e. changes that have not been > written to the disk index yet. That makes no sense. You've just misread the last sentence. Btw, the comment is IMO really clear, see CacheIndexHeader structure.

Michal Novotny [:michal]

Assignee

Comment 43

•

11 years ago

pushed to gum http://hg.mozilla.org/projects/gum/rev/7ee97651629a

Michal Novotny [:michal]

Assignee

Comment 44

•

11 years ago

Attached patch patch v9 - final patch with fixed logging in CacheFileIOManager::UpdateIndexEntry() — Details — Splinter Review

Attachment #8377530 - Attachment is obsolete: true

Attachment #8377534 - Attachment is obsolete: true

Attachment #8378249 - Flags: review+

Honza Bambas (:mayhemer)

Reporter

Comment 45

•

11 years ago

https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?changeset=2275f8be60c9

Honza Bambas (:mayhemer)

Reporter

Comment 46

•

11 years ago

This is mostly only a performance feature and precondition to other functionality patched in different bugs.

Flags: in-testsuite-

Honza Bambas (:mayhemer)

Reporter

Comment 47

•

11 years ago

Causes MOZ_Assert: Assertion failure: mState == READING, at ../../../gecko/netwerk/cache2/CacheIndex.cpp:1707 https://tbpl.mozilla.org/php/getParsedLog.php?id=35666040&tree=Mozilla-Inbound#error5

Michal Novotny [:michal]

Assignee

Comment 48

•

11 years ago

(In reply to Honza Bambas (:mayhemer) from comment #47) > Causes MOZ_Assert: Assertion failure: mState == READING, at > ../../../gecko/netwerk/cache2/CacheIndex.cpp:1707 > > https://tbpl.mozilla.org/php/getParsedLog.php?id=35666040&tree=Mozilla- > Inbound#error5 This code has been changed in bug #968106 and the problem should be fixed now. Tryserver push including patch from bug #968106: https://tbpl.mozilla.org/?tree=Try&rev=b98b78714c31

Michal Novotny [:michal]

Assignee

Comment 49

•

11 years ago

Another push to test just a gaia unit tests on B2G Desktop Linux x64 Opt which was broken on the previous revision. https://tbpl.mozilla.org/?tree=Try&rev=31bb9c11ecbc

Michal Novotny [:michal]

Assignee

Comment 50

•

11 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/583d84aeb6c6

Ryan VanderMeulen [:RyanVM]

Comment 51

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/583d84aeb6c6

Status: ASSIGNED → RESOLVED

Closed: 11 years ago

Resolution: --- → FIXED

Target Milestone: --- → mozilla30

neil@parkwaycc.co.uk

Comment 52

•

11 years ago

Comment on attachment 8378249 [details] [diff] [review] patch v9 - final patch with fixed logging in CacheFileIOManager::UpdateIndexEntry() >+class WriteLogHelper >+{ >+public: >+ WriteLogHelper(PRFileDesc *aFD) ... >+ { >+ mHash = new CacheHash(); ... >+ nsCOMPtr<CacheHash> mHash; CacheHash is not an interface, so an nsCOMPtr is incorrect here. Slightly better would be an nsRefPtr. Better still would be an nsAutoPtr, since you own the CacheHash instance. If you convert all of the uses you can then drop the now unused refcount. You'll also need to make the destructor public. (Why is it virtual anyway?) Even slightly better still in this particular case would be a member variable, rather than a pointer, since the CacheHash instance's lifetype matches that of the WriteLogHelper. (For the other uses, a Maybe<CacheHash> might be a better solution than an nsAutoPtr.)

WIP patch 12 years ago Michal Novotny [:michal] 136.97 KB, patch		Details \| Diff \| Splinter Review
WIP patch - updating, building 11 years ago Michal Novotny [:michal] 105.63 KB, patch		Details \| Diff \| Splinter Review
WIP patch - updating, building v2 11 years ago Michal Novotny [:michal] 106.23 KB, patch		Details \| Diff \| Splinter Review
WIP patch - updating, building v3 11 years ago Michal Novotny [:michal] 114.18 KB, patch		Details \| Diff \| Splinter Review
WIP patch - various fixes 11 years ago Michal Novotny [:michal] 28.43 KB, patch		Details \| Diff \| Splinter Review
WIP patch - another fixes 11 years ago Michal Novotny [:michal] 5.51 KB, patch		Details \| Diff \| Splinter Review
WIP patch - fixed delayed updating/building 11 years ago Michal Novotny [:michal] 5.01 KB, patch		Details \| Diff \| Splinter Review
single WIP patch 11 years ago Michal Novotny [:michal] 186.84 KB, patch		Details \| Diff \| Splinter Review
single WIP patch v2 11 years ago Michal Novotny [:michal] 186.71 KB, patch	mayhemer : review-	Details \| Diff \| Splinter Review
single WIP patch v1->v2 interdiff 11 years ago Michal Novotny [:michal] 481 bytes, patch		Details \| Diff \| Splinter Review
single WIP patch v3 11 years ago Michal Novotny [:michal] 186.12 KB, patch		Details \| Diff \| Splinter Review
single WIP patch v3 (as to land after bugs 945945 and 945250) 11 years ago Honza Bambas (:mayhemer) 185.07 KB, patch		Details \| Diff \| Splinter Review
merged to the current gum WIP v3 [by Michal] 11 years ago Honza Bambas (:mayhemer) 185.10 KB, patch		Details \| Diff \| Splinter Review
WIP v4 - added comments explaining how CacheIndex works 11 years ago Michal Novotny [:michal] 192.78 KB, patch	mayhemer : review-	Details \| Diff \| Splinter Review
v4 -> v5 diff 11 years ago Michal Novotny [:michal] 109.89 KB, patch	mayhemer : review+	Details \| Diff \| Splinter Review
v5 -> v6 diff 11 years ago Michal Novotny [:michal] 6.25 KB, patch	mayhemer : review+	Details \| Diff \| Splinter Review
v6 -> v7 diff 11 years ago Michal Novotny [:michal] 917 bytes, patch	mayhemer : review+	Details \| Diff \| Splinter Review
patch v7 11 years ago Michal Novotny [:michal] 203.97 KB, patch		Details \| Diff \| Splinter Review
patch v8 - few minor changes according to reviewer's comments 11 years ago Michal Novotny [:michal] 204.36 KB, patch	michal : review+	Details \| Diff \| Splinter Review
v7 -> v8 diff 11 years ago Michal Novotny [:michal] 23.10 KB, patch		Details \| Diff \| Splinter Review
patch v9 - final patch with fixed logging in CacheFileIOManager::UpdateIndexEntry() 11 years ago Michal Novotny [:michal] 204.50 KB, patch	michal : review+	Details \| Diff \| Splinter Review