723582 - Don't make nsCacheService::IsStorageEnabledForPolicy{,_Locked} block on the entire cache service

Current state of the patch for this. This seems to be causing some locking issues on try, so I'm still poking at it, but wanted to get some feedback on the general approach at least.

Attachment #595251 - Flags: feedback?(michal.novotny)

Michal Novotny [:michal]

Assignee

Comment 3

•

13 years ago

Comment on attachment 595251 [details] [diff] [review] broken patch It isn't entirely clear to me what exactly should be protect by mDeviceLock. I would assume that it should protect all accesses to devices and related members like mEnableDiskDevice etc. But methods of the devices are sometimes called under the mLock (nsCacheService::EvictEntriesForClient(), nsCacheService::VisitEntries() and others) and sometimes under the mDeviceLock (nsCacheService::OnProfileShutdown(), nsCacheService::EvictEntriesForClient() ...). Maybe it would be good (just for a better readability) to group all members that need to be protected by mDeviceLock into some subclass within the class nsCacheService. Something like this: class Devices { public: bool mEnableDiskDevice; nsDiskCacheDevice * mDiskDevice; ... ... } mDevices; > NS_IMETHODIMP > nsCacheEntryDescriptor::SetStoragePolicy(nsCacheStoragePolicy policy) > { > - nsCacheServiceAutoLock lock; > if (!mCacheEntry) return NS_ERROR_NOT_AVAILABLE; > // XXX validate policy against session? It is definitely not safe to call nsCacheEntry::SetStoragePolicy() and nsCacheEntry::MarkEntryDirty() without any lock.

Attachment #595251 - Flags: feedback?(michal.novotny) → feedback-

u408661

Reporter

Comment 4

•

13 years ago

(In reply to Michal Novotny (:michal) from comment #3) > Comment on attachment 595251 [details] [diff] [review] > broken patch > > It isn't entirely clear to me what exactly should be protect by mDeviceLock. Yeah, I didn't make that too clear, my fault. The idea for now is to protect access to the m<Whatever>Enabled booleans, since those are the only things touched by IsStorageEnabledForPolicy (which is what I'm trying to fix with this bug). I likely messed up some, and did some things with the device entries themselves under this lock as opposed to the global lock, which (in retrospect) is probably the source of some of my problems on try. > > NS_IMETHODIMP > > nsCacheEntryDescriptor::SetStoragePolicy(nsCacheStoragePolicy policy) > > { > > - nsCacheServiceAutoLock lock; > > if (!mCacheEntry) return NS_ERROR_NOT_AVAILABLE; > > // XXX validate policy against session? > > It is definitely not safe to call nsCacheEntry::SetStoragePolicy() and > nsCacheEntry::MarkEntryDirty() without any lock. These are also probably sources of some of my issues on try. Not sure why I removed the locks here...

u408661

Reporter

Updated

•

13 years ago

No longer blocks: 723577

u408661

Reporter

Updated

•

13 years ago

Depends on: 723577

u408661

Reporter

Comment 5

•

13 years ago

Rename for clarity

Summary: Fix nsCacheService::IsStorageEnabledForPolicy{,_Locked} → Don't make nsCacheService::IsStorageEnabledForPolicy{,_Locked} block on the entire cache service

u408661

Reporter

Comment 6

•

13 years ago

Attached patch patch (obsolete) — Details — Splinter Review

OK, here's an updated version of the patch that removes all the stupid I had going on in the first iteration (note: this does not guarantee a complete lack of stupid in this version). The whole idea here is that we're (possibly) blocking the main thread on I/O happening on the cache thread when we call IsStorageEnabledForPolicy, and we don't need to do that. All we need is to know if a particular device is enabled (the m*DeviceEnabled members of nsCacheService). So, this patch makes one tiny bit of finer-grained locking for those members, nsCacheService::mDeviceStatusLock. This lock is designed to never be held at the same time as nsCacheService::mLock (the Big Cache Lock), so there were a few hoops to jump through in order to have consistent views of m*DeviceEnabled in some cache service methods, but nothing too crazy. Try results at https://tbpl.mozilla.org/?tree=Try&rev=0faccc05fc8c

Attachment #595251 - Attachment is obsolete: true

Attachment #626220 - Flags: review?(michal.novotny)

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 7

•

13 years ago

1. Please document the lock interdependencies, e.g. acquisition ordering considerations between this new lock and the cache service lock. I think in this case, there are no interdependencies, so the cache device lock may be acquired independently of the cache service lock; it would be great to have that stated explicitly for occasional readers of the code like me. 2. Since we're breaking up the lock into multiple locks, the _Locked suffix of function names is ambiguous. I suggest that we change the prototypes of new functions that require different locks to be held to use the following witness pattern: bool nsCacheService::IsStorageEnabledForPolicy_Locked( nsCacheStoragePolicy policy, const nsCacheDeviceStatusAutoLock & /*proofOfLock*/); The advantages of using this pattern are: 1. It *guarantees* that you updated *every* caller to use the new lock. 2. It documents exactly which lock is required to be held, in a *compiler-checked* way. 3. When you have a chain of function calls where the top-level function acquires a lock used by the lower-level functions, it helps document each function's requirement to have the lock held so that you are much less likely to wrongly re-acquire the lock in the lower functions. I have used this pattern a few times in PSM (search for "proofOfLock"). Maybe Honza has an opinion about it.

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 8

•

13 years ago

(In reply to Brian Smith (:bsmith) from comment #7) > 1. Please document the lock interdependencies, e.g. acquisition ordering > considerations between this new lock and the cache service lock. I think in > this case, there are no interdependencies, so the cache device lock may be > acquired independently of the cache service lock; it would be great to have > that stated explicitly for occasional readers of the code like me. I think I might be wrong above. Perhaps the lock acquisition order must be: [nsCacheServiceAutoLock] -> nsCacheDeviceStatusAutoLock. That is, if you are going to hold both locks at the same time, you must grab the cache service lock first. We should document this by adding an assertion to nsCacheService::Lock(): MOZ_ASSERT(mDeviceStatusLock->AssertNotCurrentThreadOwns());

u408661

Reporter

Comment 9

•

13 years ago

Attached patch patch — Details — Splinter Review

This addresses Brian's comments. A couple things to note: 1. <somelock>.AssertNotCurrentThreadOwns doesn't actually do anything. This is documented in the comment where it's used. 2. I removed IsStorageEnabledForPolicy_Locked instead of using the (crufty, to me) pattern Brian suggested. 3. Because of (2), I should probably fix the commit message to not reference that function before committing this. Try run is at https://tbpl.mozilla.org/?tree=Try&rev=fea1f1c6aaa6

Attachment #626220 - Attachment is obsolete: true

Attachment #626220 - Flags: review?(michal.novotny)

Attachment #627989 - Flags: review?(michal.novotny)

u408661

Reporter

Comment 10

•

13 years ago

Note the try run is still going, will report back once it's done.

u408661

Reporter

Comment 11

•

13 years ago

Try looks good

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 12

•

13 years ago

In my patch for bug 759928, I move the call to SetStoragePolicy in nsHttpChannel to the cache service thread. AFAICT, within Gecko/Firefox, that is the only place where IsStorageEnabled[ForPolicy[_Locked]] is called on the main thread. If that's the case, then will we really need to do this once bug 759928 is fixed? It seems like a better fix would be to simply have IsStorageEnabled[ForPolicy[_Locked]] fail if they are not called on the cache IO thread.

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 13

•

13 years ago

The other place SetStoragePolicy is used is SetCacheAsFile, but we're planning to remove that functionality (bug 725993) and/or change the current callers to not use it (other bugs).

u408661

Reporter

Comment 14

•

13 years ago

Brian, if we can get all the calls to IsStorageEnabled<blahblahblah> get moved off the main thread, then yes, that should invalidate this bug. I'm totally fine with that course, too. Michal, do you have a preference?

Michal Novotny [:michal]

Assignee

Comment 15

•

13 years ago

I tend to prefer what Brian wrote. But I don't think that we can fail in IsStorageEnabled[ForPolicy[_Locked]] in case it is not called on the cache IO thread. We can do this e.g. in OpenCacheEntry() since there is a async alternative, but we don't have an async version of SetStoragePolicy() so no add-on would be able to set the policy.

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 16

•

13 years ago

(In reply to Michal Novotny (:michal) from comment #15) > I tend to prefer what Brian wrote. But I don't think that we can fail in > IsStorageEnabled[ForPolicy[_Locked]] in case it is not called on the cache > IO thread. We can do this e.g. in OpenCacheEntry() since there is a async > alternative, but we don't have an async version of SetStoragePolicy() so no > add-on would be able to set the policy. My understanding is that we're supposed to either provide an async API for addons for anything that does (disk) I/O or we're eventually supposed to make it so addons don't have to do it. Otherwise, even if we remove all the jank in Firefox, addons will just add it back. See bug 721336, especially bug 721336 comment 6, for example. I am not sure there's been an official policy decision either way, or if this is just wishful thinking.

Henri Sivonen (:hsivonen)

Comment 17

•

13 years ago

(In reply to Brian Smith (:bsmith) from comment #16) > See bug 721336, especially > bug 721336 comment 6, for example. I am not sure there's been an official > policy decision either way, or if this is just wishful thinking. That Bugzilla comment and does not represent an official policy decision, to my knowledge, but I wishfully think it should become a policy. :-)

Josh Aas

Updated

•

13 years ago

Assignee: hurley → michal.novotny

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 18

•

13 years ago

There are two places where this function is called on the main thread (AFAICT) in typical use. One is inside the call to mCacheEntry->SetStoragePolicy() in nsHttpChannel::InitCacheEntry. The other is in nsHttpChannel::SetCacheAsFile. The code in InitCacheEntry looks like this: if (mLoadFlags & INHIBIT_PERSISTENT_CACHING) { rv = mCacheEntry->SetStoragePolicy(nsICache::STORE_IN_MEMORY); if (NS_FAILED(rv)) return rv; } // Set the expiration time for this cache entry rv = UpdateExpirationTime(); // Takes cache service lock if (NS_FAILED(rv)) return rv; // This takes cache service lock many times. rv = AddCacheEntryHeaders(mCacheEntry); if (NS_FAILED(rv)) return rv; and the code for nsCacheEntryDescriptor::SetStoragePolicy looks like this after Nick's proposed patch: ... bool storageEnabled; nsCacheService::IsStorageEnabledForPolicy(policy, &storageEnabled); if (!storageEnabled) return NS_ERROR_FAILURE; nsCacheServiceAutoLock lock; ... So, even if we avoid taking the cache service lock inside IsStorageEnabledForPolicy, we will take the lock many times immediately after it. This is why I think that this patch makes sense only in combination with the patch in bug 723577, which would remove these other main-thread lock acquisitions. The main question we need to answer is whether we should continue down this line of work (this bug and bug 723577) OR do bug 759928 OR do bug 763555. I think fixing bug 763555 is the best solution in the long term, but because that is a ways away, we might also consider doing one of the other two options in the interim.

Michal Novotny [:michal]

Assignee

Updated

•

13 years ago

Attachment #627989 - Flags: review?(michal.novotny)

Patrick McManus [:mcmanus]

Comment 19

•

9 years ago

is this relevant still?

Flags: needinfo?(michal.novotny)

Whiteboard: [necko-backlog]

Michal Novotny [:michal]

Assignee

Comment 20

•

9 years ago

no

Status: NEW → RESOLVED

Closed: 9 years ago

Flags: needinfo?(michal.novotny)

Resolution: --- → WONTFIX

Whiteboard: [necko-backlog]

broken patch 13 years ago u408661 24.51 KB, patch	michal : feedback-	Details \| Diff \| Splinter Review
patch 13 years ago u408661 28.83 KB, patch		Details \| Diff \| Splinter Review
patch 13 years ago u408661 32.48 KB, patch		Details \| Diff \| Splinter Review