Open Bug 1183245 Opened 9 years ago Updated 1 year ago

Service worker registration should be wiped when origin storage is wiped

Categories

(Core :: DOM: Service Workers, defect, P3)

defect

Tracking

()

People

(Reporter: bkelly, Unassigned)

References

(Depends on 1 open bug, Blocks 3 open bugs)

Details

(Keywords: dev-doc-needed)

Attachments

(5 files, 7 obsolete files)

Consider the following steps:

  1) Page registers SW and saves all its resources in Cache during install event.
  2) Some time later browser wipes origin storage due to low disk space.
  3) Page tries to load, triggering SW fetch event.
  4) Page SW fails because it can't find any of its resources in Cache

Basically, quota wiping effectively breaks the guarantee we only fire a fetch event after install and activate events are executed.

Probably the correct solution is to store the registration in the quota dir so it gets removed as well.  Alternatively, we could try to detect this condition in another way and act like it got re-installed.
Andrea, what do you think?
Flags: needinfo?(amarchesini)
It makes sense. I'll speak with janv tomorrow about how to implement this correctly. I leave the NI just to do not forget.
Yeah, if I understand it correctly, we need a new quota client.
Ideally SW registrations would be split into origin directories under <profile>/storage/default.
Client::OnOriginClearCompleted can be used to update in-memory structures and the current directory scan for initialization of temporary storage could also load all SW registrations.
I believe that this approach is better from the "survive crashes" point of view.

Loading of SW registrations from one file is faster for sure, but on the other hand, when one entry is modified/removed entire file needs to be flushed.

I'm going to implement a stub for new quota client and baku will finish the rest.
janv, can you upload the stub here and NI me? Thanks!
Flags: needinfo?(amarchesini)
Jan, could quota manager fire the clear-origin-data observer topic as an alternative.  See bug 1191647.

I guess this would wipe cookies for the site when the storage is revoked as well.  Would that be bad?  Seems removing all state would be nice to avoid inconsistent site behavior.
Flags: needinfo?(Jan.Varga)
(In reply to Ben Kelly [:bkelly] from comment #5)
> Jan, could quota manager fire the clear-origin-data observer topic as an
> alternative.  See bug 1191647.
> 
> I guess this would wipe cookies for the site when the storage is revoked as
> well.  Would that be bad?  Seems removing all state would be nice to avoid
> inconsistent site behavior.

Yes I think it would be bad, since we would try to delete local storage data that is in use or recently used. Also, quota manager observes the clear-origin-data notification itself, so it would try to delete the origin again.
Flags: needinfo?(Jan.Varga)
Andrea, I think this is going to become an issue sooner rather than later after we ship.  Seems SW adoption is going to be relatively quick and we could see quota wipes on mobile semi-frequently.

Another idea I had for fixing this was to simply remove the registration if CacheStorage::Has() resolves false for the specified uuid.  What do you think of this?

I guess one complication would be when to run the CacheStorage::Has() again... in theory we would need to detect the condition where the registration is in memory, but then the quota is wiped out from under it.

I can help with the quota manager client after my vacation as well.

It would be nice to get this in 45 if possible.
Flags: needinfo?(amarchesini)
Assignee: nobody → amarchesini
Flags: needinfo?(amarchesini)
Attached patch patch 1 - GetStoragePath() (obsolete) — Splinter Review
As Janv and me discussed, this is the first step of 4.
In this step ServiceWorkerRegistrar takes the storage path from QuotaManager.

Next step is to use read ServiceWorkers.txt files from the subdirs of the storage path.

step 3: is to write in the correct directory these files.

step 4: flush SWR memory when QuotaManager cleans an origin.
Attachment #8689766 - Flags: review?(Jan.Varga)
Attached patch patch 1 - GetStoragePath() (obsolete) — Splinter Review
Instead returning the 'default' storagePath, with this patch we can decide which path we want. SWRegistrar wants the persistent storage path.
Attachment #8689766 - Attachment is obsolete: true
Attachment #8689766 - Flags: review?(Jan.Varga)
Attachment #8689955 - Flags: review?(Jan.Varga)
Wait, why persistent?  Cache API uses default.  I think we need the same or we get the same issue with the reg staying while caches deleted.
Flags: needinfo?(amarchesini)
(In reply to Ben Kelly [:bkelly] from comment #10)
> Wait, why persistent?  Cache API uses default.  I think we need the same or
> we get the same issue with the reg staying while caches deleted.

Yeah, however, I don't know about SW that much, is this approach correct (remove SW registrations when default/temporary storage data for an origin is deleted) ?
Attached patch patch 1 - GetStoragePath() (obsolete) — Splinter Review
Right. I use the 'default' now.
Attachment #8689955 - Attachment is obsolete: true
Attachment #8689955 - Flags: review?(Jan.Varga)
Flags: needinfo?(amarchesini)
Attachment #8690043 - Flags: review?(Jan.Varga)
Attached patch patch 2 - Read Data (obsolete) — Splinter Review
This is the second step where we read the file in each subdir of the storage path.
Attachment #8690044 - Flags: review?(Jan.Varga)
(In reply to Jan Varga [:janv] from comment #11)
> (In reply to Ben Kelly [:bkelly] from comment #10)
> > Wait, why persistent?  Cache API uses default.  I think we need the same or
> > we get the same issue with the reg staying while caches deleted.
> 
> Yeah, however, I don't know about SW that much, is this approach correct
> (remove SW registrations when default/temporary storage data for an origin
> is deleted) ?

I think for now it is the best approach.  Consider a site that sets itself up in its install event, but that involves saving 1GB of data.  Clearly that 1GB of data can be removed, but then the service worker will have the wrong state in its following events.  This could cause unexpected behavior and breakage in production for sites.
This is needed because using QuotaManager we cannot just block the main-thread with a loop if the loading is slow. We use the event loop and for this reason the GetRegistrations() cannot be blocked.
Attachment #8690168 - Flags: review?(josh)
patch rebased on top of part 0.
Attachment #8690043 - Attachment is obsolete: true
Attachment #8690043 - Flags: review?(Jan.Varga)
Attachment #8690172 - Flags: review?(Jan.Varga)
Patch rebased and updated.
Attachment #8690044 - Attachment is obsolete: true
Attachment #8690044 - Flags: review?(Jan.Varga)
Attachment #8690191 - Flags: review?(Jan.Varga)
Attached patch part 3 - WriteData (obsolete) — Splinter Review
Here part 3: Write data into the QuotaManager directory.
Attachment #8690205 - Flags: review?(Jan.Varga)
GetRegistrations was broken.
Attachment #8690168 - Attachment is obsolete: true
Attachment #8690168 - Flags: review?(josh)
Attachment #8690215 - Flags: review?(josh)
I had to add serviceworker.txt in the list of 'accepted' file in the actorParent. I guess you will not be super happy about it.
Attachment #8690205 - Attachment is obsolete: true
Attachment #8690205 - Flags: review?(Jan.Varga)
Attachment #8690216 - Flags: review?(Jan.Varga)
I added a pre-check for mAppURI in nsJARChannel to speed up the FF boot time.
Attachment #8690247 - Flags: review?(josh)
Attachment #8690215 - Attachment is obsolete: true
Attachment #8690215 - Flags: review?(josh)
Comment on attachment 8690247 [details] [diff] [review]
part 0 - nsIServiceWorkerLoadingCallback

Review of attachment 8690247 [details] [diff] [review]:
-----------------------------------------------------------------

Ehsan helped me realized that we can do better than this. Specifically, the fact that we don't need to care in the child process makes it easier to use the existing async capabilities of nsHttpChannel to our advantage. I propose the following:
* revert all the changes that split AsyncOpen
* make nsJARChannel crash if we ever hit the case where we try to call ShouldIntercept before ServiceWorkerRegistrar::IsDataReady() is true (Seriously. This code is destined to be ripped out and will never ship, so we shouldn't be making it any more complicated than it needs to be).
* move the check for ShouldIntercept in AsyncOpen into OpenCacheEntry at the very top of the method, and make the SW callback call OpenCacheEntry
* extend mCacheEntriesToWaitFor by one bit and add a new WAIT_FOR flag for this case, and set the bit if we need to return from OpenCacheEntry because the SW registrations aren't available yet
* add a boolean flag indicating whether mInterceptCache contains a known valid state (ie. we've run ShouldIntercept) and add assertions to any code that uses it to verify that we're not missing anything.

Does that make sense?

::: dom/ipc/TabParent.cpp
@@ +781,5 @@
>  TabParent::LoadURL(nsIURI* aURI)
>  {
>      MOZ_ASSERT(aURI);
>  
> +    // Let's postpoone this loading until SWR is ready.

postpone
Attachment #8690247 - Flags: review?(josh) → review-
Comment on attachment 8690172 [details] [diff] [review]
part 1 - GetStoragePath()

Review of attachment 8690172 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/workers/ServiceWorkerRegistrar.cpp
@@ +37,5 @@
>  
>  namespace mozilla {
>  namespace dom {
>  
> +using namespace quota;

couple with the "using" above ?

@@ +666,5 @@
>    if (NS_WARN_IF(NS_FAILED(rv))) {
>      return;
>    }
>  
> +  if (NS_WARN_IF(!BackgroundChild::GetOrCreateForCurrentThread(this))) {

You could also check if the actor already exists, see bug 1228932.

@@ +693,5 @@
> +{
> +  MOZ_ASSERT(NS_IsMainThread());
> +  MOZ_ASSERT(aActor);
> +
> +  // Not we can send a message via IPC to our self in order to start the

Not -> Now ?

@@ +743,5 @@
> +      self->QuotaManagerReady();
> +    }
> +  );
> +
> +  QuotaManager::GetOrCreate(runnable);

Again, you should call QuotaManager::Get() first, if QM already exists then you don't have to create the runnable.

@@ +748,5 @@
> +}
> +
> +void
> +ServiceWorkerRegistrar::QuotaManagerReady()
> +{

Assert the thread here.

@@ +750,5 @@
> +void
> +ServiceWorkerRegistrar::QuotaManagerReady()
> +{
> +  QuotaManager* qm = QuotaManager::Get();
> +  MOZ_ASSERT(qm);

This assertion is not correct and actually the name of the method either.
GetOrCreate() takes a callback which is called if quota manager already exists, creation was successful or creation failed. So you should check |qm| here for null.
This is by design to avoid creating another callback interface.

@@ +763,5 @@
> +  nsCOMPtr<nsIRunnable> runnable =
> +    NS_NewRunnableMethod(this, &ServiceWorkerRegistrar::LoadData);
> +  nsresult rv = target->Dispatch(runnable, NS_DISPATCH_NORMAL);
> +  if (NS_FAILED(rv)) {
> +    NS_WARNING("Failed to dispatch the LoadDataRunnable.");

Just NS_WARN_IF ?
Attachment #8690172 - Flags: review?(Jan.Varga)
Blocks: 1185716
Comment on attachment 8690172 [details] [diff] [review]
part 1 - GetStoragePath()

Review of attachment 8690172 [details] [diff] [review]:
-----------------------------------------------------------------

Something must have changed in the service worker code, I hit the |MOZ_ASSERT(mRegistrationLoaded)| in |ServiceWorkerManager::MaybeStartControlling()|
MaybeStartControlling is called before quota manager is ready.
Comment on attachment 8690172 [details] [diff] [review]
part 1 - GetStoragePath()

Review of attachment 8690172 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/workers/ServiceWorkerRegistrar.cpp
@@ +756,5 @@
> +  // Note: this will never land... it's 1 patch of many.
> +  printf("WOW: %s\n", NS_ConvertUTF16toUTF8(qm->GetStoragePath(PERSISTENCE_TYPE_DEFAULT)).get());
> +
> +  nsCOMPtr<nsIEventTarget> target =
> +    do_GetService(NS_STREAMTRANSPORTSERVICE_CONTRACTID);

I think we can/should use quota manager IO thread now for registration loading. The IO thread is guaranteed to exist while quota manager is alive.

@@ +761,5 @@
> +  MOZ_ASSERT(target, "Must have stream transport service");
> +
> +  nsCOMPtr<nsIRunnable> runnable =
> +    NS_NewRunnableMethod(this, &ServiceWorkerRegistrar::LoadData);
> +  nsresult rv = target->Dispatch(runnable, NS_DISPATCH_NORMAL);

We should protect registration loading by a directory lock. There's now an infrastructure for it. See |Maintenance::OpenDirectory()| in dom/indexedDB/ActorsParent.cpp
Once you have a directory lock for <profile>/storage/default/*/sw, you are guaranteed that origin directories are not deleted while you are working with them.
Comment on attachment 8690191 [details] [diff] [review]
part 2 - ReadData

Review of attachment 8690191 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/workers/ServiceWorkerRegistrar.cpp
@@ +293,5 @@
>    MOZ_ASSERT(!NS_IsMainThread());
>    MOZ_ASSERT(!mDataLoaded);
>  
> +  LoadDataFromQuotaManager();
> +  LoadDataFromProfile();

Is there a reason to have the code in these separate methods ?
It seems they are only called from LoadData().

@@ +330,5 @@
> +  if (NS_WARN_IF(NS_FAILED(rv))) {
> +    return;
> +  }
> +
> +  bool value;

Nit: s/value/exists

@@ +340,5 @@
> +  if (!value) {
> +    return;
> +  }
> +
> +  rv = storageDir->IsDirectory(&value);

Nit: s/value/isDirectory ? One more boolean on the stack doesn't hurt IMO

@@ +352,5 @@
> +
> +  // For each sub directory of the storage path, we call ReaData.
> +  nsCOMPtr<nsISimpleEnumerator> entries;
> +  rv = storageDir->GetDirectoryEntries(getter_AddRefs(entries));
> +  if (rv != NS_ERROR_FILE_NOT_FOUND && NS_WARN_IF(NS_FAILED(rv))) {

Why do you need to check NS_ERROR_FILE_NOT_FOUND here ?
Did you want to silent the warning for  this specific error.
Is that error ever returned from GetDirectoryEntries() ?

@@ +356,5 @@
> +  if (rv != NS_ERROR_FILE_NOT_FOUND && NS_WARN_IF(NS_FAILED(rv))) {
> +    return;
> +  }
> +
> +  if (rv != NS_ERROR_FILE_NOT_FOUND) {

and here, why is this needed ?

@@ +363,5 @@
> +      nsCOMPtr<nsISupports> entry;
> +      rv = entries->GetNext(getter_AddRefs(entry));
> +      if (NS_WARN_IF(NS_FAILED(rv))) {
> +        return;
> +      }

could be:
MOZ_ALWAYS_SUCCEEDS(entries->GetNext(getter_AddRefs(entry)));

@@ +365,5 @@
> +      if (NS_WARN_IF(NS_FAILED(rv))) {
> +        return;
> +      }
> +
> +      nsCOMPtr<nsIFile> file = do_QueryInterface(entry);

s/file/dir

@@ +372,5 @@
> +      bool isDir;
> +      rv = file->IsDirectory(&isDir);
> +      if (NS_WARN_IF(NS_FAILED(rv))) {
> +        return;
> +      }

MOZ_ALWAYS_SUCCEEDS(file->IsDirectory(&isDir));

@@ +376,5 @@
> +      }
> +
> +      if (!isDir) {
> +        continue;
> +      }

I believe we should create a new subdir for service workers, for example "sw". The loading should be protected by a directory lock as I mentioned previously and that was designed to work with "subdir per client".

@@ +379,5 @@
> +        continue;
> +      }
> +
> +      rv = ReadData(file);
> +      if (rv != NS_ERROR_FILE_NOT_FOUND && NS_WARN_IF(NS_FAILED(rv))) {

If you check NS_ERROR_FILE_NOT_FOUND you also need to check NS_ERROR_FILE_TARGET_DOES_NOT_EXIST which is used on windows I think.

@@ +389,5 @@
> +
> +void
> +ServiceWorkerRegistrar::LoadDataFromProfile()
> +{
> +  // If we are running a old profile version.

s/a/an

@@ +600,5 @@
>  
> +  // TODO for each sub dir...
> +
> +  // If we still have a profile directory serviceworker file, now it's time to
> +  // delete it. The data will be fully written in the correct Quota directories.

I don't like all these LoadDataFromQuotaManager, Quota directories.
Quota Manager manages storage directories and also quota. "Quota directories" sounds strange. I would rather use something like "origin directories"

@@ +745,2 @@
>  
> +  // We keep mProfileDir to be back compatible with previous versions of the SW

backwards ?

@@ +770,5 @@
> +    return;
> +  }
> +
> +  if (exists) {
> +    // We keep this pointer just back-compatibility, but the file in this

"just for backward-compatibility" ?

@@ +843,5 @@
>  
>  void
>  ServiceWorkerRegistrar::QuotaManagerReady()
>  {
> +  // Dispatch a runnable to do the main work on a IO thread.

once you convert it:
"on the QuotaManager IO thread"

::: dom/workers/ServiceWorkerRegistrar.h
@@ +100,5 @@
>  
>    mozilla::Mutex mMutex;
>  
>  protected:
> +  // mData is protected by mMonitor.

s/mMonitor/mMutex

@@ +105,4 @@
>    nsTArray<ServiceWorkerRegistrationData> mData;
> +
> +private:
> +  // mDataLoaded is protected by mMonitor.

s/mMonitor/mMutex
Attachment #8690191 - Flags: review?(jvarga)
Comment on attachment 8690216 [details] [diff] [review]
part 3 - WriteData

Review of attachment 8690216 [details] [diff] [review]:
-----------------------------------------------------------------

I will have to check these changes one more time, but one thing I'm worried about is ..
What happens if we crash in the middle of WriteData. Registrations for some origins may have completed but not all.
And some registration files were not deleted, etc.

I've been thinking about how to do it safely, but I need more time for testing.

::: dom/quota/ActorsParent.cpp
@@ +1508,5 @@
>          }
>  
>          if (leafName.EqualsLiteral(METADATA_FILE_NAME) ||
> +            leafName.EqualsLiteral(DSSTORE_FILE_NAME) ||
> +            leafName.EqualsLiteral(SERVICEWORKERREGISTRAR_FILE)) {

We don't need this once we move the file to a new subdir. The same applies for the rest of the file.

::: dom/workers/ServiceWorkerRegistrar.cpp
@@ +576,5 @@
> +      }
> +
> +      nsCOMPtr<nsIEventTarget> target =
> +        do_GetService(NS_STREAMTRANSPORTSERVICE_CONTRACTID);
> +      MOZ_ASSERT(target, "Must have stream transport service");

Again, just like loading, saving has to be protected by a directory lock too, otherwise the saving operation may clash with other quota manager operations.

@@ +578,5 @@
> +      nsCOMPtr<nsIEventTarget> target =
> +        do_GetService(NS_STREAMTRANSPORTSERVICE_CONTRACTID);
> +      MOZ_ASSERT(target, "Must have stream transport service");
> +
> +      rv = target->Dispatch(this, NS_DISPATCH_NORMAL);

IO thread

@@ +666,5 @@
> +  nsTArray<nsCOMPtr<nsIFile>> processedDirectories;
> +
> +  // All the keys of the hashtable must be exist as paths!
> +  for (auto iter = aData.ConstIter(); !iter.Done(); iter.Next()) {
> +    nsCOMPtr<nsIFile> dir = SaveDataInQuotaManager(iter.Key(),

SaveDataInQuotaManager() doesn't sound right.
Attachment #8690216 - Flags: review?(jvarga)
Comment on attachment 8690257 [details] [diff] [review]
part 4 - Notification when the origin is cleared

Review of attachment 8690257 [details] [diff] [review]:
-----------------------------------------------------------------

This should be all doable with existing Client:OnOriginClearCompleted()

I spoke with baku, I'll do the directory locking and OnOriginClearCompleted()
Actually I implemented something already.
Attachment #8690257 - Flags: feedback?(jvarga)
I'm also considering to optimize this a little bit more by loading service worker registrations along with the directory scan we do during default/temporary initialization.
I think we can use simpledb for this.
Depends on: 1361330
Depends on: 1407621
No longer depends on: 1361330
Blocks: 1304382
Priority: -- → P2
Assignee: amarchesini → jvarga
Note bug 1450874 trigger rate suggests we are hitting this condition quite a lot.
See Also: → 1450874
Assignee: jvarga → amarchesini
I'm not planning to work on this. janv, do you want to take it?
Assignee: amarchesini → nobody
Flags: needinfo?(jvarga)
Yes, I'm taking it.
Assignee: nobody → jvarga
Flags: needinfo?(jvarga)
We will work on this in Q1 2019.
Assignee: jvarga → nobody
Severity: normal → major
Assignee: nobody → ytausky
Assignee: ytausky → nobody
QA Whiteboard: qa-not-actionable
No longer blocks: 1413615
See Also: → 1413615

In the process of migrating remaining bugs to the new severity system, the severity for this bug cannot be automatically determined. Please retriage this bug using the new severity system.

Severity: major → --

IIUC this is still a thing?

Severity: -- → S3
Flags: needinfo?(bugmail)
Priority: P2 → P3

This is still a thing but in general https://searchfox.org/mozilla-central/source/toolkit/components/cleardata/ServiceWorkerCleanUp.jsm as used by Sanitizer.jsm's single-point-of-clearing means that we should always already be doing the right thing, it's just an appropriate refactoring for us to perform to reduce the potential for weird edge-cases and as a means of improving the handling of registration removal in general (especially as it relates to other specs that hang things off of registrations).

Flags: needinfo?(bugmail)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: