Closed Bug 1910627 Opened 7 months ago Closed 6 months ago

delete previous version

Categories

(Core :: Machine Learning, enhancement, P1)

Firefox 130
enhancement

Tracking

()

RESOLVED FIXED
131 Branch
Tracking Status
firefox131 --- fixed

People

(Reporter: tarek, Assigned: atossou)

References

Details

(Whiteboard: [genai])

Attachments

(2 files)

We should delete a previous version of a model when downloading a new version

Priority: -- → P1
Blocks: 1910847
Type: defect → enhancement
Version: unspecified → Firefox 130
Assignee: nobody → atossou
Depends on: 1909508

The patch can't be applied in phab because it's been applied on the latest main and not against the changes from the patch from https://bugzilla.mozilla.org/show_bug.cgi?id=1909508

The solution for this train of patchs to work is to provide a "stack" of patches by commiting both patchs in the same branch and then using moz phab to push them.

see the documentation here https://firefox-source-docs.mozilla.org/contributing/stack_quickref.html

since we only have a few hours left to wrap up 130, let's focus on this patch without the other one that is quite large and risky to land.

For 130 I would like to suggest a less risky approach. Instead of distributing the role of deleting files amongst classes, we can make it a internal implementation detail of the ModelHub class :

e.g. :

diff --git a/toolkit/components/ml/content/ModelHub.sys.mjs b/toolkit/components/ml/content/ModelHub.sys.mjs
index fb47f8ccbed84..2e2e866879588 100644
--- a/toolkit/components/ml/content/ModelHub.sys.mjs
+++ b/toolkit/components/ml/content/ModelHub.sys.mjs
@@ -1107,6 +1107,7 @@ export class ModelHub {
    * @param {string} config.model - The model name (organization/name).
    * @param {string} config.revision - The model revision.
    * @param {string} config.file - The file name.
+   * @param {bool} config.deletePrevious - If true, delete a previous revision if it exists.
    * @param {?function(ProgressAndStatusCallbackParams):void} config.progressCallback A function to call to indicate progress status.
    * @returns {Promise<[ArrayBuffer, headers]>} The file content
    */
@@ -1115,6 +1116,7 @@ export class ModelHub {
     model,
     revision,
     file,
+    deletePrevious,
     progressCallback,
   }) {
     // Make sure inputs are clean. We don't sanitize them but throw an exception
@@ -1129,6 +1131,11 @@ export class ModelHub {

     let useCached;

+    if (deletePrevious) {
+      // TODO  here we can check if we already have this file at a previous revision, and if it's the case,
+      // delete it on the spot.
+    }
+
     // If the revision is `main` we want to check the ETag in the hub
     if (revision === "main") {
       // this can be null if no ETag was found or there were a network error

This approach will avoid introducing file storage maintenance logic outside ModelHub

In that case, I suggest we just push the changes that allow the ability to delete without deleting the files. So the new functions written in modelHub

I didn't do the deletion in modelHub because it doesn't have a global view of the files to delete/update. So it is risky to delete in modelhub

What happens if we delete previous files (say for the models) and then, for some reasons, we cannot download new files (say for the tokenizer).
We come into a situation where the update to new model failed but then we have already invalidate the old models.
That's why I was deleting previous versions, only when we are sure that everything went smoothly with the update.

(In reply to Aristide Tossou from comment #4)

What happens if we delete previous files (say for the models) and then, for some reasons, we cannot download new files (say for the tokenizer).
We come into a situation where the update to new model failed but then we have already invalidate the old models.
That's why I was deleting previous versions, only when we are sure that everything went smoothly with the update.

I understand. For 130, you can do the deletion right after the new file was successfully downloaded then. So at the end of that function. In any case, the pdfjs UI has a delete button that will trigger a full deletion, so we are safe. Here it’s mostly to spare space when we update the model.

Deleting right after a new file was successfully downloaded is still not safe. Because that is just 1 file and there are remaining files.

Basically here is the scenario: We need to download 6 files for the update to go through.

  1. We successfully downloaded update to file 1 and we deleted all previous versions of file 1

  2. Update of file 2 failed..... Now we are in a situation where the update failed. And by this point the existing installation is also broken and can't work.

We need to wait until at least until we are sure the update of all files is successful before deleting any file of the previous versions.

If the UI has a button for a full deletion, that is another argument for just having the possibility; the functionality to delete.
So the UI could make sure that it is able to use the update before starting the deletion.
So the UI decides when to delete and not us.

If we have to do it, let's actually do it after the first request is successful (So I won't need the stacked commits anymore as they will be independent)

Regarding the stacked commits, I did what you suggested and it submitted two phabs. Phabricator is correctly showing the changes unique to each commit. But the remote build job in phabricator is not try to rebase the second commit on top of the first. It is still trying to rebase on top of mozilla-central. Is there a way to tell it to do that on top of the first commit?

If we only need the functionality to delete, then also I don't need stacked commits

(In reply to Aristide Tossou from comment #6)

Deleting right after a new file was successfully downloaded is still not safe. Because that is just 1 file and there are remaining files.

Basically here is the scenario: We need to download 6 files for the update to go through.

  1. We successfully downloaded update to file 1 and we deleted all previous versions of file 1

  2. Update of file 2 failed..... Now we are in a situation where the update failed. And by this point the existing installation is also broken and can't work.

We need to wait until at least until we are sure the update of all files is successful before deleting any file of the previous versions.

But if we moved the version to the new version in Remote settings, it will not use the previous version ever. We could even delete everything prior to the run if we know the model needs an update .

If the UI has a button for a full deletion, that is another argument for just having the possibility; the functionality to delete.
So the UI could make sure that it is able to use the update before starting the deletion.
So the UI decides when to delete and not us.

If we have to do it, let's actually do it after the first request is successful (So I won't need the stacked commits anymore as they will be independent)

Regarding the stacked commits, I did what you suggested and it submitted two phabs. Phabricator is correctly showing the changes unique to each commit. But the remote build job in phabricator is not try to rebase the second commit on top of the first. It is still trying to rebase on top of mozilla-central. Is there a way to tell it to do that on top of the first commit?

This might mean the two commits are not sequenced in the same branch. Let’s have a lot later today

I understand that we will not use previous version ever once the update is successful. But the update has to be successful first.

Basically, it is safer to make sure we have downloaded all the files needed to do the update before removing any previous version. Once we have fully completed the update, then we can delete.

If the version is updated in remote settings, we won’t use the existing version and keep on trying to download the new files. So if one file fails to download, the user will not be able to use the old version anyways, unless the front end explicitly asks for it. Right now, deleting the files upfront has the exact same effect — even if the download fails.

Attachment #9417594 - Attachment description: Bug 1910627 - delete previous version - r?tarek,calixte → Bug 1910627 - delete previous version - r?tarek,calixte
Status: NEW → RESOLVED
Closed: 6 months ago
Resolution: --- → FIXED
Target Milestone: --- → 131 Branch
Whiteboard: [genai]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: