Closed Bug 963038 Opened 8 years ago Closed 7 years ago

Datastore: Transactional semantics

Categories

(Core :: DOM: Core & HTML, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla31

People

(Reporter: baku, Assigned: baku)

References

Details

Attachments

(1 file)

In the current implementation of DataStore API nothing protects from data lost between a get() and a put(). Think about this piece of code:

store.get(42).then(function(obj) {
  obj.newValue = 'hello world';
  store.puts(obj, 42);
});

Between the get() and the put(), another app could have changed the obj value.
The proposal for this issue is here:

interface DataStore : EventTarget {
  ...

  // Get returns a single DataStoreRecord object or an array of them.
  // Promise<any>
  Promise get(DataStoreKey... id);

  // Promise<void>
  Promise put(any obj, DataStoreKey id, optional DOMString revisionId);

  // Promise<DataStoreKey>
  Promise add(any obj, optional DataStoreKey id,
              optional DOMString revisionId);

  // Promise<boolean>
  Promise remove(DataStoreKey id, optional DOMString revisionId);

  ...
};

// This dictionary is the return value of the get() method.
dictionary DataStoreRecord {
  DOMString revisionId;
  DataStoreKey id;
  any value;
};

If revisionId is passed in the add()/put()/remove() the operation fails if the last revisionId of the object and the passed revisionId do not match.
Haven't we basically recreated IndexedDB at this point?  I thought the purpose of DataStore was to replace LocalStorage?
No, the purpose of DataStore is to share data between apps. It still much easier to use than idb and this issue is just about avoid a race condition/data lost.

But before starting the implementation, I would like to receive feedbacks.
Ben, Gene, I'd love to have your thoughts on this.
Flags: needinfo?(gene.lian)
Flags: needinfo?(bent.mozilla)
Yeap, the intention makes sense to me. Whenever an app wants to modify the record in the data store, it has to make sure it already had that record up-to-date, so that it's allowed to modify that based on the latest context to avoid data-loss. However, I'd have some concerns:

1. Does it mean any app can still cheat to input the latest revision ID to change data, even if the app's revision ID is not really up-to-date? Is that harmful?

2. clear() might also need this. Otherwise, an app would mistakenly clear some newly-added records that are not really sync'ed in the app's local DB.

3. I don't understand why we need to return the following for the get():

  // This dictionary is the return value of the get() method.
  dictionary DataStoreRecord {
    DOMString revisionId;
    DataStoreKey id;
    any value;
  };

What we only need is to input revisionId in the add()/put()/remove(). If the app's local revision is not up-to-date, then the promise will be rejected and the app will have to use sync() or listen to onchange to update its revision. Then, try add()/put()/remove() again until succeed.
Flags: needinfo?(gene.lian)
In general I don't think making an add/put/remove operation fail *sometimes* is wise. Error handling is hard and I don't think this proposal is likely to make web authors write less racy code.

I'd rope sicking in here, he loves thinking about this kind of stuff...
Flags: needinfo?(bent.mozilla)
> 1. Does it mean any app can still cheat to input the latest revision ID to change data, even if the app's > revision ID is not really up-to-date? Is that harmful?

That mirrors what my feedback was on the mailing list, I expect everyone to get this wrong while its optional.

> 
> 3. I don't understand why we need to return the following for the get():
> 
>   // This dictionary is the return value of the get() method.
>   dictionary DataStoreRecord {
>     DOMString revisionId;
>     DataStoreKey id;
>     any value;
>   };
> 
> What we only need is to input revisionId in the add()/put()/remove(). If the
> app's local revision is not up-to-date, then the promise will be rejected
> and the app will have to use sync() or listen to onchange to update its
> revision. Then, try add()/put()/remove() again until succeed.

I dont understand what you mean, are you talking about putting the store global revisionId? as a general process an application reads an object, updates it and attempts to save, the revision is passed through those operations
> 3. I don't understand why we need to return the following for the get():

The idea is that if record A is changed by a 3rd app, my app can still change record B.
But yes, maybe a global revisionID is enough.
(In reply to Dale Harvey (:daleharvey) from comment #6)
> > 1. Does it mean any app can still cheat to input the latest revision ID to change data, even if the app's > revision ID is not really up-to-date? Is that harmful?
> 
> That mirrors what my feedback was on the mailing list, I expect everyone to
> get this wrong while its optional.

So are you suggesting that we should require this argument for any operation which modifies data?


(In reply to Andrea Marchesini (:baku) from comment #7)
> > 3. I don't understand why we need to return the following for the get():
> 
> The idea is that if record A is changed by a 3rd app, my app can still
> change record B.
> But yes, maybe a global revisionID is enough.

Well, that global revisionID is already exposed, and it can be used if the app wants to know if any other app has modified anything else in the DataStore in the mean time.
> So are you suggesting that we should require this argument for any operation which modifies data?

Yes, although see [1]

> Well, that global revisionID is already exposed, and it can be used if the app wants to know if any other app has modified anything else in the DataStore in the mean time.

I am currently doing this manually using the global revisionId and its causing quite a lot of 'transaction' retries

[1] Theres an alternative, if we are ever expecting Datastore to have multiple (possible concurrent) writers then it will need to deal with the ability to handle conflicts, if it needs to have the ability to handle conflicts then it would likely be better to not actually reject conflicting writes, but to resolve them on read or asynchronously instead
I have a fake transaction manager in the place that writes places to prevent me from overwriting data, however it absolutely assumes that the callback from a successful datastore operation will happen before and subsequent datastore operations happen

https://gist.github.com/daleharvey/9268112

That isnt true, I have modified it to do 

    editPlace: function(url, fun) {
      var self = this;
      var rev = this.dataStore.revisionId;
      return new Promise(function(resolve) {
        self.dataStore.get(url).then(function(place) {
          fun(place, function(newPlace) {
            if (self.writeInProgress || self.dataStore.revisionId !== rev) {
              return self.editPlace(url, fun);
            }
            self.writeInProgress = true;
            self.dataStore.put(newPlace, url).then(function() {
              self.writeInProgress = false;
              resolve();
            });
          });
        });
      });
    },

instead
I'll work on this once DataStore in C++ is fully landed.
Attachment #8402631 - Flags: review?(ehsan)
https://tbpl.mozilla.org/?tree=Try&rev=730ad535c932

built on top of DataStoreService in C++.
Comment on attachment 8402631 [details] [diff] [review]
transaction.patch

Review of attachment 8402631 [details] [diff] [review]:
-----------------------------------------------------------------

Can you please update the wiki page with the transactional semantics?  Thanks!
Attachment #8402631 - Flags: review?(ehsan) → review+
https://hg.mozilla.org/mozilla-central/rev/0f97729f261e
Status: NEW → RESOLVED
Closed: 7 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla31
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.