Closed Bug 732152 Opened 12 years ago Closed 12 years ago

Implement unified storage service client

Categories

(Cloud Services :: Firefox: Common, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
mozilla16

People

(Reporter: gps, Assigned: gps)

References

(Blocks 2 open bugs)

Details

Attachments

(1 file, 3 obsolete files)

Currently in the Sync code base the code for interacting with the storage service is scattered all over the place (service.js, record.js, rest.js, policies.js). We are using two separate HTTP request libraries (Resource/AsyncResource and the better RESTRequest). A lot of the interacting is spinning the event loop via the synchronous Resource. In short, it is a giant mess.

I'd like to restore sanity to this mess.

The attached patch is my first stab at it.

Things are implemented in two new files, storageservice.js and client.js. storageservice.js contains a generic implementation of a client for the storage service. There is nothing Sync-specific about this client, so it could be used by others against a non-Sync deployment of the storage service. client.js implements Sync-specific extensions to the generic library and provides basic integration with the existing Sync code base (calling Service, Status, SyncScheduler, etc).

Most of my work so far has been in the generic client. But, that doesn't mean it is perfect. Far from it. There are still a number of corner cases and API warts that need addressed. So far all of my work has been on the simple GET APIs. I haven't even tacked POST/PUT or streaming responses yet. But, I thought things were far enough along that I could ask for some feedback before I commit too much and refactoring becomes more painful.

There is plenty of documentation to help with the feedback. However, some of the design decisions were not captured. I'll gladly answer those in this bug or IRC. But, I will explicitly state the goals of what I'm trying to accomplish.

* P1 Aggregate all network interaction with Sync server in one logical place so it is easier to find, comprehend, debug, and maintain.
* P1 Provide asynchronous APIs for all network requests.
* P2 Facilitate cleaner server interaction code in the core Sync APIs so it is easier to read and comprehend.
* P2 Write code that can easily fit in with the future asynchronous rewrite
* P3 Differentiate between the storage service and Sync's use of it so others could use the storage service client and so we clearly separate the Sync logic from the storage service logic.

Please keep in mind I'm asking for feedback only. I'm well aware of many problems and areas for improvement. These include:

* Formalizing and cleaning the API on StorageServiceRequest. I think it's a mess and could look a lot prettier.
* Capturing the integer error code returned on some server errors.
* Ensuring server errors are all handled properly.
* Support streaming of responses.
* Adding useful APIs to the client like getInProgressRequests(), cancelAllRequests(). Perhaps also include simple metrics which we can drop in logs or integrate with Syncorro (request times, unexpected status codes, etc).
* Implement remaining APIs.
* MOAR TESTS
Attachment #602074 - Flags: feedback?(rnewman)
Attachment #602074 - Flags: feedback?(philipp)
I made many misc improvements.
Attachment #602074 - Attachment is obsolete: true
Attachment #603525 - Flags: feedback?(rnewman)
Attachment #603525 - Flags: feedback?(philipp)
Attachment #602074 - Flags: feedback?(rnewman)
Attachment #602074 - Flags: feedback?(philipp)
(In reply to Gregory Szorc [:gps] from comment #1)

> I made many misc improvements.

Then it's a good job I didn't get around to reviewing it yet! :D
Assignee: nobody → gps
Component: Firefox Sync: Backend → Firefox: Common
QA Contact: sync-backend → firefox-common
Blocks: 744320
Depends on: 744323
Blocks: 744325
Comment on attachment 603525 [details] [diff] [review]
Start of standalone client implementation

Review of attachment 603525 [details] [diff] [review]:
-----------------------------------------------------------------

I don't have much time to review a big patch like this atm, but if you want my feedback, let's schedule a meeting and walk through the code together.
Attachment #603525 - Flags: feedback?(philipp)
Comment on attachment 603525 [details] [diff] [review]
Start of standalone client implementation

This code is old and bitrotted.
Attachment #603525 - Flags: feedback?(rnewman)
Hah, I was going to ping you about that this morning, see if you had an updated patch. Sorry for the 90-day wait! :D
Attached patch Getting closer (obsolete) — Splinter Review
I /think/ I've implemented all the HTTP APIs now. I still need to perform one more pass on the code to tidy things up. I still think the onComplete() API could be improved w.r.t. error checking. It's kinda confusing, if you ask me.

This patch was hackily lifted from Git. The r? patch will be completely proper.

I'm asking for feedback on the general API. No need to do a normal rnewman f? (which is effectively a r). Maybe just look at the test code and inline examples and comment on the API.
Attachment #603525 - Attachment is obsolete: true
Attachment #630640 - Flags: feedback?(rnewman)
Comment on attachment 630640 [details] [diff] [review]
Getting closer

Review of attachment 630640 [details] [diff] [review]:
-----------------------------------------------------------------

This is the sexiest patch ever. I feel like I was swimming in a luxurious bath of documentation and tests.

::: services/common/storageservice.js
@@ +69,5 @@
> + * @param id
> + *        (string) ID of BSO. Can be null.
> + *        (string) Collection BSO belongs to. Can be null;
> + */
> +function BasicStorageObject(id=null, collection=null) {

Ooh, fancy.

@@ +81,5 @@
> +  data: null,
> +
> +  // At the time this was written, this was not adopted by Harmony. It could
> +  // break in the future. We have an xpcshell test that will break if
> +  // SpiderMonkey breaks, just in case.

+1

@@ +114,5 @@
> +   * Sets the modified time of the BSO in milliseconds since Unix epoch.
> +   *
> +   * Please note that if this value is sent to the server it will be ignored.
> +   * The server will use its time at the time of the operation when storing the
> +   * BSO.

Man, you're earning major brownie points.

@@ +123,5 @@
> +
> +  get sortindex() {
> +    if (this.data.sortindex) {
> +      return this.data.sortindex;
> +    }

You probably want

  if (this.data) {
    return this.data.sortindex || 0;
  }

  return 0;

@@ +315,5 @@
> + *
> + *   // Install an onComplete handler to be executed when response is ready:
> + *   request.onComplete = function onComplete(error, request) {
> + *     // Do something.
> + *   };

Alternative API design:

  let handler = {
    onComplete: function onComplete(…) {
      …
    }
  };
  request.handler = handler;

The advantage of this? You can reuse the handler. It can be a real object holding state to encourage this, too. Give it a shot; rewrite a couple of your tests to instantiate a test handler class with some parameter:

  request.handler = new BSOCollectorHandler();
  request.dispatch();
  console.log(request.handler.bsos);

Even better if `request.dispatch` returns your handler, because you can chain:

  // Ensure the server has three records.
  do_check_eq(3, request.dispatch(new BSOCollectorHandler()).length);

@@ +329,5 @@
> + *   });
> + *
> + * In both of the above example, the two `request` variables are identical. The
> + * original `StorageServiceRequest` is passed into the callback so callers
> + * don't need to rely on closures.

+1

@@ +1167,5 @@
> +    if (!listener) {
> +      throw new Error("listener argument must be an object.");
> +    }
> +
> +    if (!this._listeners.some(function(i) { return i == listener; })) {

Y'know, you probably still want to use indexOf here. You're incurring N function calls just to run ==.

@@ +1168,5 @@
> +      throw new Error("listener argument must be an object.");
> +    }
> +
> +    if (!this._listeners.some(function(i) { return i == listener; })) {
> +      this._listeners.push(listener);

Isn't this a perfect opportunity to use a Set?

@@ +1196,5 @@
> +        if (name in listener) {
> +          listener[name].apply(listener, args);
> +        }
> +      } catch (ex) {
> +        this._log.warn("Listener threw an exception during " + name + ":" + ex);

Space after colon.

@@ +1389,5 @@
> +    if (!id) {
> +      throw new Error("id argument must be defined.");
> +    }
> +
> +    let uri = this._baseURI + "storage/" + collection + "/" + id;

Escaping?

@@ +1441,5 @@
> +   *   });
> +   *
> +   * TODO [gps] This API feels a bit clunky. BSOs can hold collection and ID.
> +   * Why the redundancy? Should we just require the BSO has collection and ID
> +   * set and make this setBSO(bso)?

I prefer that, yes. It parallels the direction in which we've been moving for the test JS Sync server. If you really want, provide APIs that store (bso) and (coll, id, contents).

::: services/common/tests/unit/test_storageservice_client.js
@@ +311,5 @@
> +
> +  let request = client.getCollectionCounts();
> +  request.locallyModifiedVersion = now - 10;
> +  request.dispatch(function onComplete(error, req) {
> +    do_check_null(error);

Makes you want blocks, doesn't it?

@@ +420,5 @@
> +  });
> +
> +  let request = client.getCollection("testcoll");
> +  request.full = true;
> +  // TODO I think the server is wrong. Spec says strictly newer.

Resolve this with rfkelly, please.

@@ +773,5 @@
> +    server.stop(run_next_test);
> +  });
> +});
> +
> +/* TODO need support in JS server.

File bug.
Attachment #630640 - Flags: feedback?(rnewman) → feedback+
(In reply to Richard Newman [:rnewman] from comment #7)

> Alternative API design:
> ...

I'll give it a shot!

> @@ +1167,5 @@
> > +    if (!listener) {
> > +      throw new Error("listener argument must be an object.");
> > +    }
> > +
> > +    if (!this._listeners.some(function(i) { return i == listener; })) {
> 
> Y'know, you probably still want to use indexOf here. You're incurring N
> function calls just to run ==.

... yeah. This is me being silly.

> @@ +1168,5 @@
> > +      throw new Error("listener argument must be an object.");
> > +    }
> > +
> > +    if (!this._listeners.some(function(i) { return i == listener; })) {
> > +      this._listeners.push(listener);
> 
> Isn't this a perfect opportunity to use a Set?

In theory. Unfortunately, Set is not currently iterable! I've been pestering the JS team about this since the FX work week. It's a deal-breaker for using Set and Map in many real-world scenarios.

> @@ +1389,5 @@
> > +    if (!id) {
> > +      throw new Error("id argument must be defined.");
> > +    }
> > +
> > +    let uri = this._baseURI + "storage/" + collection + "/" + id;
> 
> Escaping?

Good question! If I were dealing with filesystem access, I definitely would. But, there shouldn't be a security issue here.

Since `collection` and `id` should be URI safe strings (that's what they are defined to be - correct me if I'm wrong), I think the "garbage in, garbage out" rule applies here.

If you insist, I can validate these strings on every call.

Alternatively, I could provide some kind of higher-level API that exposes collection interaction. If I did that, I'd just create an instance from a collection name and do the validation once. Not sure if that is worth it, though.
 
> @@ +1441,5 @@
> > +   *   });
> > +   *
> > +   * TODO [gps] This API feels a bit clunky. BSOs can hold collection and ID.
> > +   * Why the redundancy? Should we just require the BSO has collection and ID
> > +   * set and make this setBSO(bso)?
> 
> I prefer that, yes. It parallels the direction in which we've been moving
> for the test JS Sync server. If you really want, provide APIs that store
> (bso) and (coll, id, contents).

Works for me! It's a shame JS doesn't support named arguments. setBSO(bso=bso) or setBSO(collection="coll", id="foo", payload="payload"). We could even use the BSO properties when available and override selective fields from the arguments: setBSO(bso=bso, sortindex=15). One can dream.
(In reply to Richard Newman [:rnewman] from comment #7)
> @@ +315,5 @@
> > + *
> > + *   // Install an onComplete handler to be executed when response is ready:
> > + *   request.onComplete = function onComplete(error, request) {
> > + *     // Do something.
> > + *   };
> 
> Alternative API design:
> 
>   let handler = {
>     onComplete: function onComplete(…) {
>       …
>     }
>   };
>   request.handler = handler;
> 
> The advantage of this? You can reuse the handler. It can be a real object
> holding state to encourage this, too.

So far so good. I like this alternative. Specifically, I like it because it doesn't force closures on people. Consumers could even set their `this` as the handler (if it implements the proper interface).

> Give it a shot; rewrite a couple of
> your tests to instantiate a test handler class with some parameter:
> 
>   request.handler = new BSOCollectorHandler();
>   request.dispatch();
>   console.log(request.handler.bsos);
>
> Even better if `request.dispatch` returns your handler, because you can
> chain:
> 
>   // Ensure the server has three records.
>   do_check_eq(3, request.dispatch(new BSOCollectorHandler()).length);

Umm, so what about the async bits? There needs to be a callback *somewhere*. Surely you aren't advocating spinning the event loop!
(In reply to Gregory Szorc [:gps] from comment #8)

> > Isn't this a perfect opportunity to use a Set?
> 
> In theory. Unfortunately, Set is not currently iterable! I've been pestering
> the JS team about this since the FX work week. It's a deal-breaker for using
> Set and Map in many real-world scenarios.

Bleh.

> Since `collection` and `id` should be URI safe strings (that's what they are
> defined to be - correct me if I'm wrong), I think the "garbage in, garbage
> out" rule applies here.

Fine by me; just make sure it's documented.

> Alternatively, I could provide some kind of higher-level API that exposes
> collection interaction. If I did that, I'd just create an instance from a
> collection name and do the validation once. Not sure if that is worth it,
> though.

We'll see what drops out in use. It might be worth doing something like we do in the test code -- grabbing a collection proxy by name. No rush.

(In reply to Gregory Szorc [:gps] from comment #9)

> Umm, so what about the async bits? There needs to be a callback *somewhere*.
> Surely you aren't advocating spinning the event loop!

Was just doodling :D More likely you'd have something like we have for Android Sync:

  request.dispatch(new EnsureFetchHandler(expected, run_next_test));

with the fetch handler doing all of the collecting, comparing, and continuing for you.
OK, so I rewrote things to support handlers. I like the advanced user case (common case?). I don't like the verbosity in some code (like the unit tests). One extra object just to install a callback?!

I think I'm going to implement things so `handler` can be either a Function or an Object. If it is a Function, it will be treated as an `onComplete` function. If not a Function, it will be treated as a full-fledged handler object. Internally, I'll just convert the Function to an Object with a single `onComplete` property at handler install. This makes the internal code sane while making the external API simpler. Win win, I think.
(In reply to Gregory Szorc [:gps] from comment #11)
> OK, so I rewrote things to support handlers. I like the advanced user case
> (common case?). I don't like the verbosity in some code (like the unit
> tests). One extra object just to install a callback?!

Yeah, it's only a win if you can reuse.

> I think I'm going to implement things so `handler` can be either a Function
> or an Object. If it is a Function, it will be treated as an `onComplete`
> function. If not a Function, it will be treated as a full-fledged handler
> object. Internally, I'll just convert the Function to an Object with a
> single `onComplete` property at handler install. This makes the internal
> code sane while making the external API simpler. Win win, I think.

I'd be inclined to do this:

  r.dispatch();    // Uses r.handler.
  r.dispatch(cb);  // Uses cb as `onComplete`.

with the fallback callbacks for the latter being inherited from `r.handler`.

In other words: we already have two different ways to invoke with callback logic; let's use them.
(In reply to Richard Newman [:rnewman] from comment #12)
> I'd be inclined to do this:
> 
>   r.dispatch();    // Uses r.handler.
>   r.dispatch(cb);  // Uses cb as `onComplete`.
> 
> with the fallback callbacks for the latter being inherited from `r.handler`.

This is essentially what I have done. The difference is that dispatch(cb) will overwrite any existing handler and that dispatch(arg) can install a handler itself. The latter is easy enough to fix: validate the argument. The former is slightly more difficult.

If you really want it, I can give it to you. But, your solution will be slightly more complicated, as it requires tracking multiple variables internally. I'd rather KISS until we need this. The API can be preserved easily enough. Besides, the global callbacks on the client provide a mechanism for common event listeners.
(In reply to Gregory Szorc [:gps] from comment #13)

> This is essentially what I have done. The difference is that dispatch(cb)
> will overwrite any existing handler and that dispatch(arg) can install a
> handler itself. The latter is easy enough to fix: validate the argument. The
> former is slightly more difficult.

The main thing I want to avoid is the function/object heuristic. I'd rather see callers explicitly set `.handler`, preserving the argument for callback function usage, or even have:

  function dispatch(cb, handler)

and stick some nulls in there if you're calling with a handler. I'm trying to avoid the obvious future bug here.

I'm OK with a callback argument overwriting the current handler, which I think keeps things simple internally.
Unleash the hounds.
Attachment #630640 - Attachment is obsolete: true
Attachment #633196 - Flags: review?(rnewman)
Comment on attachment 633196 [details] [diff] [review]
Implement standalone client, v1

Review of attachment 633196 [details] [diff] [review]:
-----------------------------------------------------------------

I am as confident as I can be (knackered and relying on interdiff!) that this is good enough to land.

We can thrash out any remaining bugs later.

::: services/common/storageservice.js
@@ +244,5 @@
> + *   conflict - True if the server reported that a resource being operated on
> + *     was in conflict. If this occurs, the client should typically wait a
> + *     little and try the request again.
> + *
> + *   requestToolarge - True if the request was too large for the server. If

Argh. Can we have `requestTooLarge`, please?

@@ +770,5 @@
> +
> +      try {
> +        this._handler.onComplete(this._error, this);
> +      } catch (ex) {
> +        this._log.warn("Exception when invoking handler's onComplete: " + ex);

exceptionStr?

@@ +930,5 @@
> + *
> + * By default, requests are issued in "streaming" mode. As the client receives
> + * data from the server, it will invoke the caller-supplied onBSORecord
> + * callback for each record as it is ready. When all records have been received,
> + * it will invoke onComplete like normal. To change this behavior, modify the

s/like normal/as normal.
Attachment #633196 - Flags: review?(rnewman) → review+
https://hg.mozilla.org/services/services-central/rev/dbb9abec1fea
Status: NEW → ASSIGNED
Whiteboard: [fixed in services]
Target Milestone: --- → mozilla16
Blocks: 769864
https://hg.mozilla.org/mozilla-central/rev/dbb9abec1fea
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Whiteboard: [fixed in services]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: