Closed Bug 569295 Opened 15 years ago Closed 9 years ago

More considered handling of per-record and total upload sizes (413: client issue: request body too large)

Categories

(Firefox :: Sync, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla46
Tracking Status
firefox46 --- fixed

People

(Reporter: mconnor, Assigned: markh)

References

(Blocks 1 open bug)

Details

(Whiteboard: [sync:scale])

Attachments

(1 file, 2 obsolete files)

Right now we have MAX_UPLOAD_RECORDS for record counts, but as the API + newprod actually support up to 256k per record, this can get... large. Thus, we should probably have a client cap for max upload size, in addition to the 100 record cap. These probably should be different based on context (EDGE vs. gigabit ethernet, f.e.).
Please note operational comments in 631414 that need to be taken into consideration when setting the max upload size. I strongly advise we define a max packet size in API 1.1, so that we can expect compliance from non-Mozilla clients (eg. Dolphin). Otherwise we're going to end up with API-correct third-party clients that cannot sync.
Blocks: 697352
Note that this would also allow us to log the distribution of record sizes in each upload, and draw this out when we get a 413.
merging this into
(In reply to Richard Soderberg [:atoll] from comment #4) > merging this into mistaken comment, sorry
FWIW, Android Sync does upload record batching with both record count and total byte size thresholds, and so won't fall victim to this.
Blocks: 816509, 702710
Priority: -- → P2
Summary: be smarter about payload and total upload sizes → More considered handling of per-record and total upload sizes (413: client issue: request body too large)
Target Milestone: Future → ---
Priority: P2 → --
Whiteboard: [sync:scale]
This patch introduces a RecordUploader object that tracks both the number of records and the number of bytes and uploads the batch whenever either limit is reached. It creates its own array of JSON objects to track the bytes, so it skips the auto-json done by the Record object. This needs more tests for the byte limit and the attempt at elegant handling when a single record exceeds the byte limit, but I thought I'd get feedback before doing that. All existing tests pass. Richard, what do you think?
Attachment #8703027 - Flags: feedback?(rnewman)
Comment on attachment 8703027 [details] [diff] [review] 0001-Bug-569295-limit-the-number-of-bytes-we-attempt-to-u.patch Review of attachment 8703027 [details] [diff] [review]: ----------------------------------------------------------------- I like this approach. Presumably you have read https://dxr.mozilla.org/mozilla-central/source/mobile/android/services/src/main/java/org/mozilla/gecko/sync/repositories/Server11RepositorySession.java#347 ? There are dangers here, of course: * We can throw in situations where we wouldn't have thrown before. * We might inadvertently change how we process errors (e.g., skipping oversize records) and cause problems. * Be careful of memory consumption. If you can, discard the record when it's queued, and don't ever concatenate the uploaded string in memory. This needs thorough tests for the various cases, I think. And it would be worth making sure we unify how records are turned into bytes on the wire, so we don't accidentally have two code paths. ::: services/sync/modules/engines.js @@ +33,5 @@ > + this.collection = collection; > + // An array of already JSON.stringified records. > + this.queued = []; > + // The number of bytes in this.queued, plus the RECORD_OVERHEAD. Does *not* > + // include the UPLOAD_OVERHEAD.) s/)// @@ +42,5 @@ > +} > + > +RecordUploader.prototype = { > + enqueue(record) { > + let bytes = JSON.stringify(record); I'd like this to be a method on `record` -- after all, a Record is a complex beast with payload, cleartext, and various other fields, and who knows what might result. @@ +66,5 @@ > + // nothing queued. > + return; > + } > + // hand-crafted artisinal JSON! > + let data = "[" + this.queued.join(",") + "]"; Unfortunately this is the most expensive way to do this -- we'll transiently have at least double the string size, perhaps triple because we don't throw away the queued records until after the upload, and I bet some copying occurs when streaming the bytes. It might be necessary to use a different stream implementation here, and compute Content-Length in advance. @@ +1535,5 @@ > } catch (ex if !Async.isShutdownException(ex)) { > this._log.warn("Error creating record: " + Utils.exceptionStr(ex)); > } > + if (ok) { > + if (!uploader.enqueue(out)) { This can throw, if I'm reading it correctly. @@ +1536,5 @@ > this._log.warn("Error creating record: " + Utils.exceptionStr(ex)); > } > + if (ok) { > + if (!uploader.enqueue(out)) { > + this._log.warn("Record is too large to post - ignoring", id); For some engines this is a serious error. For others it's less serious. For others it's fine to just skip the record. That probably means we should surface this to the engine and let it choose. For example: a bookmarks folder with too many children will be rejected by the Sync server, and the client _must_ refuse to sync to avoid data loss in the presence of another client -- all of those children will end up in Unsorted Bookmarks, and then that _other_ client won't be able to upload Unsorted Bookmarks for the same reason! Skipping the folder is the worst option.
Attachment #8703027 - Flags: feedback?(rnewman) → feedback+
Assignee: nobody → markh
Status: NEW → ASSIGNED
Thanks for the quick review! (In reply to Richard Newman [:rnewman] from comment #9) > * We can throw in situations where we wouldn't have thrown before. > This can throw, if I'm reading it correctly. It could, but so could the old version in a very similar way IIUC. * I'm assuming the catch above is *really* about failure to encrypt (ie, that .pushData() failing it not being considered, given it's an array.push()). Aside - that entire catch+ignore block is a little suspect IMO as it could still hit the "Parent folder failing" scenario you mention if encryption failure was data-driven. * In the original version, the actual posting of the data isn't protected by an exception handler - so this version isn't either - both the .enqueue() and .flush() methods should fail in the same way. I did make an effort to keep the foreseeable exception semantics the same (and I know we can argue "anything throws"), but it would be great if you can elaborate on your concerns here? > * We might inadvertently change how we process errors (e.g., skipping > oversize records) and cause problems. Yeah, that's fair - but also difficult :) The sane thing to do here IMO is to simply attempt that single-record upload and let it (probably) fail, then open a new "make engines deal with massive single records upload failures" bug. > @@ +66,5 @@ > > + // nothing queued. > > + return; > > + } > > + // hand-crafted artisinal JSON! > > + let data = "[" + this.queued.join(",") + "]"; > > Unfortunately this is the most expensive way to do this -- we'll transiently > have at least double the string size, perhaps triple because we don't throw > away the queued records until after the upload, and I bet some copying > occurs when streaming the bytes. Yeah - the .join() was a bit cheeky, but I think the only improvement I can make here is to keep the body as a concatenated string and avoid the join. Note also that we don't hold a reference to the queued records, just the stringified version. I think a version avoiding the .join() should be quite a bit better. Currently we: * keep all queued record objects in memory. * Once we have 100 we JSON.stringify() the entire array into a single string. * We create a nsIStringInputStream from that string, do the post, then drop the references. (and as bug 1235474 comment 2 mentions, this body is sometimes greater than 80MB) Whereas now we would have: * Each record is stringified as it is processed and the reference dropped. * A single string is concatenated as each record is processed, up to a max length. * We still create the nsIStringInputStream from that string, do the post, and drop the string. So we've dropped the references to the records and limit the length of the string. We've probably thrashed the GC a little more, but given that byte limit it still seems like a net win, and I can't see an obvious stream implementation that would help improve that sanely. Thoughts on the above?
Flags: needinfo?(rnewman)
This is what I had in mind. It still has no tests, but has the following changes: * Keeps a single string with the current JSON representation of the post. * No longer skips single records that are too large - we just post it anyway. * Renames the object to a PostQueue(), which is obtained from |collection.newPostQueue()|. * |collection.post()| throws, telling you use a PostQueue instead. * Removes |collection.pushData()| and |collection.clearRecords()| Obviously still a WIP, but early feedback is the best kind :) Thanks!
Attachment #8703027 - Attachment is obsolete: true
Attachment #8704915 - Flags: feedback?(rnewman)
Flags: firefox-backlog+
Priority: -- → P1
(In reply to Mark Hammond [:markh] from comment #10) > Aside - that entire catch+ignore block is a little suspect > IMO as it could still hit the "Parent folder failing" scenario you mention > if encryption failure was data-driven. Yup, and that still worries me. It's just not safe to put _most_ of a user's bookmark records on the server; we should fail hard when we can. > I did make an effort to keep the foreseeable exception semantics the same > (and I know we can argue "anything throws"), but it would be great if you > can elaborate on your concerns here? I think I was looking at how up.pushData(out); was inside a try..catch, and now postQueue.enqueue(out); is outside it. But of course, pushData was safe. > Yeah, that's fair - but also difficult :) The sane thing to do here IMO is > to simply attempt that single-record upload and let it (probably) fail, then > open a new "make engines deal with massive single records upload failures" > bug. Yup. I don't think we should solve single large records in this bug. > I think a version avoiding the .join() should be quite a bit better.> * Each record is stringified as it is processed and the reference dropped. > * A single string is concatenated as each record is processed, up to a max > length. That makes sense to me. When I talk about doing this without concatenation, I'm thinking of ropes: https://en.wikipedia.org/wiki/Rope_%28data_structure%29 It looks like SpiderMonkey uses ropes internally in some cases: http://mxr.mozilla.org/mozilla-central/source/js/src/vm/String.cpp#628 so I might be over-thinking this, and with your change to just maintain the concatenated string this is as good as it can get.
This patch has tests and (I believe) addresses your previous outstanding comments. Note that I did *not* change the semantics around the possibility of the .encrypt failing and thus leaving us in the "orphaned bookmarks" case - so the semantics should be identical in that case. I'll open another bug for handling this scenario.
Attachment #8704915 - Attachment is obsolete: true
Attachment #8704915 - Flags: feedback?(rnewman)
Flags: needinfo?(rnewman)
Attachment #8706226 - Flags: review?(rnewman)
Comment on attachment 8706226 [details] [diff] [review] 0001-Bug-569295-limit-the-number-of-bytes-we-attempt-to-u.patch Review of attachment 8706226 [details] [diff] [review]: ----------------------------------------------------------------- ::: services/sync/modules/record.js @@ +641,5 @@ > +function PostQueue(poster, log, postCallback) { > + // The "post" function we should use when it comes time to do the post. > + this.poster = poster; > + this.log = log; > + // The callback we make with the response when we do get around to making the Could we get some newlines before each comment? Kinda hard for me to read. @@ +643,5 @@ > + this.poster = poster; > + this.log = log; > + // The callback we make with the response when we do get around to making the > + // post (which could be during any of the enqueue() calls or the final flush()) > + this.postCallback = postCallback; Note that postCallback can be called multiple times, and that the callback should not synchronously add more items to the queue! @@ +678,5 @@ > + this.flush(); > + } > + if (this.queued) { > + // there's already at least 1 record queued, so need a trailing "," > + this.queued += "," + bytes; Append both times with no concatenations first. And use 'numQueued' to decide whether we're the first record. And thus no need for a full-fledged 'if': // Either a ',' or a '[' depending on whether this is the first record. this.queued += this.numQueued ? "," : "["; this.queued += bytes; this.numQueued++; @@ +687,5 @@ > + this.numQueued += 1; > + }, > + > + flush() { > + if (this.queued.length == 0) { if (!this.queued) { @@ +697,5 @@ > + let response = this.poster(this.queued + "]"); > + this.postCallback(response); > + } finally { > + this.queued = ""; > + this.numQueued = 0; I think you should capture these and clear them before running the callback: let queued = this.queued + "]"; this.queued = ""; this.numQueued = 0; this.postCallback(this.poster(queued));
Attachment #8706226 - Flags: review?(rnewman) → review+
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla46
Component: Firefox Sync: Backend → Sync
Product: Cloud Services → Firefox
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: