Bug 1910613 Comment 0 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

Handling upload API requests _can_ be extremely costly in time.

We currently use rate limiting, but this is a bad solution and causes excess 429s. We really want to use connection limiting which ...

STUB -- fill out later.
Handling upload API requests _can_ be extremely costly in time. Because of this, it's possible for a single instance to get swamped with upload requests that take a long time to process which causes the instance to _look_ unhealthy and for new connections from nginx to gunicorn to result in HTTP 5xx errors.

To account for that, when Tecken was set up, they added rate limiting in nginx to restrict the number of uploads per time period for a given instance. Some uploads are small and get handled very quickly. Some uploads are large and take a long time to process. The heavy-handed approach of using rate limiting doesn't take this into account at all and it leads us to having instances that are blocked from upload requests that aren't really doing anything.

Sven tested out switching from rate limiting to connection limiting where nginx restricts the number of simultaneous upload requests. If we set the limit number to `(gunicorn workers - 2)` or something like that, we think that'll improve upload handling in prod and vastly improve upload handling in stage where we're running system tests.

This bug covers making the changes in our AWS and GCP nginx configuration.

Back to Bug 1910613 Comment 0