Formalize retry and backoff for Bagheera

RESOLVED WONTFIX

Status

Mozilla Metrics
Metrics Data Ping
RESOLVED WONTFIX
5 years ago
3 years ago

People

(Reporter: gps, Unassigned)

Tracking

unspecified
Moved to JIRA
Dependency tree / graph

Details

(Whiteboard: [JIRA METRICS-1182])

(Reporter)

Description

5 years ago
Bagheera currently lacks a formal backoff and retry mechanism. In order to curtail the thundering herd, we need to document expected behavior of clients when they encounter a slow, non-responsive, or offline server.

I submit the Sync server specification as an example: http://docs.services.mozilla.com/storage/apis-2.0.html#response-headers. You'll see it defines HTTP response headers that tell the client how to react if the server is under load. Bagheera will need something similar.

Since Bagheera is simple (unlike Sync), I think use of Retry-After should be sufficient (I doubt we need an actual backoff header).

It's possible the generation of these headers resides not with Bagheera itself but in load balancers in front of it. We will almost certainly need Operations to modify load balancers to emit specific headers in the case where all Bagheera origin servers are down. That work will be tracked in another bug. CC'ing Ravi so he knows it's coming.
N.B., Retry-After is only of use for 503s (and 3xx, if we use those).

Consider whether we want to be able to say "OK, but please take longer next time".
(Reporter)

Comment 2

5 years ago
(In reply to Richard Newman [:rnewman] from comment #1)
> N.B., Retry-After is only of use for 503s (and 3xx, if we use those).
> 
> Consider whether we want to be able to say "OK, but please take longer next
> time".

We are only supposed to be sending 1 req/day for the FHR. I doubt the backoff will be longer than a day, so as long as the server emits 503 when under load, we should be fine with just Retry-After.

That being said, there might be other users of Bagheera that issue requests more frequently and they could make use of a "backoff" header.

Updated

5 years ago
Whiteboard: [JIRA METRICS-1182]
Target Milestone: Unreviewed → Moved to JIRA
Depends on: 804266
Annie: a lot of the bugs that Greg is filing are dependencies for the FHR feature in Firefox (or varying priorities), so we need a lot of transparency into state.

Could you folks work with one of the following three options?

1. Provide a public link to the JIRA issue tracker, or
2. Copy all status updates from JIRA to this bug in near-realtime, or
3. Use this bug instead of JIRA?

I'm keen to avoid us having "RESOLVED bombs", where we don't know what's happening with a high-priority feature until it's done.

Thanks!
Nobody's updating Bagheera at this point; we've slated it for end-of-life in 2016.
Status: NEW → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.