Closed Bug 803662 Opened 8 years ago Closed 5 years ago

Formalize retry and backoff for Bagheera

Categories

(Mozilla Metrics :: Metrics Data Ping, defect)

defect
Not set

Tracking

(Not tracked)

RESOLVED WONTFIX
Moved to JIRA

People

(Reporter: gps, Unassigned)

References

Details

(Whiteboard: [JIRA METRICS-1182])

Bagheera currently lacks a formal backoff and retry mechanism. In order to curtail the thundering herd, we need to document expected behavior of clients when they encounter a slow, non-responsive, or offline server.

I submit the Sync server specification as an example: http://docs.services.mozilla.com/storage/apis-2.0.html#response-headers. You'll see it defines HTTP response headers that tell the client how to react if the server is under load. Bagheera will need something similar.

Since Bagheera is simple (unlike Sync), I think use of Retry-After should be sufficient (I doubt we need an actual backoff header).

It's possible the generation of these headers resides not with Bagheera itself but in load balancers in front of it. We will almost certainly need Operations to modify load balancers to emit specific headers in the case where all Bagheera origin servers are down. That work will be tracked in another bug. CC'ing Ravi so he knows it's coming.
N.B., Retry-After is only of use for 503s (and 3xx, if we use those).

Consider whether we want to be able to say "OK, but please take longer next time".
(In reply to Richard Newman [:rnewman] from comment #1)
> N.B., Retry-After is only of use for 503s (and 3xx, if we use those).
> 
> Consider whether we want to be able to say "OK, but please take longer next
> time".

We are only supposed to be sending 1 req/day for the FHR. I doubt the backoff will be longer than a day, so as long as the server emits 503 when under load, we should be fine with just Retry-After.

That being said, there might be other users of Bagheera that issue requests more frequently and they could make use of a "backoff" header.
Whiteboard: [JIRA METRICS-1182]
Target Milestone: Unreviewed → Moved to JIRA
Annie: a lot of the bugs that Greg is filing are dependencies for the FHR feature in Firefox (or varying priorities), so we need a lot of transparency into state.

Could you folks work with one of the following three options?

1. Provide a public link to the JIRA issue tracker, or
2. Copy all status updates from JIRA to this bug in near-realtime, or
3. Use this bug instead of JIRA?

I'm keen to avoid us having "RESOLVED bombs", where we don't know what's happening with a high-priority feature until it's done.

Thanks!
Nobody's updating Bagheera at this point; we've slated it for end-of-life in 2016.
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.