Closed Bug 877050 Opened 12 years ago Closed 11 years ago

Cleanly drop a percentage of sync traffic with a 503, rather than allowing weird behavior under load

Tracking

(Not tracked)

Status:

VERIFIED FIXED

People

(Reporter: rfkelly, Assigned: rfkelly)

References

Details

(Whiteboard: [qa+])

Ryan Kelly [:rfkelly]

Assignee

Description

•

12 years ago

Users are currently seeing error bars and weird behavior like 400 errors, javascript tracebacks, etc. See e.g. Bug 685941 and Bug 749315. As best we understand it, these are caused by timeouts and other load-induced strangeness truncating request/response bodies. Richard suggested on IRC that it may be better to purposely throttle our traffic with 503s, rather than allow the problems to show up as user-visible error bars. For example, randomly fail 50% of requests to /info/collections with a 503, Retry-After 2 hours. Advantages: * clients are better behaved with a clean 503, and produce less error-bar noise * failing things out at the initial handshake will produce less DB load than allowing them to fail in the middle of a sync Thoughts?

James Bonacci [:jbonacci]

Updated

•

12 years ago

Whiteboard: [qa+]

Ryan Kelly [:rfkelly]

Assignee

Updated

•

11 years ago

Blocks: 907479

James Bonacci [:jbonacci]

Comment 1

•

11 years ago

Something we still want for Sync 1.5?

Priority: -- → P2

Ryan Kelly [:rfkelly]

Assignee

Comment 2

•

11 years ago

Related to 975305

Depends on: 975305

Ryan Kelly [:rfkelly]

Assignee

Comment 3

•

11 years ago

Fixed for Sync1.5 in Bug 975306. Bob do you want any action on this bug for sync1.1 or should we just close it out?

Assignee: nobody → rfkelly

Status: NEW → ASSIGNED

Flags: needinfo?(bobm)

Bob Micheletto [:bobm]

Comment 4

•

11 years ago

(In reply to Ryan Kelly [:rfkelly] from comment #3) > Fixed for Sync1.5 in Bug 975306. > > Bob do you want any action on this bug for sync1.1 or should we just close > it out? No, we can put nodes into back-off at the various points of redirection in our Sync 1.1 production, which should be fine.

Flags: needinfo?(bobm)

Ryan Kelly [:rfkelly]

Assignee

Comment 5

•

11 years ago

Great, closing it out then.

Status: ASSIGNED → RESOLVED

Closed: 11 years ago

Resolution: --- → FIXED

James Bonacci [:jbonacci]

Comment 6

•

11 years ago

Done.

Status: RESOLVED → VERIFIED

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

Cleanly drop a percentage of sync traffic with a 503, rather than allowing weird behavior under load

Categories

(Cloud Services :: Operations: Miscellaneous, task, P2)

Tracking

(Not tracked)

People

(Reporter: rfkelly, Assigned: rfkelly)

References

Details

(Whiteboard: [qa+])

Crash Data

Security

(public)

User Story

Description

Updated

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6