Mysql error "Query execution was interrupted" should lead to 503+Retry-After

VERIFIED FIXED

Status

--
enhancement
VERIFIED FIXED
8 years ago
5 years ago

People

(Reporter: Atoll, Assigned: rfkelly)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [qa?], URL)

Attachments

(1 attachment)

(Reporter)

Description

8 years ago
If the server code sees the mysql error "Query execution was interrupted", please send 503+Retry-After to the client rather than any other result.  That error will only occur when we are killing long-running queries, either by hand or in an automated fashion.

This bug is for Python only, y'all may clone for PHP if so desired.
Did you run into this problem ? if yes is there any log or way to reproduce it ?
(Reporter)

Comment 2

8 years ago
The only way I know to reproduce this problem is to send a really long SELECT to the server and then to go and administratively kill the query.  I can work with you further on this to repro from a test PHP script if that would help?
If you're able to sniff what happens when mysql kills the query in the TCP layer, I can even write a proxy-based test that will simulate this. 

We can then run it over the sync server to check that we get 503s everywhere
(Reporter)

Comment 4

8 years ago
MySQL server sends a specific response to the client when "killed on server" occurs:

ERROR 1317 (70100): Query execution was interrupted

If you need to create some TCP samples, execute the query "SELECT;" on the server; it will return:

ERROR 1064 (42000): You have an error in your SQL syntax; check ...

And then adjust the sample response to fit the correct 1317 (70100) error.

Note that you cannot simply respond with a prerecorded TCP packet, but must actually maintain state for each mysql session, adjusting various bits within the reply to match.

I strongly advise taking the existing source code for a full-featured mysql proxy server, whether python or Perl or other, and altering it to return the desired 1317 (70100) error as needed.

There is evidence on the Internet that the following query will take several seconds to complete.  If you were to run it with a Python script and then go to the server and run "SHOW PROCESSLIST; kill /* thread_id */ 123456;", that should create the desired response to the python code:

SELECT BENCHMARK(1000000000, 1+1);

The above query takes ~10 seconds on adm1.mtv1.stage.  If it is not sufficient time, you can do:

SELECT BENCHMARK(...), BENCHMARK(...), BENCHMARK(...);

until you have enough of a delay to reproduce this.
Are you still having those errors ?
Assignee: tarek → nobody
Severity: normal → enhancement
(Reporter)

Comment 6

7 years ago
Whatever error the backend produces, we'll paper over it in the Zeus to turn it into a coherent error for clients.  So if you would prefer to defer this error handling to the Operations Zeus cluster, that's fine.
URL: http://
Depends on: 622889
Whiteboard: [needs /dev/sde swapped]
(Reporter)

Comment 7

7 years ago
dear bugzilla, why did you add a whiteboard to this bug on my behalf?
Whiteboard: [needs /dev/sde swapped]
Assignee: nobody → rkelly
Whiteboard: [qa?]
(Assignee)

Updated

6 years ago
Blocks: 784598
(Assignee)

Comment 8

6 years ago
Created attachment 670569 [details] [diff] [review]
patch to convert "query interrupted" errors into BackendError

This appears to be quite straightforward with the new error-code-checking logic we have added in server-core.  We just include instances of error 1317 in the _is_operational_db_error() function, which will cause them to produce a 503 response.
Attachment #670569 - Flags: review?(telliott)
Attachment #670569 - Flags: review?(telliott) → review+
(Assignee)

Comment 9

6 years ago
http://hg.mozilla.org/services/server-core/rev/1ad512dbe1f7
Status: NEW → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
Verified in code.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.