Closed
Bug 573117
Opened 15 years ago
Closed 15 years ago
Getting a lot of empty responses from the server when requesting pages on drumbeat.org
Categories
(mozilla.org Graveyard :: Server Operations, task)
Tracking
(Not tracked)
RESOLVED
INCOMPLETE
People
(Reporter: paulosman, Assigned: dmoore)
References
()
Details
I've been getting lot of empty responses from drumbeat.org today.
Seems to be more common when requesting paths other than '/' (i.e. /people, /project-index). Is anything being reported in the Apache error log? The output would be consistent with PHP seg faulting or running out of memory / reaching max execution time / etc.
$ telnet www.drumbeat.org 80
Trying 63.245.217.39...
Connected to drumbeat.org.
Escape character is '^]'.
GET /people HTTP/1.1
Host: www.drumbeat.org
Connection closed by foreign host.
$ curl -D - http://www.drumbeat.org/project-index
curl: (52) Empty reply from server
Good response:
$ curl -D - http://www.drumbeat.org/
HTTP/1.1 200 OK
Server: Apache
X-Backend-Server: pp-drumbeat01
Vary: Accept-Encoding
Cache-Control: must-revalidate
Content-Type: text/html; charset=utf-8
Date: Fri, 18 Jun 2010 20:26:43 GMT
Expires: Sun, 19 Nov 1978 05:00:00 GMT
Transfer-Encoding: chunked
ETag: "5992eb239caafe36ef5c244f946a25c2"
Set-Cookie: SESS10c66e53b380515677fb7bd5d6939b2a=j85l3pabu40jau7nv11tmij9e7; expires=Mon, 12-Jul-2010 00:00:03 GMT; path=/; domain=.drum
beat.org
Last-Modified: Fri, 18 Jun 2010 20:24:00 GMT
X-Powered-By: PHP/5.2.9
X-Cache-Info: not cacheable; response has already expired
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
...
Comment 1•15 years ago
|
||
Not much in the error logs (http://oremj.pastebin.mozilla.org/737556), but I think drupal does a bunch of error suppression.
Reporter | ||
Comment 2•15 years ago
|
||
Hmm. So helpful of Drupal :)
Okay, could I get a MySQL dump of mozcrm_drupal and mozcrm_civicrm dropped into /root on drumbeat.stage.mozilla.com then?
I can't repro the problem on staging or a local development instance and I'd like to see if it's related to any drupal changes made in the last little while (doesn't seem likely, but seems like a place to start debugging).
Comment 3•15 years ago
|
||
Done:
drumbeat_org_civicrm.dump drumbeat_org_drupal.dump
Reporter | ||
Comment 4•15 years ago
|
||
Thanks. Unfortunately, like I suspected, it's not related to any Drupal settings (or any code changes).
Comment 5•15 years ago
|
||
You aren't able to reproduce at all on staging?
Comment 6•15 years ago
|
||
Turns out it is blocking on a slow query:
SELECT COUNT(*) FROM (SELECT DISTINCT(node.nid) AS nid_1,
node_data_field_p_image.field_p_image_fid AS node_data_field_p_image_field_p_image_fid,
node_data_field_p_image.field_p_image_list AS node_data_field_p_image_field_p_image_list,
node_data_field_p_image.field_p_image_data AS node_data_field_p_image_field_p_image_data,
node.type AS node_type,
node.nid AS nid,
node.vid AS node_vid,
node.title AS node_title,
og.og_description AS og_og_description,
(SELECT COUNT(*) FROM og_uid ou INNER JOIN users u ON ou.uid = u.uid WHERE ou.nid = og.nid AND u.status > 0 AND ou.is_active >= 1 AND ou.is_admin >= 0 ) AS member_count,
votingapi_cache_node_points_vote_count.value AS votingapi_cache_node_points_vote_count_value,
DATE_FORMAT((FROM_UNIXTIME(node.created) + INTERVAL -25200 SECOND), '%Y%m%d%H%i') AS node_created_minute
FROM node node
LEFT JOIN og_uid og_uid ON node.nid = og_uid.nid
LEFT JOIN node node_og_uid ON og_uid.nid = node_og_uid.nid
LEFT JOIN votingapi_cache votingapi_cache_node_points_vote_count ON node.nid = votingapi_cache_node_points_vote_count.content_id AND (votingapi_cache_node_points_vote_count.content_type = 'node' AND votingapi_cache_node_points_vote_count.value_type = 'points' AND votingapi_cache_node_points_vote_count.tag = 'vote' AND votingapi_cache_node_points_vote_count.function = 'count')
LEFT JOIN content_type_project node_data_field_p_image ON node.vid = node_data_field_p_image.vid
LEFT JOIN og og ON node.nid = og.nid
WHERE (node.status <> 0 OR (node.uid = 3568 AND 3568 <> 0) OR 1 = 1) AND (node.type in ('project'))
GROUP BY nid_1
ORDER BY node_created_minute DESC
) count_alias;
Comment 7•15 years ago
|
||
It looks like /people isn't running one long query, but running about 2000 queries and eventually timing out.
Reporter | ||
Comment 8•15 years ago
|
||
Yeah, I've seen pages that run ~ 2000 queries (mind boggling) but in the past they've just been slow, not timing out. That query above is troubling and surprises me even for this site, which uses a lot of similar but not-as-slow queries. Clearly that one will have to be at least optimized if not removed.
I guess the main question I have is why it's stopped working now? The same slow queries work on staging, and take forever, but PHP and Apache still come back with something. It's only on the production server that the web server gives up and fails to return anything. Could it be a MySQL or PHP configuration issue?
Comment 9•15 years ago
|
||
I bumped the connect timeout on the load balancer. I think that will "fix" the issue. Real fix will be to fix those queries.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 10•15 years ago
|
||
Awesome, thanks Jeremy. I got rid of the inner most SELECT COUNT(*) (member_count) which brought the query time from ~6 seconds to 0.16 seconds on my development machine, which should help. I also cut the number of queries on the people page down to about one half.
Still a lot of performance optimization to do, but this helps a lot. Thanks again.
Reporter | ||
Comment 11•15 years ago
|
||
Getting empty responses regularly now, but only when logged in as an admin user. Load times are significantly slower. We're running a nearly identical database on staging and not experiencing the problems there. Check out load time differences:
Anonymous request:
$ time curl https://www.drumbeat.org
...
real 0m1.938s
Admin request:
$ time curl -b SESS...=session_id https://www.drumbeat.org
...
real 0m6.060s
And many requests just time out. I.e.
$ time curl -b SESS...=session_id https://www.drumbeat.org/admin/build/views/enable/project_mailing_list?destination=admin%2Fbuild%2Fviews%2Flist
curl: (52) Empty reply from server
real 0m32.560s
user 0m0.010s
sys 0m0.020s
As I mentioned, can't replicate on staging. If you drop another database dump of 'mozcrm_drupal' and 'mozcrm_civicrm' in /root on drumbeat.stage.mozilla.com I can try again with the identical database config.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Reporter | ||
Comment 12•15 years ago
|
||
Compare with staging:
$ time curl -k https://drumbeat.stage.mozilla.com
...
real 0m1.165s
$ time curl -k -b SESS...=session_id https://drumbeat.stage.mozilla.com
...
real 0m1.689s
As I mentioned in comment # 11, I can redo the comparison with an identical setup if you drop another database dump on staging.
Reporter | ||
Comment 13•15 years ago
|
||
I've got Paul Booker (community member) looking at ways of optimizing the SQL being generated on the site, but any explanation(s) for the performance discrepancy between production and staging?
Comment 14•15 years ago
|
||
I haven't had much time to look at this. Do you want to jump on and debug?
Derek, can you give him network access to pp-drumbeat01 and then reassign back to me?
Assignee: jeremy.orem+bugs → dmoore
Reporter | ||
Comment 15•15 years ago
|
||
Thanks Jeremy. If I can get access, I can debug and see if I can figure anything out. Derek, let me know if you need anything from me (i.e. keys, etc).
Comment 16•15 years ago
|
||
Don't have a good way to get you remote access to this box. Should be fixed when we get the other KVM boxes up.
I can hold this open until then or close it knowing we'll move stuff over to those boxes. Your call.
Reporter | ||
Comment 17•15 years ago
|
||
I'll close it out then. I'll investigate once we move over to the new boxes and I have access. That's all being tracked in Bug # 569251 anyway I assume. Thanks mrz.
Status: REOPENED → RESOLVED
Closed: 15 years ago → 15 years ago
Resolution: --- → INCOMPLETE
Updated•10 years ago
|
Product: mozilla.org → mozilla.org Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•