Closed Bug 614012 Opened 14 years ago Closed 11 years ago

Website: Service unavailable errors when submitting personas

Categories

(Websites Graveyard :: getpersonas.com, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: crimius, Unassigned)

References

Details

Attachments

(3 files)

User-Agent:       Mozilla/5.0 (Windows NT 6.1; rv:2.0b7) Gecko/20100101 Firefox/4.0b7
Build Identifier: 

An approver just sent me an email abotu how they received a Service Unavailable error when they submitted a persona.  If I'm not mistaken, we're pretty sure it's the mail function timing out that causes these errors.  Most of the time we get them when reviewing personas, but if it's happening when users are submitting personas, it's probably the main reason we get as many duplicate submissions that we do.

Reproducible: Always
SMTP info could be switched to use socketlabs instead, if it's just hanging on outbound.
Deb - is there an error code described in the email?
Assuming 503, we'd be looking at:
http://svn.mozilla.org/projects/getpersonas.com/trunk/server/lib/review.php

I don't know why it's timeout randomly, but here are ideas:
- php.ini on some nodes is not set properly
- hosts file does not have 'localhost' as a valid host, and mail() is using that by default

Possible solutions:
- set up .ini config to use socketlabs explicitly
- add configurations to personas and write wrappers for mail() that don't depend on server configs
> An approver just sent me an email abotu how they received a Service Unavailable
> error when they submitted a persona.  If I'm not mistaken, we're pretty sure
> it's the mail function timing out that causes these errors.  

What lead you to believe it was mail()?
I've verified the php.ini is the same on all.
(In reply to comment #4)
> > An approver just sent me an email abotu how they received a Service Unavailable
> > error when they submitted a persona.  If I'm not mistaken, we're pretty sure
> > it's the mail function timing out that causes these errors.  
> 
> What lead you to believe it was mail()?

These errors have been going on for a long time, and when I brought it up to Ryan when I first started seeing them when approving personas, he said it was probably mail(), though I'm not sure why.
I marked the original bug report (bug 598372) filed by rdhoherty as a dupe since this bug has more info.  That is where Ryan mentioned that PHP's mail() function may be involved.

(I'm also merging the priority & milestone fields)
Status: UNCONFIRMED → NEW
Ever confirmed: true
Priority: -- → P1
Target Milestone: --- → 2.9
I don't think there are any mail() calls in a loop here.  Jeremy, can you tell us load on these boxes and SMTP traffic - it doesn't seem like there would enough mail to slow things down.

Also, can you configure it to use socketlabs to see if that will help.

Do the mail logs say anything about getting backed up or having trouble looking up domain names that would slow things down?
Assignee: nobody → server-ops
Component: getpersonas.com → Server Operations
Product: Websites → mozilla.org
QA Contact: getpersonas-com → mrz
Target Milestone: 2.9 → ---
Version: unspecified → other
Assignee: server-ops → jeremy.orem+bugs
Please attach the persona the submitter tried to upload when they received a timeout.
Assignee: jeremy.orem+bugs → crimius
The submitter isn't sure which it was anymore, as they submitted several that day.  I'm getting service unavailable right now approving personas, on all of them I try to perform an action on, and after several tries they've finally gone through.  Two personas that gave this are <https://www.getpersonas.com/en-US/persona/333538> and <https://www.getpersonas.com/en-US/persona/333543>.
Another user reported this happening with this persona <http://www.getpersonas.com/en-US/persona/334208>
This is still happening, and it's not just when submitting Personas, but also when trying to review/approve stuff already in the queue.  Is there any way we can get the priority on this bumped up?  It's a major issue for the site and for our contributors.
Assignee: crimius → server-ops
Assignee: server-ops → jeremy.orem+bugs
While reviewing personas a few minutes ago, I received the 'Service Unavailable' error pages. When it happens, the request takes over 30 seconds to load and then other requests around that time tend to also get the same error.

The four errors I received for POST requests to https://www.getpersonas.com/en-US/admin/pending.php?category=Abstract

The response code was: HTTP/1.1 500 Internal Server Error
with the following headers:
Connection:close
Content-Type:text/html
Date:Tue, 11 Jan 2011 06:23:04 GMT

Perhaps there is something useful in the log files at this time? Hopefully the 30 second delay and the lack of response headers give clues as to where the problem is occurring?
How often is this happening? Does submitting the form always take a long time?
I'm not reviewing personas every day so I don't know how often it happens.  This form doesn't always take a long time as I just tested and it took around a second to submit.
It really seems to be intermittent.  I did a ton of reviews recently and had no problems at all.  I'll create a few new Personas in the next day or two and see what happens.
(In reply to comment #17)
> It really seems to be intermittent.  I did a ton of reviews recently and had no
> problems at all.  I'll create a few new Personas in the next day or two and see
> what happens.

Have you run across this lately?
This could have been due to storage load. We've reduced the load on that device and it sounds like it isn't happening very often. I'm closing this out.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
This is happening regularly again[1] and I suspect it's because of the high load due to the Firefox 4 release.

What can be done to prevent this problem? 

[1] https://forums.mozilla.org/addons/viewtopic.php?f=30&t=2675#p7647
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
What is displaying this error page?  Is it something on the web head or is it the load balancer?  It doesn't seem to be a getpersonas.com error message.
(In reply to comment #21)
> What is displaying this error page?  Is it something on the web head or is it
> the load balancer?  It doesn't seem to be a getpersonas.com error message.

It's the load balancer when the backend is unavailable (like the backend ran out of memory and didn't return a response or something).
Please attach the POST data and full headers for a failed request?
SMTP is still an issue here. If the submitters SMTP server is being slow or if the persona under review's users's SMTP server is slow the request will time out. Personas needs to stop using the mail() function directly.
Assignee: jeremy.orem+bugs → nobody
Component: Server Operations → getpersonas.com
Product: mozilla.org → Websites
QA Contact: mrz → getpersonas-com
(In reply to comment #24)
> SMTP is still an issue here. If the submitters SMTP server is being slow or if
> the persona under review's users's SMTP server is slow the request will time
> out. Personas needs to stop using the mail() function directly.

Thanks for troubleshooting this

(In reply to comment #3)
> Possible solutions:
> - set up .ini config to use socketlabs explicitly
> - add configurations to personas and write wrappers for mail() that don't
> depend on server configs

Wil or Mike: Which of these two solutions is preferred and who can this be assigned to?
Jeremy, can you just tell mail on that box to go through socketlabs?
Those servers are using a smart-host now. Let's see if that helps.
(In reply to comment #27)
> Those servers are using a smart-host now. Let's see if that helps.

It doesn't seem to have helped for me or other reviewers[1].  Here are some headers of responses with the error message.

[1] https://forums.mozilla.org/addons/viewtopic.php?f=30&t=2599&start=75#p8356
This one is a GET request
getpersonas.com has been retired.  More information at https://blog.mozilla.org/addons/2013/04/11/background-themes-have-moved-to-amo/
Status: REOPENED → RESOLVED
Closed: 13 years ago11 years ago
Resolution: --- → WONTFIX
Product: Websites → Websites Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: