Closed
Bug 866467
Opened 11 years ago
Closed 11 years ago
pulse.mozilla.org HTTP Errors
Categories
(Developer Services :: General, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: afernandez, Assigned: fox2mike)
References
Details
Attachments
(1 file)
911 bytes,
patch
|
dustin
:
review+
|
Details | Diff | Splinter Review |
Received the followings alerts; < nagios-phx1> | Sat 16:17:15 PDT [125] pulse-app1.dmz.phx1.mozilla.com:Pulse - http string is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 INTERNAL SERVER ERROR - string Pulse not found on http://pulse.mozilla.org:80/ - 248 bytes in 0.091 second response time < nagios-phx1> | Sat 16:25:14 PDT [126] pulse-app1.dmz.phx1.mozilla.com:http - pulse.m.o is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 INTERNAL SERVER ERROR - 248 bytes in 0.078 second response time Upon login into pulse-app1.dmz.phx1; sudo /etc/init.d/httpd graceful apachectl: Configuration syntax error, will not run "graceful": Syntax error on line 4 of /etc/httpd/mozilla/generic.conf: Invalid command 'WSGISocketPrefix', perhaps misspelled or defined by a module not included in the server configuration vi /etc/httpd/conf.d/wsgi.conf uncomment; LoadModule wsgi_module modules/mod_wsgi.so which allowed Apache to restart, however error still persist. I then checked the server history and seems dustin was making changes. Attempt to ping in irc and page as well but reply yet. I have acked the alerts for now, however, not sure if this service is in production as https://mana.mozilla.org/wiki/display/websites/pulse.mozilla.org appears to be outdated.
Comment 1•11 years ago
|
||
I think this is OK until Monday, as long as messages are still flowing. The HTTP site is basically just docs. This was handed off quite a while ago, but yes.. the mana docs still point to me. I'll fix that on Monday too :)
Comment 2•11 years ago
|
||
OK, I updated the docs - this is a Dev-Services-managed system, with jgriffin from a-team being the dev contact. The error, with Debug = True, is ---- ViewDoesNotExist at / Could not import django.views.generic.simple.direct_to_template. Parent module django.views.generic.simple does not exist. Request Method: GET Request URL: http://pulse.mozilla.org/ Django Version: 1.5.1 Exception Type: ViewDoesNotExist Exception Value: Could not import django.views.generic.simple.direct_to_template. Parent module django.views.generic.simple does not exist. Exception Location: /usr/lib/python2.6/site-packages/django/core/urlresolvers.py in get_callable, line 104 Python Executable: /usr/bin/python Python Version: 2.6.6 Python Path: ['/data/www/pulse', '/usr/lib64/python26.zip', '/usr/lib64/python2.6', '/usr/lib64/python2.6/plat-linux2', '/usr/lib64/python2.6/lib-tk', '/usr/lib64/python2.6/lib-old', '/usr/lib64/python2.6/lib-dynload', '/usr/lib64/python2.6/site-packages', '/usr/lib64/python2.6/site-packages/gtk-2.0', '/usr/lib/python2.6/site-packages', '/usr/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg-info'] ---- It started returning 500's at 27/Apr/2013:16:13:10. I don't see any puppet runs around that time, nor do I see any app deploys in the git history.
Assignee: server-ops → server-ops-devservices
Component: Server Operations → Server Operations: Developer Services
Comment 3•11 years ago
|
||
The system was updated on the 27th, so some package was updated which borked things. That django error appears to be because the function based views were deprecated in 1.4 and completely dropped in 1.5: https://docs.djangoproject.com/en/1.4/topics/class-based-views/ I'm no python wiz, and even less familiar with django, but it looks like the fix may be as simple as: #diff urls.py urls.py-tmp 13,14c13 < 'django.views.generic.simple.direct_to_template', < {'template': 'index.html'}, --- > django.views.generic.TemplateView.as_view(template_name="index.html") 17,18c16 < 'django.views.generic.simple.direct_to_template', < {'template': 'live_messages.html'}, --- > django.views.generic.TemplateView.as_view(template_name="live_messages.html") 24,25c22 < 'django.views.generic.simple.direct_to_template', < {'template': 'gantt_messages.html'}, --- > django.views.generic.TemplateView.as_view(template_name="gantt_messages.html") Thoughts?
Assignee | ||
Comment 4•11 years ago
|
||
Oh, this makes perfect sense! This was requested in Bug 853675. And that broke the app I guess :|
Comment 5•11 years ago
|
||
I fixed the site; it was indeed a Django-version incompatibility, as mentioned in comment #3 (and in one other place, it turned out). I fixed this in bug 875399. Could we get this change deployed to pulse.mozilla.org?
Updated•11 years ago
|
Flags: needinfo?(shyam)
Assignee | ||
Comment 6•11 years ago
|
||
(In reply to Mark Côté ( :mcote ) from comment #5) > I fixed the site; it was indeed a Django-version incompatibility, as > mentioned in comment #3 (and in one other place, it turned out). > > I fixed this in bug 875399. Could we get this change deployed to > pulse.mozilla.org? https://mana.mozilla.org/wiki/display/websites/pulse.mozilla.org#pulse.mozilla.org-Update/Pushprocedure doesn't say how this is deployed :| I'm going to ask Dustin coz I don't know.
Flags: needinfo?(shyam) → needinfo?(dustin)
Comment 7•11 years ago
|
||
It's just a regular webapp. Last login: Wed Sep 19 09:09:28 2012 from 10.8.74.5 This is the admin node for the pulse cluster when using issue-multi-command.py use the following: pulse (prod) pulse-dev (dev) pulse-stage (stage) deploy scripts are: /data/pulse/deploy (prod) /data/pulse-dev/deploy (dev) /data/pulse-stage/deploy (stage) These deploy scripts are managed by puppet. Do not edit directly. [root@pulse-app1.dmz.phx1 ~]#
Flags: needinfo?(dustin)
Assignee | ||
Comment 8•11 years ago
|
||
Mark, doesn't seemed to have fixed the app : [shyam@pulse-app1.dmz.phx1 src]$ ./update Updating pulsewebsite... not trusting file /data/pulse/src/pulse/pulsewebsite/.hg/hgrc from untrusted user root, group root not trusting file /data/pulse/src/pulse/pulsewebsite/.hg/hgrc from untrusted user root, group root abort: repository default not found! [shyam@pulse-app1.dmz.phx1 src]$ sudo ./update Updating pulsewebsite... pulling from http://hg.mozilla.org/automation/pulsewebsite searching for changes no changes found default Updating pulseshims... pulling from http://hg.mozilla.org/automation/pulseshims searching for changes no changes found default Updating mozillapulse... pulling from http://hg.mozilla.org/automation/mozillapulse searching for changes adding changesets adding manifests adding file changes added 1 changesets with 19 changes to 19 files 19 files updated, 0 files merged, 0 files removed, 0 files unresolved default [2013-06-04 07:27:11] Running rsync_project [2013-06-04 07:27:11] [localhost] running: /usr/bin/rsync -aq --include '.gitkeep' --exclude '.git*' --exclude '.hg*' --exclude '.svn*' --exclude 'CVS' --exclude '.bzr*' --delete /data/pulse/src/pulse/ /data/pulse/www/pulse/ [2013-06-04 07:27:11] [localhost] finished: /usr/bin/rsync -aq --include '.gitkeep' --exclude '.git*' --exclude '.hg*' --exclude '.svn*' --exclude 'CVS' --exclude '.bzr*' --delete /data/pulse/src/pulse/ /data/pulse/www/pulse/ (0.050s) [2013-06-04 07:27:11] Finished rsync_project (0.051s) [2013-06-04 07:27:11] Running commit_www [2013-06-04 07:27:11] [localhost] running: cd /data/pulse/www && /usr/bin/git add .; /usr/bin/git commit -a -m 'deploy ['pulse']' [2013-06-04 07:27:12] [localhost] finished: cd /data/pulse/www && /usr/bin/git add .; /usr/bin/git commit -a -m 'deploy ['pulse']' (0.043s) [localhost] out: [master bf0f5e2] deploy [pulse] [localhost] out: Committer: root <root@pulse-app1.dmz.phx1.mozilla.com> [localhost] out: Your name and email address were configured automatically based [localhost] out: on your username and hostname. Please check that they are accurate. [localhost] out: You can suppress this message by setting them explicitly: [localhost] out: [localhost] out: git config --global user.name "Your Name" [localhost] out: git config --global user.email you@example.com [localhost] out: [localhost] out: After doing this, you may fix the identity used for this commit with: [localhost] out: [localhost] out: git commit --amend --reset-author [localhost] out: [localhost] out: 18 files changed, 368 insertions(+), 20 deletions(-) [localhost] out: create mode 100644 pulse/mozillapulse/test/README.md [localhost] out: create mode 100644 pulse/mozillapulse/test/Vagrantfile [localhost] out: create mode 100644 pulse/mozillapulse/test/puppet/manifests/classes/init.pp [localhost] out: create mode 100644 pulse/mozillapulse/test/puppet/manifests/classes/rabbitmq.pp [localhost] out: create mode 100644 pulse/mozillapulse/test/puppet/manifests/vagrant.pp [localhost] out: create mode 100644 pulse/mozillapulse/test/runtests.py [2013-06-04 07:27:12] Finished commit_www (0.044s) Did that and kicked apache.
Assignee: server-ops-devservices → shyam
Flags: needinfo?(mcote)
Assignee | ||
Updated•11 years ago
|
Group: infra
Comment 9•11 years ago
|
||
Sorry, I committed the fix to the old source tree in clegnitto's user repo. I applied the patch to the proper location at http://hg.mozilla.org/automation/pulsewebsite. Please update again. I am clearing the dependencies from this bug, since, as I mentioned elsewhere, the website and the RabbitMQ service are separate things, as evidenced by the website having been broken for far longer than pulse.
Comment 10•11 years ago
|
||
fox2mike deployed the patch and restarted Apache, but I'm still getting 500 errors. fox2mike says there is nothing obvious in Apache's error logs. Dustin, do you know what's going on here (or where errors are logged)?
Flags: needinfo?(dustin)
Comment 11•11 years ago
|
||
<h1>Server Error (500)</h1> isn't an Apache error. I see 10.8.74.211 - - [05/Jun/2013:10:21:16 -0700] "GET / HTTP/1.1" 500 27 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:21.0) Gecko/20100101 Firefox/21.0" in the access logs, so it's not from Zeus. Which leaves Django. I set DEBUG=True and restarted Apache and.. it just worked. I set DEBUG=False and restarted Apache and .. it still works. So, solved?
Flags: needinfo?(dustin)
Comment 12•11 years ago
|
||
Works for me as well. Something cached maybe that got cleared out by the multiple restarts? Anyway this can be closed IMO. dkl
Comment 13•11 years ago
|
||
Huh neat. :) Yeah I was pretty sure it was a Django error, but I wasn't sure where those got logged, so I told fox2mike to try the Apache error logs. Anyway that's all very strange, but thanks. :)
Comment 14•11 years ago
|
||
I couldn't find actual logs either. They're not in the Apache error logs.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Comment 15•11 years ago
|
||
pulse is throwing 500s again. I cant seem to find anything in the logs.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 16•11 years ago
|
||
Can we pull some webops people in? I'm not a Django expert by any stretch.
Comment 17•11 years ago
|
||
It's the exact same error as before, it looks like Mark's fix got somehow unapplied.
Comment 18•11 years ago
|
||
Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/django/core/handlers/base.py", line 92, in get_response response = middleware_method(request) File "/usr/lib/python2.6/site-packages/django/middleware/common.py", line 57, in process_request host = request.get_host() File "/usr/lib/python2.6/site-packages/django/http/request.py", line 72, in get_host "Invalid HTTP_HOST header (you may need to set ALLOWED_HOSTS): %s" % host) SuspiciousOperation: Invalid HTTP_HOST header (you may need to set ALLOWED_HOSTS): pulse.mozilla.org
Comment 19•11 years ago
|
||
Ah as dustin said in comment #7, this machine is puppet managed, and I think the upgrade was applied directly to the machine. dustin, do you think you could update pulsewebsite yourself (and maybe hopefully document this somewhere? :) I apologize for the one-offedness of this site, but it was handed over to us as-is a while ago, and legneato has since left, so we're now on our own with it.
Comment 20•11 years ago
|
||
Regarding comment #18, I don't know where this is coming from, but it seems unrelated to the main failure as posted in comment #2. Certainly the "Invalid HTTP_HOST header" went away when the fix for the removed django view function was applied.
Comment 21•11 years ago
|
||
The error in comment #18 is the exact same error I was getting spammed with (courtesy of nagios) when this problem first was reported.
Comment 22•11 years ago
|
||
Puppet doesn't do webapp updates, but they do occur on a crontask. Still, :fox2mike did the deployment correctly: the /data/pulse/src/pulse/pulsewebsite tree is at http://hg.mozilla.org/automation/pulsewebsite/rev/8ca045b9651e which is the current tip and seems to be appropriate. That patch appears to be applied at /data/www/pulse/pulsewebsite. And that's the directory that's in use here: WSGIScriptAlias / /data/www/pulse/pulsewebsite/django.wsgi
Comment 23•11 years ago
|
||
Oh, and restarting Apache doesn't help, and there's really nothing new to push: [root@pulse-app1.dmz.phx1 pulsewebsite]# /data/pulse/deploy pulse [2013-06-10 08:37:56] Running rsync_project [2013-06-10 08:37:56] [localhost] running: /usr/bin/rsync -aq --include '.gitkeep' --exclude '.git*' --exclude '.hg*' --exclude '.svn*' --exclude 'CVS' --exclude '.bzr*' --delete /data/pulse/src/pulse/ /data/pulse/www/pulse/ [2013-06-10 08:37:56] [localhost] finished: /usr/bin/rsync -aq --include '.gitkeep' --exclude '.git*' --exclude '.hg*' --exclude '.svn*' --exclude 'CVS' --exclude '.bzr*' --delete /data/pulse/src/pulse/ /data/pulse/www/pulse/ (0.048s) [2013-06-10 08:37:56] Finished rsync_project (0.049s) [2013-06-10 08:37:56] Running commit_www [2013-06-10 08:37:56] [localhost] running: cd /data/pulse/www && /usr/bin/git add .; /usr/bin/git commit -a -m 'deploy ['pulse']' [2013-06-10 08:37:56] [localhost] failed: cd /data/pulse/www && /usr/bin/git add .; /usr/bin/git commit -a -m 'deploy ['pulse']' (0.008s) [localhost] out: # On branch master [localhost] out: nothing to commit (working directory clean)
Comment 24•11 years ago
|
||
Well that's weird. I have no idea how it could have broken in the meantime. Could you try switching it to debug again to get a backtrace? I don't know how else to go about debugging this. Alternatively, since most of this "app" is actually templates with no dynamic components (i.e. essentially static files), and the few parts that are dynamic are probably either broken or not used, we could just switch back to static pages and host a dynamic app elsewhere if and when we need/fix it.
Comment 25•11 years ago
|
||
So, I went and asked someone who knows (from webops): 12:36 <@solarce> dustin: your problem in https://bugzilla.mozilla.org/show_bug.cgi?id=866467 is that django 1.5.1 requires the new ALLOWED_HOSTS setting 12:38 <@solarce> dustin: https://docs.djangoproject.com/en/1.5/ref/settings/#std:setting-ALLOWED_HOSTS c.f. bug 856061. So, Mark, do you want to make that change and I'll push again?
Comment 26•11 years ago
|
||
I presume this will do it? Any other hostnames that should be in there, do you know?
Attachment #760515 -
Flags: review?(dustin)
Comment 27•11 years ago
|
||
Comment on attachment 760515 [details] [diff] [review] Add required ALLOWED_HOSTS setting to prod config I'm not the pro, but sure, lgtm.
Attachment #760515 -
Flags: review?(dustin) → review+
Comment 28•11 years ago
|
||
Heh don't think there are any pros here. :) http://hg.mozilla.org/automation/pulsewebsite/rev/6d9f2b0da639 You said cron should pick this up?
Comment 29•11 years ago
|
||
Well, webops are pros enough.. I pushed it - I'm not sure if the crons are set up. The site is back. Success?!
Status: REOPENED → RESOLVED
Closed: 11 years ago → 11 years ago
Resolution: --- → FIXED
Updated•10 years ago
|
Component: Server Operations: Developer Services → General
Product: mozilla.org → Developer Services
You need to log in
before you can comment on or make changes to this bug.
Description
•