3 years ago
3 years ago


(Reporter: robhudson, Unassigned)





3 years ago
This bug is created to track problems when pushing the django-1.8 branch to stage.

Comment 1

3 years ago
Copied from an email I sent to mdn-dev:

Yesterday I attempted to push the django-1.8 branch to stage for
testing and ran into some trouble.

tl;dr We will attempt to fix stage tomorrow morning. Read on for the
ugly details of what we've discovered thus far.

The first problem was a stale .pyc file from django-constance which
prevented the deploy script from completing. C removed the models.pyc
file that was left around which allowed the deploy script to finish.

Devs: Should we have *.pyc removal as part of our deployment process?

The next (and current) problem was what appears to be an ugly Python
packaging issue. We import RequestFactory from our for the
locale prefixing in MDN. This loads RequestFactory from the
django/tests/ folder which then loads mock which attempts to
self-discover its version, it appears, via a hipster Python package
called "pbr". But since we use vendor'd Python installs it can't
self-discover its version from normal Python packaging information, so
it falls back to git. But, the deploy script also strips .git folders
so the git tagging information that pbr is trying to find isn't there
also, so it gives up.

Comment 2

3 years ago
C and I did some work on this and have some info.

New Relic is a proximate cause, but not a root one.

As comment 1 indicates, how "mock" is installed affects this. If it's in vendor, we get an error as Rob describes:

Exception: Versioning for this project requires either an sdist tarball, or access to an upstream git repository. Are you sure that git is installed?

If it's not in vendor but instead is installed in the virtualenv, this error does not occur.

Similarly, if we remove New Relic by commenting it out in the wsgi file (, this error also does not occur. NR is altering the environment enough that pbr can no longer figure out what version of mock is installed. Weird.

However, in either case, after getting past that (putting mock in the virtualenv, or commenting out New Relic), we get a new error... it can't find the HTML templates. Full traceback is below [1].

My recommendation is to install mock in the virtualenv and remove it from vendor. For that matter we can probably do the same with everything in vendor. I don't know if your virtualenv is being managed by your Chief deploy scripts... if not, this might be problematic for you to maintain on your own. I know that SUMO has worked to build that into their deploy scripts, so it's doable (just might be further down the rabbit hole).

http_app_kuma: 2015-12-18 13:41:04,042 django.request:ERROR Internal Server Error: /en-US/: /data/www/
Traceback (most recent call last):
  File "/data/www/", line 132, in get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/data/www/", line 31, in home
    return render(request, 'landing/homepage.html', context)
  File "/data/www/", line 67, in render
    template_name, context, request=request, using=using)
  File "/data/www/", line 99, in render_to_string
    return template.render(context, request)
  File "/data/www/", line 86, in render
    return self.template.render(context)
  File "/data/www/", line 78, in render
    return super(Template, self).render(new_context)
  File "/data/www/", line 969, in render
    return self.environment.handle_exception(exc_info, True)
  File "/data/www/", line 742, in handle_exception
    reraise(exc_type, exc_value, tb)
  File "/data/www/", line 1, in top-level template code
    {% extends "base.html" %}
  File "/data/www/", line 178, in get_source
    raise TemplateNotFound(template)
TemplateNotFound: base.html
[client] mod_wsgi (pid=26399): Exception occurred processing WSGI script '/data/www/'.
[client] Traceback (most recent call last):
[client]   File "/data/www/", line 46, in application
[client]     return django_app(env, start_response)
[client]   File "/data/www/", line 119, in __call__
[client]     return self.application(environ, start_response)
[client]   File "/data/www/", line 189, in __call__
[client]     response = self.get_response(request)
[client]   File "/data/www/", line 218, in get_response
[client]     response = self.handle_uncaught_exception(request, resolver, sys.exc_info())
[client]   File "/data/www/", line 268, in handle_uncaught_exception
[client]     return callback(request, **param_dict)
[client]   File "/data/www/", line 10, in <lambda>
[client]     handler500 = lambda request: _error_page(request, 500)
[client]   File "/data/www/", line 6, in _error_page
[client]     return render(request, '%d.html' % status, status=status)
[client]   File "/data/www/", line 67, in render
[client]     template_name, context, request=request, using=using)
[client]   File "/data/www/", line 98, in render_to_string
[client]     template = get_template(template_name, using=using)
[client]   File "/data/www/", line 46, in get_template
[client]     raise TemplateDoesNotExist(template_name)
[client] TemplateDoesNotExist: 500.html

Comment 3

3 years ago
FYI: NR wrapping of celery is disabled in -stage.  (Changes committed to puppet.)  Swapping in 'run-python' for 'run-program' did not make any difference.

Comment 4

3 years ago
It's odd that this works in the django shell but not via apache/wsgi...

[chudson@developer1.stage.webapp.scl3 kuma]$ ../venv/bin/python shell
Python 2.7.9 (default, Dec 12 2014, 10:25:20)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-11)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from django.template.loader import get_template
>>> get_template('base.html')
<django_jinja.backend.Template object at 0x7f15c1143410>
>>> get_template('500.html')
<django_jinja.backend.Template object at 0x7f15c11436d0>
:robhudson And fetching the version for mock in vendor *did* also work in the Django shell as I mentioned on IRC. I'm absolutely certain New Relic is the culprit here in combination with pbr and our vendor pattern. I still think there is some odd permission issue going on here.

:jakem I'm surprised you're offering to install everything in the venv tbh since that was declined the last time on the notion that peep would be required (it was broken at the time). I'm not sure if getting rid of vendor now is a good time tbh since debugging in an environment to which we only have access to via a bottleneck (webops) makes this a high risk change.

Since we'll be migrating to AWS in Q1 I don't see this as a requirement as well. Instead I'd rather focus all dev time on getting ready for the AWS move instead of fiddling more with the current infrastructure. We've made some great strides in that direction at Mozlando, so maybe we should postpone 1.8 till AWS. There is no super urgent need (we can backport sec fixes to 1.7) other than the usual rebasing issues.

Comment 6

3 years ago
For reference:

  * Added the following to for stage:

      # Added for testing fixes for BZ 1233836 
      TEMPLATES[0]['DIRS'] = [path('jinja2')]
      TEMPLATES[1]['DIRS'] = [path('templates')]

  * Pushed out to -stage, commented out NR lines in kuma.wsgi, and restarted Apache.  This seems to restore functionality in -stage.

Comment 7

3 years ago
We removed the vendor/ folder for python dependencies and used pip/peep. Everything was happy and we deployed to production.

Closing bug as fixed.

Thanks all!
Last Resolved: 3 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.