Closed Bug 1461350 Opened 7 years ago Closed 7 years ago

Re-implement locale-prefixed URLs

Categories

(developer.mozilla.org Graveyard :: Localization, enhancement, P1)

All
Other
enhancement

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jwhitlock, Assigned: jwhitlock)

References

Details

(Keywords: in-triage, Whiteboard: [specification][type:change][points=6+])

What feature should be changed? Please provide the URL of the feature if possible. ================================================================================== Locale-prefixed URLs are handled by the LocaleURLMiddleware [1], based on zamboni.amo.middleware, copied from kitsune, and modified for the Mindtouch migration and MDN-specific issues. It should be re-implemented using modern Django patterns. [1] What problems would this solve? =============================== The current localization framework uses a prefixer class, which was removed in Django 1.10. Radical changes would be needed to update it to be compatible with Django 1.10. A better solution would be to re-impleent using the patterns of Django 1.8, and then make any small adjustments needed to update to Django 1.10 and beyond. Who would use this? =================== All MDN visitors interact with locale-prefixed URLs. There are some non-locale URLs, for assets and for non-content URLs like robots.txt. Staff contributors to MDN interact regularly with the locale infrastructure in testa and production code. What would users see? ===================== Visitors will see no current locale-prefixed URLs. The default with the current infrastructure is 'no locale' URLs, which redirect to locale-prefixed URLs. With Django's locale infrastructure, the default is prefixed URLs in the current locale, so users may see less non-prefixed URLs. Staff will need less code to use prefixed locales in tests and production code, and it will be harder to inject non-prefixed URLs in code. What would users do? What would happen as a result? =================================================== Visitors should see no change. Staff will be able to simplify code and update to Django 1.10 and beyond. Is there anything else we should know? ====================================== There will be changes in the way Kuma parses the Accept-Language header to redirect to locale-prefixed URLs. Some corner cases will change behavior, added in bug 647796. A user requesting 'fr-FR, de' currently gets 'de', but will get 'fr' in the future.
I missed the reference for [1] in the initial bug. Current LocaleURLMiddleware is at: https://github.com/mozilla/kuma/blob/e531cbe97a1f5ef689536dcc7a3f97d2bec9337e/kuma/core/middleware.py#L21-L86 Locale-prefixed URLs require a custom reverse function, along with some helper classes like Prefixer, defined in urlresolvers.py: https://github.com/mozilla/kuma/blob/master/kuma/core/urlresolvers.py#L36-L72 Fixing this bug will require customizing Django's locale framework, and removing the existing custom code.
Assignee: nobody → jwhitlock
Blocks: 1429933
Status: NEW → ASSIGNED
See Also: → 647796
Keywords: in-triage
Priority: -- → P1
Whiteboard: [specification][type:change] → [specification][type:change][points=6+]
Commits pushed to master at https://github.com/mozilla/kuma https://github.com/mozilla/kuma/commit/9296733d92f6c72a31f680c2112f1b5b11d883b7 bug 1461350: Refactor middleware, resolver tests Convert LocaleURLMiddlware and get_best_language tests to pytest style, and share the get_best_language test cases with the LocaleURLMiddlware tests, since get_best_language won't be needed after the Django 1.11 transition. https://github.com/mozilla/kuma/commit/f01c6a405c5fe0b8d45c8cc78548eb9fa4ea2c0f bug 1461350: Add tests for lang fixes, legacy 404s Add tests for MDN-specific locale fixes, such as cn → zh-CN, and for legacy URLs that should 404, such as index.php. These were previously tested only in headless tests, but are part of the URL contract. https://github.com/mozilla/kuma/commit/c7935b5695724156c84f4e50d4d47be0efdcf03d bug 1461350: Add more ?lang test scenarios Test ?lang redirects for the root URL, the English homepage, and when the parameter is the same as the current locale. https://github.com/mozilla/kuma/commit/6eb165266ae3cbecf96c21636801f7fe3f6534d1 bug 1461350: Convert middleware tests to pytest Convert the remaining middleware tests (RemoveSlashMiddleware, SetRemoteAddrFromForwardedFor) to pytest. https://github.com/mozilla/kuma/commit/df5bc3b1bcc2addc54915b8362e99c86fa50166f Merge pull request #4778 from jwhitlock/middleware-tests-1461350 bug 1461350: Update Middleware tests
Commits pushed to master at https://github.com/mozilla/kuma https://github.com/mozilla/kuma/commit/5431a8e263f1a7ac61df218d3317037bcda538fa bug 1461350: Update RemoveSlashMiddlware test URLs Locale-prefixed URLs with trailing slashes are handled by the catch-all view mindtouch_to_kuma_redirect, so the RemoveSlashMiddleware never updates the response. Update the tests to use non-prefixed URLs for tests, which are handled by the middleware, and currently xfail. https://github.com/mozilla/kuma/commit/265e510ef75ae25156c909660aa4e282cf8ee8f7 bug 1461350: Improve is_valid_path for 404s The fallback view mindtouch_to_kuma_redirect matches everything, so using the resolver to detect invalid paths doesn't work. Refactor so we can detect if mindtouch_to_kuma_redirect would return a 404, and avoid the redirect. This also means that RemoveSlashMiddleware works for non-locale URLs. Coverage is improved, but (as noted) not all code paths are exercised by tests, and some appear to be impossible. https://github.com/mozilla/kuma/commit/fb2e328de7abf892678c0cbd894bb159d23bc241 Merge pull request #4780 from jwhitlock/is-valid-path-1461350 bug 1461350: Improve is_valid_path
Commits pushed to master at https://github.com/mozilla/kuma https://github.com/mozilla/kuma/commit/81455294f4c950c30a9a8f89386c7d41140b6582 bug 1461350: Make humans.txt a no-locale URL https://github.com/mozilla/kuma/commit/a13554e4c6531b367bae33d8b06ae29cf9869a3f Merge pull request #4786 from jwhitlock/humans-no-locale-1461350 bug 1461350: Make humans.txt a no-locale URL
Commits pushed to master at https://github.com/mozilla/kuma https://github.com/mozilla/kuma/commit/57c4d1b13f9f6989d5b875e32243ca9370d35551 bug 1461350: Avoid runtime cache in tests Avoid interacting with the runtime cache in tests by using the local memory cache for django cacheback, and by changing the cache key prefix in tests. https://github.com/mozilla/kuma/commit/cdfce6d69f6740d2061c8795cbfc7b12ba05b92d Merge pull request #4789 from jwhitlock/cache-in-test-1461350 bug 1461350: Avoid runtime cache in tests
Commits pushed to master at https://github.com/mozilla/kuma https://github.com/mozilla/kuma/commit/eb7231c2ba70328aef2320792228d8ea21b4e74f bug 1461350: Duplicate Django locale code Copy code from Django 1.8.19 which needs to be modified to work with Kuma language codes, legacy locale support, and 404s: * from django.conf.urls: - i18n_patterns * from django,core.urlresolvers: - LocaleRegexURLResolver (extended) * from django.middleware.locale: - LocaleMiddleware * from django.utils.translation.trans_real: - get_langugage - get_supported_language_variant - get_language_from_path - get_language_from_request No modifications yet, to make it easier to review changes in future Django versions. https://github.com/mozilla/kuma/commit/db047b5b750bbad4005bdb810b7dd2d0092d47c1 bug 1461350: Update Django locale code for Kuma Modify the Django 1.8.19 locale code to work with Kuma: * Return Kuma language codes, with overrides from settings.LOCALE_ALIASES, from get_supported_language_variant. * Don't check the session for a language override. * Don't add header Vary: Accept-Language * Handle alternate code paths unused in Kuma, by converting to asserts, or commenting and skipping code coverage. https://github.com/mozilla/kuma/commit/b434a70e8f29295cd2a93efb792188c78dac507f bug 1461350: Add more language middlewares Add middlewares for Kuma-specific locale features, provided in to LocaleURLMiddleware or elsewhere, to keep LocaleMiddleware close to Django's version. LangSelectorMiddleware handles locale redirects based on the ?lang query parameter. LocaleStandardizerMiddleware redirects to standardize local prefix cases, legacy locales, and overly specific locales, rather than return 404s. https://github.com/mozilla/kuma/commit/9f7ba92e7a1ab1972db854842fe4cff9c260b2ef bug 1461350: Use modified Django locale framework Replace LocaleURLMiddleware with LangSelectorMiddleware, LocaleStandardizerMiddleware, and Django's modified LocaleMiddleware. Simplify Kuma's reverse function, and drop code that supports the old locale framework. One large change is that urlpatterns are split into those with a language prefix (lang_urlpatterns), and those without (urlpatterns). The customized LocaleRegexURLResolver and i18n_patterns ensure that Kuma's language codes (such as en-US) are used instead of Django's language codes (such as en-us). Kuma's previous locale tools (reverse, Jinja's wiki_url) defaulted to no locale prefix. With the new framework, the default is the currently active locale, and it is retained in request.path_info. These changes require some adjustments, such as in the DocumentZone middleware and tests. Overly-specific locales in the Accept-Language header are processed differenty. Previously, this header: Accept-Language: fr-FR, de;q=0.5 would results in 'de' being selected as the first exact match. The code now uses Django's behaviour, where 'fr' is chosen as the generic locale match for the first parameter. https://github.com/mozilla/kuma/commit/e7e4d24293088fb92d5bfcc893201fe87e31ba67 bug 1461350: Update docstrings for customizations Update the docstrings for locale functions taken from Django to more accurately describe the behaviour and where it deviates from Django's behaviour. https://github.com/mozilla/kuma/commit/23e5547e0aba2af3b498e50b5e23c4bffe24592a bug 1461350: Drop revision_based_on This was removed in PR #4769, and accidently re-added when copying from an earlier version of this pull request. https://github.com/mozilla/kuma/commit/9c0c36831aa289f11c6f98e7e92964d2d4787d29 bug 1461350: Skip non-locale paths earlier In the DocumentZoneMiddleware, skip non-locale paths (such as health checks) as early as possible, before the string manipulation for zoned URL manipulations. https://github.com/mozilla/kuma/commit/b63ec6bb0a9ada4b9a4573a9fddd6766e79fe5b7 bug 1461350: Convert duplicate check to assertion https://github.com/mozilla/kuma/commit/056769089c56abd358bbd2050e74ce511baeef55 bug 1461350: Use LANGUAGES order for prefix picks Use settings.LANGUAGES order to pick region-specific locale prefixes, mirroring Django's algorithm. The first region-specific locale prefix is chosen for the generic language, such as zh-CN for zh, and en-US for en. This is identical to the previous order with the notable exception of pt-PT, not pt-BR, being used for pt (Portuguese). This affects other settings: * ACCEPTED_LOCALES is in resolution order, and includes any candidate locales. * ENABLED_LOCALES is also in resolution order * FIRST_LOCALE is removed * LOCALE_ALIASES drops locales that can be derived from resolution order. Only the Chinese customizations (cn, zh-Hans) remain. * SORTED_LANGUAGES (new) contains the locales in preferred order for display and for forms (en-US, then the others in alphabetical order) Consistency checks for locale settings are in the kuma.core.tests.test_settings, avoiding runtime assertions. New functions kuma.core.i18n.get_django_languages and .get_kuma_languages, modeled after Django's get_languages(), return cached OrderedDicts of LANGUAGES, either with Django (en-us) or Kuma (en-US) language codes. These are efficient for both set (lang_code in get_kuma_languages()) and list (for lang_code in get_kuma_languages()) operations, and are now used in a few places. https://github.com/mozilla/kuma/commit/66779ca2068a1218ceae98c343b192baf3acd395 bug 1461350: Test that language cookie wins Update test_locale_middleware_language_cookie to include an Accept-Language header, which is overridden by the cookie. https://github.com/mozilla/kuma/commit/ebf080ca01ae6bc4868024ff15da82a9f031266b bug 1461350: XFail tests against /(fr,es)/Firefox Previously, a no-locale prefix would get a redirect, whether or not the next link was a 404. Now, the LocaleMiddleware will 404 if the destination with a locale prefix is not valid. This affects the tests for /fr/Firefox and /es/Firefox, which are not yet in the sample database. https://github.com/mozilla/kuma/commit/ccb2bdce7a4f491cbea2c98c9a74f1bdb0804109 Merge pull request #4790 from jwhitlock/locale-middleware-1461350 bug 1461350: Reimplement locale-prefixed URLs to mirror Django
Commits pushed to master at https://github.com/mozilla/kuma https://github.com/mozilla/kuma/commit/c3ff9777234f22107cdd1b00b247564f92884f16 bug 1461350: Remove extra reverse parameters Remove extra parameters from Kuma's reverse call: * prefix - No one used this * force_locale - Forced locale prefix in URL, now default * unprefixed - Forced no locale prefix in URL, now default for non-prefixed URLs. https://github.com/mozilla/kuma/commit/e09e3368582dea30428406763aecbba44e1574bf Merge pull request #4794 from jwhitlock/reverse-legacy-params-1461350 bug 1461350: Remove extra reverse parameters
Commits pushed to master at https://github.com/mozilla/kuma https://github.com/mozilla/kuma/commit/ae9dd8a8dbeab38bbdfcc0f55083c5c90c9f43ce bug 1461350: Remove LocalizingClient, Mixin With the Django-based LocaleMiddleware, all URLs get a locale prefix by default, so the LocalizingClient and the LocalizingMixin no longer needed to ensure locale-prefixed URLs are used in tests. https://github.com/mozilla/kuma/commit/c4de4b153e6dd843452074ea4c97aa8ab71fd878 Merge pull request #4793 from jwhitlock/test-client-1461350 bug 1461350: Simplify test client
Commits pushed to master at https://github.com/mozilla/kuma https://github.com/mozilla/kuma/commit/cf6a694729e7591a51fba9725dce77866d05afd8 bug 1461350: fix headless redirect test * fixes test_legacy_urls test for /cn/docs/Talk:Kakurady https://github.com/mozilla/kuma/commit/44f505a0131e60e8da9b34698edf81429b9db934 Merge pull request #4795 from escattone/fix-headless-redirect-test-1461350 bug 1461350: fix headless redirect test
Commits pushed to master at https://github.com/mozilla/kuma https://github.com/mozilla/kuma/commit/3b6adc6de8b6bf9e38f1c839f8899f9d8c64e0e1 bug 1461350: Default to locale=en-US in tests When using Kuma's reverse, don't pass locale="en-US" (common in tests to avoid a redirect) or locale=request.LANGUAGE_CODE (used in app code to avoid a redirect), but instead rely on the default behavior of including the active language. Continue passing the locale parameter for non-default locales, or if the test makes a request that sets the active language (usually by making a request to a non-default locale). The active language is reset to en-US at the start of each test with an always-run fixture set_default_language. https://github.com/mozilla/kuma/commit/c2858aa3c5bc66af48c90e22488772c39679a21f Merge pull request #4797 from jwhitlock/reverse-default-en-US-1461350 bug 1461350: Default to locale=en-US in tests
Commits pushed to master at https://github.com/mozilla/kuma https://github.com/mozilla/kuma/commit/bca9f0eebe6ec36ec79bd1ec421c51d21d4d0c47 bug 1461350: Add slashes in SlashMiddleware Convert RemoveSlashMiddleware to look at cases where adding a slash would also make a valid URL, such as /en-US or /admin. The CommonMiddleware would do this work, except that is_valid_url always returns True due to the catch-all view mindtouch_to_kuma_redirect. It is easier to customize CommonMiddleware in Django 1.9 and later, and eliminating mindtouch_to_kuma_redirect is also an option for future work. https://github.com/mozilla/kuma/commit/6d895d985f327537d617692e19029da20ffa5e2b Merge pull request #4801 from jwhitlock/slash-middleware-1461350 bug 1461350: Add slashes in SlashMiddleware
Commits pushed to master at https://github.com/mozilla/kuma https://github.com/mozilla/kuma/commit/ac3c6e5526d95ac382f9b480c25b74e2c50ffcf6 bug 1461350: Update URLs for attachment host Update the URLs for the attachment host to ensure that code samples and file attachments are processed with locale prefixes. This adds a test that confirms that a code sample can be viewed with ENABLED_RESTRICTIONS_BY_HOST. This test fails before this change. Other tests are split up so they test a single configuration. The urls cache is set when the RestrictedEndpointsMiddleware overrides the urlconf. Django's TestCase resets this cache, but pytest doesn't unless mark.urls is used. mark.urls is set to the default urlconf, the test sets it to the restricted urlconf, and pytest-django cleans it up. https://github.com/mozilla/kuma/commit/e87cd6edf6b7ca49c386343eff40fb65585aafe1 Merge pull request #4808 from jwhitlock/attachment-host-1461350 bug 1461350: Update URLs for attachment host
Depends on: 1463428
This has been in staging and production for about a week. I investigated additional trace-backs over the past week, and they do not appear to be related to this change. Calling it fixed.
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Commit pushed to master at https://github.com/mozilla/kuma https://github.com/mozilla/kuma/commit/6e3dd7bc13af777e2b4bbf773ed393c62ade8373 bug 1461350: Update path for 404 link The 404 message in debug mode now includes the locale.
See Also: → 1214431
Product: developer.mozilla.org → developer.mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.