Non-standard percent-encoding in URLs, produced by the "Add a Keyword for this Search..." context menu button

UNCONFIRMED
Unassigned

Status

()

UNCONFIRMED
3 years ago
9 months ago

People

(Reporter: alvarbanov, Unassigned)

Tracking

38 Branch
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(URL)

(Reporter)

Description

3 years ago
The "Add a Keyword for this Search..." function builds a URL template with a query string representing an HTML form.

A problem arises when the relevant markup contains characters beyond U+007F: Instead of in the format defined in RFC 3986 (http://tools.ietf.org/html/rfc3986), they are encoded in a non-standard format, using "%XX" and "%uXXXX" sequences (where each X is a hex digit).

In specific, characters from U+00A0 to U+00FF are encoded as "%XX" (where XX are the latter two characters in that code point). Characters beyond that are encoded in the UTF-16 format, each unit prefixed by "%u".

---

For an example, see the search bar on http://jisho.org/. The form has a hidden input element, named "utf8", with value "✓" (U+2713, ✓, CHECK MARK), which gets encoded as "%u2713", instead of the correct "%E2%9C%93".

Updated

3 years ago
Severity: minor → normal
Component: Bookmarks & History → Search

Comment 1

9 months ago
This is also breaking "Add a Keyword for this Search..." on Github.

The ✓ utf8-forcing URL convention is being used by lots of websites due to its adoption in Ruby on Rails.
You need to log in before you can comment on or make changes to this bug.