Closed
Bug 783786
Opened 12 years ago
Closed 12 years ago
PostgreSQL databases can be created with the wrong encoding
Categories
(Bugzilla :: Installation & Upgrading, defect)
Tracking
()
RESOLVED
FIXED
Bugzilla 4.2
People
(Reporter: LpSolit, Assigned: LpSolit)
Details
Attachments
(1 file)
825 bytes,
patch
|
dkl
:
review+
|
Details | Diff | Splinter Review |
While reviewing bug 187753, I wondered why PostgreSQL was crashing when inserting what was supposed to be short-enough strings. To make sure the patch wasn't the culprit, I tried to create a new product with a 64 characters long name, which is the max allowed for product names. PostgreSQL failed with: DBD::Pg::db do failed: ERROR: value too long for type character varying(64) [for Statement "INSERT INTO products(...)] When looking closer, I realized that all my DBs have the wrong encoding: List of Databases DB name | Owner | Encoding | Collation | Char Type | Access -----------+--------+-----------+--- -------+-----------+----------- bugs_36 | bugs | SQL_ASCII | C | C | bugs_cvs | bugs | SQL_ASCII | C | C | Per http://www.postgresql.org/docs/9.1/static/app-initdb.html, about the encoding: "The default is derived from the locale, or SQL_ASCII if that does not work." So for some reason, the locale couldn't be determine, and Pg falled back to SQL_ASCII. But http://www.postgresql.org/docs/9.1/static/sql-createdatabase.html says: "CREATE DATABASE will allow superusers to specify SQL_ASCII encoding regardless of the locale settings, but this choice is deprecated and may result in misbehavior of character-string functions" This is exactly what happened to me: bugs_cvs=> select quipid, length(quip), quip from quips where length(quip) < 50; quipid | length | quip --------+--------+---------------------- 5 | 16 | moi je suis mini 6 | 21 | moi aussi, très mini 8 | 6 | ààà Here, only the first length is correct, the other two are totally wrong due to letters outside the US-ASCII range. To fix this problem, databases must be created with ENCODING = 'UTF8': List of Databases DB name | Owner | Encoding | Collation | Char Type | Access -----------+--------+-----------+--- -------+-----------+----------- bugs_36 | bugs | SQL_ASCII | C | C | bugs_40 | bugs | UTF8 | C | C | bugs_cvs | bugs | SQL_ASCII | C | C | And in that case, string functions are working correctly again: bugs_40=> select length(name), char_length(name), name from products; length | char_length | name --------+-------------+------------------------------------------------------------------ 11 | 11 | TestProduct 64 | 64 | àààààààààààààààààààààààààààààààààààààààààààààààààààààààààààààààà 16 | 16 | télèphöne!@#¼¼½€ http://www.postgresql.org/docs/9.1/static/multibyte.html says that SQL_ASCII stores 1 byte per character, which is why it sees e.g. "à" as being 2 characters, as it needs two bytes to store it. So we shouldn't let checksetup.pl assume that the default encoding is UTF8 as this causes a lot of trouble.
Flags: blocking4.4+
Assignee | ||
Comment 1•12 years ago
|
||
Till now, Pg was using the default "CREATE DATABASE $name" with no arguments.
Attachment #653077 -
Flags: review?(dkl)
Assignee | ||
Updated•12 years ago
|
Comment 2•12 years ago
|
||
Comment on attachment 653077 [details] [diff] [review] patch, v1 Review of attachment 653077 [details] [diff] [review]: ----------------------------------------------------------------- Looks good. Do not have PostgreSQL installed myself, so if you have tested it, r=dkl
Attachment #653077 -
Flags: review?(dkl) → review+
Assignee | ||
Updated•12 years ago
|
Flags: approval4.2+
Flags: approval+
Assignee | ||
Comment 3•12 years ago
|
||
Committing to: bzr+ssh://lpsolit%40gmail.com@bzr.mozilla.org/bugzilla/trunk/ modified Bugzilla/DB/Schema/Pg.pm Committed revision 8355. Committing to: bzr+ssh://lpsolit%40gmail.com@bzr.mozilla.org/bugzilla/4.2/ modified Bugzilla/DB/Schema/Pg.pm Committed revision 8121.
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 4•12 years ago
|
||
I forgot to add it to the release notes for 4.2.4. This is now done: Committing to: bzr+ssh://lpsolit%40gmail.com@bzr.mozilla.org/bugzilla/4.2/ modified template/en/default/pages/release-notes.html.tmpl Committed revision 8163.
Keywords: relnote
Assignee | ||
Comment 5•12 years ago
|
||
Actually, it was already in the relnotes for 4.2.3. I simply forgot to remove the keyword. I reverted this last checkin.
You need to log in
before you can comment on or make changes to this bug.
Description
•