Closed
Bug 948344
Opened 11 years ago
Closed 8 years ago
Non-UTF8 database error: Unicode escape for code points higher than U+007F not permitted in non-UTF8 encoding
Categories
(Socorro :: General, task)
Socorro
General
Tracking
(Not tracked)
RESOLVED
INVALID
People
(Reporter: pancallo, Unassigned)
Details
User Agent: Mozilla/5.0 (Windows NT 6.2; WOW64; rv:25.0) Gecko/20100101 Firefox/25.0 (Beta/Release)
Build ID: 20131112160018
Steps to reproduce:
I have setup socorro like this :
http://planeshift.top-ix.org/pswiki/index.php?title=Socorro_Installation
Actual results:
2013-12-10 09:50:01,510 INFO - MainThread - LUCADEBUG About to call update_tcbs_build [datetime.date(2013, 12, 7)]
2013-12-10 09:50:01,609 DEBUG - MainThread - error when running <class 'socorro.cron.jobs.matviews.TCBSBuildCronApp'> on None
Traceback (most recent call last):
File "/data/socorro/application/socorro/cron/crontabber.py", line 703, in _run_one
for last_success in self._run_job(job_class, config, info):
File "/data/socorro/application/socorro/cron/base.py", line 174, in main
function(when)
File "/data/socorro/application/socorro/cron/base.py", line 213, in _run_proxy
self.run(connection, date)
File "/data/socorro/application/socorro/cron/jobs/matviews.py", line 52, in run
self.run_proc(connection, [target_date])
File "/data/socorro/application/socorro/cron/jobs/matviews.py", line 22, in run_proc
cursor.callproc(self.get_proc_name(), signature)
DataError: invalid input syntax for type json
DETAIL: Unicode escape for code points higher than U+007F not permitted in non-UTF8 encoding
CONTEXT: JSON data, line 1: ..."20131117000000", "Version": "0.6.0", "Renderer":...
SQL statement "INSERT INTO tcbs_build (
signature_id, build_date,
report_date, product_version_id,
process_type, release_channel,
report_count, win_count, mac_count, lin_count, hang_count,
startup_count
, is_gc_count
)
WITH raw_crash_filtered AS (
SELECT
uuid
, json_object_field_text(r.raw_crash, 'IsGarbageCollecting') as is_garbage_collecting
FROM
raw_crashes r
WHERE
date_processed::date = updateday
)
SELECT signature_id, build_date(build),
updateday, product_version_id,
process_type, release_channel,
count(*),
sum(case when os_name = 'Windows' THEN 1 else 0 END),
sum(case when os_name = 'Mac OS X' THEN 1 else 0 END),
sum(case when os_name = 'Linux' THEN 1 else 0 END),
count(hang_id),
sum(case when uptime < INTERVAL '1 minute' THEN 1 else 0 END)
, sum(CASE WHEN r.is_garbage_collecting = '1' THEN 1 ELSE 0 END) as gc_count
FROM reports_clean
JOIN product_versions USING (product_version_id)
JOIN products USING ( product_name )
LEFT JOIN raw_crash_filtered r ON r.uuid::text = reports_clean.uuid
WHERE utc_day_is(date_processed, updateday)
AND tstz_between(date_processed, build_date, sunset_date)
-- 7 days of builds only
--AND updateday <= ( build_date(build) + 6 )
-- only nightly, aurora, and rapid beta
AND reports_clean.release_channel IN ( 'nightly','aurora','release' )
AND version_matches_channel(version_string, release_channel)
GROUP BY signature_id, build_date(build), product_version_id,
process_type, release_channel"
PL/pgSQL function update_tcbs_build(date,boolean,interval) line 32 at SQL statement
Expected results:
No error! :)
This is what I did to convert the database to UTF8. It worked so far.
You can see the encoding with this command:
> psql -l
If you see SQL_ASCII then it's wrong.
You then have to convert your 'breakpad' database to UTF-8 (others can stay ascii):
> /etc/init.d/supervisor stop
> su - postgres
> pg_dump breakpad -U planeshift > db.sql
> psql
# ALTER DATABASE breakpad RENAME TO breakpad_ascii
# CREATE DATABASE breakpad WITH OWNER = breakpad_rw TEMPLATE = template0 ENCODING = 'UTF-8';
# GRANT ALL ON DATABASE breakpad to socorro;
# GRANT ALL ON DATABASE breakpad to breakpad_rw; (unsure about this one, but I did it anyway)
# GRANT ALL ON DATABASE breakpad to planeshift; (unsure about this one, but I did it anyway)
> iconv --from-code ISO8859-1 --to-code UTF8 db.sql > db_utf8.sql
Edit db_utf8.sql and change this line to UTF-8:
SET client_encoding = 'UTF-8';
> psql -U planeshift -d breakpad
# \i db_utf8.sql
> /etc/init.d/supervisor start
Comment 2•11 years ago
|
||
(In reply to pancallo from comment #1)
> This is what I did to convert the database to UTF8. It worked so far.
Thanks so much for the detailed fix.
Sorry about the issue with UTF8. I will update our docs to specify "-E UTF8" for all database creation.
Assignee: nobody → sdeckelmann
Summary: Unicode escape for code points higher than U+007F not permitted in non-UTF8 encoding → Non-UTF8 database error: Unicode escape for code points higher than U+007F not permitted in non-UTF8 encoding
Updated•11 years ago
|
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Updated•8 years ago
|
Assignee: sdeckelmann → nobody
Status: ASSIGNED → NEW
Comment 3•8 years ago
|
||
We're no longer doing Top Crashers with SQL.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → INVALID
You need to log in
before you can comment on or make changes to this bug.
Description
•