Closed Bug 1173692 Opened 9 years ago Closed 7 years ago

postgres 9.4 blows up on \u0000 in JSON

Categories

(Socorro :: Backend, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED INVALID

People

(Reporter: rhelmer, Unassigned)

Details

I've noticed that occasionally the AWS RDS Postgres instance is raising errors when it encounters certain unicode:

DataError: unsupported Unicode escape sequence
DETAIL:  \u0000 cannot be converted to text.

This is a relatively new behavior in Postgres, they accept \u0000 into JSON but any JSON-related DB function will blow up if it's present :/

The quick workaround to find+remove these is:

DELETE FROM raw_crashes WHERE raw_crash::text like '%\u0000%'::text

However, this requires manual intervention potentially delaying reports - we should make the processors filter these out.
Hah, Rob, that's exactly the Postgres error I said I ran into with trying a JSON query against raw_crashes (which I don't do in my scripts as it used to blow up with unicode errors in PHX aas well, but not the \u0000 one).
the bad character can be in a variety of fields: UserComments, Email, JavaStackTrace, ...

here's the list of crash_ids from this week that have the faulty character:

 cacbbd6f-d6eb-462f-939f-c4bcc2150608
 ce105a94-9221-4ba8-b021-319992150608
 e388ba5b-2eec-4c1e-b7c7-923102150608
 5e5390a7-4b98-4c8c-92a8-612c52150608
 c92b9734-b387-4dc3-a5ff-8fadc2150608
 d866ff33-27bd-436e-be63-efa0e2150608
 2c833f7d-6e47-4558-ad81-51bf32150608
 5abc533e-b1c9-4959-b05c-444832150608
 9bf44d8f-c19e-4b7b-b366-9d36a2150608
 e4d4c4bb-96f1-4889-b8bf-d2fdb2150608
 a5b76c3a-1223-468c-aca7-375e32150609
 ce3bbd2d-a86a-4b2c-983e-4f3a82150609
 8ee21ef2-377d-43ef-b63f-9822b2150610
 eec5dc4f-f0d4-42aa-989a-947792150610
 12263c89-2ad0-491c-bac9-eb9822150610
 063da7b0-5c45-4097-a59d-85b8a2150610
Not blocking AWS move, but let's make sure to fix it since it requires manual intervention to fix right now.
No longer blocks: 1118288
(In reply to Robert Helmer [:rhelmer] from comment #3)
> Not blocking AWS move, but let's make sure to fix it since it requires
> manual intervention to fix right now.

I have to manually fix this every few days to keep reports rolling. I started a branch which I think has the right fix, but the getting the tests right is surprisingly tricky.
Over to lars.
Assignee: nobody → lars
Assignee: lars → nobody
on a different version of postgres
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.