stop using cgi.FieldStorage
Categories
(Socorro :: Antenna, defect, P2)
Tracking
(Not tracked)
People
(Reporter: willkg, Assigned: willkg)
Details
Attachments
(3 files)
Antenna currently uses cgi.FieldStorage
for parsing multipart/form-data payloads. I used this because it's what the previous collector did and I didn't want to change that aspect of crash report collection without understanding it way better.
However, the cgi module may get removed in a future release of Python:
https://www.python.org/dev/peps/pep-0594/
Further, the FieldStorage code is interesting. We've had a couple of issues with some of the things it's doing the most recent being that it's not just parsing the HTTP payload, but also looking at querystring params.
We literally just need to parse the HTTP payload. The only thing we need to take into account is the content-length.
This bug covers looking at alternatives.
Assignee | ||
Comment 1•5 years ago
|
||
Here's a braindump of things:
-
Falcon currently suggests to use
cgi.FieldStorage
, but says they're going to add this at some point and there's been some work along those lines: -
Python's stdlib also has an email module which we could use. That might be better/safer. It's here:
https://docs.python.org/3/library/email.parser.html#email.parser.BytesParser
-
Flask has built-in support for file uploading, however it uses a tempdir and I'm unenthused to do that (plus we'd need to switch frameworks).
-
There's a "multipart" Python library, but it needs some maintenance tlc.
I think we have time to figure this out. I'm inclined to go with whatever Falcon does.
Assignee | ||
Comment 2•4 years ago
|
||
Falcon will have a multipart/form-data handler in Falcon 3.0.0. When that comes out, we can see whether it works for our situation. Possibly this year. https://github.com/falconry/falcon/milestone/33
Assignee | ||
Comment 3•4 years ago
|
||
Falcon 3.0.0 is out and has multipart/form-data handling: https://falcon.readthedocs.io/en/stable/api/multipart.html
Grabbing this to work on soon.
Assignee | ||
Comment 4•3 years ago
|
||
Bumping bugs off my queue because I'm not going to get to them any time soon.
Assignee | ||
Comment 5•3 years ago
|
||
Python 3.11 is deprecating the cgi module:
https://python.github.io/peps/pep-0594/#cgi
Grabbing this now to do. We've got a decent amount of test coverage for payloads, so I think this should be pretty straightforward.
Assignee | ||
Comment 6•3 years ago
|
||
Assignee | ||
Comment 7•3 years ago
|
||
willkg merged PR #792: "bug 1562641: rework payload handling" in a0fac7f.
After this deploys to stage, we should compare raw crashes between stage and prod to verify that this is working correctly.
Assignee | ||
Comment 8•3 years ago
|
||
Assignee | ||
Comment 9•3 years ago
|
||
We had an error on stage where it exceeded the maximum number of parts.
Falcon's multipart form handling has a default of 64 parts. There are 165 annotations defined in CrashAnnotations.yaml. I'm not sure offhand what the maximum number of parts we might see is. I'm inclined to just set it to 1,000 and never worry about it. There are other limitations in play, so I'm not concerned about this one.
Assignee | ||
Comment 10•3 years ago
•
|
||
I was comparing crash data when I bumped into a difference between prod and stage where collector_notes
sometimes has a value of "[]"
and sometimes has a value of []
depending on whether the "annotations" were in a JSON-encoded extra value or not.
I think that's a bug in the submitter. I wrote up bug #1757612 to fix that.
Once that's fixed and deployed, I can recheck the data between stage and prod for issues.
Assignee | ||
Comment 11•3 years ago
|
||
Assignee | ||
Comment 12•3 years ago
|
||
willkg merged PR #799: "bug 1562641: fix multipart handling" in c181181.
I need to recheck the data between stage and prod for issues, but I'm pretty sure it's good now.
Assignee | ||
Comment 13•3 years ago
|
||
I deployed this to prod in bug #1757737. Marking as FIXED.
Description
•