Closed Bug 72184 Opened 23 years ago Closed 22 years ago

Check for SQL size limit before inserting comment into database

Categories

(Bugzilla :: Creating/Changing Bugs, defect, P1)

2.11
defect

Tracking

()

RESOLVED FIXED
Bugzilla 2.16

People

(Reporter: roland.mainz, Assigned: myk)

Details

Attachments

(2 files, 1 obsolete file)

I suspect this was an accidental early posting of became bug 72185.

*** This bug has been marked as a duplicate of 72185 ***
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → DUPLICATE
Yes, this was an accident - I quoted the full response from BugZilla (which was
too large, too - seems that form submission in general has problems with large
incoming forms, not only a problem of the attachment CGI). Maybe we should
rename this bug to "BugZilla creates Bug entries for _failed_ submissions"... !?
What the heck happened here?  How did a mark dupe become the first comment?  And
how did it's notification happen to be marked as "New".
When this bug was submitted, it didn't have a comment.  See bug 71659.
> What the heck happened here?  How did a mark dupe become the first comment?
Raw guessing: Submission of main body failed (too large, see bug 72185),
variable wasn't "filled" and therefore the "first" comment was the "dup"
entry... = :-)
OK, maybe it isn't the other side of bug #71659.  We need to find out for sure
and fix whatever caused this.
There's also a performance problem with
lounge.mozilla.org(=bugzilla.mozilla.org) - it needed ~20mins until the error
message was displayed (I filed bug 72187 to make Mozilla5 more user-friendly in
those cases :-)...
Is this caused by this bug (BugZilla code), by "lounge" config, by the mysql
database or by something else ?

BTW/OT: Which OS does "lounge" run ? Who's the adminsitrator of this machine ?
The ~20 min problem is unavoidable.  Your browser must upload the entire form to
the web server before the script starts processing it.
Are you sure that form upload (HTTP POST) is 40 times slower than a HTTP GET ??
I cannot believe this... ;-(
Roland, I'm not sure I understand, how did you get a really long body?  Did you
somehow attach something to your first comment, or did you just do a really long
paste?
I (accidently) tried to submit the full BugZilla response (which I saved to
disk; the *.txt is HTML-->plaintext using lynx -dump *.html):
-- snip --
% ls -l createattachment_cgi_failed_to_send_postscript.*
-rw-r-----   1 gisburn  staff    1196590 Mar 16 14:31
createattachment_cgi_failed_to_send_postscript.html
-rw-r-----   1 gisburn  staff    1308989 Mar 16 14:36
createattachment_cgi_failed_to_send_postscript.txt
% cat createattachment_cgi_failed_to_send_postscript.txt | wc -l
   32382
-- snip --
e.g. form body contained 32382 lines... <language type="comic">BUM</language>
=:-)
Reopening and morphing this bug as Roland suggested.

I think what happened here is that process_bug.cgi died because of the
``die "$str: " . $::db->errstr;'' in sub SendSQL (globals.pl line 174), so no
initial comment was attached and no email was sent out.

But as probably not many people try to attach such huge comments, I suggest
not to hold 2.12 for this.
Status: RESOLVED → REOPENED
OS: Solaris → All
Hardware: Sun → All
Resolution: DUPLICATE → ---
Summary: BugZilla fails to create large attachments... → Check for SQL size limit before inserting comment into database
Personally, I don't think we should release any version with any known issues of
database corruption.
Target Milestone: --- → Bugzilla 2.16
See also bug #42165.
Priority: -- → P1
-> New Bugzilla Product
Assignee: tara → myk
Status: REOPENED → NEW
Component: Bugzilla → Creating/Changing Bugs
Product: Webtools → Bugzilla
Version: Bugzilla 2.11 → 2.11
Reading the comments on this bug I'm still confused by what the heck is going on
here...  anyone have a clue?  Do we just need to make sure the comment isn't as
big as the max packet size for MySQL?
potential database corruption, upgrading to blocker for 2.16
Severity: normal → blocker
According to bug 42165, apache's default internal limit is 8K for a URL. Adding
in a check is fine (You'd need to do it in both process_bug and attachment.cgi)

8K seems really small, though. Adding the limit is simple - would 100K be a
reasonable default? Maybe that state storage should be using POST instead....

myk, what is the maximum size of a longdesc in bmo? Wht is the largest size of a
resonable comment in bmo?
We submit comments using POST, rather than GET, so the 8k URL limit doesn't come
into play. Isn't that right?

Gerv
That's right.  POSTs can be of an arbitrary size as far as a web browser and
server are concerned.  MySQL, however, only accepts field values smaller than
MAX_PACKET_SIZE.

The longdescs.thetext column is of type MEDIUMTEXT, which accepts values up to
16MB in size.  This is clearly unreasonable long.  We should change this to
column of type TEXT and reject comments longer than 64K, that column's maximum
size, with a message instructing the user to post that comment as an attachment
instead.

FYI, these are the largest 30 comments on b.m.o.:

mysql> select bug_id, length(thetext) from longdescs order by length(thetext)
desc limit 30;
+--------+-----------------+
| bug_id | length(thetext) |
+--------+-----------------+
|  31810 |          351011 |
|  64624 |          302118 |
|  86232 |          278083 |
| 121383 |          256004 |
|  98007 |          197748 |
|  57272 |          161698 |
|  17997 |          132062 |
|  63501 |          125324 |
|  77677 |          122521 |
|  76735 |           97994 |
| 100532 |           91165 |
|  25669 |           90200 |
|  90494 |           81992 |
| 103358 |           81170 |
|  93818 |           71592 |
|  64806 |           67732 |
|  61812 |           67442 |
|  64375 |           63649 |
|  64376 |           63649 |
|  59108 |           62720 |
| 124983 |           61880 |
|  88766 |           60978 |
|  93246 |           60294 |
|  97930 |           57585 |
|  56902 |           56150 |
| 126000 |           56080 |
| 114703 |           55737 |
|  84947 |           55208 |
|  85773 |           54558 |
|  16402 |           54111 |
+--------+-----------------+
30 rows in set (28.87 sec)
You'd have to manually poke the db to split up those other comments first, since
I imagine that the alter table would fail. Or we could just make it a param, set
to 64K. Most of those comments are too long anyway - you could make them a
text/plain attachment, and replace the comment with some boilerplate.

Do we want that for 2.16? Or do we just want the error, with the intention of
chaning it later?
*** Bug 94548 has been marked as a duplicate of this bug. ***
This patch is the low-risk fix.  It just throws an error when a user tries to
enter a comment greater than 64K in length.
This higher-risk patch converts large comments to attachments and then changes
the longdescs.thetext field from MEDIUMTEXT to TEXT type.
Comment on attachment 72598 [details] [diff] [review]
patch Av1: prevents users from entering too-large comments

you need to test attachment.cgi too.
Attachment #72598 - Flags: review-
Comment on attachment 72600 [details] [diff] [review]
patch Bv1: converts the field and large comments to attachments

Don't you need error messages, and the checks from the first patch? Also, what
if two comments were made at the same time? You'll replace both of them.
Attachment #72600 - Flags: review-
This patch moves the comment validation code to CGI.pl and calls it from
attachment.cgi (for both attachment creation and attachment modification) along
with post_bug.cgi and process_bug.cgi.
Attachment #72598 - Attachment is obsolete: true
>Don't you need error messages, and the checks from the first patch?

Yes, but my goal here is to get the first patch checked in for 2.16, then drop
this bug from the list of release blockers for 2.16 and get the higher risk
stuff in during the 2.18 timeframe.

>Also, what
>if two comments were made at the same time? You'll replace both of them.

The bug-ID, user, and date columns combined form the primary key for the
longdescs table, so no comment will ever have the same combination of those
fields (all of which I match when updating records in that table).

OK. I'll look at this later. BTW, what about SqlQuote? That doesn't change the
size when its in the db, does it?

Also, they are not a primary key - longdescs doesn't have any unique indexes,
let alone a primary key. Maybe match on length, too? Or check that you only get
one result returned (ie do another select)?
>OK. I'll look at this later. BTW, what about SqlQuote? That doesn't change the
>size when its in the db, does it?

No, SqlQuote (sic) merely escapes meta-characters, it doesn't change the value
of the string itself.

>Also, they are not a primary key - longdescs doesn't have any unique indexes,
>let alone a primary key. Maybe match on length, too? Or check that you only get
>one result returned (ie do another select)?

Mathematically, the likelihood of this happening at all is really low, but I'm
not sure matching on length will significantly reduce it, since duplicates are
likely to have duplicate content as well.  We could match on content, but that
would take a lot of memory.

On b.m.o. only 17 of over 1.2 million comments are over 64K, and there are only
about 1500 duplicates.

Notes for the future: We probably need to check for and delete duplicate entries
in the longdescs table and then give the table a primary key so the problem
doesn't recur.  Something like the following query should work:

SELECT DISTINCT bug_id, who, bug_when FROM longdescs GROUP BY bug_id, who,
bug_when, thetext HAVING COUNT(*) > 1;

This is, nevertheless, not a blocker for 2.16.  The only blocker for 2.16 is the
error/data corruption that occurs because we don't enforce a maximum comment
size.  Setting the patch and review keywords.  For now, please review patch Av2
(attachment 72650 [details] [diff] [review]) so we can get it in for 2.16.

Keywords: patch, review
No, cause then you'll renumber the comments. Since this should cause a midair
(with the exception of attachment comments) its probably not worth it.

I'll look at the patch now/
Comment on attachment 72650 [details] [diff] [review]
patch Av2: also checks comments submitted via attachment.cgi

r=bbaetz. I think thats everywhere we allow comments.
Attachment #72650 - Flags: review+
Comment on attachment 72650 [details] [diff] [review]
patch Av2: also checks comments submitted via attachment.cgi

>+sub ValidateComment {
>+    # Make sure a comment is not too large (greater than 64K).
>+    
>+    my ($comment) = @_;
>+    
>+    if (defined($comment) && length($comment) > 65535) {
>+        DisplayError("Comments cannot be longer than 65,535 characters.");
>+        exit;

Should we not just say "64k" here? Also, do we want to check against 65535, or
some slightly lower 
figure, for reasons of safety? Are we totally certain the comment doesn't get
further munging
which might change it?

Other than that, r=gerv.

Gerv
Attachment #72650 - Flags: review+
>Should we not just say "64k" here?

Maybe, but considering that only 17 of more than 1.2 million comments on b.m.o.
are bigger than this, I'm not going to worry too much about the phrasing here.

>Also, do we want to check against 65535, or
>some slightly lower 
>figure, for reasons of safety? Are we totally certain the comment doesn't get
>further munging
>which might change it?

There are few, if any, things in life I am totally certain about, and given the
state of the MySQL documentation this is certainly not one of them. :-) 
Nevertheless, the documentation doesn't mention any munging, and no other
database I have ever worked with has munged, so I'm confident we're safe.

Note also that the patch does not change the field type from MEDIUMTEXT to TEXT,
so the MySQL allowance of 16 million characters for this field remains.  It is
only in Bugzilla that we are restricting the number of characters we will accept.

Also, take a look at bug 130011 for an idea of where we should be going with this.
Checking in CGI.pl;
/cvsroot/mozilla/webtools/bugzilla/CGI.pl,v  <--  CGI.pl
new revision: 1.135; previous revision: 1.134
done
Checking in post_bug.cgi;
/cvsroot/mozilla/webtools/bugzilla/post_bug.cgi,v  <--  post_bug.cgi
new revision: 1.43; previous revision: 1.42
done
Checking in process_bug.cgi;
/cvsroot/mozilla/webtools/bugzilla/process_bug.cgi,v  <--  process_bug.cgi
new revision: 1.117; previous revision: 1.116
done
Checking in attachment.cgi;
/cvsroot/mozilla/webtools/bugzilla/attachment.cgi,v  <--  attachment.cgi
new revision: 1.13; previous revision: 1.12
done
Status: NEW → RESOLVED
Closed: 23 years ago22 years ago
Resolution: --- → FIXED
QA Contact: matty_is_a_geek → default-qa
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: