Closed Bug 153129 Opened 22 years ago Closed 17 years ago

Bugzilla uses "mediumtext" as a DB data type when it's not necessary

Categories

(Bugzilla :: Database, defect)

2.17
defect
Not set
normal

Tracking

()

RESOLVED FIXED
Bugzilla 3.2

People

(Reporter: bbaetz, Assigned: xiaoou.wu)

References

(Blocks 2 open bugs)

Details

Attachments

(1 file, 8 obsolete files)

mediumtext may be up to 2^24-1 characters, or just under 16 megs (justdaves comment in bug 96431 about this is incorrect) The following fields may be up to 16 megs in length: - The description of an attachment - The mimetype for an attachment - The filename for an attachment (These are all longer than the size for the attachment, see below) - The description for an attachment status - the short_desc for a bug (bug 96431) - the status whiteboard for a bug - the keyword cache for a bug - text for a comment (this is probably ok) - description for a component. - description for a product - disabledtext for a user - email flags for a user (covered under bug 73665) - stored named query - description for fielddefs - description for a keyword - command for a shadowlog (ignore this) - An attachment is longblob, or 4 Gigs. The mysql client/server protocol limits packet sizes to 16 megs, and the default is 1 meg. This makes longblob totally useless, and mediumtext semi-useless for most things (the actual attachment/comments being an exception, since those have separate limits) We need to have limits checked in the code for all this stuff, and then change the schema.
What are the recommended MySQL types for text strings of various lengths? Gerv
VARCHAR(n) where n is the maximum size of the string
Also, Sybase won't let you ORDER BY its equivalent to mediumtext, because you can't index blobs (and its closest equivalent to mediumtext is a blob type), which means I had to make schema changes to get ordering to work properly on group descriptions.
Blocks: bz-sybase
This is what I'm thinking as of right now for new field lenghts using varchar(): # Table Field New Max Length my @fields = ( ['attachments', 'description', 128], ['attachments', 'mimetype', 64], ['bugs', 'short_desc', 255], ['bugs', 'status_whiteboard', 255], ['bugs', 'keywords', 255], ['components', 'descritpion', 255], ['products', 'description', 255], ['profiles', 'disabledtext', 512], ['namedqueries', 'query', 255], ['fielddefs', 'description', 32], ['keyworddefs', 'description', 255] ); Does that seem sensible to everyone?
Assignee: justdave → jake
what about groups.description?
I was just using Bradley's list which is limited to those specifically using mediumtext. Should we fix all 'text' types?
mysql won't let you do ORDER BY either, really - it just pretends with the first 1024 bytes (changable via a server config option) Why have you used 255, but then 512? We should check the fencpost for an extra byte on teh various dbs, and then use that.
Attached patch Patch v1 (obsolete) — Splinter Review
For that matter, why use "255" instead of "250"? Seems like the logical requirements of the fields should dictate their size, not the technical capabilities of the database.
Blocks: 7233
reassigning to default owner/QA... Jake's Army Reserve unit has been deployed.
Assignee: jake → justdave
Reassigning bugs that I'm not actively working on to the default component owner in order to try to make some sanity out of my personal buglist. This doesn't mean the bug isn't being dealt with, just that I'm not the one doing it. If you are dealing with this bug, please assign it to yourself.
Assignee: justdave → general
QA Contact: mattyt-bugzilla → default-qa
OS: Linux → All
Hardware: PC → All
Target Milestone: --- → Bugzilla 2.22
FWIW, I'd like to split this up into multiple bugs, one for each field that needs to be changed. Or at least, one for each table containing fields that need to be changed.
Blocks: 310718
Target Milestone: Bugzilla 2.22 → Bugzilla 2.24
This bug is retargetted to Bugzilla 3.2 for one of the following reasons: - it has no assignee (except the default one) - we don't expect someone to fix it in the next two weeks (i.e. before we freeze the trunk to prepare Bugzilla 3.0 RC1) - it's not a blocker If you are working on this bug and you think you will be able to submit a patch in the next two weeks, retarget this bug to 3.0. If this bug is something you would like to see implemented in 3.0 but you are not a developer or you don't think you will be able to fix this bug yourself in the next two weeks, please *do not* retarget this bug. If you think this bug should absolutely be fixed before we release 3.0, either ask on IRC or use the "blocking3.0 flag".
Target Milestone: Bugzilla 3.0 → Bugzilla 3.2
Is it possible for us to break the MEDIUMTEXT into two types, one's length > 4000, and the other's length < 4000 ? So we can use varchar(4000) instead of LOBs(which need a special way to do with in Oracle) in some case when the fields not really need a max length of 65535. Oracle doens't support: 1. Using a CLOB column in a SELECT... DISTINCT or SELECT... UNIQUE statement. For example: line #446 in Bugzilla::User " $query = 'SELECT DISTINCT id, name, description FROM groups';" 2. Comparing CLOB with String. For example: line #647 in Bugzilla::Field $dbh->selectrow_array('SELECT id, name FROM fielddefs WHERE description = ?', undef, $field_description); 3 inserting stuffs as normal varchar.
(In reply to comment #14) > Is it possible for us to break the MEDIUMTEXT into two types, one's length > > 4000, and the other's length < 4000 ? Yeah, I think this is a good idea. I think we should have only two text types, one called TEXT and the other called LONGTEXT. TEXT can be 4000 characters, and LONGTEXT can be many MB. Instead of TINYTEXT, we should probably just use varchar(255) directly. We don't gain much in Bugzilla from using TINYTEXT. We'll have to fix the schema generally, for this. Also, for converting fields to the "new" TEXT types, we should file one bug at a time, for each field. (Unless there are several that it makes sense to group together.) Also, we should up the SCHEMA_VERSION to 2.1 after we do this, because we'll have to convert old schemas when we load them. (That's done by deserialize_schema.)
Attached patch Bugzilla::DB::Schema patch (obsolete) — Splinter Review
Change type TINYTEXT to varchar(255), MEDIUMTEXT to TEXT ( or LONGTEXT for those require a length>4000).
Attachment #287478 - Flags: review?(mkanat)
Attachment #107319 - Attachment is obsolete: true
Assignee: general → xiaoou.wu
Component: Bugzilla-General → Database
Summary: mediumtext abuse → Bugzilla uses "mediumtext" as a DB data type when it's not necessary
Comment on attachment 287478 [details] [diff] [review] Bugzilla::DB::Schema patch I think MEDIUMTEXT is actually an OK name for the 4000-character string. So we should keep that. But anything longer than 4000 characters should be converted to LONGTEXT. This means that you have to add the LONGTEXT type to the Pg and MySQL drivers. Every change will require a modification to Bugzilla::Install::DB. Before changing the type you'll have to check that nothing in the DB currently exceeds the length, for the fields that you actually change the length of. (For example, if you change a MEDIUMTEXT to a varchar(255).) If you make a change in the underlying data types for MySQL or Pg, then you may have to add some code to bz_setup_database in the Pg and MySQL drivers, as required. >@@ -372,8 +372,8 @@ > PRIMARYKEY => 1}, > bug_id => {TYPE => 'INT3', NOTNULL => 1}, > creation_ts => {TYPE => 'DATETIME', NOTNULL => 1}, >- description => {TYPE => 'MEDIUMTEXT', NOTNULL => 1}, >- mimetype => {TYPE => 'MEDIUMTEXT', NOTNULL => 1}, >+ description => {TYPE => 'TEXT', NOTNULL => 1}, This should just be varchar(255). There's no reason to have an attachment description longer than 255 characters. >+ mimetype => {TYPE => 'TEXT', NOTNULL => 1}, This should be varchar(128). I can't imagine a mimetype longer than that. >@@ -525,7 +525,7 @@ > DEFAULT => FIELD_TYPE_UNKNOWN}, > custom => {TYPE => 'BOOLEAN', NOTNULL => 1, > DEFAULT => 'FALSE'}, >- description => {TYPE => 'MEDIUMTEXT', NOTNULL => 1}, >+ description => {TYPE => 'TEXT', NOTNULL => 1}, That can be varchar(128). >@@ -779,7 +779,7 @@ > COLUMN => 'userid', > DELETE => 'CASCADE'}}, > name => {TYPE => 'varchar(64)', NOTNULL => 1}, >- query => {TYPE => 'MEDIUMTEXT', NOTNULL => 1}, >+ query => {TYPE => 'TEXT', NOTNULL => 1}, There *is* a chance of that being longer than 4000 characters. So I guess that will have to be LONGTEXT. >@@ -1065,7 +1065,7 @@ > name => {TYPE => 'varchar(64)', NOTNULL => 1}, > frequency => {TYPE => 'INT2', NOTNULL => 1}, > last_viewed => {TYPE => 'DATETIME'}, >- query => {TYPE => 'MEDIUMTEXT', NOTNULL => 1}, >+ query => {TYPE => 'TEXT', NOTNULL => 1}, Same there. >@@ -1155,7 +1155,7 @@ > COLUMN => 'userid', > DELETE => 'CASCADE'}}, > subject => {TYPE => 'varchar(128)'}, >- body => {TYPE => 'MEDIUMTEXT'}, >+ body => {TYPE => 'TEXT'}, And I think that should be LONGTEXT too.
Attachment #287478 - Flags: review?(mkanat) → review-
Comment on attachment 287478 [details] [diff] [review] Bugzilla::DB::Schema patch Oh, and you also need to update the POD in Schema.pm about the data types, to note that MEDIUMTEXT is a maximum of 4000 characters on the most limited DB, and LONGTEXT is up to 16MB.
Attached patch mediumtext patch (obsolete) — Splinter Review
Attachment #287954 - Flags: review?(mkanat)
Attachment #287954 - Attachment is obsolete: true
Attachment #287954 - Flags: review?(mkanat)
Attached patch mediumtext patch (obsolete) — Splinter Review
Attachment #287957 - Flags: review?(mkanat)
Comment on attachment 287957 [details] [diff] [review] mediumtext patch >@@ -1748,7 +1749,7 @@ >- push(@statements, "ALTER TABLE $table ADD COLUMN $column " . >+ push(@statements, "ALTER TABLE $table". ADD_COLUMN ." $column " . This has nothing to do with this patch. Now, you have to add code to Bugzilla::Install::DB that modifies all of these fields, and makes sure that their content isn't longer than their new limit. If the content is longer than the new limit, you are allowed to die and inform the administrator.
Attachment #287957 - Flags: review?(mkanat) → review-
Blocks: 310717
Attached patch mediumtext.diff (obsolete) — Splinter Review
Attachment #290812 - Flags: review?(mkanat)
Comment on attachment 290812 [details] [diff] [review] mediumtext.diff >+ # 2007-11-29 xiaoou.wu@oracle.com - Bug 153129 >+ $dbh->bz_alter_column('bugs', 'bug_file_loc', >+ { TYPE => 'MEDIUMTEXT'}, ''); > [snip] Okay, I want this all to go into a subroutine at the bottom. It can be called _change_text_types. *All* fields *must* be checked *before* being modified. So, the easiest thing to do is to make a hash of all the new fields and call _check_content_length_before_upgrade. The length needs to be the length that's used by the DB, not some number you type in inside of Bugzilla::Install::DB. Finally, you don't need the fourth argument to bz_alter_column. You only need that when you're setting something NOT NULL that used to be NULL. >+sub _check_length_of_content { >+ my ($table_name, $field_name, $length) = @_; >+ my $dbh = Bugzilla->dbh; >+ my $content_lengths = $dbh->selectcol_arrayref("SELECT LENGTH($field_name) >+ FROM $table_name"); >+ foreach my $content_length ( @$content_lengths ){ >+ if ($content_length > $length) { >+ die("The content of field <$field_name> in table <$table_name>\n" >+ . "is longer than its new limit, <$length> characters.\n" >+ . "Please make sure that it would not happen, or the upgrade" >+ . " will never go on."); You need to show the content that is longer. Show every single string that's too long. If they're longer than 80 characters, cut them off with a "..." *in the middle*, so both the beginning and end show.>--- ../cvs/bugzilla/checksetup.pl 2007-10-19 14:46:10.000000000 +0800 >+Bugzilla::Install::DB::check_content_length_before_upgrade(); No, this should just be a part of Install::DB. I don't want checksetup to check this every time it runs, so wrap it in an "if" like other schema changes in Bugzilla::Install::DB. If you read the file, how we do this will be very obvious.
Attachment #290812 - Flags: review?(mkanat) → review-
(In reply to comment #23) How can I get the size of specific type like TINYTEXT, MEDIUMTEXT, LONGTEXT ? And I think we can just only check those fields that are changed shorter, do you think so?
(In reply to comment #24) > (In reply to comment #23) > How can I get the size of specific type like TINYTEXT, MEDIUMTEXT, LONGTEXT ? You'll have to figure that out. > And I think we can just only check those fields that are changed shorter, do > you think so? Yes, we only need to check the ones that we change to a shorter type. Since in PostgreSQL they're all the same size, we only need to check their sizes in MySQL. So you could just check their sizes in Bugzilla::DB::MySQL::bz_setup_database, if you want. Only do it if their type is the old type, though.
Status: NEW → ASSIGNED
Attached patch mediumtext patch (obsolete) — Splinter Review
Attachment #291368 - Flags: review?(mkanat)
Comment on attachment 291368 [details] [diff] [review] mediumtext patch >diff -ru ../cvs/bugzilla/Bugzilla/Install/DB.pm Bugzilla/Install/DB.pm >@@ -522,6 +522,10 @@ > Bugzilla::Hook::process('install-update_db'); > > $dbh->bz_setup_foreign_keys(); >+ # Check length of the field content before upgrade. >+ # 2007-11-29 xiaoou.wu@oracle.com - Bug 153129 >+ check_content_length_before_upgrade(); No, add it above the --TABLE-- part, just like the very large, obvious comment says to. Also, call it change_text_types() instead. >@@ -2898,6 +2902,72 @@ >+sub check_content_length_before_upgrade { >+ my $dbh = Bugzilla->dbh; >+ return if $dbh->isa('Bugzilla::DB::Oracle'); No. Instead, do: return if $dbh->bz_column_info('longdescs', 'thetext')->{TYPE} eq 'MEDIUMTEXT'; >diff -ru ../cvs/bugzilla/Bugzilla/Constants.pm Bugzilla/Constants.pm >+ MAX_TINYTEXT_SIZE >+ MAX_MEDIUMTEXT_SIZE >+ MAX_LONGTEXT_SIZE These constants should be DB-specific instead. They can have defaults in Bugzilla::DB, and then they can be overridden in DB drivers as necessary. For example, MySQL has a very long MEDIUMTEXT length.
Attachment #291368 - Flags: review?(mkanat) → review-
Attachment #287957 - Attachment is obsolete: true
Attachment #290812 - Attachment is obsolete: true
Attachment #287478 - Attachment is obsolete: true
Attached patch mediumtext patch (obsolete) — Splinter Review
Attachment #291368 - Attachment is obsolete: true
Attachment #291577 - Flags: review?(mkanat)
Comment on attachment 291577 [details] [diff] [review] mediumtext patch >diff -ru ../cvs/bugzilla/Bugzilla/Install/DB.pm Bugzilla/Install/DB.pm >@@ -514,6 +514,9 @@ > > # 2007-08-21 wurblzap@gmail.com - Bug 365378 > _make_lang_setting_dynamic(); >+ >+ # 2007-11-29 xiaoou.wu@oracle.com - Bug 153129 >+ change_text_types(); > > ################################################################ > # New --TABLE-- changes should go *** A B O V E *** this point # >@@ -2898,6 +2901,72 @@ > } > } > >+sub change_text_types { >+ my $dbh = Bugzilla->dbh; >+ return if $dbh->bz_column_info('longdescs', 'thetext')->{TYPE} eq >+ 'MEDIUMTEXT'; Oh, my mistake. I realized that you change this above, so this will always be true, at this point. You'll have to pick something else, like checking the type of attachments.description or something. >+WARNING: Some of your fields have content longer than its new limit $max_length > [snip] This string needs to be localizeable, so it should be using install_string from Bugzilla::Install::Util. >+ }else { Nit: } else { >diff -ru ../cvs/bugzilla/Bugzilla/DB.pm Bugzilla/DB.pm >+use constant MAX_TINYTEXT_SIZE => 255; I just realized, we only need this one, right? So that's my mistake, you can just put this directly as a number into Bugzilla::Install::DB, like you had it originally, and you don't need to check against the type.
Attachment #291577 - Flags: review?(mkanat) → review-
(> >+sub change_text_types { > >+ my $dbh = Bugzilla->dbh; > >+ return if $dbh->bz_column_info('longdescs', 'thetext')->{TYPE} eq > >+ 'MEDIUMTEXT'; > Oh, my mistake. I realized that you change this above, so this will always be > true, at this point. You'll have to pick something else, like checking the type > of attachments.description or something. Do you want it do nothing to current MySQL or Pg? if not, I think it should be something like this: return if $dbh->bz_column_info('attachments', 'description')->{TYPE} eq 'TINYTEXT'; or just check if it is not Oracle.
Attached patch mediumtext patch (obsolete) — Splinter Review
Attachment #291577 - Attachment is obsolete: true
Attachment #292515 - Flags: review?(mkanat)
Attached patch v7 (or so)Splinter Review
Okay, I took your patch, modified it slightly, and this is what I'm going to check in. Thank you.
Attachment #292515 - Attachment is obsolete: true
Attachment #292527 - Flags: review+
Attachment #292515 - Flags: review?(mkanat)
Checking in Bugzilla/DB/Schema.pm; /cvsroot/mozilla/webtools/bugzilla/Bugzilla/DB/Schema.pm,v <-- Schema.pm new revision: 1.91; previous revision: 1.90 done Checking in Bugzilla/DB/Schema/Mysql.pm; /cvsroot/mozilla/webtools/bugzilla/Bugzilla/DB/Schema/Mysql.pm,v <-- Mysql.pm new revision: 1.18; previous revision: 1.17 done Checking in Bugzilla/DB/Schema/Pg.pm; /cvsroot/mozilla/webtools/bugzilla/Bugzilla/DB/Schema/Pg.pm,v <-- Pg.pm new revision: 1.15; previous revision: 1.14 done Checking in Bugzilla/Install/DB.pm; /cvsroot/mozilla/webtools/bugzilla/Bugzilla/Install/DB.pm,v <-- DB.pm new revision: 1.45; previous revision: 1.44 done Checking in template/en/default/setup/strings.txt.pl; /cvsroot/mozilla/webtools/bugzilla/template/en/default/setup/strings.txt.pl,v <-- strings.txt.pl new revision: 1.5; previous revision: 1.4 done
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Flags: approval+
Resolution: --- → FIXED
Comment on attachment 292527 [details] [diff] [review] v7 (or so) >Index: Bugzilla/DB/Schema.pm >@@ -469,7 +469,7 @@ >- description => {TYPE => 'TEXT'}, >+ description => {TYPE => 'MEDIUMTEXT'}, >Index: Bugzilla/Install/DB.pm >+sub change_text_types { >+ $dbh->bz_alter_column('flagtypes', 'description', >+ { TYPE => 'MEDIUMTEXT', NOTNULL => 1 }); This patch is broken: new installations have NOT NULL => 0 while upgrades get NOT NULL => 1. This makes tinderbox to fail. I will file a bug about it.
Blocks: 409463
Blocks: 824701
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: