Closed
Bug 902504
Opened 12 years ago
Closed 12 years ago
Clean test suite run fails on mysql 5.6
Categories
(Webtools Graveyard :: Elmo, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
3.3
People
(Reporter: peterbe, Assigned: peterbe)
Details
Attachments
(4 files, 5 obsolete files)
When using MySQL 5.6 it's impossible to run::
FORCE_DB=true python manage.py test --noinput
You get the traceback in attachment.
Assignee | ||
Comment 1•12 years ago
|
||
Asking for feedback here because I'm stumped. Without this patch I can't run the test suite because otherwise I get errors such as::
FATAL ERROR - The following SQL query failed: CREATE INDEX `l10nstats_unchangedinfile_6e62a245` ON `l10nstats_unchangedinfile` (`file`);
The error was: (1071, 'Specified key was too long; max key length is 767 bytes')
What are the limitations about indexes on varchars?
Also, we're running it with UTF8 so, is VARCHAR(400) actually 800 which is greater than 767? Characters vs. bytes is still hard for me.
Attachment #786995 -
Flags: feedback?(scabral)
Attachment #786995 -
Flags: feedback?(l10n)
Comment 2•12 years ago
|
||
InnoDB has the limit that a max key length is 767 bytes (MyISAM doesn't have that limit). UTF8 actually is either 3 or 4 bytes, depending on the character, so VARCHAR(400) has a max possible value of 1200 bytes.
Comment 3•12 years ago
|
||
FWIW, it's the index value that has that limit in InnoDB, not the data value. So you can have a varchar(400) field in UTF8 no problem, the data size can be that large. It's making an index that's a problem.
Assignee | ||
Comment 4•12 years ago
|
||
Considering that we (Mozilla and IT as a whole) are going to upgrade to MySQL 5.6 at some point in the nearish future; what else can we do?
I'm now entirely blocked on being able to run the tests on my system.
What say you Axel?
:sheeri,
What's your opinion about having indexes on large fields?
Comment 5•12 years ago
|
||
This is weird. The indexes are fine, actually. They're not perfect, but OK.
./manage.py syncdb --migrate work, for example.
That's the code path that ./manage.py test ends up in for the test db creation, with one difference, it's specifying
'load_initial_data': False
in options in south's migrate command. Not that I understand how that's making a difference.
Comment 6•12 years ago
|
||
Comment on attachment 786995 [details] [diff] [review]
Now I can run the whole test suite in mysql 5.6
Review of attachment 786995 [details] [diff] [review]:
-----------------------------------------------------------------
As I mentioned in the bug, I don't think this is necessary. MySQL docs say that short indexes are only bad if the entries have a unique constraint.
Also things work fine when trying to create the db outside of the test code path.
Attachment #786995 -
Flags: feedback?(l10n) → feedback-
Comment 7•12 years ago
|
||
I have no problem running the test suite locally.
$ mysql --version
mysql Ver 14.14 Distrib 5.5.32, for debian-linux-gnu (x86_64) using readline 6.2
$ ./manage.py test
/home/adrian/mozilla/workspace/elmo/vendor/src/funfactory/funfactory/manage.py:47: UserWarning: You're using an old-style Playdoh layout with a top level __init__.py and apps directories. This is error prone and fights the Zen of Python. See http://playdoh.readthedocs.org/en/latest/getting-started/upgrading.html
warnings.warn("You're using an old-style Playdoh layout with a top "
/home/adrian/mozilla/workspace/elmo/vendor-local/lib/python/pytz/__init__.py:35: UserWarning: Module pytz was already imported from /home/adrian/mozilla/workspace/elmo/vendor-local/lib/python/pytz/__init__.pyc, but /home/adrian/mozilla/workspace/elmo/env/lib/python2.7/site-packages is being added to sys.path
from pkg_resources import resource_stream
nosetests --verbosity 1
Creating test database for alias 'default'...
...................................................................................................................................................................................
----------------------------------------------------------------------
Ran 179 tests in 38.492s
OK
This was achieved from a non-existing test database. The same happens if I force the database creation:
$ FORCE_DB=1 ./manage.py test
/home/adrian/mozilla/workspace/elmo/vendor/src/funfactory/funfactory/manage.py:47: UserWarning: You're using an old-style Playdoh layout with a top level __init__.py and apps directories. This is error prone and fights the Zen of Python. See http://playdoh.readthedocs.org/en/latest/getting-started/upgrading.html
warnings.warn("You're using an old-style Playdoh layout with a top "
/home/adrian/mozilla/workspace/elmo/vendor-local/lib/python/pytz/__init__.py:35: UserWarning: Module pytz was already imported from /home/adrian/mozilla/workspace/elmo/vendor-local/lib/python/pytz/__init__.pyc, but /home/adrian/mozilla/workspace/elmo/env/lib/python2.7/site-packages is being added to sys.path
from pkg_resources import resource_stream
nosetests --verbosity 1
Creating test database for alias 'default'...
Got an error creating the test database: (1007, "Can't create database 'test_elmo'; database exists")
Type 'yes' if you would like to try deleting the test database 'test_elmo', or 'no' to cancel: yes
Destroying old test database 'default'...
...................................................................................................................................................................................
----------------------------------------------------------------------
Ran 179 tests in 54.305s
OK
Assignee | ||
Comment 8•12 years ago
|
||
So, it's not just a problem with tests.
Off of "develop" branch, I can't create the tables.
I created a brand new database called "elmo_tmp", updated settings/local.py accordingly and ran the syncdb && migrate. Failed.
Assignee | ||
Comment 9•12 years ago
|
||
I think that what's really suspect is that when I run `FORCE_DB=1 ./manage.py test --nonput` it eventually barfs on one of the index creations.
BUT, 50 of the 75 tables are created. Just not with the indexes.
mysql> explain mbdb_file;
+-------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| path | varchar(400) | NO | | NULL | |
+-------+--------------+------+-----+---------+----------------+
2 rows in set (0.00 sec)
mysql> CREATE INDEX `mbdb_file_6a8a34e2` ON `mbdb_file` (`path`);
ERROR 1071 (42000): Specified key was too long; max key length is 767 bytes
I guess, this is the first index creation attempt once the table has been created and this error is breaking things.
:sheeri, can you confirm that it's possible or not possible to create indexes on columns even though they don't have a uniqueness constraint on them?
Comment 10•12 years ago
|
||
Indexes can exist on columns that are non-unique (and they should! they're useful!).
If you want indexes on those columns, you need a prefix index, like
CREATE INDEX `mbdb_file_6a8a34e2` ON `mbdb_file` (`path`(190));
That will create an index on the first 190 characters of "path". (I chose 190, because 190 * 4 = 760, so it's close but doesn't exceed the 767 limit. If you are mostly using 3-byte UTF8 characters, you can use 255 characters....which is, I believe, why they picked 767 as the limit in the first place :D). But if you want to play it safe and it's enough characters for you, 190 chars is what you want.
Assignee | ||
Comment 11•12 years ago
|
||
That command works.
Considering that the above SQL I posted is taken from out of the Django ORM (using the South library) I really don't know how we can inject that extra limit on the length through python.
Comment 12•12 years ago
|
||
It looks like django doesn't actually give you that ability. How frustrating. Everyone just says "oh just hash the long text and index the hash". What a pain.
However, in MySQL 5.6, according to https://dev.mysql.com/doc/refman/5.6/en/innodb-restrictions.html:
"When the innodb_large_prefix configuration option is enabled, this length limit is raised to 3072 bytes, for InnoDB tables that use the DYNAMIC and COMPRESSED row formats."
innodb_large_prefix is off by default but is dynamic (see https://dev.mysql.com/doc/refman/5.6/en/innodb-parameters.html#sysvar_innodb_large_prefix) - there are some other settings we have to make sure are in place.
If this is for l10n, I believe that's on its own server, so these changes aren't a problem.
Comment 13•12 years ago
|
||
I wonder if we're hitting https://kb.askmonty.org/en/mariadb-versus-mysql-compatibility/#incompatibilities-between-mariadb-52-and-mysql-51:
New SQL_MODE value was added: IGNORE_BAD_TABLE_OPTIONS. If it is not set, using a table, field, or index attribute (option) that is not supported by the chosen storage engine will cause an error. This change might cause warnings in the error log about incorrectly defined tables from the mysql database, fix that with mysql_upgrade.
Comment 14•12 years ago
|
||
The language of that is kind of confusing, but the SQL_MODE overrides the error for a "bad table option". We did not set that on the server side, and I'd be surprised if it was set on the client side (by you).
Updated•12 years ago
|
Attachment #786995 -
Flags: review+
Updated•12 years ago
|
Attachment #786995 -
Flags: feedback?(scabral)
Comment 15•12 years ago
|
||
I got back to this, and I managed to hack django to work, but it's not pretty.
I'm also wondering if the indexes actually work well to begin with, given that most file have a common starting pattern. Like just files starting with 'layout/' are about 20k. I get the feeling that in this particular case, doing an index on the actual hash might actually be what we want. It might also make the stunts we need to do about trailing spaces in filenames have less impact still.
And yes, I did watch Sheeri's indexing talk today. You can tell by the way I spelt indices. ;-)
Comment 16•12 years ago
|
||
<3 re: the indexing talk. Hope it was helpful! (I hate it too, I took Latin in high school, but my college job in a library trained me into using "indexes" instead of "indices")
If you're only doing exact matches, then an index of a hash will work well.
Comment 17•12 years ago
|
||
Comment on attachment 786995 [details] [diff] [review]
Now I can run the whole test suite in mysql 5.6
Review of attachment 786995 [details] [diff] [review]:
-----------------------------------------------------------------
OK, based on the status quo, this is good to go.
I'd like to have a follow-up to create the hash columns and index and use those.
Also, we need a patch for life, still, which has a mock File object that mirrors mbdb.
Also, should the migration for life have the conditional if mbdb isn't on? Probably, as we may at some point actually drop that.
Attachment #786995 -
Flags: feedback- → feedback+
Assignee | ||
Comment 18•12 years ago
|
||
:Pike, so are you OK with me merging this patch? Sheeri gave it the r+ and you the f+ but I'm not sure if your f+ is on the r+ so to say.
Comment 19•12 years ago
|
||
> Also, we need a patch for life, still, which has a mock File object that
> mirrors mbdb.
>
> Also, should the migration for life have the conditional if mbdb isn't on?
> Probably, as we may at some point actually drop that.
At least the first issue needs to be addressed. Probably the latter as well.
Assignee | ||
Comment 20•12 years ago
|
||
Attachment #786995 -
Attachment is obsolete: true
Attachment #801651 -
Flags: review?(l10n)
Assignee | ||
Comment 21•12 years ago
|
||
oops. last diff was done too quickly. wrong commit.
Attachment #801651 -
Attachment is obsolete: true
Attachment #801651 -
Flags: review?(l10n)
Attachment #801801 -
Flags: review?(l10n)
Comment 22•12 years ago
|
||
Comment on attachment 801801 [details] [diff] [review]
bug902504-3.diff
Review of attachment 801801 [details] [diff] [review]:
-----------------------------------------------------------------
Sadly, the life copy of the model migration needs more work. r-.
::: apps/life/migrations/0001_initial.py
@@ +29,5 @@
> + ('id', self.gf('django.db.models.fields.AutoField')(primary_key=True)),
> + ('path', self.gf('django.db.models.fields.CharField')(max_length=400)),
> + ))
> + db.send_create_signal('life', ['File'])
> +
This needs to be in a conditional, and the table is mbdb_file, even if constructed through the life/models.py.
Attachment #801801 -
Flags: review?(l10n) → review-
Assignee | ||
Comment 23•12 years ago
|
||
Attachment #801801 -
Attachment is obsolete: true
Attachment #802560 -
Flags: review?(l10n)
Comment 24•12 years ago
|
||
Comment on attachment 802560 [details] [diff] [review]
bug902504-4.diff
Review of attachment 802560 [details] [diff] [review]:
-----------------------------------------------------------------
This didn't address the review comments in full, r-.
::: apps/life/migrations/0001_initial.py
@@ +26,5 @@
> db.send_create_signal('life', ['Branch'])
>
> + if 'mbdb' not in settings.INSTALLED_APPS:
> + # Adding model 'File'
> + db.create_table('life_file', (
no.
@@ +30,5 @@
> + db.create_table('life_file', (
> + ('id', self.gf('django.db.models.fields.AutoField')(primary_key=True)),
> + ('path', self.gf('django.db.models.fields.CharField')(max_length=400)),
> + ))
> + db.send_create_signal('life', ['File'])
probably neither. Don't know the signals, really.
Attachment #802560 -
Flags: review?(l10n) → review-
Assignee | ||
Comment 25•12 years ago
|
||
Sorry for my poor understanding. I really don't understand what you mean then.
We already have `db.create_table('mbdb_file' ...` in apps/mbdb/migrations/0001_initial.py
Sorry, but can you try to explain it in some other way that this tired swede can understand?
Comment 26•12 years ago
|
||
Comment on attachment 802560 [details] [diff] [review]
bug902504-4.diff
Review of attachment 802560 [details] [diff] [review]:
-----------------------------------------------------------------
::: apps/life/migrations/0001_initial.py
@@ +26,5 @@
> db.send_create_signal('life', ['Branch'])
>
> + if 'mbdb' not in settings.INSTALLED_APPS:
> + # Adding model 'File'
> + db.create_table('life_file', (
the table name is mbdb_file, see models.py...
@@ +30,5 @@
> + db.create_table('life_file', (
> + ('id', self.gf('django.db.models.fields.AutoField')(primary_key=True)),
> + ('path', self.gf('django.db.models.fields.CharField')(max_length=400)),
> + ))
> + db.send_create_signal('life', ['File'])
I suspect the sent signal is db.send_create_signal('mbdb', ['File'])
::: apps/life/models.py
@@ +64,2 @@
> class Meta:
> db_table = 'mbdb_file'
We explicitly specify the table name here, and the rest of life migration etc needs to be consistent.
Assignee | ||
Comment 27•12 years ago
|
||
That's what I had! ...in an earlier patch. At least that's what I think I had. Perhaps I had that in my head but made the wrong diff or something.
Anyway, that explains a lot of the confusion.
Again, sorry for not comprehending.
Attachment #802560 -
Attachment is obsolete: true
Attachment #803824 -
Flags: review?(l10n)
Comment 28•12 years ago
|
||
Sorry, I slowly learn about South, and this conditional model is whack.
I just change my mind as we go, too.
Let's remove that mock model in life/models.py, instead of trying to make it work. If we drop mbdb, we should properly migrate that model and data into life, I think.
For reference, I just went ahead and made south help us out with what had been the right life initial migration. Just corrupted my settings/base.py to not include a whole lot, and then had south generate the initial migration.
That gives the right drops, and the file models, but those need to be ifdef'ed still.
But in the end, as you can see, I just reconsidered while typing this comment, I think our models should reflect what we're doing, and not try to be generic and reusable.
I expect that's gonna help us when it comes down to actually migrating off of mbdb.
Comment 29•12 years ago
|
||
Comment on attachment 803824 [details] [diff] [review]
bug902504-5.diff
Review of attachment 803824 [details] [diff] [review]:
-----------------------------------------------------------------
So sorry, but based on my thinking in the previous comment and change of mind ...
::: apps/life/migrations/0001_initial.py
@@ +224,4 @@
> }
> }
>
> + complete_apps = ['life']
let's not touch the life migration at all ...
::: apps/life/models.py
@@ +64,4 @@
> class Meta:
> db_table = 'mbdb_file'
>
> + path = models.CharField(max_length=400)
.. and remove the mock here completely.
Attachment #803824 -
Flags: review?(l10n) → review-
Assignee | ||
Comment 30•12 years ago
|
||
I hope this is the right one now.
Attachment #803824 -
Attachment is obsolete: true
Attachment #804495 -
Flags: review?(l10n)
Comment 31•12 years ago
|
||
Comment on attachment 804495 [details] [diff] [review]
bug902504-6.diff
Review of attachment 804495 [details] [diff] [review]:
-----------------------------------------------------------------
r=me with a nit, there are still no-op changes to life's migration file. Thanks for the patience.
::: apps/life/migrations/0001_initial.py
@@ +2,4 @@
> import datetime
> from south.db import db
> from south.v2 import SchemaMigration
> +from django.conf import settings
We actually don't want this anymore.
@@ +216,4 @@
> }
> }
>
> + complete_apps = ['life']
... and the newline isn't important enough to touch the file at all.
Attachment #804495 -
Flags: review?(l10n) → review+
Comment 32•12 years ago
|
||
Commit pushed to develop at https://github.com/mozilla/elmo
https://github.com/mozilla/elmo/commit/88ecb9f1a5482479ec266ec54476c22ccc99f627
fixes bug 902504 - Removed indexes on charfields that are max_length=400, r=Pike
Updated•12 years ago
|
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Assignee | ||
Updated•12 years ago
|
Target Milestone: --- → 3.3
Updated•11 years ago
|
Assignee: nobody → peterbe
Updated•4 years ago
|
Product: Webtools → Webtools Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•