Closed Bug 285111 Opened 20 years ago Closed 20 years ago

Installation checks will not be reliable in a cross-database environment

Categories

(Bugzilla :: Installation & Upgrading, defect, P1)

2.19.2

Tracking

()

RESOLVED FIXED
Bugzilla 2.20

People

(Reporter: mkanat, Assigned: mkanat)

References

(Blocks 1 open bug)

Details

Basically, we want to store a serialized Schema in the database, and read/change it to do our installation checks. Below is the original email I sent on this subject to developers@, which describes both the problem and the proposed solution. ---------------------------------------------------------- I've come across an interesting theoretical problem, today, when thinking about our new cross-platform installation. Basically, here's the problem: We have a new Bugzilla::DB::Schema module coming, which is almost complete. (See <https://bugzilla.mozilla.org/show_bug.cgi?id=bz-dbschema>). It maps "generic" types to database-specific types. It's very cool. We, the developers, cannot really know how any given database will map a "generic" type to a database-specific type. There's just no way at this point to predict that for every database that we might implement. For example, for the PostgreSQL patch, all integers will probably be the same size, namely, "integer." Now, guess how checksetup works? It checks to see if fields have changed their definition! So let's say that we changed a field from a "small integer" to a "large integer," and then we wanted to trigger a change based on that. On MySQL, the triggered change code would run fine. On PostgreSQL, the triggered change would never run, because a "small integer" and a "large integer" are the SAME SIZE. Now, what if we had a database system where all variable-length text fields had the same underlying implementation? That is, where varchar == tinytext == mediumtext, etc. We couldn't trigger a change on moving something from varchar to text. Thankfully, this is not an urgent issue. We currently have no other databases besides MySQL to upgrade. Even when we implement PostgreSQL support, we will only have to truly upgrade starting with 2.22, although our first 2.21 release may also have some requirements in that area. I think we have a few choices that I can think of that will be entirely future-proof: (1) Store a version number somewhere in the database. When we upgrade, read that version number and run the correct upgrade code. Problem: This makes running checksetup against a CVS tree very risky, where before that was no problem. (2) Store the entire current schema itself in the database. That way we'll always know what the "generic" type of a field is SUPPOSED to be, even if it's not possible to read out that information. Problem: That's a somewhat-decent amount of work. Really, to keep our current functionality, I think that #2 is probably the best solution at this point. With Bugzilla::DB::Schema, this is somewhat easier. In fact, the easiest way would just be to store a Data::Dumper->Dump of the generic {schema} hash inside of Bugzilla::DB::Schema. However, that requires that the internal data structures of Bugzilla::DB::Schema either: (1) Always stay the same. (2) Always provide backwards-compatibility. Storing the schema in the database will also make my life in implementing cross-db installation a LOT easier. Right now, in order to get the definition of a table column in a cross-database fashion, I have to do a lot of hocus-pocus with DBI. Now, any advice on how we can best accomplish all of this in a fully maintainable and future proof way is very welcome. :-) I think the structures of Bugzilla::DB::Schema (the next version, not quite this version) should be fine. We'll have to implement a serialize() and deserialize() function, and when we serialize the data, we MUST store along with it the version of the schema that's being stored. I think that we should use a special DB::Schema version, as opposed to the Bugzilla version itself, since we care about when the Schema internal format changes, not when Bugzilla itself changes. -Max
Now, although we don't need to worry about this TOO much before 2.21, it makes touching checksetup very risky. Also, right this very moment, I need to re-work the schema-checking functions to be cross-database-compatible. If we go this route, it will be much easier.
Assignee: zach → mkanat
Severity: normal → major
Target Milestone: --- → Bugzilla 2.20
Depends on: bz-dbschema
Depends on: 285113
Oh, for the record, I think that we should serialize it to a single row in a brand-new table, called bz_schema (because "schema" is likely to be a reserved word in some database SOMEwhere), which has only two columns: version char(4) not null schema text not null The table will only have one row at any given time.
Priority: -- → P1
(In reply to comment #0) > So let's say that we changed a field from a > "small integer" to a "large integer," and then we wanted to trigger a > change based on that. On MySQL, the triggered change code would run > fine. On PostgreSQL, the triggered change would never run, because a > "small integer" and a "large integer" are the SAME SIZE. That seems entirely correct to me! There's nothing to change in that scenario. > Now, what if we had a database system where all variable-length text > fields had the same underlying implementation? That is, where varchar == > tinytext == mediumtext, etc. We couldn't trigger a change on moving > something from varchar to text. Sure you can. When you query the database, you get the actual data type that is implemented. If that's not the type you want it to be, then you change it. I still don't see the problem.
(In reply to comment #3) > That seems entirely correct to me! There's nothing to change in that scenario. Apparently I haven't been clear enough, since two people have asked me this, at this point. Bugzilla development w.r.t. checksetup works like this: (1) We upgrade the fashion in which Bugzilla works. (2) We need to do some migration code on the actual data in the database, as a result of this change. (3) In order to determine whether or not we perform this change, we check the current definition of a field. This tells us whether we're running the "old" Bugzilla or the "new" Bugzilla. This is nice because it allows us to upgrade reliably between CVS versions. This is NOT about changing the definitions of fields. It's about running migration code. Yes, if we had no migration code to run, there would be nothing to worry about! Everything would be fine. We trigger migration code off of field changes. These triggers must be reliable. The problem here is that our MIGRATION CODE TRIGGERS are not reliable. > Sure you can. When you query the database, you get the actual data type that > is implemented. Actually, you don't always get even that. For example, all I get get out of PostgreSQL about a serial field is that it's an integer with a very strange-looking default. Playing with DBI's column_info function is NOT fun. (Believe me, I've was doing it for six hours a day for several days to get patches when I was heading in that direction.)
At Tomas's request, I'm going to write a broad overview of how the system will work: (1) We store a Schema object in the database. This contains a list of all the tables, columns, and indexes, in the "database-neutral" format. See $self->{schema} in Bugzilla::DB::Schema::new to see what this looks like. (2) Then, later (in a future version), we're asked to make a column change, and we need to check if the column needs to be updated. Let's say that we're changing the resolution.value field from a varchar(64) to a varchar(255), and we also need to change the names of all the resolutions in the database. (3) So, checksetup needs to check if the column is already a varchar(255). We implement a function Bugzilla::DB::bz_get_column_def which returns a definition for the current column. This looks like: { TYPE => 'VARCHAR(64)', NOTNULL => 1 } We can see that type != 'VARCHAR(64)'. (4) So, we need to do the migration code. We update all of the resolutions in the bugs table to be some new, really long values. This is just a lot of standard SQL and some perl. (5) Once the migration code is successful (and only then), we need to change the type of the field. We have a function to do this, called Bugzilla::DB::bz_change_column. Here's how it works: (a) We read in the saved Schema from the database. (b) We ask the Schema, "Is the type of resolution.value equal to the new type that we want it to be?" That is, "Is it varchar(255) NOT NULL?" (c) Schema says, "No, it's not equal." (It's "varchar(64) NOT NULL.") (d) We say, "OK, then we need to change it to the new type!" (e) We ask the Schema object, "Can you generate an ALTER TABLE statement for this?" It generates the ALTER TABLE, and we run it, to modify the actual database. (f) We then update the Schema object with the new definition, so that what's in memory is the same as the actual structure of the database on the disk. (g) We then save the in-memory Schema back to the database. Note that because we do (a) and (g) EVERY TIME we call bz_change_column, there's pretty much no way for the database and the Schema to get out of sync. There's also a bz_rename_column which will work in a similar fashion. Oh, and by the way, at the beginning of all of this there's one other step: (0) At the very beginning of checksetup's database-migration code, we take a WRITE lock on the bz_schema table, so that only one instance of checksetup can modify it at a time.
Depends on: 285680
Depends on: 285713
Depends on: 285722
Depends on: 285723
shouldn't the order of (4) and (5) be swapped? update the schema then insert long values.
(In reply to comment #6) > shouldn't the order of (4) and (5) be swapped? update the schema then insert > long values. Hmmm, actually, they need to be interleaved. You need to do half of the change of the DB table, then you may need to run some migration code, and then you need to finish the table changes. This happens when you e.g. change column type between two incompatible types. Imagine changing e.g. operating systems (former enum, now table), which are related with the column in the bugs table by name, to be related by id. You need to create the column for id, then go through all the records and convert names to ids, then you want to drop the old name column.
Flags: blocking2.20+
(In reply to comment #7) > (In reply to comment #6) > > shouldn't the order of (4) and (5) be swapped? update the schema then > > insert long values. > > Hmmm, actually, they need to be interleaved. Yeah. The process basically just describes exactly how checksetup works now, but just explains what the new underlying implementation will be. It always depends on the migration code, what you want to do. The reason for generally changing the schema after the migration code is that if the migration code fails, the schema will still be unchanged, and the migration code will try to run again the next time you run checksetup. If you change the schema before the migration code, then if the migration code fails, it will never run again (because it's keyed off of the schema change).
Depends on: 286527
Blocks: 285748
No longer blocks: 285748
Depends on: 285748
Depends on: 286689
Blocks: 286695
Just a thought: Regarding option #1 in the original bug description, this could be accomplished very easily if all commits that changed the database schema incremented a "build number" or "database version" number that was part of the overall version number during development. For example, instead of just 2.21, you could start with 2.21.001, 2.21.002, etc. As long as the version number always increased, this would remove the risk from #1 and make the upgrade process MUCH SIMPLER than option #2.
(In reply to comment #9) > this could be accomplished very easily if all > commits that changed the database schema incremented a "build number" or > "database version" number Yeah, I also considered doing something like that. The problem is that it would fundamentally alter the way that checksetup works, and it would be somewhat difficult to do in the kind of development environment we have with Bugzilla. (I'd become worried about maintenance of the version number, among other problems.) It would also then be somewhat difficult to re-order changes, as we have to sometimes do when we discover that we've made an error. Really, you kind of have to have deep guru-level understanding of checksetup's database-migration code to know what the exact problems would be, but otherwise I would agree that it's a much simpler solution. Anyhow, I'm already pretty far into option 2, and it's a very robust solution that, from the developers' perspective, works the same way the old system worked.
Blocks: 288409
> Yeah, I also considered doing something like that. The problem is that it > would fundamentally alter the way that checksetup works That's ok, checksetup.pl's current algorithm isn't written in stone. > it would be > somewhat difficult to do in the kind of development environment we have with > Bugzilla. (I'd become worried about maintenance of the version number, among > other problems.) If the version number increments every time there's a schema change, this approach would work just as well for development environments as it does for production environments. > It would also then be somewhat difficult to re-order changes, > as we have to sometimes do when we discover that we've made an error. Has this happened, and why why does this approach make it difficult to reorder changes? > Anyhow, I'm already pretty far into option 2, and it's a very robust solution That may be so, but implementing a solution is just the beginning. We'll have to maintain it for years, so we should make sure we're picking the best one now, even if you've done work on one already. For maintenance and robustness, there's a lot to be said for simplicity, and versioning seems like a good solution to the problem. Where does it fall down? > that, from the developers' perspective, works the same way the old system worked. That's not a problem, as we change the way things work all the time (most recently when we moved the schema from checksetup.pl to its own module), and developers are better off with a better solution than a more similar one. We should pick the best solution, no matter how dissimilar it is from what we do in checksetup.pl today.
(In reply to comment #11) > That may be so, but implementing a solution is just the beginning. We'll have > to maintain it for years, so we should make sure we're picking the best one now, > even if you've done work on one already. For maintenance and robustness, > there's a lot to be said for simplicity, and versioning seems like a good > solution to the problem. Where does it fall down? I don't know how familiar you are with the solution Max has proposed and is currently implementing, but I think it's very good and robust solution. Instead of storing just schema version number, we are storing the whole schema. It's not so much data, so storage is not a problem, and it should be quite bullet-proof. I personaly can't think of better way to do it :-). That does not say anything about the actual implementation, but I personally think that even the implementation is pretty good at the moment, and if there are any problems, they should not be fundamental, IMHO. The only thing I was thinking could be maybe better would be somehow attaching the migration code to the schema - you compare current schema with the new one and if you find difference, you will get the migration code from some hash attached to the schema and run it, so it does not have to pollute checksetup, as it does now. But I haven't found any sensible way how to implement it yet, so it's still only an idea for a future enhancement :-).
Re: comment 11 You definitely make some good points. :-) One point that I've been thinking about is that for Schema to be entirely compatible with the upcoming Custom Fields code, we'd have to pretty much take the approach that I'm working on anyway, where the Schema constantly reflects the database on the disk. For the versioning approach, in Bugzilla's development environment, how would I write such a version number in a patch that I could put on a bug? As far as the reordering question goes, we do frequently have to either reorder changes or take out older changes. The nice thing about the way checksetup works now is that it's based on the current state of the database, so it can even deal with a bit of local DB customization (though not much), and it's easy to see where we need to change things, because we only have to worry about what the state of the database was in the past, and not what version people are running. With versioning, that sort of situation becomes more confused in my mind, at the least. The paradigm that we use now works, is pretty flexible, and it's a known quantity. Changing to versions could be difficult, and it's problems aren't known. In other words, I'm certain that my solution will work, and I'm not certain that versions will work and be easy to maintain. The code that I'm writing for my solution will also need to be implemented eventually anyway. Re: comment 12 I thought about trying to do the same thing, but we'd have to come up with some really good mechanism, and I think it would end up being WAY more complex than what we have now. :-)
I didn't have time to read all the conversation here, so far, so I'm just typing some brainstorming-like idea: The main issues of checksetup.pl, in my opinion, are: 1) MySQL-based code (until now). 2) inability to call the DB upgrade code from a perl script (web installer) (the code should be in a .pm module, which kind of happened partially with mkanat's Schema thing, but we need the code that uses Schema.pm in a module as well) 3) inability to revert to a previous DB (justdave decided to take out of the CVS one risky patch because it makes Bugzilla too risky after been commited, and landfill is practically not usable with it applied - those that ran checksetup.pl have no way to go back to the previous DB, before the patch with the schema changes was applied). 4) inability to switch between Bugzillas - for example I run the tip, and I'd like to go to 2.18-STABLE because it proves to be more stable nowadays (that HASH error thing is not suitable for production environments). However, checksetup.pl doesn't know how to transform a tip DB into a 2.18 DB. So maybe versioning is not the perfect solution. Probably we should aim more into the lines of having changesets, which can be defined as valid starting with 2.18, or maybe only for the 2.20 branch or something similar. Something to allow us to revert the DB as well, or to allow us to switch between branches. I don't know if such thing is possible, or if such design would be easy to maintain and allow this with the same amount of effort as the current system. But since myk throw in "let's think about this", I thought that the above might be useful. Of course, we might decide to go for a quick-fix for 2.20 and open a new bug about the rest.
Blocks: 290367
"If it's not a regression from 2.18 and it's not a critical problem with something that's already landed, let's push it off." - Dave
Flags: blocking2.20+
Flags: blocking2.20-
Blocks: 291548
No longer blocks: 286695
Now that all the blockers are resolved, this bug is resolved. Checksetup has essentially been re-written to work in the fashion described in comment 5. The actual implementation of all the individual functions was done in blockers, and the changes to checksetup were done in bug 285722. Soon, I will send out an email describing the new functions, with a pointer to their (usually very detailed) API docs in Bugzilla::DB. The API docs need a bit of cleanup, but that's bug 290533.
Status: NEW → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
Blocks: 287296
You need to log in before you can comment on or make changes to this bug.