Installation checks will not be reliable in a cross-database environment

RESOLVED FIXED in Bugzilla 2.20

Status

()

Bugzilla
Installation & Upgrading
P1
major
RESOLVED FIXED
13 years ago
13 years ago

People

(Reporter: Max Kanat-Alexander, Assigned: Max Kanat-Alexander)

Tracking

(Blocks: 1 bug)

2.19.2
Bugzilla 2.20
Dependency tree / graph
Bug Flags:
blocking2.20 -

Details

(Assignee)

Description

13 years ago
Basically, we want to store a serialized Schema in the database, and
read/change it to do our installation checks.

  Below is the original email I sent on this subject to developers@, which
describes both the problem and the proposed solution.
----------------------------------------------------------

  I've come across an interesting theoretical problem, today, when
thinking about our new cross-platform installation.

        Basically, here's the problem:

        We have a new Bugzilla::DB::Schema module coming, which is almost
complete. (See
<https://bugzilla.mozilla.org/show_bug.cgi?id=bz-dbschema>). It maps
"generic" types to database-specific types. It's very cool.

        We, the developers, cannot really know how any given database will map
a "generic" type to a database-specific type. There's just no way at
this point to predict that for every database that we might implement. 

        For example, for the PostgreSQL patch, all integers will probably be
the same size, namely, "integer."

        Now, guess how checksetup works? It checks to see if fields have
changed their definition! So let's say that we changed a field from a
"small integer" to a "large integer," and then we wanted to trigger a
change based on that. On MySQL, the triggered change code would run
fine. On PostgreSQL, the triggered change would never run, because a
"small integer" and a "large integer" are the SAME SIZE.

        Now, what if we had a database system where all variable-length text
fields had the same underlying implementation? That is, where varchar ==
tinytext == mediumtext, etc. We couldn't trigger a change on moving
something from varchar to text.

        Thankfully, this is not an urgent issue. We currently have no other
databases besides MySQL to upgrade. Even when we implement PostgreSQL
support, we will only have to truly upgrade starting with 2.22, although
our first 2.21 release may also have some requirements in that area.

        I think we have a few choices that I can think of that will be entirely
future-proof:

        (1) Store a version number somewhere in the database. When we upgrade,
read that version number and run the correct upgrade code.
        Problem: This makes running checksetup against a CVS tree very risky,
where before that was no problem.

        (2) Store the entire current schema itself in the database. That way
we'll always know what the "generic" type of a field is SUPPOSED to be,
even if it's not possible to read out that information.
        Problem: That's a somewhat-decent amount of work.

        Really, to keep our current functionality, I think that #2 is probably
the best solution at this point. With Bugzilla::DB::Schema, this is
somewhat easier. In fact, the easiest way would just be to store a
Data::Dumper->Dump of the generic {schema} hash inside of
Bugzilla::DB::Schema. However, that requires that the internal data
structures of Bugzilla::DB::Schema either:

        (1) Always stay the same.
        (2) Always provide backwards-compatibility.

        Storing the schema in the database will also make my life in
implementing cross-db installation a LOT easier. Right now, in order to
get the definition of a table column in a cross-database fashion, I have
to do a lot of hocus-pocus with DBI.

        Now, any advice on how we can best accomplish all of this in a fully
maintainable and future proof way is very welcome. :-) 

        I think the structures of Bugzilla::DB::Schema (the next version, not
quite this version) should be fine. We'll have to implement a
serialize() and deserialize() function, and when we serialize the data,
we MUST store along with it the version of the schema that's being
stored. I think that we should use a special DB::Schema version, as
opposed to the Bugzilla version itself, since we care about when the
Schema internal format changes, not when Bugzilla itself changes.

        -Max
(Assignee)

Comment 1

13 years ago
Now, although we don't need to worry about this TOO much before 2.21, it makes
touching checksetup very risky.

Also, right this very moment, I need to re-work the schema-checking functions to
be cross-database-compatible. If we go this route, it will be much easier.
Assignee: zach → mkanat
Severity: normal → major
Target Milestone: --- → Bugzilla 2.20
(Assignee)

Updated

13 years ago
Depends on: 146679
(Assignee)

Updated

13 years ago
Depends on: 285113
(Assignee)

Comment 2

13 years ago
Oh, for the record, I think that we should serialize it to a single row in a
brand-new table, called bz_schema (because "schema" is likely to be a reserved
word in some database SOMEwhere), which has only two columns:

  version char(4) not null
  schema text not null

  The table will only have one row at any given time.
(Assignee)

Updated

13 years ago
Priority: -- → P1

Comment 3

13 years ago
(In reply to comment #0)
> So let's say that we changed a field from a
> "small integer" to a "large integer," and then we wanted to trigger a
> change based on that. On MySQL, the triggered change code would run
> fine. On PostgreSQL, the triggered change would never run, because a
> "small integer" and a "large integer" are the SAME SIZE.

That seems entirely correct to me! There's nothing to change in that scenario.

>         Now, what if we had a database system where all variable-length text
> fields had the same underlying implementation? That is, where varchar ==
> tinytext == mediumtext, etc. We couldn't trigger a change on moving
> something from varchar to text.

Sure you can. When you query the database, you get the actual data type that is
implemented. If that's not the type you want it to be, then you change it. I
still don't see the problem.
(Assignee)

Comment 4

13 years ago
(In reply to comment #3)
> That seems entirely correct to me! There's nothing to change in that scenario.

  Apparently I haven't been clear enough, since two people have asked me this,
at this point.

  Bugzilla development w.r.t. checksetup works like this:

  (1) We upgrade the fashion in which Bugzilla works.
  (2) We need to do some migration code on the actual data in the database, as a 
      result of this change.
  (3) In order to determine whether or not we perform this change, we check the
      current definition of a field. This tells us whether we're running the
      "old" Bugzilla or the "new" Bugzilla. This is nice because it allows
      us to upgrade reliably between CVS versions.

  This is NOT about changing the definitions of fields. It's about running
migration code.

  Yes, if we had no migration code to run, there would be nothing to worry
about! Everything would be fine. We trigger migration code off of field changes.
These triggers must be reliable. 

  The problem here is that our MIGRATION CODE TRIGGERS are not reliable.

> Sure you can. When you query the database, you get the actual data type that 
> is implemented.

  Actually, you don't always get even that. For example, all I get get out of
PostgreSQL about a serial field is that it's an integer with a very
strange-looking default. Playing with DBI's column_info function is NOT fun.
(Believe me, I've was doing it for six hours a day for several days to get
patches when I was heading in that direction.)
(Assignee)

Comment 5

13 years ago
At Tomas's request, I'm going to write a broad overview of how the system will work:

  (1) We store a Schema object in the database. This contains a list of all the 
      tables, columns, and indexes, in the "database-neutral" format. See 
      $self->{schema} in Bugzilla::DB::Schema::new to see what this looks like.

  (2) Then, later (in a future version), we're asked to make a column change, 
      and we need to check if the column needs to be updated. Let's say that 
      we're changing the resolution.value field from a varchar(64) to a 
      varchar(255), and we also need to change the names of all the resolutions 
      in the database.

  (3) So, checksetup needs to check if the column is already a varchar(255).
      We implement a function Bugzilla::DB::bz_get_column_def which returns
      a definition for the current column. This looks like:
      { TYPE => 'VARCHAR(64)', NOTNULL => 1 }
      We can see that type != 'VARCHAR(64)'.

  (4) So, we need to do the migration code. We update all of the resolutions in 
      the bugs table to be some new, really long values. This is just a lot of
      standard SQL and some perl.

  (5) Once the migration code is successful (and only then), we need to change
      the type of the field. We have a function to do this, called 
      Bugzilla::DB::bz_change_column. Here's how it works:
          (a) We read in the saved Schema from the database.
          (b) We ask the Schema, "Is the type of resolution.value equal to the 
              new type that we want it to be?" That is, "Is it varchar(255) NOT 
              NULL?"
          (c) Schema says, "No, it's not equal." (It's "varchar(64) NOT 
              NULL.")
          (d) We say, "OK, then we need to change it to the new type!"
          (e) We ask the Schema object, "Can you generate
              an ALTER TABLE statement for this?" It generates the ALTER 
              TABLE, and we run it, to modify the actual database.
          (f) We then update the Schema object with the new definition, so that
              what's in memory is the same as the actual structure of the
              database on the disk.
          (g) We then save the in-memory Schema back to the database.

  Note that because we do (a) and (g) EVERY TIME we call bz_change_column,
there's pretty much no way for the database and the Schema to get out of sync.
There's also a bz_rename_column which will work in a similar fashion.

  Oh, and by the way, at the beginning of all of this there's one other step:

  (0) At the very beginning of checksetup's database-migration code, we take a 
      WRITE lock on the bz_schema table, so that only one instance of checksetup 
      can modify it at a time.
(Assignee)

Updated

13 years ago
Depends on: 285680
(Assignee)

Updated

13 years ago
Depends on: 285713
(Assignee)

Updated

13 years ago
Depends on: 285722
(Assignee)

Updated

13 years ago
Depends on: 285723
shouldn't the order of (4) and (5) be swapped?  update the schema then insert
long values.

Comment 7

13 years ago
(In reply to comment #6)
> shouldn't the order of (4) and (5) be swapped?  update the schema then insert
> long values.

Hmmm, actually, they need to be interleaved. You need to do half of the change
of the DB table, then you may need to run some migration code, and then you need
to finish the table changes. This happens when you e.g. change column type
between two incompatible types.
Imagine changing e.g. operating systems (former enum, now table), which are
related with the column in the bugs table by name, to be related by id. You need
to create the column for id, then go through all the records and convert names
to ids, then you want to drop the old name column.
Flags: blocking2.20+
(Assignee)

Comment 8

13 years ago
(In reply to comment #7)
> (In reply to comment #6)
> > shouldn't the order of (4) and (5) be swapped?  update the schema then 
> > insert long values.
> 
> Hmmm, actually, they need to be interleaved.

  Yeah. The process basically just describes exactly how checksetup works now,
but just explains what the new underlying implementation will be.

  It always depends on the migration code, what you want to do.

  The reason for generally changing the schema after the migration code is that
if the migration code fails, the schema will still be unchanged, and the
migration code will try to run again the next time you run checksetup. If you
change the schema before the migration code, then if the migration code fails,
it will never run again (because it's keyed off of the schema change).
(Assignee)

Updated

13 years ago
Depends on: 286527
(Assignee)

Updated

13 years ago
Blocks: 285748
(Assignee)

Updated

13 years ago
No longer blocks: 285748
Depends on: 285748
(Assignee)

Updated

13 years ago
Depends on: 286689
(Assignee)

Updated

13 years ago
Blocks: 286695

Comment 9

13 years ago
Just a thought:

Regarding option #1 in the original bug description, this could be accomplished very easily if all 
commits that changed the database schema incremented a "build number" or "database version" 
number that was part of the overall version number during development.  For example, instead of just 
2.21, you could start with 2.21.001, 2.21.002, etc.

As long as the version number always increased, this would remove the risk from #1 and make the 
upgrade process MUCH SIMPLER than option #2.
(Assignee)

Comment 10

13 years ago
(In reply to comment #9)
> this could be accomplished very easily if all 
> commits that changed the database schema incremented a "build number" or 
> "database version" number

  Yeah, I also considered doing something like that. The problem is that it
would fundamentally alter the way that checksetup works, and it would be
somewhat difficult to do in the kind of development environment we have with
Bugzilla. (I'd become worried about maintenance of the version number, among
other problems.) It would also then be somewhat difficult to re-order changes,
as we have to sometimes do when we discover that we've made an error.

  Really, you kind of have to have deep guru-level understanding of checksetup's
database-migration code to know what the exact problems would be, but otherwise
I would agree that it's a much simpler solution.

  Anyhow, I'm already pretty far into option 2, and it's a very robust solution
that, from the developers' perspective, works the same way the old system worked.
(Assignee)

Updated

13 years ago
Blocks: 288409
>   Yeah, I also considered doing something like that. The problem is that it
> would fundamentally alter the way that checksetup works

That's ok, checksetup.pl's current algorithm isn't written in stone.


> it would be
> somewhat difficult to do in the kind of development environment we have with
> Bugzilla. (I'd become worried about maintenance of the version number, among
> other problems.)

If the version number increments every time there's a schema change, this
approach would work just as well for development environments as it does for
production environments.


> It would also then be somewhat difficult to re-order changes,
> as we have to sometimes do when we discover that we've made an error.

Has this happened, and why why does this approach make it difficult to reorder
changes?


>   Anyhow, I'm already pretty far into option 2, and it's a very robust solution

That may be so, but implementing a solution is just the beginning.  We'll have
to maintain it for years, so we should make sure we're picking the best one now,
even if you've done work on one already.  For maintenance and robustness,
there's a lot to be said for simplicity, and versioning seems like a good
solution to the problem.  Where does it fall down?


> that, from the developers' perspective, works the same way the old system worked.

That's not a problem, as we change the way things work all the time (most
recently when we moved the schema from checksetup.pl to its own module), and
developers are better off with a better solution than a more similar one.  We
should pick the best solution, no matter how dissimilar it is from what we do in
checksetup.pl today.

Comment 12

13 years ago
(In reply to comment #11)
> That may be so, but implementing a solution is just the beginning.  We'll have
> to maintain it for years, so we should make sure we're picking the best one now,
> even if you've done work on one already.  For maintenance and robustness,
> there's a lot to be said for simplicity, and versioning seems like a good
> solution to the problem.  Where does it fall down?

I don't know how familiar you are with the solution Max has proposed and is
currently implementing, but I think it's very good and robust solution. Instead
of storing just schema version number, we are storing the whole schema. It's not
so much data, so storage is not a problem, and it should be quite bullet-proof.
I personaly can't think of better way to do it :-).
That does not say anything about the actual implementation, but I personally
think that even the implementation is pretty good at the moment, and if there
are any problems, they should not be fundamental, IMHO.

The only thing I was thinking could be maybe better would be somehow attaching
the migration code to the schema - you compare current schema with the new one
and if you find difference, you will get the migration code from some hash
attached to the schema and run it, so it does not have to pollute checksetup, as
it does now. But I haven't found any sensible way how to implement it yet, so
it's still only an idea for a future enhancement :-).
(Assignee)

Comment 13

13 years ago
Re: comment 11

You definitely make some good points. :-)

One point that I've been thinking about is that for Schema to be entirely
compatible with the upcoming Custom Fields code, we'd have to pretty much take
the approach that I'm working on anyway, where the Schema constantly reflects
the database on the disk.


For the versioning approach, in Bugzilla's development environment, how would I
write such a version number in a patch that I could put on a bug?


As far as the reordering question goes, we do frequently have to either reorder
changes or take out older changes. The nice thing about the way checksetup works
now is that it's based on the current state of the database, so it can even deal
with a bit of local DB customization (though not much), and it's easy to see
where we need to change things, because we only have to worry about what the
state of the database was in the past, and not what version people are running.
With versioning, that sort of situation becomes more confused in my mind, at the
least.


The paradigm that we use now works, is pretty flexible, and it's a known
quantity. Changing to versions could be difficult, and it's problems aren't known.

In other words, I'm certain that my solution will work, and I'm not certain that
versions will work and be easy to maintain. The code that I'm writing for my
solution will also need to be implemented eventually anyway.


Re: comment 12

I thought about trying to do the same thing, but we'd have to come up with some
really good mechanism, and I think it would end up being WAY more complex than
what we have now. :-)

Comment 14

13 years ago
I didn't have time to read all the conversation here, so far, so I'm just typing
some brainstorming-like idea:

The main issues of checksetup.pl, in my opinion, are:

1) MySQL-based code (until now).
2) inability to call the DB upgrade code from a perl script (web installer) (the
code should be in a .pm module, which kind of happened partially with mkanat's
Schema thing, but we need the code that uses Schema.pm in a module as well)
3) inability to revert to a previous DB (justdave decided to take out of the CVS
one risky patch because it makes Bugzilla too risky after been commited, and
landfill is practically not usable with it applied - those that ran
checksetup.pl have no way to go back to the previous DB, before the patch with
the schema changes was applied).
4) inability to switch between Bugzillas - for example I run the tip, and I'd
like to go to 2.18-STABLE because it proves to be more stable nowadays (that
HASH error thing is not suitable for production environments). However,
checksetup.pl doesn't know how to transform a tip DB into a 2.18 DB.

So maybe versioning is not the perfect solution. Probably we should aim more
into the lines of having changesets, which can be defined as valid starting with
2.18, or maybe only for the 2.20 branch or something similar. Something to allow
us to revert the DB as well, or to allow us to switch between branches.

I don't know if such thing is possible, or if such design would be easy to
maintain and allow this with the same amount of effort as the current system.
But since myk throw in "let's think about this", I thought that the above might
be useful.

Of course, we might decide to go for a quick-fix for 2.20 and open a new bug
about the rest.
(Assignee)

Updated

13 years ago
Blocks: 290367
"If it's not a regression from 2.18 and it's not a critical problem with
something that's already landed, let's push it off." - Dave
Flags: blocking2.20+
Flags: blocking2.20-
(Assignee)

Updated

13 years ago
Blocks: 291548

Updated

13 years ago
No longer blocks: 286695
(Assignee)

Comment 16

13 years ago
Now that all the blockers are resolved, this bug is resolved. Checksetup has
essentially been re-written to work in the fashion described in comment 5. The
actual implementation of all the individual functions was done in blockers, and
the changes to checksetup were done in bug 285722.

Soon, I will send out an email describing the new functions, with a pointer to
their (usually very detailed) API docs in Bugzilla::DB.

The API docs need a bit of cleanup, but that's bug 290533.
Status: NEW → RESOLVED
Last Resolved: 13 years ago
Resolution: --- → FIXED
(Assignee)

Updated

13 years ago
Blocks: 287296
You need to log in before you can comment on or make changes to this bug.