Closed
Bug 309663
Opened 19 years ago
Closed 17 years ago
Oracle thinks that an empty string is a NULL value
Categories
(Bugzilla :: Database, enhancement)
Tracking
()
RESOLVED
FIXED
Bugzilla 3.2
People
(Reporter: lance.larsh, Assigned: xiaoou.wu)
References
Details
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20041119 Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20041119 In Oracle SQL, empty strings are considered NULL values. A number of tables in the current schema have NOT NULL constraints on columns which are routinely populated with empty strings. When using an Oracle database, these INSERT/UPDATE statements fail because they violate the NOT NULL constraint. In many cases, removing the NOT NULL constraint from affected columns may be the most appropriate fix, but there may be cases where this isn't right. Should we file a separate bug for each case, or maybe just file a separate bug for the ones where removing the NOT NULL won't work? The examples I've found so far, just by running checksetup.pl: RESOLUTION.VALUE (this one will go away when bug 106589 gets fixed) PRODUCTS.MILESTONEURL GROUPS.USERREGEXP PROFILES.DISABLEDTEXT Reproducible: Always Steps to Reproduce: (NOTE: Testing obviously requires a database driver for the DB of interest, and as of this moment the driver for Oracle isn't done yet since this bug blocks it. The steps below show how to test the code against any database for which a driver is available.) 1. Run checksetup.pl on a clean installation. 2. Edit localconfig to set the db_* parameters to point to a fresh db with no schema created yet. 3. Run checksetup.pl again (this time to create the schema). Actual Results: As an example, checksetup.pl fails populating RESOLUTION.VALUE, with the error below: checksetup.pl: DBD::Oracle::st execute failed: ORA-01400: cannot insert NULL into ("BUGS"."RESOLUTION"."VALUE") (DBD ERROR: error possibly near <*> indicator at char 57 in 'INSERT INTO resolution (value,sortkey,isactive) VALUES (:<*>p1,:p2,:p3)') [for Statement "INSERT INTO resolution (value,sortkey,isactive) VALUES (?,?,?)" with ParamValues: :p3=1, :p1='', :p2=100] at ./checksetup.pl line 1775 checksetup.pl: main::PopulateEnumTable('resolution','','FIXED','INVALID','WONTFIX', 'LATER','REMIND','DUPLICATE','WORKSFORME',...) called at ./checksetup.pl line 1809 Expected Results: INSERTs/UPDATEs should succeed.
Comment 1•19 years ago
|
||
OK, this is one I actually have an issue with. An empty string is *not* a NULL value. There's no way to change this behavior in Oracle, per-database? Removing the NOT NULL constraints globally is not acceptable. At the *absolute worst*, we could have the Oracle driver automatically make all columns NULL instead of NOT NULL.
Assignee: database → lance.larsh
Status: UNCONFIRMED → NEW
Ever confirmed: true
Summary: Don't insert empty strings into NOT NULL columns → Oracle thinks that an empty string is a NULL value
Comment 2•19 years ago
|
||
Or, we could override bz_alter_column_raw and bz_add_column to not create NOT NULL constraints ever, on text/varchar/char fields. This is really just so... silly. There's no way to make it behave ANSI-standard?
Reporter | ||
Comment 3•19 years ago
|
||
Yeah, this issue can be a real headache. Unfortunately there's no way to change that behavior in Oracle. (In reply to comment #2) > Or, we could override bz_alter_column_raw and bz_add_column to not create NOT > NULL constraints ever, on text/varchar/char fields. Removing the NOT NULL constraint just for Oracle won't really work, because some queries would end up returning different results on Oracle than on other databases. For example, if the app inserts '' into a row, the query "SELECT * FROM mytable WHERE textcol <> 'whatever'" would return that row on MySQL but would not return that row on Oracle (since SQL's "<>" operator doesn't match NULL values, and Oracle would consider '' as NULL). To really make that work, we'd also have to change the app to always insert NULL instead of '' (in addition to removing the NOT NULL constraint). Then queries would always return the same results on all databases, but it could end up being a ton of work to change the app. Perhaps a better approach (which would be much less work) would be to leave the NOT NULL constraint on the columns, but have the Oracle module use a trigger on each column to change empty strings to a single space on INSERT/UPDATE. A space is still not the same as the empty string, but at least then I think Oracle's behavior would match Sybase, SQL Server and DB2 in their handling of empty strings. It might even work to use DBI's ChopBlanks attribute to turn this single space back into an empty string automatically on SELECT.
Comment 4•19 years ago
|
||
Oy, the headaches. Yeah, I remember this problem quite well from the farce of a Sybase port. :) We did the "make an empty string be a single space" thing, and just ran trim() on everything that came out of the DB that could potentially be empty. (We do this on half the fields already anyway, just extend it to the ones that routinely deal in empty strings). This would work well across all databases if we enforce it across the board. Sybase actually converts an empty string to a space for you (in an attempt to be ANSI-compat, since it knows an empty string isn't NULL, it just can't deal with it internally).
Comment 5•19 years ago
|
||
OK. We need to come up with some way to handle this bug that doesn't impact every Bugzilla developer for all of the future. Anything that would require us to catch things in review won't work -- we'll just end up breaking Oracle support all the time, and it'll be mysterious to our end users. In order to do this... we may have to get into SQL parsing. That is, we may have to detect every time we put an empty string into the database, and replace it with a string containing a control character (some character with a low hex value that will never show up normally). And every time we get that string back out of the database, the driver should convert it to an empty string. This could involve some interesting DBI work, but it would also be valuable code for other databases.
Comment 6•19 years ago
|
||
Do we have anything that actually needs to care about the difference between an empty string and a space? Why not just use a space instead of some mysterious control character? Sybase actually converts empty strings to single spaces for you, and I think Lance mentioned there was a trick that could be pulled server-side on Oracle to make it do the same. As long as we make it a hard and fast rule to trim() anything that comes out of the database that shouldn't trip anyone up.
Comment 7•19 years ago
|
||
I think the problem is that "making it a hard and fast rule" is something that a reviewer has to do, and something we can't automatically enforce in the code. That means we *could* miss it, and probably *will* at some point. And it's the sort of bug that would be *extremely* difficult to track down once it started occuring. As it stands now, I can think of very few places where we trim data that comes out of the DB. Usually we trim it before we put it in.
Reporter | ||
Comment 8•19 years ago
|
||
I know it's not pleasant to think about, but honestly the way to get maximum portability of string handling across all databases is to avoid the use of empty strings altogether, since I think all databases support NULL strings and non-empty strings in a similar fashion. It's only the handling of empty strings that varies significantly. But I won't bring it up again... ;) (In reply to comment #7) > I think the problem is that "making it a hard and fast rule" is something that > a reviewer has to do, and something we can't automatically enforce in the > code. If we go with the "convert empty string to single space" approach, can't we use DBI's ChopBlanks attribute on the connection, so that DBI automatically trims everything that it gets from the DB? Then we don't have to worry about the requiring the app's SQL to trim things. Of course, this approach would cause problems if there's anywhere that the app requires the leading/trailing space to be preserved. Another thing to worry about is that any database which converts empty strings to single spaces will return different results from queries which use SQL functions that compute a string's length (unless of course the query trims the argument to the length function). But this may not be a big deal because I'm not sure the length functions have portable names anyway (meaning we'd likely be avoiding those functions already). Of course, we're still left with the hard-and-fast rule "don't rely on the database preserving space". But if we used ChopBlanks on all databases, at least such problems would appear to all developers, no matter which db they're using.
Comment 9•19 years ago
|
||
The only field I can think of where the leading and trailing spaces are important is the comment field, and attachments (but attachments aren't text anyhow). However, ChopBlanks won't help. The docs say this: "The ChopBlanks attribute can be used to control the trimming of trailing space characters from fixed width character (CHAR) fields. No other field types are affected, even where field values have trailing spaces." I don't think we even have any fixed-width character columns. I'm fairly sure the only way to handle this will be to do some DBI hacking. It would be interesting to investigate how other cross-DB perl libraries (or non-perl libraries) handle this particular problem. I don't think it's really feasible to treat all empty strings as NULL. Somebody could really want to enter an empty string into a field, and I think an empty string is a valid value for some columns, and not for others.
Comment 10•19 years ago
|
||
(In reply to comment #8) > Then we don't have to worry about the requiring the app's SQL to trim things. I meant in the perl after retrieving the values. I don't really think this is anywhere near as big a deal as mkanat is thinking it'll be. I can think of two places in the existing code where we rely on a string coming out of the database to be empty. shutdownhtml, and disabledtext on useraccounts. Fix those two places to trim() the data after it comes out of the database before comparing (or better yet, make them both flags in addition to the text fields, and don't compare the text as a means of telling whether it's active), add the server-side trigger to enforce it on Oracle, and be done with it. Yes, it might trip someone down the road, but realistically, what would you ever do that requires telling the difference between an empty string and a non-empty one in a database field where it doesn't make sense to allow nulls? Although the logic might trip someone who needs to set up something like that, the chances of someone ever setting up something like that are almost nil.
Comment 11•19 years ago
|
||
Well, we're still going to need to hack DBI for data on the way in, or require reviewers to always put in a single space when they mean to put in an empty string. Isn't there an "is empty" Boolean Chart? I'm concerned about overloading "a string with a single space" to mean "an empty string that isn't NULL." It's not so much that I can point out a particular problem with it at the moment (besides ambiguity), but that it requires us to predict a future that we can't predict. That is, I really have no idea if choosing to use a single space for an empty string will cause us problems in the future. The problem is two things, as far as I know: (a) The fact that for Oracle, '' will never be matched by anything unless we use IS NULL. That's the one we've been talking about, mostly. (b) Lance: Does DBI take out an empty string from Oracle as an empty string, or as "undef"? DBI treats NULL as "undef" and we rely on that behavior. We also rely on it not to treat an empty string as "undef." In general, the best solution that I can see is, indeed, as we have all basically decided, to come up with some way to differentiate between NULL and an empty string. I'm just trying to come up with the way that will be the most automatic, and won't require attention from me or another reviewer. Reviewers already have a lot of things to remember about database-compatibility, and they slip by a lot of the time if I'm not reviewing the bug personally.
Comment 12•19 years ago
|
||
(In reply to comment #11) > Well, we're still going to need to hack DBI for data on the way in, or require > reviewers to always put in a single space when they mean to put in an empty > string. Isn't there an "is empty" Boolean Chart? No DBI hacking required, it's all done server-side. > (b) Lance: Does DBI take out an empty string from Oracle as an empty string, > or as "undef"? DBI treats NULL as "undef" and we rely on that behavior. We > also rely on it not to treat an empty string as "undef." There's no such thing as an empty string in Oracle, that's the point of this discussion. If it's actually an empty string, you'll get a NULL back, which will be an undef to Perl. The transition from empty string to a space happens at insertion time, and is done automatically by the server (we don't have to mess with it and neither does DBI) or at least the server can be configured to do so. > I'm just trying to come up with the way that will be the most > automatic, and won't require attention from me or another reviewer. Fix the two places that need to know the difference so they trim() before comparison, and stop worrying about it. ;) It's really a pretty rare situation. The two places that depend on it now are major hacks anyway, and probably shouldn't be depending on it at all to begin with.
Reporter | ||
Comment 13•19 years ago
|
||
(In reply to comment #12) > (In reply to comment #11) > > Well, we're still going to need to hack DBI for data on the way in, or require > > reviewers to always put in a single space when they mean to put in an empty > > string. Isn't there an "is empty" Boolean Chart? > > No DBI hacking required, it's all done server-side. Well, we CAN handle it server-side, as I originally suggested. But this would require creating a trigger (a db schema object) on every NOT NULL column, and it could get messy managing that through schema upgrades. A possibly better alternative I discovered yesterday is that DBD::Oracle already has an option to automatically convert SQL parameters bound to empty Perl strings to ' ' during execute(). This would allow DBD::Oracle to do the '' to ' ' conversion for us without having to maintain all those triggers in the schema. The downside is that it only works for bound parameters, as in: my $sth = $dbh->prepare("INSERT INTO t (textcol) VALUES (?, ?)"); $sth->execute(""); but does not catch embedded SQL string literals, as in: $dbh->do("INSERT INTO t (textcol) VALUES ('')"); Note that using this option would affect every SQL operation (SELECT, INSERT, UPDATE, etc) using bound parameters. It might require some thought to figure out if that would cause problems or not. My first guess is that it should work, since all operations are transformed the same way (assuming you don't use SQL string literals, that is). > There's no such thing as an empty string in Oracle, that's the point of this > discussion. Exactly. Think of it this way: Oracle has no concept of an empty string- only NULL strings and non-empty strings. NULL strings in Oracle are handled by Perl/DBI just like NULL strings from any other db (i.e., undef).
Reporter | ||
Comment 14•19 years ago
|
||
(In reply to comment #11) > (a) The fact that for Oracle, '' will never be matched by anything unless we use > IS NULL. That's the one we've been talking about, mostly. Correct, but we avoid the need to use IS NULL by doing the single space conversion. Also, keep in mind that lots of other big databases (e.g., SQL Server, Sybase, DB2) convert '' to spaces by default anyway, so if you want Bugzilla eventually to run on those databases, they would face the same issue.
Comment 15•19 years ago
|
||
Hrm. The DBD::Oracle solution doesn't sound too terrible, but does it do the same thing for strings coming *out* of the database? That is, in perl, '' is false but ' ' is true (and we likely rely on that all over the place). The only problem with it is requiring people to always use placeholders, which is another point that requires oversight from me. It could also involve possibly serious hacking, depending on exactly what code we'd have to change. I think we're in for changing more than just disabledtext and shutdownhtml, to get total consistency on this. Also, we don't want to *always* use placeholders -- using them frequently prevents the database from caching query results. And there are places we definitely want the DB to cache query results if at all possible. The triggers also sound like they could be a good idea (I don't mind managing schema changes, since that's what DB::Schema does best), but would they also work when we're comparing things in a WHERE clause?
Reporter | ||
Comment 16•19 years ago
|
||
(In reply to comment #15) > Hrm. The DBD::Oracle solution doesn't sound too terrible, but does it do the > same thing for strings coming *out* of the database? That is, in perl, '' is > false but ' ' is true (and we likely rely on that all over the place). Not sure, but from my quick glance at it, I don't think the CompatMode option does anything on the result side. > The triggers also sound like they could be a good idea (I don't mind managing > schema changes, since that's what DB::Schema does best), but would they also > work when we're comparing things in a WHERE clause? For every text column of every table, we would set up a trigger to fire on each INSERT or UPDATE to a column, changing the inserted/updated value to a space whenever the SQL tried to use an empty string, as follows: CREATE TABLE t ( col VARCHAR(32) NOT NULL ); CREATE TRIGGER t_col_tr BEFORE INSERT OR UPDATE OF col ON t FOR EACH ROW WHEN ( NEW.col IS NULL ) BEGIN :NEW.col := ' '; END; As a result, the database really stores the non-empty string that consists of a single space, and the NOT NULL constraint is not violated. Triggers can't do anything on the query side, so we still face the issue of '' not being the same as '' without trimming.
Comment 17•19 years ago
|
||
OK. So, what we need is: * Every INSERT or UPDATE SQL statement containing an empty string needs to insert *A* (where *A* is some particular value we choose) instead of an empty string. Most DBs use a single-space for *A*; I wanted a control character. We could use a single space, I suppose. * Every SQL statement containing an empty string as a WHERE or HAVING criteria needs to automatically replace that empty string with *A*. * Every time we get *A* out of the database, it needs to be translated back into an empty string before anything sees it. All we need is for all of the above to be true, as far as perl is concerned. I don't really care what happens in the DB, as long as those three conditions are all kept. From our discussion here, I'm still fairly certain that the only totally reliable way of accomplishing the above is to override some DBI functions in Bugzilla::DB::Oracle, but if you can come up with some other way of maintaining the above three conditions, I'm OK with that, too.
Reporter | ||
Comment 18•19 years ago
|
||
(In reply to comment #17) > From our discussion here, I'm still fairly certain that the only > totally reliable way of accomplishing the above is to override some DBI > functions in Bugzilla::DB::Oracle Ok, I've added method overrides in Bugzilla::DB::Oracle for all of the relevant DBD::Oracle::db/DBD::Oracle::st methods. Now that I've done the work, the difference between using a single space, a control character, or a control string is trivial. Some of the DBI fetch methods call each other internally, so whatever we do at fetch time to undo what we did at insert time needs to be idempotent. Therefore I opted against using a single space, since it's difficult to make removal of a space idempotent. I decided to use a control string "__BZ_EMPTY_STR__". This will be easier to debug than a single control character since it will be more obvious where to start debugging if the app gets "__BZ_EMPTY_STR__" back from a query.
Comment 19•19 years ago
|
||
(In reply to comment #18) Great, that sounds perfect.
Updated•18 years ago
|
Severity: normal → enhancement
Updated•17 years ago
|
Assignee: lance.larsh → xiaoou.wu
Target Milestone: --- → Bugzilla 3.2
Comment 20•17 years ago
|
||
This was worked around in our creation of Bugzilla::DB::Oracle.
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•