Open Bug 285122 (bz-mssql) Opened 19 years ago Updated 8 years ago

[MS-SQL] Support MS SQL Server as a database backend

Categories

(Bugzilla :: Database, enhancement)

enhancement
Not set
normal

Tracking

()

ASSIGNED

People

(Reporter: florian, Assigned: mockodin)

References

(Depends on 10 open bugs, )

Details

Attachments

(1 file)

User-Agent:       Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.1.4322)
Build Identifier: 

Can i run bugzilla with MSSQL instead of MySQL?

Reproducible: Always
This is not a support forum, so this isn't the place to ask questions. :-) You
can ask questions using the resources in http://www.bugzilla.org/support/

However, we don't already have a bug filed for MS-SQL support, even though I
know that glob is working on it. So I'll take this bug and make it into our
"support MS-SQL" bug.
Assignee: query-and-buglist → bugzilla
Severity: normal → enhancement
Summary: Can i run bugzilla with MSSQL instead of MySQL? → Support MS SQL Server as a database backend
Version: unspecified → 2.19.2
Status: UNCONFIRMED → NEW
Ever confirmed: true
Component: Query/Bug List → Bugzilla-General
very early startings of ms-sql support.
Component: Bugzilla-General → Database
Summary: Support MS SQL Server as a database backend → [MS-SQL] Support MS SQL Server as a database backend
Comment on attachment 186095 [details] [diff] [review]
very very very early patch

  OK, here's a few comments to guide you on your way. :-)

>Index: Bugzilla/DB/Mssql.pm
>+#use constant BLOB_TYPE => { pg_type => DBD::Pg::PG_BYTEA };
>+use constant REQUIRED_VERSION => '0';

  We should probably figure that out at some point. Does bz_server_version not
return something useful?

>+use constant PROGRAM_NAME => 'Mssql';

  That should probably be "MS-SQL," since it's displayed to end-users.

>+sub new {
>+    # construct the DSN from the parameters we got
>+    my @dsn = (
>+        'Provider=SQLOLEDB.1',

  Does this mean that people have to name their ODBC connection that, or what?

>+        'Persist Security Info=True',

  What's that mean? :-)

>+    my $attributes = {
>+        RaiseError => 0,
>+        AutoCommit => 1,
>+        PrintError => 0,
>+        ShowErrorStatement => 1,
>+        HandleError => \&_handle_error,
>+        TaintIn => 1,
>+        FetchHashKeyName => 'NAME',
>+    };

  You actually don't need to specify $attributes unless you want to override
it.

>+    # all class local variables stored in DBI derived class needs to have
>+    # a prefix 'private_'. See DBI documentation.
>+    $self->{private_bz_tables_locked} = 0;

  That should be handled by db_new... do the other DBs do this in here, also?

>+sub _handle_error {
>+    die "aieee\n";
>+    exit;
>+}

  Hahahaha... why'd you override this?

>+sub sql_regexp {
>+    return "~*";
>+}

  Does MS-SQL actually have a regex operator?

>Index: Bugzilla/DB/Schema/Mssql.pm
>+    # Remove FULLTEXT index types from the schemas.

  Does MS-SQL have any sort of fulltext engine?

>Index: Bugzilla/DB/Schema.pm
>===================================================================
>RCS file: /cvsroot/mozilla/webtools/bugzilla/Bugzilla/DB/Schema.pm,v
>retrieving revision 1.27
>diff -u -r1.27 Schema.pm
>--- Bugzilla/DB/Schema.pm	27 Apr 2005 11:08:58 -0000	1.27
>+++ Bugzilla/DB/Schema.pm	30 Apr 2005 15:39:59 -0000
>@@ -1325,7 +1325,7 @@
>     while (@fields) {
>         my $field = shift(@fields);
>         my $finfo = shift(@fields);
>-        $create_table .= "\t$field\t" . $self->get_type_ddl($finfo);
>+        $create_table .= "\t\"$field\"\t" . $self->get_type_ddl($finfo);

  This will break MySQL, which uses backticks instead of double-quotes. You may
have to override this entire subroutine, or provide a hook of some sort for
quoting into the DB framework. I take it you're running into some reserved
word?


  If you have any Bugzilla::DB questions, feel free to ask them here in the
bug. This bug could become a useful repository of Bugzilla::DB information that
I could later use to make some docs.
> OK, here's a few comments to guide you on your way. :-)

thanks.  unexpected, but thanks :)

> We should probably figure that out at some point. Does bz_server_version not
> return something useful?

haven't got there yet :)

> Does this mean that people have to name their ODBC connection that, or what?

nope, this is a dsn-less connection.  i'm building the ADO connection string
from scratch; that's telling ADO to connect to a sql server.

> 'Persist Security Info=True'
> What's that mean? :-)

remember the password even if the connection is reset.

> >+sub _handle_error {
> >+    die "aieee\n";
> >+    exit;
> >+}
> 
>   Hahahaha... why'd you override this?

one of the problems i'm having right now is DBI::ADO is incorrectly dealing with
information messages returned by ado.  when you create a database, you get back
"blah database created, allocating blah disk" or words to that effect.  ado
throws them back to me as errors, so the $attributes and this sub are just
playthings to see if i can work around that.

> >+sub sql_regexp {
> >+    return "~*";
> >+}
> 
>   Does MS-SQL actually have a regex operator?

not as standard.

right now i'm focusing my efforts on getting checksetup working.  then, once i
have a database to play with, i can deal with the sql :)

> >Index: Bugzilla/DB/Schema/Mssql.pm
> >+    # Remove FULLTEXT index types from the schemas.
> 
>   Does MS-SQL have any sort of fulltext engine?

yes, but it's not installed by default.

http://www.microsoft.com/sql/evaluation/features/fulltext.asp

i've never played with it, so for now i'm nuking the indexes.

> This will break MySQL, which uses backticks instead of double-quotes. You may
> have to override this entire subroutine, or provide a hook of some sort for
> quoting into the DB framework. I take it you're running into some reserved
> word?

this is another stop-gap.  yes, it's a reserved word.  once i have time to pick
this bug up again i'll push for getting that column renamed.

I am going to have a look at this.  Glob - are you still working on this currently?
> Glob - are you still working on this currently?

i am not :( 
Assignee: bugzilla → nobody
Taking.

Thanks the the initial patch Glob! :)
Status: NEW → ASSIGNED
Assignee: nobody → bugzilla.mozilla
Status: ASSIGNED → NEW
OS: Windows Server 2003 → All
Hardware: PC → All
Version: 2.19.2 → 2.20
Not going to be able to attempt this for a while.  Work is killing me and my wedding in September means very little time.  If no one has taken it after that, I'll take it again.
Assignee: bugzilla.mozilla → database
I also have an interest in seeing this work.  (Very happily using Bugzilla with PostGRESQL at the moment, but the question comes up a lot.)  Our shop is almost exclusively MSSQL.

I have limited time, but would be interested in assisting others if I am available while they are working.
I'm also in a MS-centric IT dept, and would be willing to help introduce SQL Server compatibility.  I have a few questions though, before I start any work:

1) What version/milestone should I build to?  I saw another enhancement suggesting a DBCompat module (or something like that) in v3.  On a related note, adding compatibility for various DBs (I could do Oracle and others) would be much easier if the sql strings were all in resource bundles or something.

2) What version(s) of SQL Server should we support?  I suspect I could tune some of the searches to be seriously faster for SQL Server 2005, and the TEXT type is being deprecated in favour of the new varchar(max).  I'm willing to add support SQL 2000 as well, it just means more version checking, more switches, more places for things to go wrong or be written poorly.  Even the schema can be described differently to run faster in 2005.

3) Should I start with the "very very very early", or just go it alone?  In all honesty, I haven't touched perl in nearly 10 years.  The last thing I did was to embed scaled image attachments in the Bugzilla comments, instead of just the link.  (There's an open enhancement to add that functionality, but it seemed overly complicated, so I just did it myself.)  My point is that I'm not up to speed on the latest perl coding standards, and I haven't played with the ADO library at all.  If the patch successfully opens a connection, I'd be happiest using it.  If it doesn't, maybe Bruce or Glob could let me know what needs to be done to get connected?

4) I've never used the Submit Patch feature of Bugzilla.  I've always been with companies that use it internally, so any patches are usually checked in strait away, with comments in the code pointing back at bugzilla.  I'll happily read the docs on that, I'm just offering fair warning in case I stuff it up the first few times.

5) Do we have a checklist of all the sql commands, something I can use to verify my progress?  It would be nice to know just how much work I'm volunteering for.

6) Should I bother with any sort of migration script?  It should be easy to take a MySQL dump and convert it to MSSQL syntax.  But there are probably lots of tools out there that would already provide this functionality; hence I don't think it's worth the effort.

And finally, 7) I haven't noticed where the code for migrating schemas is located.  (ie: When I last upgraded Bugzilla, I vaguely recall that it modified the schema.  I imagine that code is structured like 'if moving from v2.18 to v2.20, run script xyz'.  Related to question 1, I'm wondering at what point the MSSQL module would need to support that migration.  I wouldn't approach that until the end, but I wonder if I need  to leave stubs somewhere to fill that in later.
(In reply to comment #10)
> 1) What version/milestone should I build to?

  Just work against CVS HEAD.

> 2) What version(s) of SQL Server should we support?

  Whatever version will allow the least changes to Bugzilla, as long as the version is in common use.

> 3) Should I start with the "very very very early", or just go it alone?

  It's up to you. Whatever looks best.

> My point is that I'm not up to
> speed on the latest perl coding standards, and I haven't played with the ADO
> library at all.

  Well, come into #mozwebtools on irc.mozilla.org and we can answer your questions and help you out.

  However, you may want to tackle a smaller Bugzilla patch first, to get more familiar with perl and our coding standards.

> 5) Do we have a checklist of all the sql commands, something I can use to
> verify my progress?  It would be nice to know just how much work I'm
> volunteering for.

  You won't need that. You just have to implement the DB modules, and then see how things go.

> 6) Should I bother with any sort of migration script?

  Worry about that later when you need it.

  Anyhow, we already have contrib/bzdbcopy.pl, which should work fine.

> 7) I haven't noticed where the code for migrating schemas is
> located.

  Read: 

  http://www.bugzilla.org/docs/tip/html/api/checksetup.html#How_Checksetup_Works
  http://www.bugzilla.org/docs/tip/html/api/Bugzilla/Install/DB.html#update_table_definitions

> I'm wondering at what point the
> MSSQL module would need to support that migration.

  As soon as we check the code in, ideally.

  At the worst, it must support it from the very first released version with MS-SQL support.
I'd like to pick this bug up.

I have a version of 3.2.3 partially working on MSSQL, still identifying the necessary touch points before moving my dev environment to use the head of the cvs 3.4 branch. I am able to run checksetup to deploy my instance. The catch points are issues of syntax specific to MSSQL. The Syntax change will likely require moving functions out of hard coded SQL string and into subs to format them specific to be requirements. For example:

MYSQL:   NOW() - INTERVAL 30 DAYS
MSSQL:   DATEADD(dd, -30, GETDATE())

Calls to sql_limit would change from:
$query .= ' '.$dbh->sql_limit(10,20);
TO
$query = $dbh->sql_limit($query,10,20);

http://www.planet-source-code.com/vb/scripts/ShowCode.asp?txtCodeId=850&lngWId=5

// MySQL limit clause
SELECT emp_id,lname,fname FROM employee LIMIT 20,10

// MSSQL equivelent
select * from (
 select top 10 emp_id,lname,fname from (
    select top 30 emp_id,lname,fname
    from employee
   order by lname asc
 ) as newtbl order by lname desc
) as newtbl2 order by lname asc


Thus far areas needing attention, but not limited to:
Date Manipulation
REGEX (database side)
LIMIT replication
FK (Multi Path CASCADE no allowed in MSSQL)

The rebuild of the bugzilla back end has made this over all a much simpler integration, the catch points are primarily MSQL side quarks where they are stubbornly not supporting common function names (ex. NOW() vs GETDATE() or TOP vs LIMIT)

Other touchy areas are detection of fulltext index vs indexes, as well as trigger detection will be required for mssql to work around the CASCADE issues in FK.
(In reply to comment #12)
> I'd like to pick this bug up.

  Okay, please read all the comments on this bug, particularly my responses to people's statements/questions.

> I have a version of 3.2.3 partially working on MSSQL, still identifying the
> necessary touch points before moving my dev environment to use the head of the
> cvs 3.4 branch.

  In order for us to accept it, you will have to work against HEAD, not the 3.4 branch.

> MYSQL:   NOW() - INTERVAL 30 DAYS
> MSSQL:   DATEADD(dd, -30, GETDATE())

  If MS-SQL doesn't support the ANSI standard interval syntax (or some interval syntax), we are unlikely to support it.

  What version of MS-SQL are you developing against?

> Calls to sql_limit would change from:
> $query .= ' '.$dbh->sql_limit(10,20);
> TO
> $query = $dbh->sql_limit($query,10,20);

  No, look at how we did that for Oracle instead.

> The rebuild of the bugzilla back end has made this over all a much simpler
> integration, the catch points are primarily MSQL side quarks where they are
> stubbornly not supporting common function names (ex. NOW() vs GETDATE() or TOP
> vs LIMIT)

  You can usually define functions to handle that--that's what we did for Oracle.

> Other touchy areas are detection of fulltext index vs indexes, as well as
> trigger detection will be required for mssql to work around the CASCADE issues
> in FK.

  Oh, I wouldn't worry about fulltext at first, unless you really want to.
Assignee: database → mockodin
(In reply to comment #13)
> > I have a version of 3.2.3 partially working on MSSQL, still identifying the
> > necessary touch points before moving my dev environment to use the head of the
> > cvs 3.4 branch.
> 
>   In order for us to accept it, you will have to work against HEAD, not the 3.4

ah, yes actually what I meant, but not what I typed.

> branch.
> 
> > MYSQL:   NOW() - INTERVAL 30 DAYS
> > MSSQL:   DATEADD(dd, -30, GETDATE())
> 
>   If MS-SQL doesn't support the ANSI standard interval syntax (or some interval
> syntax), we are unlikely to support it.
> 
>   What version of MS-SQL are you developing against?

SQL 2005 SP2, on further reflection the requirement can likely be dropped to SQL 2000 but the initial focus is 2005. The INTERVAL xx unit is not supported in either version. The closest functionality in MSSQL is something like

    SELECT GETDATE() - 10

But is interpreted as seconds. Using that formula the limitation can be overcome by preparing code side:
(example not complete code)

    sub sql_interval {
    my ($self, $interval, $units) = @_;
    

    if ($units eq 'YEAR'){
	$interval = $interval * 31536000;
    } elsif ($units eq 'DAY'){
        $interval = $interval * 86400;
    } elsif ($units eq 'HOUR'){
        $interval = $interval * 3600;
    }
    return $interval;
}

The above method kinda works but is buggy.

Taking into account your next reply (about sql_limit) this can be worked around more effectively in this manner.

NOW() - INTERVAL 1 DAY
would become
DATEADD(DAY, 1, GETDATE())

All of the following are supported:
year, month, day, hour, minute, second, or fraction
http://msdn.microsoft.com/en-us/library/ms186819.aspx

The latter option would be the best option.

> 
> > Calls to sql_limit would change from:
> > $query .= ' '.$dbh->sql_limit(10,20);
> > TO
> > $query = $dbh->sql_limit($query,10,20);
> 
>   No, look at how we did that for Oracle instead.

Ok, I see what we're doing there, it feels painful, but does simplify a number of issues.

> 
> > The rebuild of the bugzilla back end has made this over all a much simpler
> > integration, the catch points are primarily MSQL side quarks where they are
> > stubbornly not supporting common function names (ex. NOW() vs GETDATE() or TOP
> > vs LIMIT)
> 
>   You can usually define functions to handle that--that's what we did for
> Oracle.

A couple functions can't be defined so easily specifically GLOBAL_CAT and REGEXP
They can be created as managed functions or store proc, but I'm taking a guess we don't want to do that. Though as I said it would make it infinitely easier.

> 
> > Other touchy areas are detection of fulltext index vs indexes, as well as
> > trigger detection will be required for mssql to work around the CASCADE issues
> > in FK.
> 
>   Oh, I wouldn't worry about fulltext at first, unless you really want to.

Will probably be the last piece worked on.

For the FK what I'm thinking about is to catch UPDATE CASCADE and DELETE CASCADE and generate the triggers in their stead. It works in my test environment, but I haven't built out support for dropping or altering as of yet. Shouldn't be a big deal to get working. I will likely add a look up before the 'create trigger' statement if exists drop and then create. I usually prefer 'alter trigger' statements but in this case it shouldn't matter enough to warrant it.

Current progress is a semi-functional environment. Install works, I am revising my current changes based on comment 13 to fit with his suggestions/guidelines.  Now developing off cvs head.
Bugzilla/User.pm line 1636

my $eventdata = $dbh->selectrow_array(
        "SELECT eventdata
           FROM tokens
          WHERE (tokentype = 'emailold'
                AND SUBSTRING(eventdata, 1, (" .
                    $dbh->sql_position(q{':'}, 'eventdata') . "-  1)) = ?)
             OR (tokentype = 'emailnew'
                AND SUBSTRING(eventdata, (" .
                    $dbh->sql_position(q{':'}, 'eventdata') . "+ 1)) = ?)",
         undef, ($username, $username));

MSSQL requires the  SUBSTRING be called with 3 parameters, is there any reason to leave out the 3rd param in the second call?

Perhaps updated to:

my $eventdata = $dbh->selectrow_array(
        "SELECT eventdata
           FROM tokens
          WHERE (tokentype = 'emailold'
                AND SUBSTRING(eventdata, 1, (" .
                    $dbh->sql_position(q{':'}, 'eventdata') . "-  1)) = ?)
             OR (tokentype = 'emailnew'
                AND SUBSTRING(eventdata, (" .
                    $dbh->sql_position(q{':'}, 'eventdata') . "+ 1), LENGTH(eventdata)) = ?)",
         undef, ($username, $username));

I would still end up having to swap out the LENGTH function in MSSQL for LEN but you get the idea.
Out of curiosity wouldn't this have been a simpler call with the same level accuracy?

my $eventdata = $dbh->selectrow_array(
        "SELECT eventdata
           FROM tokens
          WHERE (tokentype = 'emailold'
                AND eventdata LIKE ?+':%')
             OR (tokentype = 'emailnew'
                AND eventdata LIKE '%:'+?)",
         undef, ($username, $username));
It looks like the original form of this look up was like comment 16 and was changed in issue 165221 back in 2002

I would think it more appropriate to exclude the possibility using a : in the login name, I could be wrong but I'm fairly certain it's not valid in a email address in any case. Even if your using some other implementation for auth I can't think of an instance where : would be included as a valid character.
(In reply to comment #14)
> The above method kinda works but is buggy.

  Buggy how? If it works for all the cases we actually use in Bugzilla, we should be OK.

> 
> Taking into account your next reply (about sql_limit) this can be worked around
> more effectively in this manner.
> 
> NOW() - INTERVAL 1 DAY
> would become
> DATEADD(DAY, 1, GETDATE())
> [snip]

  No, we can't parse SQL reliably enough to make that change automatically at runtime. Let's do the seconds thing, unless there's some significant flaw in it?

> Ok, I see what we're doing there, it feels painful, but does simplify a number
> of issues.

  Yeah. We're all about simplification. :-)

> A couple functions can't be defined so easily specifically GLOBAL_CAT and
> REGEXP

  GLOBAL_CAT--what's that? We don't use that. You mean string concatenation?

> They can be created as managed functions or store proc, but I'm taking a guess
> we don't want to do that. Though as I said it would make it infinitely easier.

  If you can define REGEXP as a function or stored procedure, that would be OK with me. They can be created by checksetup.pl during bz_setup_database, as long as they are reliable.

> For the FK what I'm thinking about is to catch UPDATE CASCADE and DELETE
> CASCADE and generate the triggers in their stead.

  Yep, that's what we did for Oracle.

> Current progress is a semi-functional environment. Install works, I am revising
> my current changes based on comment 13 to fit with his suggestions/guidelines. 
> Now developing off cvs head.

  Okay. Make sure to deliver things to me incrementally. That is, the smallest working patch that does *something* would be the first part, and we'll want to review Bugzilla::DB::Mssql and Bugzilla::DB::Schema::Mssql in separate bugs that block this one.


(In reply to comment #15)
> MSSQL requires the  SUBSTRING be called with 3 parameters, is there any reason
> to leave out the 3rd param in the second call?

  You know, that looks almost like it could be replaced entirely with sql_string_until, which exists on 3.4 and HEAD, which uses the ANSI-standard syntax for SUBSTRING instead of that function-like syntax.

  Anyhow, this should be discussed and possibly modified in an entirely separate bug from this one (blocking this bug), since it's a self-contained issue.
Status: NEW → ASSIGNED
(In Reply to comment #18)

>Buggy how? If it works for all the cases we actually use in Bugzilla, we
>should be OK.

Buggy is a bad word, more that it make my head hurt. Calculating number is what a database is for, moving it outside just introduces more for the web server to figure out. Figuring out months is problematic as not all months are equal so the logic requires multiple paths to determine the correct increase or decrease in seconds


Bugzilla/Mysql.pm line 122
sub sql_group_concat {
    my ($self, $column, $separator) = @_;
    my $sep_sql;
    if ($separator) {
        $sep_sql = " SEPARATOR $separator";
    }
    return "GROUP_CONCAT($column$sep_sql)";
}

>If you can define REGEXP as a function or stored procedure, that would be OK
>with me. They can be created by checksetup.pl during bz_setup_database, as long
>as they are reliable.

To be clear, this would entail a compiled .dll, source provided of course and the database would need CLR enabled. Fortunately it runs using MSSQL approved .NET thus allowing to run under the safe permissions level which is the most strict. The required assembly is System.Text.RegularExpressions. Obviously this is rather annoying omission in the mssql platform.

Aside from that I will open blocking tickets tonight for Bugzilla::DB::Mssql and Bugzilla::DB::Schema::Mssql

I assume we need a ticket for Bugzilla::Constants.pm to add mssql as well yes?
(In reply to comment #19)
> Buggy is a bad word, more that it make my head hurt. Calculating number is what
> a database is for, moving it outside just introduces more for the web server to
> figure out.

  Well, it's actually much easier to scale web servers than it is to scale database servers. That almost never matters with Bugzilla, though. I'm OK with having that sort of logic happen on the web server side.

> Figuring out months is problematic as not all months are equal so
> the logic requires multiple paths to determine the correct increase or decrease
> in seconds

  Oh...yes, that is an enormous problem, actually. We may need a sql_date_math function. :-( Can you look into how complex that would be (in a separate bug)?

> Bugzilla/Mysql.pm line 122
> sub sql_group_concat {

  What's this in this comment for, by the way? This is one thing you don't need to implement unless MS-SQL has some equivalent.

> To be clear, this would entail a compiled .dll, source provided of course and
> the database would need CLR enabled. Fortunately it runs using MSSQL approved
> .NET thus allowing to run under the safe permissions level which is the most
> strict. The required assembly is System.Text.RegularExpressions. Obviously this
> is rather annoying omission in the mssql platform.

  Ooooh. Okay, yeah, we won't go down that route. Instead, we will check if the database supports sql_regexp, and if it doesn't, we won't show the regexp-related options in Bugzilla's UI.


> Aside from that I will open blocking tickets tonight for Bugzilla::DB::Mssql
> and Bugzilla::DB::Schema::Mssql
> 
> I assume we need a ticket for Bugzilla::Constants.pm to add mssql as well yes?

  Yeah, let's do three bugs there.
(In reply to comment #20)
> > Figuring out months is problematic as not all months are equal so
> > the logic requires multiple paths to determine the correct increase or decrease in seconds
>   Oh...yes, that is an enormous problem, actually. We may need a sql_date_math
> function. :-( Can you look into how complex that would be (in a separate bug)?

I can build a sql function for this, the same issue applies to DATE_FORMAT so they will be similar functions in what/how they parse. Short function make be happy, this will not be very short, but should be fast so is neither here nor there in the grand scheme just a series of if like and if else operators ugly but functional.

> > Bugzilla/Mysql.pm line 122
> > sub sql_group_concat {
>   What's this in this comment for, by the way? This is one thing you don't >need to implement unless MS-SQL has some equivalent.
> GLOBAL_CAT--what's that? We don't use that. You mean string concatenation?

Deleted to much quote :-) and mistyped it anyhow. should have said GROUP_CONCAT
 
> Instead, we will check if the database supports sql_regexp, and if it doesn't, we won't show the regexp-related options in Bugzilla's UI.

Nothing direct, or last I looked will research again, I have wish for this many times in the past. There is RegExg like behavior, more specifically in PADINDEX  param 1 will recognize some regex like operations like [a-zA-Z] I need to read up on its full extent, so long as we're only using it for matching it may work in some form.
PADINEX (corrected name from above) doesn't use REGEX rather uses SQLs LIKE operator syntax so not of any use.

Just to note, there appears to be several places that call sql_rexexp and sql_not_regexp

Bugzilla::DB
Bugzilla::Search
editusers.cgi
Bugzilla::Install::DB

The first three *look* like they test for existence. The last appears to blindly call sql_regexp without checking for existence.

Assuming Bugzilla::Install::DB were corrected, then regexp could be omitted for now.

What is the likelihood we could add a optional function where we could specify an function name and syntax if the DBA implements a custom CLR Function?
(In reply to comment #22)
> PADINEX (corrected name from above) doesn't use REGEX rather uses SQLs LIKE
figures I still miss typed: PADINDEX()
(In reply to comment #21)
> I can build a sql function for this, the same issue applies to DATE_FORMAT so
> they will be similar functions in what/how they parse. Short function make be
> happy, this will not be very short, but should be fast so is neither here nor
> there in the grand scheme just a series of if like and if else operators ugly
> but functional.

  I meant $dbh->sql_date_math, like a Bugzilla::DB function.

> Deleted to much quote :-) and mistyped it anyhow. should have said GROUP_CONCAT

  Oh. Yeah, don't worry about GROUP_CONCAT. No DB except MySQL has it anyway.

> Nothing direct, or last I looked will research again, I have wish for this many
> times in the past. 

  No, I mean we will have a $dbh->bz_supports_regexp function inside of Bugzilla to see whether or not the database we're using has regex functionality, or just a UNIVERSAL::can($dbh, 'sql_regexp').

  But if there is some limited regex support, we could expose that. Most regexes we use in Bugzilla come from the users, so they can limit themselves.
(In reply to comment #22)
> The first three *look* like they test for existence. The last appears to
> blindly call sql_regexp without checking for existence.

  There would be nowhere that checks for sql_regexp's existence, as all our current DBs support it.

> Assuming Bugzilla::Install::DB were corrected, then regexp could be omitted for
> now.

  You shouldn't have to worry about Bugzilla::Install::DB, because you don't have to upgrade from those older versions.

> What is the likelihood we could add a optional function where we could specify
> an function name and syntax if the DBA implements a custom CLR Function?

  Don't worry about it at first, just go with the simple option.
(In reply to comment #25)

>   You shouldn't have to worry about Bugzilla::Install::DB, because you don't
> have to upgrade from those older versions.

Looks like it's called on new installs as well.

Inserting values into the 'op_sys' table:
'All'     sortkey: 100
'Windows' sortkey: 200
'Mac OS'  sortkey: 300
'Linux'   sortkey: 400
'Other'   sortkey: 500
Removing existing compiled templates...
Precompiling templates...done.
Initializing "Dependency Tree Changes" email_setting ...
Can't locate object method "sql_regexp" via package "Bugzilla::DB::Mssql" at Bugzilla/Install/DB.pm line 2608.
Depends on: 487437, 487438, 487439
Depends on: 487443
No longer depends on: 487437, 487438, 487439, 487443
Version: 2.20 → 3.5
(In reply to comment #26)

Bugzilla/Install/DB.pm line 2608 refers to:

sub _clean_control_characters_from_short_desc {
    my $dbh = Bugzilla->dbh;

    # Fixup for Bug 101380
    # "Newlines, nulls, leading/trailing spaces are getting into summaries"

    my $controlchar_bugs =
        $dbh->selectall_arrayref("SELECT short_desc, bug_id FROM bugs WHERE " .
            $dbh->sql_regexp('short_desc', "'[[:cntrl:]]'"));
    if (scalar(@$controlchar_bugs)) {
        my $msg = 'Cleaning control characters from bug summaries...';
        my $found = 0;
        foreach (@$controlchar_bugs) {
            my ($short_desc, $bug_id) = @$_;
            my $clean_short_desc = clean_text($short_desc);
            if ($clean_short_desc ne $short_desc) {
                print $msg if !$found;
                $found = 1;
                print " $bug_id...";
                $dbh->do("UPDATE bugs SET short_desc = ? WHERE bug_id = ?",
                          undef, $clean_short_desc, $bug_id);
            }
        }
        print " done.\n" if $found;
    }
}
re: sql_date_format

I grepped through the code and the only instance where date_format is used only 4 variations of the string are used so I can emulate the function easily in sql.

The format strings are:
%Y%m%d
%Y-%m-%d
%Y.%m.%d %H:%i
%Y.%m.%d %H:%i:%s

So the following should work:
CREATE FUNCTION DATE_FORMAT( @date datetime, @format nvarchar(max) )
RETURNS varchar(max) AS
BEGIN
		
        SET @format = Replace(@format, '%y', DATEPART(yy,@date))
        SET @format = Replace(@format, '%m', RIGHT('00'+CAST(DATEPART(mm,@date) as NVARCHAR(2)),2))
        SET @format = Replace(@format, '%d', RIGHT('00'+CAST(DATEPART(dd,@date) as NVARCHAR(2)),2))
        SET @format = Replace(@format, '%h', RIGHT('00'+CAST(DATEPART(hh,@date) as NVARCHAR(2)),2))
        SET @format = Replace(@format, '%i', RIGHT('00'+CAST(DATEPART(mi,@date) as NVARCHAR(2)),2))
        SET @format = Replace(@format, '%s', RIGHT('00'+CAST(DATEPART(ss,@date) as NVARCHAR(2)),2))
	RETURN @format
END
GO

Note: The replace function in mssql is case-insensitive.

Rather limited but effective in the current scope. MSSQL 2008 has improved DATEPART supporting concepts like h vs hh which 2000 and 2005 still lack, hh only)
Alias: bz-mssql
Depends on: 487437
Depends on: 487438
Depends on: 487439
Depends on: 487443
(In reply to comment #27)
> sub _clean_control_characters_from_short_desc {

  Okay. We can deal with that in a separate bug about making sql_regexp support optional.

(In reply to comment #28)
> So the following should work:
> [snip]

  Why don't you just override sql_date_format like we do for all the other DBs? I don't see a need to make that a stored procedure.
(In reply to comment #29)
> (In reply to comment #27)
> > sub _clean_control_characters_from_short_desc {
> 
>   Okay. We can deal with that in a separate bug about making sql_regexp support
> optional.
I found/remembered an old method for running regex in mssql, has high overhead however. Though since the REGEX feature doesn't appear to be used frequent operations might be ok, has a high overhead cost. Imports a call for vbscript.regex functions as a ole object within the database.

>   Why don't you just override sql_date_format like we do for all the other DBs?
> I don't see a need to make that a stored procedure.

That would be nice yes, but MSSQL doesn't supply a convert facility with comparably abilities: http://msdn.microsoft.com/en-us/library/ms187928.aspx

So I'm reduced to using DATEPART to pull pieces and reassemble. Essentially the same thing concept but force one to built the logic manually. The user defined function in comment 28 is functional.

For all the cool things MSSQL does, it 's written like it's a 'genius', often the advanced stuff it does nicely, the commonsense stuff gets left behind. aka Bumbling Professor syndrome.
(In reply to comment #30)
> I found/remembered an old method for running regex in mssql, has high overhead
> however. Though since the REGEX feature doesn't appear to be used frequent
> operations might be ok, has a high overhead cost. Imports a call for
> vbscript.regex functions as a ole object within the database.

  Oh, cool. Yeah, high overhead is fine, we usually don't expect regex matches to be fast. Well, except that we use it for "all of the words" or "any of the words" (which we probably shouldn't).

> That would be nice yes, but MSSQL doesn't supply a convert facility with
> comparably abilities: http://msdn.microsoft.com/en-us/library/ms187928.aspx

  You could write basically the exact same thing you had in that SQL function in an override to sql_date_format.
(In reply to comment #31)

>   You could write basically the exact same thing you had in that SQL function
> in an override to sql_date_format.

Yes, but the code being generated by sql_date_format is being passed to sql yes?
So what I would be doing to overriding sql_date_format to create syntax to call dbo.Date_Format from comment 28. I think we might be saying the same thing? Or perhaps I'm not understanding the intent of sql_date_format?
Moving my development to MSSQL 2008 as a target DB platform to compensate for inconsistencies between ANSI standards and the MS implementations. 2008 offers better handling for many issues including date handling, filtered indexes, encryption, compression, file storage etc..
Here's a question, how do we want to handle driver support?

Specifically for ODBC the driver used will depend on preference.

my $dsn = "dbi:ODBC:Driver={SQL Native Client};Server=$host;UID=$user;PWD=$pass;Database=$dbname"

Where 'SQL Native Client' should be $driver or similar.

As this need to support cross platform the driver will not be the same. Linux users would likely opt for odbc with freetds or their preference where as Windows Users can use the built-in driver or server specific version like 'Sql Server Native Client 10.0'
(In reply to comment #34)
> Here's a question, how do we want to handle driver support?
> 
> Specifically for ODBC the driver used will depend on preference.

  I'd say that we should make a decision on it for the user, and they can customize it if they want it to work some other way. I suspect most people using SQL Server will be installing Bugzilla on Windows, so we can default to the native client and people can customize if they want?
We can do that, given that making mssql work from a Linux box requires, generally, the user to know their way around. It shouldn't be too much of a hardship.
One more annoying thing, which is kinda sahred between IIS and ActiveState

perl.exe is fairly slow under iis when running bugzilla. So users should opt for PerlEx30.dll as a isapi module in iis. It is perl_mod ish in function from most comments I've seen. The problem here is that in ActivePerl-5.8.x the PerlEx30.dll seems not to play nicely with IIS6+ and requires flipping on IIS5 Isolation Mode compatibility. So we introduce the requirement to run ActivePerl-5.10.x which does have a newer build of PerlEx30.dll.
(In reply to comment #37)
> perl.exe is fairly slow under iis when running bugzilla. So users should opt
> for PerlEx30.dll as a isapi module in iis. 

  Okay, but that is a whole different issue from this patch, so we can discuss it in some separate bug if you want. Or you can post instructions on the Wiki for what people should do, which is more likely what we'd support (wiki instructions).
(In reply to comment #38)
>   Okay, but that is a whole different issue from this patch, so we can discuss
> it in some separate bug if you want. Or you can post instructions on the Wiki
> for what people should do, which is more likely what we'd support (wiki
> instructions).

Agreed. Dan Wierenga also comment that I should probably clarify the problem in
comment 37. ActiveState 5.10 should be required so that IIS6 (Windows 2003) does not have to be downgraded to runs in IIS5 compatibility mode (IIS5 was part of Windows 2000). Running under IIS5 compatibility mode opens up some security risks that are fixed under IIS6 in native mode. Additionally, though I have not tested yet, PerlEx30 from ActiveState 5.10 will run under IIS7 (Windows 2008) at which point downgrading to IIS5 compatibility is not a option.
Quick Update

The following tickets contain the changes necessary to run under MSSQL 2008 and IIS6 (Windows 2003) and (Windows 2008) IIS7:

bug 487437 bug 487438 bug 487439
Depends on: 504411
Depends on: 505362
Depends on: 525033
(In reply to comment #28)
> re: sql_date_format
> Note: The replace function in mssql is case-insensitive.

That is not always true.  Per the MSSQL docs (ms-help://MS.SQLCC.v10/MS.SQLSVR.v10.en/s10de_6tsql/html/8a7aaaf2-62e3-46c0-8e44-fa22290dd86b.htm)

> REPLACE performs comparisons based on the collation of the input. To perform a comparison in a specified collation, you can use COLLATE to apply an explicit collation to the input.

So it sounds like one of three options is needed:

1.  Use the COLLATE clause to force case-insensitive replacement.
2.  Test for upper and lower
3.  Document that the params must be lowercase and make sure the rest of bugzilla uses lowercase.
Target Milestone: --- → Bugzilla 3.8
Whiteboard: [roadmap: 3.8]
Depends on: 526142
Depends on: 526144
(In reply to comment #41)
Or this should work.

SELECT REPLACE(cast(CAST(getdate() as date) as nvarchar),'-','')
SELECT REPLACE(cast(CAST(getdate() as date) as nvarchar),'-','.')
SELECT cast(CAST(getdate() as date) as nvarchar)+' '+ CAST(CAST(getdate() as TIME(0)) as nvarchar)
SELECT LEFT(CAST(getdate() as TIME(0)),5)
SELECT CAST(getdate() as TIME(0))
Depends on: 535821
Depends on: 565720
Target Milestone: Bugzilla 4.0 → Bugzilla 5.0
Depends on: 602165
Depends on: 602926
Ok despite my lack of updates, a brief comment on progress.

I have a now production instance of Bugzilla on MSSQL for my company. This instance has been online since July 27th. Stability has been good, we have found a few issues performance wise but have resolved them.

Main Issue Thus Far:
  Full Text Support:
     To make this work required altering the code in Search.pm, as doesn't allow for overrides from the DB. The existing sql_fulltext is relatively useless as for mssql as the implementation as is posts the result in the select, where and order by which makes mssql keel over. However mssql supports a cleaner more effect usage. Instead of joining bugs_full text you join the Table-Valued function CONTAINSTABLE()

    e.g.  LEFT JOIN CONTAINSTABLE(bug_fulltext, 'FindThis AND FindThat') ft_1 ON
                bugs.bug_id = ft_1.KEY

    You can then call ft_1.RANK in the select, where and order by as bugzilla likes to do. Not doing it this way means using:
CASE WHEN <table alias>.<column> LIKE '%<VALUE>%' THEN 1 ELSE 0 END
In sql_fulltext which is slow, you also end up with only very basic ranking, ie if the value is in 3 comments it is summed as a rank of 3 by adding the CASE WHEN outputs together. Never mind that you'll table scan ALOT of records meaning you better have a lot of disk space to parse results on your sql box, I mean A LOT.

I've adjusted the some of the clr functions to be faster, for instance hybriding the dbo.INSTR function by using regex to find the existence of the substring then using IndexOf is faster than just using IndexOf on all queried rows.

An alternative is LIKE again but you end up having to know your collation type and setting the case sensitive version to use sql_postion vs sql_iposition (the instr functions) LIKE also is slower if your searching for multiple terms on the comment fields either as 'OR' or 'AND' on a single term LIKE appears slightly faster. Remember you can't put a index on VARCHAR(MAX) which is what I substitute as for text. I do this because using text or ntext has other issues that would require even more modifications to use. NVARCHAR support db level compression as well if your using Enterprise edition.

There are also a few extra functions in the clr collection the can be useful if someone has time to investigate them. I use them with custom extension, (displaying custom multi selects in bug list for example)


Next Issue is a bug in Search.pm

There is some duplicate code the slow searches down a lot:

THIS CODE IS USELESS:

    foreach my $f ("short_desc", "longdesc", "bug_file_loc",
                   "status_whiteboard") {
        if (defined $params->param($f)) {
            my $s = trim($params->param($f));
            if ($s ne "") {
                my $n = $f;
                my $q = $dbh->quote($s);
                trick_taint($q);
                my $type = $params->param($f . "_type");
                push(@specialchart, [$f, $type, $s]);
            }
        }
    }

Why? Because about 200 lines above you find this:
    foreach my $field ($params->param()) {
        if (grep { $_->name eq $field } @legal_fields) {
            my $type = $params->param("${field}_type");
            if (!$type) {
                if ($field eq 'keywords') {
                    $type = 'anywords';
                }
                else {
                    $type = 'anyexact';
                }
            }
            $type = 'matches' if $field eq 'content';
            push(@specialchart, [$field, $type,
                                 join(',', $params->param($field))]);
        }
    }


The result of having both is that where conditions against the follow fields:
"short_desc", "longdesc", "bug_file_loc", "status_whiteboard"

Unnessisarily duplicated so search for a word in a comment and you effectively this:

WHERE ((longdescs.thetext LIKE '%blah%') AND (longdescs.thetext LIKE '%blah%'))

Ever wonder why terms show up in the terms list above a bug list multiple times? That's why. Search ANYWORDS on a multiword search on a comment and it explodes exponentially. I just researched that one this week, commenting the "bad" code made searches that took 30 seconds take half or even less time.

My guess is that it is old code left in because it doesn't break anything therefore is missed by QA as an artifact that can be removed.
Thanks for the status update!

I think I'm going to run into some similar problems with fulltext on SQLite, and probably on Pg when we implement fulltext there. I'm going to refactor Search.pm and the DB modules to do fulltext in a fashion that's more workable in a cross-database environment.

As far as that Search.pm slowdown, believe me, there are about a hundred more bugs than that in Search.pm. :-) Many of them are fixed on trunk (which has a completely refactored Search.pm) and many more will be. The bug you pointed out has definitely been fixed. :-)
I see the progress on this has stalled a bit. I have too been running a production instance of Bugzilla on MSSQL for my company for little over a year now and would be interested in helping get this one finalized if possible. 

I know mockodin has probably made more improvements to the preliminary patches he has posted but patches that are in place now with some small tweaks have worked perfectly ok for us so far.
(In reply to Jasmin Sehic from comment #45)
> I see the progress on this has stalled a bit. I have too been running a
> production instance of Bugzilla on MSSQL for my company for little over a
> year now and would be interested in helping get this one finalized if
> possible. 


  Hey Jasmin. That's great to hear, we'd love to have your help on this! Have you read our Contributor's Guide? It's at:

  http://wiki.mozilla.org/Bugzilla:Developers

  There's probably quite a bit of work to be done here, but the way forward is just to start tackling all the blockers for this bug, one by one. For many of them, if they're complex, it would probably be best to talk to me (mkanat) on IRC first, or if I simply am not around and can't be reached, LpSolit also can be really helpful (although I usually manage more of the database-side stuff than he does, which is the only reason I say to try to contact me first).
Ok I'll start from the top and work my way down. I have this running on the bleeding edge Bugzilla trunk so I will post patch updates based on this.
No longer depends on: 602926
@mockodin: what is preventing you from implementing MS SQL support? Which bug(s) exactly? I see the long list of bugs blocking this one in the dependency tree, but there is certainly a root cause blocking your work. I'm not CC'ed to this bug (as I don't want to be spammed here) so feel free to reply here for other users, but it would be nice if you could jump into the IRC channel (#bugzilla) to have some live discussion with core developers. Maybe what blocked your work some years ago no longer applies. We did a lot or work in the backend since 2005.
Whiteboard: [roadmap: 3.8] → [roadmap: 5.0]
Target Milestone: Bugzilla 4.4 → Bugzilla 5.0
No action on this bug in a while, and it's not necessary for 5.0.
Whiteboard: [roadmap: 5.0]
Target Milestone: Bugzilla 5.0 → ---
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: