Closed Bug 232193 Opened 21 years ago Closed 8 years ago

bmo's systems (webheads, database, etc) should use UTC natively for o/s timezone and date storage

Categories

(bugzilla.mozilla.org :: General, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: jouni, Assigned: dkl)

References

Details

User Story

Reviewer Testing Steps:

0. Populate the mysql.time_zone_names table with the latest timezone names.
   - http://dev.mysql.com/doc/refman/5.6/en/time-zone-support.html#time-zone-installation
1. Set test container to PST (America/Los_Angeles) and restart DB to set to new TZ. 
   - ln -sf /usr/share/zoneinfo/America/Los_Angeles /etc/localtime
   - Then restart MySQL.
2. Import test data from recent BMO snapshot.
3. Take several samples and record the dates such as creation_ts, comment timestamps, etc.
4. Set timezone of DB server to UTC.
   - sudo ln -sf /usr/share/zoneinfo/UTC /etc/localtime
   - Then restart MySQL.
5. Apply approved patch from bug 232193 to codebase
6. Run scripts/convert_datetime.pl to migrate datetime columns and values to UTC.
   - Must provide mysql root password as argument to the script.
7. Set default server timezone to 'America/Los_Angeles' instead of system timezone in default user preferences.
   System timezone should be set UTC on webheads.
8. Log in to Bugzilla as test user that has timezone set to 'system default' in user preferences.
   - Should also be able to manually set to 'America/Los_Angeles' and get the same values.
9. View sample bugs and make sure timestamps match what was previously recorded.

Production Steps:

1. Select collection of bugs and record dates such as creation_ts, delta_ts, comments, etc. 
2. Update the mysql timezone table
   - http://dev.mysql.com/doc/refman/5.6/en/time-zone-support.html#time-zone-installation
3. Block access to BMO (hardhat).
4. Suspend execution of all daemons (push, jobqueue, httpd, etc.).
5. Backup the database by moving a slave out of pool (turn off all replication?)
6. Set system timezone to UTC for the following:
   - sudo ln -sf /usr/share/zoneinfo/UTC /etc/localtime
   - all bmo nodes (including but not limited to webheads, jobqueue, admin, db-backup, and push servers)
   - master database and all slaves (restart mysqld after making this change)
7. Update cron jobs to run at UTC-8 instead of UTC so they still run on PST time.
8. Commit bug 232193 and perform full code push.
9. Run scripts/convert_datetime.pl to migrate datetime columns and values to UTC.
   - Must provide mysql root password as argument to the script.
10. Set default server timezone to 'America/Los_Angeles' instead of system timezone in default user preferences.
   System timezone should be set UTC on webheads.
11. Log in to Bugzilla as test user that has timezone set to 'system default' in user preferences.
   - Should also be able to manually set to 'America/Los_Angeles' and get the same values.
12. Verify that the previously recorded dates are displayed properly in the bugs selected.
    - should also test bug creation, searching by dates, charting, and whines
13. If not backout code change and move backup slave to master.
14. If all looks good:
    - re-enable stopped cron jobs (with new UTC-8 configuration) and verify working properly
    - un-hardhat BMO
    - Run some final sanity tests to verify proper operation

Attachments

(1 file, 7 obsolete files)

I propose that b.m.o should use another time zone than PDT. I'm not aware of the
developer distribution around here, but since the fall of Netscape it probably
isn't the same as it used to be. Calculating time differences from PDT is rather
hard, and most people know their Time Zone offset from GMT/UTC better than PDT.
Summary: bmo should use international Time Zone → bmo should use an international Time Zone (UTC/GMT)
If b.m.o decides to switch to UTC, it should use UTC exclusively. GMT was
deprecated more than 30 years ago no matter how much people at BBC are fond of GMT. 
 
Switching from one time zone to another doesn't make things much better, even if
more users find UTC easier to deal with than PST (which isn't clear at all; the
Mozilla Foundation's home is in PST, and we can expect significant activity to
continue to come from them).

Bug 182238 is a much better solution which will make every user's life better in
this respect, so that's what we're going to do.
Severity: minor → enhancement
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → WONTFIX
*** Bug 284411 has been marked as a duplicate of this bug. ***
Component: Bugzilla: Other b.m.o Issues → General
Product: mozilla.org → bugzilla.mozilla.org
Reopening as this has become an issue once again recently with the inability to replicate to another database that is running set up as UTC. We need to do this sooner or later so we can use this bug to track our progress.

http://www.pythian.com/blog/datetime-vs-timestamp/

We need to convert all date fields that are using the DATETIME data type to use TIMESTAMP as well to help this happen. Currently 23 columns are DATETIME type:

bugs.creation_ts
bugs.delta_ts
bugs.lastdiffed
bugs.deadline
bugs_activity.bug_when
longdescs.bug_when
longdescs_tags_activity.bug_when
attachments.creation_ts
attachments.modification_time
audit_log.at_time
flags.creation_date
flags.modification_date
profiles.last_seen_date
profiles_activity.profiles_when
logincookies.last_used
login_failure.login_time
tokens.issuedate
series_data.series_date
whine_schedules.run_next
email_rates.message_ts
bug_user_last_visit.last_visit_ts
user_api_keys.last_used
user_request_log.timestamp

Currently database is configured in Pacific time:

foreach $table in (keys %tables) {
    foreach $column (@{ $tables{$table} }) {
        $dbh->do('ALTER TABLE ? ADD COLUMN ? TIMESTAMP', undef, $table, $column . '_utc');
        $dbh->do("UPDATE TABLE ? SET ? = CONVERT_TZ(?, 'America/Los_Angeles', 'UTC')", undef, $table, $column . '_utc', $column);
    }
}

Restart database as UTC:

my.cnf:
[mysqld]
default_time_zone = 'UTC'

foreach $table in (keys %tables) {
    foreach $column (@{ $tables{$table} }) {
        $dbh->do('ALTER TABLE ? DROP COLUMN ?', undef, $table, $column);
        $dbh->do("ALTER TABLE ? CHANGE ? ? TIMESTAMP', undef, $table, $column . '_utc', $column);
    }
}

That would only be for the database. We would also need to do a lot of code auditing to see where this change could
have negative side effects.

Thoughts
dkl
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
You probably won't want the 2004 default assignee, then.
Assignee: endico → nobody
QA Contact: myk
this bug isn't the same as storing dates as UTC, morphing -- the default user-visible timezone for BMO should continue to be US/Pacific.

off the top of my head:

- your field list isn't complete (eg. it's missing nag_defer.defer_until, flag_state_activity.flag_when, probably more)

- all servers (webheads, adminnode, jobqueue, push, etc) should run UTC as well
  - some datetime math happens client-side
  - it's a bigger task to allow PST webheads talking to a UTC database, and one which is unnecessary
  - scheduling of cronjobs needs to be updated at the same time to ensure there are no gaps in execution
Summary: bmo should use an international Time Zone (UTC/GMT) → bmo's systems (webheads, database, etc) should use UTC natively for o/s timezone and date storage
(In reply to Byron Jones ‹:glob› from comment #6)
> this bug isn't the same as storing dates as UTC, morphing -- the default
> user-visible timezone for BMO should continue to be US/Pacific.
> 
> off the top of my head:
> 
> - your field list isn't complete (eg. it's missing nag_defer.defer_until,
> flag_state_activity.flag_when, probably more)

Sorry. Mistakenly forgot to add in the extension fields. Updated list:

attachments.creation_ts
attachments.modification_time
audit_log.at_time
bug_interest.modification_time
bug_user_last_visit.last_visit_ts
bugs.creation_ts
bugs.deadline
bugs.delta_ts
bugs.lastdiffed
bugs_activity.bug_when
email_rates.message_ts
flag_state_activity.flag_when
flags.creation_date
flags.modification_date
login_failure.login_time
logincookies.last_used
longdescs.bug_when
longdescs_activity.change_when
longdescs_tags_activity.bug_when
naf_defer.defer_until
profiles.last_seen_date
profiles.last_statistics_ts
profiles_activity.profiles_when
push.push_ts
push_backlog.attempt_ts
push_backlog.push_ts
push_backoff.next_atttempt_ts
push_log.processed_ts
push_log.push_ts
push_notify.delta_ts
series_data.series_date
tokens.issuedate
user_api_keys.last_used
user_request_log.timestamp
whine_schedules.run_next

> - all servers (webheads, adminnode, jobqueue, push, etc) should run UTC as
> well
>   - some datetime math happens client-side
>   - it's a bigger task to allow PST webheads talking to a UTC database, and
> one which is unnecessary
>   - scheduling of cronjobs needs to be updated at the same time to ensure
> there are no gaps in execution

Agreed that this is more than just the DB. AWS will be easier as it is all default UTC, even the webheads. Significantly more work to get SCL3 up to date but not impossible. I can look into what it would take client side.

dkl
Attached patch 232193_1.patch (obsolete) — Splinter Review
Reviewer Testing Steps:

1. Set test container to PST (America/Los_Angeles) and restart DB to set to new TZ. You can set it on Linux VM/Docker by 'ln -sf /usr/share/zoneinfo/America/Los_Angeles /etc/localtime'. Then restart MySQL.
2. Import test data from recent BMO snapshot.
3. Take several samples and record the dates such as creation_ts, comment timestamps, etc.
4. Set TZ of DB server to UTC using "$ sudo ln -sf /usr/share/zoneinfo/UTC /etc/localtime". Restart MySQL.
5. Apply patch to codebase
6. Run scripts/convert_datetime.pl to migrate datetime columns and values to UTC.
7. Set default server timezone to 'America/Los_Angeles' instead of system timezone in default user preferences. System timezone should now of course be UTC.
8. Log in to Bugzilla as test user that has timezone set to 'system default' in user preferences.
9. View sample bugs and make sure timestamps match what was previously recorded.

Production Steps:

1. Update TZ for webheads, jobqueue, admin, and push servers to UTC. All must match.
2. Update cron jobs to run at UTC-8 instead of UTC so they still run on PST time.
3. Set master and slaves in production to UTC and restart.
4. Repeat 5-7 from above steps on production master.
Assignee: nobody → dkl
Status: REOPENED → ASSIGNED
Attachment #8713692 - Flags: review?(dylan)
Attachment #8713692 - Flags: feedback?(glob)
Comment on attachment 8713692 [details] [diff] [review]
232193_1.patch

Review of attachment 8713692 [details] [diff] [review]:
-----------------------------------------------------------------

from a quick reading of the patch..

> Production Steps:
before:
- update the mysql timezone table
- block access to bmo
- stop all daemons (push, jobqueue, httpd, etc)
- backup the database
> 1. Update TZ for webheads, jobqueue, admin, and push servers to UTC. All must match.
> 2. Update cron jobs to run at UTC-8 instead of UTC so they still run on PST time.
> 3. Set master and slaves in production to UTC and restart.
> 4. Repeat 5-7 from above steps on production master.

please provide steps required to test the script worked, and bmo isn't broken.


note: we can (and should!) test the db conversion on dev and stage, however crons for all three systems are running on the same host that cannot be altered/tested until production.

::: Bugzilla/DB/Schema.pm
@@ +259,5 @@
>                                      NOTNULL => 1, DEFAULT => "''"},
>              bug_severity        => {TYPE => 'varchar(64)', NOTNULL => 1},
>              bug_status          => {TYPE => 'varchar(64)', NOTNULL => 1},
> +            creation_ts         => {TYPE => 'TIMESTAMP'},
> +            delta_ts            => {TYPE => 'TIMESTAMP', NOTNULL => 1},

wouldn't it have been easier to map DATETIME to TIMESTAMP in the mysql layer?

@@ +2050,5 @@
> +    # unless explicitly setup in the table definition. This will change in future releases
> +    # and can be disabled by using 'explicit_defaults_for_timestamp = 1' in my.cnf.
> +    # We do not want to mess with my.cnf unless we have to so we explicitly setup
> +    # TIMESTAMP types to not be automatic.
> +    if ($type =~ /^timestamp$/i) {

no need to fire up the regex engine for a case-insensitive string comparison:
if (lc($type) eq 'timestamp') {

@@ +2053,5 @@
> +    # TIMESTAMP types to not be automatic.
> +    if ($type =~ /^timestamp$/i) {
> +        $type_ddl .= " NULL" if !$finfo->{NOTNULL};
> +        $type_ddl .= " DEFAULT CURRENT_TIMESTAMP" if $finfo->{NOTNULL} && !defined $default;
> +    }

we should throw an error if someone tries to create DATETIME column instead of TIMESTAMP

::: Bugzilla/DB/Schema/Mysql.pm
@@ +121,2 @@
>  
>          DATETIME =>     'datetime',

DATETIME shouldn't be here - right?

::: scripts/convert_datetime.pl
@@ +1,1 @@
> +#!/usr/bin/env perl

use #!/usr/bin/perl to match every other script

@@ +11,5 @@
> +use strict;
> +use warnings;
> +
> +use Bugzilla;
> +use Bugzilla::DB;

this script needs Bugzilla->usage_mode(USAGE_MODE_CMDLINE);

and should have a confirmation step instead of immediately converting the data

@@ +15,5 @@
> +use Bugzilla::DB;
> +
> +my $bugs_dbh = Bugzilla->dbh;
> +
> +my $root_mysql_pw = shift || '';

the script should throw a syntax error if no parameters are provided that states that it needs the mysql root password.

@@ +23,5 @@
> +    db_host   => 'localhost',
> +    db_name   => 'mysql',
> +    db_user   => 'root',
> +    db_pass   => $root_mysql_pw
> +});

read db_host from localconfig instead of hardcoding localhost.

@@ +31,5 @@
> +       FROM INFORMATION_SCHEMA.COLUMNS
> +      WHERE TABLE_SCHEMA = ?
> +            AND DATA_TYPE='datetime'",
> +    undef, Bugzilla->localconfig->{db_name});
> +

check that mysql's timezone table has been populated and is update to date before starting this.
http://dev.mysql.com/doc/refman/5.6/en/time-zone-upgrades.html
Attachment #8713692 - Flags: feedback?(glob)
Moved testing/production to steps to user story field.
User Story: (updated)
Attached patch 232193_2.patch (obsolete) — Splinter Review
Thanks for the speedy feedback. Originally I added the additional TIMESTAMP type to Mysql.pm as I thought later down the road it may would be less confusing to someone make changes to just have both a DATETIME and the TIMESTAMP types. But admittedly it is much simpler to just override DATETIME. Not to mention easier to review. 

I also added the revised instructions to the user story field. So please take a look at that too.

I ran convert_datetime.pl against a sanitized copy of BMO and it took 50m10.379s to complete. So I would guesstimate at least double that for the production migration. We can see when we do the initial migration on dev/stage.

dkl
Attachment #8713692 - Attachment is obsolete: true
Attachment #8713692 - Flags: review?(dylan)
Attachment #8714562 - Flags: review?(dylan)
Attachment #8714562 - Flags: feedback?(glob)
User Story: (updated)
Comment on attachment 8714562 [details] [diff] [review]
232193_2.patch

Review of attachment 8714562 [details] [diff] [review]:
-----------------------------------------------------------------

::: Bugzilla/DB/Schema.pm
@@ +2043,4 @@
>      $type_ddl .= " PRIMARY KEY" if ($finfo->{PRIMARYKEY});
>      $type_ddl .= " NOT NULL" if ($finfo->{NOTNULL});
>  
> +    # TIMESTAMPS as of 5.6.6 still default to

it's weird for mysql specific stuff to live here and not in Bugzilla::DB::Mysql.
you should subclass get_type_ddl and append to the base result.

@@ +2046,5 @@
> +    # TIMESTAMPS as of 5.6.6 still default to
> +    # 'NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP'
> +    # unless explicitly setup in the table definition. This will change in future releases
> +    # and can be disabled by using 'explicit_defaults_for_timestamp = 1' in my.cnf.
> +    # We do not want to mess with my.cnf unless we have to so we explicitly setup

"we do not want to mess with" should be "we cannot mess with", because rds.

@@ +2048,5 @@
> +    # unless explicitly setup in the table definition. This will change in future releases
> +    # and can be disabled by using 'explicit_defaults_for_timestamp = 1' in my.cnf.
> +    # We do not want to mess with my.cnf unless we have to so we explicitly setup
> +    # TIMESTAMP types to not be automatic.
> +    if (lc($type) eq 'timestamp'

this should be 'datetime' not 'timestamp'.

::: scripts/convert_datetime.pl
@@ +34,5 @@
> +unless ($mysql_tz_count) {
> +    die "The timezone table mysql.time_zone_name has not been populated." .
> +        "Please populate using instuctions at http://dev.mysql.com/doc/refman/5.6/en/time-zone-support.html#time-zone-installation " .
> +        "and re-run this script.\n";
> +}

you're not checking if the timezone data is up to date.
there's a way to test this on the page i linked in my last comment.

@@ +50,5 @@
> +    exit;
> +}
> +
> +print STDERR <<EOF;
> +About to move convert $total DATETIME columns to TIMESTAMP columns and migrate their values from PST to UTC.

s/move //
Attachment #8714562 - Flags: feedback?(glob)
Attached patch 232193_3.patch (obsolete) — Splinter Review
Attachment #8714562 - Attachment is obsolete: true
Attachment #8714562 - Flags: review?(dylan)
Attachment #8715021 - Flags: review?(dylan)
Attachment #8715021 - Flags: feedback?(glob)
Comment on attachment 8715021 [details] [diff] [review]
232193_3.patch

Review of attachment 8715021 [details] [diff] [review]:
-----------------------------------------------------------------

::: Bugzilla/DB/Schema/Mysql.pm
@@ +401,5 @@
> +sub get_type_ddl {
> +    my $self = shift;
> +    my $finfo = (@_ == 1 && ref($_[0]) eq 'HASH') ? $_[0] : { @_ };
> +    my $type = $finfo->{TYPE};
> +    confess "A valid TYPE was not specified for this column (got " . Dumper($finfo) . ")" unless ($type);

you shouldn't have to duplicate the parent class's sub here.
call it via SUPER and append to the result if required.

@@ +423,5 @@
> +    # 'NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP'
> +    # unless explicitly setup in the table definition. This will change in future releases
> +    # and can be disabled by using 'explicit_defaults_for_timestamp = 1' in my.cnf.
> +    # So instead, we explicitly setup TIMESTAMP types to not be automatic.
> +    if (lc($type) eq 'timestamp'

'timestamp' isn't a valid datatype in db_schema, so it can be removed from here.
Attachment #8715021 - Flags: feedback?(glob)
(In reply to Byron Jones ‹:glob› from comment #14)
> @@ +423,5 @@
> > +    # 'NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP'
> > +    # unless explicitly setup in the table definition. This will change in future releases
> > +    # and can be disabled by using 'explicit_defaults_for_timestamp = 1' in my.cnf.
> > +    # So instead, we explicitly setup TIMESTAMP types to not be automatic.
> > +    if (lc($type) eq 'timestamp'
> 
> 'timestamp' isn't a valid datatype in db_schema, so it can be removed from
> here.

It depends. When the bz_schema is loaded from the DB at instantiation-time it is adjusted to match the current DB being used. So DATETIME is changed to timestamp. This happens when creating new tables. Which is why I needed to check for either DATETIME or timestamp. For example, when using $dbh->bz_add_column, the type passed in to get_type_ddl is DATETIME and not timestamp.

Will try a different approach.

dkl
Attached patch 232193_4.patch (obsolete) — Splinter Review
glob, also did you get a chance to look at the revised procedure in user story?

Thanks
dkl
Attachment #8715021 - Attachment is obsolete: true
Attachment #8715021 - Flags: review?(dylan)
Attachment #8715378 - Flags: review?(dylan)
Attachment #8715378 - Flags: feedback?(glob)
Fubar. Can you give me an idea on the complexity involved on a system level on switching our SCL3 BMO infrastructure over to UTC? 

1. How hard will it be to set the timezones on each system? Simple puppet change or manual?
2. How many cron scripts do we have set up to run right now and how difficult would it be to get their run times shifted to UTC-8?
3. Is this something we could do during a tree closure window?
4. Do we need to restart the systems or does simply changing the link for /etc/localtime do it?

I will get with a DBA to see what would be involved on the DB side of things. We would need the DB systems to also be restarted after the timezone has changed. This would need to be done prior to running the conversion process.

dkl
Flags: needinfo?(klibby)
(In reply to David Lawrence [:dkl] from comment #16)
> glob, also did you get a chance to look at the revised procedure in user
> story?

yes - i made some changes to it in https://bugzilla.mozilla.org/show_bug.cgi?id=232193#a379268228_13647
Comment on attachment 8715378 [details] [diff] [review]
232193_4.patch

Review of attachment 8715378 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm (note i haven't performed the migration, i'll leave full testing with dylan).
Attachment #8715378 - Flags: feedback?(glob) → feedback+
Comment on attachment 8715378 [details] [diff] [review]
232193_4.patch

Review of attachment 8715378 [details] [diff] [review]:
-----------------------------------------------------------------

r-

Luckily I'm dealing with vm snapshots to trying this again will be easy although it takes a long time. Should I try increasing the disk space first
or the RAM?

::: scripts/convert_datetime.pl
@@ +69,5 @@
> +        $bugs_dbh->bz_rename_column($table, $column, $column . "_pst");
> +        $column_info->{TYPE} = 'TIMESTAMP';
> +        $column_info->{DEFAULT} = 'CURRENT_TIMESTAMP' if $column_info->{NOTNULL} && !$column_info->{DEFAULT};
> +        $bugs_dbh->bz_add_column($table, $column, $column_info);
> +        $bugs_dbh->do("UPDATE $table SET $column = CONVERT_TZ(" . $column . '_pst' . ", 'America/Los_Angeles', 'UTC')");

This can die with DBD::mysql::db do failed: The table 'longdescs' is full [for Statement "ALTER TABLE longdescs ADD COLUMN bug_when TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL"] at Bugzilla/DB.pm line 637.

longdescs is 4.7G for me, and there is more than double that amount of disk space free. When I re-run the script, it will skip longdescs but the install is broken because of a bugs_when_pst field existing and no bugs_when field existing.
Attachment #8715378 - Flags: review?(dylan) → review-
(In reply to Dylan William Hardison [:dylan] from comment #20)
> Comment on attachment 8715378 [details] [diff] [review]
> 232193_4.patch
> 
> Review of attachment 8715378 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> r-
> 
> Luckily I'm dealing with vm snapshots to trying this again will be easy
> although it takes a long time. Should I try increasing the disk space first
> or the RAM?
> 
> ::: scripts/convert_datetime.pl
> @@ +69,5 @@
> > +        $bugs_dbh->bz_rename_column($table, $column, $column . "_pst");
> > +        $column_info->{TYPE} = 'TIMESTAMP';
> > +        $column_info->{DEFAULT} = 'CURRENT_TIMESTAMP' if $column_info->{NOTNULL} && !$column_info->{DEFAULT};
> > +        $bugs_dbh->bz_add_column($table, $column, $column_info);
> > +        $bugs_dbh->do("UPDATE $table SET $column = CONVERT_TZ(" . $column . '_pst' . ", 'America/Los_Angeles', 'UTC')");
> 
> This can die with DBD::mysql::db do failed: The table 'longdescs' is full
> [for Statement "ALTER TABLE longdescs ADD COLUMN bug_when TIMESTAMP DEFAULT
> CURRENT_TIMESTAMP NOT NULL"] at Bugzilla/DB.pm line 637.
> 
> longdescs is 4.7G for me, and there is more than double that amount of disk
> space free. When I re-run the script, it will skip longdescs but the install
> is broken because of a bugs_when_pst field existing and no bugs_when field
> existing.

Strange as I have ran this 3-4 times now on my docker environment and not had table full issues. As I understand it, this will create some very large temporary tables in the process and so I would make sure your tmpdir is ample size. Could be a memory issue maybe as my docker container will use all available ram on my system so could explain why it works for me. You may need to increase your VM ram size and/or disk size.

We should take with a DBA about how to safeguard about issues that could come up during the migration process and how to make things faster. Also some ideas might be good to see how we can rollback if something goes wrong in production. Schema changes are non-transactional so we can just wrap it all in a transaction. mpressman, is it you or someone else who would be best suited to give us some advice?

dkl
Flags: needinfo?(mpressman)
It doesn't seem to be the tmp dir, but the /var/lib/mysql dir. It needs more than 2x the size of the table apparently. Attempt #4 is in progress.
Alright, it seems with an 8g /tmp and some plenty (> 2G) on /var/lib/mysql the script completes, except with the errors noted in irc. (reproduced below)

Failed SQL: [ALTER TABLE series_data DROP COLUMN series_date_pst] Error: DBD::mysql::db do failed: Duplicate entry '1' for key 'series_data_series_id_idx' [for Statement "ALTER TABLE series_data DROP COLUMN series_date_pst"] at Bugzilla/DB.pm line 920.

sanitized_bugs_activity.bug_when does not exist in bz_schema and will need to fixed manually.
(In reply to Dylan William Hardison [:dylan] from comment #23)
> Alright, it seems with an 8g /tmp and some plenty (> 2G) on /var/lib/mysql
> the script completes, except with the errors noted in irc. (reproduced below)
> 
> Failed SQL: [ALTER TABLE series_data DROP COLUMN series_date_pst] Error:
> DBD::mysql::db do failed: Duplicate entry '1' for key
> 'series_data_series_id_idx' [for Statement "ALTER TABLE series_data DROP
> COLUMN series_date_pst"] at Bugzilla/DB.pm line 920.
> 
> sanitized_bugs_activity.bug_when does not exist in bz_schema and will need
> to fixed manually.

I hate to ask, but what happens if you reimport a fresh db, checksetup.pl it up to the latest schema and run the script once more to see if the error goes away? I have not encountered that error in my testing.

dkl
(In reply to David Lawrence [:dkl] from comment #17)
> Fubar. Can you give me an idea on the complexity involved on a system level
> on switching our SCL3 BMO infrastructure over to UTC? 
> 
> 1. How hard will it be to set the timezones on each system? Simple puppet
> change or manual?

Timezone can be set through puppet.


> 2. How many cron scripts do we have set up to run right now and how
> difficult would it be to get their run times shifted to UTC-8?

They're all in puppet, so it's pretty easy to change them en masse.


> 3. Is this something we could do during a tree closure window?

Based on your "all must match" note above regarding the change to production, I'm going to say it'll need a TCW so that we can shutdown all services, change the TZ and then restart everything. 

If I wanted to take the naive route, I'd suggest that the Bugzilla app shouldn't care what the host's TZ is. ;-D


> 4. Do we need to restart the systems or does simply changing the link for
> /etc/localtime do it?

Using a link is unsupported these days; it's change /etc/sysconfig/clock and run tzdata-update, now. And docs are ambiguous about restarts; RHEL doc doesn't say anything at all, some people advise rebooting or restarting any apps that use it.
Flags: needinfo?(klibby)
Thanks for the responses.

(In reply to Kendall Libby [:fubar] from comment #25)
> > 3. Is this something we could do during a tree closure window?
> 
> Based on your "all must match" note above regarding the change to
> production, I'm going to say it'll need a TCW so that we can shutdown all
> services, change the TZ and then restart everything. 

Cool. I figured that but wanted to be sure. But it seems from your comment that the time needed is not huge and could be done during an outage window.

> If I wanted to take the naive route, I'd suggest that the Bugzilla app
> shouldn't care what the host's TZ is. ;-D
 
It does in that users can set their preferences to use the servers local timezone as their default choice. 
So if the webserver's timezone is PST, then the user will see their timestamps in PST. So we definitely want
the webserver's timezone to match other systems so needs to be UTC. 
 
> > 4. Do we need to restart the systems or does simply changing the link for
> > /etc/localtime do it?
> 
> Using a link is unsupported these days; it's change /etc/sysconfig/clock and
> run tzdata-update, now. And docs are ambiguous about restarts; RHEL doc
> doesn't say anything at all, some people advise rebooting or restarting any
> apps that use it.

Ah cool. I have been just resetting the symlink in my test system and it does the right thing as well. At least on Centos7. But be a RHEL thing. I do not see a /etc/sysconfig/clock file under Centos. As for rebooting, seems like we will have time for that anyway.

dkl
dkl - i suspect the custom reports will need to be updated.

eg. the 'release tracking report' queries by date; after this change the time frame will be shifted by 8 hours, potentially returning incorrect results.
(In reply to Dylan William Hardison [:dylan] from comment #23)
> Alright, it seems with an 8g /tmp and some plenty (> 2G) on /var/lib/mysql
> the script completes, except with the errors noted in irc. (reproduced below)
> 
> Failed SQL: [ALTER TABLE series_data DROP COLUMN series_date_pst] Error:
> DBD::mysql::db do failed: Duplicate entry '1' for key
> 'series_data_series_id_idx' [for Statement "ALTER TABLE series_data DROP
> COLUMN series_date_pst"] at Bugzilla/DB.pm line 920.
> 
> sanitized_bugs_activity.bug_when does not exist in bz_schema and will need
> to fixed manually.

I actually got this error this time around. Or it may have been there and looked like normal output at first glance.

Converting series_data.series_date to TIMESTAMP...
Renaming column 'series_data.series_date' to
'series_data.series_date_pst'...
Adding new column 'series_date' to the 'series_data' table...
Deleting the 'series_date_pst' column from the 'series_data' table...
Failed SQL: [ALTER TABLE series_data DROP COLUMN series_date_pst] Error: DBD::mysql::db do failed: Duplicate entry '1' for key 'series_data_series_id_idx' [for Statement "ALTER TABLE series_data DROP COLUMN series_date_pst"] at Bugzilla/DB.pm line 920.

I will look into this. One thing off the top of my head is that the unique index was added to the old column at a later time once there was already duped values and now inserting them into the new column is failing.

dkl
(In reply to David Lawrence [:dkl] from comment #28)
> I actually got this error this time around. Or it may have been there and
> looked like normal output at first glance.
> 
> Converting series_data.series_date to TIMESTAMP...
> Renaming column 'series_data.series_date' to
> 'series_data.series_date_pst'...
> Adding new column 'series_date' to the 'series_data' table...
> Deleting the 'series_date_pst' column from the 'series_data' table...
> Failed SQL: [ALTER TABLE series_data DROP COLUMN series_date_pst] Error:
> DBD::mysql::db do failed: Duplicate entry '1' for key
> 'series_data_series_id_idx' [for Statement "ALTER TABLE series_data DROP
> COLUMN series_date_pst"] at Bugzilla/DB.pm line 920.
> 
> I will look into this. One thing off the top of my head is that the unique
> index was added to the old column at a later time once there was already
> duped values and now inserting them into the new column is failing.
> 
> dkl

Hmm

---------

This is a very well known and fully documented limitation of our online ALTER operation. Here is the text:

When running an online ALTER TABLE operation, the thread that runs the ALTER TABLE operation will apply an “online log” of DML operations that were run concurrently on the same table from other connection threads. When the DML operations are applied, it is possible to encounter a duplicate key entry error (ERROR 1062 (23000): Duplicate entry), even if the duplicate entry is only temporary and would be reverted by a later entry in the “online log”. This is similar to the idea of a foreign key constraint check in InnoDB in which constraints must hold during a transaction.

You can read about all limitation of online DDL here:

https://dev.mysql.com/doc/refman/5.6/en/innodb-create-index-limitations.html
for bmo, it would be me, however my priorities have for the most part shifted, so if you can ask direct questions, I will try and respond as quickly as I can
Flags: needinfo?(mpressman)
Also, if you have specific mysql questions, not necessarily bmo questions, please direct that to pythian, they have been contracted to handle most of our mysql work, with the exception of work requiring access to sensitive information like bmo
(In reply to Matt Pressman [:mpressman] from comment #30)
> for bmo, it would be me, however my priorities have for the most part
> shifted, so if you can ask direct questions, I will try and respond as
> quickly as I can

Thanks Matt. My main questions for you are from comment 21 where we asked:

1. How we can better safeguard against issues that could come up during the migration process and how to make things faster. How do we determine what the resource limitations would be altering large tables such as the ones in BMO.
2. Also some ideas on how we can rollback if something goes wrong in production. Schema changes are non-transactional so we can just wrap it all in a transaction. Do we just take the master out of replication, and if things go south, do we just make one of the slaves a new master and rebuild the old master? How much work/downtime would that be on the day of the migration?

Questions for Pythian would be comment 28 and comment 29.

1. What is any tips or tricks on how to work around the 'Duplicate entry' error when alter a table's column that is part of a n unique index?

Thanks
dkl
Flags: needinfo?(mpressman)
The best safeguard would be to ensure that the process is worked out. As in comment 22 you discovered the need to have the disk size increased. An alter will make a duplicate temporary table while data is copied over thus requiring the extra space. Secondly, and related to your second question. It would be beneficial to have a slave, most likely the backup have replication disabled so that you can revert to it's copy. Regardless of which method used, it's a good idea to have an active copy that can be quickly copied over. If necessary to go this route. It would take the time to copy over to the other hosts. Maybe, an hour? Another way to perform the schema changes on slaves themselves and work your way up to the master, when the master is all that's left, fail that over and perform the changes on the previous master. This limits downtime as one host can be worked on at a time. There are tools, like pt-online-schema-change from the percona toolkit that may prove beneficial. This is primarily useful in cases of performing live changes. 

For the duplicate entry error when altering a table's column. Ideally, you would drop the index first, before modifying the table, then adding the index back in. I'm not sure of a workaround for altering the schema. You can disable unique checks with set UNIQUE_CHECKS=0; That is typically used for importing data in order to speed up inserts. It may be worth a quick test, but ideally dropping the index first would be the way to go.
Flags: needinfo?(mpressman)
Attached patch 232193_5.patch (obsolete) — Splinter Review
Updated to deal with dropping and re-adding of indexes involved with the columns we are converting. Seems to help with the issue we were having at least on my system. Please give this a try and a fresh snapshot and let me know if it works.

dkl
Attachment #8729557 - Flags: review?(dylan)
Note to self:

I did get the following errors outputted at the end:
autoland_attachments.status_when does not exist in bz_schema and will need to fixed manually.
sanitized_bugs_activity.bug_when does not exist in bz_schema and will need to fixed manually.

Not sure we need to care about the latter since it is only present in the sanitized dumps but I will look into the first one.

Dylan, please verify you see the same.

dkl
(In reply to David Lawrence [:dkl] from comment #35)
> autoland_attachments.status_when

isn't that the old autoland stuff that never got enabled in production?

> sanitized_bugs_activity.bug_when does not exist in bz_schema and will need
> to fixed manually.
> 
> Not sure we need to care about the latter since it is only present in the
> sanitized dumps but I will look into the first one.

iirc sanitized_bugs_activity is a view - can you check if that view is still required?  (iirc it was created for really old metrics stuff which is likely long gone).  i'd start by asking a dba which users have access to that view, and checking when those accounts were last used.
Depends on: 1259257
(In reply to Byron Jones ‹:glob› from comment #36)
> (In reply to David Lawrence [:dkl] from comment #35)
> > autoland_attachments.status_when
> 
> isn't that the old autoland stuff that never got enabled in production?

Yeah. Somehow it is still present in the DB but not in bz_schema. We will need to manually drop the tables. I could do it as part of the convert_datetime.pl script or do it as a separate bug. The entries in the tables are over 3 years old.

> > sanitized_bugs_activity.bug_when does not exist in bz_schema and will need
> > to fixed manually.
> > 
> > Not sure we need to care about the latter since it is only present in the
> > sanitized dumps but I will look into the first one.
> 
> iirc sanitized_bugs_activity is a view - can you check if that view is still
> required?  (iirc it was created for really old metrics stuff which is likely
> long gone).  i'd start by asking a dba which users have access to that view,
> and checking when those accounts were last used.

bug 1259257
Depends on: 1259258
Attachment #8715378 - Attachment is obsolete: true
Attached patch 232193_6.patch (obsolete) — Splinter Review
(In reply to Byron Jones ‹:glob› from comment #27)
> dkl - i suspect the custom reports will need to be updated.
> 
> eg. the 'release tracking report' queries by date; after this change the
> time frame will be shifted by 8 hours, potentially returning incorrect
> results.

I have updated the Release Tracking Report to convert on the fly from UTC to Pacific when searching.

dkl
Attachment #8729557 - Attachment is obsolete: true
Attachment #8729557 - Flags: review?(dylan)
Attachment #8734149 - Flags: review?(dylan)
Comment on attachment 8734149 [details] [diff] [review]
232193_6.patch

meant to set feedback?(glob) to get his opinion on my change for release tracking reports.
Attachment #8734149 - Flags: feedback?(glob)
Comment on attachment 8734149 [details] [diff] [review]
232193_6.patch

Review of attachment 8734149 [details] [diff] [review]:
-----------------------------------------------------------------

CONVERT_TZ looks sane to me.  i assume you've checked all reports that take dates as input.
Attachment #8734149 - Flags: feedback?(glob) → feedback+
Attached patch 232193_7.patch (obsolete) — Splinter Review
- Added timezone conversion for UserActivity report as well.
Attachment #8734149 - Attachment is obsolete: true
Attachment #8734149 - Flags: review?(dylan)
Attachment #8736356 - Flags: review?(dylan)
Comment on attachment 8736356 [details] [diff] [review]
232193_7.patch

Review of attachment 8736356 [details] [diff] [review]:
-----------------------------------------------------------------

for the bit rot, but it completed anyway.
Attachment #8736356 - Flags: review?(dylan) → review-
Attached patch 232193_8.patchSplinter Review
fixed broken patch
Attachment #8736356 - Attachment is obsolete: true
Attachment #8736912 - Flags: review?(dylan)
Comment on attachment 8736912 [details] [diff] [review]
232193_8.patch

Review of attachment 8736912 [details] [diff] [review]:
-----------------------------------------------------------------

I have tested this with the steps outlined in the user story, and confirmed timestamps all over the codebase.

I've paid special attention to the times when US DST comes into effect and when it comes out of effect.
Attachment #8736912 - Flags: review?(dylan) → review+
r+ too. Although for some reason I still had to "git apply --whitespace=fix"
DONE - Step 0: mysql_tzinfo_to_sql /usr/share/zoneinfo | mysql -u root mysql
DONE - skip the time zone Pacific part, since this is already there
DONE - Step 4: Set timezone of DB server to UTC.
   - sudo ln -sf /usr/share/zoneinfo/UTC /etc/localtime
   - Then restart MySQL.

Verified: 

mysql> show variables like '%time_zone%';
+------------------+--------+
| Variable_name    | Value  |
+------------------+--------+
| system_time_zone | UTC    |
| time_zone        | SYSTEM |
+------------------+--------+
2 rows in set (0.00 sec)
User Story: (updated)
Blocks: 1270270
This migration has been completed. Closing. Please open new bugs if anyone finds issues related to timestamps, etc. as a result of this migration.

Thanks
dkl
Status: ASSIGNED → RESOLVED
Closed: 21 years ago8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: