Closed
Bug 1378433
Opened 7 years ago
Closed 7 years ago
Update from MySQL 5.6 to 5.7
Categories
(Tree Management :: Treeherder, enhancement, P2)
Tree Management
Treeherder
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: emorley, Assigned: emorley)
References
Details
Attachments
(9 files)
47 bytes,
text/x-github-pull-request
|
camd
:
review+
|
Details | Review |
63 bytes,
text/x-github-pull-request
|
dividehex
:
review+
dividehex
:
checkin+
|
Details | Review |
8.71 KB,
text/plain
|
Details | |
63 bytes,
text/x-github-pull-request
|
emorley
:
review+
emorley
:
checkin+
|
Details | Review |
47 bytes,
text/x-github-pull-request
|
Details | Review | |
7.66 KB,
text/plain
|
Details | |
63 bytes,
text/x-github-pull-request
|
emorley
:
review+
emorley
:
checkin+
|
Details | Review |
8.71 KB,
text/plain
|
Details | |
63 bytes,
patch
|
dividehex
:
review+
dividehex
:
checkin+
|
Details | Diff | Splinter Review |
MySQL 5.7 has much better performance for some schema migration operations (eg 2-3 times as fast), as well as a number of other performance and bug fixes.
First step will be testing MySQL 5.7 in Vagrant, then Travis, then on the Heroku prototype's RDS instance, ...
Assignee | ||
Comment 1•7 years ago
|
||
For MySQL 5.6 -> 5.7 upgrades, the upgrade process forces the use of the new temporal types, which if upgrading from tables created by a version older than 5.6.4 will mean very long ALTER TABLES during the upgrade.
Using the scripts at:
https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_UpgradeDBInstance.MySQL.html
https://cloudiamo.com/2016/11/14/migrating-a-rds-mysql-5-6-to-rds-mysql-5-7/
...it seems as though we don't have any such tables so should be fine. The test on prototype will confirm this (depending on whether it just takes 5 minutes or hours).
Comment 2•7 years ago
|
||
Assignee | ||
Comment 3•7 years ago
|
||
Comment on attachment 8884109 [details] [review]
[treeherder] mozilla:mysql-5.7 > mozilla:master
This switches Vagrant/Travis, and I think is worth landing now rather than waiting for the Terraform pieces, so we get longer to sanity check the new version locally through normal usage.
Notes on changes on the new version:
https://dev.mysql.com/doc/refman/5.7/en/upgrading-from-previous-series.html
Once we're done with this bug (switching all environments to 5.7), we can also look into removing our custom sql_mode setting and using the new stricter default in 5.7 (we had to enable a few things since the 5.6 defaults were pretty lax, but now there are even more things enabled out of the box):
https://dev.mysql.com/doc/refman/5.7/en/sql-mode.html#sql-mode-changes
Attachment #8884109 -
Flags: review?(cdawson)
Assignee | ||
Comment 4•7 years ago
|
||
Attachment #8884252 -
Flags: review?(jwatkins)
Updated•7 years ago
|
Attachment #8884109 -
Flags: review?(cdawson) → review+
Comment 5•7 years ago
|
||
Commits pushed to master at https://github.com/mozilla/treeherder
https://github.com/mozilla/treeherder/commit/fd75e9ae15fcb2d2181097d05c8e2f040399916f
Bug 1378433 - Update to MySQL 5.7 in Vagrant/Travis
The Vagrant environment is running Ubuntu 16.04, which has a native
mysql-server-5.7 package available, so doesn't need a custom APT
repository to be set.
Travis is still on Ubuntu 14.04 (since they don't offer 16.04 yet),
so has to use the upstream mysql.com repository instead.
In the future I'll be switching both Travis and local development to
use the same Docker images to avoid these kind of differences.
After this lands, we'll want to open a PR against the Terraform configs
to switch the treeherder-dev RDS instance to MySQL 5.7, before doing
the same for stage/prod.
https://github.com/mozilla/treeherder/commit/225c772e38b732e0bb6ee6cb494870ffc7ba4a41
Bug 1378433 - Stop vendoring libmysqlclient in Vagrant/Travis
Now that we're using MySQL 5.7 server, there is no longer a conflict
between our desired choice of client library (which had to be 5.7 to
avoid security vulnerabilities) and the server version.
The library is still vendored on Heroku for now (in `bin/pre_compile`),
but that can stop too, once we switch to the Heroku-16 stack which is
based on Ubuntu 16.04 (bug 1365567).
Comment 6•7 years ago
|
||
Comment on attachment 8884252 [details] [review]
devservices-aws PR#53: Upgrade treeherder-dev to MySQL 5.7
Git merged and terraform has been applied. I temporarily enabled allow_major_version_upgrade to allow the upgrade. Once the upgrade completed and state achieved, allow_major_version_upgrade was removed to be set back to false (default).
~ aws_db_instance.treeherder-dev-rds
allow_major_version_upgrade: "" => "true"
engine_version: "5.6.34" => "5.7.17"
parameter_group_name: "treeherder" => "treeherder-mysql57"
aws_db_instance.treeherder-dev-rds: Still modifying... (ID: treeherder-dev, 1h12m51s elapsed)
aws_db_instance.treeherder-dev-rds: Modifications complete (ID: treeherder-dev)
Attachment #8884252 -
Flags: review?(jwatkins)
Attachment #8884252 -
Flags: review+
Attachment #8884252 -
Flags: checkin+
Assignee | ||
Comment 7•7 years ago
|
||
Thank you for merging/applying that :-)
(In reply to Jake Watkins [:dividehex] from comment #6)
> Git merged and terraform has been applied. I temporarily enabled
> allow_major_version_upgrade to allow the upgrade. Once the upgrade
> completed and state achieved, allow_major_version_upgrade was removed to be
> set back to false (default).
Ah the `allow_major_version_upgrade` setting doesn't appear to exist in the UI so wasn't aware of it. I see it now on:
https://www.terraform.io/docs/providers/aws/r/db_instance.html
Looking at the timestamps in the RDS logs (plus the "1h12m51s elapsed" above) got me concerned initially that the update did need the ALTER TABLEs for legacy date fields after all (comment 1) - however the `mysqlUpgrade` log (attached) shows the upgrade itself was quick - it was just the backup that took a while.
When I checked the instance just now, the parameter group was showing as "pending reboot", so I had to reboot it.
Also, the event log shows:
July 7, 2017 at 10:09:21 PM UTC+1 DB instance shutdown
July 7, 2017 at 10:09:29 PM UTC+1 Backing up DB instance
July 7, 2017 at 10:11:13 PM UTC+1 DB instance shutdown
July 7, 2017 at 10:11:54 PM UTC+1 DB instance restarted
July 7, 2017 at 10:12:43 PM UTC+1 Updated to use DBParameterGroup treeherder-mysql57
July 7, 2017 at 10:52:41 PM UTC+1 Finished DB Instance backup
July 7, 2017 at 10:52:41 PM UTC+1 Database instance patched
July 7, 2017 at 10:52:51 PM UTC+1 Backing up DB instance
July 7, 2017 at 11:16:15 PM UTC+1 Finished DB Instance backup
This makes me think that Terraform didn't apply all the changes as one action, but tired to first apply the parameter group before updating the version. I also noticed there is a bug relating to updating master+replica combos (like we use for prod):
https://github.com/terraform-providers/terraform-provider-aws/issues/775
Given all of the above, I wonder whether we should just do stage/prod by hand (when the time comes), and then update the Terraform configs retrospectively? (Particularly given the need to do the `allow_major_version_upgrade` dance too.)
(In reply to Ed Morley [:emorley] from comment #3)
> Once we're done with this bug (switching all environments to 5.7), we can
> also look into removing our custom sql_mode setting and using the new
> stricter default in 5.7
Note to self:
For MySQL 5.7 the AWS default parameter group actually overrides the new upstream MySQL defaults (presumably so customers don't blame them for breakage), so we can't remove the setting entirely.
Assignee | ||
Comment 8•7 years ago
|
||
(In reply to Ed Morley [:emorley] from comment #7)
> Given all of the above, I wonder whether we should just do stage/prod by
> hand (when the time comes), and then update the Terraform configs
> retrospectively? (Particularly given the need to do the
> `allow_major_version_upgrade` dance too.)
Things appear to be working well on the dev instance, so I think we're ready to try stage soon.
What are you thoughts on the above? :-)
Flags: needinfo?(jwatkins)
Comment 9•7 years ago
|
||
(In reply to Ed Morley [:emorley] from comment #8)
> (In reply to Ed Morley [:emorley] from comment #7)
> > Given all of the above, I wonder whether we should just do stage/prod by
> > hand (when the time comes), and then update the Terraform configs
> > retrospectively? (Particularly given the need to do the
> > `allow_major_version_upgrade` dance too.)
>
> Things appear to be working well on the dev instance, so I think we're ready
> to try stage soon.
>
> What are you thoughts on the above? :-)
I'm fine with doing it either way. The `allow_major_version_upgrade` dance wasn't much of an inconvenience. I would personally opt for rolling the upgrade through terraform again since it worked fine with dev and there is no need to go back and match state after the fact.
It's your call, I'm game for either. Just let me know when and/or drop a PR. :-)
Flags: needinfo?(jwatkins)
Assignee | ||
Comment 10•7 years ago
|
||
My main concern in comment 7 was that Terraform seemed to be applying the changes as two separate steps:
1) Changing parameter group on its own (but not properly applying it due to lack of restart)
2) Updating the MySQL version
This left the prototype instance using the default parameter group until I restarted, which means the wrong DB collation (amongst other things) until I manually restarted the instance.
I was under the impression that if done via the AWS console the two steps would normally be done as one action, avoiding this problem?
Assignee | ||
Comment 11•7 years ago
|
||
Attachment #8886571 -
Flags: review?(jwatkins)
Comment 12•7 years ago
|
||
Comment 13•7 years ago
|
||
Commit pushed to master at https://github.com/mozilla/treeherder
https://github.com/mozilla/treeherder/commit/ed34567863fba80eb27f519b6b3cc22366b6a378
Bug 1378433 - Skip test_mysqlclient_tls_enforced() on Travis
Since we have to upgrade from MySQL 5.6 to 5.7 and there are still
enough remnants behind that cause the test to fail. Multiple attempts
have been made to clean them up to no avail, and since this will go
away soon when we switch Travis to using Docker images, it's not worth
spending any more time on it.
Comment 14•7 years ago
|
||
I initiated the upgrade on treeherder-stage rds starting at 9:13am PDT but terraform ended up timing out. The view from the console shows it is still in the process of upgrading. I'm getting the impression the upgrade is stuck at the moment.
aws_db_instance.treeherder-stage-rds: timeout while waiting for state to become 'available' (last state: 'upgrading', timeout: 1h20m0s)
Assignee | ||
Comment 15•7 years ago
|
||
I believe the upgrade itself has completed, but RDS keeps the "upgrading" state until the backup has finished - at least judging from when dev was upgraded. Though the backup really should have completed by now, so not sure.
Assignee | ||
Comment 16•7 years ago
|
||
For posterity, Jake opened an AWS support ticket, whose response was:
"""
I have taken a look at your RDS instance treeherder-stage and can indeed see it is currently going through the update process. Currently a snapshot is being taken of your RDS instance as a security measure which is what is taking a majority of the time. Your RDS instance however should be usable in it's current state.
In order for your RDS state to be in an available state, the current shapshot will need to coplete. Right now your snapshot is approximately at 56%. I can not provide an estimate however of the amount of time it may take for the snapshot to complete as this will depend on the RDS instance workload.
I have confirmed however that the instance is healthy and this process is progressing as expected.
...
"""
That 56% was 4.5 hours after the update started.
The backup finally finished 9 hours after the update started. Compare to the 40 minutes that it took for the dev instance (which is the same instance size and DB size).
The good news is that New Relic shows the DB was only actually inaccessible for ~4 minutes, the rest was just the background backup.
Note I still had to manually restart the instance for the parameter group to apply. Perhaps we need to set the "apply immediately" option when we do prod?
Assignee | ||
Comment 17•7 years ago
|
||
Comment on attachment 8886571 [details] [review]
devservices-aws PR#56: Upgrade treeherder-stage to MySQL 5.7
https://github.com/mozilla-platform-ops/devservices-aws/commit/63b329d157bf2327a0d9b9ce70b8197a757c4479
Attachment #8886571 -
Flags: review?(jwatkins)
Attachment #8886571 -
Flags: review+
Attachment #8886571 -
Flags: checkin+
Assignee | ||
Comment 18•7 years ago
|
||
Assignee | ||
Comment 19•7 years ago
|
||
Everything looking ok for prototype/stage, I start the process for prod once I'm back from PTO next week.
Assignee | ||
Comment 20•7 years ago
|
||
Jake, I don't suppose we could try making this change sometime over the weekend, or else say early Monday, just in case there are any hiccups?
Attachment #8893697 -
Flags: review?(jwatkins)
Comment 21•7 years ago
|
||
I can see about doing this tomorrow morning (EST) if that's acceptable
Assignee | ||
Comment 22•7 years ago
|
||
That would be great - thank you :-)
Assignee | ||
Comment 23•7 years ago
|
||
Comment on attachment 8893697 [details] [review]
devservices-aws PR#60: Upgrade treeherder-prod to MySQL 5.7
https://github.com/mozilla-platform-ops/devservices-aws/commit/c0c3c6a76859108f8d8f08d1dfe92f455ff490a3
And an additional commit to update the replica too:
https://github.com/mozilla-platform-ops/devservices-aws/commit/9fbd55eba5afe8b8ec9c2dc96f8d006f2d0f0d02
Attachment #8893697 -
Flags: review?(jwatkins)
Attachment #8893697 -
Flags: review+
Attachment #8893697 -
Flags: checkin+
Assignee | ||
Comment 24•7 years ago
|
||
Assignee | ||
Comment 25•7 years ago
|
||
Other than cleaning up the now unused RDS treeherder custom mysql5.6 parameter group, we're all done here :-)
Attachment #8895329 -
Flags: review?(jwatkins)
Assignee | ||
Updated•7 years ago
|
Attachment #8895329 -
Flags: review?(klibby)
Comment 26•7 years ago
|
||
Comment on attachment 8895329 [details] [diff] [review]
devservices-aws PR#63: Remove MySQL 5.6 specific parameter group
Review of attachment 8895329 [details] [diff] [review]:
-----------------------------------------------------------------
Terribly sorry for the delay to this review. Cleanup had to take a backseat to relops OKRs. Patch LGTM
Attachment #8895329 -
Attachment is patch: true
Attachment #8895329 -
Attachment mime type: text/x-github-pull-request → text/plain
Attachment #8895329 -
Flags: review?(jwatkins) → review+
Comment 27•7 years ago
|
||
Comment on attachment 8895329 [details] [diff] [review]
devservices-aws PR#63: Remove MySQL 5.6 specific parameter group
Merged and applied
Attachment #8895329 -
Flags: review?(klibby) → checkin+
Assignee | ||
Updated•7 years ago
|
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•