Closed
Bug 1219321
Opened 9 years ago
Closed 9 years ago
Etherpad data integrity issues
Categories
(Infrastructure & Operations :: Change Requests, task)
Infrastructure & Operations
Change Requests
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: scabral, Unassigned)
Details
Etherpad has data integrity issues, and we'd like to fail over the db to the slave to try to mitigate.
Reporter | ||
Comment 1•9 years ago
|
||
Etherpad db was filling up disk space in /, so I moved the db to /data. In the process, the MyISAM table "store" was marked as crashed and needed to be repaired. The repair restored 120 records: 2015-10-28 16:27:59 13292 [Note] Found 4143174 of 4143054 rows when repairing './etherpad/store' +----------------+--------+----------+---------------------------------------------------+ | Table | Op | Msg_type | Msg_text | +----------------+--------+----------+---------------------------------------------------+ | etherpad.store | repair | info | Wrong bytesec: 108-108- 44 at 1116783308; Skipped | | etherpad.store | repair | info | Wrong bytesec: 115-115-105 at 1116778176; Skipped | | etherpad.store | repair | info | Wrong bytesec: 52- 71-116 at 1116785704; Skipped | | etherpad.store | repair | info | Wrong bytesec: 58-106-105 at 1116785548; Skipped | | etherpad.store | repair | info | Wrong bytesec: 58- 49-110 at 1116794512; Skipped | | etherpad.store | repair | warning | Number of rows changed from 4143054 to 4143174 | | etherpad.store | repair | status | OK | +----------------+--------+----------+---------------------------------------------------+ 7 rows in set (35.71 sec) This resulted in a 4 minute outage from 16:23 to 16:27 UTC (9:23 - 9:27 Pacific).
Reporter | ||
Comment 2•9 years ago
|
||
Received a complaint that https://public.etherpad-mozilla.org/p/measurement-team-meeting-notes is blank after the outage. Etherpad's database is not set up in a way that we can extract text or history without using the API. There aren't any commandline tools available, so we'd like to fail over to the redundant slave (which didn't crash) in the hopes that the history/text is still there.
Reporter | ||
Updated•9 years ago
|
Assignee: team73 → server-ops
Component: DB: MySQL → Change Requests
Product: Data & BI Services Team → Infrastructure & Operations
QA Contact: scabral → lypulong
Reporter | ||
Updated•9 years ago
|
Change Request: --- → ?
Reporter | ||
Comment 3•9 years ago
|
||
needinfo'ing jakem, as this needs updating: https://mana.mozilla.org/wiki/display/websites/etherpad.mozilla.org#etherpad.mozilla.org-RestartEtherpad
Flags: needinfo?(nmaul)
Reporter | ||
Comment 4•9 years ago
|
||
svn sysadmins repo r109826 committed to change config of etherpad db's to swap master and slave (configs only, nothing changes until the lb changes).
Reporter | ||
Comment 5•9 years ago
|
||
:atoll stopped etherpad, I updated the load balancer, :atoll restarted etherpad. Functionality is good, unfortunately the etherpad that lost data, overwrote with an empty pad, so the slave did not have any history. There may be other pads that lost data, but due to the nature of how etherpad stores data in the db, it's not possible to sleuth out how many pads were affected.
Updated•9 years ago
|
Change Request: ? → emergency
Comment 6•9 years ago
|
||
Seems closable.
Status: NEW → RESOLVED
Closed: 9 years ago
Flags: needinfo?(nmaul)
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•