Closed Bug 587736 Opened 14 years ago Closed 14 years ago

tm-breakpad01-master01 is running out of disk - information request

Categories

(mozilla.org Graveyard :: Server Operations, task)

task
Not set
critical

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: laura, Assigned: tellis)

Details

Our PG master is paging to say it's short on disk, so we need to assess the situation.  We need to know:
- has it been vacuumed/analyzed recently?  when?
- how much space each table is using (how to is here http://wiki.postgresql.org/wiki/Disk_Usage)?

We need to address this now before we run out of disk.
Assignee: server-ops → tellis
I'll answer the second part first, since it's easiest.

                           relation                            |    size    
---------------------------------------------------------------+------------
 public.frames_20090202                                        | 12 GB
 public.reports_20090202                                       | 8228 MB
 public.top_crashes_by_signature                               | 3438 MB
 public.top_crashes_by_signature_signature_key                 | 3119 MB
 public.frames_20090202_pkey                                   | 3056 MB
 public.frames_20100621                                        | 3006 MB
 public.frames_20100719                                        | 2292 MB
 public.frames_20100614                                        | 2136 MB
 public.reports_20100621                                       | 2062 MB
 public.frames_20100607                                        | 1948 MB
 public.frames_20100517                                        | 1802 MB
 public.reports_20100419                                       | 1798 MB
 public.reports_20100719                                       | 1795 MB
 public.frames_20100524                                        | 1776 MB
 public.frames_20100510                                        | 1768 MB
 public.frames_20100503                                        | 1734 MB
 public.frames_20100419                                        | 1710 MB
 public.frames_20100426                                        | 1692 MB
 public.reports_20100426                                       | 1658 MB
 public.frames_20100531                                        | 1609 MB
 public.reports_20100517                                       | 1562 MB
 public.reports_20100503                                       | 1556 MB
 public.extensions_20100621                                    | 1552 MB
 public.reports_20100614                                       | 1550 MB
 public.frames_20100405                                        | 1543 MB
 public.reports_20100510                                       | 1535 MB
 public.extensions_20100719                                    | 1527 MB
 public.frames_20100315                                        | 1524 MB
 public.topcrashurlfactsreports                                | 1520 MB
 public.frames_20100322                                        | 1519 MB
 public.frames_20100809                                        | 1510 MB
 public.frames_20100705                                        | 1510 MB
 public.frames_20100628                                        | 1505 MB
 public.reports_20090202_url_key                               | 1493 MB
 public.frames_20100712                                        | 1484 MB
 public.frames_20100802                                        | 1481 MB
 public.frames_20100329                                        | 1466 MB
 public.extensions_20100621_extension_id_extension_version_idx | 1462 MB
 public.reports_20100607                                       | 1456 MB
 public.extensions_20100614                                    | 1445 MB
 public.reports_20100405                                       | 1430 MB
 public.reports_20100412                                       | 1422 MB
 public.extensions_20100607                                    | 1418 MB
 public.top_crashes_by_signature_osdims_key                    | 1371 MB
 public.frames_20100412                                        | 1368 MB
 public.reports_20100329                                       | 1364 MB
 public.reports_20100315                                       | 1352 MB
 public.extensions_20100419                                    | 1348 MB
 public.extensions_20100503                                    | 1341 MB
 public.reports_20100524                                       | 1336 MB
 public.extensions_20100426                                    | 1325 MB
 public.extensions_20100517                                    | 1324 MB
 public.extensions_20100510                                    | 1323 MB
 public.top_crashes_by_signature_window_end_productdims_id_idx | 1316 MB
 public.extensions_20100524                                    | 1300 MB
 public.extensions_20100531                                    | 1266 MB
 public.frames_20100726                                        | 1265 MB
 public.extensions_20100719_extension_id_extension_version_idx | 1263 MB
 public.frames_20100308                                        | 1221 MB
 public.reports_20100531                                       | 1210 MB
 public.extensions_20100405                                    | 1197 MB
 public.frames_20100301                                        | 1195 MB
 public.reports_20090202_uuid_key                              | 1188 MB
 public.extensions_20100614_extension_id_extension_version_idx | 1187 MB
 public.frames_20100215                                        | 1178 MB
 public.top_crashes_by_url_signature                           | 1163 MB
 public.reports_20090202_date_key                              | 1162 MB
 public.frames_20100125                                        | 1159 MB
 public.frames_20100208                                        | 1148 MB
 public.reports_20100809                                       | 1139 MB
 public.reports_20100705                                       | 1130 MB
 public.frames_20100621_report_id_date_key                     | 1128 MB
 public.extensions_20100607_extension_id_extension_version_idx | 1128 MB
 public.frames_20100118                                        | 1125 MB
 public.extensions_20100322                                    | 1122 MB
 public.reports_20100308                                       | 1121 MB
 public.extensions_20100329                                    | 1111 MB
 public.extensions_20100315                                    | 1110 MB
 public.reports_20100628                                       | 1105 MB
 public.reports_20090202_signature_date_key                    | 1093 MB
 public.frames_20100222                                        | 1086 MB
 public.reports_20100712                                       | 1082 MB
 public.reports_20100802                                       | 1082 MB
 public.top_crashes_by_signature_pkey                          | 1079 MB
 public.extensions_20100419_extension_id_extension_version_idx | 1073 MB
 public.top_crashes_by_signature_productdims_window_end_idx    | 1066 MB
 public.reports_20100726                                       | 1065 MB
 public.extensions_20100503_extension_id_extension_version_idx | 1065 MB
 public.frames_20100111                                        | 1064 MB

There are many many more, as you can imagine, but I cut it short here. If you need more tables listed, let me know, I'll create a pastebin or whatever.
This is my recommendation.  I noted several weeks ago as I began work on Socorro 1.8, the frames tables are no longer in use in any read operation of the python or php code. It's write only table.  Its continued existence is just an historical artifact.  

If we were to drop these tables, I doubt anyone would notice.  I suggest that we drop all but the newest of these partitioned tables.  Then, when 1.8 ships, we drop the rest of them.  That should free up some significant disk space.
It would indeed. So the proposal is to drop these tables:

breakpad=# SELECT nspname || '.' || relname AS "relation",
    pg_size_pretty(pg_relation_size(C.oid)) AS "size"
  FROM pg_class C
  LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
  WHERE nspname NOT IN ('pg_catalog', 'information_schema') and relname like 'frames_%'
  ORDER BY relname limit 20;
                 relation                  |    size    
-------------------------------------------+------------
 public.frames_20090119                    | 8192 bytes
 public.frames_20090119_pkey               | 16 kB
 public.frames_20090119_report_id_date_key | 16 kB
 public.frames_20090126                    | 24 kB
 public.frames_20090126_pkey               | 16 kB
 public.frames_20090126_report_id_date_key | 16 kB
 public.frames_20090202                    | 12 GB
 public.frames_20090202_pkey               | 3056 MB
 public.frames_20090209                    | 528 MB
 public.frames_20090209_pkey               | 138 MB
 public.frames_20090209_report_id_date_key | 194 MB
 public.frames_20090216                    | 517 MB
 public.frames_20090216_pkey               | 135 MB
 public.frames_20090216_report_id_date_key | 190 MB
 public.frames_20090223                    | 508 MB
 public.frames_20090223_pkey               | 133 MB
 public.frames_20090223_report_id_date_key | 187 MB
 public.frames_20090302                    | 539 MB
 public.frames_20090302_pkey               | 140 MB
 public.frames_20090302_report_id_date_key | 197 MB

I can do this tomorrow morning. Can I have a +1 from someone who feels authoritative? This seems a dangerous operation.
ok, to be safer, rather than dropping these tables, let's truncate them.  That way if there is something somewhere that I've overlooked that needs them, it won't crash because they're gone.  It will only think there is no data to be found.

Before we act, let's have a quorum give it the thumbs up.  We've got our Socorro meeting tomorrow and I'll bring it up as a topic there.

Delay acting until we give it the thumbs up after the 1pm meeting.
(In reply to comment #4)

> ok, to be safer, rather than dropping these tables, let's truncate them.
+1
(In reply to comment #5)
> (In reply to comment #4)
> 
> > ok, to be safer, rather than dropping these tables, let's truncate them.
> +1

+1
Looks like here we have lots of agreement. Will wait 'til post-1pm meeting to actually do this.
Here's how I created the "truncate table" script:

SELECT 'truncate table ' || nspname || '.' || relname || '; --' AS "relation",
    pg_size_pretty(pg_relation_size(C.oid)) AS "size"
  FROM pg_class C
  LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
  WHERE nspname NOT IN ('pg_catalog', 'information_schema') and relname like 'frames_________' and pg_relation_size(C.oid) > 0
  ORDER BY relname limit 30;

The output looks like this:

                 relation                  |  size
-------------------------------------------+---------
 truncate table public.frames_20090817; -- | 615 MB
 truncate table public.frames_20090824; -- | 635 MB
 truncate table public.frames_20090831; -- | 645 MB
 truncate table public.frames_20090907; -- | 660 MB
 truncate table public.frames_20090914; -- | 668 MB
 truncate table public.frames_20090921; -- | 676 MB
 truncate table public.frames_20090928; -- | 696 MB
 truncate table public.frames_20091005; -- | 665 MB
 truncate table public.frames_20091012; -- | 799 MB

Then you can just copy/paste the lines of output onto the same psql session to cause the truncation to happen. Which is what I did for the first 40 or so tables. That cleared up enough space:

:) df -m
Filesystem           1M-blocks      Used Available Use% Mounted on
/dev/mapper/mpath1p1    503974    412447     91527  82% /mnt/eql/bp-data

So I stopped clearing space at that point. We can drop the rest of these tables when the time comes easily by changing the SELECT above to "drop table" instead of "truncate table."
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.