Postgres maintenance: vacuum hell, cluster to the rescue
I have an OpenNMS server (a PC really) sitting under a desk at a company that I don't hear from often, but I still get weekly backup mail-recaps.
This week the recap brought bad news: postgres stopped accepting connections until the db is vacuumed to prevent transaction id wraparound.
I have know of this issue of postgres for some time and planned accordingly: a cron job every month would run a full vacuum. Unfortunately in one month the db grew so much that vacuum never reached completion because the customer simply rebooted the server thinking that it was 'stuck'.
This vacuum-stuck-reboot had gone on for months until today. As I knew from prior experience this was going to be a painful experience: a quick du on the database folder reported 60GB of data. Vacuum would probably take days.
Luckily for me I had heard from the great Postgresql 9.0 high performance book that there is another tool in the toolbox called cluster.
So I decided to execute this plan this time:
This week the recap brought bad news: postgres stopped accepting connections until the db is vacuumed to prevent transaction id wraparound.
I have know of this issue of postgres for some time and planned accordingly: a cron job every month would run a full vacuum. Unfortunately in one month the db grew so much that vacuum never reached completion because the customer simply rebooted the server thinking that it was 'stuck'.
This vacuum-stuck-reboot had gone on for months until today. As I knew from prior experience this was going to be a painful experience: a quick du on the database folder reported 60GB of data. Vacuum would probably take days.
Luckily for me I had heard from the great Postgresql 9.0 high performance book that there is another tool in the toolbox called cluster.
So I decided to execute this plan this time:
- delete old events, alarms and notifications (by far the largest tables)
- cluster each table (including the 5 largest)
- vacuum full
Well, believe it or not the whole process took no longer that a couple of hours, most of which spent deleting records. I am not kidding you when I say that other times I have spent days waiting for vacuum to end.
And best of all the database shrunk from 60GB to 7.9GB! Most of the shrinking happened already after clustering.