Sun, Apr 19, 2009

How not to restore a Linux software raid

We had a disk failure on one of our Xen servers at work last week, and what we thought would be a quick disk replace, turned into a small nightmare. Our setup is fairly “simple”: 2 x raid1’s consisting of sda1/sdb1 (/dev/md0 mounted at /) and sda3/sdb3 (/dev/md1 with LVM on top of it). mdadm reported that sdb1 and sdb3 had failed, so we just had to identify which disk was sdb in the server and replace it.

Sat, Jul 26, 2008

Fast nagios exim mail queue plugin replacement

We had a problem with the nagios check_mailq plugin at work, it kept timing out. So I wrote a simple bash script (instead of 610 lines of perl) which is “compatible” with check_mailq (supports the same arguments) which uses “exim4” and is very quick. Just drop it in /usr/local/bin/ and adjust your nagios conf to use that instead of check_mailq