Skip to main content

Using SQLite to keep state in shell scripts

Bash shell scripting is one the things that I miss more on the times I work on Windows. Even as good as shell scripting is, sometimes I wish it was easy to keep track of state across script executions, for instance when a script executes a rsync at short intervals. In that case I don't care if one particula rsync fails, but I definitely care if it fails, say, ten times in a row or for more than a day.

To do this I need some kind of way to keep state and record each run exit status. The simplest approach that I could think of is to use SQLite. Copied straight from the SQLite site: SQLite is a software library that implements a self-contained, serverless, zero-configuration, transactional SQL database engine. SQLite is the most widely deployed SQL database engine in the world. The source code for SQLite is in the public domain.

As I said, in this example I'll show how to track the status of a series of rsync operations, all run from a central node and pushed to remote nodes. The script is scheduled to run every 30m.

First thing to do in the shell script is to create and init the database unless it does not exist. This is pretty easy (notice the unique constraint, which I'll explain below):

function createdb {
 echo "create table stat(host varchar(20) unique, failcount smallint, lastsuccess datetime);" | sqlite3 db/stats.db

mkdir db > /dev/null 2>&1
test -e db/stats.db || createdb

I put this code at the beginning of the script. If the database gets corrupted I just delete the file and it gets recreated with the next run.

The table structure is easy enough: for each host I want to know when the last successful replication occurred and how many failures since then.

The second important step is to make sure there is a row for each host, because in SQL data must first be inserted and then updated. This is also easy since SQLite support the OR statement: basically we add a unique constraint on the host column and then at every run attempt to insert an initial row for each host. On a traditional db (like Oracle or Postgres) the second insert would trigger an error because the duplicate value violates the unique constraint. Instead on a db that supports OR we can tell the db to ignore the duplicate and keep the existing row. This way I don't have to test if the row exists and insert it if it doesn't with the bonus that the script stays small and simpler:

echo "insert or ignore into stat(host,failcount,lastsuccess) values ('$ip', 0, null);" | sqlite3 db/stats.db  

SQL statements can be fed on stdin or as the second argument of the sqlite3 program. I'll show you how in the last fragment. In this case $ip holds the ip address or hostname used to identify the host in the script.

After the rsync command has run I capture the exit code and store it into the db:

if [ "$status" == "0" ]; then
   echo "update stat set failcount=0, lastsuccess=datetime('now') where host='$ip';" | sqlite3 db/stats.db  
   echo "update stat set failcount=failcount+1 where host='$ip';" | sqlite3 db/stats.db  

At the end of the script I write out failures stats in a file that is later piped into a mail notification:

echo "The following hosts NEVER completed a sync successfully:" > stats.txt
sqlite3 db/stats.db "select host from stat where lastsuccess is null" >> stats.txt
echo "" >> stats.txt
echo "The following hosts did not sync in the last 24 hours:" >> stats.txt
sqlite3 db/stats.db "select host from stat where lastsuccess < (datetime('now','-1 day')) and failcount>0" >> stats.txt


Popular posts from this blog

Indexing Apache access logs with ELK (Elasticsearch+Logstash+Kibana)

Who said that grepping Apache logs has to be boring?

The truth is that, as Enteprise applications move to the browser too, Apache access logs are a gold mine, it does not matter what your role is: developer, support or sysadmin. If you are not mining them you are most likely missing out a ton of information and, probably, making the wrong decisions.
ELK (Elasticsearch, Logstash, Kibana) is a terrific, Open Source stack for visually analyzing Apache (or nginx) logs (but also any other timestamped data).

From 0 to ZFS replication in 5m with syncoid

The ZFS filesystem has many features that once you try them you can never go back. One of the lesser known is probably the support for replicating a zfs filesystem by sending the changes over the network with zfs send/receive.
Technically the filesystem changes don't even need to be sent over a network: you could as well dump them on a removable disk, then receive  from the same removable disk.

RUNDECK job maintenance

Learn more about Rundeck.

Now that I have a fair number of jobs scheduled by Rundeck, how do I periodically prune the job execution history and keep only the last, say, 30 executions for each job?