Skip to main content

Monitoring Oracle tablespace quota with OpenNMS

Going beyond the normal application availability check

One interesting use of the OpenNMS JDBC poller is for extracting data from the Oracle administrative database tables, for example tracking tablespace quota usage to detect quota exhaustion, sudden usage peaks and graph usage over time.

Graph of quota usage for user [redacted] on tablespace DAT.
Notice the cleaning operation running at 3.30 AM
Tablespace quotas is a feature present in the Oracle database that allows the DBA to set a limit on the amount of storage that any given user can consume on a specific tablespace. This allows the DBA to share tablespaces across users yet still be able to policy users into predefined usage boundaries. When a user consumes all its quota it can no longer store data, but it can delete it, thus allowing self-recovery.



Configuring OpenNMS to monitor quota usage is rather simple. First of all make sure that you have the Oracle JDBC driver in $OPENNMS_HOME/lib. If not download and copy the jar file in that directory. Do not restart OpenNMS as we will need to restart it later.

Now cd into $OPENNMS_HOME/etc, make a backup copy of the configuration files (not necessary if you already use version control) and then make the changes described below. The code is available at this location. Note that in these files I report only the relevant fragments. It should be straightforward to merge these fragments in the right context of your files.

Things that you will have to change and adapt to your environment:
  1. user and password of the Oracle user used to connect to the Oracle database and query the dba_ts_quotas table. Your DBA should also take care of granting the appropriate rights to this user
  2. hostname and service name for the JDBC URL. This is repeated in two different places, so make sure to change them all. If you don't know the right server and service names consult with your DBA. It might get tricky especially with Oracle RAC configurations. Usually I first try with  OPENNMS_JDBC_HOSTNAME, then, if it fails, I fall back to specifying an hostname. Caution: if you specify an hostname, say srvora1, AND assign this service to another host, say srvora2, you will NOT be monitoring quotas on srvora2, but rather on srvora1!
  3. since 1.10 the datacollection-config.xml fragment can be modularized in the datacollection directory as described here



After the required changes are in place restart OpenNMS and then assign the OracleMonitoring service to the right node. Please note that due the way that the JDBC URL is constructed it is not possible to assign this service to more than one database instance, unless they all have the same service name.
To monitor database instances with different service names one must duplicate the whole configuration, with perhaps the exception of graphs.

The graphs will be shown only for Oracle users that have quotas set (ie MaxBytes > 0) on at least one tablespace. Users without quotas will not be shown in the graphs list for the host.

Thresholds and notifications

To enable notifications we must first establish thresholds. For that see the last two code fragments in the gist above. I didn't want to create custom UEIs, so I didn't specify any in the UEI fields.

Note that I have specified the UserName in the datasource label field. This allows us to show useful information in the notification message such as the tablespace and user that have triggered or rearmed the threshold by inserting the %parm[label]% tag in the subject and/or message body.


Comments

Popular posts from this blog

Indexing Apache access logs with ELK (Elasticsearch+Logstash+Kibana)

Who said that grepping Apache logs has to be boring?

The truth is that, as Enteprise applications move to the browser too, Apache access logs are a gold mine, it does not matter what your role is: developer, support or sysadmin. If you are not mining them you are most likely missing out a ton of information and, probably, making the wrong decisions.
ELK (Elasticsearch, Logstash, Kibana) is a terrific, Open Source stack for visually analyzing Apache (or nginx) logs (but also any other timestamped data).

From 0 to ZFS replication in 5m with syncoid

The ZFS filesystem has many features that once you try them you can never go back. One of the lesser known is probably the support for replicating a zfs filesystem by sending the changes over the network with zfs send/receive.
Technically the filesystem changes don't even need to be sent over a network: you could as well dump them on a removable disk, then receive  from the same removable disk.

Mirth: recover space when mirthdb grows out of control

I was recently asked to recover a mirth instance whose embedded database had grown to fill all available space so this is just a note-to-self kind of post.
Btw: the recovery, depending on db size and disk speed, is going to take long.

The problem A 1.8 Mirth Connect instance was started, then forgotten (well neglected, actually). The user also forgot to setup pruning so the messages filled the embedded Derby database until it grew to fill all the available space on the disk. The SO is linux.

The solution First of all: free some disk space so that the database can be started in embedded mode from the cli. You can also copy the whole mirth install to another server if you cannot free space. Depending on db size you will need a corresponding amount of space: in my case a 5GB db required around 2GB to start, process logs and then store the temp files during shrinking.

Then open a shell as the user that mirth runs as (you're not running it as root, are you?) and cd into the mirth home. …