Skip to main content

OpenNMS 15: warm your postgres cache

OpenNMS 15 puts a much higher load on the database than previous versions.
Besides tuning postgres, the OS and perhaps splitting the app and the db on different boxes one aspect that I found to really make a difference is having a warm postgres cache.

Additional tip: if you haven't already put postgres on XFS. There is a reason that RH7 switched to XFS as the default fs and it is performance. You will also find that most postgres people recommend XFS instead of ext3/4.

If you followed the instructions on my previous post you should have a v_database_cache view in the opennms database. Soon after installing OpenNMS 15 I found that the events relation was not cached at all (less than 2% of it was cached after one day).

This is probably due to to various reasons, most likely queries have been improved to use indices instead of scanning the tables, but the UI performance suffers (it takes 1-2 seconds to display the node pages)[1].

To warm the datbase cache and improve general performance and responsiveness of the UI run this command as the postgres user:
psql -A opennms -c "select * from events; " > /dev/null
If you have a large events table consider adding a filter (ie: only events from the last week).

Now check the database cache: percent_of_relation should show a larger value for the events relation. In my case it was 100% (shared_buffers=1GB , event table is ~180M) and I found the UI to be much much snappier.

opennms=# select * from  v_database_cache ;
            relname            | buffered | buffers_percent | percent_of_relation 
-------------------------------+----------+-----------------+---------------------
 events                        | 181 MB   |            17.7 |               100.0
 notifications                 | 44 MB    |             4.3 |                84.6
 outages                       | 16 MB    |             1.6 |               100.0
 events_ipaddr_idx             | 4128 kB  |             0.4 |                40.9
 bridgemaclink                 | 4704 kB  |             0.4 |               100.7
 events_nodeid_idx             | 4008 kB  |             0.4 |                51.9
 events_nodeid_display_ackuser | 4480 kB  |             0.4 |                42.9
 assets                        | 2848 kB  |             0.3 |               101.1
 snmpinterface                 | 1576 kB  |             0.2 |               100.0
 bridgemaclink_pk_idx2         | 2296 kB  |             0.2 |               100.0

Thanks for reading.

[1] yes I am running on somewhat aged hardware (Proliant DL580G5, RAID10 on 10K drives, 8GBRAM, 2 × XEON).

Comments

Popular posts from this blog

Indexing Apache access logs with ELK (Elasticsearch+Logstash+Kibana)

Who said that grepping Apache logs has to be boring?

The truth is that, as Enteprise applications move to the browser too, Apache access logs are a gold mine, it does not matter what your role is: developer, support or sysadmin. If you are not mining them you are most likely missing out a ton of information and, probably, making the wrong decisions.
ELK (Elasticsearch, Logstash, Kibana) is a terrific, Open Source stack for visually analyzing Apache (or nginx) logs (but also any other timestamped data).

From 0 to ZFS replication in 5m with syncoid

The ZFS filesystem has many features that once you try them you can never go back. One of the lesser known is probably the support for replicating a zfs filesystem by sending the changes over the network with zfs send/receive.
Technically the filesystem changes don't even need to be sent over a network: you could as well dump them on a removable disk, then receive  from the same removable disk.

A not so short guide to TDD SaltStack formulas

One of the hardest parts about Infrastructure As Code and Configuration Management is establishing a discipline on developing, testing and deploying changes.
Developers follow established practices and tools have been built and perfected over the last decade and a half. On the other hand sysadmins and ops people do not have the same tooling and culture because estensive automation has only become a trend recently.

So if Infrastructure As Code allows you to version the infrastructure your code runs on, what good is it if then there are no tools or established practices to follow?

Luckily the situation is changing and in this post I'm outlining a methodology for test driven development of SaltStack Formulas.

The idea is that with a single command you can run your formula against a matrix of platforms (operating systems) and suites (or configurations). Each cell of the matrix will be tested and the result is a build failure or success much alike to what every half-decent developer of…