Friday, February 06, 2015

Auto-upload Elastisearch template mapping with Apache Camel

When feeding data into Elastisearch, one important step is to configure the correct template for the index/type so that, for instance, numeric fields are stored as numbers to ensure that they can be sorted by and/or confronted correctly.

The Elasticsearch Logstash plugin has a handy option just for this purpose. If you are not using Logstash you have to do it yourself, eithr through configuration mgmt, startup scripts or simply manaully launching the appropriate curl command.

If you have followed my previous post on using Apache Camel to feed sql data into Elasticsearch then it might come natural to attempt to use Camel also for the purpose of uploading the template mapping.

Tuesday, January 13, 2015

Camel-Elasticsearch: create timestamped indices

One nice feature of the logstash-elasticsearch integration is that, by default, logstash will use timestamped indices when feeding data to elasticsearch.

This means that yesterday's data is in a separate index from today's data and from each other day's data, simplifying index management. For instance, suppose you only want to keep the last 30 days:

elasticsearch-remove-old-indices.sh -i 30

The Apache Camel Elasticsearch component provides no such feature out of the box, but luckily it is quite easy to implement (when you know what to do. /grin ).

Thursday, October 30, 2014

Extending a LVM logical volume with SaltStack

How do you, at once, extend a LVM logical volume on a fleet of identical linux (Centos) servers using SaltStack? Here's how and, thanks to Salt, it only took 5m.

Friday, September 12, 2014

Indexing Apache access logs with ELK (Elasticsearch+Logstash+Kibana)

Who said that grepping Apache logs has to be boring?

Sample of dashboard that can be created with ELK. Pretty impressive, huh?
The truth is that, as Enteprise applications move to the browser too, Apache access logs are a gold mine, it does not matter what your role is: developer, support or sysadmin.
If you are not mining them you are most likely missing out a ton of information and, probably, making the wrong decisions.

ELK (Elasticsearch, Logstash, Kibana) is a terrific, Open Source stack for visually analyzing Apache (or nginx) logs (but also any other timestamped data).

Wednesday, August 27, 2014

Extract TABLE data from a large postgres SQL dump (with postgis)

What do you do when postgres refuses to import a dump because it contains invalid byte sequences?

Solution: feed the sql script to iconv then import it as usual.

That's easier said than done especially if your database contains postgis data which must be restored through a custom postgres dump (instructions here).

I recently experienced this issue on a relatively small table in a large-ish database. Since hand editing the SQL dump is cumbersome and hard (it is over 500MB in size) the only and most elegant alternative was to do it with a script.

The following is an awk script which will extract the COPY instructions relative to a table from a postgres SQL dump:



Usage:
awk -f copy_extract.awk -v TBL=TABLENAME pgdump/database_dump.sql

One liner:
awk -f copy_extract.awk -v TBL=TEST pgdump/db.sql | iconv -f latin1 -t utf8 | psql db

Wednesday, June 11, 2014

Ehcache: deploy multiple versions of a Grails app (fix javax.management.InstanceAlreadyExistsException)

When a Grails application makes use of the Ehcache cache plugin in its default configuration it can be impossibile to perform deploys of multiple versions of the app, even though the container might support it.
The same plugin (in its default configuration) also breaks deploying multiple different Grails apps on the same container.

The problem is in the way the plugin generates the name for the cache (which will then be used to register the cache jmx bean): the name is by default set grails-cache-ehcache. When another second application or another application version is deployed registration will fail because the name already exists. The exception message is the following (indented for clarity):

org.springframework.beans.factory.BeanCreationException:
Error creating bean with name 'ehCacheManagementService':
Invocation of init method failed;
nested exception is net.sf.ehcache.CacheException:
javax.management.InstanceAlreadyExistsException:
net.sf.ehcache:type=CacheManager,name=grails-cache-ehcache

The (undocumented) solution is easy to implement. Edit the Config.groovy file and add the following configuration bit:

grails.cache.config = {
  provider {
    name "ehcache-<yourappname>-"+(new Date().format("yyyyMMddHHmmss"))
  }
}

If you are using the ehcache.xml file instead it might be more difficult to randomize the cache name, but it could be done during the build.

Tested on Grails 2.1.5 and Tomcat 7.

Friday, February 07, 2014

Create an OpenLayers map programmatically

Sometimes it is useful to abstract away the repetitive layer creation code with a configuration-based approach.

For example consider this very simple map taken from the OpenLayers examples:

How could we avoid repeating invoking the layer contructor and instead provde a framework that allows us to instantiate any layer with just configuration? The solution is quite simple.