Saturday, February 16, 2013

Logstash - GROK patterns and nginx access log

I've using logstash for over a week now and I think its a really good tool to put some order on your infrastructure. There are too many file formats, protocols... and not enough time write the required amount to regexps or parser that may allow you to understand whats going on on your platform.

Logstash provides an abstraction layer for complex regexps called grok. It allows you to convert "streams" into fields (chunks of information) that may be queried later. We are going to parse nginx access logs using grok.

Hands on
In my case, I had to start learning GROK (grok debugger helped me a lot) in order to parse an nginx ( 0.7.67-3 on squeeze ) access log. So here is the logs file format (taken from slicehost)

If you need some more detail on nginx file format and so, check this.

Here is an example nginx log file:

Now create a directory called "patterns" and create a file "nginx.grok", and put the following pattern there:

I used the same var names used on the nginx site to create the field names. So it will be easier to track the information into your output.

Now lets put all this to work into logstash:
And here is the resulting json data as shown in the stdout:

The date

Something really important to remember its that logstash stores all the events on GMT using "@timestamp". We sent the event on the "time_local" field, and  using the "date" filter we told logstash to use that field as its timestamp.

You may ask, Why do logstash changes my events time? The reason its simple, it will allow us to make event correlation among boxes on different timezones.

Here is what I sent to logstash on nginx $local_time format:
"time_local":["16/Feb/2013:12:30:20 -0430"]
Here is what logstash stored on ISO8601 format:

As you may notice, there is a time difference of 4 hours and 30 minutes. And the reason is that the linux box time zone is "America/Caracas", which is -4:30 from GTM.

I hope this may be useful for you, enjoy!


  1. Buen día, dónde trabajas? Qué puedes recomendar como lectura inicial a logstash? Estoy bastante interesado. Gracias.

    1. Hola David. Soy Venezolano y en el sector financiero. Mi recomendación es que compres el libro sobre logstash que escribió James Turnbull. Yo lo tengo la verdad me ha ayudado muchísimo:

      El tutorial básico del sitio de logstash es bastante bueno:

      La otra opción es consultar el canal irc de logstash en freenode. Siempre está muy activo y la comunidad está muy dispuesta a apoyar a quienes están aprendiendo.

      Y por supuesto si te puedo ayudar en algo, escríbeme y veré como echarte una mano.

  2. y todavia mas actualizado : ...