端くれプログラマの備忘録 Elasticsearch [Elasticsearch] LogstashでApacheアクセスログをバッチで読み込む

[Elasticsearch] LogstashでApacheアクセスログをバッチで読み込む

Logstashはログ管理ツール。以下の機能を備えており、プラグイン形式で機能を拡張できるのが特徴。

  • input ログを記録するイベントを監視する
  • filter イベントに対しフィルタ処理を行う
  • codec inputから受け取ったイベントを指定した形式に整形する
  • output ログの出力を行う

Logstashのインストール

$ sudo rpm --import https://packages.elastic.co/GPG-KEY-elasticsearch
$ sudo vi /etc/yum.repos.d/logstash.repo
[logstash-2.2]
name=Logstash repository for 2.2.x packages
baseurl=http://packages.elastic.co/logstash/2.2/centos
gpgcheck=1
gpgkey=http://packages.elastic.co/GPG-KEY-elasticsearch
enabled=1
$ sudo yum install logstash

Logstashのテスト

$ /opt/logstash/bin/logstash --version
logstash 2.2.2
$ /opt/logstash/bin/logstash -e 'input { stdin { } } output { stdout {} }'
Settings: Default pipeline workers: 2
Logstash startup completed
hello world
2016-03-21T15:38:56.933Z localhost.localdomain hello world
CTRL-D
Logstash shutdown completed

アクセスログの読み込みテスト

$ vi apache-import.conf
input {
  stdin { }
}

filter {
  grok {
  match => { "message" => "%{COMBINEDAPACHELOG}" }
  }
  date {
  match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
  locale => "en"
  }
  mutate {
  replace => { "type" => "apache_access" }
  }
}

output {
  stdout { codec => rubydebug }
}
$ /opt/logstash/bin/logstash -f apache-import.conf < logs/test_log
Settings: Default pipeline workers: 2
Logstash startup completed
{
  "message" => "157.55.39.187 - - [20/Mar/2016:03:27:52 -0700] \"GET /blog/2014/08/07/ HTTP/1.1\" 200 57529 \"-\" \"Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)\"",
  "@version" => "1",
  "@timestamp" => "2016-03-20T10:27:52.000Z",
  "host" => "localhost.localdomain",
  "clientip" => "157.55.39.187",
  "ident" => "-",
  "auth" => "-",
  "timestamp" => "20/Mar/2016:03:27:52 -0700",
  "verb" => "GET",
  "request" => "/blog/2014/08/07/",
  "httpversion" => "1.1",
  "response" => "200",
  "bytes" => "57529",
  "referrer" => "\"-\"",
  "agent" => "\"Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)\"",
  "type" => "apache_access"
}
Logstash shutdown completed

Elasticsearchとの接続

$ vi apache-import.conf
input {
  stdin { }
}

filter {
  grok {
  match => { "message" => "%{COMBINEDAPACHELOG}" }
  }
  date {
  match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
  locale => "en"
  }
  mutate {
  replace => { "type" => "apache_access" }
  }
}

output {
  # stdout { codec => rubydebug }
  elasticsearch { hosts => '127.0.0.1:9200' }
}

$ /opt/logstash/bin/logstash -f apache-import.conf --configtest
Configuration OK
$ /opt/logstash/bin/logstash -f apache-import.conf < logs/test_log
Settings: Default pipeline workers: 2
Logstash startup completed
Logstash shutdown completed

$ /opt/logstash/bin/logstash -f apache-import.conf < logs/access_log
$ /opt/logstash/bin/logstash -f apache-import.conf < logs/access_log-20160320

es-apache-import

参考サイト

15分で作る、Logstash+Elasticsearchによるログ収集・解析環境 – さくらのナレッジ
http://knowledge.sakura.ad.jp/tech/2736/

Package Repositories
https://www.elastic.co/guide/en/logstash/current/package-repositories.html#_yum

Stashing Your First Event: Basic Logstash Example
https://www.elastic.co/guide/en/logstash/current/first-event.html

Reference [2.2]
https://www.elastic.co/guide/en/logstash/current/index.html