关于logstash中grok匹配的一点经验:
#这个网站中正则表达的质量比较高:https://github.com/yangzj1992/articles/blob/master/%E3%80%90%E4%B8%AA%E4%BA%BA%E6%80%BB%E7%BB%93%E3%80%91%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F%E8%AF%AD%E6%B3%95%E5%8F%8A%E5%B8%B8%E7%94%A8%E6%AD%A3%E5%88%99.md
input {
file {
path => "c:\Program Files\logstash-2.3.3\jifeng_test1.txt"
}
}
filter {
grok {
match => { "message" => "%{USERNAME_CHINESE_DONG:username}\~\|\^%{EMAILADDRESS_DONG:email}" }
#关于这一行有几点要注意:每条日志的样子为: anetadmin~|^gfan2011@mappn.com~|^3e46cea508a2595dc54510a10d5bbb1e:203473~|^222.130.140.89
#我们用match去匹配时不能写成:
# match => { "message" => "%{USERNAME_CHINESE_DONG:username}~|^%{EMAILADDRESS_DONG:email}" }
#因为 "|" 和 "^"是正则表达式中的特殊字符,需要用"\"转义,就像第10行所示.另外,"~"不是正则表达式的特殊字符,
#但我测试过,在"~"前加"\"也可以正确运行.即以下match匹配都对:
# match => { "message" => "%{USERNAME_CHINESE_DONG:username}\~\|\^%{EMAILADDRESS_DONG:email}" }
# match => { "message" => "%{USERNAME_CHINESE_DONG:username}~\|\^%{EMAILADDRESS_DONG:email}" }
#另外有一点:系统中自带的email匹配EMAILADDRESS输出结果不正确,所以我是用了自定义的EMAILADDRESS_DONG去匹配.
#EMAILADDRESS_DONG的定义位于C:\Program Files\logstash-2.3.3\vendor\bundle\jruby\1.9\gems\logstash-patterns-core-2.0.5\patterns
#match中正则表达式的部分是有严格的格式的,不能随便加入空格\制表符等东西.
}
}
output{
stdout{ codec => rubydebug{}}
#elasticsearch {
# flush_size => 10000
# hosts => "localhost"
# index => "qq_sgk"
# document_type => "qq_sgk-1"
#}
}