#Is NetFlow streaming data analysis possible with fluentd?
I wanted to do some analysis against NetFlow data that I receive everyday. The analyis that I wanna do is various. It is simple pattern matching for a specific IP address, detecting specifc traffic pattern, figuring out network graph and calicurate proximity of certain nodes and so on so forth. I might use norikura
plug-in for that purpose later but not sure now.
I know storm
, kafuka
, spark streaming
and even memSQL
and VoltDB
are good for this purpose but wanted to go with quick way for now.
#Captureing NetFlow with fluentd
I use this NetFlow plug-in on fluentd
2015-12-19 19:08:01 +0000 [info]: fluent/engine.rb:113:block in configure: gem 'fluent-plugin-netflow' version '0.1.1'
2015-12-19 19:08:01 +0000 [info]: fluent/engine.rb:113:block in configure: gem 'fluentd' version '0.12.18'
When I tried install the plug-in to td-agentd
, it gives me a worningn around json.
WARN: Unresolved specs during Gem::Specification.reset:
json (>= 1.4.3)
WARN: Clearing out unresolved specs.
Please report a bug if this causes problems.
##Prepare docker container
The play ground is docker. Note that you need to map udp
port to receive Netflow in your container. When you usually specify a port mapping with -p
, the default transport is tcp
. I spent some time until I aware why is it not getting any NetFlow data from my routers :(.
I am doing somethig like:
$ docker run -d -h fluent-nf --name=fluent-nf -p 8888:8888 -p 5140:5140/udp -v /home/me/docker/fluent-nf:/fluent-nf -it fluent-fn
##Netflow plugin Configuration
Just add the example in the plug-in page and ouput to stdout to fluent.conf
.
<source>
type netflow
tag netflow.event
# optional parameters
#bind 127.0.0.1
bind 0.0.0.0
port 5140
# optional parser parameters
cache_ttl 6000
versions [5, 9]
</source>
<match netflow.**>
@type stdout
@id stdout_output
</match>
##Let's test ###HTTP
$ curl -X POST -d 'json={"json":"message"}' http://<collector_ip>:8888/debug.test
###UDP
echo -n 'foo' | nc -4u -w1 <collector_ip> 5140
You find a line in stdout. It tries to understand the udp packet as NetFlow but ends up with warn with wired version number :). Now you know udp is reaching to the container.
2015-12-19 19:08:53 +0000 [warn]: plugin/parser_netflow.rb:58:call: Ignoring Netflow version v26223
###Netflow simulator Now, let's inject real - not realy real - NetFlow using a NetFlow simulator here.
2015-12-22 01:31:09 +0000 [debug]: plugin/in_netflow.rb:73:block in receive_data: received logs host="10.2.6.142" data="\x00\x05\x00\x01\r?\xEF\x15<\xFF\xAA\x00`\x17ZvJZ\xAD\f\xFE\xFF\xFF\xFF\n\x00\x00\x01\n\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x03\xE8\r?\x04\xB5\r?\xEF\x15\x03\xE8\x00P\x80i\x06\xFF\xFF\xFF\xFF\xFF\x00\x00\x00\x00"
2015-12-22 01:31:09 +0000 [warn]: plugin/parser_netflow.rb:58:call: Ignoring Netflow version v5
Something wrong men...
Modifyed plugin/parser_netflow.rb
line 58 to skip version check then.
2015-12-22 01:59:59 +0000 [debug]: plugin/in_netflow.rb:73:block in receive_data: received logs host="10.2.6.142" data="\x00\x05\x00\x01\x01\xCA\xD7\x7FVx\xAB\\\x00\x00\x00\x00\x00\x00\x016\x00\x00\x00\x00\n\x00\x00\x02\n\x00\x00\x03\x00\x00\x00\x00\x00\x03\x00\x05\x00\x00\x00\x01\x00\x00\x00@\x01\xC9\xED\x1F\x01\xCA\xD7\x7F\x10\x92\x00P\x00\x00\x11\x01\x00\x02\x00\x03 \x1F\x00\x00"
2015-12-22 01:46:04 +0000 netflow.event: {"version":"5","flow_seq_num":"310","engine_type":"0","engine_id":"0","sampling_algorithm":"0","sampling_interval":"0","flow_records":"1","ipv4_src_addr":"10.0.0.2","ipv4_dst_addr":"10.0.0.3","ipv4_next_hop":"0.0.0.0","input_snmp":"3","output_snmp":"5","in_pkts":"1","in_bytes":"64","first_switched":"2015-12-22T01:45:04.000Z","last_switched":"2015-12-22T01:46:04.000Z","l4_src_port":"4242","l4_dst_port":"80","tcp_flags":"0","protocol":"17","src_tos":"1","src_as":"2","dst_as":"3","src_mask":"32","dst_mask":"31","host":"10.2.6.142"}
Yes. Working.
where is the docker image
fluent-fn
? I can't find it on docker hub.