Sunday, April 29, 2007

Painless Non-Enterprise Netflow

Tonight I released another ubuntutrinux-core snaphot that include fprobe and a few tools from flow-tools. I'll spare the introduction to Netflow except to comment why this might useful for Linux/Trinux (its obvious for routers!) as opposed to other network monitoring tools.

In terms of data your are getting about what you would get with a port logger such as ippl or other port listeners that log connections from hosts

root@gx620:/tmp# flow-cat biglast | flow-print | head
srcIP dstIP prot srcPort dstPort octets packets
24.136.0.111 239.255.255.250 2 0 0 32 1
24.136.0.189 239.255.255.250 2 0 0 32 1
82.211.81.145 24.136.x.y 17 123 123 76 1
24.136.2.30 239.255.255.250 2 0 0 32 1
24.136.2.67 224.0.0.251 2 0 0 32 1
24.136.2.67 239.255.255.253 2 0 0 32 1
10.48.120.1 224.0.0.1 2 0 0 28 1
24.136.0.163 239.255.67.250 2 0 0 32 1
24.136.19.48 224.0.0.253 2 0 0 32 1

Notice this is mostly multicast cruft on RCN with the exception of NTP traffic to the Ubuntu time source. But big deal. Some of you may remember (back in the day!) a NSWC tool called SHADOW (where Northcutt and Irwin made their claim to fame) that was basically a collection of Perl scripts that managed tcpdump file capture and viewing through a web interface.

Well flow-tools allows you to a lot of the same stuff with much less overhead and all from the command-line

$ flow-cat biglast | flow-stat -f5 -S 1 | head -25

# Args: flow-stat -f5 -S 1

#
#
# port flows octets packets
#
80 2464 12632818 162315
53 1099 214515 3053
1026 414 250208 710
32768 405 131967 766
1027 314 155033 314
123 186 14136 186
443 177 333745 3020
7 175 19075 175
5222 141 36842 370
3408 120 15892 220

Basically you "flow-cat" the saved file to a number of different tools, flow-stat being the most useful for me. Not terribly surprising HTTP is at the top nor the 1026 to my firewall. Damn cable.

$ flow-cat biglast | flow-stat

#
# Fields: Total
# Symbols: Disabled
# Sorting: None
# Name: Overall Summary
#
# Args: flow-stat
#
Total Flows : 13168
Total Octets : 380416991
Total Packets : 517136
Total Time (1/1000 secs) (flows): 222894382
Duration of data (realtime) : 34320
Duration of data (1/1000 secs) : 364934
Average flow time (1/1000 secs) : 16926.9733
Average packet size (octets) : 735.6227
Average flow size (octets) : 28889.5043
Average packets per flow : 39.2722
Average flows / second (flow) : 36.1758
Average flows / second (real) : 0.3837
Average Kbits / second (flow) : 8360.8132
Average Kbits / second (real) : 88.6753

After concatenating all this data into a single file, it only took about about 800k for about 10 hours of traffic.

And how did I kick all this off?

First I ran fprobe on Trinux and made sure it was working by testing it out with EHNT
which is the quickest way (it took me a while to wander through the flow-tools manpages, and the ubuntu startup script (in /etc/init.d) for capturing flows didn't work. I'm using pcap to get this but there is a version of fprobe that can generate flow from iptables.

#fprobe -u nobody collector-ip:collector port

BTW, netflow uses UDP. You can sniff to make sure the flow updates are being sent. And then on the server (you'll want to be more restrictive on the local and remote ports, the 0's)

#flow-capture -w /raid/flows/ 0/0/4444 -S20

which creates the directory hierarchy like:

root@gx620:/raid/flows/2007# ls -alR | less

.:
total 0
drwxr-xr-x 3 root root 72 2007-04-29 12:04 .
drwxr-xr-x 3 root root 72 2007-04-29 12:04 ..
drwxr-xr-x 3 root root 80 2007-04-29 12:04 2007-04

./2007-04:
total 2
drwxr-xr-x 3 root root 80 2007-04-29 12:04 .
drwxr-xr-x 3 root root 72 2007-04-29 12:04 ..
drwxr-xr-x 2 root root 2208 2007-04-29 22:15 2007-04-29

./2007-04/2007-04-29:
total 326
drwxr-xr-x 2 root root 2208 2007-04-29 22:15 .
drwxr-xr-x 3 root root 80 2007-04-29 12:04 ..
-rw-r--r-- 1 root root 1145 2007-04-29 12:15 ft-v05.2007-04-29.121249-0500
-rw-r--r-- 1 root root 4500 2007-04-29 12:30 ft-v05.2007-04-29.121904-0500
-rw-r--r-- 1 root root 2153 2007-04-29 12:45 ft-v05.2007-04-29.123001-0500


Of course there are tons more options (the -S20 just says write a status report to syslog every 20 minutes like:

Apr 29 22:20:00 localhost flow-capture[27718]: STAT: now=1177903200 startup=1177867131 src_ip=192.168.100.1 dst_ip=192.168.169.162 d_ver=5 pkts=2918 flows=13828 lost=1 reset=0 filter_drops=0

No comments: