NAME¶
flowdumper - a
grep(1)-like utility for raw flow files
SYNOPSIS¶
flowdumper [-h] [-v] [-s|S|r|R] [-a|n] [[-I expr] -e expr [-E expr]] [-c] [-B file] [-o output_file] [flow_file [...]]
but usually just:
flowdumper [-s] -e expr flow_file [...]
DESCRIPTION¶
flowdumper is a
grep(1)-like utility for selecting and processing
flows from cflowd or flow-tools raw flow files. The selection criteria are
specified by using the "-e" option described below.
flowdumper's primary features are the ability to:
- •
- Print the content of raw flow files in one of two built-in formats or a
format of the users own. The built-in "long" format is much like
that produced by the flowdump command supplied with cflowd. The
"short", single-line format is suitable for subsequent
post-processing by line-oriented filters like sed(1).
- •
- Act as a filter, reading raw flow input from either file(s) or standard
input, and producing filtered raw flow output on standard output. This is
similar to how grep(1) is often used on text files.
- •
- Select flows according to practically any criteria that can be expressed
in perl syntax.
The "flow variables" and other symbols available for use in the
"-e" expression are those made available by the Cflow module when
used like this:
use Cflow qw(:flowvars :tcpflags :icmptypes :icmpcodes);
See the Cflow perl documentation for full details on these values (i.e.
"perldoc Cflow".)
Most perl syntax is allowed in the expressions specified with the
"-e", "-I", and "-E" options. See the perl man
pages for full details on operators ("man perlop") and functions
("man perlfunc") available for use in those expressions.
If run with no arguments, filters standard input to standard output.
The options and their arguments, roughly in order of usefulness, are:
- "-h"
- shows the usage information
mnemonic: 'h'elp
- "-a"
- print all flows
implied if "-e" is not specified
mnemonic: 'a'll
- "-e" expr
- evaluate this expression once per flow
mnemonic: 'e'xpression
- "-c"
- print number of flows matched in input
mnemonic: 'c'ount
- "-s"
- print flows in short (one-line) format, ignored with "-n"
mnemonic: 's'hort
- "-r"
- print flows in the raw/binary flow file format
ignored with "-n"
mnemonic: 'r'aw
- "-R"
- "repacks" and print flows in the raw/binary flow file format
requires "-e", ignored with "-n", useful with
"-p"
mnemonic: 'R'epack raw
- "-n"
- don't print matching flows
mnemonic: like "perl "-n"" or "sed
"-n""
- "-o" output_file
- send output to the specified file. A single printf(3) string
conversion specifier can be used within the output_file value (such as
"/tmp/%s.txt") to make the output file name a function of the
input file basename.
mneomic: 'o'utput file
- "-S"
- print flows in the "old" short (one-line) format
ignored with "-n"
mnemonic: 'S'hort
- "-v"
- be verbose with messages
mnemonic: 'v'erbose
- "-V"
- be very verbose with messages (implies ""-v"")
mnemonic: 'V'ery verbose
- "-I" expr
- eval expression initially, before flow processing
practically useless without "-e"
mnemonic: 'I'nitial expression
- "-E" expr
- eval expression after flow processing is complete
practically useless without "-e"
mnemonic: 'E'ND expression
- "-B" file
- Load the specified BGP dump file using Net::ParseRouteTable.
In your optional expression, you can now refer to these variables:
$dst_as_path_arrayref
$dst_origin_as
$dst_peer_as
$src_as_path_arrayref
$src_origin_as
$src_peer_as
which will cause a lookup. Their values are undefined if the lookup fails.
mnemonic: 'B'GP dump file
- "-p" prefix_mappings_file
- read file containing IPv4 prefix mappings in this format (one per line):
10.42.69.0/24 -> 10.69.42.0/24
...
When specifying this option, you can, and should at some point, call the
ENCODE subroutine in your expressions to have it encode the IP address
flowvars such as $Cflow::exporter, $Cflow::srcaddr, $Cflow::dstaddr, and
$Cflow::nexthop.
mnemonic: 'p'refixes
EXAMPLES¶
Print all flows, in a multi-line format, to a pager:
$ flowdumper -a flows.* |less
Print all the UDP flows to another file using the raw binary flow format:
$ flowdumper -re '17 == $protocol' flows.current > udp_flows.current
Print all TCP flows which have the SYN bit set in the TCP flags:
$ flowdumper -se '6 == $protocol && ($TH_SYN & $tcp_flags)' flows.*
Print the first 10 flows to another file using the raw binary flow format:
$ flowdumper -I '$n = 10' -re '$n-- or exit' flows.*0 > head.cflow
Print all flows with the start and end time using a two-line format:
$ flowdumper -se 'print scalar(localtime($startime)), "\n"' flows.*
Print all flows with the specified source address using a short, single-line
format:
$ flowdumper -se '"10.42.42.42" eq $srcip' flows.*
Do the same thing in a quicker, but less obvious, way:
$ flowdumper -I '
use Socket;
$addr = unpack("N", Socket::inet_aton("10.42.42.42"));
' -se '$addr == $srcaddr' flows.*
(This latter method runs quicker because
inet_aton(3) is only called
once, instead of once per flow.)
Print all flows with a source address within the specifed network/subnet:
$ flowdumper \
-I 'use Socket;
$mask = unpack("N", Socket::inet_aton("10.42.0.0"));
$width = 16' \
-se '$mask == ((0xffffffff << (32-$width)) & $srcaddr)' flows.*
Print all flows where either the source or the destination address, but not
both, is within the specified set of networks or subnets:
$ flowdumper \
-I 'use Net::Patricia;
$pt = Net::Patricia->new;
map { $pt->add_string($_, 1) } qw( 10.42.0.0/16
10.69.0.0/16 )' \
-se '1 == ($pt->match_integer($srcaddr) +
$pt->match_integer($dstaddr))' flows.*
Count the total number of "talkers" (unique source host addresses) by
piping them to
sort(1) and
wc(1) to count them:
$ flowdumper \
-I 'use Net::Patricia;
$pt = Net::Patricia->new;
map { $pt->add_string($_, 1) } qw( 10.42.0.0/16
10.69.0.0/16 )' \
-ne '$pt->match_integer($srcaddr) and print "$srcip\n"' flows.* \
|sort -u |wc -l
Count the total number of "talkers" (unique source host addresses)
that are within a the specified networks or subnets:
$ flowdumper \
-I 'use Net::Patricia;
$pt = new Net::Patricia;
map { $pt->add_string($_, 1) } qw( 10.42.0.0/16
10.69.0.0/16 );
$talkers = new Net::Patricia' \
-ne '$pt->match_integer($srcaddr) &&
($talkers->match_integer($srcaddr) or
$talkers->add_string($srcip, 1))' \
-E 'printf("%d\n", $talkers->climb( sub { 1 } ))' flows.*
(For large numbers of flows, this latter method is quicker because it populates
a Net::Patricia trie with the unique addresses and counts the resulting nodes
rather than having to print them to standard output and then having to sort
them to determine how many are unique.)
Select the TCP flows and "ENCODE" the IP addresses according to the
prefix encodings specified in "prefix_encodings.txt":
$ flowdumper -p prefix_encodings.txt -se '6 == $protocol && ENCODE'
Produce a new raw flow file with the IP addresses ENCODEd according to the
prefix encodings specified in "prefix_encodings.txt":
$ flowdumper -p prefix_encodings.txt -Re 'ENCODE' flows > flows.enc
Produce a set of raw flow files that have the $src_as and $dst_as origin AS
values filled in based upon a lookup in externally-specified routing table (in
the file "router.bgp") and have the IP address info replaces with
zeroes (for anonymity):
$ ssh router "show route protocol bgp terse" > router.bgp # Juniper
$ flowdumper \
-B router.bgp \
-e '$src_as = $src_origin_as,
$dst_as = $dst_origin_as,
(($exporter = 0),
($srcaddr = 0),
($src_mask = 0),
($dstaddr = 0),
($dst_mask = 0),
($nexthop = 0), 1)' \
-R \
-o /tmp/%s.cflow_enc \
flows*
NOTES¶
This utility was inspired by Daniel McRobb's
flowdump utility which is
supplied with cflowd.
flowdumper was originally written as merely a
sample of what can be done with the Cflow perl module, but has since been
developed into a more complete tool.
BUGS¶
When using the "-B" option, routing table entries that contain AS sets
at the end of the AS path are quietly discarded. (It's not so quiet if you
also specified "-V".) It was necessary to discard these, because I
did not consider AS sets when designing the API and therefore have no way to
communicate more than one origin AS value per for a single source or
destination IP address.
There are perhaps some pathological combinations of options that currently do
not produce usage error messages, but should.
Since the expression syntax is that of perl itself, there are lots of useless
expressions that will happily be accepted without complaint. This is
particular troublesome when trying to track down typos, for instance, with the
flow variable names.
This script probably has the same bugs as the Cflow module, since it's based
upon it.
AUTHOR¶
Dave Plonka <plonka@doit.wisc.edu>
Copyright (C) 1998-2002 Dave Plonka. This program is free software; you can
redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation; either version 2 of the
License, or (at your option) any later version.
SEE ALSO¶
perl(1), Socket, Net::Netmask, Net::Patricia, Cflow.