This project was moved to https://codeberg.org/BelWue/flowpipeline
Please check the new repository for the most up-to-date version.
A goflow2-compatible flow message processing toolkit
bwNET is a research project of the German federal state of Baden-Württemberg which aims to provide innovative services within the state's research and education network BelWü. This GitHub Org contains the code pertaining to the monitoring aspect of this project.
This repo contains our flow processing toolkit which enables us and our users to define pipelines for goflow2-compatible flow messages. The flowpipeline project integrates most other parts of our flow processing stack into a single piece of software which can be configured to serve any function:
- accepting raw Netflow (using goflow2)
- enriching the resulting flow messages (examples/configurations/enricher)
- writing to and reading from Kafka (examples/localkafka)
- dumping flows to cli (e.g. flowdump)
- providing metrics and insights (examples/prometheus)
- and many more...
To get going, choose one of the following deployment methods.
Clone this repo and use go build . to build the binary yourself.
By default, the binary will look for a config.yml in its local directory, so
you'll either want to create one or call it from any example directory (and
maybe follow the instructions there).
Download our latest release
and run it, same as if you compiled it yourself.
The flowpipeline releases contain executables for MacOS (flowpipeline-darwin) and for linux (flowpipeline-linux).
The default, dynamically linked version requires a reasonably recent system
(glibc 2.32+, linux 5.11+ for bpf, mongodb ...) and comes with all features.
As a fallback option, the static binaries will work in older environments
(Rocky Linux 8, Debian 11, ...), but come without the segments that require
CGO/dynamically linked code (bpf, sqlite, mongodb and plugin support, check
CONFIGURATION.md).
A ready to use container is provided as belwue/flowpipeline, you can check
it out on GitHub container registry.
Configurations referencing other files (geolocation databases for instance)
will work in a container without extra edits. This is because the volume
mountpoint /config is prepended in all segments which accept configuration to
open files, if the binary was built with the container build flag.
podman run -v ./examples/configuration/xy:/config flowpipeline
# or
docker run -v ./examples/configuration/xy:/config flowpipelineWe also provide a container displaying example visualizations via prometheus+grafana dashboards (ghcr.io/belwue/flowpipeline-grafana). The example container starts with:
- grafana running on port 3000
- starting with default grafana admin credentials (
user:admin, password:admin) - netflow receiver running on port 2055
- sflow receiver running on port 6343
- prometheus running on port 9090
docker run -p 3000:3000 -p 2055:2055/udp -p 6343:6343 -p 9090:9090 /udp ghcr.io/belwue/flowpipeline-grafana
The corresponding configuration files are available in /examples/visualization
Refer to CONFIGURATION.md
for the full guide. Other than that, looking at the examples should give you a
good idea what the config looks like in detail and what the possible
applications are.
The default configuration file used by flowpipeline is config.yml.
To use another configuration file, its location has to be specified using -c path/to/file.yml.
A configuration file should start with a segment from the input group. This can then be followed
by one or more segments. The following segments process all flows output by the previous segment.
Most segments output all flows that they consumed from the previous segment.
The exception to this are the filter group segments.
For sake of completeness, here's another minimal example
which starts listening for Netflow v9 on port 2055, applies the filter given as
first argument, and then prints it out to stdout in a tcpdump-style format.
- segment: goflow
- segment: flowfilter
config:
filter: $0
- segment: printflowdumpYou'd call it with ./flowpipeline "proto tcp and (port 80 or port 443)"., for
instance.
For deployments in a production environment, the use of a central Kafka cluster is strongly advised.
This allows distributing multiple redundant flowpipeline instances throughout multiple georedundant locations.
The different workers can use the kafkaconsumer segment to read from and the kafkaproducer segment to write to the cluster.
Redundant workers need to be configured using the same kafka group for all instances to not duplicate flows.
If you find that the existing segments lack some functionality or you require
some very specific behaviour, it is possible to include segments as a plugin.
This is done using the -p yourplugin.so commandline option and your own
custom module. See
examples/plugin
for a basic example and instructions on how to compile your plugin.
Note that this requires CGO and thus will not work using the static binary releases or in a container.
Contributions in any form (code, issues, feature requests) are very much welcome.