Skip to main content
Version: 1.2.9

Docker Default Installation Guide

In this deployment guide, you will learn how to deploy a basic Netsage setup that includes one sflow and/or one netflow collector. If you have more than one collector of either type, or other special situations, see the Docker Advanced guide.

The Docker containers included in the installation are

  • rabbit (the local RabbitMQ server)
  • sflow-collector (receives sflow data and writes nfcapd files)
  • netflow-collector (receives netflow data and writes nfcapd files)
  • importer (reads nfcapd files and puts flows into a local rabbit queue)
  • logstash (logstash pipeline that processes flows and sends them to, by default, netsage-elk1.grnoc.iu.edu)
  • ofelia (cron-like downloading of files used by the logstash pipeline)

The code and configs for the importer and logstash pipeline can be viewed in this github repo (netsage-project/netsage-pipeline). See netsage-project/docker-nfdump-collector for code related to the collectors.

1. Set up Data Sources#

The data processing pipeline needs data to ingest in order to do anything, of course. There are three types of data that can be consumed.

  • sflow
  • netflow
  • tstat

At least one of these must be set up on a sensor to provide the incoming flow data. You can do this step later, but it will helpful to have it working first.

Sflow and netflow data should be exported to the pipeline host where there are collectors (nfcapd and/or sfcapd processes) ready to receive it (see below). To use the default settings, send sflow to port 9998 and netflow to port 9999. On the pipeline host, allow incoming traffic from the flow exporters, of course.

Tstat data should be sent directly to the logstash input RabbitMQ queue on the pipeline host. No collector is needed for tstat data. From there, logstash will grab the data and process it the same way as it processes sflow/netflow data. (See the Docker Advanced guide.)

2. Clone the Netsage Pipeline Project#

If you haven't already, install Docker and Docker Compose and clone this project

git clone https://github.com/netsage-project/netsage-pipeline.git

(If you are upgrading to a new release, see the Upgrade section below!)

Then checkout the right version of the code.

git checkout tag_name

Replace "tag_name" with the release version you intend to use, e.g., "v1.2.8". ("Master" is the development version and is not intended for general use!) git status will confirm which branch you are on, e.g., master or v1.2.8.

3. Create Docker-compose.override.yml#

Information in the docker-compose.yml file tells docker which containers (processes) to run and sets various parameters for them. Settings in the docker-compose.override.yml file will overrule and add to those. Note that docker-compose.yml should not be edited since upgrades will replace it. Put all customizations in the override file, since override files will not be overwritten.

Collector settings may need to be edited by the user, so the information that docker uses to run the collectors is specified (only) in the override file. Therefore, docker-compose_override.example.yml must always be copied to docker-compose_override.yml.

cp docker-compose.override_example.yml docker-compose.override.yml

By default docker will bring up a single netflow collector and a single sflow collector. If this matches your case, you don't need to make any changes to the docker-compose.override_example.yml. If you have only one collector, remove or comment out the section for the one not needed so the collector doesn't run and simply create empty nfcapd files.

note

If you only have one collector, you should remove or comment out the section for the collector that is not used.

This file also specifies port numbers, and directories for nfcapd files. By default, the sflow collector will listen to udp traffic on localhost:9998, while the netflow collector will listen on port 9999, and data will be written to /data/input_data/. Each collector is namespaced by its type so the sflow collector will write data to /data/input_data/sflow/ and the netflow collector will write data to /data/input_data/netflow/.

Other lines in this file you can ignore for now.

note

You may need to remove all the comments in the override file as they may conflict with the parsing done by docker-compose

4. Create Environment File#

Please copy env.example to .env

cp env.example .env

then edit the .env file to set the sensor names

sflowSensorName=my sflow sensor name
netflowSensorName=my netflow sensor name

Simply change the names to unique identifiers (with spaces or not, no quotes) and you're good to go.

note

These names uniquely identify the source of the data. In elasticsearch, they are saved in the meta.sensor_id field and will be shown in Grafana dashboards. Choose names that are meaningful and unique. For example, your sensor names might be "RNDNet New York Sflow" and "RNDNet Boston Netflow" or "rtr.one.rndnet.edu" and "rtr.two.nrdnet.edu". Whatever makes sense in your situation.

  • If you don't set a sensor name, the default docker hostname, which changes each time you run the pipeline, will be used.
  • If you have only one collector, remove or comment out the line for the one you are not using.
  • If you have more than one of the same type of collector, see the "Docker Advanced" documentation.

Other settings of note in this file include the following. You will not necessarily need to change these, but be aware.

rabbit_output_host: this defines where the final data will land after going through the pipeline. By default, the last rabbit queue will be on rabbit, ie, the local rabbitMQ server running in its docker container. Enter a hostname to send to a remote rabbitMQ server (also the correct username, password, and queue key/name).

The following Logstash Aggregation Filter settings are exposed in case you wish to use different values. (See comments in the *-aggregation.conf file.) The aggregation filter stitches together long-lasting flows that are seen in multiple nfcapd files, matching by the 5-tuple (source and destination IPs, ports, and protocol) plus sensor name.

Aggregation_maps_path: the name of the file to which logstash will write in-progress aggregation data when logstash shuts down. When logstash starts up again, it will read this file in and resume aggregating. The filename is configurable for complex situations, but /data/ is required.

Inactivity_timeout: If more than inactivity_timeout seconds have passed between the 'start' of a flow and the 'start' of the LAST matching flow, OR if no matching flow has coming in for inactivity_timeout seconds on the clock, assume the flow has ended.

note

Nfcapd files are typically written every 5 minutes. Netsage uses an inactivity_timeout = 630 sec = 10.5 min for 5-min files; 960 sec = 16 min for 15-min files. (For 5-min files, this allows one 5 min gap or period during which the no. of bits transferred don't meet the cutoff)

max_flow_timeout: If a long-lasting flow is still aggregating when this timeout is reached, arbitrarily cut it off and start a new flow. The default is 24 hours.

5. Choose Pipeline Version#

Once you've created the docker-compose.override.xml file and finished adjusting it for any customizations, you're ready to select which version Docker should run.

./scripts/docker_select_version.sh

When prompted, select the same version you checked out earlier. This script will replace the version numbers of docker images in the docker-compose files with the correct values.

Running the Collectors#

After selecting the version to run, you can start the two flow collectors by themselves by running the following line. If you only need one of the collectors, remove the other from this command.

(Or see the next section for how to start all the containers, including the collectors.)

docker-compose up -d sflow-collector netflow-collector

If the collector(s) are running properly, you should see nfcapd files in subdirectories of data/input_data/, and they should have sizes of more than a few hundred bytes. (See Troubleshooting if you have problems.)

Running the Collectors and Pipeline#

Start up the pipeline (all containers) using:

docker-compose up -d

This will also restart any containers/processes that have died. "-d" runs containers in the background.

You can see the status of the containers and whether any have died (exited) using

docker-compose ps

To check the logs for each of the containers, run

docker-compose logs

Add -f or, e.g., -f logstash to see new log messages as they arrive. --timestamps, --tail, and --since are also useful -- look up details in Docker documentation.

To shut down the pipeline (all containers) use

docker-compose down

Upgrading#

Shut things down

cd {netsage-pipeline directory}
docker-compose down

This will stop all the docker containers, including the importer, logstash, and any collectors. Note that incoming flow data will not be saved during the time the collectors are down.

Update Source Code

To upgrade to a new release, just reset and pull changes including the new release from github. Your customized .env and override files will not be overwritten.

git reset --hard
git pull origin master
warning

git reset --hard will obliterate any changes you have made to non-override files. If necessary, please make sure you commit and save to a feature branch before continuing.

Example: git commit -a -m "Saving local state"; git checkout -b feature/backup; git checkout master

Check/Update Files

  • Compare the new docker-compose.override_example.yml file to your docker-compose.override.yml to see if a new version of Docker is required. Look for, eg, version: "3.7" at the top. If the version number is different, change it in your docker-compose.override.yml file and upgrade Docker manually.

  • In the same files, see if the version of nfdump has changed. Look for lines like "image: netsage/nfdump-collector:1.6.18". If there has been a change, update the version in the override file. (You do not need to actually perform any update yourself.) Note that you do not need to update the versions of the importer or logstash images. That will be done for you in the "select release version" stop coming up.

  • Also compare your .env file with the new env.example file to see if any new lines or sections have been added. Copy new lines into your .env file, making any appropriate changes to example values.

  • If you used the Docker Advanced guide to make a netsage_override.xml file, compare it to netsage_shared.xml to see if there are any changes. This is unlikely.

Select Release Version

Run these two commands to select the new release you want to run. In the first, replace "tag_value" by the version to run (eg, v1.2.8). When asked by the second, select the same version as the tag you checked out.

git checkout -b tag_value
git pull
./scripts/docker_select_version.sh

Check to be sure docker-compose.yml and docker-compose.override.yml both now have the version number you selected.

Update Docker Containers

Do not forget this step! This applies for both development and release versions.

docker-compose pull

Restart Docker Containers

docker-compose up -d

This will start all the services/containers listed in the docker-compose.yml and docker-compose.override.yml files, including the importer, logstash pipeline, and collectors.

Delete old images

To save space, delete any old images that are not needed.