Pipeline
#
The NetSage Pipeline#
DescriptionThe Netsage Flow Processing Pipeline includes several components for processing network flow data, including importing, deidentification, metadata tagging, flow stitching, etc.
#
ComponentsThe Pipeline is made of the following components (Currently)
- Importer (Collection of perl scripts)
- Elastic Logstash Performs a variety of transformation on the data
- RabbitMQ used for message passing and queing of tasks.
#
Sensors and Data Collection"Testpoints" or "sensors" collect flow data (tstat, sflow, or netflow) and send it to a "pipeline host" for processing (for globanoc, flow-proc.bldc.grnoc.iu.edu or netsage-probe1.grnoc.iu.edu).
Tstat data goes directly into the netsage_deidentifier_raw queue rabbit queue. The other data is written to nfcapd files.
#
ImporterA netsage-netflow-importer-daemon reads any new nfcapd files that have come in after a configurable delay. The importer aggregates flows within each file, and writes the results to the netsage_deidentifier_raw queue rabbit queue.