Advanced Feature Dashboard

This chapter manages Latency and Drop Differ dashboard for DMF Analytics Node for keeping the records for NetFlow. This chapter has the following section:

Latency Differ and Drop Differ Dashboard

The DANZ Monitoring Fabric (DMF) Latency Differ Dashboard and Drop Differ Dashboard feature provides a near real-time visual representation of latency and drops in the DMF Analytics Node (AN) dedicated to NetFlow Records.

For a given flow, it reports the latency and drop of packets over time between two existing tap points (A, B), with network flows traversing the managed network from A towards B.

This feature introduces the concept of DiffPair, defined as a flow from A towards B.

The Dashboards provide clear, concise information about the flows. The data helps determine which applications are running slow and identifies peak times. A configurable mechanism alerts on abnormal drops and latency.

Introducing DiffPair

When identifying the flows between two tap points or filter interfaces, the aggregation occurs as A towards B pairs. It implies that a flow originating from point A will be received at point B. The term DiffPair is employed to visualize this flow as a cohesive set. This newly introduced field in the flow data selects the ingress and egress tap points encompassing a flow in between. The utilization of this DiffPair facilitates tap point filtering and comparison.

Note: It is important to verify the accuracy of the DiffPair data flowing between the tap points when comparing source data to the destination data.

Latency Differ Dashboard

Locate the Latency Differ dashboard by searching for the term Latency Differ.

The dashboard combines a visual representation of NetFlow Latency data in two views. The upper view displays individual flows, while the lower view aggregates A towards B pairs (A > B) or DiffPair.
Figure 1. Latency Differ Dashboard
The following widgets appear in the Latency Differ dashboard:
  • Latency Records By Flows: The pie chart represents the proportions of flow latency summed. The inner circle displays source IP addresses, the middle circle displays destination IP addresses, and the outermost circle displays destination ports.
  • Latency over time By Flows: The line chart represents the maximum Latency in nanoseconds (ns) over time, split by each flow between source IP and destination IP addresses.
  • Observation Point Selector (A > B or DiffPair): Use the drop-down menus to filter by A > B pair or DiffPair. The point B selector is dependent on point A.
  • Top Latencies By A > B Pair: The pie chart shows the Latency max summed by A > B Points. The inner circle displays the source A tap point, while the outer circle displays the B destination tap point.
  • Latency over time By A > B Pair: The line chart represents maximum Latency in nanoseconds (ns) over time, split by each A > B pair between the source tap point and destination tap point.
    Figure 2. Latency Record by Flows
Select A > B selection or DiffPair to visualize the data types. Filter the data using A > B Points by selecting a single source (A) and one or more receivers (B).
Figure 3. Flow Record with Observation Point Selector
Figure 4. Latency between Points
The dashboard displays the latency between points A and B(s), separated by flows between the points in the upper view or filtered by the A > B pairs in the lower view. The diff records appear on the lower dashboard.
Figure 5. Diff Record over Time

Select individual data points in the visualization for further analysis.

Change the visualization perspective by selecting DiffPairs by selecting one or more DiffPair for their analysis.
Figure 6. DiffPair Analysis
Figure 7. Another DiffPair Analysis

Drop Differ Dashboard

Locate the Drop Differ dashboard by searching for the term Drop Differ.

The dashboard combines a visual representation of NetFlow Latency data in two views. The upper view displays individual flows, while the lower view aggregates A towards B pairs (A > B) or DiffPair.Drop Differ Dashboard
Figure 8. Drop Differ Dashboard
The following widgets appear in the Drop Differ dashboard:
  • Drop Records By Flows: The pie chart represents the proportions of drop packets for each flow summed. The inner circle displays source IP addresses, the middle circle displays destination IP addresses, and the outermost circle displays destination ports.
  • Max Drops By Flows: The line chart represents the maximum number of drop packets, separated by each flow between source IP and destination IP addresses. If fewer data points exist, the chart displays them as individual points instead of complete lines.
  • Observation Point Selector (A>B or DiffPair): Use the drop-down menus to filter by A > B pair or DiffPair. The point B selector is dependent on point A.
  • Top Drop A>B: The heat map displays the drop of packets summed by A > B Points. The map plots the source tap point, A on the vertical axis and the destination tap point, B, on the horizontal axis.
  • Top Dropping A>B Pairs: The bar chart represents the sum of drop packets over time, separated by each A > B pair between the source tap point and the destination tap point. It shows the Top 10 available dropping A > B pairs .
    Figure 9. Top Dropping A>B Pairs

Select A > B selection or DiffPair to visualize the data types.

Filter the data using A > B Points by selecting a single source (A) and one or more receivers (B).
Figure 10. Data Types Visualization
Figure 11. Single Source Data
  • This provides a dashboard for packet drops between points A and B(s), either split by flows in between those points (Top) or filtered by A > B pairs (bottom) as selected. View the diff records at the bottom of the dashboard.
  • Select individual data points in the visualization for further analysis.
  • Selecting DiffPairs can provide a similar visualization perspective. Choose one or more DiffPairs for analysis.
    Figure 12. DiffPair Analysis for Drop Differ

Configuring Watcher Alerts

Watcher is an elastic search feature that supports the creation of alerts based on conditions triggered at set intervals. For more information, refer to: Watcher | Kibana Guide [7.17] | Elastic

AN includes two built-in examples of watcher templates for ease of use. To access the templates, navigate to Stack Management > Watcher.
  • Arista_NetOps_Drop_Differ_Watch
  • arista_NetOps_Latency_Differ_Watch

The templates are disabled by default and require manual configuration before use.

Setting the SMTP Connector
The system dispatches Alerts by email; configure the SMTPForAlerts Connector before use.
  1. Navigate to Stack Management > Connector.
  2. Under Configuration for the SMTPForAlerts Connector, specify the Senderand Service field values.
  3. Sending email alerts may require authentication based on the type of mail service selected.
  4. Test and validate the settings using the Test tab.
    Figure 13. Testing SMTP Connector
Setting the Watchers
  • arista_NetOps_Drop_Differ_Watch:
    1. The watcher is configured to send an alert when the maximum drop count of packets in NetFlow in the last 5-minute interval exceeds the historical average (last 7-day average) of drop of packets by a threshold percentage.
    2. This watcher is configured by default to be triggered every 10 minutes.
    3. As this may be incorrect for all flows combined, configure it for a particular Flow and Destination Port.
    4. Search for CHANGE_ME in the watcher and specify the flow and destination port value (introduced to correctly compare each flow and destination port individually instead of comparing all flows together).
    5. Specify the percentage_increase parameter in the condition using a positive value between 0-100.
    6. Enter the recipient's email address receiving the alert.
    7. Select Save watch.
      Figure 14. NetOps_Drop_Differ_Watch-1
      Figure 15. NetOps_Drop_Differ_Watch-2
      Figure 16. Editing NetOps_Drop_Differ_Watch
  • arista_NetOps_Latency_Differ_Watch:
    1. The watcher is configured to send an alert when NetFlow's maximum latency (or lag) in the last 5-minute interval exceeds the historical average (last 7-day average) latency by a threshold percentage.
    2. This watcher is configured by default to be triggered every 10 minutes.
    3. As this may be incorrect for all flows combined, configure it for a particular Flow and Destination Port.
    4. Search for CHANGE_ME in the watcher and specify the flow and destination port value (introduced to correctly compare each flow and destination port individually instead of comparing all flows together).
    5. Specify the percentage_increase parameter in the condition using a positive value between 0-100.
    6. Enter the recipient's email address receiving the alert.
    7. Select Save watch.

Considerations

  • Default Watchers are disabled and must be modified with user-configured alert settings before being enabled.
    Figure 17. Arista_NetOps_Drop_Differ_Watch

Troubleshooting

  • The dashboard obtains its data from the flow-netflow index. If no data is present in the dashboard, verify there is sufficient relevant data in the index.
  • Watchers trigger at a set interval. To troubleshoot issues related to watchers, navigate to Stack Management > Watcher. Select the requisite watcher and navigate to Action statuses to determine if there is an issue with the last trigger.
    Figure 18. Watcher Action Status

Usage Notes

  • The dashboards only show partial and not full drops during a given time and are configured with filtering set to the egress.Tap value as empty.
  • A full drop occurs when the flow of packets is observed at the source tap point, but no packet is observed at the destination tap point. The dashboards are configured to filter out full drop flows.
  • A partial drop is a scenario in which the flow of packets is observed at the source tap point, and some, if not all, packets are observed at the destination tap point. The dashboards clearly show partial drop flows.
 

Creating Watcher Alerts for Machine Learning jobs

The following appendix describes the procedure for creating Watcher alerts for machine learning jobs, emails, and remote Syslog servers.

Watcher Alert Workaround

DMF 8.1 uses Elasticsearch 7.2.0, where the inter-container functional calls are HTTP-based. However, DMF 8.3 uses Elasticsearch version 7.13.0, which now requires HTTPS-based calls. It would require an extensive change in the system calls used by the Analytics Node (AN), and engineering is working on this effort. Arista recommends the following workaround until the earlier fixes are released.
Workaround Summary:
  • Create a Watcher manually using the provided template.
  • Configure the Watcher to select the job ID for the ML job that needs to send alerts.
  • Use ‘webhook’ as the alerting mechanism within the Watcher to send alerts to 3rd party tools like ‘Slack.’
  1. Access the AN's ML job page and click Manage Jobs to list the ML jobs.
  2. If the data feed column shows as stopped, skip to Step 3. If it says started, click the 3 dots for a particular ML job and Stop the data feed for the current ML job.
    Figure 1. Stop Data Feed
  3. After the data feed has stopped, click the 3 dots and start the data feed.
    Figure 2. Start Data Feed
  4. Select the options as shown in the diagram below.
    Figure 3. Job Time Options
  5. Confirm that the data feed has started. Note down the job ID of this ML job.
    Figure 4. ML Job Characteristics
  6. Access the Watchers page.
    Figure 5. Access Watchers
  7. Create an advanced Watcher.
    Figure 6. Create Advanced Watcher
  8. Configure the name of the Watcher (can include whitespace characters), e.g., Latency ML.
  9. Configure the ID of the Watcher (can be alphanumeric, but without whitespace characters), e.g., ml_latency.
  10. Delete the code from the Watch JSON section.
  11. Copy and paste the following code into the Watcher. Replace the highlighted text according to your environment and your ML job parameters.
    {
      "trigger": {
        "schedule": {
          "interval": "107s"
        }
      },
      "input": {
        "search": {
          "request": {
            "search_type": "query_then_fetch",
            "indices": [
              ".ml-anomalies-*"
            ],
            "rest_total_hits_as_int": true,
            "body": {
              "size": 0,
              "query": {
                "bool": {
                  "filter": [
                    {
                      "term": {
                        "job_id": "<use the id of the ML job retrieved in step 6.>"
                      }
                    },
                    {
                      "range": {
                        "timestamp": {
                          "gte": "now-30m"
                        }
                      }
                    },
                    {
                      "terms": {
                        "result_type": [
                          "bucket",
                          "record",
                          "influencer"
                        ]
                      }
                    }
                  ]
                }
              },
              "aggs": {
                "bucket_results": {
                  "filter": {
                    "range": {
                      "anomaly_score": {
                        "gte": 75
                      }
                    }
                  },
                  "aggs": {
                    "top_bucket_hits": {
                      "top_hits": {
                        "sort": [
                          {
                            "anomaly_score": {
                              "order": "desc"
                            }
                          }
                        ],
                        "_source": {
                          "includes": [
                            "job_id",
                            "result_type",
                            "timestamp",
                            "anomaly_score",
                            "is_interim"
                          ]
                        },
                        "size": 1,
                        "script_fields": {
                          "start": {
                            "script": {
                              "lang": "painless",
                              "source": "LocalDateTime.ofEpochSecond((doc[\"timestamp\"].value.getMillis()-((doc[\"bucket_span\"].value * 1000)\n * params.padding)) / 1000, 0,ZoneOffset.UTC).toString()+\":00.000Z\"",
                              "params": {
                                "padding": 10
                              }
                            }
                          },
                          "end": {
                            "script": {
                              "lang": "painless",
                              "source": "LocalDateTime.ofEpochSecond((doc[\"timestamp\"].value.getMillis()+((doc[\"bucket_span\"].value * 1000)\n * params.padding)) / 1000, 0,ZoneOffset.UTC).toString()+\":00.000Z\"",
                              "params": {
                                "padding": 10
                              }
                            }
                          },
                          "timestamp_epoch": {
                            "script": {
                              "lang": "painless",
                              "source": """doc["timestamp"].value.getMillis()/1000"""
                            }
                          },
                          "timestamp_iso8601": {
                            "script": {
                              "lang": "painless",
                              "source": """doc["timestamp"].value"""
                            }
                          },
                          "score": {
                            "script": {
                              "lang": "painless",
                              "source": """Math.round(doc["anomaly_score"].value)"""
                            }
                          }
                        }
                      }
                    }
                  }
                },
                "influencer_results": {
                  "filter": {
                    "range": {
                      "influencer_score": {
                        "gte": 3
                      }
                    }
                  },
                  "aggs": {
                    "top_influencer_hits": {
                      "top_hits": {
                        "sort": [
                          {
                            "influencer_score": {
                              "order": "desc"
                            }
                          }
                        ],
                        "_source": {
                          "includes": [
                            "result_type",
                            "timestamp",
                            "influencer_field_name",
                            "influencer_field_value",
                            "influencer_score",
                            "isInterim"
                          ]
                        },
                        "size": 3,
                        "script_fields": {
                          "score": {
                            "script": {
                              "lang": "painless",
                              "source": """Math.round(doc["influencer_score"].value)"""
                            }
                          }
                        }
                      }
                    }
                  }
                },
                "record_results": {
                  "filter": {
                    "range": {
                      "record_score": {
                        "gte": 75
                      }
                    }
                  },
                  "aggs": {
                    "top_record_hits": {
                      "top_hits": {
                        "sort": [
                          {
                            "record_score": {
                              "order": "desc"
                            }
                          }
                        ],
                        "_source": {
                          "includes": [
                            "result_type",
                            "timestamp",
                            "record_score",
                            "is_interim",
                            "function",
                            "field_name",
                            "by_field_value",
                            "over_field_value",
                            "partition_field_value"
                          ]
                        },
                        "size": 3,
                        "script_fields": {
                          "score": {
                            "script": {
                              "lang": "painless",
                              "source": """Math.round(doc["record_score"].value)"""
                            }
                          }
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      },
      "condition": {
        "compare": {
          "ctx.payload.aggregations.bucket_results.doc_count": {
            "gt": 0
          }
        }
      },
      "actions": {
        "log": {
          "logging": {
            "level": "info",
            "text": "Alert for job [{{ctx.payload.aggregations.bucket_results.top_bucket_hits.hits.hits.0._source.job_id}}] at [{{ctx.payload.aggregations.bucket_results.top_bucket_hits.hits.hits.0.fields.timestamp_iso8601.0}}] score [{ctx.payload.aggregations.bucket_results.top_bucket_hits.hits.hits.0.fields.score.0}}]"
          }
        },
        "my_webhook": {
          "webhook": {
            "scheme": "https",
            "host": "hooks.slack.com",
            "port": 443,
            "method": "post",
            "path": "<path for slack>",
            "params": {},
            "headers": {
              "Content-Type": "application/json"
            },
            "body": """{"channel": "#<slack channel name>", "username": "webhookbot", "text":"Alert for job [{{ctx.payload.aggregations.bucket_results.top_bucket_hits.hits.hits.0._source.job_id}}] at [{{ctx.payload.aggregations.bucket_results.top_bucket_hits.hits.hits.0.fields.timestamp_iso8601.0}}] score [{{ctx.payload.aggregations.bucket_results.top_bucket_hits.hits.hits.0.fields.score.0}}]", "icon_emoji": ":exclamation:"}"""
          }
        }
      }
    }
    
  12. Click Create Watch to create the Watcher.

Email Alerts and Remote Syslog Server

Sending Watcher alerts to email required editing configuration files in the command line and restarting the Elasticsearch container previously.

An update to the Watcher alerts feature creates a simpler configuration method using the Analytics Node UI and supports sending Watcher alerts to remote Syslog servers

Configuring a Kibana Email Connector

Select an existing Kibana email connector to send email alerts or create a connector by navigating to Stack Management > Rules and Connectors > Connectors > Create Connectors. Complete the following steps:

Figure 7. Rules and Connectors
  1. Configure the fields in the Configuration tab.
  2. Verify the connector works in the Test tab.
    Figure 8. Editing Connector
    Figure 9. Editing Connector to create action

Configuring a Watch

Configure a Watch using the Create threshold alert or Create advanced watch option, described in the following instructions.
Figure 10. Watcher

Create Threshold Alert

  1. Navigate to Stack Management > Watcher > Create > Create threshold alert and configure the alert conditions.
    Figure 11. Creating threshold alert
  2. Add a webhook action with the following fields.
    • Method: POST
    • Scheme: HTTP
    • Host: 169.254.16.1
    • Port: 8000
    • Specify the Body field as follows:
      • Sending Watcher alerts by email: Enter the required fields: to, subject, message, and kibana_email_connector. Multiple entries in the to field require a comma-separated list of email addresses wrapped in quotes. The kibana_email_connector field references an existing Kibana email connector.
      • Sending Watcher alerts to a remote Syslog server: Enter the required fields: message, protocol, primary_syslog_ip, and primary_syslog_port. If a second Syslog server should receive alerts, include backup_syslog_ip and backup_syslog_port.
        Figure 12. Performing Action for Webhook
    • The Path, Username, and Password fields do not need to be specified.
  3. Test the webhook action using Send Request before selecting Create alert. Depending on the configuration:
    • Verify the receipt of an email at the configured recipient address.
    • Verify the receipt of a syslog message on the remote Syslog server.

Create Advanced Watch

  1. Navigate to Stack Management > Watcher > Create > Create advanced watch and fill out the Name and ID of the Watch.
    Figure 13. Editing Advanced Watch
  2. For the Watch JSON field, the following JSON template configures the forwarding of alerts to email and remote Syslog servers. Configure the alert condition under the input and condition fields. Replace these values with any custom alert condition using the Elastic Painless scripting language. The configuration for forwarding alerts to email and remote Syslog servers is under the actions field.
    {
      "trigger": {
        "schedule": {
          "interval": "1m"
        }
      },
      "input": {
        "http": {
          "request": {
            "scheme": "https",
            "host": "<host>",
            "port": 443,
            "method": "get",
            "path": "/_cluster/health",
            "params": {},
            "headers": {
              "Content-Type": "application/json"
            },
            "auth": {
              "basic": {
                "username": "<user>",
                "password": "<password>"
              }
            }
          }
        }
      },
      "condition": {
        "script": {
          "source": "ctx.payload.status == 'green'",
          "lang": "painless"
        }
      },
      "actions": {
        "webhook_1": {
          "webhook": {
            "host": "169.254.16.1",
            "port": 8000,
            "method": "post",
            "scheme": "http",
            "body": "{\"message\": \"The Elasticsearch cluster status is {{ctx.payload.status}}\",  \"kibana_email_connector\": \"<existing-email-connector>\",  \"to\": \"This email address is being protected from spambots. You need JavaScript enabled to view it.\",  \"subject\": \"Elasticsearch cluster status alert\", \"protocol\": \"UDP\", \"primary_syslog_ip\": \"<remote-syslog-ip>\", \"primary_syslog_port\": <remote-syslog-port>, \"backup_syslog_ip\": \"<remote-syslog-ip>\", \"backup_syslog_port\": <remote-syslog-port>}"
          }
        }
      }
    }
  3. (Optional) To simulate the Watch, you can configure the fields in the Simulate Tab. The webhook action mode must be set to force_execute.
    Figure 14. Simulating Advanced Watch

Troubleshooting

  • If the email alert fails, verify that the value of the kibana_email_connector field matches the name of a Kibana email connector and that this email connector works in the Test tab.

Limitations

  • Remote Syslog messages require UDP. TCP is not supported currently.

Enabling Secure Email Alerts through SMTP Setting

Refresh the page to view the updated SMTP Settings fields.

The following is an example of the UI SMTP Settings in previous releases:
Figure 15. SMTP Setting
After upgrading the Analytics Node from an earlier version to the DMF 8.6.* version, the following changes apply:
  • Server Name, User, Password, Sender, and Timezone no longer appear in the SMTP Settings.
  • A new field, Kibana Email Connector Name, has been added to SMTP Settings.
  • The system retains Recipients and Dedupe Interval and their respective values in SMTP Settings.
  • If previously configured SMTP settings exist:
    • The system automatically creates a Kibana email connector named SMTPForAlerts using the values previously specified in the fields Server Name, User (optional), Password (optional), and Sender.
    • The Kibana Email Connector Name field automatically becomes SMTPForAlerts.
The following settings appear in the UI after the upgrade to the DMF 8.6.* version:
Figure 16. Upgraded SMTP Setting

Troubleshooting

When Apply & Test, do not send an email to the designated recipients, verify the recipient email addresses are comma-separated and spelled correctly. If it still doesn’t work, verify the designated Kibana email connector matches the name of an existing Kibana email connector. Test that connector by navigating to Stack Management > Rules and Connectors > Connectors, selecting the connector's name, and sending a test email in the Test tab.

Using TACACS+ and RADIUS to Control Access to the Arista Analytics CLI

This appendix describes using TACACS+ and RADIUS servers to control administrative access to the Analytics Node.

Using AAA Services with Arista Analytics

Use remote Authentication, Authorization, and Accounting (AAA) services using TACACS+ or RADIUS servers to control administrative access to the Analytics Node CLI.

The following table lists the accepted Attribute-Value (AV) pairs:
 
Attributes Values
BSN-User-Role admin

read-only

bigtap-admin

bigtap-read-only

Note: The remotely authenticated admin and bigtap-admin users and the read-only and bigtap-read-only users have the same privileges. The bigtap-admin and bigtap-read-only values are supported to create BMF-specific entries without affecting the admin and read-only TACACS+ server entries.
You must also create a role in Elasticsearch with the same name as the group configured in the CLI.
Figure 1. Creating a Group in Elasticsearch

A remotely authenticated admin user has full administrative privileges. Read-only users on the switch must be remotely authenticated. Read-only access is not configurable for locally authenticated user accounts.

Read-only users can only access login mode, from which they can view most show commands, with some limitations, including the following:
  • TACACS, SNMP, and user configuration are not visible to the read-only user in the output from the show running-config command.
  • show snmp, show user, and show support commands are disabled for the read-only user.
    Note: Local authentication and authorization take precedence over remote authentication and authorization.
Privileges at the remote TACACS+ server must be configured using the following attribute-value pairs:
  • Supported attribute name: BSN-User-Role
  • Supported attribute values: admin, read-only

Use a TACACS+ server to maintain administrative access control instead of using the Analytics Node local database, however, it is a best practice to keep the local database as the secondary authentication and authorization method in case the remote server becomes unavailable.

DMF TACACS+ Configuration

The DANZ Monitoring Fabric (DMF) requires the following configuration on TACACS+ servers and the configuration required on the Analytics Node.

Authentication Method

  • Configure the TACACS+ server to accept ASCII authentication packets. Do not use the single connect only protocol feature.
  • The DMF TACACS+ client uses the ASCII authentication method. It does not use PAP.

Device Administration

  • Configure the TACACS+ server to connect to the device administration login service.
  • Do not use a network access connection method, such as PPP.

Group Memberships

  • Create a bigtap-admin group. Make all DANZ Monitoring Fabric users part of this group.
  • TACACS+ group membership is specified using the BSN-User-Role AV Pair as part of TACACS+ session authorization.
  • Configure the TACACS+ server for session authorization, not for command authorization.
    Note: The BSN-User-Role attribute must be specified as Optional in the tac_plus.conf file to use the same user credentials to access ANET and non-ANET devices.

Enabling Remote Authentication and Authorization on the Analytics Node

Use the following commands to configure remote login authentication and authorization. The examples use the SSH default for connection type.
analytics-1# tacacs server host 10.2.3.201
analytics -1# aaa authentication login default group tacacs+ local
analytics -1# aaa authorization exec default group tacacs+ local

All users in the bigtap-admin group on TACACS+ server 10.2.3.201 have full access to the Arista Analytics Node.

User Lockout

Use the following command to lock out an AAA user after a calculated number of incorrect login attempts.
(config)#aaa authentication policy lockout failure F window W duration D
max-failures = F = [1..255] duration = D = [1..(2^32 - 1)] window = W = [1..(2^32 - 1)]

Adding a TACACS+ Server

To view the current TACACS+ configuration, enter the show running-config command, as in the following example:
analytics -1(config-switch)# show run switch BMF-DELIVERY-SWITCH-1 tacacs override-enabled
tacacs server host 1.1.1.1 key 7 020700560208
tacacs server key 7 020700560208 
analytics -1(config-switch)#

It displays the TACACS+ key value as a type7 secret instead of plaintext.

Complete the following steps to configure the Analytics Node with TACACS+ to control administrative access to the switch.

Identify the IP address of the TACACS+ server and any key required for access using the tacacs server command, which has the following syntax:
tacacs server <server> [key {<plaintext-key> | 0 <plaintext-key> | 7 <encrypted-key>}
You can enable up to four AAA servers by repeating this command for each server. For example, using a plaintext key, the following command enables TACACS+ with the server running at 10.2.3.4.
analytics -1(config-switch)# tacacs server 10.1.1.1 key 0 secret

In case of a missing key, it uses an empty key.

Note: Do not use the pound character (#) in the TACACS secret. It is the start of a comment in the PAM config file.

Each TACACS+ server connection can be encrypted using a pre-shared key.

To specify a key for a specific host, use one of the following commands:
analytics -1# tacacs server host <ip-address> key <plaintextkey> 
analytics -1# tacacs server host <ip-address> key 0 <plaintextkey> 
analytics -1# tacacs server host <ip-address> key 7 <plaintextkey>

Replace plaintextkey with a password up to 63 characters in length. This key can be specified either globally or for each host. The first two forms accept a plaintext (literal) key, and the last form accepts a pseudo-encrypted key, such as that displayed with show running-config.

It uses the global key value when no key is specified for a given host. An empty key is assumed when no key is specified globally or specified for a given host.

The following example uses the key 7 option followed by the encrypted string:
analytics-1(config-switch)# tacacs server 10.1.1.1 key 7 0832494d1b1c11
Note: Be careful while configuring TACACS+ to avoid disabling access to the Analytics Node.

Setting up a TACACS+ Server

Refer to your AAA server documentation for further details or instructions on setting up other servers.

After installing the TACACS+ server, complete the following steps to set up authentication and authorization for Analytics Node with the TACACS+ server:

  1. Configure users and groups.
  2. In the /etc/tacacs/tac_plus.conf file, specify the user credentials and group association.
    # user details
    user = user1 {
    member = anet-vsa-admin
    login = des a9qtD2JXeK0Sk
    }
  3. Configure the groups to use one of the AV pairs supported by the Analytics Node (for example, BSN-User-Role=admin for admin users).
    # group details# 
    ANET admin group
    group = anet-vsa-admin {
        service = exec {
        BSN-User-Role="admin"
        }
    }
    # BSN read-only group
    group = anet-vsa-read-only {
        service = exec {
        BSN-User-Role="read-only"
        }
    }
  4. Configure the TACACS+ server and AAA on the Analytics Node.
    tacacs server host <IP address> key server’s secret>
    aaa authentication login default group tacacs+ local
    aaa authorization exec default group tacacs+ local
    aaa accounting exec default start-stop locals group tacacs+
This configuration sets authentication and authorization to first connect to the TACACS+ server to verify user credentials and privileges. It checks the user account locally only when the remote server is unreachable. In this example, accounting stores audit logs locally and sends them to the remote server.

Using the Same Credentials for the Analytics Node and Other Devices

The BSN-User-Role attribute must be specified as Optional in the tac_plus.conf file to use the same user credentials to access the Analytics Node and other devices, as shown in the following example.
group = group-admin {
default service = permit
service = exec {
optional BSN-User-Role = "admin"
}
}

RBAC-based Configuration for Non-default Group User

To create an RBAC configuration for a user in a non-default group, complete the following steps:
  1. Create a group AD1.
    group AD1

    Do not associate with any local users.

  2. Use the same group name on the TACACS+ server and associate a user to this group.
    Note: The attribute should be BSN-User-Role, and the value should be the group name.
    The following is an example from the open TACACS+ server configuration.
    group = AD1 {
    service = exec {
    BSN-User-Role="AD1"
    }
    }
  3. After you create the group, associate a user to the group.
    user = user3 {
    member = AD1
    login = cleartext user3
  4. Click save.

Using RADIUS for Managing Access to the Arista Analytics Node

Note: RADIUS does not separate authentication and authorization, so be careful when authorizing a user account with a remote RADIUS server to use the password configured for the user on the remote server.
By default, authentication and authorization functions are set to local while the accounting function is disabled. The only supported privilege levels are as follows:
  • admin: Administrator access, including all CLI modes and debug options.
  • read-only: Login access, including most show commands.
Note: The admin and recovery user accounts cannot be authenticated remotely using TACACS. These accounts are always authenticated locally to prevent administrative access from being lost in case a remote AAA server is unavailable.

The admin group provides complete access to all network resources, while the read-only group provides read-only access to all network resources.

DANZ Monitoring Fabric also supports communication with a remote AAA server (TACACS+ or RADIUS). The following summarizes the options available for each function:
  • Accounting: local, local and remote, or remote.
  • Authentication: local, local then remote, remote then local, or remote.
  • Authorization: local, local then remote, remote then local, or remote.
    Note: Fallback to local authentication occurs only when the remote server is unavailable, not when authentication fails.
Privileges at the remote TACACS+ server must be configured using the attribute-value pairs shown in the following table:
 
Supported attribute names Supported attribute values
BSN-User-Role admin

read-only

bigtap-admin

bigtap-read-only

The BSN-AV-Pair attribute sends CLI command activity accounting to the RADIUS server.

Adding a RADIUS Server

Use the following command to specify the remote RADIUS server:
radius server host <server-address> [timeout {<timeout>}][key {{<plaintext>} | 0 {<plaintext>} | 7 {<secret>}}]
For example, the following command identifies the RADIUS server at the IP address 192.168.17.101:
analytics-1(config)# radius server host 192.168.17.101 key admin

You can enter this command up to five times to specify multiple RADIUS servers. The Analytics Node tries to connect to each server in the order they are configured.

Setting up a FreeRADIUS Server

After installing the FreeRADIUS server, complete the following steps to set up authentication and authorization for the Analytics Node with the RADIUS server:
  1. Create the BSN dictionary and add it to the list of used dictionaries.
    create dictionary /usr/share/freeradius/dictionary.bigswitch with the contents below:
    VENDORBig-Switch-Networks 37538
    BEGIN-VENDOR Big-Switch-Networks
    ATTRIBUTE BSN-User-Role 1 string
    ATTRIBUTE BSN-AVPair2
    string
    END-VENDOR Big-Switch-Networks
  2. Include the bigswitch dictionary in the RADIUS dictionary file: /usr/share/freeradius/dictionary
    $INCLUDE dictionary.bigswitch
  3. Configure a sample user with admin and read-only privileges.
    The following is an example that defines and configures a user, opens the user file /etc/freeradius/users, and inserts the following entries:
    "user1" Cleartext-Password := "passwd"
    BSN-User-Role := "read-only",
    Note: It shows the VSA's association with the user and its privileges. In an actual deployment, a database, and an encrypted password are necessary.
    The following example authorizes the user2 for RBAC group AD1:
    "user2" Cleartext-Password := "passwd"
    BSN-User-Role := "AD1",
  4. Configure the RADIUS server and AAA on the Analytics Node.
    radius server host <IP address> key server’s secret>
    aaa authentication login default group radius local
    aaa authorization exec default group radius local
    aaa accounting exec default start-stop group radius local

    This configuration sets authentication and authorization to first connect to the RADIUS server to verify user credentials and privileges. AAA fallback to local occurs only when the remote server is unreachable. In this example, accounting stores audit logs locally and sends them to the remote server.

  5. Add the Analytics Node subnet to the allowed subnets (‘clients.conf’) on the RADIUS server.
    It is required when access to the RADIUS server is limited to allowed clients or subnets. The following is an example of the clients.conf file:
    client anet {
    ipaddr = 10.0.0.0/8
    secret = <server’s secret>
    }
  6. Restart the FreeRADIUS service on the server to enable the configuration.

    The following is an example accounting record sent from the Analytics Node to the RADIUS server after adding the BSN-AVPair attribute to the /usr/share/freeradius/dictionary.bigswitch file.

    s

Backup and Restore

Elasticsearch Snapshot and Restore

Elasticsearch provides a mechanism to snapshot data to a network-attached storage device and to restore from it.

  1. Mount the Network File Storage (NFS) on the Analytics Node.
    1. Create a directory on the remote Ubuntu Server (NFS store). This directory must have the user group remoteuser and root, respectively, with 10000 for the UID and 0 for the GID.
    2. Stop the Elasticsearch container: sudo docker elasticsearch stop
    3. Mount the remote store on /opt/bigswitch/snapshot in the Analytics server.
    4. Start the Analytics Node: sudo docker elasticsearch start
  2. Create a snapshot repository by running the following API call:
    curl \
    -k \
    -X PUT \
    -H 'Content-Type:application/json' \
    -d '{"type":"fs","settings":{"location":"/usr/share/elasticsearch/snapshot"}}' \
    -u admin:***** \
    https://169.254.16.2:9201/_snapshot/test_automation
  3. Take a snapshot by running the following API call:
    curl \
    -k \
    -X POST \
    -H 'Content-Type:application/json' \
    -d '{ "indices": ".ds-flow-sflow-stream-2023.08.21-000001", "include_global_state": true, "ignore_unavailable": true, "include_hidden": true}' \
    -u admin:***** \
    https://169.254.16.2:9201/_snapshot/test_automation/test_snap1
  4. To view the a snapshot, run the following API call:
    curl \
    -s -k \
    -H 'Content-Type:application/json' \
    -u admin:***** \
    https://169.254.16.2:9201/_snapshot/test_automation/test_snap1?pretty
  5. To restore a snapshot, run the following API call:
    curl \
    -k \
    -X POST \
    -H 'Content-Type:application/json' \
    -d '{ "indices": ".ds-flow-sflow-stream-2023.08.21-000001", "ignore_unavailable": true, "include_global_state": true, "rename_pattern": "(.+)", "rename_replacement": "restored_$1" }' \
    -u admin:***** \
    https://169.254.16.2:9201/_snapshot/test_automation/test_snap1/_restore

Import and Export of Saved Objects

The Saved Objects UI helps keep track of and manage saved objects. These objects store data for later use, including dashboards, visualization, searches, and more. This section explains the procedures for backing up and restoring saved objects in Arista Analytics.

Exporting Saved Objects

 

  1. Open the main menu, then click Main Menu > Management > Saved Objects.
  2. Select the custom-saved objects to export by clicking on their checkboxes.
  3. Click the Export button to download. Arista Networks suggests changing the file name to the nomenclature that suits your environment (for example, clustername_date_saved_objects_<specific_name_or_group_name>.ndjson).
    Note: Arista Networks recommends switching ON include related objects before selecting the export button. If there are any missing dependency objects, selecting include related objects may throw errors, in which case switch it OFF.
  4. The system displays the following notification if the download is successful.
    Figure 1. Verifying a Saved/Downloaded Object
    Note: Recommended Best Practices
    • While creating saved objects, Arista Networks recommends naming conventions that suit your environment. For instance, in the example above, a naming pattern has been used, prefixed with “ARISTA” and specifying Type: dashboard, which allows a manageable set of items to click individually or to select all. Furthermore, exporting individual dashboards based on their Type is a more appropriate option, as tracking modifications to a dashboard improves using this method. Dashboards should use only custom visualizations and searches (i.e., do not depend on default objects that might change during a software upgrade).
    • Do not edit any default objects. Arista Networks suggests saving the new version with a different (custom) name if default objects require editing.
    • The files exported should be treated as code and reserved in a source control system, so dissimilarities and rollbacks are possible under standard DevOps approaches.

Importing Saved Objects

 

  1. To import one or a group of custom-created objects, navigate to Main Menu > Management > Kibana > Saved Objects.
    Figure 2. Importing a Group of Saved Objects
  2. Click Import and navigate to the NDJSON file that represents the objects to import. By default, saved objects already in Kibana are overwritten by the imported object. The system should display the following screen.
    Figure 3. NDJSON File Import Mechanism
  3. Verify the number of successfully imported objects. Also verify the list of objects, selecting Main Menu > Management > Kibana > Saved Objects > search for imported objects.
    Figure 4. Import Successful Dialog Box

Import and Export of Watchers

Use the Watcher feature to create actions and alerts based on certain conditions and periodically evaluate them using queries on the data. This section explains the procedure of backing up and restoring the Watchers in Arista Analytics.

Exporting Watchers

 

  1. The path parameter required to back up the Watcher configuration is watcher_id. To obtain the watcher_id, go to Main Menu > Management > Watcher > Watcher_ID.
    Figure 5. Find Watcher_ID
  2. Open the main menu, then select Dev Tools > Console. Issue the GET API mentioned below with the watcher_id. The response appears in the output terminal.
    Run the following API call:
    GET _watcher/watch/<watcher_id>
    Replace Watcher_ID with the watcher_id name copied in Step 1.
    Figure 6. GET API
  3. Copy the API response from Step 2 into a .json file with the terminology that suits the environment, and keep track of it. As an example, the following may be helpful to nomenclature: Arista_pod1_2022-02-03_isp_latency_watcher.json.

Importing Watchers

 

  1. Not all exported fields are needed when importing a Watcher. To filter out the unwanted fields from the exported file, use the jq utility. Use jq .watch <exported_watcher.json>and import the output.
    Figure 7. jq Command Output
  2. Click DevTools > console, enter the API PUT_watcher/watch/<watcher_id>, and copy the Step 1 output into the following screen. Replace Watcher_ID with the desired Watcher name. The output terminal will confirm the creation of the Watcher.
    Figure 8. PUT API in Dev Tools Console
  3. Locate the newly created Watcher in the Main menu > Management > Elasticsearch > Watcher > search with Watcher_ID.
    Figure 9. Watcher

Import and Export of Machine Learning Jobs

Machine Learning (ML) automates time series data analysis by creating accurate baselines of normal behavior and identifying anomalous patterns. This section explains ways to back up and restore the Machine Learning jobs in Arista Analytics.

Exporting Machine Learning Jobs

 

  1. Open the main menu, then select Dev Tools > Console. Send a GET _ml/anomaly_detectors/<Job-id> request to Elasticsearch and view the response of all the Machine Learning anomaly jobs. Replace Job_id with the ML job name. The system displays the following output when executing the GET request.
    Figure 10. Main Menu > Dev Tools > Console
  2. Copy the GET API response of the ML job into a .json file with terminology that suits your environment and keep track of it. An example of appropriate nomenclature might be Arista_pod1_2022-02-03_ML_Source_Latency_ML_job.json.

Importing Machine Learning Jobs

 

  1. It is optional to import all the exported fields. Only description, analysis_config, and data_description fields may be needed. Running jq '.jobs[] |{description, analysis_config, data_description}'<json-filename> copies the output into the Dev tools console. Replace json-filename with the filename of the JSON file previously exported.
    Run the following API call:
    jq '.jobs[] |{description, analysis_config, data_description}' Arista_pod1_
    2022-02-03_ML_Source_Latency_ML_job.json
    Figure 11. jq Required Fields
  2. Select Dev tools > Console and copy the Step 1 output into the screen below and the PUT request.
    Run the following API call:
    PUT _ml/anomaly_detectors/<ml_job_name>
    Replace ml_job_name with the specific string of the ML Job name.
    Figure 12. PUT ML Jobs API
  3. The successful response to the PUT request confirms the creation of the ML Job. Further, verify imported ML jobs by selecting Main menu > Machine Learning > Job Management > search with ML Job Name.
    Figure 13. ML Job Verification

Monitoring Network Performance and Events

This chapter monitors network performance and identifies unusual events. It includes the following sections.

Interfaces Sending or Receiving Traffic

To identify specific interfaces that are sending or receiving traffic, use the following features:
  • DMF Top Filter interfaces
  • Production interfaces
Figure 1. DMF Filter Interfaces
Figure 2. sFlow® > Flow by Production Device & IF

This information derives from the LLDP/CDP exchange between the production and DANZ Monitoring Fabric switches.

Anomalies

Use the following features to recognize unusual activity or events on the network.
  • Comparing dashboards and visualization over time
  • sFlow®* > Count sFlow vs Last Wk
  • New Flows & New Hosts
  • Utilization alerts
  • Machine Learning

Identify any unusual activity by comparing the same dashboard over the past 1 hour to the same time last week's data. For example, the bar visualization of traffic over time shows changing ratios of internal to external traffic, which can highlight an abnormality.

The Count sFlow vs Last Wk visualization in the sFlow dashboard shows the number of unique flows being seen now compared to last week. This visualization indicates unusual network activity and will help pinpoint a Denial of Service (DOS) attack.
Figure 3. Count sFlow vs Last Wk
In a well-inventoried environment, use the New Flows & New Hosts report.
Figure 4. Production Traffic
Configure utilization alerts associated with the following DMF port types:
  • Filter
  • Delivery
  • Core
  • Services
Figure 5. Monitoring Port Utilization Alerts
The other alerts available include the following.
  • The percentage of outbound traffic exceeds the usual thresholds.
  • New hosts appear on the network every 24 hours.
Figure 6. New Host Report
Perform Anomaly Detection in data over byte volume and characteristics over time using machine learning.
Figure 7. Machine Learning

Application Data Management

Application Data Management (ADM) helps users govern and manage data in business applications like SAP ERP. To use Arista Analytics for ADM, perform the following steps:

  1. Pick a service IP address or block of IP addresses.
  2. Identify the main body of expected communication with adjacent application servers.
  3. Filter down to ports that need to be communicating.
  4. Expand the time horizon to characterize necessary communication completely.
  5. Save as CSV.
  6. Convert the CSV to ACL rules to enforce in the network.

Machine Learning

Arista Analytics uses machine learning for anomaly detection. The following jobs are available:
  • Single-metric anomaly detection
  • Multimetric anomaly detection
  • Population
  • Advanced
  • Categorization
Figure 13. Machine Learning
For every job, a job ID must be configured. To create a machine learning job:
  • Select the time range
  • Select the appropriate metric
  • Enter details: job ID, description, custom URLs, and calendars to exclude planned outages from the job
Figure 14. Machine Learning Job options

Single-metric anomaly detection uses machine learning on only one metric or field.

Figure 15. Single-metric Anomaly Detection
Multimetric and so on, I couldn't find any whichanomaly detection uses machine learning on more than one metric field. The image below uses two metrics: over and running ml per L4 app.
Figure 16. Multimetric Anomaly Detection
Multimetric Anomaly Detection detects network activity that differs from the population of data points. Arista Networks recommends this analysis for high-cardinality data.
Figure 17. Population
This job groups data points into categories and then finds anomalies between them.
Figure 18. Categorization
*sFlow® is a registered trademark of Inmon Corp.

Monitoring Users and Software Running on the Network

This chapter describes using Arista Analytics with the DMF Recorder Node. It includes the following sections.

IP Addresses

This section describes identifying traffic transmitted or received by the source or destination IP address.

Source and Destination Addresses

Figure 1. Identifying Source and Destination IP Addresses
Click an IP address, then click the Magnifying Glass icon (+) to pin the address to the dashboard.
Figure 2. Filtering Results by IP Address

The selected IP address is added to the filters on the dashboard.

Each dashboard has a bar chart depicting traffic on the y-axis and time on the x-axis. To add a time filter, click and drag an area in the All Flows Over Time bar chart.

Unauthorized IP Destinations

To determine if an IP destination that is not authorized is being accessed in your network for a specific period, set the time value in the upper right corner.
Figure 3. Setting the Duration

Select the duration of time for the search.

Type the IP address or the Network ID in the Search field.

The system displays any events associated with the address or network ID.

Geographic Location

Analytics associates public network IP addresses to geographic regions using the MaxMind GeoIP database. Traffic associated with these addresses shows as a heat map on the Map visualization on the sFlow®* dashboard. To filter on a region, draw a box or a polygon around the region.
Figure 4. Geographic Flow Source and Destination

Use the Square tool to draw a square around a region of interest, or use the Polygon tool to draw an irregular shape around a region. It will redraw the mapto zoom in on the selected region and to show details about traffic to or from the region.

Software Running in the Network

This section identifies specific applications or operating systems running on network hosts.

Top Talkers Using Well-known Layer-4 Ports

To view top-N statistics for the flows using a well-known L4 port, use the Live L4 Ports table on the Flows dashboard.
Figure 5. Flows > Live L4 Ports
Use the App L4 Port table on the sFlows dashboard when a sFlow generator configured to send flows to Analytics.
Figure 6. sFlow > App L4 Port

These tables use well-known ports to identify the traffic generated by each application. You can also associate user-defined ports with applications as described in the following section.

Associating Applications with User-defined Layer4 Ports

To associate user-defined ports with applications, complete the following steps:
  1. Select System > Configuration.
  2. Select the Edit control to the right of the Ports section.
    Figure 7. Edit Ports
  3. To copy an existing row, enable the checkbox to the left of the row and select Duplicate from the drop-down menu.
    Figure 8. Duplicate Ports
  4. Type over the port number in the row you copied and enter an associated label.
    For example, assign port 1212 to Customer App X.
  5. Click save.

Software Running on Hosts

The following features identify the software running on hosts in the monitored network.
  • Searching for well-known applications
  • Using Layer4 labels
  • Searching packet captures on the DMF Recorder Node
  • Using the Flows dashboard
  • Using the DHCP dashboard for information about operating systems

The IP block default mapping associates many common applications with specific address ranges. For example, you can identify video traffic by searching for YouTube or Netflix.

L4 label strings identify applications using well-known ports and applications running on user-defined ports after mapping those ports to the applications.

The flow dashboards all give an overall sense of who is talking to whom. Click on an IP address or L4 port, and with the + that appears, pin that to filter the dashboard by the selection. Every dashboard has a bar chart depicting traffic on the y-axis and time on the x-axis. Note that a time filter can be added by a click and sideways selection of the bar chart.

The who can also be in terms of the user with a source of users to IP mappings (OpenVPN supported) configured. After that, a search by the user string for traffic attributed to that user over a dashboard period.

The DHCP dashboard indicates the operating systems running on hosts based on information derived from DHCP client requests. The default mapping is copied from the signatures provided by fingerbank.org.
Figure 9. DHCP OS Fingerprinting

Tools Receiving Traffic

Identify traffic forwarded to a specific tool or host using the IP Blocks mapping to associate an IP address or a range of IP addresses to a label describing the application. This label will then appear on any dashboards or visualizations that display the IP Block labels. After mapping, the search can happen for events associated with the label assigned to the tool.

Refer to the Mapping IP Address Blocks section for details about updating the IP block mapping file.

  1. To edit the IP blocks, select System > Configuration and click the Edit control to the right of the IP blocks section.
    Figure 10. Mapping a Tool to an IP Address: IP Block Edit
  2. To define a new IP block, append a range of IP addresses to the blocks section.
  3. Scroll down and add a tag definition with the same number as the IP block.
    Figure 11. Mapping a Tool to an IP Address: Define Tags
  4. Define the new IP block section tags, including a descriptive name for the specific tool.
  5. Select DMF Network > Policy Statistics.
    To cross-reference the information you get by labeling an IP block with information about any policies configured to forward traffic to that IP address.
    Figure 12. DMF Policies

User Activity

This section identifies specific users transmitting or receiving traffic on the network.

User Sessions

To identify users transmitting or receiving traffic on the network, use the following features:
  • Flows dashboard
  • sFlow dashboard
  • NetFlow dashboard
  • Open VPN or Active Directory mapping to IP address
The Flows dashboards all provide an overall idea of who communicates on the network (traffic source and destination).
Figure 13. Flows > Flows Source IP Dest IP
Click an IP address or L4 port, and with the + that appears, pin that to filter the dashboard for the selection. Every dashboard has a bar chart that shows traffic on the y-axis and time on the x-axis.
Figure 14. All Flows Over Time
To filter the display to a specific time, click and drag from left to right over the interesting period.
Figure 15. All flows Over Time (Specific Time)

It can also identify traffic associated with specific users after using the IP blocks configuration to map them to a specific IP address. Once saved, it can search for the user string to see traffic attributed to that user over the period displayed on the dashboard.

New Network Users

To identify new network users, use the following features:
  • Comparing the same dashboard for two different periods
  • sFlow > Count sFlow vs Last Wk
  • ARP dashboard
  • New Host Report
The sFlow dashboard provides a Count sFlow vs Last Wk visualization, which shows the number of unique flows being seen now vs. last week.
Figure 16. sFlow > Count sFlow vs Last Wk
The ARP dashboard provides a visualization for Tracked Hosts New-Old-Inactive, Vendor.
Figure 17. ARP > Tracked Hosts New-Old-Inactive, Vendor
To use the New Host report, enable the report and configure where to send alerts on the System > Configuration page.
Figure 18. System > Configuration > New Host Report

Unauthorized Intranet Activity

To identify unauthorized usage of your internal network, use the following features:
  • Malicious vs. compromised vs. apt zero-day vs. known threats. It enables the association of flows to users and flows to internal organizations.
  • Searching by the username will reveal access to different organizations and their Apps.
  • For OpenVPN users, when the IP is from a different geographical location, it shows the user's external IP. It may indicate a compromised account, especially in combination with access at odd hours.
  • The OpenVPN server records logins with IP addresses and computer type, assigns IP addresses inside the lab, and sends Syslog on OpenVPN.
  • Use the DMF Recorder Node to retrieve the original packets for forensic analysis and to obtain evidence of unauthorized activity.

Monitoring Active Directory Users

Windows Active Directory should be configured to audit logon and logoff events on Active Directory.
  1. Download and install Winlogbeat from the Elastic website on the Windows machine. Download Winlogbeat.
  2. On the Analytics node, run: sudo rm -rf * inside /home/admin/xcollector and then run docker exec xcollect /home/logstash/generate_client_keys.sh <AN IP> client. It generates .pem files in /home/admin/xcollector.
  3. On the Analytics node machine, replace the winlogbeat.yml file from /opt/bigswitch/conf/x_collector/winlogbeat.yml to the one in the Windows server. Edit the logstash output section:
    #----------------------------- Logstash output ----------------------------------
    output.logstash:
    #Point agent to analytics IPv4 in hosts below hosts: ["10.2.5.10:5043"]
    
    #List of root certificates for HTTPS server verifications ssl.certificate_authorities: ["C:/Program Files/Winlogbeat/security/ca/cacert.pem"]
    
    #Certificate for SSL client authentication
    ssl.certificate: "C:/Program Files/Winlogbeat/security/clientcert.pem"
    
    
    #Client Certificate Key
    ssl.key: "C:/Program Files/Winlogbeat/security/clientkey.pem"
    
  4. Using the recovery account, use an SCP application to transfer the .pem files from the Analytics node to the Windows machine and update their locations in winlogbeat.yml.
  5. On Windows, enter the powershell, navigate to winlogbeat.exe, and run: .install-service-winlogbeat.ps1 to install Winlogbeat.
  6. Test the configuration using “winlogbeat test config” to test winlogbeat.yml syntax and “winlogbeat test output” to test connectivity with logstash on the Analytics node.
  7. Run winlogbeat run -e to start Winlogbeat.
*sFlow® is a registered trademark of Inmon Corp.

Monitoring DMF Network Health

This chapter describes uses the dashboards on the DMF Network tab to monitor activity on the DANZ Monitoring Fabric. It includes the following sections.

DMF Network Tab

The DMF Network tab includes dashboards that display the following information visible to the DMF controller:
  • Policy Statistics
  • Interface Statistics
  • SN Statistics
  • Incline Statistics
  • Events
Note: Information displayed on these dashboards requires configuring an ACL for Redis and replicated Redis using the Analytics CLI after first boot configuration.

Policy Statistics Dashboard

Click the Policy Statistics tab to display the following dashboard:
Figure 1. DMF Network > Policy Statistics Dashboard
The Policy Statistics dashboard summarizes information about DANZ Monitoring Fabric policy activity and provides the following panels:
  • Top Active Policies
  • Top Filter Interfaces
  • Top Service Interfaces
  • Top Delivery Interfaces
  • Mean Bit Rate
  • Mean Packet Rate
  • Core Switch
  • Policy Core Interface Bit Rate
  • Core Interface
  • Policy Core Interface Packet Rate
  • Records
  • Policies with no traffic

Use the Top Active Policies visualization to verify that your DANZ Monitoring Fabric policies are active and behaving as expected.

Use the Filter Interfaces visualization to balance the utilization of your filter interfaces and ensure that it doesn't drop any packets to analyze.

Interface Statistics

Click the Interface Statistics tab to display the following dashboard.
Figure 2. DMF Network > Interface Statistics
The Interface Statistics dashboard summarizes information about DANZ Monitoring Fabric switch interface activity and provides the following panels:
  • Interface Type
  • SVC-Del-Filter IF
  • Core Switch (IF)
  • Core IF
  • Bit Rate
  • Packet Rate
  • Interface Detail

SN (Service Node) Statistics

Click the SN Statistics tab to display the following dashboard.
Figure 3. DMF Network > SN Statistics

Select a service from the pie chart to see the statistics for a specific managed service. It will display statistics for the selected service.

Events

Click the Events tab to display the following dashboard.
Figure 4. DMF Network > Events
The Events dashboard summarizes information about DANZ Monitoring Fabric management network events and provides the following panels:
  • Events Over Time
  • Events

Managing the NetFlow Dashboard

This chapter manages NetFlow and provides an efficient way to use the NetFlow dashboard.. Arista Analytics acts as a NetFlow collector for any agent or generator configured with the Analytics server IP address as a collector. It includes the DMF Service Node and any third-party NetFlow agent. This chapter has the following sections:

NetFlow Optimization

Arista Analytics may consolidate NetFlow records to improve performance.

The Analytics server/cluster consolidates flows received within two seconds into a single flow when the source and destination IP addresses are the same or the source or destination L4 protocol port is the same.

For example, ten flows received by the Analytics server within thirty second period are consolidated into a single flow if the source and destination IP addresses and destination port are the same for all the flows and only the source ports are different or if the source and destination IP addresses and source port are the same for all the flows and only the destination ports are different. This consolidated flow displays as a single row.

By default, the NetFlow Optimization is enabled for Netflow v5 and disabled for Netflow v9 and IPFIX. To allow the Netflow Optimization for Netflow v9 and IPFIX, refer to Consolidating Netflow V9/IPFIX records section.

This consolidation improves Analytics NetFlow performance, allowing more efficient indexing and searching of NetFlow information.

The following figure shows the NF Detail window on the NetFlow dashboard, which provides an example of NetFlow information with optimization.
Figure 1. Analytics NetFlow Optimization

Viewing Filter Interface Information on the NetFlow Dashboard

Add the filter interface name to the NetFlow dashboard to see hop-by-hop forwarding of flows for NetFlow traffic coming from the DMF Service Node for a specific flow. Arista Analytics then shows the filter interface name associated with that flow. If the flow goes through two hops, then two filter interface names are displayed for the flow.

Displaying Filter Interface Names

The nFlow by Filter Interface window on the NetFlow dashboard, shown below, can display the filter interface name where traffic is coming in for the NetFlow service. To display this information, enable the records-per-interface option in the NetFlow managed service configuration on the DANZ Monitoring Fabric controller using the commands shown in the following example.
controller(config)# managed-service netflow-managed-service
controller(config-managed-srv)# service-action netflow netflow-delivery-int
controller(config-managed-srv-netflow)# collector 10.8.39.101 udp-port 2055 mtu 1500 records-per-interface
Figure 2. Production Network > NetFlow Dashboard with Filter Interface Name

NetFlow Managed Service Records-per-interface Option

The following example displays the running-config for this configuration.
! managed-service
managed-service netflow-managed-service
	service-interface switch 00:00:4c:76:25:f5:4b:80 ethernet4/3:4
	!
	service-action netflow netflow-delivery-int
		collector 10.8.39.101 udp-port 2055 mtu 1500 records-per-interface
controller(config)# sh running-config bigtap policy netflow-policy
! policy
policy netflow-policy
	action forward
	filter-interface filter-int-eth5
	use-managed-service netflow-managed-service sequence 1 use-service-delivery
	1 match any

After enabling this option, the nFlow by Filter Interface window, shown above, displays the filter interface identified in the policy that uses the NetFlow managed service.

The production device port connected to the filter interface sends LLDP messages, Arista Analytics also displays the production switch name and the production interface name attached to the filter interface in the nFlow by Production Switch & IF window.

In the example below, wan-tap-1 displays in the nFlow by Filter Interface window. The production device N1524-WAN and the interface Gi1/0/1, connected to filter interface wan-tap-1, are displayed in the nFlow by Production Switch & IF window.
Figure 3. Production Network > NetFlow Dashboard with Filter Interface Name

NetFlow Traffic Coming from Third-party Devices

This section displays third-party device and interface names. It shows hop-by-hop forwarding of flows when NetFlow traffic comes from a third-party device. For a query for a specific flow, Arista Analytics shows the device and interface names associated with that flow. If the flows go through two hops, it displays the device and interface names associated with flows.

Arista Analytics can act as a NetFlow collector for third-party devices. In this case, Arista Analytics displays third-party device management IP addresses and the interface index (iFindex) of the interface for each NetFlow-enabled third-party device.

For example, the nFlow by Production Device & IF window shows that 10.8.39.198 is the third-party device that forwards NetFlow traffic. The iFindex of the interface on that device where NetFlow is enabled is 0, 2, 3, 4.

To discover the device name and the actual interface name rather than the iFindex, Arista Analytics automatically does an SNMP walk by getting the third-party device management IP from flow information. By default, Analytics uses the SNMP community name public to get the device name and interface name. If the SNMP community name of the third-party device is not public, change it in the Arista Analytics SNMP collector configuration.
Note: AN DMF 8.3.0 release supports both SNMPv2 and SNMPv3.
Note: For IPFIX and nFlow v9, configure the third-party device to send the iFindex. The Analytics node will do an SNMP walk to get the interface names associated with that iFindex. By default, the iFindex is not sent with IPFIX or nFlow v9. For example, to send the iFindex for IPFIX and nFlow v9, enable match interface input snmp and match interface output snmp under flow record configuration on the third-party device.

DMF Analytic > System > Configuration > Analytic Configuration > snmp_collector

Arista Analytics then performs SNMP polling and displays the third-party device name and the actual interface name in the nflow by Production Device & IF window.

To perform the SNMP configuration, complete the following steps:

  1. On the screen shown below, click DMF Analytic > System > Configuration > Analytic Configuration > snmp_collector > Edit.
    Figure 4. Analytic snmp_collector config
    The system displays the following edit dialog.
    Figure 5. Analytic Configuration > snmp_collector > Edit Dialog (SNMPv2 Configuration)
    Figure 6. Analytic Configuration > snmp_collector > Edit Dialog (SNMPv3 Configuration)
  2. Click the community string public to change it to a different value as shown in the following dialog.

    By default, the SNMP collector polls devices every 60 seconds.

  3. To change the SNMP poll interval, click the value 60, change it to the preferred value, and click Save.
After completing this configuration, the third-party device is polled for the device name and interface name, nflow by Production Device & IF window displays it.
Figure 7. Analytic Configuration > snmp_collector > Edit Dialog

Displaying Flows with Out-Discards

The NetFlow dashboard allows displaying flows with out-discards when the NetFlow packets come from third-party devices. To display this information, use the flows via interfaces with SNMP out-discards tab at the top of the Arista Analytics NetFlow dashboard.

To display the flows with out-discards, click the flows via interfaces with SNMP out-discards tab and click the Re-enable button. This window displays the flows with out-discards.

Using the DMF Recorder Node with Analytics

This chapter describes Arista Analytics to use with the DANZ Monitoring Fabric Recorder Node. It includes the following sections.

Overview

The DMF Recorder Node records packets from the network to disk and recalls specific from disk quickly, efficiently, and at scale. A single DANZ Monitoring Fabric controller can manage multiple DMF Recorder Nodes, delivering packets for recording through DANZ Monitoring Fabric policies. The controller also provides central APIs for interacting with DMF Recorder Nodes to perform packet queries across one or multiple recorders and for viewing errors, warnings, statistics, and the status of connected recorder nodes.

A DANZ Monitoring Fabric policy directs matching packets to one or more recorder interfaces. The DMF Recorder Node interface defines the switch and port used to attach the recorder to the fabric. A DANZ Monitoring Fabric policy treats these as delivery interfaces.

Both NetFlow and TCPflow dashboards have the recorder node visualization.

General Operation

To retrieve packets from the DMF Recorder Node for analysis using Arista Analytics, select the Controller and log in from the Recorder Node window on the NetFlow or Flows dashboard. To add a new Controller, click the small Select Controller icon and add the Controller. After logging in to the DMF Recorder Node, the system displays the following dialog:
Figure 1. DMF Recorder Node

The Recorder Node window can compose and submit a query to the DMF Recorder Node. Use any of the fields shown to create a query and click Submit. The Switch Controller link at the bottom of the dialog can log in to a different DMF Recorder Node.

Use the Recorder Summary query to determine the number of packets in the recorder database. Then, apply filters to retrieve a reasonable number of packets with the most interesting information.

You can modify the filters in the recorder query until a Size query returns the most beneficial number of packets.

Query Parameters

The following parameters are available for queries:
  • Query Type
    • Size: Retrieve a summary of the matching packets based on the contents and search criteria stored in the recorder node. Here, Size refers to the total frame size of the packet.
    • AppID: Retrieve details about the matching packets based on the contents and search query in the recorder node datastore, where the packets are stored. Use this query to see what applications are in encrypted packets.
    • Packet Data: Retrieve the raw packets that match the query. At the end of a search query, it generates a URL pointing to the location of the pcap if the search query is successful.
    • Packet Objects: Retrieve the packet objects that match the query. At the end of a search query, it generates a URL pointing to the location of the objects (images) if the search query is successful.
    • Replay: Identify the Delivery interface in the field that appears, where the replayed packets are forwarded.
    • FlowAnalysis: Select the flow analysis type (HTTP, HTTP Request, DNS, Hosts, IPv4, IPv6, TCP, TCP Flow Health, UDP, RTP Streams, SIP Correlate, SIP Health).
  • Time/Date Format: Identify the matching packets' time range as an absolute value or relative to a specific time, including the present.
  • Source Info: Match a specific source IP address / MAC Address / CIDR address.
  • Bi-directional: Enabling this will query bi-directional traffic.
  • Destination Info: Match a specific destination IP address / MAC Address / CIDR address.
  • IP Protocol: Match the selected IP protocol.
  • Community ID: Flow hashing.
Additional Parameters
  • VLAN: Match the VLAN ID.
  • Outer VLAN: Match the outer VLAN ID when multiple VLAN IDs exist.
  • Inner/Middle VLAN: Match the inner VLAN ID of two VLAN IDs or the middle VLAN ID of three VLAN IDs.
  • Innermost VLAN: Match innermost VLAN ID of three VLAN IDs.
  • Filter Interfaces: Match packets received at the specified DANZ Monitoring Fabric filter interfaces.
  • Policy Names: Match packets selected by the specified DANZ Monitoring Fabric policies.
  • Max Size: Set the maximum size of the query results in bytes.
  • Max Packets: Limits the number of packets the query returns to this set value.
  • MetaWatch Device ID: Matches on device ID / serial number found in the trailer of the packet stamped by the MetaWatch Switch.
  • MetaWatch Port ID: Matches on application port ID found in the trailer of the packet stamped by the MetaWatch Switch.
  • Packet Recorders: Query a particular DMF Recorder Node. Default is none or not selected; all packet recorders configured on the DANZ Monitoring Fabric receive the query.
  • Dedup: Enable/Disable Dedup.
  • Query Preview: After expanding, this section provides the Stenographer syntax used in the selected query. You can cut and paste the Stenographer query and include it in a REST API request to the DMF Recorder Node.

Using Recorder with Analytics

For interactive analysis, any set of packets exceeding 1 GB becomes unwieldy. To reduce the number of packets to a manageable size, complete the following steps:

  1. Use the Summary query to determine the number of packets captured by the Recorder. Apply filters until the packet set is manageable (less than 1 GB).
  2. Search over the metadata from all sources and analyze it to retrieve a limited and useful set of packets based on source address, destination address, timeframe, and other filtering attributes.
  3. Submit the Stenographer query, which is used by the DMF Recorder Node and automatically composed by Arista Analytics.

    You can perform flow analysis without downloading the packets from Recorder. Select specific rows to show Throughput, RTT, Out of order, and Re-transmissions. Packet varieties like HTTP, HTTP request, DNS, Hosts, IPv4, IPv6, TCP, TCPFlow Health, UDP, RTP Streams, SIP Correlate, and SIP Streams analyze the flows. Then, sort and search as required and save to CSV for later analysis. You can search over a given duration of time for the IP address by exact match or prefix match.

    Replay set direct large packets to an archive for later analysis; this frees up the Recorder to capture a new packet set.

    Use DMF Recorder Node to identify the applications on your network that are encrypting packets. Use a Recorder Detail query to see the applications with encrypted packets.

    Refer to the DANZ Monitoring Fabric Deployment Guide for information about installing and setting up the DMF Recorder Node. For details about using the Recorder from the DANZ Monitoring Fabric Controller GUI or CLI, refer to the DANZ Monitoring Fabric User Guide.

Analyzing SIP and RTP for DMF Analytics

This feature describes how Session Initiation Protocol (SIP) packets are parsed in a DANZ Monitoring Fabric (DMF) Analytics Node deployment and presented in a dashboard to allow the retrieval of data packets conveying voice traffic (RTP) from the DMF Recorder Node (RN). DMF accomplishes this by showing logical call information such as the call ID, phone number, and username. After retrieving the SIP record, the associated IP addresses are used to retrieve packets from the RN and opened in Wireshark for analysis.

Kibana has the SIP dashboard.
Figure 2. SIP Dashboard

DMF Preconditions

The feature requires a physical connection from the DMF Delivery Switch to the 10G Analytics Node (AN) Collector interface.
  • Policy configured to filter for SIP traffic (UDP port 5060) such that low-rate traffic (< 1Gbps) is delivered to AN via collector interface with a filter on the Layer 4 port number or UDF.
  • LAG will send SIP Control Packets to 1, 3, and 5 AN Nodes with symmetric hashing enabled and without hot-spotting.
  • Recorder Node to receive SIP and Control packets recorded with standard key fields.

Configuration

Configure SIP using the broker_address, timestamp-field, and field_group to enable the feature. Refer to Field Details for more information on broker_address.

Figure 3. Edit-topic indexer

Limitations

The AN DMF 8.5.0 release supports this feature.
  • There is no toggle switch to turn this feature on or off.