Troubleshooting MSS

This section contains valuable information on troubleshooting common issues and performing routine maintenance tasks.

Show Commands

The following show commands help with troubleshooting a ZTX Traffic Mapper.

Check the interface status to ensure that the port-channel and the member ports are connected:

ZTX# show interfaces status
      Port  Name Status    Vlan   Duplex Speed Type       Flags Encapsulation
      Et1/1      connected in Po1 full   10G   10GBASE-SR 
      Et1/2      connected in Po1 full   10G   10GBASE-CR 
      .. 
      Po1 connected routed full 40G N/A

Check each GRE tunnel interface status to ensure that the status is Up (multiple GRE tunnels may terminate on the interface):

ZTX# show interfaces tunnel 0
      Tunnel0 is up, line protocol is up (connected)
      Hardware is Tunnel, address is 0000.0000.0000
      Tunnel source 10.10.254.1, destination 10.10.254.2
      Tunnel protocol/transport GRE/IP
      Hardware forwarding enabled
      Tunnel transport MTU 1476 bytes (default)
      Tunnel underlay VRF "default"
      Up 3 days, 53 minutes, 22 seconds

Check the flow tracking feature status is active on all GRE monitor tunnels from TORs:

ZTX# show flow tracking firewall distributed
      Flow Tracking Status
      Type: Distributed Firewall
      Running: yes, enabled by the 'flow tracking firewall distributed' command
      Tracker: flowtrkr
      Active interval: 300000 ms
      Inactive timeout: 15000 ms
      Groups: IPv4
      Exporter: exp
      VRF: default
      Local interface: Loopback0 (10.10.254.1)
      Export format: IPFIX version 10, MTU 9152
      DSCP: 0
      Template interval: 3600000 ms
      Collectors:
      127.0.0.1 port 4739
      Active Ingress Interfaces: 
      Tu0, Tu1 

Check if mirrored flows are seen at the ZTX Traffic Mapper node:

ZTX# show firewall distributed instance session-table
Legend
eph - Ephemeral port
Sessions: 5
VRF   Proto Source/Destination Fwd/Rev Src VTEP IP Fwd/Rev Pkts Fwd/Rev Bytes Complete   Half-Open  Start Time Destination
----- ----- ------------------ ----------------     ----------  ---------- -----------   ---------- --------------------
vrf2   UDP   1.1.1.1:50004      10.10.254.2          1          428      0 1 2024-10-28  11:13:09   1.1.1.4:1001 10.10.254.3 1 428
vrf1   UDP   1.1.1.1:eph        10.10.254.2          5          2140     5 0 2024-10-28  11:13:09   1.1.1.3:1001 10.10.254.3 5 2140
vrf1   UDP   1.1.1.1:eph        10.10.254.2          5          2140     5 0 2024-10-28  11:13:09   1.1.1.2:1001 10.10.254.3 5 2140

If mirrored flows are not seen at the ZTX Traffic Mapper node, check for drops:

ZTX# show platform sfe counters | nz
Name                                  Owner            Counter Type Unit    Count
------------------------------------- ---------------- ------------ ------- —----
Tunnel-Global-gre_decap_drop_pkts     Ip4TunDemux      module       packets 400
Tunnel-Global-tun_decap_drop_pkts     Ip4TunDemux      module       packets 260
IpInput_Tunnel0-Stateful_drop_counter IpInput_Tunnel0  module       packets 47

If mirrored flows are not seen at CloudVision Portal, check if ZTX Traffic Mapper node has exported the flows, and if there are no failures in IPFIX export:

switch# show agent sfe threads flow cache scan counters
Purged count: 501
IPFIX export count: 354
IPFIX failed export count: 0 

The following is a brief explanation of the output:

  • Type: Distributed Firewall type for ZTX Traffic Mapper nodes.
  • Running: “yes” indicates that the flow tracking feature is successfully running.
  • Tracker: Name of flow tracker configuration.
  • Active interval: Interval after which IPFIX data packet is exported for active sessions. Active interval is set to 1800000 ms (30 min) by default and can be modified in MSS Studio.
  • Inactive timeout: The time after which sessions are considered inactive if no packets are received. Inactive timeout is not configurable and defaults to 15000 ms.
  • Groups: Currently, only IPv4 packets are supported.
  • Exporter: Name of exporter configuration.
  • VRF: VRF used for IPFIX export.
  • Local Interface: Local interface used for IPFIX export.
  • Export format: IPFIX version and IP MTU used for exported IPv4 packets.
  • DSCP: Differentiated Service Code Point value used in exported IPv4 packet header.
  • Template Interval: Time interval between successive IPFIX template export to collector.
  • Collectors: List of IPFIX Collector IP and Port. 127.0.0.1 indicates local IPFIX collector running on a ZTX Traffic Mapper device. The local IPFIX collector will send the exported flows to CVP.
  • Action Ingress Interfaces: Displays all the tunnel interfaces on which IPFIX flow tracking is running.

Tracing

Enabling tracing can seriously impact the switch’s performance in some cases. Please use it cautiously and seek advice from an Arista representative before enabling it in any production environments.

trace Sfe setting IpfixWalker*/*

Considerations

When deploying Multi-domain Segmentation Services (MSS) with the ZTX-7250S Traffic Mapper, several crucial technical aspects must be considered to ensure optimal performance and policy enforcement.

ZTX-7250S Traffic Mapper

Session Capacity and Traffic Throughput

The ZTX-7250STraffic Mapper monitor node can handle a maximum of 32 million session entries concurrently. Session entries include a mix of aggregate, short-lived ephemeral, and persistent non-ephemeral sessions. Monitoring your network’s session count is vital to avoid exceeding this limit, which could impact performance or lead to dropped connections. Additionally, the node supports up to 80 Gbps of incoming monitor traffic when all 16 Ethernet ports are actively connected to upstream service Top-of-Rack (TOR) switches. Designing your network to leverage all available ports will maximize monitoring throughput.

Self-IP Support

MSS Studio automatically applies a Self-IP rule to permit unicast traffic destined for the device. While this is usually sufficient for control plane protocols, those relying on multicast PDUs—like PIM, OSPF, or IPv6 Neighbor Discovery for BGPv6—might fail to establish if a default deny any any rule is in place. you will need to allow such multicast traffic in your policies explicitly. Also, remember that Layer 2 devices where traffic policy enforcement occurs still require IP routing to be enabled to function correctly with MSS.

Layer 2 Devices

L2 devices where traffic policy enforcement is applied will still need to enable IP routing.

Network Address Translation

NAT and Policy Rule Limitations

If you enable NAT on 7050S, 720S, or 722S platforms, be aware that the number of supported traffic policy rules will be reduced. This reduction is because NAT and traffic policies share the same Ternary Content Addressable Memory (TCAM) resources. Careful planning of your NAT implementation is necessary to avoid impacting your segmentation policies.

Access Control Lists and Policy-Based Routing

Policy Overlap and Precedence

The interaction between different policy types, specifically Access Control Lists (ACLs), Policy-Based Routing (PBR), and MSS Traffic Policy rules, requires careful consideration. If these policies are configured to apply to overlapping flow attributes, their combined effect might not be as intended. For instance, ACLs can be safely applied to Self-IP traffic, but keep in mind that MSS Studio isn’t aware of any ACLs that impact data packets directly. you will need to manually account for how these different policy mechanisms might interact to avoid unintended traffic behavior or security gaps.