Reforming Controller HA Cluster

Removing default IPv4 or IPv6 permit entry before adding a specific permit rule in the sync access list will permanently break communication between active and standby Controllers. Follow the procedure outlined in this appendix to recover from a split controller HA cluster.

Controller Cluster Recovery

Controller-1 (IP:192.168.55.11, node-id:23955) is active and Controller-2 (IP:192.168.39.44, node-id:1618) is standby. Retrieve the Node-id using the show controller details command in the CLI.
DMF-CTL2(config-controller-access-list)# show controller details
Cluster Name : DMF-7050
Cluster UID : a5de38214971de42aa7b51b96ac7345f4f228b20
Cluster Virtual IP : 10.240.130.18
Redundancy Status : redundant
Redundancy Description : Cluster is Redundant
Last Role Change Time : 2022-11-05 00:56:04.862000 UTC
Cluster Uptime : 2 months, 1 week
# IPHostname @ Node Id Domain Id State StatusUptime
-|-------------|--------|-|-------|---------|-------|---------|---------------|
1 192.168.39.44 DMF-CTL2 * 16181 activeconnected 2 weeks, 2 days
2 192.168.55.11 DMF-CTL1 23955 1 standby connected 2 weeks, 2 days
~~~~~~~~~~~~~~~~~~~~~~~~~~~ Failover History ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# New Active Time completed Node Reason Description
-|----------|------------------------------|-----|---------------------|------------------|
1 220492022-11-05 00:55:35.994000 UTC 22049 cluster-config-change Changed connection
state: cluster configuration changed
Procedure
  1. [Controller-1] Add the sync 2 permit from the 0.0.0.0/0 rule.
  2. Controller-2 remains a standby so the user cannot add default rule to access-list sync until it transitions to active.
  3. [Controller-2] Run this command on Controller-2, system reset-connection switch all changingController-2 to active.
  4. [Controller-2] On Controller-2, add the default rule to access-list sync 2 permit from the 0.0.0.0/0 rule.
  5. [Controller-2] On controller-2, use the debug bash then run the following command:
    sudo bootstraptool -ks /etc/floodlight/auth_credentials.jceks --set 23955,192.168.55.11,
    6642
    Node id 23955 is the old active Controller-1 node-id and ip address is old active
    Controller-1 ip address.
  6. Wait for the cluster to reform.
  7. Controller-1 and Controller-2 may change their role after this recovery procedure, that is Controller-2 may become active.