Smart System Upgrade

Smart System Upgrade (SSU) significantly reduces reload time by streamlining and optimizing the reload procedure for upgrades, and by continuing to send LACP PDUs while the CPU is rebooting, keeping port channels operational during the reload. SSU leverages protocols capable of graceful restart to minimize traffic loss during upgrade.

Features capable of hitless restart under SSU include:
  • QinQ
  • 802.3ad Link Aggregation/LACP
  • 802.3x flow control
  • BGP (BGP graceful restart must be enabled: see Configuring BGP)
  • MP-BGP (BGP graceful restart must be enabled: see Configuring BGP)
  • 128-way Equal Cost Multipath Routing (ECMP)
  • VRF
  • route maps
  • L2 MTU
  • QoS
Note: SSU is not compatible with VRRP. If VRRP is configured on the switch, another upgrade method must be used.

Upgrading the eos image with Smart System Upgrade

Using SSU to upgrade the active eos image is a five-step process:
  1. Prepare switch for upgrade ( Prepare the switch for SSU).
  2. Transfer image file to the switch (Transfer the Image File for SSU). (Not required if desired file is on the switch).
  3. Modify boot-config file to point to the desired image file (Modify boot-config).
  4. Start the SSU process (Start the SSU Process).
  5. Verify that the upgrade was successful (Verify Success of the Upgrade).

Prepare the switch for SSU

Note: configuring BGP graceful restart resets BGP sessions.If configuring BGP graceful restart as part of the SSU process, ensure that BGP sessions are stable and all BGP routing information has been learned and advertised before proceeding with SSU.

Backing Up Critical Software

Before upgrading the eos image, ensure thatcopies of the currently running eos version and the running-config file are available in case of corruption during the upgrade process. To copy the running-config file, use the copy running-config command. In this example, running-config is copied to a file in the flash drive on the switch.

switch# copy running-config flash:/cfg_06162014
Copy completed successfully.
switch#

Making Room on the Flash Drive

Determine the size of the new eos image. Then verify that there is enough space available on the flash drive for two copies of this image, plus a recommended 240MB (if available) for diagnostic information in case of a fatal error. Use the dir command to check the “bytes free” figure.

switch# dir flash:
Directory of flash:/
-rwx   293168526      Nov 4    22:17   eos4.11.0.swi
-rwx          36      Nov 8    10:24   boot-config
-rwx       37339      Jun 16   14:18   cfg_06162014

606638080 bytes total (602841088 bytes free)

Verifying Connectivity

Ensure that the switch has a management interface configured with an IP addresses and default gateway.See Assigning a Virtual IP Address to access the Active Ethernet Management Port and Configuring a Default Route to the Gateway. Confirm that the switch can be reached through the network by using the command and pinging the default gateway.

switch# show interfaces status
Port    Name     Status     Vlan       Duplex   Speed      Type
Et3/1            notconnect   1         auto    auto     1000BASE-T

<-------OUTPUT OMITTED FROM EXAMPLE-------->
Ma1/1            connected   routed     unconf   unconf    Unknown 

switch#ping 1.1.1.10
PING 172.22.26.1 (172.22.26.1) 72(100) bytes of data.
80 bytes from 1.1.1.10: icmp_seq=1 ttl=64 time=0.180 ms
80 bytes from 1.1.1.10: icmp_seq=2 ttl=64 time=0.076 ms
80 bytes from 1.1.1.10: icmp_seq=3 ttl=64 time=0.084 ms
80 bytes from 1.1.1.10: icmp_seq=4 ttl=64 time=0.073 ms
80 bytes from 1.1.1.10: icmp_seq=5 ttl=64 time=0.071 ms

Verifying Configuration

Verify that the switch configuration is valid for SSU by using the show reload fast-boot command. If parts of the configuration are blocking execution of SSU, an error message will be displayed explaining what they are. For SSU to proceed, the configuration conflicts must be corrected before issuing the reload fast-boot command.

switch# show reload fast-boot
switch#'reload fast-boot' cannot proceed due to the following:
  Spanning-tree portfast is not enabled for one or more ports
  Spanning-tree BPDU guard is not enabled for one or more ports
switch#
Note: The show reload hitless and reload hitless commands can still be used, but their effect is identical to the commands shown above.

Configuring BGP

For hitless restart of BGP and MP-BGP, BGP graceful restart must first be enabled using the graceful-restart command. The default restart time value (300 seconds) is appropriate for mostconfigurations.

The BGP configuration mode in which the graceful-restart command is issued determines which BGP connections will restart gracefully.

Note: configuring BGP graceful restart resets BGP sessions.If configuring BGP graceful restart as part of the SSU process, ensure that BGP sessions are stable and all BGP routing information has been learned and advertised before proceeding with SSU.
  • For all BGP connections, use the graceful-restart command in BGP configuration mode:
    switch# config
    switch(config)# router bgp 64496
    switch(config-router-bgp)# graceful-restart
    switch(config-router-bgp)#
  • For all BGP connections in a specific VRF, use the graceful-restart command in BGP VRF configuration mode:
    switch# config
    switch(config)# router bgp 64496
    switch(config-router-bgp)# vrf purple
    switch(config-router-bgp-vrf-purple)# graceful-restart
    switch(config-router-bgp-vrf-purple)# exit
    switch(config-router-bgp)#
  • For all BGP connections in a specific BGP address family, use the graceful-restartcommand in BGP address-family configuration mode:
    switch# config
    switch(config)# router bgp 64496
    switch(config-router-bgp)# address-family ipv6
    switch(config-router-bgp-af)# graceful-restart
    switch(config-router-bgp-af)# exit
    switch(config-router-bgp)#

Transfer the Image File for SSU

The target image must be copied to the file system on the switch, typically onto the flash drive. After verifying that there is space for two copies of the image plus an optional 240MB for diagnostic information, use the copy command to copy the image to the flash drive, then confirm that the new image file has been correctly transferred.

These command examples transfer an image file to the flash drive from various locations.

USB Memory

Command

copy usb1:/sourcefile flash:/destfile

Example

sch# copy usb1:/eos-4.14.4.swi flash:/eos-4.14.4.swi

FTP Server

Command

copy ftp:/ftp-source/sourcefile flash:/destfile

Example

switch# copy ftp:/user:password@10.0.0.3/eos-4.14.4.swi flash:/eos-4.14.4.swi

SCP

Command

copy scp://scp-source/sourcefile flash:/destfile

Example

switch# copy scp://user@10.1.1.8/user/eos-4.13.2.swi flash:/eos-4.13.2.swi

HTTP

Command

copy http://http-source/sourcefile flash:/destfile

Example

switch# copy http://10.0.0.10/eos-4.14.4.swi flash:/eos-4.14.4.swi

Once the file has been transferred, verify that it is present in the directory, then confirm the MD5 checksum using the verify command. The MD5 checksum is available from the eos download page of the Arista website.

switch# dir flash:
Directory of flash:/
-rwx     293168526   Nov 4     22:17     eos4.14.2.swi
-rwx            36   Nov 8     10:24     boot-config
-rwx         37339   Jun 16    14:18     cfg_06162014
-rwx     394559902   May 30    02:57     eos4.13.1.swi

606638080 bytes total (208281186 bytes free)
switch# verify /md5 flash:eos-4.14.4.swi 
verify /md5 (flash:eos-4.14.4.swi) =c277a965d0ed48534de6647b12a86991
switch#

Modify boot-config

After transferring and confirming the desired image file, use the boot system command to update the boot-config file to point to the new eos image.

This command changes the boot-config file to point to the image file located in flash memory at eos-4.14.4.swi.

switch# configure terminal
switch(config)# boot system flash:/eos-4.14.4.swi

Use the show boot-config command to verify that the boot-config file is correct:

switch(config)# show boot-config
Software image: flash:/eos-4.14.4.swi
Console speed: (not set)
Aboot password (encrypted): $1$ap1QMbmz$DTqsFYeauuMSa7/Qxbi2l1

Save the configuration to the startup-config file with the write command.

switch# write

Start the SSU Process

After updating the boot-config file, verify that your configuration supports SSU (if you have not already done so) by using the show reload fast-boot command. If parts of the configuration are blocking execution of SSU, an error message will be displayed explaining what they are.

switch# show reload fast-boot
switch#'reload fast-boot' cannot proceed due to the following:
  Spanning-tree portfast is not enabled for one or more ports
  Spanning-tree BPDU guard is not enabled for one or more ports
switch#

Then start the SSU process using the reload fast-boot command to reload the switch and activate the new image. The CLI will identify any changes that must be made to the configuration before starting SSU, prompt to save any modifications to the system configuration, and request confirmation before reloading.

switch# reload fast-boot
System configuration has been modified. Save? [yes/no/cancel/diff]:y
Copy completed successfully.
Proceed with reload? [confirm]y
Note: The show reload hitless and reload hitless commands can also be used, but their effect is identical to the commands shown above.

Verify Success of the Upgrade

Before making any configuration changes to the switch after reload, verify that the SSU process is complete using the command show boot stages log. If the process is complete, the last message should be “Hitless boot stages complete.”

switch# show boot stages log
Timestamp           Delta Begin Msg
2022-10-03 12:42:06 000.000000 Asu Hitless boot stages started
2022-10-03 12:42:06 000.001592 stage CriticalAgent started
2022-10-03 12:42:06 000.001834   event CriticalAgent:PhyEthtool completed

[ . . . ]

2022-10-03 12:43:02 056.316874 stage BootSanityCheck is complete
2022-10-03 12:43:02 056.317491 Asu Hitless boot stages complete
switch#

Completion of the SSU process may also be verified by checking the syslog for the following message:

LAUNCHER-6-BOOT_STATUS: 'reload fast-boot' reconciliation complete

To verify whether the SSU upgrade was successful, use the show reload cause command. If a fatal error occurred during the upgrade process, the switch will have completely rebooted and the fatal error will be displayed along with the directory in which diagnostic information can be found. If the SSU upgrade has succeeded, it will read “Hitless reload requested by the user.”

Fatal Error Display

switch# show reload cause
Reload Cause 1:
-------------------
Fatal error occurred during Asu Hitless boot. (stageMgr - LinkStatusUpdate timed out)

Reload Time:
------------
Reload occurred at Sun Oct 02 12:06:37 2022 PDT.

Recommended Action:
-------------------
The system rebooted due to a fatal error.
If the problem persists, contact your customer support representative.

Debugging Information:
-------------------------------
/mnt/flash/persist/fatalError-2022-10-02_120637
switch#

Successful Upgrade Display

switch# show reload cause
Reload Cause 1:
-------------------
Hitless reload requested by the user.

Reload Time:
------------
Reload occurred at Mon Oct 03 13:29:31 2022 PDT.

Recommended Action:
-------------------
No action necessary.

Debugging Information:
-------------------------------
None available.
switch#

The show version command confirms whether the correct image is loaded. The Software image version line displays the version of the active image file.

switch# show version
switch#show version
Arista DCS-7050QX-32-F
Hardware version: 02.00
Serial number: JPE14071098
System MAC address: 001c.7355.556f
Software image version: 4.14.5F-2353054.eos4145F
Architecture: i386
Internal build version: 4.14.5F-2353054.eos4145F
Internal build ID: e8748ea7-916d-4217-878f-4bfe2adc7122
Uptime: 4 minutes
Total memory: 3981328 kB
Free memory: 1342408 kB
switch#
Note: If a fatal error occurs during the SSU process, the new eos image will still be loaded and booted.