diff --git a/doc/BGP/BGP-router-id.md b/doc/BGP/BGP-router-id.md new file mode 100644 index 0000000000..d4ed83027b --- /dev/null +++ b/doc/BGP/BGP-router-id.md @@ -0,0 +1,172 @@ +# BGP Router ID Explicitly Configured + +- [Revision](#revision) +- [Definitions/Abbreviations](#definitionsabbreviations) +- [Scope](#scope) +- [Overview](#overview) +- [Requirements](#requirements) +- [High Level Design](#high-level-design) +- [Config DB Enhancement](#config-db-enhancement) + - [DEVICE_METADATA](#device_metadata) + +### Revision + +| Revision | Date | Author | Change Description | +| -------- | ----------- | ---------------- | ------------------ | +| 1.0 | Mar 27 2024 | Yaqiang Zhu, Jing Kan | Initial proposal | + +### Definitions/Abbreviations + +| Definitions/Abbreviation | Description | +| ------------------------ | ----------- | +| FRR | A free and open source Internet routing protocol suite for Linux and Unix platforms | +| BGP Router ID | 32-bit value that uniquely identifies a BGP device | +| AS | Autonomous System | +| iBGP | Internal Border Gateway Protocol, which is used inside the autonomous systems | +| eBGP | External Border Gateway Protocol, which is used between autonomous systems | + +### Scope + +This document describes a mechanism to allow user explicitly configure BGP router id. + +### Overview + +Currently, there are some BGP hard codings in SONiC: +1. BGP router id was defined as a 32-bit value that uniquely identifies a BGP device. In single-asic device, SONiC uses Loopback0 IPv4 address as BGP router id. In mult-asic and uses Loopback4096 IPv4 address as BGP router id (for both iBGP and eBGP). This coupling prevents users from using customized router id. If IPv4 address of Loopback0 / Loopback4096 don't exist, BGP router id wouldn't be configured, then FRR would choose the largest IP address in device to be BGP router id. If the router id choosen by FRR is not unique, it would be considered an error. +2. In single-asic device, SONiC wouldn't add BGP peer when there is not Loopback0 IPv4 exists. In multi-asic, SONiC wouldn't add eBGP peer when there is not Loopback0 IPv4 exists. + +Below is current workflow about BGP and router id in single-asic, only includes contents related to Loopback0. + +1. After bgp container started, configuration file `/etc/frr/bgpd.conf` for bgpd would be rendered. It will use Loopback0 IPv4 address as BGP router id, if it doesn't exist, the BGP router id wouldn't be specified. +2. bgpd start with configuration rendered before. If BGP router id is not specified, it would choose an IP address in device to be BGP router id. +3. After bgpcfgd started, it will add bgp peer depends on whether Loopback0 IPv4 exist. If Loopback0 IPv4 doesn't exist, stop to process neighbors adding and return with False signal. + +

+Origin bgp seq +

+ +Below is current workflow about BGP and router id in multi-asic, only includes contents related to Loopback0 and Loopback4096. To be clarified that due to space limitations, the figure below only describes the behavior of one aisc in a multi-asic system. The behavior of other asics is similar to this one, except that they will start their own bgp\[x\] containers and read their respective config_db. + +1. After bgp container of each asic started, configuration file `/etc/frr/bgpd.conf` for bgpd would be rendered. It will use Loopback4096 IPv4 address configured in correspond config_db as BGP router id, if it doesn't exist, the BGP router id wouldn't be specified. +2. bgpd start with configuration rendered before. If BGP router id is not specified, it would choose an IP address in device to be BGP router id. +3. After bgpcfgd started, it will add bgp peer depends on whether Loopback0 and Loopback4096 IPv4 exist: + 1. If Loopback0 IPv4 doesn't exist, stop to process BGP neighbors adding and return with False signal. + 2. Else + 1. If IPv4 address of Loopback4096 exists, add iBGP peer; else process iBGP neighbors adding and exit with False signal. + 2. If current asic is FrontEnd, add eBGP peer. + +

+Origin bgp seq multi asic +

+ +### Requirements + +Add support to allow user explicitly configure BGP router id. + +### High Level Design + +2 aspects enhancement: + +1. Add a field `bgp_router_id` in `CONFIG_DB["DEVICE_METADATA"]["localhost"]` to support explicitly configure BGP router id. For multi-asic devices, this configuraion would be added to correspond config_db for each asic. If `CONFIG_DB["DEVICE_METADATA"]["localhost"]["bgp_router_id"]` configured, always use it as BGP router id. With this change, the new BGP router id configuration behavior will be like follow table. To be clarified that when bgp_router_id doesn't be configured, the behavior is totally same as previously. + +| | Loopback0/Loopback4096 IPv4 address exists | Loopback0/Loopback4096 IPv4 address doesn't exist | +|--------------|-------|------------| +| bgp_router_id configured | Honor bgp_router_id | Honor bgp_router_id | +| bgp_router_id doesn't be configured | Honor Loopback0/Loopback4096 IPv4 address | FRR default router ID value is selected as the largest IP address of the device. When router zebra is not enabled bgpd can’t get interface information so router-id is set to 0.0.0.0 | + +2. In single-asic scenario, remove strong dependencies on Loopback0 IPv4 address when adding BGP peer in the situation that bgp_router_id is configured. With this change, the new BGP peer adding behavior will be like follow table. 2 points need To be clarified: + 1. When bgp_router_id doesn't be configured, the behavior is totally same as previously. + 2. We won't modify the logic of adding internal peer, it will still follow previous logic to treat Loopback4096 IPv4 address as required factor. + +| | Loopback0 IPv4 address exists | Loopback0 IPv4 address doesn't exist | +|--------------|-------|------------| +| bgp_router_id configured | Add BGP peer | Add BGP peer | +| bgp_router_id doesn't be configured | Add BGP peer | Do not add BGP peer | + +#### Single-asic + +Below is new workflow for single-asic, the main changes are in `1.` and `3.`. + +1. After bgp container started, configuration file `/etc/frr/bgpd.conf` for bgpd is would be rendered. + * If CONFIG_DB`["DEVICE_METADATA"]["localhost"]["bgp_router_id"]` exists, use it as BGP router id. + * Else if Loopback0 IPv4 address exists, use it as BGP router id. + * Else, BGP router id wouldn't be specified. To be clarified that this scenario is out of scope for current HLD, behavior in the scenario that router-id isn't specified is totally same as previous. +2. bgpd start with configuration rendered before. If router id is not specified, it would choose an IP address in device to be router id, which would cause BGP cannot work if the router id is not unique in network. +3. After bgpcfgd started, it will start BGP peer based on configuration. + * If Loopback0 IPv4 address exists, continue to add BGP peer. + * Else if CONFIG_DB`["DEVICE_METADATA"]["localhost"]["bgp_router_id"]` exists, continue to add BGP peer. + * Else, stop to process neighbors adding and return with False signal. + +

+New bgp seq +

+ +#### Multi-asic + +Below is new workflow for multi-asic, the main changes are in `1.` and `3.`. To be clarified that due to space limitations, the figure below only describes the behavior of one aisc in a multi-asic system. The behavior of other asics is similar to this one, except that they will start their own bgp\[x\] containers and read their respective config_db. + +1. After bgp0 container started, configuration file `/etc/frr/bgpd.conf` for bgpd is would be rendered. + * If CONFIG_DB`["DEVICE_METADATA"]["localhost"]["bgp_router_id"]` exists, use it as BGP router id. + * Else if Loopback4096 IPv4 address exists, use it as BGP router id. + * Else, BGP router id wouldn't be specified. To be clarified that this scenario is out of scope for current HLD, behavior in the scenario that router-id isn't specified is totally same as previous. +2. bgpd start with configuration rendered before. If router id is not specified, it would choose an IP address in device to be router id, which would cause BGP cannot work if the router id is not unique in network. +3. After bgpcfgd started, it will start BGP peer based on configuration. + * If current asic is FrontEnd + * If Loopback0 IPv4 address exists or bgp_router_id configured + * Add eBGP peer. + * If Loopback4096 IPv4 address exists, add iBGP peer. + * Else, stop to process neighbors adding and return with False signal. + * Else if current asic is BackEnd + * If Loopback0 IPv4 address exists or bgp_router_id configured + * If Loopback4096 IPv4 address exists, add iBGP peer + * Else, stop to process iBGP neighbors adding and return with False signal. + * Else, stop to process neighbors adding and return with False signal. + +

+New bgp seq multi asic +

+ +### Config DB Enhancement + +#### DEVICE_METADATA + +**Configuration schema in ABNF format:** + +```abnf +; DEVICE_METADATA table +key = DEVICE_METADATA|localhost ; Device metadata configuration table +; field = value +bgp_router_id = inet:ipv4-address ; Customized BGP router id +``` + +**Sample of CONFIG DB snippet:** + +```json +{ + "DEVICE_METADATA": { + "localhost": { + "bgp_router_id": "10.1.0.32" + } + } +} +``` + +**Snippet of `sonic-device_metatadata.yang`:** + +``` +module sonic-device_metadata { + container sonic-device_metadata { + container DEVICE_METADATA { + container localhost { + leaf bgp_router_id { + type inet:ipv4-address + } + } + /* end of container localhost */ + } + /* end of container DEVICE_METADATA */ + } + /* end of top level container */ +} +/* end of module sonic-device_metadata */ +``` diff --git a/doc/BGP/img/new_bgp_seq.png b/doc/BGP/img/new_bgp_seq.png new file mode 100755 index 0000000000..893b2d9c36 Binary files /dev/null and b/doc/BGP/img/new_bgp_seq.png differ diff --git a/doc/BGP/img/new_bgp_seq_multi_asic.png b/doc/BGP/img/new_bgp_seq_multi_asic.png new file mode 100755 index 0000000000..0a05961a54 Binary files /dev/null and b/doc/BGP/img/new_bgp_seq_multi_asic.png differ diff --git a/doc/BGP/img/origin_bgp_seq.png b/doc/BGP/img/origin_bgp_seq.png new file mode 100755 index 0000000000..5dd5ef3fba Binary files /dev/null and b/doc/BGP/img/origin_bgp_seq.png differ diff --git a/doc/BGP/img/origin_bgp_seq_multi_asic.png b/doc/BGP/img/origin_bgp_seq_multi_asic.png new file mode 100755 index 0000000000..407ca218f9 Binary files /dev/null and b/doc/BGP/img/origin_bgp_seq_multi_asic.png differ diff --git a/doc/config-generic-update-rollback/SONiC_Generic_Config_Update_and_Rollback_Design.md b/doc/config-generic-update-rollback/SONiC_Generic_Config_Update_and_Rollback_Design.md index 9b01b0d618..349a8d3a0c 100644 --- a/doc/config-generic-update-rollback/SONiC_Generic_Config_Update_and_Rollback_Design.md +++ b/doc/config-generic-update-rollback/SONiC_Generic_Config_Update_and_Rollback_Design.md @@ -88,6 +88,7 @@ - [3.2.2.1 Configuration Commands](#3221-configuration-commands) - [3.2.2.2 Show Commands](#3222-show-commands) - [3.2.2.3 Debug Commands](#3223-debug-commands) + + [3.2.3 Multi ASIC support](#323-multi-asics-support) - [4 Flow Diagrams](#4-flow-diagrams) - [5 Error Handling](#5-error-handling) - [6 Serviceability and Debug](#6-serviceability-and-debug) @@ -872,6 +873,138 @@ show rollback log [exec | verify | status] Use the *verbose* option to view additional details while executing the different commands. +### 3.2.3 Multi ASICs Support + +The initial design of the SONiC Generic Configuration Update and Rollback feature primarily focuses on single-ASIC platforms. To cater to the needs of Multi-ASIC platforms, this section introduces enhancements to support configuration updates and rollbacks across multiple ASICs and the host namespace. + +#### 3.2.3.1 Overview + +In Multi-ASIC SONiC platforms, configurations can be applied either globally, affecting all ASICs, or individually, targeting a specific ASIC or the host. The configuration management tools must, therefore, be capable of identifying and applying configurations based on their intended scope. + +#### 3.2.3.2 Namespace-aware Configuration Management + +The SONiC utilities for configuration management (apply-patch, checkpoint, and rollback) will be enhanced to become namespace-aware. These utilities will determine the target namespace for each operation from the configuration patch itself or from user input for operations that involve checkpoints and rollbacks. + +#### 3.2.3.3 JSON Patch Format Extension + +The JSON Patch format will be extended to include the namespace identifier for each operation's path. Path that operations the host namespace will be marked with a "localhost" identifier, while those intended for a specific ASIC will include an "asicN" identifier, where N denotes the ASIC number. + +```json +[ + { + "op": "add", + "path": "/asic0/PORTCHANNEL/PortChannel102/admin_status", + "value": "down" + }, + { + "op": "replace", + "path": "/localhost/BGP_DEVICE_GLOBAL/STATE/tsa_enabled", + "value": "true" + }, + { + "op": "replace", + "path": "/asic0/BGP_DEVICE_GLOBAL/STATE/tsa_enabled", + "value": "true" + }, + { + "op": "replace", + "path": "/asic1/BGP_DEVICE_GLOBAL/STATE/tsa_enabled", + "value": "true" + } +] +``` + +#### 3.2.3.4 Applying Configuration Changes + +When applying a configuration patch, the system will: + + Parse the extended JSON Patch to identify the target namespaces. + Apply the operations intended for the host namespace directly to the host's configuration database. + Apply ASIC-specific operations to the respective ASIC's configuration database. + +#### 3.2.3.5 Checkpoints and Rollbacks + +Checkpoint and rollback operations will be enhanced to support Multi-ASIC platforms by: + + Capturing the configuration state of all ASICs and the host namespace when creating a checkpoint. + Allowing rollbacks to restore the configuration state of all ASICs and the host namespace to a specific checkpoint. + +#### 3.2.3.6 Implementation Details + +The extension of the SONiC Generic Configuration Update and Rollback feature to support Multi-ASIC platforms enhances the flexibility and manageability of SONiC deployments in complex environments. By introducing namespace-aware configuration management, SONiC can efficiently handle the intricacies of Multi-ASIC platforms, ensuring smooth and reliable operation. + + Namespace-aware Utilities: Update the SONiC configuration utilities to handle namespace identifiers in the configuration patches and command-line options for specifying target namespaces for checkpoints and rollbacks. + + Validation and Verification: Extend the configuration validation and verification mechanisms to cover Multi-ASIC scenarios, ensuring that configurations are valid and consistent across all ASICs and the host. + + CLI Enhancements: Introduce new CLI options to specify target namespaces for configuration operations, and to manage checkpoints and rollbacks in a Multi-ASIC environment. + + Testing: Develop comprehensive test cases to cover Multi-ASIC configuration updates, including scenarios that involve simultaneous updates to multiple ASICs and the host. + +| Pull Request | Description | +| --------- | ----------- | +|[Add Multi ASIC support for apply-patch](https://github.com/sonic-net/sonic-utilities/pull/3249)|1. Categorize configuration as JSON patch format per ASIC.
2. Apply patch per ASIC, including localhost. | +|[Add Multi ASIC support for Checkpoint and Rollback](https://github.com/sonic-net/sonic-utilities)|To be implemented| + +#### 3.2.3.7 Enhancement + +Given that applying patches or performing other actions on multiple ASICs can be time-consuming, we are introducing the -p option to expedite the process. This option operates under the assumption that each ASIC functions independently. + +| Pull Request | Description | +| --------- | ----------- | +|[Add Multi ASIC support for parallel option](https://github.com/sonic-net/sonic-utilities)|To be implemented| + +1. apply-patch + ```python + @config.command('apply-patch') + ... + @click.option('-p', '--parallel', is_flag=True, default=False, help='applying the change to all ASICs parallelly') + ... + def apply_patch(ctx, patch_file_path, format, dry_run, parallel, ignore_non_yang_tables, ignore_path, verbose): + ... + ``` +2. checkpoint + ```python + @config.command() + ... + @click.option('-p', '--parallel', is_flag=True, default=False, help='taking the checkpoints to all ASICs parallelly') + ... + def checkpoint(ctx, checkpoint_name, verbose): + ``` +3. replace + ```python + @config.command() + ... + @click.option('-p', '--parallel', is_flag=True, default=False, help='replacing the change to all ASICs parallelly') + ... + def replace(ctx, target_file_path, format, dry_run, ignore_non_yang_tables, ignore_path, verbose): + ... + ``` +4. rollback + ```python + @config.command() + ... + @click.option('-p', '--parallel', is_flag=True, default=False, help='rolling back the change to all ASICs parallelly') + ... + def rollback(ctx, checkpoint_name, dry_run, ignore_non_yang_tables, ignore_path, verbose): + ``` +5. list_checkpoints + ```python + @config.command() + ... + @click.option('-p', '--parallel', is_flag=True, default=False, help='listing the change to all ASICs parallelly') + ... + def list_checkpoints(ctx, checkpoint_name, dry_run, ignore_non_yang_tables, ignore_path, verbose): + ``` +6. delete_checkpoint + ```python + @config.command() + ... + @click.option('-p', '--parallel', is_flag=True, default=False, help='listing the change to all ASICs parallelly') + ... + def delete_checkpoint(ctx, checkpoint_name, dry_run, ignore_non_yang_tables, ignore_path, verbose): + ``` + # 4 Flow Diagrams # 5 Error Handling @@ -909,6 +1042,16 @@ N/A | 14 | Dynamic port breakout as described [here](https://github.com/sonic-net/SONiC/blob/master/doc/dynamic-port-breakout/sonic-dynamic-port-breakout-HLD.md).| | 15 | Remove an item that has a default value. | | 16 | Modifying items that rely depends on each other based on a `must` condition rather than direct connection such as `leafref` e.g. /CRM/acl_counter_high_threshold (check [here](https://github.com/sonic-net/sonic-buildimage/blob/master/src/sonic-yang-models/yang-models/sonic-crm.yang)). | +| 17 | Add a new ASIC config subtree. | +| 18 | Add a new ASIC with empty config. | +| 19 | Independent Patch Application: Apply configuration patches independently to each ASIC without any coordination between them. Verify that each ASIC updates according to its patch and that there are no discrepancies in configurations that might affect system operations.| +| 20 | Simultaneous Patch Application:Apply configuration patches to all ASICs simultaneously to ensure that simultaneous updates do not cause conflicts or failures. This test checks the system’s ability to handle concurrent configuration changes across multiple independent units.| +| 21 | Sequential Patch Application: Apply configuration patches to ASICs in a controlled sequence, one after the other. This test aims to check if the order of patch application affects the final system configuration, especially when configurations might not directly interact but could have cumulative effects.| +| 22 |Patch Rollback Capability: After applying patches, initiate a rollback to previous configurations for each ASIC independently. Verify that each ASIC can revert to its previous state accurately and that the rollback process does not introduce new issues.| +| 23 | Conditional Patch Application: Apply patches based on conditional checks within each ASIC’s configuration (e.g., only apply a patch if the current firmware version is below a certain level). This test ensures that conditions within patches are evaluated correctly and that the patch is applied only when the conditions are met.| +| 24 | Cross-ASIC Dependency Verification: While each ASIC operates independently, this test involves applying patches that could potentially have indirect impacts on other ASICs through shared resources or network topology changes. Validate that changes in one ASIC do not adversely affect others.| +| 25 | Patch Compatibility and Conflict Resolution: Apply patches that introduce changes conflicting with existing configurations across ASICs. This test examines how the system identifies and resolves conflicts, ensuring that the most critical settings are preserved and that any issues are clearly reported.| +| 26 | Performance Impact Assessment: Measure system performance before and after patch application to determine the impact of configuration changes. This includes monitoring processing speed, memory usage, and network latency to ensure that performance remains within acceptable parameters.| ## 9.2 Unit Tests for Checkpoint | Test Case | Description | diff --git a/doc/dash/dash-sonic-kvm.md b/doc/dash/dash-sonic-kvm.md new file mode 100644 index 0000000000..bd5237060b --- /dev/null +++ b/doc/dash/dash-sonic-kvm.md @@ -0,0 +1,89 @@ + +# DASH SONiC KVM + + +# Table of contents + +- [1 Motivation](#1-motivation) +- [2 Architecture](#2-architecture) +- [3 Modules](#3-modules) + - [3.1 BMv2 (dataplane engine)](#31-bmv2-dataplane-engine) + - [3.2 Dataplane APP](#32-dataplane-app) + - [3.3 SAIRedis](#33-sairedis) + - [3.4 SWSS](#34-swss) + - [3.5 GNMI/APP DB](#35-gnmiapp-db) + - [3.6 Other SONiC Services](#36-other-sonic-services) +- [4 Dataflow](#4-dataflow) + - [4.1 Data plane](#41-data-plane) + - [4.2 Control plane](#42-control-plane) + +# 1 Motivation + +1. Provide a Proof of Concept (POC) for development and collaboration. Utilizing the typical SONiC workflow, we can leverage this virtual switch image to construct a virtual testbed, eliminating the need for a complete hardware device. This virtual DPU image enables the creation of a mixed hardware-software testbed or a software-only testbed, applicable to both the control plane and the data plane. +2. Enable Continuous Integration(CI) via Azure Pipelines (Azp) for SONiC repositories, like sonic-buildimage, sonic-swss and so on. + +# 2 Architecture + +![BMv2 virtual SONiC](../../images/dash/bmv2-virtual-sonic.svg) + +# 3 Modules + +## 3.1 BMv2 (dataplane engine) + +This component is the original P4 BMv2 container image, which serves as the data plane implementation - usually in hardware. +It attaches three types of interfaces: system port(Ethernet), line port(eth), and DPDK port(CPU). +- Ethernet is used as the system port. Protocol services like BGP and LLDP perform send/receive operations on these interfaces. +- eth is used as the line port. These are native interfaces in KVM for communication between the inside and outside. The eth port and Ethernet port is one-to-one mapping. +- CPU is used for the DPDK port. The dataplane APP directly manipulates the traffic on these ports. + +## 3.2 Dataplane APP + +Due to the P4 and BMv2 limitation, such as flow creation, flow resimulation and etc, in this virtual DPU, our implementation is based on the VPP framework with the CPU interface to enhance the dataplane engine for these extra functions in the dataplane app module. Meanwhile, this dataplane APP loads the generated shared library, saidash, which communicates with BMv2 via GRPC. For the SAI APIs that will not be used by DASH/DPU SONiC, the SAI implementation will be mocked, as long as SWSS could work, e.g. DTEL. Additionally, this component connects with sairedis through a shim SAI agent(dashsai server - remote dashsai client). + +We will have a dedicated doc on the data plane app for more details. + +## 3.3 SAIRedis + +In the original virtual SONiC, SAIRedis will load the saivs. However, in the new SONiC DASH virtual DPU, it will load the remote dashsai client mentioned above. + +## 3.4 SWSS + +The SWSS on this virtual DPU is almost the same as the one used in the physical DPU. We don't need to make any special changes to it. + +## 3.5 GNMI/APP DB + +The GNMI and APP DB are identical to the physical device. However, in this virtual image, we support two modes: DPU mode and single device mode. The details of these two modes will be described in the following section. + +## 3.6 Other SONiC Services + +We plan to keep the other services, such as BGP, LLDP, and others. these services will be kept, so the KVM runs the same way as how SONiC runs on the real DPU. + +# 4 Dataflow +## 4.1 Data plane + +All data plane traffic will enter the BMv2 simple switch and be forwarded to the target port based on the P4 logic imported on BMv2. + +Here is an example about the data plane +```mermaid +graph TD + +%% LLDP packet + eth1 --> packet_dispatcher{"Packet dispatcher"} + packet_dispatcher -->|LLDP| Ethernet0; + Ethernet0 --> lldp_process["LLDP process"]; + +%% Normal VNET packet + packet_dispatcher -->|DASH| dash_pipeline{"DASH Pipeline"} + dash_pipeline -->|VNet| eth2; + +%% TCP SYN packet + dash_pipeline -->|"TCP SYN"| cpu0_in[CPU0]; + cpu0_in[CPU0] --> dataplane_app["Dataplane APP"]; + dataplane_app["Dataplane APP"] --> cpu0_out[CPU0]; + cpu0_out[CPU0] --> dash_pipeline +``` + +## 4.2 Control plane + +In the physical SmartSwitch, configuration is forwarded via GNMI in the NPU. So, in the virtual SONiC environment, the SWSS module is capable of receiving configuration from an external GNMI service through the management port, eth-midplane. However, in the single device mode, the GNMI service can also be run within the KVM and directly forward the configuration to the local SWSS module. + diff --git a/doc/SONiC_201911_Release_Notes.md b/doc/release-notes/SONiC_201911_Release_Notes.md similarity index 100% rename from doc/SONiC_201911_Release_Notes.md rename to doc/release-notes/SONiC_201911_Release_Notes.md diff --git a/doc/SONiC_202006_Release_Notes.md b/doc/release-notes/SONiC_202006_Release_Notes.md similarity index 98% rename from doc/SONiC_202006_Release_Notes.md rename to doc/release-notes/SONiC_202006_Release_Notes.md index 7341afa69c..9cf1f4ec5d 100644 --- a/doc/SONiC_202006_Release_Notes.md +++ b/doc/release-notes/SONiC_202006_Release_Notes.md @@ -1,145 +1,145 @@ -# SONiC 202006 Release Notes - -This document captures the new features added and enhancements done on existing features/sub-features for the SONiC 202006 release. - - - -# Table of Contents - - * [Branch and Image Location](#branch-and-image-location) - * [Dependency Version](#dependency-version) - * [Security Updates](#security-updates) - * [Feature List](#feature-list) - * [SAI APIs](#sai-apis) - * [Contributors](#contributors) - - -# Branch and Image Location - -Branch : https://github.com/sonic-net/sonic-buildimage/tree/202006
-Image : https://sonic-jenkins.westus2.cloudapp.azure.com/ (Example - Image for Broadcom based platforms is [here]( https://sonic-jenkins.westus2.cloudapp.azure.com/job/broadcom/job/buildimage-brcm-202006/lastSuccessfulBuild/artifact/target/)) - -# Dependency Version - -|Feature | Version | -| ------------------------- | --------------- | -| Linux kernel version | linux_4.9.0-11-2 (4.9.189-3+deb9u2) | -| SAI version | SAI v1.6.3 | -| FRR | 7.2 | -| LLDPD | 0.9.6-1 | -| TeamD | 1.28-1 | -| SNMPD | 5.7.3+dfsg-1.5 | -| Python | 3.6.0-1 | -| syncd | 1.0.0 | -| swss | 1.0.0 | -| radvd | 2.17-2~bpo9+1 | -| isc-dhcp | 4.3.5-2 ([PR2946](https://github.com/sonic-net/sonic-buildimage/pull/2946) ) | -| sonic-telemetry | 0.1 | -| redis-server/ redis-tools | 5.0.3-3~bpo9+2 | -| Debian version | Continues to use Stretch (Debian version 9) | - -Note : The kernel version is migrated to the version that is mentioned in the first row in the above 'Dependency Version' table. - - -# Security Updates - -1. Kernel upgraded from 4.9.110-3deb9u6 (SONiC Release 201904) to 4.9.168-1+deb9u5 in this SONiC release. - Change log: https://tracker.debian.org/media/packages/l/linux/changelog-4.9.168-1deb9u5 -2. Docker upgraded from 18.09.2\~3-0\~debian-stretch to 18.09.8\~3-0\~debian-stretch. - Change log: https://docs.docker.com/engine/release-notes/#18098 - -# Feature List - -#### Build Improvements -DPKG caching framework provides the infrastructure to cache the sonic module/target .deb files into a local cache by tracking the target dependency files.SONIC build infrastructure is designed as a plugin framework where any new source code can be easily integrated into sonic as a module and that generates output as a .deb file.This provides a huge improvement in build time and also supports the true incremental build by tracking the dependency files. -
**Pull Requests** : [3292](https://github.com/sonic-net/sonic-buildimage/pull/3292), [4117](https://github.com/sonic-net/sonic-buildimage/pull/4117), [4425](https://github.com/sonic-net/sonic-buildimage/pull/4425) - -#### Bulk API for route -This feature provides bulk routes and next hop group members as coded in the PR mentioned below. -
**Pull Requests** : [1238](https://github.com/sonic-net/sonic-swss/pull/1238) - -#### D-Bus to Host Communications -This document describes a means (framework) for an application executed inside a container to securely request the execution of an operation ("action") by the host OS.This framework is intended to be used by the SONiC management and telemetry containers, but can be extended for other application containers as well. -
Refer [HLD document](https://github.com/sonic-net/SONiC/blob/master/doc/mgmt/Docker%20to%20Host%20communication.md) and below mentioned PR's for more details. -
**Pull Requests** : [4840](https://github.com/sonic-net/sonic-buildimage/pull/4840) - -#### Debian 10 upgrade, base image,driver -This feature provides change in kernel version. By changing the kernel ABI from version 6 to version 6-2, this will allow to disable the kernel ABI check which Debian performs at the very end of the kernel build. -
**Pull Requests** : [145](https://github.com/sonic-net/sonic-linux-kernel/pull/145), [4711](https://github.com/sonic-net/sonic-buildimage/pull/4711) - -#### Dynamic port breakout -Ports can be broken out to different speeds with various lanes in most HW today. However, on SONiC, before this release, the port breakout modes are hard-coded in the profiles and only loaded at initial time. In case we need to have a new port breakout mode, we would potentially need a new image or at least need to restart services which would impact the traffic of the box on irrelevant ports. The feature is to address the above issues. -
Refer [HLD document](https://github.com/sonic-net/SONiC/blob/master/doc/dynamic-port-breakout/sonic-dynamic-port-breakout-HLD.md) and below mentioned PR's for more details. -
**Pull Requests** : [4235](https://github.com/sonic-net/sonic-buildimage/pull/4235), [3910](https://github.com/sonic-net/sonic-buildimage/pull/3910), [1242](https://github.com/sonic-net/sonic-swss/pull/1242), [1219](https://github.com/sonic-net/sonic-swss/pull/1219), [1151](https://github.com/sonic-net/sonic-swss/pull/1151), [1150](https://github.com/sonic-net/sonic-swss/pull/1150), [1148](https://github.com/sonic-net/sonic-swss/pull/1148), [1112](https://github.com/sonic-net/sonic-swss/pull/1112), [1085](https://github.com/sonic-net/sonic-swss/pull/1085), [766](https://github.com/sonic-net/sonic-utilities/pull/766), [72](https://github.com/sonic-net/sonic-platform-common/pull/72), [859](https://github.com/sonic-net/sonic-utilities/pull/859), [767](https://github.com/sonic-net/sonic-utilities/pull/767), [765](https://github.com/sonic-net/sonic-utilities/pull/765), [3912](https://github.com/sonic-net/sonic-buildimage/pull/3912), [3911](https://github.com/sonic-net/sonic-buildimage/pull/3911), [3909](https://github.com/sonic-net/sonic-buildimage/pull/3909), [3907](https://github.com/sonic-net/sonic-buildimage/pull/3907), [3891](https://github.com/sonic-net/sonic-buildimage/pull/3891), [3874](https://github.com/sonic-net/sonic-buildimage/pull/3874), [3861](https://github.com/sonic-net/sonic-buildimage/pull/3861), [3730](https://github.com/sonic-net/sonic-buildimage/pull/3730) - -#### Egress shaping (port, queue) -Quality of Service (QoS) scheduling and shaping features enable better service to certain traffic flows.Queue scheduling provides preferential treatment of traffic classes mapped to specific egress queues. SONiC supports SP, WRR, and DWRR scheduling disciplines.Queue shaping provides control of minimum and maximum bandwidth requirements per egress queue for more effective bandwidth utilization. Egress queues that exceed an average transmission rate beyond the shaper max bandwidth will stop being serviced. Additional ingress traffic will continue to be stored on the egress queue until the queue size is exceeded which results in tail drop. -
Refer [HLD document](https://github.com/sonic-net/SONiC/blob/41e55d2762e9267454a4910b42a1eb7ad07acda8/doc/qos/scheduler/SONiC_QoS_Scheduler_Shaper.md) and below mentioned PR's for more details. -
**Pull Requests** : [1296](https://github.com/sonic-net/sonic-swss/pull/1296), [991](https://github.com/sonic-net/sonic-swss/pull/991) - -#### FW utils extension -A modern network switch is a sophisticated equipment which consists of many auxiliary components which are responsible for managing different subsystems (e.g., PSU/FAN/QSFP/EEPROM/THERMAL) and providing necessary interfaces (e.g., I2C/SPI/JTAG).Basically these components are complex programmable logic devices with it's own HW architecture and software. It is very important to always have the latest recommended software version to improve device stability, security and performance. In order to make software update as simple as possible and to provide a nice user frindly interface for various maintenance operations (e.g., install a new FW or query current version) we might need a dedicated FW utility. -
Refer [HLD document](https://github.com/sonic-net/SONiC/blob/master/doc/fwutil/fwutil.md) and below mentioned PR's for more details. -
**Pull Requests** : [4764](https://github.com/sonic-net/sonic-buildimage/pull/4764), [4758](https://github.com/sonic-net/sonic-buildimage/pull/4758), [941](https://github.com/sonic-net/sonic-utilities/pull/941), [942](https://github.com/sonic-net/sonic-utilities/pull/942), [87](https://github.com/sonic-net/sonic-platform-common/pull/87), [82](https://github.com/sonic-net/sonic-platform-common/pull/82) - -#### Getting docker ready for Debian 10 -This change adds support to build dockers using buster as base.sonic-mgmt-framework docker is updated to build using buster as base. -
**Pull Requests** : [4671](https://github.com/sonic-net/sonic-buildimage/pull/4671), [4727](https://github.com/sonic-net/sonic-buildimage/pull/4727), [4726](https://github.com/sonic-net/sonic-buildimage/pull/4726), [4665](https://github.com/sonic-net/sonic-buildimage/pull/4665), [4515](https://github.com/sonic-net/sonic-buildimage/pull/4515), [4598](https://github.com/sonic-net/sonic-buildimage/pull/4598), [4529](https://github.com/sonic-net/sonic-buildimage/pull/4529), [4480](https://github.com/sonic-net/sonic-buildimage/pull/4480) - -#### Port Mirroring -This feature describes the high level design details on Port/Port-channel mirroring support, dynamic session management, ACL rules can continue to use port/ERSPAN sessions as the action, Configuration CLI for mirror session. -
Refer [HLD document](https://github.com/sonic-net/SONiC/blob/e8c86d1b3a03d6320727ff148966081869461e4a/doc/SONiC_Port_Mirroring_HLD.md) and below mentioned PR's for more details. -
**Pull Requests** : [1314](https://github.com/sonic-net/sonic-swss/pull/1314), [936](https://github.com/sonic-net/sonic-utilities/pull/936) - -#### Proxy ARP -When an interface is enabled with "proxy_arp", the same is enabled in the kernel. ASIC ARP packet action is also updated to trap these packets to CPU in those interfaces. -
Refer [HLD Document](https://github.com/sonic-net/SONiC/blob/master/doc/arp/Proxy%20Arp.md) for more details. -
**Pull Requests** : [617](https://github.com/sonic-net/SONiC/pull/617) - -#### Pytest 100% moved from ansible to Pytest - -#### SPytest -This is an initial version of spytest framework and first set of test scripts for 202006 release. -
Refer [HLD Document](https://github.com/sonic-net/sonic-mgmt/blob/master/spytest/Doc/intro.md) for more details. -
**Pull Requests** : [1533](https://github.com/sonic-net/sonic-mgmt/pull/1533) - -#### Thermal control -Thermal control daemon has been added to monitor the temperature of devices (CPU, ASIC, optical modules, etc) and the running status of fan. It retrieves the switch device temperatures via platform APIs and raises alarms when the high/low thresholds are hit.It also stores temperature values fetched from sensors and thermal device running status to the DB.In addition it provides the policy based thermal control and fan speed tuning in configuration, and we are able to customize and/or add the platform specific policies as needed. -
Refer [HLD Document](https://github.com/sonic-net/SONiC/blob/master/thermal-control-design.md) for more details. -
**Pull Requests** : [73](https://github.com/sonic-net/sonic-platform-common/pull/73), [777](https://github.com/sonic-net/sonic-utilities/pull/777), [49](https://github.com/sonic-net/sonic-platform-daemons/pull/49), [3949](https://github.com/sonic-net/sonic-buildimage/pull/3949),[832](https://github.com/sonic-net/sonic-utilities/pull/832) - -#### PSU and FAN LED management - -The PSU and FAN LED on switch will be set according to PSU and FAN presence and running status, for example if there is a failure happening to PSU or FAN, the corresponding LED will be set to red. -
Refer [HLD Document](https://github.com/sonic-net/SONiC/blob/master/thermal-control-design.md) and [HLD Document](https://github.com/sonic-net/SONiC/pull/591) for more details. -
**Pull Requests** : [4437](https://github.com/sonic-net/sonic-buildimage/pull/4437);[1580](https://github.com/sonic-net/sonic-mgmt/pull/1580);[881](https://github.com/sonic-net/sonic-utilities/pull/881);[54](https://github.com/sonic-net/sonic-platform-daemons/pull/54);[83](https://github.com/sonic-net/sonic-platform-common/pull/83) - -#### PSU, thermal and FAN plugin extension - -On new plugins for fan, thermal and PSU the PSU plugin was extended with voltage, current and power supported, and the fan and thermal plugins were introduced. -
**Pull Requests** : [4041](https://github.com/sonic-net/sonic-buildimage/pull/4041) - - - - -
- - -# SAI APIs - -Please find the list of API's classified along the newly added SAI features. For further details on SAI API please refer [SAI_1.6.3 Release Notes](https://github.com/opencomputeproject/SAI/blob/master/doc/SAI_1.6.3_ReleaseNotes.md) - -| S.No | Feature | -| ---- | --------------------------- | -| 1 | MACSEC | -| 2 | System Port API | - - -# Contributors - -SONiC community would like to thank all the contributors from various companies and the individuals who has contributed for the release. Special thanks to the major contributors - Microsoft, Broadcom, DellEMC, Mellanox, Alibaba, Linkedin, Nephos & Aviz. - -
- - - +# SONiC 202006 Release Notes + +This document captures the new features added and enhancements done on existing features/sub-features for the SONiC 202006 release. + + + +# Table of Contents + + * [Branch and Image Location](#branch-and-image-location) + * [Dependency Version](#dependency-version) + * [Security Updates](#security-updates) + * [Feature List](#feature-list) + * [SAI APIs](#sai-apis) + * [Contributors](#contributors) + + +# Branch and Image Location + +Branch : https://github.com/sonic-net/sonic-buildimage/tree/202006
+Image : https://sonic-jenkins.westus2.cloudapp.azure.com/ (Example - Image for Broadcom based platforms is [here]( https://sonic-jenkins.westus2.cloudapp.azure.com/job/broadcom/job/buildimage-brcm-202006/lastSuccessfulBuild/artifact/target/)) + +# Dependency Version + +|Feature | Version | +| ------------------------- | --------------- | +| Linux kernel version | linux_4.9.0-11-2 (4.9.189-3+deb9u2) | +| SAI version | SAI v1.6.3 | +| FRR | 7.2 | +| LLDPD | 0.9.6-1 | +| TeamD | 1.28-1 | +| SNMPD | 5.7.3+dfsg-1.5 | +| Python | 3.6.0-1 | +| syncd | 1.0.0 | +| swss | 1.0.0 | +| radvd | 2.17-2~bpo9+1 | +| isc-dhcp | 4.3.5-2 ([PR2946](https://github.com/sonic-net/sonic-buildimage/pull/2946) ) | +| sonic-telemetry | 0.1 | +| redis-server/ redis-tools | 5.0.3-3~bpo9+2 | +| Debian version | Continues to use Stretch (Debian version 9) | + +Note : The kernel version is migrated to the version that is mentioned in the first row in the above 'Dependency Version' table. + + +# Security Updates + +1. Kernel upgraded from 4.9.110-3deb9u6 (SONiC Release 201904) to 4.9.168-1+deb9u5 in this SONiC release. + Change log: https://tracker.debian.org/media/packages/l/linux/changelog-4.9.168-1deb9u5 +2. Docker upgraded from 18.09.2\~3-0\~debian-stretch to 18.09.8\~3-0\~debian-stretch. + Change log: https://docs.docker.com/engine/release-notes/#18098 + +# Feature List + +#### Build Improvements +DPKG caching framework provides the infrastructure to cache the sonic module/target .deb files into a local cache by tracking the target dependency files.SONIC build infrastructure is designed as a plugin framework where any new source code can be easily integrated into sonic as a module and that generates output as a .deb file.This provides a huge improvement in build time and also supports the true incremental build by tracking the dependency files. +
**Pull Requests** : [3292](https://github.com/sonic-net/sonic-buildimage/pull/3292), [4117](https://github.com/sonic-net/sonic-buildimage/pull/4117), [4425](https://github.com/sonic-net/sonic-buildimage/pull/4425) + +#### Bulk API for route +This feature provides bulk routes and next hop group members as coded in the PR mentioned below. +
**Pull Requests** : [1238](https://github.com/sonic-net/sonic-swss/pull/1238) + +#### D-Bus to Host Communications +This document describes a means (framework) for an application executed inside a container to securely request the execution of an operation ("action") by the host OS.This framework is intended to be used by the SONiC management and telemetry containers, but can be extended for other application containers as well. +
Refer [HLD document](https://github.com/sonic-net/SONiC/blob/master/doc/mgmt/Docker%20to%20Host%20communication.md) and below mentioned PR's for more details. +
**Pull Requests** : [4840](https://github.com/sonic-net/sonic-buildimage/pull/4840) + +#### Debian 10 upgrade, base image,driver +This feature provides change in kernel version. By changing the kernel ABI from version 6 to version 6-2, this will allow to disable the kernel ABI check which Debian performs at the very end of the kernel build. +
**Pull Requests** : [145](https://github.com/sonic-net/sonic-linux-kernel/pull/145), [4711](https://github.com/sonic-net/sonic-buildimage/pull/4711) + +#### Dynamic port breakout +Ports can be broken out to different speeds with various lanes in most HW today. However, on SONiC, before this release, the port breakout modes are hard-coded in the profiles and only loaded at initial time. In case we need to have a new port breakout mode, we would potentially need a new image or at least need to restart services which would impact the traffic of the box on irrelevant ports. The feature is to address the above issues. +
Refer [HLD document](https://github.com/sonic-net/SONiC/blob/master/doc/dynamic-port-breakout/sonic-dynamic-port-breakout-HLD.md) and below mentioned PR's for more details. +
**Pull Requests** : [4235](https://github.com/sonic-net/sonic-buildimage/pull/4235), [3910](https://github.com/sonic-net/sonic-buildimage/pull/3910), [1242](https://github.com/sonic-net/sonic-swss/pull/1242), [1219](https://github.com/sonic-net/sonic-swss/pull/1219), [1151](https://github.com/sonic-net/sonic-swss/pull/1151), [1150](https://github.com/sonic-net/sonic-swss/pull/1150), [1148](https://github.com/sonic-net/sonic-swss/pull/1148), [1112](https://github.com/sonic-net/sonic-swss/pull/1112), [1085](https://github.com/sonic-net/sonic-swss/pull/1085), [766](https://github.com/sonic-net/sonic-utilities/pull/766), [72](https://github.com/sonic-net/sonic-platform-common/pull/72), [859](https://github.com/sonic-net/sonic-utilities/pull/859), [767](https://github.com/sonic-net/sonic-utilities/pull/767), [765](https://github.com/sonic-net/sonic-utilities/pull/765), [3912](https://github.com/sonic-net/sonic-buildimage/pull/3912), [3911](https://github.com/sonic-net/sonic-buildimage/pull/3911), [3909](https://github.com/sonic-net/sonic-buildimage/pull/3909), [3907](https://github.com/sonic-net/sonic-buildimage/pull/3907), [3891](https://github.com/sonic-net/sonic-buildimage/pull/3891), [3874](https://github.com/sonic-net/sonic-buildimage/pull/3874), [3861](https://github.com/sonic-net/sonic-buildimage/pull/3861), [3730](https://github.com/sonic-net/sonic-buildimage/pull/3730) + +#### Egress shaping (port, queue) +Quality of Service (QoS) scheduling and shaping features enable better service to certain traffic flows.Queue scheduling provides preferential treatment of traffic classes mapped to specific egress queues. SONiC supports SP, WRR, and DWRR scheduling disciplines.Queue shaping provides control of minimum and maximum bandwidth requirements per egress queue for more effective bandwidth utilization. Egress queues that exceed an average transmission rate beyond the shaper max bandwidth will stop being serviced. Additional ingress traffic will continue to be stored on the egress queue until the queue size is exceeded which results in tail drop. +
Refer [HLD document](https://github.com/sonic-net/SONiC/blob/41e55d2762e9267454a4910b42a1eb7ad07acda8/doc/qos/scheduler/SONiC_QoS_Scheduler_Shaper.md) and below mentioned PR's for more details. +
**Pull Requests** : [1296](https://github.com/sonic-net/sonic-swss/pull/1296), [991](https://github.com/sonic-net/sonic-swss/pull/991) + +#### FW utils extension +A modern network switch is a sophisticated equipment which consists of many auxiliary components which are responsible for managing different subsystems (e.g., PSU/FAN/QSFP/EEPROM/THERMAL) and providing necessary interfaces (e.g., I2C/SPI/JTAG).Basically these components are complex programmable logic devices with it's own HW architecture and software. It is very important to always have the latest recommended software version to improve device stability, security and performance. In order to make software update as simple as possible and to provide a nice user frindly interface for various maintenance operations (e.g., install a new FW or query current version) we might need a dedicated FW utility. +
Refer [HLD document](https://github.com/sonic-net/SONiC/blob/master/doc/fwutil/fwutil.md) and below mentioned PR's for more details. +
**Pull Requests** : [4764](https://github.com/sonic-net/sonic-buildimage/pull/4764), [4758](https://github.com/sonic-net/sonic-buildimage/pull/4758), [941](https://github.com/sonic-net/sonic-utilities/pull/941), [942](https://github.com/sonic-net/sonic-utilities/pull/942), [87](https://github.com/sonic-net/sonic-platform-common/pull/87), [82](https://github.com/sonic-net/sonic-platform-common/pull/82) + +#### Getting docker ready for Debian 10 +This change adds support to build dockers using buster as base.sonic-mgmt-framework docker is updated to build using buster as base. +
**Pull Requests** : [4671](https://github.com/sonic-net/sonic-buildimage/pull/4671), [4727](https://github.com/sonic-net/sonic-buildimage/pull/4727), [4726](https://github.com/sonic-net/sonic-buildimage/pull/4726), [4665](https://github.com/sonic-net/sonic-buildimage/pull/4665), [4515](https://github.com/sonic-net/sonic-buildimage/pull/4515), [4598](https://github.com/sonic-net/sonic-buildimage/pull/4598), [4529](https://github.com/sonic-net/sonic-buildimage/pull/4529), [4480](https://github.com/sonic-net/sonic-buildimage/pull/4480) + +#### Port Mirroring +This feature describes the high level design details on Port/Port-channel mirroring support, dynamic session management, ACL rules can continue to use port/ERSPAN sessions as the action, Configuration CLI for mirror session. +
Refer [HLD document](https://github.com/sonic-net/SONiC/blob/e8c86d1b3a03d6320727ff148966081869461e4a/doc/SONiC_Port_Mirroring_HLD.md) and below mentioned PR's for more details. +
**Pull Requests** : [1314](https://github.com/sonic-net/sonic-swss/pull/1314), [936](https://github.com/sonic-net/sonic-utilities/pull/936) + +#### Proxy ARP +When an interface is enabled with "proxy_arp", the same is enabled in the kernel. ASIC ARP packet action is also updated to trap these packets to CPU in those interfaces. +
Refer [HLD Document](https://github.com/sonic-net/SONiC/blob/master/doc/arp/Proxy%20Arp.md) for more details. +
**Pull Requests** : [617](https://github.com/sonic-net/SONiC/pull/617) + +#### Pytest 100% moved from ansible to Pytest + +#### SPytest +This is an initial version of spytest framework and first set of test scripts for 202006 release. +
Refer [HLD Document](https://github.com/sonic-net/sonic-mgmt/blob/master/spytest/Doc/intro.md) for more details. +
**Pull Requests** : [1533](https://github.com/sonic-net/sonic-mgmt/pull/1533) + +#### Thermal control +Thermal control daemon has been added to monitor the temperature of devices (CPU, ASIC, optical modules, etc) and the running status of fan. It retrieves the switch device temperatures via platform APIs and raises alarms when the high/low thresholds are hit.It also stores temperature values fetched from sensors and thermal device running status to the DB.In addition it provides the policy based thermal control and fan speed tuning in configuration, and we are able to customize and/or add the platform specific policies as needed. +
Refer [HLD Document](https://github.com/sonic-net/SONiC/blob/master/thermal-control-design.md) for more details. +
**Pull Requests** : [73](https://github.com/sonic-net/sonic-platform-common/pull/73), [777](https://github.com/sonic-net/sonic-utilities/pull/777), [49](https://github.com/sonic-net/sonic-platform-daemons/pull/49), [3949](https://github.com/sonic-net/sonic-buildimage/pull/3949),[832](https://github.com/sonic-net/sonic-utilities/pull/832) + +#### PSU and FAN LED management + +The PSU and FAN LED on switch will be set according to PSU and FAN presence and running status, for example if there is a failure happening to PSU or FAN, the corresponding LED will be set to red. +
Refer [HLD Document](https://github.com/sonic-net/SONiC/blob/master/thermal-control-design.md) and [HLD Document](https://github.com/sonic-net/SONiC/pull/591) for more details. +
**Pull Requests** : [4437](https://github.com/sonic-net/sonic-buildimage/pull/4437);[1580](https://github.com/sonic-net/sonic-mgmt/pull/1580);[881](https://github.com/sonic-net/sonic-utilities/pull/881);[54](https://github.com/sonic-net/sonic-platform-daemons/pull/54);[83](https://github.com/sonic-net/sonic-platform-common/pull/83) + +#### PSU, thermal and FAN plugin extension + +On new plugins for fan, thermal and PSU the PSU plugin was extended with voltage, current and power supported, and the fan and thermal plugins were introduced. +
**Pull Requests** : [4041](https://github.com/sonic-net/sonic-buildimage/pull/4041) + + + + +
+ + +# SAI APIs + +Please find the list of API's classified along the newly added SAI features. For further details on SAI API please refer [SAI_1.6.3 Release Notes](https://github.com/opencomputeproject/SAI/blob/master/doc/SAI_1.6.3_ReleaseNotes.md) + +| S.No | Feature | +| ---- | --------------------------- | +| 1 | MACSEC | +| 2 | System Port API | + + +# Contributors + +SONiC community would like to thank all the contributors from various companies and the individuals who has contributed for the release. Special thanks to the major contributors - Microsoft, Broadcom, DellEMC, Mellanox, Alibaba, Linkedin, Nephos & Aviz. + +
+ + + diff --git a/doc/SONiC_202012_Release_Notes.md b/doc/release-notes/SONiC_202012_Release_Notes.md similarity index 100% rename from doc/SONiC_202012_Release_Notes.md rename to doc/release-notes/SONiC_202012_Release_Notes.md diff --git a/doc/SONiC_202106_Release_Notes.md b/doc/release-notes/SONiC_202106_Release_Notes.md similarity index 100% rename from doc/SONiC_202106_Release_Notes.md rename to doc/release-notes/SONiC_202106_Release_Notes.md diff --git a/doc/SONiC_202111_Release_Notes.md b/doc/release-notes/SONiC_202111_Release_Notes.md similarity index 100% rename from doc/SONiC_202111_Release_Notes.md rename to doc/release-notes/SONiC_202111_Release_Notes.md diff --git a/doc/SONiC_202205_Release_Notes.md b/doc/release-notes/SONiC_202205_Release_Notes.md similarity index 100% rename from doc/SONiC_202205_Release_Notes.md rename to doc/release-notes/SONiC_202205_Release_Notes.md diff --git a/doc/SONiC_202211_Release_Notes.md b/doc/release-notes/SONiC_202211_Release_Notes.md similarity index 100% rename from doc/SONiC_202211_Release_Notes.md rename to doc/release-notes/SONiC_202211_Release_Notes.md diff --git a/doc/SONiC_202305_Release_Notes.md b/doc/release-notes/SONiC_202305_Release_Notes.md similarity index 100% rename from doc/SONiC_202305_Release_Notes.md rename to doc/release-notes/SONiC_202305_Release_Notes.md diff --git a/doc/SONiC_202311_Release_Notes.md b/doc/release-notes/SONiC_202311_Release_Notes.md similarity index 100% rename from doc/SONiC_202311_Release_Notes.md rename to doc/release-notes/SONiC_202311_Release_Notes.md diff --git a/doc/smart-switch/smart-switch-database-architecture/smart-switch-database-architecture.png b/doc/smart-switch/smart-switch-database-architecture/smart-switch-database-architecture.png new file mode 100644 index 0000000000..ab856c5476 Binary files /dev/null and b/doc/smart-switch/smart-switch-database-architecture/smart-switch-database-architecture.png differ diff --git a/doc/smart-switch/smart-switch-database-architecture/smart-switch-database-design.md b/doc/smart-switch/smart-switch-database-architecture/smart-switch-database-design.md new file mode 100644 index 0000000000..32d141881e --- /dev/null +++ b/doc/smart-switch/smart-switch-database-architecture/smart-switch-database-design.md @@ -0,0 +1,415 @@ +# Smart Switch Database design + +## Table of Content + +- [Smart Switch Database design](#smart-switch-database-design) + - [Table of Content](#table-of-content) + - [Revision](#revision) + - [Scope](#scope) + - [Definitions/Abbreviations](#definitionsabbreviations) + - [Overview](#overview) + - [Requirements](#requirements) + - [Architecture Design](#architecture-design) + - [Database services](#database-services) + - [NPU side](#npu-side) + - [DPU side](#dpu-side) + - [Database flow](#database-flow) + - [Update Overlay Objects via GNMI:](#update-overlay-objects-via-gnmi) + - [Update Object Status:](#update-object-status) + - [Update Counters and Meters:](#update-counters-and-meters) + - [High-Level Design](#high-level-design) + - [SAI API](#sai-api) + - [Configuration and management](#configuration-and-management) + - [CLI/YANG model Enhancements](#cliyang-model-enhancements) + - [Config DB Enhancements](#config-db-enhancements) + - [Warmboot and Fastboot Design Impact](#warmboot-and-fastboot-design-impact) + - [Memory Consumption](#memory-consumption) + - [DPU\_APPL\_DB](#dpu_appl_db) + - [Global Tables](#global-tables) + - [Per ENI Tables](#per-eni-tables) + - [DPU\_APPL\_STATE\_DB/DPU\_STATE\_DB](#dpu_appl_state_dbdpu_state_db) + - [Global Tables](#global-tables-1) + - [Per ENI Tables](#per-eni-tables-1) + - [Restrictions/Limitations](#restrictionslimitations) + - [Testing Requirements/Design](#testing-requirementsdesign) + - [Unit Test cases](#unit-test-cases) + - [System Test cases](#system-test-cases) + - [Open/Action items - if any](#openaction-items---if-any) + +### Revision + +| Rev | Date | Author | Change Description | +| :-: | :--: | :----: | -------------------------------- | +| 0.1 | | Ze Gan | Initial version. Database design | + +### Scope + +This document provides a high-level design for Smart Switch database. + +### Definitions/Abbreviations + +| Term | Meaning | +| --------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| NPU | Network Processing Unit | +| DPU | Data Processing Unit | +| DB | Database | +| GNMI | gRPC Network Management Interface | +| overlay objects | All objects defined in the [sonic-dash-api](https://github.com/sonic-net/sonic-dash-api) | +| midplane bridge | Defined in the [smart-switch-ip-address-assignment](https://github.com/sonic-net/SONiC/blob/master/doc/smart-switch/ip-address-assigment/smart-switch-ip-address-assignment.md) | + +### Overview + +The Smart Switch comprises two integral components: the Network Processing Unit (NPU) and the Data Processing Unit (DPU), both operating on the SONiC OS. The database stack encompasses the entire database infrastructure for both the NPU and DPU. However, due to memory limitations on the DPU, certain overlay objects, such as DASH objects, are stored in the NPU. + +In addition, dedicated database containers are maintained in the NPU for each DPU, serving the purpose of resource management within the Smart Switch architecture. This separation allows for efficient handling of database-related operations and ensures optimal utilization of resources across the entire Smart Switch. + +### Requirements + +- All databases, including those on both NPU and DPU, must be accessible through the GNMI server. +- Each DPU database instance on the NPU is associated with a unique TCP port and domain Unix socket path. +- All DPU database instances on the NPU will be bound to the IP address of the midplane bridge. +- All database instances on the NPU share the same network namespace to facilitate seamless communication. +- DPUs can access their respective overlay database instances on the NPU using the IP of the midplane bridge and a pre-assigned unique TCP port. + +### Architecture Design + +![smart-switch-database-architecture](smart-switch-database-architecture.png) + +#### Database services + +##### NPU side + +In this section, the focus is on illustrating the maintenance of DPU overlay databases within the NPU. It's essential to note that the traditional database services of both NPU and DPU remain unchanged and do not necessitate further design modifications. + +The management of DPU overlay databases within the NPU is orchestrated through existing SONiC database services. The daemon, named "featured," retains the responsibility for initiating, terminating, enabling, and disabling the DPU overlay database services. This interaction is facilitated using the systemctl tool. + +To determine the DPU number, the "featured" daemon should leverage the platform API. However, for the sake of implementation simplicity, the DPU number is extracted directly from the platform_env.conf file firstly. + +```shell +cat /usr/share/sonic/device/$PLATFORM/platform_env.conf +NUM_DPU=2 +``` + +To align with the established multi-ASIC design in SONiC, a new field, `"has_per_dpu_scope": "True"``, is introduced in the database feature table within config_db.json. This field plays a crucial role in ensuring that each DPU database instance is initiated within a dedicated database container. This design approach maintains consistency with SONiC's existing architecture while accommodating the specific requirements of DPU overlay databases. + +```json +# config_db.json + +"database": { + "auto_restart": "always_enabled", + "delayed": "False", + "has_global_scope": "True", + "has_per_asic_scope": "True", + "has_per_dpu_scope": "True", # New field for DPU database service + "high_mem_alert": "disabled", + "state": "always_enabled", + "support_syslog_rate_limit": "true" +}, +``` + +Our design also extends its multi-ASIC principles by introducing a database_global.json file. It includes two critical fields: + +- container_name: This field uniquely maps to the DPU's index, ensuring a clear association between the database instances and the respective DPU. +- include: The include field serves as a pointer to the location of the DPU's database configuration. + +```json +{ + "INCLUDES": [ + { + "include": "../../redis/sonic-db/database_config.json" + }, + { + "container_name": "dpu1", + "include": "../../redisdpu0/sonic-db/database_config.json" + }, + { + "container_name": "dpu0", + "include": "../../redisdpu1/sonic-db/database_config.json" + } + ], + "VERSION": "1.0" +} +``` + +Within the NPU, the management of DPU overlay databases involves specific configurations. Each DPU overlay database instance is bound to the IP address of the midplane bridge (169.254.200.254 by default). The TCP port assignment follows a predictable pattern, with each DPU ID associated with a unique port (6381 + DPU ID). + +Here is an example includes two DPU: + +```json +# DPU0: /var/run/redisdpu2/sonic-db/database_config.json +"redis": { + "hostname": "169.254.200.254", + "port": 6381, + "unix_socket_path": "/var/run/redisdpu0/redis.sock", + "persistence_for_warm_boot": "yes", + "database_type": "dpudb" +} +#DPU1: /var/run/redisdpu1/sonic-db/database_config.json +"redis": { + "hostname": "169.254.200.254", + "port": 6382, + "unix_socket_path": "/var/run/redisdpu1/redis.sock", + "persistence_for_warm_boot": "yes", + "database_type": "dpudb" +} +``` + +There are four new tables introduction for the DPU overlay database: + +```json +"DPU_APPL_DB": { + "id": 15, + "separator": ":", + "instance": "redis", + "format": "proto" +}, +"DPU_APPL_STATE_DB": { + "id": 16, + "separator": "|", + "instance": "redis" +}, +"DPU_STATE_DB": { + "id": 17, + "separator": "|", + "instance": "redis" +}, +"DPU_COUNTERS_DB": { + "id": 18, + "separator": ":", + "instance": "redis" +} +``` + +##### DPU side + +In the architecture of our Smart Switch, DPU operation involves accessing both local and remote database services. This document is on elucidating the interaction with remote database services in the NPU, specifically for overlay objects. + +The DPU employs a DHCP server hosted on the NPU, ensuring each DPU fetches a predetermined and consistent IP address. Leveraging this IP address, the DPU determines the TCP port of its associated overlay database. Following this design principle, the DPU autonomously generates the requisite configuration for remote database services. + +``` json +# /var/run/redis/sonic-db/database_config.json +{ + "INSTANCES": { + "redis": { + "hostname": "127.0.0.1", + "port": 6379, + "unix_socket_path": "/var/run/redis/redis.sock", + "persistence_for_warm_boot": "yes" + }, + "remote_redis": { + "hostname": "169.254.200.254", + "port": 6381, + } + }, + "DATABASES": { + "APPL_DB": { + "id": 0, + "separator": ":", + "instance": "redis" + }, + "ASIC_DB": { + "id": 1, + "separator": ":", + "instance": "redis" + }, + "COUNTERS_DB": { + "id": 2, + "separator": ":", + "instance": "redis" + }, + "LOGLEVEL_DB": { + "id": 3, + "separator": ":", + "instance": "redis" + }, + "CONFIG_DB": { + "id": 4, + "separator": "|", + "instance": "redis" + }, + "PFC_WD_DB": { + "id": 5, + "separator": ":", + "instance": "redis" + }, + "FLEX_COUNTER_DB": { + "id": 5, + "separator": ":", + "instance": "redis" + }, + "STATE_DB": { + "id": 6, + "separator": "|", + "instance": "redis" + }, + "SNMP_OVERLAY_DB": { + "id": 7, + "separator": "|", + "instance": "redis" + }, + "RESTAPI_DB": { + "id": 8, + "separator": "|", + "instance": "redis" + }, + "GB_ASIC_DB": { + "id": 9, + "separator": ":", + "instance": "redis" + }, + "GB_COUNTERS_DB": { + "id": 10, + "separator": ":", + "instance": "redis" + }, + "GB_FLEX_COUNTER_DB": { + "id": 11, + "separator": ":", + "instance": "redis" + }, + "APPL_STATE_DB": { + "id": 14, + "separator": ":", + "instance": "redis" + }, + "DPU_APPL_DB": { + "id": 15, + "separator": ":", + "instance": "remote_redis", + "format": "proto" + }, + "DPU_APPL_STATE_DB": { + "id": 16, + "separator": "|", + "instance": "remote_redis" + }, + "DPU_STATE_DB": { + "id": 17, + "separator": "|", + "instance": "remote_redis" + }, + "DPU_COUNTERS_DB": { + "id": 18, + "separator": ":", + "instance": "remote_redis" + } + }, + "VERSION": "1.0" +} + +``` + +#### Database flow + +This section outlines critical workflows interacting with the DPU overlay database. + +##### Update Overlay Objects via GNMI: + +Communication with the SWSS of the DPU occurs through GNMI, leveraging ZMQ. Simultaneously, an asynchronous insertion of the object backup is made to the DPU_APPL_DB. This backup mechanism serves purposes such as debugging, migration, and future considerations. + +##### Update Object Status: + +The SWSS of the DPU takes a proactive role in updating the DPU_APPL_STATE_DB and DPU_STATE_DB when corresponding objects undergo updates. This update can be triggered either by GNMI message commands or internal service logic. + +##### Update Counters and Meters: + +Flex counter management in Syncd of the DPU handles the update of counters and meters for overlay objects. Traditional counters are also managed through this mechanism. + +These workflows ensure an interaction between the DPU overlay database and various components within the Smart Switch. The DPUs access their respective database instances via the IP address of the midplane bridge and the assigned TCP port. Concurrently, GNMI accesses these instances through the Unix domain socket + +### High-Level Design + +### SAI API + +N/A + +### Configuration and management + +An enhanced database CLI offers the capability to convert binary messages within the DPU_APPL_DB into human-readable text. + +#### CLI/YANG model Enhancements + +```yang + container sonic-feature { + container FEATURE { + leaf has_per_dpu_scope { + description "This configuration identicates there will only one service + spawned per DPU"; + type feature-scope-status; + default "false"; + } + } + } +``` + +#### Config DB Enhancements + +Refer section: [Database services](#database-services) + +### Warmboot and Fastboot Design Impact + +N/A + +### Memory Consumption + +#### DPU_APPL_DB + +The estimated memory consumption for the Smart Switch database is calculated based on entry sizes sourced from the [sonic-dash-api](https://github.com/sonic-net/sonic-dash-api) repository with the [commit](https://github.com/sonic-net/sonic-dash-api/tree/d4448c78b4e0afd1ec6dfaa390aef5c650cee4b3) and entry numbers derived from the [DASH high-level design scaling requirements](https://github.com/sonic-net/DASH/blob/main/documentation/general/dash-sonic-hld.md#14-scaling-requirements). + +The following tables comprises two parts: Global tables and per ENI tables. Notably, when calculating the total size per card, the memory consumption of per ENI tables is adjusted by multiplying it by the exact number of ENIs. + +##### Global Tables + +| Table name | Entry size (bytes) | No. of entries in the Table | Total size per card (KB) | +| ----------------------- | ------------------ | --------------------------- | ------------------------ | +| DASH_VNET_TABLE | 448 | 1,024 | 448 | +| DASH_ENI_TABLE | 208 | 64 | 13 | +| DASH_PREFIX_TAG(IPv6) | 229,492 | 32 | 7,172 | +| DASH_VNET_MAPPING_TABLE | 216 | 10,000,000 | 2,109,375 | + +##### Per ENI Tables + +| Table name | Entry size (bytes) | No. of entries in the Table per ENI | Total size per card (KB) | +| ------------------------------ | ------------------ | ----------------------------------- | ------------------------ | +| DASH_ACL_RULE_TABLE(IPv6) | 2,488 | 10,000 | 1,555,000 | +| DASH_ROUTE_RULE_TABLE(inbound) | 176 | 10,000 | 110,000 | +| DASH_ROUTE_TABLE(outbound) | 264 | 100,000 | 1,650,000 | + +Based on the provided data and calculations, the estimated memory consumption for the DPU_APPL_DB is approximately **5.18GB** per card. + +#### DPU_APPL_STATE_DB/DPU_STATE_DB + +For the DPU_APPL_STATE_DB and DPU_STATE_DB, the storage focus is specifically on retaining the keys and its status of each object rather than storing the metadata. This results in a reduced memory footprint compared to the DPU_APPL_DB. The estimated memory consumption for these databases is approximately 2.45GB. + +##### Global Tables + +| Table name | Entry size (bytes) | No. of entries in the Table | Total size per card (KB) | +| ----------------------- | ------------------ | --------------------------- | ------------------------ | +| DASH_VNET_TABLE | 88 | 1,024 | 88 | +| DASH_ENI_TABLE | 144 | 64 | 7 | +| DASH_PREFIX_TAG(IPv6) | 104 | 32 | 3 | +| DASH_VNET_MAPPING_TABLE | 144 | 10,000,000 | 1,406,250 | + +##### Per ENI Tables + +| Table name | Entry size (bytes) | No. of entries in the Table per ENI | Total size per card (KB) | +| ------------------------------ | ------------------ | ----------------------------------- | ------------------------ | +| DASH_ACL_RULE_TABLE(IPv6) | 104 | 10,000 | 65,000 | +| DASH_ROUTE_RULE_TABLE(inbound) | 160 | 10,000 | 100,000 | +| DASH_ROUTE_TABLE(outbound) | 160 | 100,000 | 1,000,000 | + +### Restrictions/Limitations + +### Testing Requirements/Design + +#### Unit Test cases + +No separate test for the is required. The feature will be tested implicitly by the other DASH tests. + +#### System Test cases + +No separate test for the is required. The feature will be tested implicitly by the other DASH tests. + +### Open/Action items - if any + +1. Platform API for fetch DPU numbers diff --git a/doc/sonic-gns3/GNS3 VM for SONiC.md b/doc/sonic-gns3/GNS3 VM for SONiC.md new file mode 100644 index 0000000000..048180e540 --- /dev/null +++ b/doc/sonic-gns3/GNS3 VM for SONiC.md @@ -0,0 +1,88 @@ +# SONiC on GNS3 VM + +GNS3 is an environment that allows simulation of networking equipment in realistic scenarios. It can be used to emulate, configure, test, and troubleshoot networks in a simulated environment. GNS3 allows you to run a small network topology +consisting of only a few devices on your Windows 10 laptop, or larger network topologies using a GNS Server that is installed on an Ubuntu Linux server. You can use the GNS3 simulator to create a virtual environment to emulate various networks. See [GNS3 online documentation](https://docs.gns3.com/) and [Getting started](https://docs.gns3.com/docs/) with GNS3 for complete information. +Use GNS3 to run SONiC simulator VMs. GNS3 consists of the following components: +### For Windows Environment + +GNS3 user interface — Used to create and visualize network connections for the Windows enviorment. + +### For Client Server Model + +GNS3 client — Used to create and visualize complex network connections for the Windows enviorment. +GNS3 server — Controls SONiC VM execution (natively supported on Ubuntu Linux running on a Dell server) + +### GNS3 VM installation overview + +1. Install GNS3 on a windows enviorment using [GNS3 VM installation guide](https://docs.gns3.com/docs/getting-started/installation/windows/#:~:text=The%20following%20are%20the%20optimal%20requirements%20for%20a,%2F%20RVI%20Series%20or%20Intel%20VT-X%20%2F%20EPT). +2. Download the SONiC image from the [azure pipeline](https://sonic-build.azurewebsites.net/ui/sonic/pipelines) to the windows enviorment. +3. Import the SONiC image the GNS3 VM enviorment. +4. Build your SONiC topology virtual devices. +5. Log in and configure each device. + +### GNS3 VM set up + +Once the GNS3 VM is installed and application is opened. We see a window as shown below, + +![](https://github.com/prasanna228/prasannaSONiC/blob/main/doc/sonic-gns3/image1.jpg) + + +In the GNS3 window, create a project. + +a. Select File->New blank project. +b. Name: Enter a new project name. +c. Location: The default projects folder name changes to the new project name. +d. Click OK. + +![](https://github.com/prasanna228/prasannaSONiC/blob/main/doc/sonic-gns3/image2.jpg) + +The project window opens. The window title displays the name of the new project. + +![](https://github.com/prasanna228/prasannaSONiC/blob/main/doc/sonic-gns3/image3.jpg) + +Install an SONiC image for GNS3 appliance file. + - Go to [SONiC pipeline](https://sonic-build.azurewebsites.net/ui/sonic/pipelines) and select the version of the SONiC image that you want to use. + - Select an SONiC build image zip file and click Download. The zip file contains an SONiC image file. + - On the windows environment, extract the SONiC image file. + +In the GNS3 project window, Click on New template on the left corner of the screen +Select the option "Manually createa new template" + +![](https://github.com/prasanna228/prasannaSONiC/blob/main/doc/sonic-gns3/image4.jpg) + +Under the new tab, select Qemu VM and then select a new template as shown below. Key in the desired type of device and its RAM details as recommended. + +![](https://github.com/prasanna228/prasannaSONiC/blob/main/doc/sonic-gns3/image5.jpg) + +Now, we should be able to find a new device on the left side panel to configure our device template. + - In the QEMU VM template configuration window, under the General Settings tab, change the RAM size to 8192 MB (8GB) and the vCPU number to 4. + - Select Auto Start Console to automatically open the console when the Community SONiC appliances start. + - Click OK to save the changes. + +![](https://github.com/prasanna228/prasannaSONiC/blob/main/doc/sonic-gns3/image6.jpg) + +![](https://github.com/prasanna228/prasannaSONiC/blob/main/doc/sonic-gns3/image7.jpg) + +![](https://github.com/prasanna228/prasannaSONiC/blob/main/doc/sonic-gns3/image11.jpg) + +### Build your network topology + +In the GNS3 project window, click the Browse Routers icon on the left side bar. Drag and drop CommunitySONiC devices in the middle project frame as required for your network topology. Place each device in the appropriate location on the screen. To rename a switch, click its icon and overwrite the text + +![](https://github.com/prasanna228/prasannaSONiC/blob/main/doc/sonic-gns3/image8.jpg) + +Connect the Community SONiC switches. Select the "Add a link" icon on the left side bar. Click a switch in the project frame and select an available port in the drop-down list. + +![](https://github.com/prasanna228/prasannaSONiC/blob/main/doc/sonic-gns3/image9.jpg) + +Drag the connection line to another switch, click the switch icon, and select a port from the drop-down list to establish the link. + +![](https://github.com/prasanna228/prasannaSONiC/blob/main/doc/sonic-gns3/image10.jpg) + +Repeat this step to connect each Community SONiC devices. + +Configure the Management IP address on each Community SONiC switch. Start each Community SONiC switch by right-clicking the icon and selecting Start. The connections from the switch to other devices in the project frame turn from red to green and the console window opens. + +In the console window of each Enterprise SONiC switch, log in by entering the default username admin and the default password YourPaSsWoRd. + +Access Configuration mode in the Community SONiC command-line interface to configure each switch. See the Community [SONiC User Guide](https://github.com/sonic-net/SONiC/blob/master/doc/SONiC-User-Manual.md) for configuration information and procedures. diff --git a/doc/sonic-gns3/image1.jpg b/doc/sonic-gns3/image1.jpg new file mode 100644 index 0000000000..d6e8b685dd Binary files /dev/null and b/doc/sonic-gns3/image1.jpg differ diff --git a/doc/sonic-gns3/image10.jpg b/doc/sonic-gns3/image10.jpg new file mode 100644 index 0000000000..b67f07fe33 Binary files /dev/null and b/doc/sonic-gns3/image10.jpg differ diff --git a/doc/sonic-gns3/image11.jpg b/doc/sonic-gns3/image11.jpg new file mode 100644 index 0000000000..db1500eb1f Binary files /dev/null and b/doc/sonic-gns3/image11.jpg differ diff --git a/doc/sonic-gns3/image2.jpg b/doc/sonic-gns3/image2.jpg new file mode 100644 index 0000000000..8add6d0aac Binary files /dev/null and b/doc/sonic-gns3/image2.jpg differ diff --git a/doc/sonic-gns3/image3.jpg b/doc/sonic-gns3/image3.jpg new file mode 100644 index 0000000000..6a6d97ce50 Binary files /dev/null and b/doc/sonic-gns3/image3.jpg differ diff --git a/doc/sonic-gns3/image4.jpg b/doc/sonic-gns3/image4.jpg new file mode 100644 index 0000000000..d8401113bb Binary files /dev/null and b/doc/sonic-gns3/image4.jpg differ diff --git a/doc/sonic-gns3/image5.jpg b/doc/sonic-gns3/image5.jpg new file mode 100644 index 0000000000..e6e6dc8850 Binary files /dev/null and b/doc/sonic-gns3/image5.jpg differ diff --git a/doc/sonic-gns3/image6.jpg b/doc/sonic-gns3/image6.jpg new file mode 100644 index 0000000000..828c5bd346 Binary files /dev/null and b/doc/sonic-gns3/image6.jpg differ diff --git a/doc/sonic-gns3/image7.jpg b/doc/sonic-gns3/image7.jpg new file mode 100644 index 0000000000..07def67e57 Binary files /dev/null and b/doc/sonic-gns3/image7.jpg differ diff --git a/doc/sonic-gns3/image8.jpg b/doc/sonic-gns3/image8.jpg new file mode 100644 index 0000000000..49bf8d54da Binary files /dev/null and b/doc/sonic-gns3/image8.jpg differ diff --git a/doc/sonic-gns3/image9.jpg b/doc/sonic-gns3/image9.jpg new file mode 100644 index 0000000000..36c8ac170e Binary files /dev/null and b/doc/sonic-gns3/image9.jpg differ diff --git a/doc/voq/fabric.md b/doc/voq/fabric.md index d963e304d0..0127014722 100644 --- a/doc/voq/fabric.md +++ b/doc/voq/fabric.md @@ -24,7 +24,7 @@ # List of Figures # Revision -| Rev | Date | Author | Change Description | +| Rev | Date | Author | Change Description | |:---:|:-----------:|:------------------:|--------------------| | 1 | Aug-28 2020 | Ngoc Do, Eswaran Baskaran (Arista Networks) | Initial Version | | 1.1 | Sep-1 2020 | Ngoc Do, Eswaran Baskaran (Arista Networks) | Add hotswap handling | @@ -33,15 +33,16 @@ | 3 | Jun-3 2022 | Cheryl Sanchez, Jie Feng (Arista Networks) | Update on fabric link monitoring | | 3.1 | Mar-30 2023 | Jie Feng (Arista Networks) | Update Overview, SAI API and Configuration and management section | | 3.2 | May-01 2023 | Jie Feng (Arista Networks) | Update Counter tables information | +| 3.3 | Oct-31 2023 | Jie Feng (Arista Networks) | Update clear fabric counter commands | # Scope This document covers: - + - Bring up of fabric ports in a VOQ chassis. -- Monitoring the fabric ports in forwarding and fabric chips. +- Monitoring the fabric ports in forwarding and fabric chips. -This document builds on top of the VOQ chassis architecture discussed [here](https://github.com/sonic-net/SONiC/blob/master/doc/voq/architecture.md) and the multi-ASIC architecture discussed [here](https://github.com/sonic-net/SONiC/blob/2f320430c8199132c686c06b5431ab93a86fb98f/doc/multi_asic/SONiC_multi_asic_hld.md). +This document builds on top of the VOQ chassis architecture discussed [here](https://github.com/sonic-net/SONiC/blob/master/doc/voq/architecture.md) and the multi-ASIC architecture discussed [here](https://github.com/sonic-net/SONiC/blob/2f320430c8199132c686c06b5431ab93a86fb98f/doc/multi_asic/SONiC_multi_asic_hld.md). # Definitions/Abbreviations @@ -49,7 +50,7 @@ This document builds on top of the VOQ chassis architecture discussed [here](htt |------|--------------------|--------------------------------| | SSI | Supervisor SONiC Instance | SONiC OS instance on a central supervisor module that controls a cluster of forwarding instances and the interconnection fabric. | | NPU | Network Processing Unit | Refers to the forwarding engine on a device that is responsible for packet forwarding. | -| ASIC | Application Specific Integrated Circuit | In addition to NPUs, also includes fabric chips that could forward packets or cells. | +| ASIC | Application Specific Integrated Circuit | In addition to NPUs, also includes fabric chips that could forward packets or cells. | | cell | Fabric Data Units | The data units that traverse a cell-based chassis fabric. | # Overview @@ -58,12 +59,12 @@ This document provides an overview of the SONiC support for fabric ports that ar # 1 Requirements -Fabric ports are used in systems in which there are multiple forwarding ASICs are required to be connected. Traffic passes from one front panel port in a forwarding ASIC over a fabric network to one or multiple front panel ports on one or other ASICs. The fabric network is formed using fabric ASICs. Fabric links on the fabric network connect fabric ports on forwarding ASICs to fabric ports on fabric ASICs. +Fabric ports are used in systems in which there are multiple forwarding ASICs are required to be connected. Traffic passes from one front panel port in a forwarding ASIC over a fabric network to one or multiple front panel ports on one or other ASICs. The fabric network is formed using fabric ASICs. Fabric links on the fabric network connect fabric ports on forwarding ASICs to fabric ports on fabric ASICs. High level requirements: -- SONiC needs to form a fabric network among forwarding ASICs, monitor and manage it. Monitoring could include link statistics, error monitoring and reporting, etc. -- SONiC should be able to initialize fabric asics and manage them similar to how forwarding ASICs are managed - using syncd and sairedis calls. +- SONiC needs to form a fabric network among forwarding ASICs, monitor and manage it. Monitoring could include link statistics, error monitoring and reporting, etc. +- SONiC should be able to initialize fabric asics and manage them similar to how forwarding ASICs are managed - using syncd and sairedis calls. # 2 Design @@ -77,7 +78,7 @@ For each fabric ASIC, there will be: - Swss container - Syncd container -Unlike forwarding ASICs, fabric ASICs do not have any front panel ports, but only fabric ports. So all the front panel port related containers like lldp, teamd and bgpd can be disabled for fabric ASICs. +Unlike forwarding ASICs, fabric ASICs do not have any front panel ports, but only fabric ports. So all the front panel port related containers like lldp, teamd and bgpd can be disabled for fabric ASICs. ## 2.2 Database Schemas @@ -134,7 +135,7 @@ Note that Linecard Sonic instances will also have STATE_DB|FABRIC_PORT_TABLE as ## 2.3 System Initialization -As part of multi-ASIC support, /etc/sonic/generated_services.conf contains the list of services which will be created for each asic when the system boots up. This is read by systemd-sonic-generator to generate the service files for each container that needs to run. +As part of multi-ASIC support, /etc/sonic/generated_services.conf contains the list of services which will be created for each asic when the system boots up. This is read by systemd-sonic-generator to generate the service files for each container that needs to run. Since the fabric ASIC doesn’t need lldp, bgpd and teamd containers to run, systemd-sonic-generator will be modified to not start these services for the fabric ASICs. A per-platform file called `asic_disabled_services` can list the services that are not needed for a given ASIC and systemd-sonic-generator will not generate the service files for these containers. For example, ``` @@ -152,7 +153,7 @@ PMON will be responsible for detecting card presence and hotswap events using th ## 2.5 Orchagent -Orchagent creates the switch using the SAI API similar to creating the switch for a forwarding ASIC, except that the switch type will be fabric. When the ASIC is initialized, all the fabric ports are initialized by default. The fabric ports are a subtype of SAI Port object and it can be obtained by getting all the fabric port objects from SAI. Since there are no front panel ports on a fabric ASIC, port_config.ini will be empty and portsyncd will not run. +Orchagent creates the switch using the SAI API similar to creating the switch for a forwarding ASIC, except that the switch type will be fabric. When the ASIC is initialized, all the fabric ports are initialized by default. The fabric ports are a subtype of SAI Port object and it can be obtained by getting all the fabric port objects from SAI. Since there are no front panel ports on a fabric ASIC, port_config.ini will be empty and portsyncd will not run. On fabric ASICs, OrchDaemon will only monitor and manage fabric ports. It will not maintain cpu port and front panel port related ochres, such as PortsOrch, IntfsOrch, NeighborOrch, VnetOrch, QosOrch, TunnelOrch, and etc. To simplify the change, we will just create FabricOrchDaemon inheriting OrchDaemon for fabric ASICs and this will only run FabricPortsOrch, the module responsible for managing fabric ports. @@ -206,11 +207,11 @@ The design of fabric link monitor is intentionally scoped to use local component ### 2.8.1 Monitor Fabric Link Status -Unhealthy fabric links may lead to traffic drops. Fabric link monitoring is an important tool to minimize traffic loss. The fabric link monitor algorithm monitors fabric link status and isolates the link if one or more criteria are true. By isolating a fabric link, the link is still up in the physical layer, but is taken out of service and does not distribute traffic. This feature is needed on both fabric ASICs and forwarding ASICs. +Unhealthy fabric links may lead to traffic drops. Fabric link monitoring is an important tool to minimize traffic loss. The fabric link monitor algorithm monitors fabric link status and isolates the link if one or more criteria are true. By isolating a fabric link, the link is still up in the physical layer, but is taken out of service and does not distribute traffic. This feature is needed on both fabric ASICs and forwarding ASICs. #### 2.8.1.1 Fabric link monitoring criteria -The fabric link monitoring algorithm checks two type of errors on a link: crc errors and uncorrectable errors. +The fabric link monitoring algorithm checks two type of errors on a link: crc errors and uncorrectable errors. The criteria can be extended to include checking other errors later. @@ -234,7 +235,7 @@ If more than #crcCells out of #rxCells received cells seen with error, the fabri ``` > config fabric port monitor poll threshold isolation <#polls> ``` -The above command can be used to set the number of consecutive polls in which the threshold needs to be detected to isolate a link. +The above command can be used to set the number of consecutive polls in which the threshold needs to be detected to isolate a link. ``` > config fabric port monitor poll threshold recovery <#polls> @@ -249,7 +250,7 @@ The above command sets the number of consecutive polls in which no error is dete > config fabric port unisolate [port_id] ``` -Besides the fabric link monitoring algorithm, the above two commands are added. The commands can be used to manually isolate and unisolate a fabric link ( i.e. take the link out of service and put the link back into service ). The two commands can help us debug on the system as well as force isolate a fabric link. +Besides the fabric link monitoring algorithm, the above two commands are added. The commands can be used to manually isolate and unisolate a fabric link ( i.e. take the link out of service and put the link back into service ). The two commands can help us debug on the system as well as force isolate a fabric link. ### 2.8.2 Monitor Fabric Capacity @@ -269,11 +270,11 @@ A show command is added to display the fabric capacity on a system. > show fabric monitor capacity Monitored fabric capacity threshold: 90% -ASIC Operating Total # % Last Event Last Time +ASIC Operating Total # % Last Event Last Time Links of Links ----- ------ -------- ---- ---------- --------- 0 110 112 98 None Never -1 112 112 100 None Never +1 112 112 100 None Never .... ``` @@ -297,7 +298,7 @@ The following proposed CLI is used to show the traffic among fabric links on bot –------ ----- --------- ---------- 0 1 0 36113 .... - 0 19 0 36107 + 0 19 0 36107 0 20 0 36110 .... ``` @@ -446,6 +447,18 @@ Command to display fabric counters queue. > show fabric counters queue ``` +Command to clear fabric counters port. + +``` +sonic-clear fabriccountersport +``` + +Command to clear fabric counters queue. + +``` +sonic-clear fabriccountersqueue +``` + Command to display fabric status. ``` @@ -509,11 +522,11 @@ The existing warmboot/fastboot feature is not affected due to this design. # 6 Testing -Fabric port testing will rely on sonic-mgmt tests that can run on chassis hardware. +Fabric port testing will rely on sonic-mgmt tests that can run on chassis hardware. - Test fabric port mapping: To verify the fabric mapping, we can inspect the remote switch ID that are saved in the STATE_DB and match that with the known chassis architecture. More comprehensive information about this testing can be found in the Chassis Fabric Test Plan document, which is available at testplan/Chassis-fabric-test-plan.md. -- Test traffic and counters: Send traffic through the chassis and verify traffic going through fabric ports via counters. +- Test traffic and counters: Send traffic through the chassis and verify traffic going through fabric ports via counters. - Test fabric port monitoring: * Use the CLI to isolate/unisolate fabric ports, and verify whether the corresponding STATE_DB entries are updated. @@ -522,7 +535,7 @@ Fabric port testing will rely on sonic-mgmt tests that can run on chassis hardwa # 7 Open/Action items - if any -- In this proposal, all fabric ports on fabric ASICs or forwarding ASICs that join to form the fabric network will be enabled even when there are no peer ports available. We could provide a config model for the platforms to express the expected fabric connectivity and turn off unnecessary fabric ports. +- In this proposal, all fabric ports on fabric ASICs or forwarding ASICs that join to form the fabric network will be enabled even when there are no peer ports available. We could provide a config model for the platforms to express the expected fabric connectivity and turn off unnecessary fabric ports. - Fabric ports that do not have a peer port will show up as a ‘down’ port. Fabric ports that do have a peer port could also go ‘down’ and there is no current way to differentiate this from a fabric port that does not have a peer port. This can be detected if the config model can express the expected fabric connectivity. diff --git a/images/dash/bmv2-virtual-sonic.svg b/images/dash/bmv2-virtual-sonic.svg new file mode 100644 index 0000000000..0170d223e7 --- /dev/null +++ b/images/dash/bmv2-virtual-sonic.svg @@ -0,0 +1,431 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + Page-1 + + + + Rectangle.1000 + SONiC KVM + + + + + + + SONiC KVM + + Rectangle.1001 + eth-midplane <PCIe mgmt> (KVM inf) + + + + + + + eth-midplane<PCIe mgmt>(KVM inf) + + Rectangle.1024 + SWSS + + + + + + + SWSS + + Rectangle.1003 + BMv2 (BMv2 container) + + + + + + + BMv2(BMv2 container) + + Rectangle.1004 + Ethernet0 <Hostif> (tun) + + + + + + + Ethernet0<Hostif>(tun) + + Rectangle.1005 + Ethernet1 <Hostif> (tun) + + + + + + + Ethernet1<Hostif>(tun) + + Rectangle.1006 + eth1 <Ether> (KVM inf) + + + + + + + eth1<Ether>(KVM inf) + + Rectangle.1007 + eth2 <Ether> (KVM inf) + + + + + + + eth2<Ether>(KVM inf) + + Rectangle.1009 + Cpu0 <Dpdk-port> (tun?) + + + + + + + Cpu0<Dpdk-port>(tun?) + + Sheet.1009 + + Rectangle.1010 + Other SONiC Services + + + + + + + Other SONiC Services + + Rectangle.1011 + BGP + + + + + + + BGP + + Rectangle.1012 + ... + + + + + + + ... + + Rectangle.1014 + LLDP + + + + + + + LLDP + + + Rectangle.1013 + Dataplane APP + + + + + + + Dataplane APP + + Dynamic connector.1020 + + + + Dynamic connector.1021 + + + + Dynamic connector.1017 + VPP + + + + + VPP + + Dynamic connector.1018 + + + + Sheet.1019 + + Rectangle.1012 + SAIREDIS + + + + + + + SAIREDIS + + Rectangle.1027 + remote dashsai client + + + + + + + remotedashsaiclient + + + Dynamic connector.1022 + RPC + + + + + RPC + + Dynamic connector.1023 + GRPC + + + + + GRPC + + Dynamic connector.1031 + + + + Dynamic connector.1032 + Counters Meter + + + + + CountersMeter + + Rectangle.1026 + dashsai server + + + + + + + dashsai server + + Rectangle.1034 + saidash + + + + + + + saidash + + Rectangle.1028 + GNMI + + + + + + + GNMI + + Dynamic connector.1029 + + + + Dynamic connector.1030 + + + + Rectangle.1031 + GNMI + + + + + + + GNMI + + + + + Can.1032 + APP DB + + Sheet.1033 + + + + + + + + + + + + + APP DB + + + + + + Can.1034 + APP DB + + Sheet.1035 + + + + + + + + + + + + + APP DB + + + Dynamic connector.1036 + + + + Dynamic connector.1037 + + + + Dynamic connector.1038 + + + + Dynamic connector.1039 + + + + Rectangle.1040 + eth0 <mgmt> (KVM inf) + + + + + + + eth0<mgmt>(KVM inf) + + Dynamic connector.1041 + + + + Sheet.1042 + + + + Sheet.1043 + Single device mode + + + + Single device mode + +