Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extended port operational status with various error and fault status #2060

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

prgeor
Copy link
Contributor

@prgeor prgeor commented Aug 3, 2024

Port status change notification now adds additional field to indicate various port error/fault status that caused the port operational status to go down. A capability attribute is added to query the Switch's capability that can be used to know if the new fault/error status are supported in the port status change notification.

sai_port_error_status_t bitmap can indicate more than one port error conditions by setting the corresponding error bit indicating why the port went down. Ideally, when the port operational status goes UP, the bitmap value should be ZERO i.e no errors.

Adding the error status as part of the port link status change notification allows better correlation of port operational status with various port error/fault conditions events that may have caused the port down event.

@prgeor prgeor force-pushed the mac-fault branch 2 times, most recently from 83ff60a to 2f120bb Compare August 3, 2024 06:15
inc/saiport.h Outdated
* @type bool
* @flags READ_ONLY
*/
SAI_PORT_ATTR_MAC_LOCAL_FAULT_STATUS,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it better to add this as PORT_ATTR_FAULT_STATUS and then define enum such as mac_local_faluse, remote fault, and in the future could be other types of fault.

@prgeor prgeor changed the title Add port attribute to get MAC local and remote fault status Extended port operational status with various error and fault status Aug 6, 2024
@JaiOCP
Copy link
Contributor

JaiOCP commented Aug 12, 2024

@prgeor @lguohan
We discussed this topic offline with other vendors. There is a issue with having a separate port error notification as the callback can reach in any order along with port oper status. Besides this there is also the issue of when the error state is cleared.

#2048

Ideally we should have the port errors as part of the existing oper_status_t port notification, is the consensus.
Please look at the private branch (PR is not yet opened)

typedef enum _sai_port_oper_status_t
{
/** Unknown */
SAI_PORT_OPER_STATUS_UNKNOWN,

/** Up */
SAI_PORT_OPER_STATUS_UP,

/** Down */
SAI_PORT_OPER_STATUS_DOWN,

/** Test Running */
SAI_PORT_OPER_STATUS_TESTING,

/** Not Present */
SAI_PORT_OPER_STATUS_NOT_PRESENT,

/** Remote-fault observed on the port */
SAI_PORT_OPER_STATUS_REMOTE_FAULT,

/** Local-fault for example loss of signal and more */
SAI_PORT_OPER_STATUS_LOCAL_FAULT,

/** Pre-emphasis set failed when configuration is done*/
SAI_PORT_OPER_STATUS_PREEMPHASIS_SET_FAILED,

/** FEC set failed when configuration is done */
SAI_PORT_OPER_STATUS_FEC_SET_FAILED,

/** Speed set failed when configuration is done */
SAI_PORT_OPER_STATUS_SPEED_SET_FAILED,

/** Interface type set failed when configuration is done */
SAI_PORT_OPER_STATUS_IF_TYPE_SET_FAILED,

/** Media type set failed when configuration is done */
SAI_PORT_OPER_STATUS_MEDIA_TYPE_SET_FAILED,

....

@prgeor
Copy link
Contributor Author

prgeor commented Aug 19, 2024

@prgeor @lguohan We discussed this topic offline with other vendors. There is a issue with having a separate port error notification as the callback can reach in any order along with port oper status. Besides this there is also the issue of when the error state is cleared.

#2048

Ideally we should have the port errors as part of the existing oper_status_t port notification, is the consensus. Please look at the private branch (PR is not yet opened)

typedef enum _sai_port_oper_status_t { /** Unknown */ SAI_PORT_OPER_STATUS_UNKNOWN,

/** Up */
SAI_PORT_OPER_STATUS_UP,

/** Down */
SAI_PORT_OPER_STATUS_DOWN,

/** Test Running */
SAI_PORT_OPER_STATUS_TESTING,

/** Not Present */
SAI_PORT_OPER_STATUS_NOT_PRESENT,

/** Remote-fault observed on the port */
SAI_PORT_OPER_STATUS_REMOTE_FAULT,

/** Local-fault for example loss of signal and more */
SAI_PORT_OPER_STATUS_LOCAL_FAULT,

/** Pre-emphasis set failed when configuration is done*/
SAI_PORT_OPER_STATUS_PREEMPHASIS_SET_FAILED,

/** FEC set failed when configuration is done */
SAI_PORT_OPER_STATUS_FEC_SET_FAILED,

/** Speed set failed when configuration is done */
SAI_PORT_OPER_STATUS_SPEED_SET_FAILED,

/** Interface type set failed when configuration is done */
SAI_PORT_OPER_STATUS_IF_TYPE_SET_FAILED,

/** Media type set failed when configuration is done */
SAI_PORT_OPER_STATUS_MEDIA_TYPE_SET_FAILED,

....

@JaiOCP we decided to not use existing notification instead created new notification which can be used in future by adding new faults (apart from local/remote) faults.

@JaiOCP
Copy link
Contributor

JaiOCP commented Aug 19, 2024 via email

inc/saiport.h Outdated Show resolved Hide resolved
inc/saiport.h Outdated Show resolved Hide resolved
inc/saiport.h Outdated
Comment on lines 80 to 109
typedef enum _sai_port_error_status_t
{
/** No errors */
SAI_PORT_ERROR_STATUS_CLEAR = 0,

SAI_PORT_ERROR_STATUS_MAC_LOCAL_FAULT = 1,

SAI_PORT_ERROR_STATUS_MAC_REMOTE_FAULT = 2,

SAI_PORT_ERROR_STATUS_FEC_SYNC_LOSS = 4,

SAI_PORT_ERROR_STATUS_FEC_LOSS_ALIGNMENT_MARKER = 8
} sai_port_error_status_t;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already an enum sai_port_err_status_t that has similar status'. Can we reuse it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@harshitgulati18 thanks for pointing out. sai_port_err_status_list_t is a list of various port error which needs to be traversed to find out which ERROR is set. As the list grows it becomes less efficient whereas in this PR the proposal is to use 64-bit bitmap of errors which is more efficient. I guess 64 bit is sufficient to cover port related errors...

inc/saiport.h Show resolved Hide resolved
inc/saiport.h Outdated

SAI_PORT_ERROR_AUTONEG_FAILED=0x8,

SAI_PORT_ERROR_LINK_TRAINING_FAILED=0x10,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplication, appears twice

SAI_PORT_ERROR_STATUS_FEC_SYNC_LOSS = 4,

SAI_PORT_ERROR_STATUS_FEC_LOSS_ALIGNMENT_MARKER = 8
} sai_port_error_status_t;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add SAI_PORT_ERROR_STATUS_HIGH_SER and SAI_PORT_ERROR_STATUS_HIGH_BER

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inc/saiport.h Show resolved Hide resolved
Copy link
Contributor

@mikeberesford mikeberesford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we were going to mark sai_port_err_status_list_t/SAI_PORT_ATTR_ERR_STATUS_LIST as deprecated?

/**
* @brief Attribute bitmap data for #SAI_PORT_ATTR_ERROR_STATUS
*/
typedef enum _sai_port_error_status_t
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are any other values from sai_port_err_status_t needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikeberesford may I know which one you are interested to be added here? I am not sure if I can duplicate the attributes here but I can try...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm personally mostly interested in local/remote fault, just want to make sure the others are not useful or are covered.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to Local/Remote fault, SAI_PORT_ERR_STATUS_CRC_RATE would also be good to include here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@harshitgulati18 added now

Signed-off-by: Prince George <prgeor@microsoft.com>
@prgeor
Copy link
Contributor Author

prgeor commented Sep 16, 2024

@mikeberesford may I know what is blocking?

@mikeberesford
Copy link
Contributor

I believe we were going to mark sai_port_err_status_list_t/SAI_PORT_ATTR_ERR_STATUS_LIST as deprecated?

I believe we were going to mark sai_port_err_status_list_t/SAI_PORT_ATTR_ERR_STATUS_LIST as deprecated?

Signed-off-by: Prince George <prgeor@microsoft.com>
@prgeor
Copy link
Contributor Author

prgeor commented Sep 19, 2024

I believe we were going to mark sai_port_err_status_list_t/SAI_PORT_ATTR_ERR_STATUS_LIST as deprecated?

I believe we were going to mark sai_port_err_status_list_t/SAI_PORT_ATTR_ERR_STATUS_LIST as deprecated?

@mikeberesford can you check now?

@mikeberesford
Copy link
Contributor

I believe we were going to mark sai_port_err_status_list_t/SAI_PORT_ATTR_ERR_STATUS_LIST as deprecated?

I believe we were going to mark sai_port_err_status_list_t/SAI_PORT_ATTR_ERR_STATUS_LIST as deprecated?

@mikeberesford can you check now?

Please add @deprecated true to SAI_PORT_ATTR_ERR_STATUS_LIST as well?

Signed-off-by: Prince George <prgeor@microsoft.com>
@prgeor
Copy link
Contributor Author

prgeor commented Sep 19, 2024

I believe we were going to mark sai_port_err_status_list_t/SAI_PORT_ATTR_ERR_STATUS_LIST as deprecated?

I believe we were going to mark sai_port_err_status_list_t/SAI_PORT_ATTR_ERR_STATUS_LIST as deprecated?

@mikeberesford can you check now?

Please add @deprecated true to SAI_PORT_ATTR_ERR_STATUS_LIST as well?

@mikeberesford added

@rlhui rlhui added the reviewed PR is discussed in SAI Meeting label Sep 19, 2024
@mikeberesford
Copy link
Contributor

I believe we were going to mark sai_port_err_status_list_t/SAI_PORT_ATTR_ERR_STATUS_LIST as deprecated?

I believe we were going to mark sai_port_err_status_list_t/SAI_PORT_ATTR_ERR_STATUS_LIST as deprecated?

@mikeberesford can you check now?

Please add @deprecated true to SAI_PORT_ATTR_ERR_STATUS_LIST as well?

@mikeberesford added

Still not done? I don't appear to be able to leave a comment on the line, but I still see:

    /**
     * @brief Port Down Error Status
     *
     * @type sai_port_err_status_list_t
     * @flags READ_ONLY
     */
    SAI_PORT_ATTR_ERR_STATUS_LIST,

Signed-off-by: Prince George <prgeor@microsoft.com>
@@ -1935,6 +1988,7 @@ typedef enum _sai_port_attr_t
*
* @type sai_port_err_status_list_t
* @flags READ_ONLY
* @deprecated true
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikeberesford now added here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

Copy link
Contributor Author

@prgeor prgeor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JaiOCP can you review?

@@ -1935,6 +1988,7 @@ typedef enum _sai_port_attr_t
*
* @type sai_port_err_status_list_t
* @flags READ_ONLY
* @deprecated true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
reviewed PR is discussed in SAI Meeting
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants