Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Arista] Disable CPU C-States other than C1 #16703

Merged
merged 1 commit into from
Oct 14, 2023

Conversation

Staphylo
Copy link
Collaborator

@Staphylo Staphylo commented Sep 26, 2023

Why I did it

Networking devices need to be responsive. Such responsiveness is harmed when the CPU change state.
There is a latency penalty when a CPU is idle (e.g C2) and need to exit this state to come back to C1 state.
To prevent this from happening the CPU should be forced to remain in C1 state.

Work item tracking
  • Microsoft ADO (number only): 25489102

How I did it

Generalize the cstate forcing to C1 to all Arista products.
This is done by adding processor.max_cstate=1 to the kernel cmdline for all CPUs.
Additionally Intel CPUs also need intel_idle.max_cstate=0 to fallback to the acpi_idle driver.

How to verify it

Check that processor.max_cstate=1 is present on the cmdline for AMD CPUs
Check that both processor.max_cstate=1 and intel_idle.max_cstate=0 are present on the cmdline for Intel CPUs

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205
  • 202211
  • 202305

Description for the changelog

Disable CPU C-States other than C1 for Arista products

@Staphylo Staphylo requested a review from lguohan as a code owner September 26, 2023 14:26
@Staphylo
Copy link
Collaborator Author

@StormLiangMS @lipxu

@lipxu lipxu self-requested a review September 26, 2023 16:08
Copy link
Contributor

@lipxu lipxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @Staphylo , may you please let us know what's the motivation for this PR? Are we encountering any specific issues, or is there a performance concern driving this change? if so, how much improvement can be expected? thanks a lot

@Staphylo
Copy link
Collaborator Author

Hi @lipxu this changes bring Arista products to parity with others using ONIE on the regard of CPU cstates.
See #16339 and #16371
This change removes the corner case where the CPU needs to spend extra time to process some data because it first has to exit a higher cstate.
A real world example in networking would be the CPU receiving the control plane packet but since it's idle it has to spend a few cycles more to actually start processing it.

@lguohan lguohan added the device label Sep 28, 2023
@Blueve
Copy link
Contributor

Blueve commented Oct 13, 2023

@lguohan @yxieca I think we are ready to merge this PR, could you help?

@yxieca yxieca merged commit d760fb9 into sonic-net:master Oct 14, 2023
3 checks passed
mssonicbld pushed a commit to mssonicbld/sonic-buildimage that referenced this pull request Oct 14, 2023
Why I did it
Networking devices need to be responsive. Such responsiveness is harmed when the CPU change state.
There is a latency penalty when a CPU is idle (e.g C2) and need to exit this state to come back to C1 state.
To prevent this from happening the CPU should be forced to remain in C1 state.

How I did it
Generalize the cstate forcing to C1 to all Arista products.
This is done by adding processor.max_cstate=1 to the kernel cmdline for all CPUs.
Additionally Intel CPUs also need intel_idle.max_cstate=0 to fallback to the acpi_idle driver.

How to verify it
Check that processor.max_cstate=1 is present on the cmdline for AMD CPUs
Check that both processor.max_cstate=1 and intel_idle.max_cstate=0 are present on the cmdline for Intel CPUs
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202205: #16887

mssonicbld pushed a commit to mssonicbld/sonic-buildimage that referenced this pull request Oct 14, 2023
Why I did it
Networking devices need to be responsive. Such responsiveness is harmed when the CPU change state.
There is a latency penalty when a CPU is idle (e.g C2) and need to exit this state to come back to C1 state.
To prevent this from happening the CPU should be forced to remain in C1 state.

How I did it
Generalize the cstate forcing to C1 to all Arista products.
This is done by adding processor.max_cstate=1 to the kernel cmdline for all CPUs.
Additionally Intel CPUs also need intel_idle.max_cstate=0 to fallback to the acpi_idle driver.

How to verify it
Check that processor.max_cstate=1 is present on the cmdline for AMD CPUs
Check that both processor.max_cstate=1 and intel_idle.max_cstate=0 are present on the cmdline for Intel CPUs
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202211: #16888

mssonicbld pushed a commit to mssonicbld/sonic-buildimage that referenced this pull request Oct 17, 2023
Why I did it
Networking devices need to be responsive. Such responsiveness is harmed when the CPU change state.
There is a latency penalty when a CPU is idle (e.g C2) and need to exit this state to come back to C1 state.
To prevent this from happening the CPU should be forced to remain in C1 state.

How I did it
Generalize the cstate forcing to C1 to all Arista products.
This is done by adding processor.max_cstate=1 to the kernel cmdline for all CPUs.
Additionally Intel CPUs also need intel_idle.max_cstate=0 to fallback to the acpi_idle driver.

How to verify it
Check that processor.max_cstate=1 is present on the cmdline for AMD CPUs
Check that both processor.max_cstate=1 and intel_idle.max_cstate=0 are present on the cmdline for Intel CPUs
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202305: #16905

mssonicbld pushed a commit that referenced this pull request Oct 17, 2023
Why I did it
Networking devices need to be responsive. Such responsiveness is harmed when the CPU change state.
There is a latency penalty when a CPU is idle (e.g C2) and need to exit this state to come back to C1 state.
To prevent this from happening the CPU should be forced to remain in C1 state.

How I did it
Generalize the cstate forcing to C1 to all Arista products.
This is done by adding processor.max_cstate=1 to the kernel cmdline for all CPUs.
Additionally Intel CPUs also need intel_idle.max_cstate=0 to fallback to the acpi_idle driver.

How to verify it
Check that processor.max_cstate=1 is present on the cmdline for AMD CPUs
Check that both processor.max_cstate=1 and intel_idle.max_cstate=0 are present on the cmdline for Intel CPUs
@mssonicbld
Copy link
Collaborator

@Staphylo cherry pick PR didn't pass PR checker. Please check!!!
#16888

2 similar comments
@mssonicbld
Copy link
Collaborator

@Staphylo cherry pick PR didn't pass PR checker. Please check!!!
#16888

@mssonicbld
Copy link
Collaborator

@Staphylo cherry pick PR didn't pass PR checker. Please check!!!
#16888

mssonicbld pushed a commit that referenced this pull request Oct 20, 2023
Why I did it
Networking devices need to be responsive. Such responsiveness is harmed when the CPU change state.
There is a latency penalty when a CPU is idle (e.g C2) and need to exit this state to come back to C1 state.
To prevent this from happening the CPU should be forced to remain in C1 state.

How I did it
Generalize the cstate forcing to C1 to all Arista products.
This is done by adding processor.max_cstate=1 to the kernel cmdline for all CPUs.
Additionally Intel CPUs also need intel_idle.max_cstate=0 to fallback to the acpi_idle driver.

How to verify it
Check that processor.max_cstate=1 is present on the cmdline for AMD CPUs
Check that both processor.max_cstate=1 and intel_idle.max_cstate=0 are present on the cmdline for Intel CPUs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants