Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] udev high CPU usage and trouble shutting down on archlinux-zen-6.5.8 #142

Open
BlackOutedMind opened this issue Oct 26, 2023 · 3 comments
Labels
bug Something isn't working

Comments

@BlackOutedMind
Copy link

BlackOutedMind commented Oct 26, 2023

Describe the bug
A clear and concise description of what the bug is.
After installing EnvyControl I rebooted into hybrid mode, but my system sat at the shutdown screen with the following text:
229.011736] systemd-shutdown[1): Waiting for process: 361 ((udev-worker)), 350 ((udeu-worker)), 318 ((udeu-worker)), 328 ((udev-worker)), 302 (systemd-udevd) 246.869244) INFO: task (udeu-uorker):318 blocked for more than 122 seconds. 6.5.8 zen1-1-zen 81 OE Ta inted: P ( 216.869514) [ 246.869780 ] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. L 246.870140] INFO: task (udev-worker):328 blocked for more than 122 seconds. [ 246.670412] 6.5.8-zen1-1-zen #1 OE Tainted: P [ 216.870686) "echo 0 > /proc/sys/kernel/hung_task_tineout_secs" disables this message. 246.870996) INFO: task (udeu-worker):361 blocked for than 122 seconds vore 6.5.8- zen1-1-zen #1 Tainted: P OE [ 246.871310] [ 216.871588] "echo 0 > /proc/sys/kernel/hung_task_tincout_secs" disables this message
To Reproduce
Steps to reproduce the behavior:

  1. Run sudo envycontrol -s integrated
  2. Reboot
  3. After booting up system is fine, until after some time udev kicks in with a high CPU usage.

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

System Information:

  • Model: [Lenovo Legion 5i 15IHM05H]
  • Distro: [Arch Linux]
  • Kernel: [6.5.8-zen]
  • DE/WM and Display Manager (if applicable): [Gnome 45 with GDM]
  • EnvyControl version: [3.3.0]
  • Nvidia driver version: [35.113.01]
  • lspci output:
00:01.0 PCI bridge: Intel Corporation 6th-10th Gen Core Processor PCIe Controller (x16) (rev 02)
00:02.0 VGA compatible controller: Intel Corporation CometLake-H GT2 [UHD Graphics] (rev 05)
00:04.0 Signal processing controller: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Thermal Subsystem (rev 02)
00:08.0 System peripheral: Intel Corporation Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th/8th Gen Core Processor Gaussian Mixture Model
00:12.0 Signal processing controller: Intel Corporation Comet Lake PCH Thermal Controller
00:14.0 USB controller: Intel Corporation Comet Lake USB 3.1 xHCI Host Controller
00:14.2 RAM memory: Intel Corporation Comet Lake PCH Shared SRAM
00:14.3 Network controller: Intel Corporation Comet Lake PCH CNVi WiFi
00:15.0 Serial bus controller: Intel Corporation Comet Lake PCH Serial IO I2C Controller #0
00:15.1 Serial bus controller: Intel Corporation Comet Lake PCH Serial IO I2C Controller #1
00:16.0 Communication controller: Intel Corporation Comet Lake HECI Controller
00:17.0 SATA controller: Intel Corporation Device 06d3
00:1d.0 PCI bridge: Intel Corporation Comet Lake PCI Express Root Port #9 (rev f0)
00:1d.6 PCI bridge: Intel Corporation Device 06b6 (rev f0)
00:1f.0 ISA bridge: Intel Corporation Comet Lake LPC Controller
00:1f.3 Audio device: Intel Corporation Comet Lake PCH cAVS
00:1f.4 SMBus: Intel Corporation Comet Lake PCH SMBus Controller
00:1f.5 Serial bus controller: Intel Corporation Comet Lake PCH SPI Controller
01:00.0 VGA compatible controller: NVIDIA Corporation TU106M [GeForce RTX 2060 Mobile] (rev a1)
01:00.2 USB controller: NVIDIA Corporation TU106 USB 3.1 Host Controller (rev a1)
01:00.3 Serial bus controller: NVIDIA Corporation TU106 USB Type-C UCSI Controller (rev a1)
06:00.0 Non-Volatile memory controller: Silicon Motion, Inc. SM2263EN/SM2263XT (DRAM-less) NVMe SSD Controllers (rev 03)
07:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15)

Additional context
Add any other context about the problem here. If possible try to reproduce the problem with --verbose flag and attach its output.
systemctl status systemd-udev

systemd-udevd[26734]: /etc/udev/rules.d/99-switch.rules:1 Unknown group 'plugdev', ignoring.

sudo envycontrol -s integrated --verbose

Switching to integrated mode
Successfully disabled nvidia-persistenced.service
INFO: Created file /etc/modprobe.d/blacklist-nvidia.conf
# Automatically generated by EnvyControl

blacklist nouveau
blacklist nvidia
blacklist nvidia_drm
blacklist nvidia_uvm
blacklist nvidia_modeset
alias nouveau off
alias nvidia off
alias nvidia_drm off
alias nvidia_uvm off
alias nvidia_modeset off

INFO: Created file /lib/udev/rules.d/50-remove-nvidia.rules
# Automatically generated by EnvyControl

# Remove NVIDIA USB xHCI Host Controller devices, if present
ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x0c0330", ATTR{power/control}="auto", ATTR{remove}="1"

# Remove NVIDIA USB Type-C UCSI devices, if present
ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x0c8000", ATTR{power/control}="auto", ATTR{remove}="1"

# Remove NVIDIA Audio devices, if present
ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x040300", ATTR{power/control}="auto", ATTR{remove}="1"

# Remove NVIDIA VGA/3D controller devices
ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x03[0-9]*", ATTR{power/control}="auto", ATTR{remove}="1"

Operation completed successfully
Please reboot your computer for changes to take effect!
@BlackOutedMind BlackOutedMind added the bug Something isn't working label Oct 26, 2023
@NoelJacob
Copy link

Same problem here. Output of inxi -Fxz:

System:
Kernel: 6.7.1-1-cachyos-bore arch: x86_64 bits: 64 compiler: gcc v: 13.2.1
Desktop: KDE Plasma v: 5.27.10 Distro: Garuda Linux base: Arch Linux
Machine:
Type: Laptop System: HP product: Victus by HP Laptop 16-e0xxx v: N/A
serial: <filter>
Mobo: HP model: 88EE v: 80.73 serial: <filter> UEFI: AMI v: F.19
date: 10/17/2023
Battery:
ID-1: BAT0 charge: 70.1 Wh (100.0%) condition: 70.1/70.1 Wh (100.0%)
volts: 17.4 min: 15.4 model: HP Primary status: full
CPU:
Info: 6-core model: AMD Ryzen 5 5600H with Radeon Graphics bits: 64
type: MT MCP arch: Zen 3 rev: 0 cache: L1: 384 KiB L2: 3 MiB L3: 16 MiB
Speed (MHz): avg: 601 high: 2819 min/max: 400/4280 cores: 1: 400 2: 400
3: 400 4: 400 5: 400 6: 400 7: 2819 8: 400 9: 400 10: 400 11: 400 12: 400
bogomips: 79052
Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm
Graphics:
Device-1: NVIDIA GA106M [GeForce RTX 3060 Mobile / Max-Q]
vendor: Hewlett-Packard driver: nvidia v: 545.29.06 arch: Ampere
bus-ID: 01:00.0
Device-2: AMD Cezanne [Radeon Vega Series / Radeon Mobile Series]
vendor: Hewlett-Packard driver: amdgpu v: kernel arch: GCN-5 bus-ID: 06:00.0
temp: 47.0 C
Device-3: Quanta HP Wide Vision HD Camera driver: uvcvideo type: USB
bus-ID: 1-3:2
Display: server: X.Org v: 21.1.11 with: Xwayland v: 23.2.4 driver: X:
loaded: amdgpu,nvidia unloaded: modesetting,nouveau dri: radeonsi
gpu: amdgpu resolution: 1920x1080
API: EGL v: 1.5 drivers: nvidia,radeonsi,swrast platforms:
active: gbm,x11,surfaceless,device inactive: wayland,device-1
API: OpenGL v: 4.6.0 compat-v: 4.5 vendor: amd mesa v: 23.3.4-arch1.2
glx-v: 1.4 direct-render: yes renderer: AMD Radeon Graphics (radeonsi
renoir LLVM 16.0.6 DRM 3.56 6.7.1-1-cachyos-bore)
API: Vulkan v: 1.3.276 drivers: radv,nvidia,llvmpipe surfaces: xcb,xlib
devices: 3
Audio:
Device-1: NVIDIA GA106 High Definition Audio vendor: Hewlett-Packard
driver: snd_hda_intel v: kernel bus-ID: 01:00.1
Device-2: AMD ACP/ACP3X/ACP6x Audio Coprocessor vendor: Hewlett-Packard
driver: snd_rn_pci_acp3x v: kernel bus-ID: 06:00.5
Device-3: AMD Family 17h/19h HD Audio vendor: Hewlett-Packard
driver: snd_hda_intel v: kernel bus-ID: 06:00.6
API: ALSA v: k6.7.1-1-cachyos-bore status: kernel-api
Server-1: PipeWire v: 1.0.1 status: n/a (root, process)
Network:
Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet
vendor: Hewlett-Packard driver: r8169 v: kernel port: e000 bus-ID: 02:00.0
IF: eno1 state: down mac: <filter>
Device-2: Realtek RTL8852AE 802.11ax PCIe Wireless Network Adapter
vendor: Hewlett-Packard driver: rtw89_8852ae v: N/A port: d000
bus-ID: 03:00.0
IF: wlo1 state: down mac: <filter>
IF-ID-1: enp6s0f4u1 state: unknown speed: -1 duplex: half mac: <filter>
Bluetooth:
Device-1: Realtek Bluetooth Radio driver: btusb v: 0.8 type: USB
bus-ID: 1-4:3
Report: btmgmt ID: hci0 rfk-id: 1 state: down bt-service: enabled,running
rfk-block: hardware: no software: yes address: <filter> bt-v: 5.2 lmp-v: 11
Device-2: OPPO RMX3392 driver: rndis_host v: kernel type: USB
bus-ID: 3-1:2
Drives:
Local Storage: total: 476.94 GiB used: 25.24 GiB (5.3%)
ID-1: /dev/nvme0n1 vendor: Western Digital model: WD PC SN810
SDCPNRY-512G-1006 size: 476.94 GiB temp: 39.9 C
Partition:
ID-1: / size: 86.2 GiB used: 24.89 GiB (28.9%) fs: btrfs dev: /dev/nvme0n1p4
ID-2: /boot/efi size: 1.03 GiB used: 363.1 MiB (34.4%) fs: vfat
dev: /dev/nvme0n1p1
ID-3: /home size: 86.2 GiB used: 24.89 GiB (28.9%) fs: btrfs
dev: /dev/nvme0n1p4
ID-4: /var/log size: 86.2 GiB used: 24.89 GiB (28.9%) fs: btrfs
dev: /dev/nvme0n1p4
ID-5: /var/tmp size: 86.2 GiB used: 24.89 GiB (28.9%) fs: btrfs
dev: /dev/nvme0n1p4
Swap:
ID-1: swap-1 type: zram size: 14.96 GiB used: 0 KiB (0.0%) dev: /dev/zram0
Sensors:
System Temperatures: cpu: 53.9 C mobo: N/A gpu: amdgpu temp: 47.0 C
Fan Speeds (rpm): fan-1: 2191 fan-2: 2390
Info:
Processes: 320 Uptime: 6m Memory: total: 16 GiB note: est.
available: 14.96 GiB used: 3.62 GiB (24.2%) Init: systemd Compilers:
gcc: 13.2.1 clang: 16.0.6 Packages: 1287 Shell: Sudo v: 1.9.15p5
inxi: 3.3.31

@autodistries
Copy link

Hi, the same happens to me every time I reboot from integrated GPU mode.
I might also add that my Nvidia GPU is not turned off, even when in integrated mode. (GreenWithEnvy reports a somewhat stable 2W consumption)
Envycontrol version is 3.3.1 and reports integrated mode.

output of inxi -Fxz:

CPU:
Info: 8-core (4-mt/4-st) model: 12th Gen Intel Core i5-12450H bits: 64
type: MST AMCP arch: Alder Lake rev: 3 cache: L1: 704 KiB L2: 7 MiB
L3: 12 MiB
Speed (MHz): avg: 928 high: 1824 min/max: 400/4400:3300 cores: 1: 565
2: 400 3: 1562 4: 400 5: 400 6: 1483 7: 1703 8: 400 9: 1824 10: 400 11: 1602
12: 400 bogomips: 59904
Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
Graphics:
Device-1: Intel Alder Lake-P GT1 [UHD Graphics]
vendor: Acer Incorporated ALI driver: i915 v: kernel arch: Gen-12.2
bus-ID: 0000:00:02.0
Device-2: NVIDIA AD107M [GeForce RTX 4060 Max-Q / Mobile]
vendor: Acer Incorporated ALI driver: N/A arch: Lovelace
bus-ID: 0000:01:00.0
Device-3: Quanta ACER HD User Facing driver: N/A type: USB bus-ID: 3-6:2
Display: x11 server: X.Org v: 21.1.11 with: Xwayland v: 23.2.4 driver: X:
loaded: modesetting,nvidia unloaded: nouveau dri: iris gpu: i915
resolution: 1920x1080~144Hz
API: EGL v: 1.5 drivers: iris,nvidia,swrast platforms:
active: x11,surfaceless,device inactive: gbm,wayland,device-1
API: OpenGL v: 4.6.0 compat-v: 4.5 vendor: intel mesa v: 24.0.1-arch1.1
glx-v: 1.4 direct-render: yes renderer: Mesa Intel Graphics (ADL GT2)
API: Vulkan v: 1.3.276 drivers: intel,nvidia,llvmpipe surfaces: xcb,xlib
devices: 3
Audio:
Device-1: Intel Alder Lake PCH-P High Definition Audio
vendor: Acer Incorporated ALI driver: sof-audio-pci-intel-tgl
bus-ID: 0000:00:1f.3
API: ALSA v: k6.7.6-zen1-1-zen status: kernel-api
Server-1: PipeWire v: 1.0.3 status: active
Network:
Device-1: Intel Alder Lake-P PCH CNVi WiFi vendor: Rivet Networks Dual Band
Wi-Fi 6 Killer AX1650i 160MHz 2x2 driver: iwlwifi v: kernel
bus-ID: 0000:00:14.3
IF: wlp0s20f3 state: up mac: <filter>
Device-2: Realtek Killer E2600 GbE vendor: Acer Incorporated ALI
driver: r8169 v: kernel port: 3000 bus-ID: 0000:2b:00.0
IF: enp43s0 state: down mac: <filter>
Bluetooth:
Device-1: Intel AX201 Bluetooth driver: btusb v: 0.8 type: USB
bus-ID: 3-10:3
Report: btmgmt ID: hci0 rfk-id: 2 state: down bt-service: enabled,running
rfk-block: hardware: no software: yes address: <filter> bt-v: 5.2 lmp-v: 11
RAID:
Hardware-1: Intel Volume Management Device NVMe RAID Controller driver: vmd
v: 0.6 bus-ID: 0000:00:0e.0
Drives:
Local Storage: total: 1.38 TiB used: 111.62 GiB (7.9%)
ID-1: /dev/nvme0n1 vendor: Micron model: 3400 MTFDKBA512TFH
size: 476.94 GiB temp: 31.9 C
ID-2: /dev/nvme1n1 vendor: Transcend model: TS1TMTE250S size: 931.51 GiB
temp: 31.9 C
Partition:
ID-1: / size: 444.03 GiB used: 111.56 GiB (25.1%) fs: btrfs
dev: /dev/nvme0n1p6
ID-2: /boot/efi size: 256 MiB used: 61.4 MiB (24.0%) fs: vfat
dev: /dev/nvme0n1p1
ID-3: /home size: 444.03 GiB used: 111.56 GiB (25.1%) fs: btrfs
dev: /dev/nvme0n1p6
ID-4: /var/log size: 444.03 GiB used: 111.56 GiB (25.1%) fs: btrfs
dev: /dev/nvme0n1p6
ID-5: /var/tmp size: 444.03 GiB used: 111.56 GiB (25.1%) fs: btrfs
dev: /dev/nvme0n1p6
Swap:
ID-1: swap-1 type: zram size: 15.31 GiB used: 0 KiB (0.0%) dev: /dev/zram0
Sensors:
System Temperatures: cpu: 40.0 C mobo: N/A
Fan Speeds (rpm): N/A
Info:
Memory: total: 16 GiB note: est. available: 15.31 GiB used: 3.05 GiB (19.9%)
Processes: 366 Uptime: 3m Init: systemd
Packages: 1473 Compilers: gcc: 13.2.1 Shell: fish v: 3.7.0 inxi: 3.3.33

@autodistries
Copy link

autodistries commented Mar 1, 2024

I have very low knowledge in the matter. I just stumbled on something that might be a fix.

TLDR: After having installed/updated the nvidia drivers from hybrid (or full nvidia ?) mode, reboot (might be unnecessary), switch to integrated mode, reboot, and reinstall acpi_call-dkms. Needs a forced shutdown after that install, but then things seem to work fine (nvidia gpu is off in integrated mode and shutdown goes through)

The rest of this comment describes my adventure with that. Not very scientific process tho.


The two problems (no shutdown, no power-down of dGPU in integrated mode) were fixed after installing acpi_call-dkms
it seems like sometimes the nvidia card is turned off, and sometimes not. The laptop has no problem shutting down when nvidia gpu is off.
The bug reappears sometimes (probably every times kernel modules are rebuilt, or nvidia is updated). I have limited data available, but reinstalling acpi_call-dkms (which rebuilds something for sure) fixed the bug after reboot

At one point I thought that it seemed to be nvidia-version specific:

I updated nvidia stuff today to version 550.76-2. I could not get greenwithenvy to work.
Other symptoms were (including, but not limited to):

  • In hybrid mode, suspending then resuming would let me on the last rendered frame before the suspend with no way to directly interact with the computer
  • in integrated mode, nvidia gpu is present and running, and I got the same problem than OP when trying to reboot
    I rolled back to the last version I had of nvidia stuff which was 550.67-1. After a reboot, things work again

But I updated my system today to nvidia 550.78-1 and after doing the tldr things work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants