Welcome to the Linux Foundation Forum!

Scale: register a network device(register_netdev) takes long time and high CPU usage

we have a network application in which we make netlink call and in the kernel-module we call register_netdev to create a network device we need. This works without any issues for small numbers.

When we create 8k network devices all at once, I could see it takes 2 minutes approx to complete, after which CPU usage continues to be high, mainly with many daemons named systemd-udevd and ifquery

root@NetwAA:~# pgrep -l systemd
1925 systemd-journal
1946 systemd-udevd
2130 systemd-logind
13381 systemd-udevd
13385 systemd-udevd
13393 systemd-udevd
13399 systemd-udevd
13405 systemd-udevd
13411 systemd-udevd
13415 systemd-udevd
13423 systemd-udevd
13428 systemd-udevd
13432 systemd-udevd
13441 systemd-udevd
13446 systemd-udevd
13452 systemd-udevd
13459 systemd-udevd
13462 systemd-udevd
13468 systemd-udevd
13475 systemd-udevd
13483 systemd-udevd
...

root@NetwAA:~# pgrep -l ifquery
13388 ifquery
13392 ifquery
13397 ifquery
13404 ifquery
13409 ifquery
13417 ifquery
...

top - 00:55:05 up 1:14, 2 users, load average: 70.09, 30.47, 12.17
Tasks: 462 total, 12 running, 450 sleeping, 0 stopped, 0 zombie
%Cpu(s): 19.0 us, 74.1 sy, 0.0 ni, 6.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 15937.0 total, 13024.0 free, 2283.3 used, 629.7 buff/cache
MiB Swap: 0.0 total, 0.0 free, 0.0 used. 13308.5 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1946 root 20 0 26412 11004 3896 R 99.3 0.1 2:22.81 systemd-u+
13551 root 20 0 7660 6820 1556 D 36.5 0.0 0:12.04 ifquery
13564 root 20 0 5812 4816 1404 R 15.6 0.0 0:10.69 ifquery
13656 root 20 0 2908 1968 1460 D 12.0 0.0 0:10.28 ifquery
13590 root 20 0 4096 3396 1564 D 11.6 0.0 0:10.29 ifquery
13711 root 20 0 3568 2856 1556 D 11.3 0.0 0:10.49 ifquery
13572 root 20 0 3964 3104 1540 D 11.0 0.0 0:12.23 ifquery
13784 root 20 0 4096 3236 1404 R 11.0 0.0 0:10.94 ifquery
13471 root 20 0 2380 1812 1564 D 10.6 0.0 0:10.91 ifquery
13640 root 20 0 8848 8020 1440 D 10.6 0.0 0:10.65 ifquery
13701 root 20 0 8056 7300 1512 D 10.0 0.0 0:11.42 ifquery
13630 root 20 0 3304 2592 1556 D 9.6 0.0 0:11.44 ifquery
13704 root 20 0 4492 3640 1548 D 9.6 0.0 0:10.96 ifquery
13392 root 20 0 9112 8304 1460 D 9.0 0.1 0:11.99 ifquery
13521 root 20 0 7660 6704 1440 D 9.0 0.0 0:12.84 ifquery
13770 root 20 0 8716 7768 1452 D 9.0 0.0 0:12.06 ifquery

High CPU usage by systemd-udevd and ifquery

after a long time: //about 10-15 mins

root@NetwAA:~# udevadm monitor
UDEV [538.005910] add /devices/virtual/net/xe9.924/queues/tx-0 (queues)
UDEV [538.011380] add /devices/virtual/net/xe9.934/queues/rx-0 (queues)
UDEV [538.248471] add /devices/virtual/net/xe9.933 (net)
UDEV [538.266858] add /devices/virtual/net/xe9.976 (net)
UDEV [538.358951] add /devices/virtual/net/xe9.933/queues/rx-0 (queues)
UDEV [538.359507] add /devices/virtual/net/xe9.933/queues/tx-0 (queues)
UDEV [538.359594] add /devices/virtual/net/xe9.936/queues/rx-0 (queues)
UDEV [538.360094] add /devices/virtual/net/xe9.934/queues/tx-0 (queues)
UDEV [538.360121] add /devices/virtual/net/xe9.936/queues/tx-0 (queues)
UDEV [538.609054] add /devices/virtual/net/xe9.954 (net)
UDEV [538.624835] add /devices/virtual/net/xe9.968 (net)
UDEV [538.797069] add /devices/virtual/net/xe9.932 (net)
UDEV [538.804163] add /devices/virtual/net/xe9.954/queues/rx-0 (queues)
UDEV [538.813290] add /devices/virtual/net/xe9.954/queues/tx-0 (queues)
UDEV [538.813778] add /devices/virtual/net/xe9.968/queues/rx-0 (queues)
UDEV [538.814202] add /devices/virtual/net/xe9.968/queues/tx-0 (queues)
UDEV [538.814664] add /devices/virtual/net/xe9.976/queues/rx-0 (queues)
UDEV [538.815270] add /devices/virtual/net/xe9.976/queues/tx-0 (queues)
UDEV [539.581034] add /devices/virtual/net/xe9.953 (net)
UDEV [539.634047] add /devices/virtual/net/xe9.978 (net)
...
...
UDEV [1277.422066] add /devices/virtual/net/xe9.7999/queues/rx-0 (queues)
UDEV [1277.425117] add /devices/virtual/net/xe9.7987 (net)
UDEV [1277.425660] add /devices/virtual/net/xe9.7987/queues/tx-0 (queues)
UDEV [1277.426005] add /devices/virtual/net/xe9.7987/queues/rx-0 (queues)
UDEV [1277.436997] add /devices/virtual/net/xe9.7996 (net)
UDEV [1277.437450] add /devices/virtual/net/xe9.7996/queues/rx-0 (queues)
UDEV [1277.437550] add /devices/virtual/net/xe9.7996/queues/tx-0 (queues)
UDEV [1277.473557] add /devices/virtual/net/xe9.7997 (net)
UDEV [1277.474560] add /devices/virtual/net/xe9.7997/queues/tx-0 (queues)
UDEV [1277.474615] add /devices/virtual/net/xe9.7997/queues/rx-0 (queues)

root@NetwAA:~#
root@NetwAA:~#
root@NetwAA:~# pgrep -l ifquery
root@NetwAA:~# pgrep -l systemd
1 systemd
1925 systemd-journal
1946 systemd-udevd
2130 systemd-logind
root@NetwAA:~#

what is happening in the background in kernel ?
why is the bottle-neck ? is it due to rtnl semaphore access ?

Categories

Upcoming Training