Skip to content

OCPBUGS-18662:rps: trigger udev even per queue #816

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Oct 16, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1 +1 @@
SUBSYSTEM=="net", ACTION=="add", ENV{DEVPATH}!="/devices/virtual/net/*", TAG+="systemd", ENV{SYSTEMD_WANTS}="update-rps@%k.service"
SUBSYSTEM=="queues", ACTION=="add", ENV{DEVPATH}=="/devices/pci*/queues/rx*", TAG+="systemd", ENV{SYSTEMD_WANTS}="update-rps@%p.service"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the devpath for patching events is almost full sysfs queue directory - just missing the '/sys' prefix...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK the path should start from /devices and not from /sys this is how it worked so far.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Tal-or: I was not clear in the sentence above, I'm sorry. What I mean is that since the devpath for patching events is almost full sysfs queue directory, we can exploit such fact to simplify the action, as hopefully better described in my comment on set-rps-mask.sh.
The above sentence and the comment there are supposed to be read one after another as part of a the same message

40 changes: 9 additions & 31 deletions assets/performanceprofile/scripts/set-rps-mask.sh
Original file line number Diff line number Diff line change
@@ -1,38 +1,16 @@
#!/usr/bin/env bash

dev=$1
[ -n "${dev}" ] || { echo "The device argument is missing" >&2 ; exit 1; }
path=${1}
[ -n "${path}" ] || { echo "The device path argument is missing" >&2 ; exit 1; }

mask=$2
mask=${2}
[ -n "${mask}" ] || { echo "The mask argument is missing" >&2 ; exit 1; }

dev_dir="/sys/class/net/${dev}"
queue_path=${path%rx*}

function find_dev_dir {
systemd_devs=$(systemctl list-units -t device | grep sys-subsystem-net-devices | cut -d' ' -f1)
queue_num=${path#*queues/}
# replace '/' with '-'
queue_num="${queue_num/\//-}"
Copy link

@pabeni pabeni Oct 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit: is this substitution required? If I read correctly $path is not escaped anymore and this is basically de-escaping the queue nr part

Another minor note: you could possibly mention in the commit message that this change additionally avoid looping on all the rx queue, just for better future memory ;)

both nits are not intended to block the MR, just to be considered if a new version should be needed for any reason

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit: is this substitution required? If I read correctly $path is not escaped anymore and this is basically de-escaping the queue nr part

when the $path is not escaped the queue number looks like rx\n and not rx-n as I would expect.
this is why this substitution is necessary.


for systemd_dev in ${systemd_devs}; do
dev_sysfs=$(systemctl show "${systemd_dev}" -p SysFSPath --value)

dev_orig_name="${dev_sysfs##*/}"
if [ "${dev_orig_name}" = "${dev}" ]; then
dev_name="${systemd_dev##*-}"
dev_name="${dev_name%%.device}"
if [ "${dev_name}" = "${dev}" ]; then # disregard the original device unit
continue
fi

echo "${dev} device was renamed to $dev_name"
dev_dir="/sys/class/net/${dev_name}"
break
fi
done
}

[ -d "${dev_dir}" ] || find_dev_dir # the net device was renamed, find the new name
[ -d "${dev_dir}" ] || { sleep 5; find_dev_dir; } # search failed, wait a little and try again
[ -d "${dev_dir}" ] || { echo "${dev_dir}" directory not found >&2 ; exit 0; } # the interface disappeared, not an error

for i in "${dev_dir}"/queues/rx-*/rps_cpus; do
echo "${mask}" > "${i}"
done
# set rps affinity for the queue
echo "${mask}" > "/sys${queue_path}${queue_num}/rps_cpus"
Original file line number Diff line number Diff line change
Expand Up @@ -514,7 +514,7 @@ func getOfflineCPUs(offlineCpus string) []*unit.UnitOption {
}

func getRPSUnitOptions(rpsMask string) []*unit.UnitOption {
cmd := fmt.Sprintf("%s %%i %s", getBashScriptPath(setRPSMask), rpsMask)
cmd := fmt.Sprintf("%s %%I %s", getBashScriptPath(setRPSMask), rpsMask)
return []*unit.UnitOption{
// [Unit]
// Description
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ spec:
path: /usr/local/bin/hugepages-allocation.sh
user: {}
- contents:
source: data:text/plain;charset=utf-8;base64,IyEvdXNyL2Jpbi9lbnYgYmFzaAoKZGV2PSQxClsgLW4gIiR7ZGV2fSIgXSB8fCB7IGVjaG8gIlRoZSBkZXZpY2UgYXJndW1lbnQgaXMgbWlzc2luZyIgPiYyIDsgZXhpdCAxOyB9CgptYXNrPSQyClsgLW4gIiR7bWFza30iIF0gfHwgeyBlY2hvICJUaGUgbWFzayBhcmd1bWVudCBpcyBtaXNzaW5nIiA+JjIgOyBleGl0IDE7IH0KCmRldl9kaXI9Ii9zeXMvY2xhc3MvbmV0LyR7ZGV2fSIKCmZ1bmN0aW9uIGZpbmRfZGV2X2RpciB7CiAgc3lzdGVtZF9kZXZzPSQoc3lzdGVtY3RsIGxpc3QtdW5pdHMgLXQgZGV2aWNlIHwgZ3JlcCBzeXMtc3Vic3lzdGVtLW5ldC1kZXZpY2VzIHwgY3V0IC1kJyAnIC1mMSkKCiAgZm9yIHN5c3RlbWRfZGV2IGluICR7c3lzdGVtZF9kZXZzfTsgZG8KICAgIGRldl9zeXNmcz0kKHN5c3RlbWN0bCBzaG93ICIke3N5c3RlbWRfZGV2fSIgLXAgU3lzRlNQYXRoIC0tdmFsdWUpCgogICAgZGV2X29yaWdfbmFtZT0iJHtkZXZfc3lzZnMjIyovfSIKICAgIGlmIFsgIiR7ZGV2X29yaWdfbmFtZX0iID0gIiR7ZGV2fSIgXTsgdGhlbgogICAgICBkZXZfbmFtZT0iJHtzeXN0ZW1kX2RldiMjKi19IgogICAgICBkZXZfbmFtZT0iJHtkZXZfbmFtZSUlLmRldmljZX0iCiAgICAgIGlmIFsgIiR7ZGV2X25hbWV9IiA9ICIke2Rldn0iIF07IHRoZW4gIyBkaXNyZWdhcmQgdGhlIG9yaWdpbmFsIGRldmljZSB1bml0CiAgICAgICAgICAgICAgY29udGludWUKICAgICAgZmkKCiAgICAgIGVjaG8gIiR7ZGV2fSBkZXZpY2Ugd2FzIHJlbmFtZWQgdG8gJGRldl9uYW1lIgogICAgICBkZXZfZGlyPSIvc3lzL2NsYXNzL25ldC8ke2Rldl9uYW1lfSIKICAgICAgYnJlYWsKICAgIGZpCiAgZG9uZQp9CgpbIC1kICIke2Rldl9kaXJ9IiBdIHx8IGZpbmRfZGV2X2RpciAgICAgICAgICAgICAgICAjIHRoZSBuZXQgZGV2aWNlIHdhcyByZW5hbWVkLCBmaW5kIHRoZSBuZXcgbmFtZQpbIC1kICIke2Rldl9kaXJ9IiBdIHx8IHsgc2xlZXAgNTsgZmluZF9kZXZfZGlyOyB9ICAjIHNlYXJjaCBmYWlsZWQsIHdhaXQgYSBsaXR0bGUgYW5kIHRyeSBhZ2FpbgpbIC1kICIke2Rldl9kaXJ9IiBdIHx8IHsgZWNobyAiJHtkZXZfZGlyfSIgZGlyZWN0b3J5IG5vdCBmb3VuZCA+JjIgOyBleGl0IDA7IH0gIyB0aGUgaW50ZXJmYWNlIGRpc2FwcGVhcmVkLCBub3QgYW4gZXJyb3IKCmZvciBpIGluICIke2Rldl9kaXJ9Ii9xdWV1ZXMvcngtKi9ycHNfY3B1czsgZG8KICBlY2hvICIke21hc2t9IiA+ICIke2l9Igpkb25lCg==
source: data:text/plain;charset=utf-8;base64,IyEvdXNyL2Jpbi9lbnYgYmFzaAoKcGF0aD0kezF9ClsgLW4gIiR7cGF0aH0iIF0gfHwgeyBlY2hvICJUaGUgZGV2aWNlIHBhdGggYXJndW1lbnQgaXMgbWlzc2luZyIgPiYyIDsgZXhpdCAxOyB9CgptYXNrPSR7Mn0KWyAtbiAiJHttYXNrfSIgXSB8fCB7IGVjaG8gIlRoZSBtYXNrIGFyZ3VtZW50IGlzIG1pc3NpbmciID4mMiA7IGV4aXQgMTsgfQoKcXVldWVfcGF0aD0ke3BhdGglcngqfQoKcXVldWVfbnVtPSR7cGF0aCMqcXVldWVzL30KIyByZXBsYWNlICcvJyB3aXRoICctJwpxdWV1ZV9udW09IiR7cXVldWVfbnVtL1wvLy19IgoKIyBzZXQgcnBzIGFmZmluaXR5IGZvciB0aGUgcXVldWUKZWNobyAiJHttYXNrfSIgPiAiL3N5cyR7cXVldWVfcGF0aH0ke3F1ZXVlX251bX0vcnBzX2NwdXMi
verification: {}
group: {}
mode: 448
Expand Down Expand Up @@ -68,7 +68,7 @@ spec:
path: /etc/sysctl.d/99-default-rps-mask.conf
user: {}
- contents:
source: data:text/plain;charset=utf-8;base64,U1VCU1lTVEVNPT0ibmV0IiwgQUNUSU9OPT0iYWRkIiwgRU5We0RFVlBBVEh9IT0iL2RldmljZXMvdmlydHVhbC9uZXQvKiIsIFRBRys9InN5c3RlbWQiLCBFTlZ7U1lTVEVNRF9XQU5UU309InVwZGF0ZS1ycHNAJWsuc2VydmljZSIK
source: data:text/plain;charset=utf-8;base64,U1VCU1lTVEVNPT0icXVldWVzIiwgQUNUSU9OPT0iYWRkIiwgRU5We0RFVlBBVEh9PT0iL2RldmljZXMvcGNpKi9xdWV1ZXMvcngqIiwgVEFHKz0ic3lzdGVtZCIsIEVOVntTWVNURU1EX1dBTlRTfT0idXBkYXRlLXJwc0AlcC5zZXJ2aWNlIg==
verification: {}
group: {}
mode: 420
Expand Down Expand Up @@ -124,7 +124,7 @@ spec:

[Service]
Type=oneshot
ExecStart=/usr/local/bin/set-rps-mask.sh %i 0
ExecStart=/usr/local/bin/set-rps-mask.sh %I 0
name: update-rps@.service
- contents: |
[Unit]
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
apiVersion: machineconfiguration.openshift.io/v1
kind: KubeletConfig
metadata:
creationTimestamp: null
name: performance-manual
ownerReferences:
- apiVersion: performance.openshift.io/v2
kind: PerformanceProfile
name: manual
uid: ""
spec:
kubeletConfig:
apiVersion: kubelet.config.k8s.io/v1beta1
authentication:
anonymous: {}
webhook:
cacheTTL: 0s
x509: {}
authorization:
webhook:
cacheAuthorizedTTL: 0s
cacheUnauthorizedTTL: 0s
containerRuntimeEndpoint: ""
cpuManagerPolicy: static
cpuManagerPolicyOptions:
full-pcpus-only: "true"
cpuManagerReconcilePeriod: 5s
evictionHard:
imagefs.available: 15%
memory.available: 100Mi
nodefs.available: 10%
nodefs.inodesFree: 5%
evictionPressureTransitionPeriod: 0s
fileCheckFrequency: 0s
httpCheckFrequency: 0s
imageMinimumGCAge: 0s
kind: KubeletConfiguration
kubeReserved:
memory: 500Mi
logging:
flushFrequency: 0
options:
json:
infoBufferSize: "0"
verbosity: 0
memoryManagerPolicy: Static
memorySwap: {}
nodeStatusReportFrequency: 0s
nodeStatusUpdateFrequency: 0s
reservedMemory:
- limits:
memory: 1100Mi
numaNode: 0
reservedSystemCPUs: "0"
runtimeRequestTimeout: 0s
shutdownGracePeriod: 0s
shutdownGracePeriodCriticalPods: 0s
streamingConnectionIdleTimeout: 0s
syncFrequency: 0s
systemReserved:
memory: 500Mi
topologyManagerPolicy: single-numa-node
volumeStatsAggPeriod: 0s
machineConfigPoolSelector:
matchLabels:
machineconfiguration.openshift.io/role: worker-cnf
status:
conditions: null
Loading