-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support fetching PSI stats for cgroupv2 containers - takeover #3679
Conversation
@thaJeztah as you reviewed #3358 and I tried to fix your suggestions in this PR using also the commits you already checked. Can you check if this PR looks fine from code point of view? |
There is another effort in #3627. Both this PR and #3627 has the same bug, see #3627 (comment) |
events.go
Outdated
println("from is nil") | ||
return | ||
} | ||
if to == nil { | ||
println("to is nil") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Were these leftovers from debugging this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes and right now the PR is not in state of ready to review.
I think I need a decision from maintainers if it makes sense to go down the pointer change or not and if not what needs to be changed to make it able to merge.
@kolyshkin @dcantah do you know who could help me to answer these questions, because otherwise I don't feel that I want to invest my time?
@kolyshkin @thaJeztah can we figure out a path forward here? My read is that #3358 was close; had maintainer LGTM and some nits to fix. I think @dqminh wasn't convinced that @thaJeztah's changes were needed for the original PR and could done in a follow-up, but that conversation just ended with no resolution/decision. @thaJeztah worked in a branch on @dqminh's fork, and then opened (still in Draft state) #3627 as a carry with his suggestions Then @szuecs opened this PR (#3679) with those changes and his own attempts to respond to @thaJeztah's suggestions around pointer types for the PSI data. I created a branch starting back at #3358 and @kolyshkin's originals nits fixed without anything else. But I just realized how silly it would be to open a 4th PR when the original and 2 carrys are in limbo. There are consumers/users who clearly want the PSI capabilities and we are now past one year without anything merged. I'm happy to help here, but I don't want to step on these existing attempts unless no one has time/ability to drive to completion. If so, I'm happy to at least attempt to finish this work on a more reasonable timeline than one more year :) Apologies if my interpretations are not quite right; happy to hear any corrections to my assumptions; they are just from reading through the history and comments to date. |
@estesp I am all in to change whatever is needed to make it to an agreed change, but I have no idea what needs to be done. |
Signed-off-by: liuqiming.lqm <liuqiming.lqm@alibaba-inc.com>
ping @thaJeztah when you have time; per our discussion |
From my side we can also do a short online meeting. I am in Berlin/Europe timezone, but I can also meet at 8pm if you are in US west. Maybe we can connect via kubernetes slack in sig-node channel. What do you think? |
Thanks for estesp summarizing the PR status on comment #3679 (comment) After reading it through, below are a few things we need to have some agreement for @szuecs to move the PR forward:
My thoughts for 1): The purpose of whether using the pointer or structure is to help Below are possible options for discussion: a) With no pointers on all PSIData, PSIStats, CpuStats: https://go.dev/play/p/1QFeuFiWowx
b) With pointers on PSIData, and structure on PSIStats, CpuStats: https://go.dev/play/p/QCiOr-ndWjU
c) With no pointers on all PSIData, PSIStats, CpuStats , and add omitempty on PSIData like ThrottlingData: https://go.dev/play/p/CcuVZ-Q22YU
d) Whether make pointers on PSIStats, CpuStats: I don't think it matters a lot, as long as the code has the right logic to initialize the pointer, handle nil pointer cases if using the pointer. Or using the structure without omitempty (since with omitempty, it doesn't achieve its purpose like CpuUsage and ThrottlingData, which can be refactored later on) Overall, b) makes sense to me as
when PSI enabled, for CPU, it is:
when PSI enabled, for memory/io, it is:
@mrunalp when you have time, could you please help review the PR as well and share your thoughts about how we could get the change merged ? |
/cc @bobbypage |
dc3b9de
to
bfac6b1
Compare
@kolyshkin ptal |
From my side c) makes most sense and I can provide the change if there is an agreement. |
szuecs, I have updated my comments above. I think 3) is not what we wanted after a second thought, as it doesn't help the purpose ii) and iii) for PSI enabled/disabled, and CPU value case. |
@Dragoncell I prepared b) or c) locally. We just need a decision and I can push the one which is the favorite here. |
We read output from the following files if they exists: - cpu.pressure - memory.pressure - io.pressure Each are in format: ``` some avg10=0.00 avg60=0.00 avg300=0.00 total=0 full avg10=0.00 avg60=0.00 avg300=0.00 total=0 ``` Signed-off-by: Daniel Dao <dqminh89@gmail.com>
Signed-off-by: Daniel Dao <dqminh89@gmail.com>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
@kolyshkin I rebased to make the PR clean. Can you make a decision based on #3679 (comment) ? |
OK, at this point it is easier for me to fix all the things myself; there are too many and I don't want a bunch of nitpicks in here. As for deciding on pointer vs non-pointer, let's say since PSI is optional, we should make Also,
is not true -- even in our test case we check cpu.some and cpu.full (although these two sets of numbers are the same). Anyway, PTAL #3900. For the record, here's full diff from this PR: diff --git a/events.go b/events.go
index 79672381..47aa1abe 100644
--- a/events.go
+++ b/events.go
@@ -111,13 +111,6 @@ information is displayed once every 5 seconds.`,
},
}
-func convertPSI(from *cgroups.PSIData, to *cgroups.PSIData) {
- to.Avg10 = from.Avg10
- to.Avg60 = from.Avg60
- to.Avg300 = from.Avg300
- to.Total = from.Total
-}
-
func convertLibcontainerStats(ls *libcontainer.Stats) *types.Stats {
cg := ls.CgroupStats
if cg == nil {
@@ -136,8 +129,7 @@ func convertLibcontainerStats(ls *libcontainer.Stats) *types.Stats {
s.CPU.Throttling.Periods = cg.CpuStats.ThrottlingData.Periods
s.CPU.Throttling.ThrottledPeriods = cg.CpuStats.ThrottlingData.ThrottledPeriods
s.CPU.Throttling.ThrottledTime = cg.CpuStats.ThrottlingData.ThrottledTime
- convertPSI(&cg.CpuStats.PSI.Some, &s.CPU.PSI.Some)
- convertPSI(&cg.CpuStats.PSI.Full, &s.CPU.PSI.Full)
+ s.CPU.PSI = cg.CpuStats.PSI
s.CPUSet = types.CPUSet(cg.CPUSetStats)
@@ -147,8 +139,7 @@ func convertLibcontainerStats(ls *libcontainer.Stats) *types.Stats {
s.Memory.Swap = convertMemoryEntry(cg.MemoryStats.SwapUsage)
s.Memory.Usage = convertMemoryEntry(cg.MemoryStats.Usage)
s.Memory.Raw = cg.MemoryStats.Stats
- convertPSI(&cg.MemoryStats.PSI.Some, &s.Memory.PSI.Some)
- convertPSI(&cg.MemoryStats.PSI.Full, &s.Memory.PSI.Full)
+ s.Memory.PSI = cg.MemoryStats.PSI
s.Blkio.IoServiceBytesRecursive = convertBlkioEntry(cg.BlkioStats.IoServiceBytesRecursive)
s.Blkio.IoServicedRecursive = convertBlkioEntry(cg.BlkioStats.IoServicedRecursive)
@@ -158,8 +149,7 @@ func convertLibcontainerStats(ls *libcontainer.Stats) *types.Stats {
s.Blkio.IoMergedRecursive = convertBlkioEntry(cg.BlkioStats.IoMergedRecursive)
s.Blkio.IoTimeRecursive = convertBlkioEntry(cg.BlkioStats.IoTimeRecursive)
s.Blkio.SectorsRecursive = convertBlkioEntry(cg.BlkioStats.SectorsRecursive)
- convertPSI(&cg.BlkioStats.PSI.Some, &s.Blkio.PSI.Some)
- convertPSI(&cg.BlkioStats.PSI.Full, &s.Blkio.PSI.Full)
+ s.Blkio.PSI = cg.BlkioStats.PSI
s.Hugetlb = make(map[string]types.Hugetlb)
for k, v := range cg.HugetlbStats {
diff --git a/libcontainer/cgroups/fs2/fs2.go b/libcontainer/cgroups/fs2/fs2.go
index 2373006d..47b67afc 100644
--- a/libcontainer/cgroups/fs2/fs2.go
+++ b/libcontainer/cgroups/fs2/fs2.go
@@ -9,7 +9,6 @@ import (
"github.com/opencontainers/runc/libcontainer/cgroups"
"github.com/opencontainers/runc/libcontainer/cgroups/fscommon"
"github.com/opencontainers/runc/libcontainer/configs"
- "golang.org/x/sys/unix"
)
type parseError = fscommon.ParseError
@@ -115,17 +114,15 @@ func (m *Manager) GetStats() (*cgroups.Stats, error) {
if err := statCpu(m.dirPath, st); err != nil && !os.IsNotExist(err) {
errs = append(errs, err)
}
- // psi (since kernel 4.20)
- // open *.pressure file returns
- // - ErrNotExist when kernel < 4.20 or CONFIG_PSI is disabled
- // - ENOTSUP when we requires psi=1 in kernel command line to enable PSI support
- if err := statPSI(m.dirPath, "cpu.pressure", st); err != nil && !errors.Is(err, os.ErrNotExist) && !errors.Is(err, unix.ENOTSUP) {
+ // PSI (since kernel 4.20).
+ var err error
+ if st.CpuStats.PSI, err = statPSI(m.dirPath, "cpu.pressure"); err != nil {
errs = append(errs, err)
}
- if err := statPSI(m.dirPath, "memory.pressure", st); err != nil && !errors.Is(err, os.ErrNotExist) && !errors.Is(err, unix.ENOTSUP) {
+ if st.MemoryStats.PSI, err = statPSI(m.dirPath, "memory.pressure"); err != nil {
errs = append(errs, err)
}
- if err := statPSI(m.dirPath, "io.pressure", st); err != nil && !errors.Is(err, os.ErrNotExist) && !errors.Is(err, unix.ENOTSUP) {
+ if st.BlkioStats.PSI, err = statPSI(m.dirPath, "io.pressure"); err != nil {
errs = append(errs, err)
}
// hugetlb (since kernel 5.6)
diff --git a/libcontainer/cgroups/fs2/psi.go b/libcontainer/cgroups/fs2/psi.go
index 157109aa..603c6eaf 100644
--- a/libcontainer/cgroups/fs2/psi.go
+++ b/libcontainer/cgroups/fs2/psi.go
@@ -2,98 +2,84 @@ package fs2
import (
"bufio"
+ "errors"
"fmt"
"os"
"strconv"
"strings"
+ "golang.org/x/sys/unix"
+
"github.com/opencontainers/runc/libcontainer/cgroups"
)
-func statPSI(dirPath string, file string, stats *cgroups.Stats) error {
- if stats == nil {
- return fmt.Errorf("invalid Stats pointer is nil")
- }
+func statPSI(dirPath string, file string) (*cgroups.PSIStats, error) {
f, err := cgroups.OpenFile(dirPath, file, os.O_RDONLY)
if err != nil {
- return err
+ if errors.Is(err, os.ErrNotExist) || errors.Is(err, unix.ENOTSUP) {
+ // Open *.pressure file returns:
+ // - ErrNotExist when kernel < 4.20 or when CONFIG_PSI is disabled;
+ // - ENOTSUP when psi=1 kernel cmdline is required for PSI support.
+ // In both cases, stats are not available, so return nil.
+ err = nil
+ }
+ return nil, err
}
defer f.Close()
- var psistats *cgroups.PSIStats
- switch file {
- case "cpu.pressure":
- psistats = &stats.CpuStats.PSI
- case "memory.pressure":
- psistats = &stats.MemoryStats.PSI
- case "io.pressure":
- psistats = &stats.BlkioStats.PSI
- }
-
+ var psistats cgroups.PSIStats
sc := bufio.NewScanner(f)
for sc.Scan() {
parts := strings.Fields(sc.Text())
+ var pv *cgroups.PSIData
switch parts[0] {
case "some":
- data, err := parsePSIData(parts[1:])
- if err != nil {
- return err
- }
- psistats.Some = *data
+ pv = &psistats.Some
case "full":
- data, err := parsePSIData(parts[1:])
+ pv = &psistats.Full
+ }
+ if pv != nil {
+ *pv, err = parsePSIData(parts[1:])
if err != nil {
- return err
+ return nil, &parseError{Path: dirPath, File: file, Err: err}
}
- psistats.Full = *data
}
}
if err := sc.Err(); err != nil {
- return &parseError{Path: dirPath, File: file, Err: err}
- }
- return nil
-}
-
-func setFloat(s string, f *float64) error {
- if f == nil {
- return fmt.Errorf("invalid pointer *float64 is nil")
- }
- v, err := strconv.ParseFloat(s, 64)
- if err != nil {
- return fmt.Errorf("invalid PSI value: %q", s)
+ return nil, &parseError{Path: dirPath, File: file, Err: err}
}
- *f = v
-
- return nil
+ return &psistats, nil
}
-func parsePSIData(psi []string) (*cgroups.PSIData, error) {
+func parsePSIData(psi []string) (cgroups.PSIData, error) {
data := cgroups.PSIData{}
for _, f := range psi {
kv := strings.SplitN(f, "=", 2)
if len(kv) != 2 {
- return nil, fmt.Errorf("invalid psi data: %q", f)
+ return data, fmt.Errorf("invalid psi data: %q", f)
}
+ var pv *float64
switch kv[0] {
case "avg10":
- if err := setFloat(kv[1], &data.Avg10); err != nil {
- return nil, err
- }
+ pv = &data.Avg10
case "avg60":
- if err := setFloat(kv[1], &data.Avg60); err != nil {
- return nil, err
- }
+ pv = &data.Avg60
case "avg300":
- if err := setFloat(kv[1], &data.Avg300); err != nil {
- return nil, err
- }
+ pv = &data.Avg300
case "total":
v, err := strconv.ParseUint(kv[1], 10, 64)
if err != nil {
- return nil, fmt.Errorf("invalid PSI value: %q", f)
+ return data, fmt.Errorf("invalid %s PSI value: %w", kv[0], err)
}
data.Total = v
}
+ if pv != nil {
+ v, err := strconv.ParseFloat(kv[1], 64)
+ if err != nil {
+ return data, fmt.Errorf("invalid %s PSI value: %w", kv[0], err)
+ }
+ *pv = v
+ }
}
- return &data, nil
+ return data, nil
}
diff --git a/libcontainer/cgroups/fs2/psi_test.go b/libcontainer/cgroups/fs2/psi_test.go
index b8b8dc63..f70c19f7 100644
--- a/libcontainer/cgroups/fs2/psi_test.go
+++ b/libcontainer/cgroups/fs2/psi_test.go
@@ -9,10 +9,10 @@ import (
"github.com/opencontainers/runc/libcontainer/cgroups"
)
-const examplePSIData = `some avg10=1.71 avg60=2.36 avg300=2.57 total=230548833
+func TestStatCPUPSI(t *testing.T) {
+ const examplePSIData = `some avg10=1.71 avg60=2.36 avg300=2.57 total=230548833
full avg10=1.00 avg60=1.01 avg300=1.00 total=157622356`
-func TestStatCPUPSI(t *testing.T) {
// We're using a fake cgroupfs.
cgroups.TestMode = true
@@ -23,12 +23,12 @@ func TestStatCPUPSI(t *testing.T) {
t.Fatal(err)
}
- var stats cgroups.Stats
- if err := statPSI(fakeCgroupDir, "cpu.pressure", &stats); err != nil {
- t.Error(err)
+ st, err := statPSI(fakeCgroupDir, "cpu.pressure")
+ if err != nil {
+ t.Fatal(err)
}
- if !reflect.DeepEqual(stats.CpuStats.PSI, cgroups.PSIStats{
+ if !reflect.DeepEqual(*st, cgroups.PSIStats{
Some: cgroups.PSIData{
Avg10: 1.71,
Avg60: 2.36,
@@ -42,6 +42,6 @@ func TestStatCPUPSI(t *testing.T) {
Total: 157622356,
},
}) {
- t.Errorf("unexpected PSI result: %+v", stats.CpuStats.PSI)
+ t.Errorf("unexpected PSI result: %+v", st)
}
}
diff --git a/libcontainer/cgroups/stats.go b/libcontainer/cgroups/stats.go
index 13a0eafd..8ff1fbb5 100644
--- a/libcontainer/cgroups/stats.go
+++ b/libcontainer/cgroups/stats.go
@@ -47,7 +47,7 @@ type PSIStats struct {
type CpuStats struct {
CpuUsage CpuUsage `json:"cpu_usage,omitempty"`
ThrottlingData ThrottlingData `json:"throttling_data,omitempty"`
- PSI PSIStats `json:"psi,omitempty"`
+ PSI *PSIStats `json:"psi,omitempty"`
}
type CPUSetStats struct {
@@ -102,7 +102,7 @@ type MemoryStats struct {
UseHierarchy bool `json:"use_hierarchy"`
Stats map[string]uint64 `json:"stats,omitempty"`
- PSI PSIStats `json:"psi,omitempty"`
+ PSI *PSIStats `json:"psi,omitempty"`
}
type PageUsageByNUMA struct {
@@ -147,7 +147,7 @@ type BlkioStats struct {
IoMergedRecursive []BlkioStatEntry `json:"io_merged_recursive,omitempty"`
IoTimeRecursive []BlkioStatEntry `json:"io_time_recursive,omitempty"`
SectorsRecursive []BlkioStatEntry `json:"sectors_recursive,omitempty"`
- PSI PSIStats `json:"psi,omitempty"`
+ PSI *PSIStats `json:"psi,omitempty"`
}
type HugetlbStats struct {
diff --git a/tests/integration/events.bats b/tests/integration/events.bats
index f736ff96..b7607003 100644
--- a/tests/integration/events.bats
+++ b/tests/integration/events.bats
@@ -37,20 +37,23 @@ function teardown() {
runc run -d --console-socket "$CONSOLE_SOCKET" test_busybox
[ "$status" -eq 0 ]
- # stress the cpu a little bit
- runc exec test_busybox dd if=/dev/zero bs=1M count=5 of=/dev/null
+ # Stress the CPU a little bit.
+ runc exec test_busybox dd if=/dev/zero bs=1M count=42 of=/dev/null
[ "$status" -eq 0 ]
+ runc exec test_busybox sh -c 'tail /sys/fs/cgroup/*.pressure'
+
runc events --stats test_busybox
[ "$status" -eq 0 ]
- # fetch stats to see PSI metrics
+
+ # Check PSI metrics.
+ jq '.data.cpu.psi' <<<"${lines[0]}"
for psi_type in some full; do
for psi_metric in avg10 avg60 avg300 total; do
- [[ "$(echo "${lines[0]}" | jq .data.cpu.psi.$psi_type.$psi_metric)" != "" ]]
+ echo -n "checking .data.cpu.psi.$psi_type.$psi_metric != 0: "
+ jq -e '.data.cpu.psi.'$psi_type.$psi_metric' != 0' <<<"${lines[0]}"
done
done
- # total must have been more than 0
- [[ "$(echo "${lines[0]}" | jq .data.cpu.psi.some.total)" != "0" ]]
}
function test_events() {
diff --git a/types/events.go b/types/events.go
index cae53b24..e28ac8c3 100644
--- a/types/events.go
+++ b/types/events.go
@@ -24,9 +24,9 @@ type Stats struct {
NetworkInterfaces []*NetworkInterface `json:"network_interfaces"`
}
-type PSIData cgroups.PSIData
+type PSIData = cgroups.PSIData
-type PSIStats cgroups.PSIStats
+type PSIStats = cgroups.PSIStats
type Hugetlb struct {
Usage uint64 `json:"usage,omitempty"`
@@ -50,7 +50,7 @@ type Blkio struct {
IoMergedRecursive []BlkioEntry `json:"ioMergedRecursive,omitempty"`
IoTimeRecursive []BlkioEntry `json:"ioTimeRecursive,omitempty"`
SectorsRecursive []BlkioEntry `json:"sectorsRecursive,omitempty"`
- PSI PSIStats `json:"psi,omitempty"`
+ PSI *PSIStats `json:"psi,omitempty"`
}
type Pids struct {
@@ -77,7 +77,7 @@ type CpuUsage struct {
type Cpu struct {
Usage CpuUsage `json:"usage,omitempty"`
Throttling Throttling `json:"throttling,omitempty"`
- PSI PSIStats `json:"psi,omitempty"`
+ PSI *PSIStats `json:"psi,omitempty"`
}
type CPUSet struct {
@@ -108,7 +108,7 @@ type Memory struct {
Kernel MemoryEntry `json:"kernel,omitempty"`
KernelTCP MemoryEntry `json:"kernelTCP,omitempty"`
Raw map[string]uint64 `json:"raw,omitempty"`
- PSI PSIStats `json:"psi,omitempty"`
+ PSI *PSIStats `json:"psi,omitempty"`
}
type L3CacheInfo struct { |
Since there is no movement in #3358 nor in dqminh#37 I decided to create the PR myself with using the same commits.
I rebased on main, so I hope you can easily review it.