Skip to content

Conversation

@wxnzb
Copy link
Contributor

@wxnzb wxnzb commented Jul 24, 2025

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:
Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:


Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @wxnzb, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a new set of unit tests for BPF cgroup socket programs. These tests are designed to validate the behavior of connection handling and redirection logic within a BPF-based service mesh environment, ensuring proper management of network namespace cookies and correct routing of connections through various service configurations, including those involving waypoints.

Highlights

  • New BPF Unit Test: Added workload_cgroup_sock_test.o to the build system, enabling the compilation of a new BPF program specifically for unit testing cgroup socket functionality.
  • Cgroup Socket Program Testing: Integrated a new testCgroupSock suite into the existing Go-based BPF unit test framework, expanding test coverage for BPF programs interacting with cgroup sockets.
  • Comprehensive Connection Redirection Tests: Included detailed tests for the BPF_CGROUP_SOCK_CONNECT4 hook, covering critical scenarios such as recording and removing kmesh managed network namespace cookies, and connection redirection to backend services both with and without a configured waypoint, as well as service-level waypoint redirection.
  • eBPF Program Type Support: Updated the BPF test framework to correctly recognize and handle ebpf.CGroupSockAddr program types during the loading and preparation of eBPF specifications.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds unit tests for cgroup_sock BPF programs. The changes are well-structured, introducing a new test file, updating the Makefile for compilation, and adding the test logic in Go.

My review focuses on improving the robustness and maintainability of the new test code. I've identified a few critical and high-severity issues that should be addressed:

  • The use of log.Fatalf in tests, which can prematurely terminate the entire test suite.
  • Potential panics from unchecked errors and improper handling of network connections.
  • A resource leak due to an unclosed network listener.

Additionally, I've included some medium-severity suggestions to enhance code quality by reducing duplication and adhering to language-specific conventions. Addressing these points will make the new tests more reliable and easier to maintain in the long run.

UpstreamID: 1,
}
if err := FrontendMap.Update(&f_key, &f_val, ebpf.UpdateAny); err != nil {
log.Fatalf("Update failed: %v", err)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Using log.Fatalf inside a test is a critical issue because it will terminate the entire test process, preventing other tests from running. You should use t.Fatalf instead, which will only fail the current test and allow the test suite to continue with other tests.

This issue occurs in multiple places in this file:

  • line 894
  • line 924
  • line 1017
  • line 1049
  • line 1144
  • line 1174

Please replace all instances of log.Fatalf with t.Fatalf.

Suggested change
log.Fatalf("Update failed: %v", err)
t.Fatalf("Update failed: %v", err)

Comment on lines 933 to 944
conn, err := (&net.Dialer{
LocalAddr: &net.TCPAddr{
IP: net.ParseIP(localIP),
Port: clientPort,
},
Timeout: 2 * time.Second,
}).Dial("tcp4", serverSocket)
defer conn.Close()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The error returned from Dial is not being checked. If Dial fails, conn will be nil, and the subsequent defer conn.Close() will cause a panic when the test function returns. You must check the error and handle it appropriately.

Suggested change
conn, err := (&net.Dialer{
LocalAddr: &net.TCPAddr{
IP: net.ParseIP(localIP),
Port: clientPort,
},
Timeout: 2 * time.Second,
}).Dial("tcp4", serverSocket)
defer conn.Close()
conn, err := (&net.Dialer{
LocalAddr: &net.TCPAddr{
IP: net.ParseIP(localIP),
Port: clientPort,
},
Timeout: 2 * time.Second,
}).Dial("tcp4", serverSocket)
if err != nil {
t.Fatalf("Dial failed: %v", err)
}
defer conn.Close()

serverSocket := localIP + ":" + strconv.Itoa(serverPort)
var testPort uint16 = 55555
testIpPort := localIP + ":" + strconv.Itoa(int(htons(testPort)))
_, err := net.Listen("tcp4", testIpPort)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The net.Listen call can fail, but the error is not being checked. Additionally, the returned listener is not being closed, which will cause a resource leak. You should check the error and ensure the listener is closed using defer.

Suggested change
_, err := net.Listen("tcp4", testIpPort)
testListener, err := net.Listen("tcp4", testIpPort)
if err != nil {
t.Fatalf("Failed to listen on test port %s: %v", testIpPort, err)
}
defer testListener.Close()

Comment on lines 51 to 56
WORKLOAD_CGROUP_SOCK_FLAGS = -I$(ROOT_DIR)/bpf/kmesh/ \
-I$(ROOT_DIR)/bpf/kmesh/probes \
-I$(ROOT_DIR)/bpf/kmesh/workload/include \
-I$(ROOT_DIR)/api/v2-c
workload_cgroup_sock_test.o: workload_cgroup_sock_test.c
$(QUIET) $(CLANG) $(CLANG_FLAGS) $(WORKLOAD_CGROUP_SOCK_FLAGS) -c $< -o $@

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The variable WORKLOAD_CGROUP_SOCK_FLAGS is a duplicate of the existing WORKLOAD_SOCKOPS_FLAGS defined on line 41. To improve maintainability and avoid redundancy, you can reuse the existing variable for the new build rule.

workload_cgroup_sock_test.o: workload_cgroup_sock_test.c
	$(QUIET) $(CLANG) $(CLANG_FLAGS) $(WORKLOAD_SOCKOPS_FLAGS) -c $< -o $@

Comment on lines 870 to 922
type ip_addr struct {
Raw [16]byte
}

type frontend_key struct {
Addr ip_addr
}

type frontend_value struct {
UpstreamID uint32
}

FrontendMap := coll.Maps["km_frontend"]
var f_key frontend_key
ip4 := net.ParseIP(localIP).To4()
if ip4 == nil {
t.Fatalf("invalid IPv4 address")
}
copy(f_key.Addr.Raw[0:4], ip4)
//construct the value
f_val := frontend_value{
UpstreamID: 1,
}
if err := FrontendMap.Update(&f_key, &f_val, ebpf.UpdateAny); err != nil {
log.Fatalf("Update failed: %v", err)
}
//populate the km_backend
BackendMap := coll.Maps["km_backend"]
const MAX_SERVICE_COUNT = 10
type backend_key struct {
BackendUID uint32
}

type backend_value struct {
Addr ip_addr
ServiceCount uint32
Service [MAX_SERVICE_COUNT]uint32
WpAddr ip_addr
WaypointPort uint32
}
//construct the key
b_key := backend_key{
BackendUID: 1,
}
//construct the value
b_val := backend_value{
Addr: ip_addr{Raw: [16]byte{}},
ServiceCount: 0,
Service: [MAX_SERVICE_COUNT]uint32{},
WpAddr: ip_addr{Raw: [16]byte{}},
WaypointPort: 0,
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

These struct types (ip_addr, frontend_key, frontend_value, backend_key, backend_value) are redefined in each test case. This leads to significant code duplication and makes the code harder to maintain.

Consider defining these types once at a shared scope, such as at the beginning of the testCgroupSock function or at the package level, and reuse them across all test cases.

if ip4 == nil {
t.Fatalf("invalid IPv4 address")
}
copy(f_key.Addr.Raw[0:4], ip4) // 高位放IPv4

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This comment is in Chinese. To ensure the codebase is accessible and maintainable for all contributors, please write comments in English.

Suggested change
copy(f_key.Addr.Raw[0:4], ip4) // 高位放IPv4
copy(f_key.Addr.Raw[0:4], ip4) // Store IPv4 in the high bytes of the address

t.Run(tt.objFilename, tt.run())
}
}
func load_bpf_prog_to_cgroup(t *testing.T, objFilename string, progName string, cgroupPath string) (*ebpf.Collection, link.Link) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

In Go, function names should use CamelCase instead of snake_case. Please rename load_bpf_prog_to_cgroup to loadBpfProgToCgroup to follow idiomatic Go style. This also applies to the htons function on line 1252.

Suggested change
func load_bpf_prog_to_cgroup(t *testing.T, objFilename string, progName string, cgroupPath string) (*ebpf.Collection, link.Link) {
func loadBpfProgToCgroup(t *testing.T, objFilename string, progName string, cgroupPath string) (*ebpf.Collection, link.Link) {

@hzxuzhonghu hzxuzhonghu requested a review from Copilot July 25, 2025 06:41
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds unit tests for the workload_cgroup_sock component, introducing comprehensive testing for BPF CGROUP socket connection handling including managed process tracking, backend routing with and without waypoints, and service-based waypoint routing.

Key changes:

  • Adds new test file to include the workload cgroup socket source code
  • Implements 4 comprehensive test cases covering different BPF cgroup socket scenarios
  • Updates build configuration to compile the new test object file

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File Description
test/bpf_ut/workload_cgroup_sock_test.c Simple include directive to incorporate cgroup socket source code for testing
test/bpf_ut/bpftest/workload_test.go Main test implementation with 4 test scenarios for cgroup socket functionality
test/bpf_ut/bpftest/bpf_test.go Updates program type filter to include CGroupSockAddr for test compatibility
test/bpf_ut/Makefile Adds build rules and flags for compiling the new cgroup socket test object
Comments suppressed due to low confidence (3)

test/bpf_ut/bpftest/workload_test.go:772

  • The type name 'unitTests_BUILD_CONTEXT' uses inconsistent naming convention. Consider using 'unitTestsBuildContext' or 'UnitTestsBuildContext' to follow Go naming conventions.
	tests := []unitTests_BUILD_CONTEXT{

test/bpf_ut/bpftest/workload_test.go:775

  • The type name 'unitTest_BUILD_CONTEXT' uses inconsistent naming convention. Consider using 'unitTestBuildContext' or 'UnitTestBuildContext' to follow Go naming conventions.
			uts: []unitTest_BUILD_CONTEXT{

test/bpf_ut/bpftest/workload_test.go:1146

  • Variable 'BackendMap' is misleading as it actually references the 'km_service' map, not a backend map. Consider renaming to 'ServiceMap' for clarity.
						BackendMap := coll.Maps["km_service"]

serverSocket := localIP + ":" + strconv.Itoa(serverPort)
var testPort uint16 = 55555
testIpPort := localIP + ":" + strconv.Itoa(int(htons(testPort)))
_, err := net.Listen("tcp4", testIpPort)
Copy link

Copilot AI Jul 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The listener created here is not stored or closed, which may cause resource leaks. The error is also ignored. Either store the listener and defer its closure, or handle the error appropriately.

Suggested change
_, err := net.Listen("tcp4", testIpPort)
listener, err := net.Listen("tcp4", testIpPort)
if err != nil {
t.Fatalf("Failed to create listener on %s: %v", testIpPort, err)
}
defer listener.Close()

Copilot uses AI. Check for mistakes.
wpIP := net.ParseIP(localIP).To4()
//insert value
s_val := service_value{
WpAddr: ip_addr{Raw: [16]byte{}}, // waypoint IP全0,表示无
Copy link

Copilot AI Jul 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment contains Chinese characters. Consider using English for consistency: '// waypoint IP all zeros, indicates none'

Suggested change
WpAddr: ip_addr{Raw: [16]byte{}}, // waypoint IP全0,表示无
WpAddr: ip_addr{Raw: [16]byte{}}, // waypoint IP all zeros, indicates none

Copilot uses AI. Check for mistakes.
//insert value
s_val := service_value{
WpAddr: ip_addr{Raw: [16]byte{}}, // waypoint IP全0,表示无
WaypointPort: uint32(testPort), //
Copy link

Copilot AI Jul 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty comment should either be removed or completed with meaningful description of what the waypoint port represents.

Suggested change
WaypointPort: uint32(testPort), //
WaypointPort: uint32(testPort), // Port used for waypoint communication in the service

Copilot uses AI. Check for mistakes.
}
}
lk, err := link.AttachCgroup(link.CgroupOptions{
Path: constants.Cgroup2Path,
Copy link

Copilot AI Jul 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cgroup attachment uses 'constants.Cgroup2Path' instead of the provided 'cgroupPath' parameter, which could cause the program to attach to the wrong cgroup.

Suggested change
Path: constants.Cgroup2Path,
Path: cgroupPath,

Copilot uses AI. Check for mistakes.
tc_mark_encrypt_test.o \
tc_mark_decrypt_test.o
tc_mark_decrypt_test.o \
workload_cgroup_sock_test.o
Copy link

Copilot AI Jul 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trailing whitespace should be removed for consistency with the rest of the Makefile.

Suggested change
workload_cgroup_sock_test.o
workload_cgroup_sock_test.o

Copilot uses AI. Check for mistakes.
@LiZhenCheng9527
Copy link
Contributor

Please fix lint error and use git commit -s -m sign per commits.

wxnzb added 4 commits July 30, 2025 16:21
Signed-off-by: sweeywu <sweetwx6@gmail.com>
Signed-off-by: sweeywu <sweetwx6@gmail.com>
Signed-off-by: sweeywu <sweetwx6@gmail.com>
Signed-off-by: sweeywu <sweetwx6@gmail.com>
Signed-off-by: sweeywu <sweetwx6@gmail.com>
@codecov
Copy link

codecov bot commented Jul 30, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 38.24%. Comparing base (dddab4c) to head (63193d4).
⚠️ Report is 31 commits behind head on main.
see 3 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0847367...63193d4. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: sweeywu <sweetwx6@gmail.com>
@kmesh-bot kmesh-bot added size/XXL and removed size/XL labels Aug 1, 2025
wxnzb added 2 commits August 1, 2025 13:31
Signed-off-by: sweeywu <sweetwx6@gmail.com>
Signed-off-by: sweeywu <sweetwx6@gmail.com>
@kmesh-bot kmesh-bot added size/XL and removed size/XXL labels Aug 1, 2025
Copy link
Member

@hzxuzhonghu hzxuzhonghu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just one comments, can you abstract the redundant code{like simulate kmesh manage and check map} into help functions

Signed-off-by: sweeywu <sweetwx6@gmail.com>
Signed-off-by: sweeywu <sweetwx6@gmail.com>
wxnzb added 3 commits August 18, 2025 21:11
Signed-off-by: sweeywu <sweetwx6@gmail.com>
Signed-off-by: sweeywu <sweetwx6@gmail.com>
Signed-off-by: sweeywu <sweetwx6@gmail.com>
@wxnzb wxnzb mentioned this pull request Sep 11, 2025
6 tasks
Signed-off-by: sweeywu <sweetwx6@gmail.com>
@kmesh-bot
Copy link
Collaborator

Adding label do-not-merge/contains-merge-commits because PR contains merge commits, which are not allowed in this repository.
Use git rebase to reapply your commits on top of the target branch. Detailed instructions for doing so can be found here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Signed-off-by: sweeywu <sweetwx6@gmail.com>
Signed-off-by: sweeywu <sweetwx6@gmail.com>
@LiZhenCheng9527
Copy link
Contributor

/lgtm
/approve

@kmesh-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hzxuzhonghu, LiZhenCheng9527

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kmesh-bot kmesh-bot merged commit 932c468 into kmesh-net:main Oct 15, 2025
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants