Skip to content

Iperf3 lp_JA to review #2019

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,20 +1,17 @@
---
title: Get started with network microbenchmarking and tuning with iperf3

draft: true
cascade:
draft: true
title: Microbenchmark and tune network performance with iPerf3 and Linux traffic control

minutes_to_complete: 30

who_is_this_for: This is an introductory topic for performance engineers, Linux system administrators, or application developers who want to microbenchmark, simulate, or tune the networking performance of distributed systems.
who_is_this_for: This is an introductory topic for performance engineers, Linux system administrators, and application developers who want to microbenchmark, simulate, or tune the networking performance of distributed systems.

learning_objectives:
- Understand how to use iperf3 and tc for network performance testing and traffic control to microbenchmark different network conditions.
- Identify and apply basic runtime parameters to tune application performance.
- Run accurate network microbenchmark tests using iPerf3.
- Simulate real-world network conditions using Linux Traffic Control (tc).
- Tune basic Linux kernel parameters to improve network performance.

prerequisites:
- Foundational understanding of networking principles such as TCP/IP and UDP.
- Basic understanding of networking principles such as Transmission Control Protocol/Internet Protocol (TCP/IP) and User Datagram Protocol (UDP).
- Access to two [Arm-based cloud instances](https://learn.arm.com/learning-paths/servers-and-cloud-computing/csp/).

author: Kieran Hejmadi
Expand All @@ -25,13 +22,13 @@ subjects: Performance and Architecture
armips:
- Neoverse
tools_software_languages:
- iperf3
- iPerf3
operatingsystems:
- Linux

further_reading:
- resource:
title: iperf3 user manual
title: iPerf3 user manual
link: https://iperf.fr/iperf-doc.php
type: documentation

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,17 +6,19 @@ weight: 3
layout: learningpathall
---

## Microbenchmark the TCP connection
With your systems configured and reachable, you can now use iPerf3 to microbenchmark TCP and UDP performance between your Arm-based systems.

You can microbenchmark the bandwidth between the client and server.
## Microbenchmark the TCP connection

First, start `iperf` in server mode on the server system with the following command:
Start by running `iperf` in server mode on the `SERVER` system:

```bash
iperf3 -s
```

You see the output, indicating the server is ready:
This starts the server on the default TCP port 5201.

You should see:

```output
-----------------------------------------------------------
Expand All @@ -25,20 +27,23 @@ Server listening on 5201 (test #1)

```

The default server port is 5201. Use the `-p` flag to specify another port if it is in use.
The default server port is 5201. If it is already in use, use the `-p` flag to specify another.

{{% notice Tip %}}
If you already have an `iperf3` server running, you can manually kill the process with the following command.
If you already have an `iperf3` server running, terminate it with:
```bash
sudo kill $(pgrep iperf3)
```
{{% /notice %}}

Next, on the client node, run the following command to run a simple 10-second microbenchmark using the TCP protocol.
## Run a TCP test from the client

On the client node, run the following command to run a simple 10-second microbenchmark using the TCP protocol:

```bash
iperf3 -c SERVER -V
iperf3 -c SERVER -v
```
Replace `SERVER` with your server’s hostname or private IP address. The `-v` flag enables verbose output.

The output is similar to:

Expand Down Expand Up @@ -68,28 +73,47 @@ rcv_tcp_congestion cubic

iperf Done.
```
## TCP result highlights

- The`Cwnd` column prints the control window size and corresponds to the allowed number of TCP transactions in flight before receiving an acknowledgment `ACK` from the server. This value grows as the connection stabilizes and adapts to link quality.

- The `CPU Utilization` row shows both the usage on the sender and receiver. If you are migrating your workload to a different platform, such as from x86 to Arm, this is a useful metric.

- The `snd_tcp_congestion cubic` and `rcv_tcp_congestion cubic` variables show the congestion control algorithm used.

- `Bitrate` shows the throughput achieved. In this example, the the `t4g.xlarge` AWS instance saturates its 5 Gbps bandwidth available.

- The`Cwnd` column prints the control window size and corresponds to the allowed number of TCP transactions in flight before receiving an acknowledgment `ACK` from the server. This adjusts dynamically to not overwhelm the receiver and adjust for variable link connection strengths.
![instance-network-size#center](./instance-network-size.png "Instance network size")

- The `CPU Utilization` row shows both the usage on the sender and receiver. If you are migrating your workload to a different platform, such as from x86 to Arm, there may be variations.
## UDP result highlights

- The `snd_tcp_congestion cubic` abd `rcv_tcp_congestion cubic` variables show the congestion control algorithm used.
You can also microbenchmark the `UDP` protocol using the `-u` flag with iPerf3. Unlike TCP, UDP does not guarantee packet delivery which means some packets might be lost in transit.

- This `bitrate` shows the throughput achieved under this microbenchmark. As you can see, the 5 Gbps bandwidth available to the `t4g.xlarge` AWS instance is saturated.
To evaluate UDP performance, focus on the server-side statistics, particularly:

![instance-network-size](./instance-network-size.png)
* Packet loss percentage

### Microbenchmark UDP connection
* Jitter (variation in packet arrival time)

You can also microbenchmark the `UDP` protocol with the `-u` flag. As a reminder, UDP does not guarantee packet delivery with some packets being lost. As such you need to observe the statistics on the server side to see the percent of packets lost and the variation in packet arrival time (jitter). The UDP protocol is widely used in applications that need timely packet delivery, such as online gaming and video calls.
These metrics help assess reliability and responsiveness under real-time conditions.

Run the following command from the client to send 2 parallel UDP streams with the `-P 2` option.
UDP is commonly used in latency-sensitive applications such as:

* Online gaming

* Voice over IP (VoIP)

* Video conferencing and streaming

Because it avoids the overhead of retransmission and ordering, UDP is ideal for scenarios where timely delivery matters more than perfect accuracy.

Run the following command from the client to send two parallel UDP streams with the `-P 2` option:

```bash
iperf3 -c SERVER -V -u -P 2
iperf3 -c SERVER -v -u -P 2
```

Looking at the server output you observe 0% of packets where lost for the short test.
Look at the server output and you can see that none (0%) of packets were lost for the short test:

```output
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
Expand All @@ -98,8 +122,10 @@ Looking at the server output you observe 0% of packets where lost for the short
[SUM] 0.00-10.00 sec 2.51 MBytes 2.10 Mbits/sec 0.015 ms 0/294 (0%) receiver
```

Additionally on the client side, the 2 streams saturated 2 of the 4 cores in the system.
Additionally on the client side, the two streams saturated two of the four cores in the system:

```output
CPU Utilization: local/sender 200.3% (200.3%u/0.0%s), remote/receiver 0.2% (0.0%u/0.2%s)
```
```

This demonstrates that UDP throughput is CPU-bound when pushing multiple streams.
Original file line number Diff line number Diff line change
@@ -1,72 +1,118 @@
---
title: Prepare for network performance testing
title: Set up Arm-based Linux systems for network performance testing with iPerf3
weight: 2

### FIXED, DO NOT MODIFY
layout: learningpathall
---

## Configure two Arm-based Linux computers
## Environment setup and Learning Path focus

To perform network performance testing you need two Linux computers. You can use AWS EC2 instances with Graviton processors or any other Linux virtual machines from another cloud service provider.
To benchmark bandwidth and latency between Arm-based systems, you'll need to configure two Linux machines running on Arm.

You will also experiment with a local computer and a cloud instance to learn the networking performance differences compared to two cloud instances.
You can use AWS EC2 instances with Graviton processors, or Linux virtual machines from any other cloud service provider.

The instructions below use EC2 instances from AWS connected in a virtual private cloud (VPC).
This tutorial walks you through a local-to-cloud test to compare performance between:

To get started, create two Arm-based Linux instances, one system to act as the server and the other to act as the client. The instructions below use two `t4g.xlarge` instances running Ubuntu 24.04 LTS.
* Two cloud-based instances
* One local system and one cloud instance

### Install software dependencies
The setup instructions below use AWS EC2 instances connected within a Virtual Private Cloud (VPC).

Use the commands below to install `iperf3`, a powerful and flexible open-source command-line tool used for network performance measurement and tuning. It allows network administrators and engineers to actively measure the maximum achievable bandwidth on IP networks.
To get started, create two Arm-based Linux instances, with each instance serving a distinct role:

Run the following on both systems:
* One acting as a client
* One acting as a server

The instructions below use two `t4g.xlarge` instances running Ubuntu 24.04 LTS.

## Install software dependencies

Use the commands below to install iPerf3, which is a powerful open-source CLI tool for measuring maximum achievable network bandwidth.

Begin by installing iPerf3 on both the client and server systems:

```bash
sudo apt update
sudo apt install iperf3 -y
```

{{% notice Note %}}
If you are prompted to start `iperf3` as a daemon you can answer no.
If you're prompted to run `iperf3` as a daemon, answer "no".
{{% /notice %}}

## Update Security Rules
## Update security rules

If you are working in a cloud environment like AWS, you need to update the default security rules to enable specific inbound and outbound protocols.
If you're working in a cloud environment like AWS, you must update the default security rules to enable specific inbound and outbound protocols.

From the AWS console, navigate to the security tab. Edit the inbound rules to enable `ICMP`, `UDP` and `TCP` traffic to enable communication between the client and server systems.
To do this, follow these instructions below using the AWS console:

![example_traffic](./example_traffic_rules.png)
* Navigate to the **Security** tab for each instance.
* Configure the **Inbound rules** to allow the following protocols:
* `ICMP` (for ping)
* All UDP ports (for UDP tests)
* TCP port 5201 (for traffic to enable communication between the client and server systems)

{{% notice Note %}}
For additional security set the source and port ranges to the values being used. A good solution is to open TCP port 5201 and all UDP ports and use your security group as the source. This doesn't open any traffic from outside AWS.
![example_traffic#center](./example_traffic_rules.png "AWS console view")

{{% notice Warning %}}
For secure internal communication, set the source to your instance’s security group. This avoids exposing traffic to the internet while allowing traffic between your systems.

You can restrict the range further by:

* Opening only TCP port 5201

* Allowing all UDP ports (or a specific range)
{{% /notice %}}

## Update the local DNS

To avoid using IP addresses directly, add the IP address of the other system to the `/etc/hosts` file.
To avoid using IP addresses directly, add the other system's IP address to the `/etc/hosts` file.

The local IP address of the server and client can be found in the AWS dashboard. You can also use commands like `ifconfig`, `hostname -I`, or `ip address` to find your local IP address.
You can find private IPs in the AWS dashboard, or by running:

```bash
hostname -I
ip address
ifconfig
```
## On the client

On the client, add the IP address of the server to the `/etc/hosts` file with name `SERVER`.
Add the server's IP address, and assign it the name `SERVER`:

```output
127.0.0.1 localhost
10.248.213.104 SERVER
```

Repeat the same thing on the server and add the IP address of the client to the `/etc/hosts` file with the name `CLIENT`.
## On the server

Add the client's IP address, and assign it the name `CLIENT`:

```output
127.0.0.1 localhost
10.248.213.105 CLIENT
```

## Confirm server is reachable
| Instance Name | Role | Description |
|---------------|--------|------------------------------------|
| SERVER | Server | Runs `iperf3` in listen mode |
| CLIENT | Client | Initiates performance tests |

Finally, confirm the client can reach the server with the ping command below. As a reference you can also ping the localhost.



## Confirm the server is reachable

Finally, confirm the client can reach the server by using the ping command below. If required, you can also ping the localhost:

```bash
ping SERVER -c 3 && ping 127.0.0.1 -c 3
```

The output below shows that both SERVER and localhost (127.0.0.1) are reachable. Naturally, the local host response time is ~10x faster than the server. Your results will vary depending on geographic location of the systems and other networking factors.
The output below shows that both SERVER and localhost (127.0.0.1) are reachable.

Localhost response times are typically ~10× faster than remote systems, though actual values vary based on system location and network conditions.

```output
PING SERVER (10.248.213.104) 56(84) bytes of data.
Expand All @@ -87,4 +133,4 @@ PING 127.0.0.1 (127.0.0.1) 56(84) bytes of data.
rtt min/avg/max/mdev = 0.022/0.027/0.032/0.004 ms
```

Continue to the next section to learn how to measure the network bandwidth between the systems.
Now that your systems are configured, the next step is to measure the available network bandwidth between them.
Loading