Skip to content

Commit dbe69e4

Browse files
committed
Merge tag 'net-next-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski: "Core: - BPF: - add syscall program type and libbpf support for generating instructions and bindings for in-kernel BPF loaders (BPF loaders for BPF), this is a stepping stone for signed BPF programs - infrastructure to migrate TCP child sockets from one listener to another in the same reuseport group/map to improve flexibility of service hand-off/restart - add broadcast support to XDP redirect - allow bypass of the lockless qdisc to improving performance (for pktgen: +23% with one thread, +44% with 2 threads) - add a simpler version of "DO_ONCE()" which does not require jump labels, intended for slow-path usage - virtio/vsock: introduce SOCK_SEQPACKET support - add getsocketopt to retrieve netns cookie - ip: treat lowest address of a IPv4 subnet as ordinary unicast address allowing reclaiming of precious IPv4 addresses - ipv6: use prandom_u32() for ID generation - ip: add support for more flexible field selection for hashing across multi-path routes (w/ offload to mlxsw) - icmp: add support for extended RFC 8335 PROBE (ping) - seg6: add support for SRv6 End.DT46 behavior - mptcp: - DSS checksum support (RFC 8684) to detect middlebox meddling - support Connection-time 'C' flag - time stamping support - sctp: packetization Layer Path MTU Discovery (RFC 8899) - xfrm: speed up state addition with seq set - WiFi: - hidden AP discovery on 6 GHz and other HE 6 GHz improvements - aggregation handling improvements for some drivers - minstrel improvements for no-ack frames - deferred rate control for TXQs to improve reaction times - switch from round robin to virtual time-based airtime scheduler - add trace points: - tcp checksum errors - openvswitch - action execution, upcalls - socket errors via sk_error_report Device APIs: - devlink: add rate API for hierarchical control of max egress rate of virtual devices (VFs, SFs etc.) - don't require RCU read lock to be held around BPF hooks in NAPI context - page_pool: generic buffer recycling New hardware/drivers: - mobile: - iosm: PCIe Driver for Intel M.2 Modem - support for Qualcomm MSM8998 (ipa) - WiFi: Qualcomm QCN9074 and WCN6855 PCI devices - sparx5: Microchip SparX-5 family of Enterprise Ethernet switches - Mellanox BlueField Gigabit Ethernet (control NIC of the DPU) - NXP SJA1110 Automotive Ethernet 10-port switch - Qualcomm QCA8327 switch support (qca8k) - Mikrotik 10/25G NIC (atl1c) Driver changes: - ACPI support for some MDIO, MAC and PHY devices from Marvell and NXP (our first foray into MAC/PHY description via ACPI) - HW timestamping (PTP) support: bnxt_en, ice, sja1105, hns3, tja11xx - Mellanox/Nvidia NIC (mlx5) - NIC VF offload of L2 bridging - support IRQ distribution to Sub-functions - Marvell (prestera): - add flower and match all - devlink trap - link aggregation - Netronome (nfp): connection tracking offload - Intel 1GE (igc): add AF_XDP support - Marvell DPU (octeontx2): ingress ratelimit offload - Google vNIC (gve): new ring/descriptor format support - Qualcomm mobile (rmnet & ipa): inline checksum offload support - MediaTek WiFi (mt76) - mt7915 MSI support - mt7915 Tx status reporting - mt7915 thermal sensors support - mt7921 decapsulation offload - mt7921 enable runtime pm and deep sleep - Realtek WiFi (rtw88) - beacon filter support - Tx antenna path diversity support - firmware crash information via devcoredump - Qualcomm WiFi (wcn36xx) - Wake-on-WLAN support with magic packets and GTK rekeying - Micrel PHY (ksz886x/ksz8081): add cable test support" * tag 'net-next-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2168 commits) tcp: change ICSK_CA_PRIV_SIZE definition tcp_yeah: check struct yeah size at compile time gve: DQO: Fix off by one in gve_rx_dqo() stmmac: intel: set PCI_D3hot in suspend stmmac: intel: Enable PHY WOL option in EHL net: stmmac: option to enable PHY WOL with PMT enabled net: say "local" instead of "static" addresses in ndo_dflt_fdb_{add,del} net: use netdev_info in ndo_dflt_fdb_{add,del} ptp: Set lookup cookie when creating a PTP PPS source. net: sock: add trace for socket errors net: sock: introduce sk_error_report net: dsa: replay the local bridge FDB entries pointing to the bridge dev too net: dsa: ensure during dsa_fdb_offload_notify that dev_hold and dev_put are on the same dev net: dsa: include fdb entries pointing to bridge in the host fdb list net: dsa: include bridge addresses which are local in the host fdb list net: dsa: sync static FDB entries on foreign interfaces to hardware net: dsa: install the host MDB and FDB entries in the master's RX filter net: dsa: reference count the FDB addresses at the cross-chip notifier level net: dsa: introduce a separate cross-chip notifier type for host FDBs net: dsa: reference count the MDB entries at the cross-chip notifier level ...
2 parents a6eaf38 + b6df007 commit dbe69e4

File tree

1,908 files changed

+109791
-28910
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,908 files changed

+109791
-28910
lines changed
Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
What: /sys/devices/platform/soc@X/XXXXXXX.ipa/
2+
Date: June 2021
3+
KernelVersion: v5.14
4+
Contact: Alex Elder <elder@kernel.org>
5+
Description:
6+
The /sys/devices/platform/soc@X/XXXXXXX.ipa/ directory
7+
contains read-only attributes exposing information about
8+
an IPA device. The X values could vary, but are typically
9+
"soc@0/1e40000.ipa".
10+
11+
What: .../XXXXXXX.ipa/version
12+
Date: June 2021
13+
KernelVersion: v5.14
14+
Contact: Alex Elder <elder@kernel.org>
15+
Description:
16+
The .../XXXXXXX.ipa/version file contains the IPA hardware
17+
version, as a period-separated set of two or three integers
18+
(e.g., "3.5.1" or "4.2").
19+
20+
What: .../XXXXXXX.ipa/feature/
21+
Date: June 2021
22+
KernelVersion: v5.14
23+
Contact: Alex Elder <elder@kernel.org>
24+
Description:
25+
The .../XXXXXXX.ipa/feature/ directory contains a set of
26+
attributes describing features implemented by the IPA
27+
hardware.
28+
29+
What: .../XXXXXXX.ipa/feature/rx_offload
30+
Date: June 2021
31+
KernelVersion: v5.14
32+
Contact: Alex Elder <elder@kernel.org>
33+
Description:
34+
The .../XXXXXXX.ipa/feature/rx_offload file contains a
35+
string indicating the type of receive checksum offload
36+
that is supported by the hardware. The possible values
37+
are "MAPv4" or "MAPv5".
38+
39+
What: .../XXXXXXX.ipa/feature/tx_offload
40+
Date: June 2021
41+
KernelVersion: v5.14
42+
Contact: Alex Elder <elder@kernel.org>
43+
Description:
44+
The .../XXXXXXX.ipa/feature/tx_offload file contains a
45+
string indicating the type of transmit checksum offload
46+
that is supported by the hardware. The possible values
47+
are "MAPv4" or "MAPv5".
48+
49+
What: .../XXXXXXX.ipa/modem/
50+
Date: June 2021
51+
KernelVersion: v5.14
52+
Contact: Alex Elder <elder@kernel.org>
53+
Description:
54+
The .../XXXXXXX.ipa/modem/ directory contains a set of
55+
attributes describing properties of the modem execution
56+
environment reachable by the IPA hardware.
57+
58+
What: .../XXXXXXX.ipa/modem/rx_endpoint_id
59+
Date: June 2021
60+
KernelVersion: v5.14
61+
Contact: Alex Elder <elder@kernel.org>
62+
Description:
63+
The .../XXXXXXX.ipa/feature/rx_endpoint_id file contains
64+
the AP endpoint ID that receives packets originating from
65+
the modem execution environment. The "rx" is from the
66+
perspective of the AP; this endpoint is considered an "IPA
67+
producer". An endpoint ID is a small unsigned integer.
68+
69+
What: .../XXXXXXX.ipa/modem/tx_endpoint_id
70+
Date: June 2021
71+
KernelVersion: v5.14
72+
Contact: Alex Elder <elder@kernel.org>
73+
Description:
74+
The .../XXXXXXX.ipa/feature/tx_endpoint_id file contains
75+
the AP endpoint ID used to transmit packets destined for
76+
the modem execution environment. The "tx" is from the
77+
perspective of the AP; this endpoint is considered an "IPA
78+
consumer". An endpoint ID is a small unsigned integer.

Documentation/RCU/checklist.rst

Lines changed: 34 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -211,27 +211,40 @@ over a rather long period of time, but improvements are always welcome!
211211
of the system, especially to real-time workloads running on
212212
the rest of the system.
213213

214-
7. As of v4.20, a given kernel implements only one RCU flavor,
215-
which is RCU-sched for PREEMPTION=n and RCU-preempt for PREEMPTION=y.
216-
If the updater uses call_rcu() or synchronize_rcu(),
217-
then the corresponding readers may use rcu_read_lock() and
218-
rcu_read_unlock(), rcu_read_lock_bh() and rcu_read_unlock_bh(),
219-
or any pair of primitives that disables and re-enables preemption,
220-
for example, rcu_read_lock_sched() and rcu_read_unlock_sched().
221-
If the updater uses synchronize_srcu() or call_srcu(),
222-
then the corresponding readers must use srcu_read_lock() and
223-
srcu_read_unlock(), and with the same srcu_struct. The rules for
224-
the expedited primitives are the same as for their non-expedited
225-
counterparts. Mixing things up will result in confusion and
226-
broken kernels, and has even resulted in an exploitable security
227-
issue.
228-
229-
One exception to this rule: rcu_read_lock() and rcu_read_unlock()
230-
may be substituted for rcu_read_lock_bh() and rcu_read_unlock_bh()
231-
in cases where local bottom halves are already known to be
232-
disabled, for example, in irq or softirq context. Commenting
233-
such cases is a must, of course! And the jury is still out on
234-
whether the increased speed is worth it.
214+
7. As of v4.20, a given kernel implements only one RCU flavor, which
215+
is RCU-sched for PREEMPTION=n and RCU-preempt for PREEMPTION=y.
216+
If the updater uses call_rcu() or synchronize_rcu(), then
217+
the corresponding readers may use: (1) rcu_read_lock() and
218+
rcu_read_unlock(), (2) any pair of primitives that disables
219+
and re-enables softirq, for example, rcu_read_lock_bh() and
220+
rcu_read_unlock_bh(), or (3) any pair of primitives that disables
221+
and re-enables preemption, for example, rcu_read_lock_sched() and
222+
rcu_read_unlock_sched(). If the updater uses synchronize_srcu()
223+
or call_srcu(), then the corresponding readers must use
224+
srcu_read_lock() and srcu_read_unlock(), and with the same
225+
srcu_struct. The rules for the expedited RCU grace-period-wait
226+
primitives are the same as for their non-expedited counterparts.
227+
228+
If the updater uses call_rcu_tasks() or synchronize_rcu_tasks(),
229+
then the readers must refrain from executing voluntary
230+
context switches, that is, from blocking. If the updater uses
231+
call_rcu_tasks_trace() or synchronize_rcu_tasks_trace(), then
232+
the corresponding readers must use rcu_read_lock_trace() and
233+
rcu_read_unlock_trace(). If an updater uses call_rcu_tasks_rude()
234+
or synchronize_rcu_tasks_rude(), then the corresponding readers
235+
must use anything that disables interrupts.
236+
237+
Mixing things up will result in confusion and broken kernels, and
238+
has even resulted in an exploitable security issue. Therefore,
239+
when using non-obvious pairs of primitives, commenting is
240+
of course a must. One example of non-obvious pairing is
241+
the XDP feature in networking, which calls BPF programs from
242+
network-driver NAPI (softirq) context. BPF relies heavily on RCU
243+
protection for its data structures, but because the BPF program
244+
invocation happens entirely within a single local_bh_disable()
245+
section in a NAPI poll cycle, this usage is safe. The reason
246+
that this usage is safe is that readers can use anything that
247+
disables BH when updaters use call_rcu() or synchronize_rcu().
235248

236249
8. Although synchronize_rcu() is slower than is call_rcu(), it
237250
usually results in simpler code. So, unless update performance is

Documentation/bpf/index.rst

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,19 @@ BPF instruction-set.
1212
The Cilium project also maintains a `BPF and XDP Reference Guide`_
1313
that goes into great technical depth about the BPF Architecture.
1414

15+
libbpf
16+
======
17+
18+
Libbpf is a userspace library for loading and interacting with bpf programs.
19+
20+
.. toctree::
21+
:maxdepth: 1
22+
23+
libbpf/libbpf
24+
libbpf/libbpf_api
25+
libbpf/libbpf_build
26+
libbpf/libbpf_naming_convention
27+
1528
BPF Type Format (BTF)
1629
=====================
1730

@@ -84,6 +97,7 @@ Other
8497
:maxdepth: 1
8598

8699
ringbuf
100+
llvm_reloc
87101

88102
.. Links:
89103
.. _networking-filter: ../networking/filter.rst

Documentation/bpf/libbpf/libbpf.rst

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
.. SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
2+
3+
libbpf
4+
======
5+
6+
This is documentation for libbpf, a userspace library for loading and
7+
interacting with bpf programs.
8+
9+
All general BPF questions, including kernel functionality, libbpf APIs and
10+
their application, should be sent to bpf@vger.kernel.org mailing list.
11+
You can `subscribe <http://vger.kernel.org/vger-lists.html#bpf>`_ to the
12+
mailing list search its `archive <https://lore.kernel.org/bpf/>`_.
13+
Please search the archive before asking new questions. It very well might
14+
be that this was already addressed or answered before.
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
.. SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
2+
3+
API
4+
===
5+
6+
This documentation is autogenerated from header files in libbpf, tools/lib/bpf
7+
8+
.. kernel-doc:: tools/lib/bpf/libbpf.h
9+
:internal:
10+
11+
.. kernel-doc:: tools/lib/bpf/bpf.h
12+
:internal:
13+
14+
.. kernel-doc:: tools/lib/bpf/btf.h
15+
:internal:
16+
17+
.. kernel-doc:: tools/lib/bpf/xsk.h
18+
:internal:
19+
20+
.. kernel-doc:: tools/lib/bpf/bpf_tracing.h
21+
:internal:
22+
23+
.. kernel-doc:: tools/lib/bpf/bpf_core_read.h
24+
:internal:
25+
26+
.. kernel-doc:: tools/lib/bpf/bpf_endian.h
27+
:internal:
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
.. SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
2+
3+
Building libbpf
4+
===============
5+
6+
libelf and zlib are internal dependencies of libbpf and thus are required to link
7+
against and must be installed on the system for applications to work.
8+
pkg-config is used by default to find libelf, and the program called
9+
can be overridden with PKG_CONFIG.
10+
11+
If using pkg-config at build time is not desired, it can be disabled by
12+
setting NO_PKG_CONFIG=1 when calling make.
13+
14+
To build both static libbpf.a and shared libbpf.so:
15+
16+
.. code-block:: bash
17+
18+
$ cd src
19+
$ make
20+
21+
To build only static libbpf.a library in directory build/ and install them
22+
together with libbpf headers in a staging directory root/:
23+
24+
.. code-block:: bash
25+
26+
$ cd src
27+
$ mkdir build root
28+
$ BUILD_STATIC_ONLY=y OBJDIR=build DESTDIR=root make install
29+
30+
To build both static libbpf.a and shared libbpf.so against a custom libelf
31+
dependency installed in /build/root/ and install them together with libbpf
32+
headers in a build directory /build/root/:
33+
34+
.. code-block:: bash
35+
36+
$ cd src
37+
$ PKG_CONFIG_PATH=/build/root/lib64/pkgconfig DESTDIR=/build/root make

tools/lib/bpf/README.rst renamed to Documentation/bpf/libbpf/libbpf_naming_convention.rst

Lines changed: 12 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
.. SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
22
3-
libbpf API naming convention
4-
============================
3+
API naming convention
4+
=====================
55

66
libbpf API provides access to a few logically separated groups of
77
functions and types. Every group has its own naming convention
@@ -10,14 +10,14 @@ new function or type is added to keep libbpf API clean and consistent.
1010

1111
All types and functions provided by libbpf API should have one of the
1212
following prefixes: ``bpf_``, ``btf_``, ``libbpf_``, ``xsk_``,
13-
``perf_buffer_``.
13+
``btf_dump_``, ``ring_buffer_``, ``perf_buffer_``.
1414

1515
System call wrappers
1616
--------------------
1717

1818
System call wrappers are simple wrappers for commands supported by
1919
sys_bpf system call. These wrappers should go to ``bpf.h`` header file
20-
and map one-on-one to corresponding commands.
20+
and map one to one to corresponding commands.
2121

2222
For example ``bpf_map_lookup_elem`` wraps ``BPF_MAP_LOOKUP_ELEM``
2323
command of sys_bpf, ``bpf_prog_attach`` wraps ``BPF_PROG_ATTACH``, etc.
@@ -49,10 +49,6 @@ object, ``bpf_object``, double underscore and ``open`` that defines the
4949
purpose of the function to open ELF file and create ``bpf_object`` from
5050
it.
5151

52-
Another example: ``bpf_program__load`` is named for corresponding
53-
object, ``bpf_program``, that is separated from other part of the name
54-
by double underscore.
55-
5652
All objects and corresponding functions other than BTF related should go
5753
to ``libbpf.h``. BTF types and functions should go to ``btf.h``.
5854

@@ -72,11 +68,7 @@ of both low-level ring access functions and high-level configuration
7268
functions. These can be mixed and matched. Note that these functions
7369
are not reentrant for performance reasons.
7470

75-
Please take a look at Documentation/networking/af_xdp.rst in the Linux
76-
kernel source tree on how to use XDP sockets and for some common
77-
mistakes in case you do not get any traffic up to user space.
78-
79-
libbpf ABI
71+
ABI
8072
==========
8173

8274
libbpf can be both linked statically or used as DSO. To avoid possible
@@ -116,7 +108,8 @@ This bump in ABI version is at most once per kernel development cycle.
116108

117109
For example, if current state of ``libbpf.map`` is:
118110

119-
.. code-block::
111+
.. code-block:: c
112+
120113
LIBBPF_0.0.1 {
121114
global:
122115
bpf_func_a;
@@ -128,7 +121,8 @@ For example, if current state of ``libbpf.map`` is:
128121
, and a new symbol ``bpf_func_c`` is being introduced, then
129122
``libbpf.map`` should be changed like this:
130123

131-
.. code-block::
124+
.. code-block:: c
125+
132126
LIBBPF_0.0.1 {
133127
global:
134128
bpf_func_a;
@@ -148,7 +142,7 @@ Format of version script and ways to handle ABI changes, including
148142
incompatible ones, described in details in [1].
149143

150144
Stand-alone build
151-
=================
145+
-------------------
152146

153147
Under https://github.com/libbpf/libbpf there is a (semi-)automated
154148
mirror of the mainline's version of libbpf for a stand-alone build.
@@ -157,12 +151,12 @@ However, all changes to libbpf's code base must be upstreamed through
157151
the mainline kernel tree.
158152

159153
License
160-
=======
154+
-------------------
161155

162156
libbpf is dual-licensed under LGPL 2.1 and BSD 2-Clause.
163157

164158
Links
165-
=====
159+
-------------------
166160

167161
[1] https://www.akkadia.org/drepper/dsohowto.pdf
168162
(Chapter 3. Maintaining APIs and ABIs).

0 commit comments

Comments
 (0)