Skip to content

Proposal: Network Devices #1239

@aojea

Description

@aojea

The spec describes Devices that are container based, but there are another class of Devices, Network Devices that are defined per namespace, quoting "Linux Device Drivers, Second Edition , Chapter 14. Network Drivers"

Chapter 14. Network Drivers
We are now through discussing char and block drivers and are ready to move on to the fascinating world of networking. Network interfaces are the third standard class of Linux devices, and this chapter describes how they interact with the rest of the kernel.

The role of a network interface within the system is similar to that of a mounted block device. A block device registers its features in the blk_dev array and other kernel structures, and it then “transmits” and “receives” blocks on request, by means of its request function. Similarly, a network interface must register itself in specific data structures in order to be invoked when packets are exchanged with the outside world.

There are a few important differences between mounted disks and packet-delivery interfaces. To begin with, a disk exists as a special file in the /dev directory, whereas a network interface has no such entry point. The normal file operations (read, write, and so on) do not make sense when applied to network interfaces, so it is not possible to apply the Unix “everything is a file” approach to them. Thus, network interfaces exist in their own namespace and export a different set of operations.

Although you may object that applications use the read and write system calls when using sockets, those calls act on a software object that is distinct from the interface. Several hundred sockets can be multiplexed on the same physical interface.

But the most important difference between the two is that block drivers operate only in response to requests from the kernel, whereas network drivers receive packets asynchronously from the outside. Thus, while a block driver is asked to send a buffer toward the kernel, the network device asks to push incoming packets toward the kernel. The kernel interface for network drivers is designed for this different mode of operation.

Network drivers also have to be prepared to support a number of administrative tasks, such as setting addresses, modifying transmission parameters, and maintaining traffic and error statistics. The API for network drivers reflects this need, and thus looks somewhat different from the interfaces we have seen so far.

Network Devices are also used for providing connectivity to the network namespaces, and commonly container runtimes use the CNI specification to provide this capacity of adding a network device to the namespace and configure its networking parameters.

Runc already has the concept of network device and how to configure it, in addition to the CNI specifixation https://github.com/opencontainers/runc/tree/main/libcontainer

package configs

// Network defines configuration for a container's networking stack
//
// The network configuration can be omitted from a container causing the
// container to be setup with the host's networking stack
type Network struct {

https://github.com/opencontainers/runc/blob/main/libcontainer/configs/network.go#L3-L51

The spec already has a reference to the network in https://github.com/opencontainers/runtime-spec/blob/main/config-linux.md#network , that references network devices, but does not allow to specify the network devices that will be part of the namespace.

However, there are cases that a Kubernetes Pod or container may want to add, in a declarative way, existing Network Devices to the namespace, it is important to mention that the Network Device configuration or creation is non-goal and is left out of the spec on purpose.

The use cases for adding network devices to namespaces are more common lately with the new AI accelerators devices that are presented as network devices to the system, but they are not really considered as an usual network device. Ref: https://lwn.net/Articles/955001/ (Available Jan 4th without subscription)

The proposal is to be able to add existing Network devices to a linux namespace by referencing them https://docs.kernel.org/networking/netdevices.html, in a similar way to the existing definition of Devices

Linux defines an structure like this one in https://man7.org/linux/man-pages/man7/netdevice.7.html

       This man page describes the sockets interface which is used to
       configure network devices.

       Linux supports some standard ioctls to configure network devices.
       They can be used on any socket's file descriptor regardless of
       the family or type.  Most of them pass an ifreq structure:

           struct ifreq {
               char ifr_name[IFNAMSIZ]; /* Interface name */
               union {
                   struct sockaddr ifr_addr;
                   struct sockaddr ifr_dstaddr;
                   struct sockaddr ifr_broadaddr;
                   struct sockaddr ifr_netmask;
                   struct sockaddr ifr_hwaddr;
                   short           ifr_flags;
                   int             ifr_ifindex;
                   int             ifr_metric;
                   int             ifr_mtu;
                   struct ifmap    ifr_map;
                   char            ifr_slave[IFNAMSIZ];
                   char            ifr_newname[IFNAMSIZ];
                   char           *ifr_data;
               };
           };

though we only need the index or the name to be able to reference one interface

   Normally, the user specifies which device to affect by setting
   ifr_name to the name of the interface or ifr6_ifindex to the
   index of the interface.  All other members of the structure may
   share memory.
## <a name="configLinuxNetDevices" />NetDevices

**`netDevices`** (array of objects, OPTIONAL) lists of network devices that MUST be available in the container network namespace.

Each entry has the following structure:

* **`name`** *(string, REQUIRED)* - name of the network device in the host.
* **`properties`** *(object, OPTIONAL)* - properties the network device per https://man7.org/linux/man-pages/man7/netdevice.7.html in the container namespace.
    has the following structure:
    * **`name`** *(string, OPTIONAL)* - name of the network device in the network namespace.
    * **`address`** *(string, OPTIONAL)* - address of the network device in the network namespace
    * **`mask`** *(string, OPTIONAL)* - mask of the network device in the network namespace
    * **`mtu`** *(uint16, OPTIONAL)* - MTU size of the network device in the network namespace

### Example

```json
"netdevices": [
    {
        "name": "eth0",
        "properties": {
              name: "ns1",
              address: "192.168.0.1",
              mask: "255.255.255.0",
              mtu: 1500, 
        }
}
    },
    {
        "name": "ens4",
    }
]

Proposal: #1240
runc prototype: https://github.com/opencontainers/runc/compare/main...aojea:runc:netdevices?expand=1

References:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions