Skip to content

S3K Design

Henrik Akira Karlsson edited this page Nov 2, 2023 · 3 revisions

Overview

S3K has a static amount of mono-threaded processes with associated with a table of capabilities. The capabilities includes access to memory, IPC, process monitoring, and finally, time slices.

Capabilities

Capabilities are tokens that grants a process access to resources. In S3K, capabilities are objects within the kernel owned by the user processes. User processes can utilize its capabilities through system calls by referencing the capability.

Capability types

The capabilities of S3K include:

  • Time Slice: Granting access to a slice of execution time.
  • Memory Slice: Granting management rights over physical memory.
  • PMP Frame: Granting access to physical by interacting with the RISC-V's PMP unit.
  • Monitor Slice: Granting rights to suspend/resume, grant/take capabilities, and read/write registers of another process
  • Channel Slice: Grant management rights over IPC channels.
  • Server/Client Socket: Access to IPC endpoints for sending and receiving capabilities and data.

More detailed explaination of each capability is presented under relevant sections below.

Capability Table

In S3K, capabilities of a process resides within a per process capability table (ctable). The size of the ctable is statically decided at compile time. Processes references capabilities by their index to the capability table.

Initial capabilities and capability derivation

The initial capabilities of a system is created at compile time and are assigned to the boot process, process with PID 0. From the initial capabilities, the boot process can derive new capabilities. An important security property of the capabilities are that derived capabilities permits access to a subset of the original capability's resources. These derived capabilities can then be given to other processes, granting them access to resources. We denote capabilities that can derive new capability as management capabilities.

Slice Capability

To facilitate secure resource management, we introduce slice capabilities. A slice capability grants control over a continuous span of a certain resources such as memory (memory slice), IPC channels (channel slice), processes (monitor slice), and time (time slice). A slice is partitioned into two continous segments: allocated, consisting of resources management rights allocated to child capabilities; and the free segment, unallocated resources for which we can utilize directly accessing resources or for deriving new capabilities.

[Insert Image]

We have the following invariants for a slice capability:

  • child slices are disjoint,
  • children are contained within the parent's allocated segment, and
  • free segments of all slices are disjoint. This ensures that each slice capability provides exclusive access to resources contained within the free segment (modulo parent slices).

To maintain these invariants, we derive from a slice capability as follows:

  • Check that the new capability is contained with the original's free segment.
  • If the new capability is a management capabiltiy, then update the allocated segment to include the new capability. These steps prevents us from deriving overlapping slice capabilities.

Capability Revocation and Capability Derivation Tree

Processes can also revoke capabilities if they own the parent capability. In S3K, we track capability derivation via a capability derivation tree (CDT). Every capability in the system resides in the CDT, when we derive a new capability, we add it as a child to the original capability in the CDT. The CDT allows us to revoke capabilities by recursively deleting the child capabilities of a capability.

Slice capability

Once we have revoked the child capabilities of a slice capability, we reset the allocated segment to NULL as no capability is within it, allowing us to derive new capabilities.

Basic Capability Operations

The following system calls are supported by all capabilities:

  • s3k_cap_read(i): Read the description of the i'th capability.
  • s3k_cap_move(i, j): Move the i'th capability to the j'th slot.
  • s3k_cap_delete(i): Delete the i'th capability. Children of the capability will be inherited by the parent.
  • s3k_cap_derive(i,j,cap): Create capability object cap at the j'th slot from the i'th capability.
  • s3k_cap_revoke(i): Recursively delete the children of the i'th capability, then reset the i'th capability if applicable.

Synchronization

To enhance determinism and simplify kernel design, we introduce asynchrony between a process's resource access and capability grants. Consider a scenario where process A is initially granted access to memory region M when scheduled. However, during its execution, parallel process B revokes A's access to M. To prevent A from accessing M post-revocation, we'd need to interrupt A via hardware to clear its permissions to M. This is a complex task to implement, which has high-overhead, and introduces non-deterministic behavior in process A. To avoid these problems, we establish explicit synchronization points.

Synchronization points are defined moments during the execution of a running process when its allocated memory and execution time resources are synchronized with the capability system. In S3K, synchronization points occur when a process enters a running state, explicitly calls sync or sync_mem, or performs in IPC or process monitoring calls. The sync and sync_mem are system calls designed specifically for process synchronization. The former synchronizes both execution time and memory rights, while the latter focuses solely on memory rights.

It is important to note that a process always synchronizes at the end of its allocated time slice, with no postponement possible. This ensures that, in the example mentioned earlier, process $B$ can rely on the fact that process A will synchronize after a predetermined amount of time and thus lose access to memory region M.

Memory Management and Protection

The kernel, tailored for resource-constrained real-time applications, relies on RISC-V's PMP for memory protection. It leverages two types of capabilities: memory slices for managing access to physical memory regions and PMP frames for interaction with the PMP unit, and accessing memory. Memory slices can derive and revoke slices or PMP frames. Once a PMP frame has been derived from a memory slice, the memory slice is locked. When the memory slice is locked, we can not derive new memory slices, but we can derive PMP frames (PMP frames do not grant resource management rights so they are not included in the allocated segment of a slice capability). The lock prevents us from deriving new memory slices overlapping PMP frames while allowing us to create shared memory via overlapping PMP frames.

Time Management

Inter-Process Communication

The kernel has a static number of IPC channels managed using the channel slice capability. From the channel slice capabilities, we can derive and revoke channel slices and socket capabilities. A socket capability represents an endpoint of an IPC channel, and we distinguish between two types of socket capabilities: the server socket, used by the receiver or server of an IPC channel, and the client socket, used by the senders or clients of an IPC channel. From a channel slice we can derive the server socket, and a server socket we can derive (and revoke) client sockets. Note that for each channel, we have at most one server socket, but can have multiple client sockets.

When deriving a server socket, the channel is associated with a mode related to the yield policy of the channel, and permissions are assigned that dictate what the server and clients can transmit through the channel, for example capabilities or data.

There are three system calls associated with the socket capabilities: send, receive, and sendReceive. Clients can only use the send and sendReceive system calls. The send system call is employed for the straightforward task of sending a capability along with data. On the other hand, sendReceive combines sending a message with atomically waiting for a reply.

Servers utilize the receive to wait for incoming client messages, and use the send to reply to the latest client from which it received a message. In cases where servers need to send replies and then wait for messages atomically, they employ the sendReceive system call.

Process Monitoring

For system setup and management, S3K provides the monitor slice capability. This capability grants a process the right execute various critical functions, including suspending/resuming another process, granting and taking capabilities, donating execution time, and setting the registers of another process.

Additionally, from a monitor slice, we can derive new monitor slices. This allows us to delegate process monitoring rights to other processes. This capability can be leveraged to facilitate the creation of one process monitor for per subsystems or establish a hierarchy of process monitors.