LoLa (Low Latency) IPC represents a safe zero-copy shared-memory based IPC mechanism. This document describes the high-level architecture concept.
In the beginning of the xPAD project, nearly all customer functions were deployed into one executable.
One of the reasons was, that there is the need for high frequent data exchange between single parts and the provided
IPC mechanisms by our ara::com
vendor were not performant enough.
Thus, the need arose for a IPC mechanism that:
- utilizes zero-copy
- provides minimum latency overhead
- is ASIL-B ready
Due to the limited time-frame it was decided that LoLa will not implement a fully-fledged ara::com
but rather
only a subset of functionality. Namely, it only provides event-based communication. Which is advanced by time.
Since LoLa will not implement all communication mechanisms like methods
and fields
that ara::com
foresees.
It rather needs to exist in parallel and needs to be operated independent of any adaptive AUTOSAR Stack[1].
Still, it will be necessary to migrate certain applications towards Lola. In order to make this as easy as possible
and to ensure that developers don't need to learn another API, adaptive AUTOSAR shall be mimicked where possible - only with the difference
that we will not use the ara
C++ namespace, but a BMW one[2]. The sub-section will go into more detail on which adaptive AUTOSAR features specifically will be needed.
The Communication Management Specification of adaptive AUTOSAR foresees two major building blocks that implement ara::com
.
One is the so-called frontend, the other one network binding. The idea is that the frontend does not change depending
on which network binding is selected. Meaning, the frontend stays the same no matter if we use SOME/IP or Shared Memory
as network binding. In order to be as flexible as possible and reduce compilation times in the CI, we want to follow the
Multi Target Build concept. In summary it shall be possible to configure on runtime which network binding shall be used[3].
While this does not make sense at the moment, since we have only one network binding, it ensures that no deployment information
leaks into the frontend and thus we can reduce compilation times (e.g. by only generating C++ libraries per interface).
The basic idea of our LoLa concept is to use two main operating system facilities:
- Shared Memory[4]: Shall be used for the heavy lifting of data exchange [6]
- Message Passing[5]: Shall be used as notification mechanism [7]
We decided for this side channel since implementing a notification system via shared memory would include the usage of condition variables. These condition variables would require a mutex, wich would require read-write access. This could lead to the situation that a malicious process could lock the mutex forever and thus destroy any event notification. In general we can say that any kind of notification shall be exchanged via message passing facilities [7]. The sub-section below will go into more detail for the Message Passing Facilities.
The usage of shared memory has some implications. First, any synchronization regarding thread-safety / process-safety needs to be performed by the user. Second, the memory that is shared between the processes is directly mapped into their virtual address space. This implies that it is easy for a misbehaving process to destroy or manipulate any data within this memory segment. In order to cope with the latter, we split up the shared memory into three segments.
- First, a segment where only the to-be-exchanged data is provided. This segment shall be read-only to consumer and only writeable by the producer [8]. This will ensure that nobody besides the producer process can maniuplate the provided data.
- The second and third segment shall contain necessary control information for the data segment[9]. Necessary control information can include atomics that are used to synchronize the access to the data segments. Since this kind of access requires write access, we split the shared memory segments for control data by ASIL Level. This way it can be ensured that no low-level ASIL process interferes with higher level ones. More information on shared memory handling can be found in sub-section.
One of the main ideas in this concept is the split of control data from sample (user) data. In order to ensure a mapping, the shared memory segments are divided into slots [10] [11]. By convention, we then define that the slot indexes correlate. Meaning, slot 0 in the control data is user to synchronize slot 0 in the sample data. More information on these slot and the underlying algorithm can be found in sub-section.
As already mentioned earlier we will only implemente the ara::com
specification partially. In the following paragraph, we
will list all necessary requirements from the SWS Communication Management of the 19.03 adaptive AUTOSAR Standard.
We will further give reasoning for deviations or if parts are not implemented
In general we take over all structuring parts that define where which types are defined. It shall only be noted that we will deviate when it comes to implementing the ara-namespace. See [2].
- Folder Structure
- Service header files existence
- Inclusion of common header file
- Namespace of Serivce header file
- Service skeleton namespace
- Service proxy namespace
- Service events namespace
- Common header file existence
- Inclusion of Types header file
- Inclusion of Implementation Types header files
- Service Identfier Type definitions
- Namespace for Service Identifier Type definitions
- Types header file existence
- Types header file namespace
- Data Type declarationsin Types header file
- Implementation Types header files existence
- Data Type definitions for AUTOSAR Data Types in Implementation Types header files
- Implementation Types header file namespace
Also the major API types are taken over without changes. We only have to adjust requirement [19] since the underlying
requirement, , would require this data-type in the ara::core
namespace.
Here we will again diverge to fulfill requirement [2].
- Instance Specifier Class
- Instance Identifier Class
- Instance Identifier Container Class
- Find Service Handle
- Handle Type Class
- Copy semantics of Handle Type Class
- Move semantics of Handle Type Class
- Service Handle Container
- Find Service Handler
For events we will provide only the types listed below. Any E2E types are not necessary, since we do not need support End-2-End protection (see [17]). Also types associated with Subscription State are not necessary, since there is no use-case on customer function side, so we try to keep the overhead minimal. This includes unused types like custom future/promise implementations, which are only necessary for method support. Also custom variant or optional types will not be implemented, since these are already provided by amp. Last but not least, the ScaleLinearAndTexttable class also has no usage within BMW, thus it can be skipped also.
Adaptive AUTOSAR clearly defines which data-types can be transmitted via its communication mechanism.
We orient ourselves strongly on this concepts. Our adaptive AUTOSAR generator, aragen
shall support
the following requirements.
- Data Type Mapping
- Provide data type definitions
- Avoid Data Type redeclaration
- Naming of data type by short name
- Supported Primitive Cpp Implementation Types
- Primitive fixed with integer types
- StdCppImplementationDataType of category ARRAY with one dimension
- Array Data Type with more than one dimension
- Structure Data Type
- Element specification typed by CppImplementationDataType
- StdCppImplementationDataType with the category STRING
- StdCppImplementationDataType with the category VECTOR with one dimension
- Vector Data Type with more than one dimension
- Data Type redefinition
- Enumeration Data Type
It shall be noted that some specifics are not supported for transmission and will not be generated. This includes:
- associative maps
- variants
- optionals in struct
- ScaleLinearAndTexttable
- custom allocators
The latter is especially important, since our implementation will need a custom allocator to ensure correct shared memory handling.
In addition to these requirements, we need to clearly specify the max-size for each container. This is due to the fact
that the shared memory needs to be preallocated[4]. If we need to preallocate on startup,
the maximum size needs to be calculated in advance[12]. In order to do that, it is necessary to know the maximum number of elements in each
container. This solves the same underlying issue as `` and SW_CM_00450
in a more generic way, thus a custom BMW extension.
[Shall enable definition of max-elements for container types]()
The error concept cannot be taken over as well. This relies heavily on the infrastructure provided by ara::core
Within BMW we have a similar infrastructure within bmw::Result
. Instead of following the requirements
,
and ``, we define that any error shall be reported via bmw::Result
.
[Utilize bmw::Result for any error reporting]()
The general API for proxies and skeletons can be taken over completely.
- Service skeleton class
- Service skeleton Event class
- Service proxy class
- Service proxy Event class
- Declaration of Construction Token
- Creation of Construction Token
- Method to offer a service
- Method to stop offering a service
There are many ways to create a skeleton. Within BMW we only want to support one way, using exceptionless mechanisms and Instance Specifier. All other methods are not supported. Since we don't support method processing either, our constructors don't allow to specify the processing mode.
- Exception-less creation of service skeleton using Instance Spec
- Copy semantics of service skeleton class
- Move semantics of service skeleton class
- Send event where application is responsible for the data
- Send event where Communication Management is responsible for the data
- Allocating data for event transfer
The same story is true for proxies. We only allow polling based search of proxies and only with InstanceSpecifier. This also excludes the support for ANY.
- Find service with immediately returned request using Instance Spec
- Exception-less creation of service proxy
- Copy semantics of service proxy class
- Move semantics of service proxy class
The proxy event handling is mostly the same as in ara::com
. Only the re-establishment for subscriptions (``) and
any kind of subscription state query is not supported. Former needs to include a fully working service discovery, which was
dismissed due to timing constraints. Latter is not used within BMW and was thus dismissed.
- Method to subscribe to a service event
- Ensure memory allocation of maxSampleCount samples
- Method to unsubscribe from a service event
- Method to update the event cache
- Signature of Callable f
- Sequence of actions in GetNewSamples
- Return Value
- Reentrancy
- Query Free Sample Slots
- Return Value of GetFreeSampleCount
- Calculation of Free Sample Count
- Possibility of exceeding sample count by one
- Enable service event trigger
- Disable service event trigger
As stated already earlier, we want that our implementation is Multi-Target-Build ready. Same does adaptive AUTOSAR, which is why we can take these requirements over.
- Change of Service Interface Deployment
- Change of Service Instance Deployment
- Change of Network Configuration
The security chapter of the SWS Communication Management defines a possibility to restrict communication. While this is generally good, we agreed with our stakeholders that this kind of functionality will not be configured via ARXML. Thus, we will not follow these requirements (e.g. SWS_COM_90002), since they describe the direct connection with the AUTOSAR Meta Model. Our security requirements will be custom-made in the lower sections.
In addition to the previous API centry requirements, there are some more functional requirements which are grouped in this section. Any network binding specific requirements like the ones for DDS or SOME/IP are obviously irrelevant, since we implement a User-Defined-Network binding.
- Uniqueness of offered service
- Protocol where a service is offered
- InstanceSpecifier check during creation of service skeleton
- FIFO semantics
- No implicit context switches
- Event Receive Handler call serialization
- Functionality afte event received
The Message Passing facilities, under QNX this will be implemented by QNX Message Passing, will not be used to synchronize the access to the shared memory segments. This is done over the control segments. We utilize message passing for notfications only. These notifications include:
This is done, since there is no need to implement an additional notification handling via shared memory, which would only be possible by using mutexes and condition variables. The utilization of mutexes would make the implementation of a wait-free algorithms more difficult. As illustrated in the graphic below a process should provide one message passing port to receive data for each supported ASIL-Level[26]. In order to ensure that messages received from QM processes will not influence ASIL messages, each message passing port shall use a custom thread to wait for new messages[27]. Further, it must be possible to register callbacks for mentioned messages[28]. These callbacks shall then be invoked in the context of the socket specific thread[29]. This way we can ensure that messages are received in a serialized manner.
POSIX based operating systems generally support two kinds of shared memory: file-backed and anonymous. Former is represented by a file within the file-system, while the latter is not visible directly to other processes. We decide for former, in order to utilize the filesystem for a minimal service discovery[13],[14]. In order to avoid fault propagation over restarts of the system, any shared memory communication shall not be persistent[15]. Processes will identify shared memory segments over their name. The name will be commonly known by producers and consumers and deduced by additional parameters like for example service id and instance id[16]. When it comes to the granularity of the data stored in the shared memory segments, multiple options can be considered. We could have one triplet of shared memory segments per process or one triplet of shared memory segments per event within a service instance. Former would make the ASIL-Split of segments quite hard, while the latter would explode the number of necessary segments within the system. As trade-of we decided to have one triplet of shared memory segments per service instance[17].
It is possible to map shared memory segments to a fixed virtual address. This is highly discouraged by POSIX and leads to undefined behaviour[18]. Thus, shared memory segments will be mapped to different virtual adresses. In consequence no raw pointer can be stored within shared memory, since it will be invalid within another process. Only offset pointer (fancy pointer, relative pointer) shall be stored within shared memory segments[19].
The usage of shared memory does not involve the operating system, after shared memory segments are setup. Thus, the operating system can no longer ensure freedom from interference between processes that have access to these shared memory regions. In order to restrict access we use ACL support of the operating system[20], 21. In addition to the restricted permissions, we have to ensure that a corrupted shared memory region cannot influence other process-local memory regions. This can be ensured by performing Active Bounds Checking. So the only way how data corruption could propagate throughout a shared memory region is if a pointer within a shared memory region points out of it. Thus, a write operation to such a pointer could forward memory corruption. The basic idea to overcome such a scenario is, that we check that any pointer stays within the bounds of the shared memory region. Since anyhow only offset pointer can be stored in a shared memory region, this active bound check can be performed whenever a offset pointer is dereferenced[22]. The last possible impact can be on timing. If another process for example wrongly locks a mutex within the shared memory region and another process would then wait for this lock, we would end up in a deadlock. While this should not harm any safety goal, we still want to strive for wait-free algorithms to avoid such situations[23].
A slot shall contain all necessary meta-information in order to synchronize data access[30]. This information most certainly needs to include a timestamp to indicate the order of produced data within the slots. Additionally, a use count is needed, indicating if a slot is currently in use by one process. The concrete data is implementation defined and must be covered by the detailed design.
The main idea of the algorithm is that a producer shall always be able to store one new data sample[31]. If he cannot find a respective slot, this indicates a contract violation, which indicates that a QM process misbehaved. In such a case, a producer should exclude any QM consumer from the communication[32].
This whole idea builds up on the split of shared memory segements by ASIL levels. This way we can ensure that an QM process will not degradate the ASIL Level for a communication path. In another case, where we already have a QM producer, it is possible for an ASIL B consumer to consume the QM data. In that case the data will always be QM since it is impossible for the middleware to apply additional checks to enhance the quality of data. This can only be done on application layer level.
This section gives an overview of the incoming requirements towards this concept (system requirements) and the outgoing requirements, derived by this concept (software component requirements).
The codebeamer aggregation can be found here.
- Intra-SoC communication using events
- Support for zero-copy shared memory IPC
- Support for synchronous sending of data over IPC
- Prevent memory fragmentation in real-time processes
- IPC synchronization support
- No direct usage of POSIX IPC functions by application
- IPC communication shall be whitelisted
- The platform shall provide a mechanism to block IPC communication
- All platform processes shall use a different user id
- No platform process shall run as the user root
- All platform processes shall run as a normal user with limited privileges
- The platform shall limit the number of shared resources between processes
- IPC communication shall be integrity-protected
- The CDC Platform shall have the ability to access IPC (ara::com) communication
- IPC Tracing for Development Purposes
- Internal ECU Communication before Middleware Startup
- Use end-to-end protection for communication
- Freedom from interference on Application Software
- Static and automatic memory
The codebeamer aggregation can be found here
- Operate LoLa in parallel to adaptive AUTOSAR
- Use bmw specific namespace
- Enable Multi-Target-Build
- OS shall provide Shared Memory IPC
- OS shall provide Message Passing IPC
- User data shall be exchanged via Shared Memory
- Notifications shall be exchanged via Message Passing
- User data shall be provided in a separate read-only shared memory segment
- One shared memory segment per ASIL level for control data
- The shared memory segments shall be devided into slots
- The number of slots shall be configurable
- Calculate necessary Shared-Memory size prior to creating it
- Shared Memory segments used by mw::com shall start with prefix
lola
- A service shall marked as found, if its underlying file exists
- Files that are used for Shared Memory IPC shall not be persistent
- Shared Memory segments shall be identfied via a common known name
- There shall be one triplet of shared memory segments per service instance
- Shared Memory segments shall not be mapped to a fixed virtual address
- Only offset pointer within Shared Memory segments
- The operating system shall support ACL for shared memory segments
- Only configured UIDs shall have access to the LoLa shared memory segments
- Perform Active Bound Checking on dereferenciation of offset pointer
- Sychronizing Shared Memory regions shall ensure wait-freedom
- Subscription handling shall be implemented via Message Passing
- Event Notifications shall be implemented via Message Passing
- One message passing receive port per ASIL-Level pro process
- Each message passing port should use a custom thread
- It shall be possible for all exchange message to register a callback
- Registered callbacks for messages shall be invoked of the respective waiting thread
- A control slot shall contain all necessary data information for synchronizing data access
- A producer shall always be able to store new data
- On contract violation, QM communication for the affected service instance shall be withdrawn
Additionally the requirements derived from the adaptive AUTOSAR requirements shall be considered, as decribed here.
The high-level software architecture can be found here: <:27112/collaborator/document/4ff2a028-8ac7-4f7a-b42e-dd90816fec81?viewId=718e62d1-c6e4-4751-9458-40ec38ac39a2&viewType=model§ionId=856b1ace-a576-49fb-ac5e-b030656afa6f>
The detailed design can be found here: </swh/ddad_platform/tree/master/aas/mw/com/design>
The operation of shared memory is always a security concern, since it makes it easier for an attacker to access the memory space of another process.
This is especially true, if two processes have read / write access to the same pages. We are confident that our applied mechanisms, like reduced access to shared memory segements and active bounds checking prevent any further attack vectors.
The only scenario that is not covered is an attack against the control segments. An attacker could in the worst case null all usage counter. In that scenario a race-condition could happen, that data that is read, is written at the same time, causing incomplete data reads.
This is a drawback that comes with the benefit of less overhead for read/write synchronization, reducing our latency a lot. At this point in time we accept this drawback by the benefit of the performance.
Software Component Failure Analysis: <> Functional Failure Analysis: <>
LoLa represents a substantial infrastructure part of our safety goals. Thus, additional AoUs towards the application side will result from LoLa.
An overview can be found here: <>
This is a performance measure. Thus, we expect a reduction in latency in communication and a reduced memory footprint, since there is no longer the need for memory copies.
There are no implications on diagnostics, since there will be no diagnostic job used.
There will be substantial need for testing. A concrete test plan is only possible after the FMEA. At this point we are sure that all tests can be executed utilizing ITF Tests. This is necessary, since the tests heavily rely on the operating system. All other tests will be possible to be conducted as unit tests.
For some use cases a loosely typed interface to services, which can be created dynamically during runtime without the need of compile-time dependencies, would be favorable. For this the DII concept has been created, which LoLa will implement for event communication in IPNext.