Name		Name	Last commit message	Last commit date
parent directory ..
ara_com_dii		ara_com_dii
assets		assets
partial_restart		partial_restart
service-discov		service-discov
README.md		README.md
feature.yaml		feature.yaml

README.md

LoLa - Inter Process Communication (IPC)

LoLa (Low Latency) IPC represents a safe zero-copy shared-memory based IPC mechanism. This document describes the high-level architecture concept.

Problem Formulation

In the beginning of the xPAD project, nearly all customer functions were deployed into one executable. One of the reasons was, that there is the need for high frequent data exchange between single parts and the provided IPC mechanisms by our ara::com vendor were not performant enough.

Thus, the need arose for a IPC mechanism that:

utilizes zero-copy
provides minimum latency overhead
is ASIL-B ready

Due to the limited time-frame it was decided that LoLa will not implement a fully-fledged ara::com but rather only a subset of functionality. Namely, it only provides event-based communication. Which is advanced by time.

Technical Concept Description

Since LoLa will not implement all communication mechanisms like methods and fields that ara::com foresees. It rather needs to exist in parallel and needs to be operated independent of any adaptive AUTOSAR Stack[1]. Still, it will be necessary to migrate certain applications towards Lola. In order to make this as easy as possible and to ensure that developers don't need to learn another API, adaptive AUTOSAR shall be mimicked where possible - only with the difference that we will not use the ara C++ namespace, but a BMW one[2]. The sub-section will go into more detail on which adaptive AUTOSAR features specifically will be needed.

The Communication Management Specification of adaptive AUTOSAR foresees two major building blocks that implement ara::com. One is the so-called frontend, the other one network binding. The idea is that the frontend does not change depending on which network binding is selected. Meaning, the frontend stays the same no matter if we use SOME/IP or Shared Memory as network binding. In order to be as flexible as possible and reduce compilation times in the CI, we want to follow the Multi Target Build concept. In summary it shall be possible to configure on runtime which network binding shall be used[3]. While this does not make sense at the moment, since we have only one network binding, it ensures that no deployment information leaks into the frontend and thus we can reduce compilation times (e.g. by only generating C++ libraries per interface).

The basic idea of our LoLa concept is to use two main operating system facilities:

Shared Memory[4]: Shall be used for the heavy lifting of data exchange [6]
Message Passing[5]: Shall be used as notification mechanism [7]

We decided for this side channel since implementing a notification system via shared memory would include the usage of condition variables. These condition variables would require a mutex, wich would require read-write access. This could lead to the situation that a malicious process could lock the mutex forever and thus destroy any event notification. In general we can say that any kind of notification shall be exchanged via message passing facilities [7]. The sub-section below will go into more detail for the Message Passing Facilities.

The usage of shared memory has some implications. First, any synchronization regarding thread-safety / process-safety needs to be performed by the user. Second, the memory that is shared between the processes is directly mapped into their virtual address space. This implies that it is easy for a misbehaving process to destroy or manipulate any data within this memory segment. In order to cope with the latter, we split up the shared memory into three segments.

First, a segment where only the to-be-exchanged data is provided. This segment shall be read-only to consumer and only writeable by the producer [8]. This will ensure that nobody besides the producer process can maniuplate the provided data.
The second and third segment shall contain necessary control information for the data segment[9]. Necessary control information can include atomics that are used to synchronize the access to the data segments. Since this kind of access requires write access, we split the shared memory segments for control data by ASIL Level. This way it can be ensured that no low-level ASIL process interferes with higher level ones. More information on shared memory handling can be found in sub-section.

One of the main ideas in this concept is the split of control data from sample (user) data. In order to ensure a mapping, the shared memory segments are divided into slots [10] [11]. By convention, we then define that the slot indexes correlate. Meaning, slot 0 in the control data is user to synchronize slot 0 in the sample data. More information on these slot and the underlying algorithm can be found in sub-section.

Features from adaptive AUTOSAR

As already mentioned earlier we will only implemente the ara::com specification partially. In the following paragraph, we will list all necessary requirements from the SWS Communication Management of the 19.03 adaptive AUTOSAR Standard.

We will further give reasoning for deviations or if parts are not implemented

Header File and C++ Namespace Structure

In general we take over all structuring parts that define where which types are defined. It shall only be noted that we will deviate when it comes to implementing the ara-namespace. See [2].

API Types

Also the major API types are taken over without changes. We only have to adjust requirement [19] since the underlying requirement, , would require this data-type in the ara::core namespace. Here we will again diverge to fulfill requirement [2].

Event Types

For events we will provide only the types listed below. Any E2E types are not necessary, since we do not need support End-2-End protection (see [17]). Also types associated with Subscription State are not necessary, since there is no use-case on customer function side, so we try to keep the overhead minimal. This includes unused types like custom future/promise implementations, which are only necessary for method support. Also custom variant or optional types will not be implemented, since these are already provided by amp. Last but not least, the ScaleLinearAndTexttable class also has no usage within BMW, thus it can be skipped also.

Communication Payload Types

Adaptive AUTOSAR clearly defines which data-types can be transmitted via its communication mechanism. We orient ourselves strongly on this concepts. Our adaptive AUTOSAR generator, aragen shall support the following requirements.

It shall be noted that some specifics are not supported for transmission and will not be generated. This includes:

associative maps
variants
optionals in struct
ScaleLinearAndTexttable
custom allocators

The latter is especially important, since our implementation will need a custom allocator to ensure correct shared memory handling. In addition to these requirements, we need to clearly specify the max-size for each container. This is due to the fact that the shared memory needs to be preallocated[4]. If we need to preallocate on startup, the maximum size needs to be calculated in advance[12]. In order to do that, it is necessary to know the maximum number of elements in each container. This solves the same underlying issue as `` and SW_CM_00450 in a more generic way, thus a custom BMW extension.

[Shall enable definition of max-elements for container types]()

The error concept cannot be taken over as well. This relies heavily on the infrastructure provided by ara::core Within BMW we have a similar infrastructure within bmw::Result. Instead of following the requirements , and ``, we define that any error shall be reported via bmw::Result.

[Utilize bmw::Result for any error reporting]()

API Reference

The general API for proxies and skeletons can be taken over completely.

Skeleton creation

There are many ways to create a skeleton. Within BMW we only want to support one way, using exceptionless mechanisms and Instance Specifier. All other methods are not supported. Since we don't support method processing either, our constructors don't allow to specify the processing mode.

Event sending

Find Service & Proxy creation

The same story is true for proxies. We only allow polling based search of proxies and only with InstanceSpecifier. This also excludes the support for ANY.

Proxy Event Handling

The proxy event handling is mostly the same as in ara::com. Only the re-establishment for subscriptions (``) and any kind of subscription state query is not supported. Former needs to include a fully working service discovery, which was dismissed due to timing constraints. Latter is not used within BMW and was thus dismissed.

Multi-Target-Build

As stated already earlier, we want that our implementation is Multi-Target-Build ready. Same does adaptive AUTOSAR, which is why we can take these requirements over.

Security

The security chapter of the SWS Communication Management defines a possibility to restrict communication. While this is generally good, we agreed with our stakeholders that this kind of functionality will not be configured via ARXML. Thus, we will not follow these requirements (e.g. SWS_COM_90002), since they describe the direct connection with the AUTOSAR Meta Model. Our security requirements will be custom-made in the lower sections.

Additional Functional Requirements

In addition to the previous API centry requirements, there are some more functional requirements which are grouped in this section. Any network binding specific requirements like the ones for DDS or SOME/IP are obviously irrelevant, since we implement a User-Defined-Network binding.

Message Passing Facilities

The Message Passing facilities, under QNX this will be implemented by QNX Message Passing, will not be used to synchronize the access to the shared memory segments. This is done over the control segments. We utilize message passing for notfications only. These notifications include:

subscribe / unsubscribe[24]
event notification[25]

This is done, since there is no need to implement an additional notification handling via shared memory, which would only be possible by using mutexes and condition variables. The utilization of mutexes would make the implementation of a wait-free algorithms more difficult. As illustrated in the graphic below a process should provide one message passing port to receive data for each supported ASIL-Level[26]. In order to ensure that messages received from QM processes will not influence ASIL messages, each message passing port shall use a custom thread to wait for new messages[27]. Further, it must be possible to register callbacks for mentioned messages[28]. These callbacks shall then be invoked in the context of the socket specific thread[29]. This way we can ensure that messages are received in a serialized manner.

Shared Memory Handling

POSIX based operating systems generally support two kinds of shared memory: file-backed and anonymous. Former is represented by a file within the file-system, while the latter is not visible directly to other processes. We decide for former, in order to utilize the filesystem for a minimal service discovery[13],[14]. In order to avoid fault propagation over restarts of the system, any shared memory communication shall not be persistent[15]. Processes will identify shared memory segments over their name. The name will be commonly known by producers and consumers and deduced by additional parameters like for example service id and instance id[16]. When it comes to the granularity of the data stored in the shared memory segments, multiple options can be considered. We could have one triplet of shared memory segments per process or one triplet of shared memory segments per event within a service instance. Former would make the ASIL-Split of segments quite hard, while the latter would explode the number of necessary segments within the system. As trade-of we decided to have one triplet of shared memory segments per service instance[17].

It is possible to map shared memory segments to a fixed virtual address. This is highly discouraged by POSIX and leads to undefined behaviour[18]. Thus, shared memory segments will be mapped to different virtual adresses. In consequence no raw pointer can be stored within shared memory, since it will be invalid within another process. Only offset pointer (fancy pointer, relative pointer) shall be stored within shared memory segments[19].

The usage of shared memory does not involve the operating system, after shared memory segments are setup. Thus, the operating system can no longer ensure freedom from interference between processes that have access to these shared memory regions. In order to restrict access we use ACL support of the operating system[20], 21. In addition to the restricted permissions, we have to ensure that a corrupted shared memory region cannot influence other process-local memory regions. This can be ensured by performing Active Bounds Checking. So the only way how data corruption could propagate throughout a shared memory region is if a pointer within a shared memory region points out of it. Thus, a write operation to such a pointer could forward memory corruption. The basic idea to overcome such a scenario is, that we check that any pointer stays within the bounds of the shared memory region. Since anyhow only offset pointer can be stored in a shared memory region, this active bound check can be performed whenever a offset pointer is dereferenced[22]. The last possible impact can be on timing. If another process for example wrongly locks a mutex within the shared memory region and another process would then wait for this lock, we would end up in a deadlock. While this should not harm any safety goal, we still want to strive for wait-free algorithms to avoid such situations[23].

Synchronization Algorithm

A slot shall contain all necessary meta-information in order to synchronize data access[30]. This information most certainly needs to include a timestamp to indicate the order of produced data within the slots. Additionally, a use count is needed, indicating if a slot is currently in use by one process. The concrete data is implementation defined and must be covered by the detailed design.

The main idea of the algorithm is that a producer shall always be able to store one new data sample[31]. If he cannot find a respective slot, this indicates a contract violation, which indicates that a QM process misbehaved. In such a case, a producer should exclude any QM consumer from the communication[32].

This whole idea builds up on the split of shared memory segements by ASIL levels. This way we can ensure that an QM process will not degradate the ASIL Level for a communication path. In another case, where we already have a QM producer, it is possible for an ASIL B consumer to consume the QM data. In that case the data will always be QM since it is impossible for the middleware to apply additional checks to enhance the quality of data. This can only be done on application layer level.

Requirements

This section gives an overview of the incoming requirements towards this concept (system requirements) and the outgoing requirements, derived by this concept (software component requirements).

Relevant System Requirements

The codebeamer aggregation can be found here.

Relevant Software Component Requirements

The codebeamer aggregation can be found here

Additionally the requirements derived from the adaptive AUTOSAR requirements shall be considered, as decribed here.

Software Architecture

The high-level software architecture can be found here: <:27112/collaborator/document/4ff2a028-8ac7-4f7a-b42e-dd90816fec81?viewId=718e62d1-c6e4-4751-9458-40ec38ac39a2&viewType=model&sectionId=856b1ace-a576-49fb-ac5e-b030656afa6f>

The detailed design can be found here: </swh/ddad_platform/tree/master/aas/mw/com/design>

Additional Considerations

Security

The operation of shared memory is always a security concern, since it makes it easier for an attacker to access the memory space of another process.

This is especially true, if two processes have read / write access to the same pages. We are confident that our applied mechanisms, like reduced access to shared memory segements and active bounds checking prevent any further attack vectors.

The only scenario that is not covered is an attack against the control segments. An attacker could in the worst case null all usage counter. In that scenario a race-condition could happen, that data that is read, is written at the same time, causing incomplete data reads.

This is a drawback that comes with the benefit of less overhead for read/write synchronization, reducing our latency a lot. At this point in time we accept this drawback by the benefit of the performance.

Safety

Software Component Failure Analysis: <> Functional Failure Analysis: <>

AoUs towards Applications using LoLa

LoLa represents a substantial infrastructure part of our safety goals. Thus, additional AoUs towards the application side will result from LoLa.

An overview can be found here: <>

Performance

This is a performance measure. Thus, we expect a reduction in latency in communication and a reduced memory footprint, since there is no longer the need for memory copies.

Diagnostics

There are no implications on diagnostics, since there will be no diagnostic job used.

Testing

There will be substantial need for testing. A concrete test plan is only possible after the FMEA. At this point we are sure that all tests can be executed utilizing ITF Tests. This is necessary, since the tests heavily rely on the operating system. All other tests will be possible to be conducted as unit tests.

Dynamic Invocation Interface

For some use cases a loosely typed interface to services, which can be created dynamically during runtime without the need of compile-time dependencies, would be favorable. For this the DII concept has been created, which LoLa will implement for event communication in IPNext.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ipc

ipc

README.md

LoLa - Inter Process Communication (IPC)

Problem Formulation

Technical Concept Description

Features from adaptive AUTOSAR

Header File and C++ Namespace Structure

API Types

Event Types

Communication Payload Types

API Reference

Skeleton creation

Event sending

Find Service & Proxy creation

Proxy Event Handling

Multi-Target-Build

Security

Additional Functional Requirements

Message Passing Facilities

Shared Memory Handling

Synchronization Algorithm

Requirements

Relevant System Requirements

Relevant Software Component Requirements

Software Architecture

Additional Considerations

Security

Safety

AoUs towards Applications using LoLa

Performance

Diagnostics

Testing

Dynamic Invocation Interface

Files

ipc

Directory actions

More options

Directory actions

More options

Latest commit

History

ipc

Folders and files

parent directory

README.md

LoLa - Inter Process Communication (IPC)

Problem Formulation

Technical Concept Description

Features from adaptive AUTOSAR

Header File and C++ Namespace Structure

API Types

Event Types

Communication Payload Types

API Reference

Skeleton creation

Event sending

Find Service & Proxy creation

Proxy Event Handling

Multi-Target-Build

Security

Additional Functional Requirements

Message Passing Facilities

Shared Memory Handling

Synchronization Algorithm

Requirements

Relevant System Requirements

Relevant Software Component Requirements

Software Architecture

Additional Considerations

Security

Safety

AoUs towards Applications using LoLa

Performance

Diagnostics

Testing

Dynamic Invocation Interface