-
Notifications
You must be signed in to change notification settings - Fork 2
vys architecture
The vys system comprises a set of visibility stream producer processes (a.k.a. servers), a set of visibility stream consumer processes (a.k.a. clients), and a protocol to transfer visibility data spectra from producers to consumers using OFED/OFS technologies. There are two phases in the transmission protocol for every visibility spectrum. The first phase is the transmission of "what" and "where" information about a spectrum, and the second phase is the transmission of the spectral data.
The first phase, the transmission of so-called "signal messages" is designed to allow for loose coupling between the producers and consumers. Signal messages, which are sent for all spectral data that appear on the visibility stream, comprise metadata that describe where among the server processes, what spectral data are available. With this design, producers require neither knowledge of the set of active consumers at any time, nor what spectra those consumers wish to receive. Similarly, the consumers require no a priori knowledge of which producers will provide which spectral data. To meet these design objectives, the signal messages are sent via multicast over InfiniBand. To reduce the rate of multicast messages, each multicast message contains the "what" and "where" information for a sequence of spectra. Given the current VLA system implementation, the maximum that can be guaranteed of the association of WIDAR correlator data products with CBE processes is that for every WIDAR configuration, the complete time series of lag frames for any given data product will be received by a single CBE process. In other words, each spectral data product in a WIDAR configuration, with fixed baseline, spectral window, and polarization product, will be received by one CBE process for the entirety of the configuration. Thus, it is natural that each signal message contains a short time series of the "what" and "where" of a single data product, which leads to the following structure of the signal messages:
struct vys_spectrum_info {
uint64_t data_addr; /* location of spectral data on the server */
uint64_t timestamp; /* spectral data timestamp */
uint8_t digest[VYS_DATA_DIGEST_SIZE]; /* to validate data after retrieval */
};
struct vys_signal_msg_payload {
struct sockaddr_in sockaddr; /* server identification */
uint16_t num_channels; /* number of channels in a spectrum */
uint8_t stations[2]; /* station ids */
uint8_t spectral_window_index; /* spectral window id */
uint8_t stokes_index; /* stokes product id */
uint8_t num_spectra; /* number of spectra in the following array */
struct vys_spectrum_info infos[]; /* per spectra information */
};
The second phase of data transmission occurs when a client retrieves whatever spectral data it chooses, based on the contents of the signal messages. This phase is implemented via RDMA, or Remote Direct Memory Access, which is available on most OFED fabrics, including InfiniBand. The primary advantage of using RDMA for data retrieval is that it may proceed without participation of any server CPU in the transfer process, minimizing the affect of client data retrieval on all servers. In the two C language structures show above, the fields sockaddr
and data_addr
are used by clients to retrieve the associated spectral data using RDMA read operations. There is a third value required of the initiating side of an RDMA read transfer, the so-called "remote key"; this value is provided to clients once by the vys protocol, when the InfiniBand connection is made between a client and a server, and is not carried by the signal messages.
It is important to note that to retrieve data by RDMA from a server process, the data must reside in a region of "registered memory" in the server. The registered memory regions are a limited resource, in which a server process may store spectral data for a limited time. The organization of data buffers in a region of registered memory may be viewed as a ring buffer, with, however, only a write pointer, and not a read pointer. Thus, any data buffer identified by the "where" fields of a signal message will contain the data identified by the "what" fields for a limited time only. Because vys is designed to avoid synchronization between producers and consumers, access to server data buffers is unsynchronized. To compensate, the signal messages contain a data digest value, to be used by clients to validate the data after it has been copied to the client's memory. An invalid digest value indicates that the retrieved data are not the data that were described by the associated signal message, and the server data buffer was already (partially) reused by the server before the client read operation completed.