Table of Contents
Function-as-a-Service (FaaS) applications could harness the disseminated nature of the fog and take advantage of fog’s benefits, such as real-time processing and reduced bandwidth. The FaaS programming paradigm allows applications to be divided in independent units called “functions.” However, deciding how to place those units in the fog is challenging. Fog contains diverse, potentially resource-constrained nodes, geographically spanning from the Cloud to the IP network edges. These nodes must be efficiently shared between the multiple applications that will require to use the fog.
We introduce “fog node ownership,” a concept where fog nodes are owned by different actors that chose to give computing resources in exchange for remuneration. This concept allows for the reach of the fog to be dynamically extended without central supervision from a unique decision taker, as currently considered in the literature. For the final user, the fog appears as a single unified FaaS platform. We use auctions to incentivize fog nodes to join and compete for executing functions.
Our auctions let Fog nodes independently put a price on candidate functions to run. It introduces the need of a “Marketplace,” a trusted third party to manage the auctions. Clients wanting to run functions communicate their requirements using Service Level Agreements (SLA) that provide guarantees over allocated resources or the network latency. Those contracts are propagated from the Marketplace to a node and relayed to neighbors.
Key features of Global Integration of Reverse Auctions and Fog Functions (GIRAFF):
- Standard stack (Kubernetes, OpenFaaS)
- Efficient programming in Rust to enable effective collaboration (also cross-compiling in the future?)
- Nix to reproduce scientifically the experiments and maintain the same development environment for everyone
- Functions to deploy
- Grid’5000 support thanks to EnosLib
Additional info
This project has been started as an internship sponsored as I was a student from National Institute of Applied Sciences Rennes (INSA Rennes) and a second master (SIF) under University of Rennes 1, University of Southern Brittany (UBS), ENS Rennes, INSA Rennes and CentraleSupélec.
A thesis is being financed by the «Centre INRIA de l’Université de Rennes» to pursue the work.
- Nix
- Rust
- Go
- Kubernetes (K3S)
- OpenFaaS
- EnosLib
This repo uses extensively just
as a powerful CLI facilitator.
Here is an overview of the content of this repo:
.
├── testbed # Contains EnosLib code to interact with Grid'5000 (build + deployment of live environment at true scale)
├── manager # contains the code of the marketplace and the fog_node
├── iot_emulation # Sends request to fog nodes in the experiments to measure their response time, etc.
└── openfaas-functions # contains code of Fog FaaS functions
- Install Nix
Nix portable is not officially supported, however it may be enough to explore the project without installing Nix
- Append to /etc/nix/nix.conf:
extra-experimental-features = nix-command flakes max-jobs = auto cores = 0 log-lines = 50 builders-use-substitutes = true trusted-useres = root <YOUR USERNAME>
This enables commands such as
nix develop
, multithreading, bigger logs and the usage of the projet's cachix cache nix develop
(or configure direnv)- All usual commands can be found in the justfiles, just type
just --list
These commands work when
flake.nix
andjustfile
are present in the current directory you are in.
This section concerns OpenFaaS functions, the code for the fog nodes under manager/
, the code for iot_emulation
, the code for the proxy.
Usually, nix develop
gets you started. I code using VS Code, thus the extension Direnv will make VS Code use all the applications/env loaded in the nix develop
, e.g. you can use Rust/Golang LSPs server/toolchains inside VS Code without ever “installing” them on the computer.
VMs detail are located in testbed/iso/flake.nix
. There lies the whole configuration of the experimental VM. In testbed/flake.nix
you can also see the VM used to deploy the code in grid’5000.
To start the same vm as in the experiments, one should go to testbed/iso
and enter a nix develop
. Then starting the VM is a matter of just vm
. Once the VM is started, connection can be made with just ssh-in
.
This process is used to start the VM used to develop the fog node software programs
This uses the exact same VMs as the previous section. To generate the VM disk and send it to grid’5000, enter just upload <rennes|nancy|...>
. The disk will be uploaded.
Then go to testbed
, once this is done, you can configure the experiments in .env
; .experiments.env
handles the variations of multiple runs. These values are used in integration.py
that handles the deployment and definitions.py
that defines the fog network and some Kubernetes configurations.
Then, just upload
will rsync both the master’s VM and the previously described files to the configured grid’5000 cluster.
Finally, just master_exec <ghcr.io username> <experiment name>
will start the experiment as configured. Do not forget to make the ghcr.io image public. The different parts of GIRAFF make use of labels to only have 1 image public.
Note that in
.experiments.env
there are some options to gracefully handle failures, as I use GNU parallel, one can “resume” a job that had some failures before.
In the end, experimental results will be available on the cluster in metric-arks
. One is able to get them back using the command just get_metrics_back
. This will download them in the local metrics-arks
folder.
Azure FaaS traces have been released in 2020. Those traces have been characterized by probabilistic laws [Hernod]. Those laws are described in the following:
- Execution times follow a highly-variable Log-normal law;
- functions live in the range of milliseconds to minutes;
Functions are billed with a millisecond granularity;- the median execution time if 600ms;
- the 99%th execution time is more than 140 seconds;
- 0.6% of functions account for 90% of total invocations (they also test a multiple functions balanced workload, where each function receives the same load);
- arrivals follow an open-loop Poisson law;
- arrival burstiness index is -0.26;
- total number of invocations does not vary much;
- invocations follow diurnal patterns as the Cloud does too;
- functions are busy-spun for the execution_duration, repeating a timed math operation over-and-over.
In their simulations/experiments, they state they use:
- mu = -0.38
- sigma = 2.36
For their experiments, they use MSTrace and select a subset of real traces to be re-run on the FaaS platform. Functions are 256MB of RAM.
- A Student T distribution is used for function latencies (as we don't know the real sample size nor the std deviation). The distribution would be divided in three buckets: functions that do not care about the latency (meaning the threshold t is at 10 secs), functions that are normal: they would want their responses back in a usable time (t = 150ms), and low latency functions (t = 15ms). The two extrema would be 5% of the total number functions. We use df=10 with -2.25 and 2.25 for the extrema.
Our own functions are:
echo
simply register to Influx the arrival time of the functions. It can transfer the request to a next function. (Rust)
We borrowed functions from EdgeFaaSBench
speech-to-text
(Python)
This flake has two modes: just labExport
will start a JupyterLab with Latex support and Tikz export for the graphs, just lab
starts a lighter version without Latex and Tikz.
Then, data exploitation is done using R inside the JupyterLab server.
With article submissions, one can find the raw data in the Release page. Take the latest, extract it to
testbed
under a directory namedmetrics-arks
. Then runjust lab
and you will be able to explore the data. Please notice that this process is heavy on the CPU and especially the RAM. I used some Systemd magic to prevent my computer from using too much RAM and cut the program if so.
To locally develop, I will describe the simple steps to get started:
- start the VM:
cd testbed/iso; just vm
- start the iot_emulation:
cd iot_emulation; just run
- start the manager (fog node && market):
cd manager; just run <ip:local ip(not localhost)>
- upload the functions to the registry:
cd openfaas-functions; just
- you can run parts of the experimental configuration using
cd manager; just expe <ip:same as before>
Tip
On grid5k, on your ~/
, you can paste a tailscale auth token (ephemeral; reusable) in ~/tailscale_authkey
to automatically connect newly spawned VMs to tailnet. That way, you can access Jaeger to see logs from the comfort of your web browser, for example.
Most of the flakes are compatible with both Linux and macOS. However, when generating packages for Linux (like the VM), only Linux can. Extension could be done to enable full cross-platform support.
Please open an issue or contact me from the info in my GitHub profile so that I may be of assistance.
This project is licensed under the MIT license.
See LICENSE for more information.
Thanks for these awesome resources that were used during the development of this project
- GNU Parallel
- EnosLib
- Grid’5000
- TCP latency/throughput estimation: [1] T. J. Hacker, B. D. Athey, et B. Noble, « The end-to-end performance effects of parallel TCP sockets on a lossy wide-area network », dans Proceedings 16th International Parallel and Distributed Processing Symposium, Ft. Lauderdale, FL: IEEE, 2002, p. 10 pp. doi: 10.1109/IPDPS.2002.1015527. and Bolliger, J., Gross, T. and Hengartner, U., Bandwidth modeling for network-aware applications. In INFOCOM '99, March 1999.