Physical properties of the surface layer of the soil. Continuous Raster Layer of organic carbon content, in g/kg, with a resolution of 500m.
Soil organic carbon is a measureable component of soil organic matter. Sequestering carbon in SOC is seen as one way to mitigate climate change by reducing atmospheric carbon dioxide. The argument is that small increases of SOC over very large areas will significantly reduce net carbon dioxide emissions from agriculture. For the reduction to be long-lasting, organic matter would have to be in the more stable or resistant fractions.
Dataset: https://esdac.jrc.ec.europa.eu/themes/topsoil-soil-organic-carbon-lucas
The ingestion workflow is based on a microservice pipeline developed in python and orchestrated using the Nuclio serverless framework.
These microservices are scalable, highly available and communicate with each other using a Kafka broker/topic enabling the distribution of process workload with buffering and replay capabilities.
The source code of the microservices involved in the pipeline can be found in the ./pipelines/esdacdirectory:
- 00-http-handler: expose an API to trigger the ingestion pipeline. It forges a download command from an http POST request and send it to next microservice (01-http-processor).
- 01-http-processor: upon receipt of a download command, it downloads and ingests the submitted GeoTIFF archive and sends each pixel as a GeoJSON feature to the next microservice (02-http-ingestion).
- 02-http-ingestion: upon receipt of a GeoJSON feature, a reprojection to the EPSG:4326 CRS is performed before upserting the feature in a MongoDB collection.
The 00-boostrap.yaml file contains the declarative configuration of two Kafka topics and the soc Mongodb collection with a 2dsphere index on the geometry field.
The first step is to encapsulate the microservices in Docker containers. The actions of building, tagging and publishing Docker images can be performed using the provided Makefiles.
For each microservice (00-http-handle,01-http-processor and 02-http-ingestion) execute the following bash commands:
- make buildto build the container image
- make tagto tag the newly built container image
- make publishto publish the container image on your own Docker repository.
The repository, name and tag of each Docker container image can be overridden using the environment variables DOCKER_REPO, IMAGE_NAME and IMAGE_TAG:
$> make DOCKER_REPO=index.docker.io IMAGE_NAME=eufast/soc-pipe-http-trigger IMAGE_TAG=0.2.0 tagThe Nuclio Serveless framework provides a simple way to describe and deploy microservices (seen as functions) on top of platforms like Kubernetes. In our case, Nuclio handles the subscription to the kafka topics and the execution of the function upon receipt of messages.
A basic configuration is provided and works as is. Don't forget to specify the Docker image created previously. Many configurations are available as the number of replicas, environment variables, resources and triggers.
apiVersion: nuclio.io/v1beta1
kind: Function
metadata:
  name: soc-pipe-http-trigger
  namespace: fast-platform
  labels:
    platform: fast
    module: core
    data: soc
spec:
  alias: latest
  description: Forge a download command from an http POST request
  handler: main:handler
  image: eufast/soc-pipe-http-trigger:0.1.0
  replicas: 1
  maxReplicas: 3
  targetCPU: 80
  runtime: python:3.6
  env:
  - name: KAFKA_BOOTSTRAP_SERVER
    value: "kafka-broker.kafka:9092"
  - name: TARGET_TOPIC
    value: soc-pipe-download
  resources:
    requests:
      cpu: 10m
      memory: 64Mi
    limits:
      cpu: 1
      memory: 1Gi 
  triggers:
    http:
      attributes:
        ingresses:
          "dev":
            host: api.fast.sobloo.io
            paths:
            - /v1/fast/data/soc
      kind: http
      maxWorkers: 5
  version: -1
status:
  state: waitingForResourceConfigurationPlease refer to the official Nuclio documentation for more information.
To deploy the pipeline on Kubernetes, apply the following YAML manifests:
$> kubectl create -f 00-bootstrap.yaml$> cd pipelines/esdac$> kubectl create -f 00-http-trigger/function.yaml
$> kubectl create -f 01-archive-processor/function.yaml
$> kubectl create -f 02-datastore-ingestion/function.yamlThen check the status of Kubernetes pods:
$> kubectl -n fast-platform get pod -l module=core,data=soc --show-all
NAME                                          READY     STATUS      RESTARTS   AGE
bootstrap-soc-data-9mzx7                      0/2       Completed   0          31s
soc-pipe-archive-processor-597d946c65-bnsct   1/1       Running     0          3s
soc-pipe-http-trigger-75c6dfdb57-sf2db        1/1       Running     0          13s
soc-pipe-ingestion-74c6f788b-dql66            1/1       Running     0          6sTo trigger the ingestion of the data, execute a HTTP POST request as shown below using Postman:

