Skip to content

datarootsio/terraform-module-azure-snowflake

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

maintained by dataroots

dataroots.png

Terraforming Snowflake

Part of Weather Nowcasting - Weather prediction on an NVIDIA jetson with a CI/CD pipeline on the cloud. Shows how Terraform can be used to set up our Snowflake resources. On top of that, we present how to load in near-real-time the content of an Azure container into a Snowflake table.

General

This is part of the graduation project of the rootsacademy of March 2022. An application to predict weather up to 4 hours in the future based on measurements obtained from a sensor installed on the roof of Dataroots' building. These predictions will be based on real-time measurements of the temperature, humidity, pressure, light, sound and a camera pointed towards the sky. We also wanted to exercise building the CI/CD pipeline and connection between the edge device and the cloud. More information can also be found here.

Cloud architecture

Here's an overview of our cloud architecture:

  • Provisioning the required resources and environments
  • Managing the flow of sensor data from the edge device into the IoT hub on Azure
  • Connecting the IoT endpoint (blob container) to Snowflake using Azure Event Grid
  • Managing the flow of images from the device into a blob storage container on Azure and from the storage container into the ML workspace
  • Developing a CI/CD pipeline that automates the build and deployment of a model on an edge device

architecture.png

Resources

Snowflake

  • Database
  • Warehouse
  • Schema
  • Tables
    • One for sensor information
    • One for predictions
  • File format
  • Storage & notification integration
  • Pipes (one for each of the tables)
  • Stages (one for each of the tables)

Edge & ML resources:

  • IoT hub, along with its dependencies, e.g. azure container registry
  • Storage accounts
    • One for storing the images sent from the IoT device
    • One for storing the sensor data
    • One for storing the predictions
  • Machine learning workspace, along with its dependencies, e.g. key vault, application insights

CI/CD actions

In this project we validate and plan our Terraform code as part of a Github workflow; more specifically, we execute the following 4 steps:

  • Init command: terraform init -backend-config backend/staging.tfvars
  • Format command: terraform fmt -check -recursive
  • Validate command: terraform validate
  • Plan command: terraform plan

The above actions get triggered on a pull-request; whenever the pull-request is reviewed, approved and closed, a new workflow will run and everything gets executed in production: terraform apply -var-file vars/prod.tfvars

Prerequisites

Getting started

We explain everything in our blog post. Briefly, you can do the following:

  • Find out the Snowflake's account locator and region id
  • Create RSA key
  • Create service user for authentication to Terraform
  • Set up the resources
  • Grant access to storage containers
  • Verify/apply changes

Acknowledgements

Special thanks go to our supervisors at Dataroots for guiding us throught this project.

Contact

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages