Skip to content

rewrite snowflake quickstart guide #83

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jun 24, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added content/en/getting-started/.DS_Store
Binary file not shown.
239 changes: 176 additions & 63 deletions content/en/getting-started/quickstart/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,22 +9,23 @@ description: Get started with LocalStack for Snowflake in a few simple steps

## Introduction

This guide explains how to set up the Snowflake emulator and develop a Python program using the Snowflake Connector for Python (`snowflake-connector-python`) to interact with emulated Snowflake running on your local machine.
This guide explains how to set up the Snowflake emulator and use Snowflake CLI to interact with Snowflake resources running on your local machine. You'll learn how to create a Snowflake database, schema, and table, upload data to a stage, and load data into the table. This quickstart is designed to help you get familiar with the Snowflake emulator and its capabilities.

## Prerequisites

- [`localstack` CLI](https://docs.localstack.cloud/getting-started/installation/#localstack-cli)
- [LocalStack for Snowflake]({{< ref "installation" >}})
- Python 3.10 or later
- [`snowflake-connector-python` library](https://docs.snowflake.com/en/developer-guide/python-connector/python-connector-install)
- [`localstack` CLI](https://docs.localstack.cloud/getting-started/installation/#localstack-cli)
- [Snowflake CLI]({{< ref "user-guide/integrations/snow-cli" >}})

LocalStack for Snowflake works with popular Snowflake integrations to run your SQL queries. This guide uses the [Snowflake CLI]({{< ref "user-guide/integrations/snow-cli" >}}), but you can also use [SnowSQL]({{< ref "user-guide/integrations/snowsql" >}}), [DBeaver]({{< ref "user-guide/integrations/dbeaver" >}}) or the [LocalStack Web Application]({{< ref "user-guide/user-interface" >}}) for this purpose.

## Instructions

Before you begin, pull the Snowflake emulator image (`localstack/snowflake`) and start the container:

{{< command >}}
$ export LOCALSTACK_AUTH_TOKEN=<your_auth_token>
$ IMAGE_NAME=localstack/snowflake:latest localstack start
$ localstack start --stack snowflake
{{< / command >}}

Check the emulator's availability by running:
Expand All @@ -36,89 +37,193 @@ $ curl -d '{}' snowflake.localhost.localstack.cloud:4566/session
</disable-copy>
{{< / command >}}

### Connect to the Snowflake emulator
In this quickstart, we'll create a student records database that demonstrates how to:

- Create databases, schemas, and tables
- Create stages and upload data using the PUT command
- Load data from CSV files into tables
- Query your data

Create a new Python file named `main.py` and use the following code to connect to the Snowflake emulator:
### Create database, schema & table

```python
import snowflake.connector as sf
Create the Snowflake database named `STUDENT_RECORDS_DEMO` and use it:

sf_conn_obj = sf.connect(
user="test",
password="test",
account="test",
database="test",
host="snowflake.localhost.localstack.cloud",
)
```sql
CREATE DATABASE IF NOT EXISTS STUDENT_RECORDS_DEMO;
USE DATABASE STUDENT_RECORDS_DEMO;
```

Specify the `host` parameter as `snowflake.localhost.localstack.cloud` and the other parameters as `test` to avoid connecting to the real Snowflake instance.
The output should be:

### Create and execute a query
```bash
+-----------------------------------------------------+
| status |
|-----------------------------------------------------|
| Database STUDENT_RECORDS_DEMO successfully created. |
+-----------------------------------------------------+
```

Extend the Python program to insert rows from a list object into the emulated Snowflake table. Create a cursor object and execute the query:
Create a Snowflake schema named `PUBLIC` and use it:

```python
print("1. Insert lot of rows from a list object to Snowflake table")
print("2. Creating a cursor object")
sf_cur_obj = sf_conn_obj.cursor()
```sql
CREATE SCHEMA IF NOT EXISTS PUBLIC;
USE SCHEMA PUBLIC;
```

print("3. Executing a query on cursor object")
try:
sf_cur_obj.execute(
"create or replace table "
"ability(name string, skill string )")
The output should be:

rows_to_insert = [('John', 'SQL'), ('Alex', 'Java'), ('Pete', 'Snowflake')]

sf_cur_obj.executemany(
" insert into ability (name, skill) values (%s,%s) " ,rows_to_insert)
```bash
+---------------------------------------------+
| result |
|---------------------------------------------|
| public already exists, statement succeeded. |
+---------------------------------------------+
```

sf_cur_obj.execute("select name, skill from ability")
Last, create the table `STUDENT_DATA` in the database:

```sql
CREATE OR REPLACE TABLE STUDENT_DATA (
student_id VARCHAR(50),
first_name VARCHAR(100),
last_name VARCHAR(100),
email VARCHAR(200),
enrollment_date DATE,
gpa FLOAT,
major VARCHAR(100)
);
```

print("4. Fetching the results")
result = sf_cur_obj.fetchall()
print("Total # of rows :" , len(result))
print("Row-1 =>",result[0])
print("Row-2 =>",result[1])
finally:
sf_cur_obj.close()
The output should be:

```bash
+------------------------------------------+
| status |
|------------------------------------------|
| Table STUDENT_DATA successfully created. |
+------------------------------------------+
```

This program creates a table named `ability`, inserts rows, and fetches the results.
### Create file format & stage

### Run the Python program
Now, create a file format for CSV files:

Execute the Python program with:
```sql
CREATE OR REPLACE FILE FORMAT csv_format
TYPE = CSV
FIELD_DELIMITER = ','
SKIP_HEADER = 1
NULL_IF = ('NULL', 'null')
EMPTY_FIELD_AS_NULL = TRUE;
```

{{< command >}}
$ python main.py
{{< / command >}}
The output should be:

```bash
+----------------------------------------------+
| status |
|----------------------------------------------|
| File format CSV_FORMAT successfully created. |
+----------------------------------------------+
```

Create a stage for uploading files:

```sql
CREATE OR REPLACE STAGE student_data_stage
FILE_FORMAT = csv_format;
```

The output should be:

```bash
+-----------------------------------------------------+
| ?COLUMN? |
|-----------------------------------------------------|
| Stage area STUDENT_DATA_STAGE successfully created. |
+-----------------------------------------------------+
```

### Upload and load sample data

Create a new file named `student_data.csv` with sample student records:

```csv
student_id,first_name,last_name,email,enrollment_date,gpa,major
S001,John,Smith,john.smith@university.edu,2023-08-15,3.75,Computer Science
S002,Alice,Johnson,alice.johnson@university.edu,2023-08-15,3.92,Mathematics
S003,Bob,Williams,bob.williams@university.edu,2022-08-15,3.45,Engineering
S004,Carol,Brown,carol.brown@university.edu,2024-01-10,3.88,Physics
S005,David,Davis,david.davis@university.edu,2023-08-15,2.95,Biology
```

Upload the CSV file to the stage using the PUT command:

```sql
PUT file://student_data.csv @student_data_stage AUTO_COMPRESS=TRUE;
```

{{< alert title="Note" >}}
Adjust the file path to the location of your `student_data.csv` file.
{{< /alert >}}

The output should show the file upload status:

```bash
source |target |source_size|target_size|source_compression|target_compression|status |message|
----------------+-------------------+-----------+-----------+------------------+------------------+--------+-------+
student_data.csv|student_data.csv.gz| 425| 262|NONE |GZIP |UPLOADED| |
```

Now load the data from the stage into the table:

```sql
COPY INTO STUDENT_DATA
FROM @student_data_stage
ON_ERROR = 'CONTINUE';
```

### Verify data loading

```sql
USE DATABASE STUDENT_RECORDS_DEMO;
USE SCHEMA PUBLIC;

SELECT COUNT(*) as total_students FROM STUDENT_DATA;
```

The output should be:

```bash
Insert lot of rows from a list object to Snowflake table
1. Insert lot of rows from a list object to Snowflake table
2. Creating a cursor object
3. Executing a query on cursor object
4. Fetching the results
Total # of rows : 3
Row-1 => ('John', 'SQL')
Row-2 => ('Alex', 'Java')
+----------------+
| TOTAL_STUDENTS |
|----------------|
| 5 |
+----------------+
```

Similarly, you can query the student details based on their GPA:

```sql
SELECT first_name, last_name, major, gpa
FROM STUDENT_DATA
WHERE gpa >= 3.8
ORDER BY gpa DESC;
```

Verify the results by navigating to the LocalStack logs:
The output should be:

```bash
2024-02-22T06:03:13.627 INFO --- [ asgi_gw_0] localstack.request.http : POST /session/v1/login-request => 200
2024-02-22T06:03:16.122 WARN --- [ asgi_gw_0] l.packages.core : postgresql will be installed as an OS package, even though install target is _not_ set to be static.
2024-02-22T06:03:45.917 INFO --- [ asgi_gw_0] localstack.request.http : POST /queries/v1/query-request => 200
2024-02-22T06:03:46.016 INFO --- [ asgi_gw_1] localstack.request.http : POST /queries/v1/query-request => 200
2024-02-22T06:03:49.361 INFO --- [ asgi_gw_0] localstack.request.http : POST /queries/v1/query-request => 200
2024-02-22T06:03:49.412 INFO --- [ asgi_gw_1] localstack.request.http : POST /session => 200
FIRST_NAME|LAST_NAME|MAJOR |GPA |
----------+---------+-----------+----+
Alice |Johnson |Mathematics|3.92|
Carol |Brown |Physics |3.88|
```

Optionally, you can also query your Snowflake resources & data using the LocalStack Web Application, that provides a **Worksheet** tab to run your SQL queries.

<img src="snowflake-web-ui.png" alt="Running SQL queries using LocalStack Web Application" width="900"/>

### Destroy the local infrastructure

To stop LocalStack and remove locally created resources, use:
Expand All @@ -127,9 +232,17 @@ To stop LocalStack and remove locally created resources, use:
$ localstack stop
{{< / command >}}

LocalStack is ephemeral and doesn't persist data across restarts. It runs inside a Docker container, and once it’s stopped, all locally created resources are automatically removed. In a future release of the Snowflake emulator, we will provide proper persistence and integration with our [Cloud Pods](https://docs.localstack.cloud/user-guide/state-management/cloud-pods/) feature as well.
LocalStack is ephemeral and doesn't persist data across restarts. It runs inside a Docker container, and once it's stopped, all locally created resources are automatically removed. To persist the state of your LocalStack for Snowflake instance, please check out our guide on [State Management]({{< ref "user-guide/state-management" >}}).

## Next Steps

Now that you've completed the quickstart, here are some additional features you can explore:

- **Load data from cloud storage**: You can load data through our [Storage Integrations]({{< ref "user-guide/storage-integrations" >}}) (currently supporting AWS S3) or using a script (see [Snowflake Drivers]({{< ref "user-guide/snowflake-drivers" >}}))
- **Automate data ingestion**: You can configure [Snowpipe]({{< ref "user-guide/snowpipe" >}}) for automated data ingestion from external sources
- **Use your favorite tools**: You can continue to work with your favorite tools to develop on LocalStack for Snowflake locally, see [Integrations]({{< ref "user-guide/integrations" >}})

## Next steps
## Further Reading

You can now explore the following resources to learn more about the Snowflake emulator:

Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.