Skip to content

Add Snowflake loader support with full-stack integration#380

Open
Copilot wants to merge 7 commits intomainfrom
copilot/add-support-for-snowflake
Open

Add Snowflake loader support with full-stack integration#380
Copilot wants to merge 7 commits intomainfrom
copilot/add-support-for-snowflake

Conversation

Copy link
Contributor

Copilot AI commented Jan 21, 2026

Implementation Plan for Snowflake Loader

  • Add snowflake-connector-python dependency to Pipfile
  • Create snowflake_loader.py following the pattern of mysql_loader.py and postgres_loader.py
    • Implement SnowflakeLoader class inheriting from BaseLoader
    • Implement _execute_sample_query method with Snowflake-specific random sampling
    • Implement _serialize_value method for JSON serialization
    • Implement _parse_snowflake_url method to parse Snowflake connection URLs
    • Implement load method to extract and load schema into graph
    • Implement extract_tables_info method
    • Implement extract_columns_info method
    • Implement extract_foreign_keys method
    • Implement extract_relationships method
    • Implement is_schema_modifying_query method
    • Implement refresh_graph_schema method (fixed async generator consumption)
    • Implement execute_sql_query method
  • Update schema_loader.py to support Snowflake URL detection
  • Update text2sql.py to support Snowflake database type detection
  • Create test_snowflake_loader.py following the pattern of existing loader tests
  • Install dependencies and verify code quality with pylint
  • Run linting to ensure code quality (all files pass with 10/10)
  • Address code review feedback and fix async generator issue
  • Run security scan (0 vulnerabilities found)
  • Create comprehensive documentation (docs/snowflake_loader.md)
  • SECURITY FIX: Update snowflake-connector-python to 3.13.2 (patched version)
  • SECURITY FIX: Fix SQL injection vulnerabilities with proper identifier validation and parameterization
  • FRONTEND: Add Snowflake support to DatabaseModal UI component

Summary

Successfully implemented complete end-to-end Snowflake support for QueryWeaver with comprehensive security and full-stack integration:

  • Connection URL format: snowflake://username:password@account/database/schema?warehouse=warehouse_name
  • Full feature parity with MySQL and PostgreSQL loaders
  • Comprehensive test coverage with 20+ unit tests including security validation tests
  • Zero security vulnerabilities detected (updated to patched version 3.13.2)
  • SQL injection protection via identifier validation and parameterized queries
  • Perfect code quality (10/10 pylint score)
  • Complete documentation including usage examples and troubleshooting guide
  • Frontend integration providing full UI support for Snowflake connections

Security Fixes Applied

All SQL injection vulnerabilities identified in code review have been fixed:

  1. Added _validate_identifier() method to validate all identifiers against alphanumeric pattern
  2. Added _quote_identifier() method to safely escape double quotes in identifiers
  3. Converted all queries to use parameterized queries for user-controlled values (schema_name, table_name)
  4. Database names are validated and quoted, but not parameterized (Snowflake limitation for INFORMATION_SCHEMA queries)
  5. Added comprehensive tests for validation logic including malicious input attempts

Frontend Changes

Added complete Snowflake integration to the DatabaseModal component (app/src/components/modals/DatabaseModal.tsx):

  • Database Type Selection: New "Snowflake" option in the dropdown with cyan color indicator
  • Connection URL Mode: Updated placeholder to show Snowflake URL format with helper text explaining defaults
  • Manual Entry Mode: Snowflake-specific fields for guided connection setup:
    • Account (required) - Snowflake account identifier
    • Database (required) - Database name
    • Schema (optional, defaults to PUBLIC)
    • Warehouse (optional, defaults to COMPUTE_WH)
    • Username (required)
    • Password (optional)
  • Smart Validation: Updated form validation to handle Snowflake's different required fields
  • URL Construction: Proper Snowflake connection URL building for both manual and URL entry modes
  • User Experience: Maintains consistent UI/UX patterns with PostgreSQL and MySQL options

Users can now connect to Snowflake databases through the UI using either direct connection URLs or the guided manual entry form.

Original prompt

This section details on the original issue you should resolve

<issue_title>Add support for Snowflake</issue_title>
<issue_description>Add support for Snowflake loader.

See reference: https://github.com/FalkorDB/QueryWeaver/tree/main/api/loaders</issue_description>

Comments on the Issue (you are @copilot in this section)


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI and others added 3 commits January 21, 2026 14:18
Co-authored-by: gkorland <753206+gkorland@users.noreply.github.com>
Co-authored-by: gkorland <753206+gkorland@users.noreply.github.com>
Co-authored-by: gkorland <753206+gkorland@users.noreply.github.com>
@gkorland gkorland requested a review from Copilot January 21, 2026 14:25
@gkorland
Copy link
Contributor

@CodeRabbit review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 21, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

…ction vulnerability

Co-authored-by: gkorland <753206+gkorland@users.noreply.github.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 21, 2026

Important

Review skipped

Bot user detected.

To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.


Comment @coderabbitai help to get the list of available commands and usage tips.

@gkorland gkorland marked this pull request as ready for review January 21, 2026 14:26
@gkorland gkorland requested a review from galshubeli January 21, 2026 14:26
@github-actions
Copy link

github-actions bot commented Jan 21, 2026

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

OpenSSF Scorecard

PackageVersionScoreDetails
pip/asn1crypto 1.5.1 🟢 4.8
Details
CheckScoreReason
Code-Review⚠️ 1Found 5/30 approved changesets -- score normalized to 1
Security-Policy🟢 10security policy file detected
Dangerous-Workflow🟢 10no dangerous workflow patterns detected
Packaging⚠️ -1packaging workflow not detected
Maintained⚠️ 00 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0
Token-Permissions⚠️ 0detected GitHub workflow tokens with excessive permissions
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
Binary-Artifacts🟢 10no binaries found in the repo
Pinned-Dependencies⚠️ 0dependency not pinned by hash detected -- score normalized to 0
Vulnerabilities🟢 100 existing vulnerabilities detected
Fuzzing🟢 10project is fuzzed
License🟢 10license file detected
Signed-Releases⚠️ -1no releases found
Branch-Protection⚠️ 0branch protection not enabled on development/release branches
SAST⚠️ 0SAST tool is not run on all commits -- score normalized to 0
pip/certifi 2026.1.4 🟢 6.8
Details
CheckScoreReason
Code-Review🟢 4Found 2/5 approved changesets -- score normalized to 4
Maintained🟢 1015 commit(s) and 1 issue activity found in the last 90 days -- score normalized to 10
Binary-Artifacts🟢 10no binaries found in the repo
Security-Policy🟢 10security policy file detected
Dangerous-Workflow🟢 10no dangerous workflow patterns detected
Token-Permissions🟢 10GitHub workflow tokens follow principle of least privilege
Pinned-Dependencies🟢 5dependency not pinned by hash detected -- score normalized to 5
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
Vulnerabilities🟢 100 existing vulnerabilities detected
License🟢 9license file detected
Fuzzing⚠️ 0project is not fuzzed
Signed-Releases⚠️ -1no releases found
Packaging🟢 10packaging workflow detected
Branch-Protection⚠️ 0branch protection not enabled on development/release branches
SAST⚠️ 0SAST tool is not run on all commits -- score normalized to 0
pip/cffi 1.17.1 🟢 5.6
Details
CheckScoreReason
Code-Review🟢 9Found 12/13 approved changesets -- score normalized to 9
Maintained🟢 1024 commit(s) and 1 issue activity found in the last 90 days -- score normalized to 10
Token-Permissions⚠️ 0detected GitHub workflow tokens with excessive permissions
Binary-Artifacts🟢 9binaries present in source code
Packaging⚠️ -1packaging workflow not detected
Dangerous-Workflow🟢 10no dangerous workflow patterns detected
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
Pinned-Dependencies⚠️ 0dependency not pinned by hash detected -- score normalized to 0
Security-Policy⚠️ 0security policy file not detected
Vulnerabilities🟢 100 existing vulnerabilities detected
License🟢 9license file detected
Fuzzing⚠️ 0project is not fuzzed
Signed-Releases⚠️ -1no releases found
Branch-Protection⚠️ -1internal error: error during branchesHandler.setup: internal error: some github tokens can't read classic branch protection rules: https://github.com/ossf/scorecard-action/blob/main/docs/authentication/fine-grained-auth-token.md
SAST⚠️ 0SAST tool is not run on all commits -- score normalized to 0
pip/cryptography 44.0.3 🟢 8
Details
CheckScoreReason
Code-Review🟢 10all changesets reviewed
Maintained🟢 1030 commit(s) and 17 issue activity found in the last 90 days -- score normalized to 10
Security-Policy🟢 10security policy file detected
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
Dangerous-Workflow🟢 10no dangerous workflow patterns detected
License🟢 9license file detected
Branch-Protection⚠️ -1internal error: error during branchesHandler.setup: internal error: some github tokens can't read classic branch protection rules: https://github.com/ossf/scorecard-action/blob/main/docs/authentication/fine-grained-auth-token.md
Signed-Releases⚠️ -1no releases found
Token-Permissions🟢 9detected GitHub workflow tokens with excessive permissions
Packaging🟢 10packaging workflow detected
Fuzzing🟢 10project is fuzzed
Binary-Artifacts🟢 10no binaries found in the repo
Pinned-Dependencies🟢 6dependency not pinned by hash detected -- score normalized to 6
SAST⚠️ 0SAST tool is not run on all commits -- score normalized to 0
Vulnerabilities🟢 37 existing vulnerabilities detected
pip/filelock 3.20.3 UnknownUnknown
pip/pyopenssl 24.3.0 🟢 6.3
Details
CheckScoreReason
Maintained🟢 1016 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 10
Code-Review🟢 10all changesets reviewed
Dangerous-Workflow🟢 10no dangerous workflow patterns detected
Binary-Artifacts🟢 10no binaries found in the repo
Token-Permissions⚠️ 0detected GitHub workflow tokens with excessive permissions
Pinned-Dependencies⚠️ 2dependency not pinned by hash detected -- score normalized to 2
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
Security-Policy⚠️ 0security policy file not detected
Vulnerabilities🟢 100 existing vulnerabilities detected
License🟢 10license file detected
Fuzzing⚠️ 0project is not fuzzed
Signed-Releases⚠️ -1no releases found
Packaging🟢 10packaging workflow detected
Branch-Protection⚠️ -1internal error: error during branchesHandler.setup: internal error: some github tokens can't read classic branch protection rules: https://github.com/ossf/scorecard-action/blob/main/docs/authentication/fine-grained-auth-token.md
SAST⚠️ 0SAST tool is not run on all commits -- score normalized to 0
pip/snowflake-connector-python 3.13.2 🟢 5.6
Details
CheckScoreReason
Maintained🟢 1030 commit(s) and 1 issue activity found in the last 90 days -- score normalized to 10
Security-Policy🟢 9security policy file detected
Code-Review🟢 10all changesets reviewed
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
License🟢 10license file detected
Dangerous-Workflow🟢 10no dangerous workflow patterns detected
Token-Permissions⚠️ 0detected GitHub workflow tokens with excessive permissions
Packaging⚠️ -1packaging workflow not detected
Binary-Artifacts⚠️ 0binaries present in source code
Branch-Protection🟢 5branch protection is not maximal on development and all release branches
Vulnerabilities🟢 100 existing vulnerabilities detected
Signed-Releases🟢 85 out of the last 5 releases have a total of 5 signed artifacts.
Fuzzing⚠️ 0project is not fuzzed
Pinned-Dependencies⚠️ 0dependency not pinned by hash detected -- score normalized to 0
SAST⚠️ 0SAST tool is not run on all commits -- score normalized to 0
pip/tomlkit 0.14.0 UnknownUnknown
pip/urllib3 2.6.3 🟢 8.6
Details
CheckScoreReason
Binary-Artifacts🟢 10no binaries found in the repo
Branch-Protection🟢 5branch protection is not maximal on development and all release branches
CI-Tests🟢 1027 out of 27 merged PRs checked by a CI test -- score normalized to 10
CII-Best-Practices🟢 5badge detected: Passing
Code-Review🟢 8Found 19/22 approved changesets -- score normalized to 8
Contributors🟢 10project has 125 contributing companies or organizations
Dangerous-Workflow🟢 10no dangerous workflow patterns detected
Dependency-Update-Tool🟢 10update tool detected
Fuzzing🟢 10project is fuzzed
License🟢 10license file detected
Maintained🟢 1030 commit(s) and 11 issue activity found in the last 90 days -- score normalized to 10
Packaging🟢 10packaging workflow detected
Pinned-Dependencies🟢 8dependency not pinned by hash detected -- score normalized to 8
SAST🟢 10SAST tool is run on all commits
Security-Policy🟢 10security policy file detected
Signed-Releases⚠️ 0Project has not signed or included provenance with any releases.
Token-Permissions🟢 10GitHub workflow tokens follow principle of least privilege
Vulnerabilities🟢 100 existing vulnerabilities detected

Scanned Files

  • Pipfile.lock

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds comprehensive support for Snowflake database connections to QueryWeaver, enabling schema extraction and Text2SQL functionality for Snowflake databases. The implementation follows the existing loader pattern established by MySQL and PostgreSQL loaders.

Changes:

  • Added Snowflake connector dependency and implemented a complete loader with schema extraction, foreign key detection, and query execution capabilities
  • Extended database type detection in schema_loader.py and text2sql.py to recognize Snowflake URLs
  • Created comprehensive test suite with 15 unit tests covering URL parsing, value serialization, and schema operations
  • Added detailed documentation explaining Snowflake connection format, features, and troubleshooting

Reviewed changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
api/loaders/snowflake_loader.py Core Snowflake loader implementation with schema extraction, query execution, and connection URL parsing
api/core/schema_loader.py Added Snowflake URL detection and loader registration
api/core/text2sql.py Added Snowflake database type detection for query execution
tests/test_snowflake_loader.py Comprehensive unit tests for Snowflake loader functionality
docs/snowflake_loader.md Complete documentation for Snowflake loader usage and troubleshooting
Pipfile Added snowflake-connector-python dependency

Comment on lines 221 to 228
cursor.execute(f"""
SELECT TABLE_NAME, COMMENT
FROM {db_name}.INFORMATION_SCHEMA.TABLES
WHERE TABLE_SCHEMA = '{schema_name}'
AND TABLE_TYPE = 'BASE TABLE'
ORDER BY TABLE_NAME;
""")

Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SQL injection vulnerability in query construction. The schema_name parameter is directly interpolated into the SQL query without parameterization. If user input can influence schema_name, this could allow SQL injection. Use parameterized queries or properly escape the schema_name value.

Suggested change
cursor.execute(f"""
SELECT TABLE_NAME, COMMENT
FROM {db_name}.INFORMATION_SCHEMA.TABLES
WHERE TABLE_SCHEMA = '{schema_name}'
AND TABLE_TYPE = 'BASE TABLE'
ORDER BY TABLE_NAME;
""")
# Validate database and schema names to prevent SQL injection via identifiers
if not re.fullmatch(r"[A-Za-z0-9_]+", db_name):
raise ValueError(f"Invalid Snowflake database name: {db_name!r}")
if not re.fullmatch(r"[A-Za-z0-9_]+", schema_name):
raise ValueError(f"Invalid Snowflake schema name: {schema_name!r}")
query = f"""
SELECT TABLE_NAME, COMMENT
FROM "{db_name}".INFORMATION_SCHEMA.TABLES
WHERE TABLE_SCHEMA = %s
AND TABLE_TYPE = 'BASE TABLE'
ORDER BY TABLE_NAME;
"""
cursor.execute(query, (schema_name,))

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit eb9efa0. Added _validate_identifier() to validate all identifiers and converted to parameterized queries using %s for schema_name while safely quoting db_name.

Comment on lines 277 to 286
SELECT
COLUMN_NAME,
DATA_TYPE,
IS_NULLABLE,
COLUMN_DEFAULT,
COMMENT
FROM {db_name}.INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_SCHEMA = '{schema_name}'
AND TABLE_NAME = '{table_name}'
ORDER BY ORDINAL_POSITION;
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SQL injection vulnerability in query construction. The db_name, schema_name, and table_name parameters are directly interpolated into the SQL query without parameterization. Use parameterized queries or properly escape these values.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit eb9efa0. All queries now use parameterized queries for schema_name and table_name, with db_name validated and safely quoted.

Comment on lines 294 to 302
SELECT COLUMN_NAME
FROM {db_name}.INFORMATION_SCHEMA.TABLE_CONSTRAINTS tc
JOIN {db_name}.INFORMATION_SCHEMA.KEY_COLUMN_USAGE kcu
ON tc.CONSTRAINT_NAME = kcu.CONSTRAINT_NAME
AND tc.TABLE_SCHEMA = kcu.TABLE_SCHEMA
AND tc.TABLE_NAME = kcu.TABLE_NAME
WHERE tc.TABLE_SCHEMA = '{schema_name}'
AND tc.TABLE_NAME = '{table_name}'
AND tc.CONSTRAINT_TYPE = 'PRIMARY KEY';
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SQL injection vulnerability in query construction. The db_name, schema_name, and table_name parameters are directly interpolated without parameterization. This pattern appears multiple times in the file and should be addressed consistently.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit eb9efa0. Consistent pattern applied across all SQL queries with identifier validation and parameterization.

Comment on lines 65 to 73
query = f"""
SELECT DISTINCT "{col_name}"
FROM "{table_name}"
WHERE "{col_name}" IS NOT NULL
SAMPLE ({sample_size * 10} ROWS)
LIMIT {sample_size};
"""
cursor.execute(query)

Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SQL injection vulnerability. The table_name and col_name parameters are directly interpolated into the query without parameterization. Even though they have double quotes, this doesn't fully protect against SQL injection if the values contain double quotes themselves.

Suggested change
query = f"""
SELECT DISTINCT "{col_name}"
FROM "{table_name}"
WHERE "{col_name}" IS NOT NULL
SAMPLE ({sample_size * 10} ROWS)
LIMIT {sample_size};
"""
cursor.execute(query)
# Validate identifiers to prevent SQL injection via table/column names.
# Allow only alphanumeric characters, underscore, dollar sign, and dot.
identifier_pattern = re.compile(r'^[A-Za-z0-9_\$\.]+$')
if not identifier_pattern.match(table_name):
raise ValueError(f"Invalid table name: {table_name!r}")
if not identifier_pattern.match(col_name):
raise ValueError(f"Invalid column name: {col_name!r}")
# Ensure sample_size is a positive integer.
if not isinstance(sample_size, int) or sample_size <= 0:
raise ValueError(f"sample_size must be a positive integer, got {sample_size!r}")
sample_rows = sample_size * 10
query = """
SELECT DISTINCT "{col_name}"
FROM "{table_name}"
WHERE "{col_name}" IS NOT NULL
SAMPLE (%s ROWS)
LIMIT %s;
""".format(col_name=col_name, table_name=table_name)
cursor.execute(query, (sample_rows, sample_size))

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit eb9efa0. Added identifier validation and safe quoting with _quote_identifier() that escapes double quotes, plus parameterized sample_size values.


# Execute query
url = "snowflake://user:pass@account/testdb/PUBLIC?warehouse=COMPUTE_WH"
result = SnowflakeLoader.execute_sql_query("INSERT INTO users VALUES (1, 'test')", url)
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test doesn't verify that the connection commits the transaction or that the query was actually executed with the correct parameters. Consider adding assertions to verify mock_cursor.execute was called with the expected SQL query.

Copilot uses AI. Check for mistakes.
Comment on lines 293 to 317
cursor.execute(f"""
SELECT COLUMN_NAME
FROM {db_name}.INFORMATION_SCHEMA.TABLE_CONSTRAINTS tc
JOIN {db_name}.INFORMATION_SCHEMA.KEY_COLUMN_USAGE kcu
ON tc.CONSTRAINT_NAME = kcu.CONSTRAINT_NAME
AND tc.TABLE_SCHEMA = kcu.TABLE_SCHEMA
AND tc.TABLE_NAME = kcu.TABLE_NAME
WHERE tc.TABLE_SCHEMA = '{schema_name}'
AND tc.TABLE_NAME = '{table_name}'
AND tc.CONSTRAINT_TYPE = 'PRIMARY KEY';
""")
primary_keys = {row['COLUMN_NAME'] for row in cursor.fetchall()}

# Get foreign key information
cursor.execute(f"""
SELECT COLUMN_NAME
FROM {db_name}.INFORMATION_SCHEMA.TABLE_CONSTRAINTS tc
JOIN {db_name}.INFORMATION_SCHEMA.KEY_COLUMN_USAGE kcu
ON tc.CONSTRAINT_NAME = kcu.CONSTRAINT_NAME
AND tc.TABLE_SCHEMA = kcu.TABLE_SCHEMA
AND tc.TABLE_NAME = kcu.TABLE_NAME
WHERE tc.TABLE_SCHEMA = '{schema_name}'
AND tc.TABLE_NAME = '{table_name}'
AND tc.CONSTRAINT_TYPE = 'FOREIGN KEY';
""")
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SQL injection vulnerability in foreign key query. Similar to other queries in this file, the db_name, schema_name, and table_name are interpolated without parameterization.

Suggested change
cursor.execute(f"""
SELECT COLUMN_NAME
FROM {db_name}.INFORMATION_SCHEMA.TABLE_CONSTRAINTS tc
JOIN {db_name}.INFORMATION_SCHEMA.KEY_COLUMN_USAGE kcu
ON tc.CONSTRAINT_NAME = kcu.CONSTRAINT_NAME
AND tc.TABLE_SCHEMA = kcu.TABLE_SCHEMA
AND tc.TABLE_NAME = kcu.TABLE_NAME
WHERE tc.TABLE_SCHEMA = '{schema_name}'
AND tc.TABLE_NAME = '{table_name}'
AND tc.CONSTRAINT_TYPE = 'PRIMARY KEY';
""")
primary_keys = {row['COLUMN_NAME'] for row in cursor.fetchall()}
# Get foreign key information
cursor.execute(f"""
SELECT COLUMN_NAME
FROM {db_name}.INFORMATION_SCHEMA.TABLE_CONSTRAINTS tc
JOIN {db_name}.INFORMATION_SCHEMA.KEY_COLUMN_USAGE kcu
ON tc.CONSTRAINT_NAME = kcu.CONSTRAINT_NAME
AND tc.TABLE_SCHEMA = kcu.TABLE_SCHEMA
AND tc.TABLE_NAME = kcu.TABLE_NAME
WHERE tc.TABLE_SCHEMA = '{schema_name}'
AND tc.TABLE_NAME = '{table_name}'
AND tc.CONSTRAINT_TYPE = 'FOREIGN KEY';
""")
safe_db_name = db_name.replace('"', '""')
pk_query = f"""
SELECT COLUMN_NAME
FROM "{safe_db_name}".INFORMATION_SCHEMA.TABLE_CONSTRAINTS tc
JOIN "{safe_db_name}".INFORMATION_SCHEMA.KEY_COLUMN_USAGE kcu
ON tc.CONSTRAINT_NAME = kcu.CONSTRAINT_NAME
AND tc.TABLE_SCHEMA = kcu.TABLE_SCHEMA
AND tc.TABLE_NAME = kcu.TABLE_NAME
WHERE tc.TABLE_SCHEMA = %s
AND tc.TABLE_NAME = %s
AND tc.CONSTRAINT_TYPE = 'PRIMARY KEY';
"""
cursor.execute(pk_query, (schema_name, table_name))
primary_keys = {row['COLUMN_NAME'] for row in cursor.fetchall()}
# Get foreign key information
fk_query = f"""
SELECT COLUMN_NAME
FROM "{safe_db_name}".INFORMATION_SCHEMA.TABLE_CONSTRAINTS tc
JOIN "{safe_db_name}".INFORMATION_SCHEMA.KEY_COLUMN_USAGE kcu
ON tc.CONSTRAINT_NAME = kcu.CONSTRAINT_NAME
AND tc.TABLE_SCHEMA = kcu.TABLE_SCHEMA
AND tc.TABLE_NAME = kcu.TABLE_NAME
WHERE tc.TABLE_SCHEMA = %s
AND tc.TABLE_NAME = %s
AND tc.CONSTRAINT_TYPE = 'FOREIGN KEY';
"""
cursor.execute(fk_query, (schema_name, table_name))

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit eb9efa0. Applied parameterization for schema_name and table_name with validated and quoted db_name.

Comment on lines 384 to 400
SELECT
tc.CONSTRAINT_NAME,
kcu.COLUMN_NAME,
ccu.TABLE_NAME AS REFERENCED_TABLE_NAME,
ccu.COLUMN_NAME AS REFERENCED_COLUMN_NAME
FROM {db_name}.INFORMATION_SCHEMA.TABLE_CONSTRAINTS tc
JOIN {db_name}.INFORMATION_SCHEMA.KEY_COLUMN_USAGE kcu
ON tc.CONSTRAINT_NAME = kcu.CONSTRAINT_NAME
AND tc.TABLE_SCHEMA = kcu.TABLE_SCHEMA
AND tc.TABLE_NAME = kcu.TABLE_NAME
JOIN {db_name}.INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE ccu
ON tc.CONSTRAINT_NAME = ccu.CONSTRAINT_NAME
AND tc.TABLE_SCHEMA = ccu.TABLE_SCHEMA
WHERE tc.TABLE_SCHEMA = '{schema_name}'
AND tc.TABLE_NAME = '{table_name}'
AND tc.CONSTRAINT_TYPE = 'FOREIGN KEY'
ORDER BY tc.CONSTRAINT_NAME, kcu.ORDINAL_POSITION;
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SQL injection vulnerability in extract_foreign_keys method. The db_name, schema_name, and table_name parameters are directly interpolated without parameterization.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit eb9efa0. Parameterized schema_name and table_name with validated db_name.

Comment on lines 430 to 446
SELECT
tc.TABLE_NAME,
tc.CONSTRAINT_NAME,
kcu.COLUMN_NAME,
ccu.TABLE_NAME AS REFERENCED_TABLE_NAME,
ccu.COLUMN_NAME AS REFERENCED_COLUMN_NAME
FROM {db_name}.INFORMATION_SCHEMA.TABLE_CONSTRAINTS tc
JOIN {db_name}.INFORMATION_SCHEMA.KEY_COLUMN_USAGE kcu
ON tc.CONSTRAINT_NAME = kcu.CONSTRAINT_NAME
AND tc.TABLE_SCHEMA = kcu.TABLE_SCHEMA
AND tc.TABLE_NAME = kcu.TABLE_NAME
JOIN {db_name}.INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE ccu
ON tc.CONSTRAINT_NAME = ccu.CONSTRAINT_NAME
AND tc.TABLE_SCHEMA = ccu.TABLE_SCHEMA
WHERE tc.TABLE_SCHEMA = '{schema_name}'
AND tc.CONSTRAINT_TYPE = 'FOREIGN KEY'
ORDER BY tc.TABLE_NAME, tc.CONSTRAINT_NAME;
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SQL injection vulnerability in extract_relationships method. The db_name and schema_name parameters are directly interpolated without parameterization.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit eb9efa0. Parameterized schema_name with validated db_name.

@gkorland gkorland requested a review from Naseem77 January 21, 2026 14:26
Copilot AI changed the title [WIP] Add support for Snowflake loader Add Snowflake loader support Jan 21, 2026
Copilot AI requested a review from gkorland January 21, 2026 14:27
@gkorland
Copy link
Contributor

@copilot please handle the spell checker

<htmlcontent> docs/snowflake_loader.md: html>body>ul>li
--------------------------------------------------------------------------------
DDL
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/snowflake_loader.md: html>body>ul>li
--------------------------------------------------------------------------------
DML
--------------------------------------------------------------------------------

…n and parameterization

- Add _validate_identifier() to validate all database/schema/table/column names
- Add _quote_identifier() to safely quote identifiers with double-quote escaping
- Update all SQL queries to use parameterized queries for user-controlled values
- Add comprehensive validation tests
- Add Snowflake to spellcheck wordlist

Co-authored-by: gkorland <753206+gkorland@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Jan 24, 2026

@copilot please handle the spell checker

...

Added "Snowflake" to .github/wordlist.txt in commit eb9efa0.

@galshubeli
Copy link
Collaborator

@copilot The PR is lack of the frontend side, there is no option to choose snow flake in app/src/components/modals/DatabaseModal.tsx

- Add Snowflake as a database type option in the dropdown (with cyan color indicator)
- Update connection URL placeholder to include Snowflake format example
- Add Snowflake-specific fields for manual entry: account, schema, warehouse
- Update validation logic to handle Snowflake's different required fields
- Update URL building logic to construct proper Snowflake connection URLs
- Update dialog description to mention Snowflake support

Co-authored-by: galshubeli <124919062+galshubeli@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Feb 2, 2026

@copilot The PR is lack of the frontend side, there is no option to choose snow flake in app/src/components/modals/DatabaseModal.tsx

Added Snowflake support to the frontend in commit 7e96588.

Changes made to DatabaseModal.tsx:

  1. Database Type Dropdown - Added "Snowflake" option with cyan color indicator (appears alongside PostgreSQL and MySQL)

  2. Connection URL Mode - When Snowflake is selected:

    • Placeholder shows: snowflake://username:password@account/database/schema?warehouse=warehouse_name
    • Helper text explains schema defaults to PUBLIC and warehouse to COMPUTE_WH
  3. Manual Entry Mode - Snowflake-specific fields:

    • Account (required) - Snowflake account identifier (e.g., myorg-account)
    • Database (required) - Database name
    • Schema (optional) - Defaults to PUBLIC
    • Warehouse (optional) - Defaults to COMPUTE_WH
    • Username (required)
    • Password (optional)
  4. Form Behavior:

    • Updated validation to require account/database/username for Snowflake
    • URL builder constructs proper Snowflake connection URLs
    • Form reset includes new Snowflake fields

The UI now provides a complete interface for connecting to Snowflake databases with both URL and manual entry options.

Copilot AI changed the title Add Snowflake loader support Add Snowflake loader support with full-stack integration Feb 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for Snowflake

4 participants