-
Notifications
You must be signed in to change notification settings - Fork 65
Feat/sql redis query #467
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Feat/sql redis query #467
Conversation
|
Hi, I’m Jit, a friendly security platform designed to help developers build secure applications from day zero with an MVS (Minimal viable security) mindset. In case there are security findings, they will be communicated to you as a comment inside the PR. Hope you’ll enjoy using Jit. Questions? Comments? Want to learn more? Get in touch with us. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces SQL query support to redisvl by adding a new SQLQuery class that translates SQL-like queries into Redis FT.SEARCH and FT.AGGREGATE commands. The implementation wraps the external sql-redis package to provide a more familiar SQL interface for querying Redis.
Changes:
- Added SQLQuery class that translates SQL SELECT statements to Redis queries
- Integrated SQLQuery execution into SearchIndex.query() method
- Added comprehensive integration tests covering various SQL operators and aggregations
- Created documentation notebook demonstrating SQL query usage
Reviewed changes
Copilot reviewed 7 out of 8 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| pyproject.toml | Adds sql-redis as optional dependency for [sql] extra |
| uv.lock | Updates lock file with sql-redis and sqlglot dependencies |
| redisvl/query/sql.py | New SQLQuery class implementation with redis_query_string() method |
| redisvl/query/init.py | Exports SQLQuery from redisvl.query module |
| redisvl/index/index.py | Adds _sql_query() method and integrates SQLQuery into query() dispatcher |
| tests/integration/test_sql_redis.py | Comprehensive integration tests for SQL query functionality |
| docs/user_guide/12_sql_to_redis_queries.ipynb | New user guide notebook demonstrating SQL query usage |
| docs/user_guide/02_hybrid_queries.ipynb | Unintentional execution artifacts from re-running cells |
Comments suppressed due to low confidence (1)
redisvl/index/index.py:1150
- The docstring for the query method still references the old type hint "Union[BaseQuery, AggregateQuery, HybridQuery]" in the Args section (line 1150), but the actual type hint now includes SQLQuery. Update the docstring to reflect that SQLQuery is also accepted.
"""Execute a query on the index.
This method takes a BaseQuery, AggregationQuery, or HybridQuery object directly, and
handles post-processing of the search.
Args:
query (Union[BaseQuery, AggregateQuery, HybridQuery]): The query to run.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| """Execute a query on the index. | ||
|
|
Copilot
AI
Jan 29, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docstring for the query method states "This method takes a BaseQuery, AggregationQuery, or HybridQuery object directly" but should also mention SQLQuery since it's now a supported query type. Additionally, the Args section references the old type hint without SQLQuery.
| # Substitute non-bytes params in SQL before translation | ||
| sql = self.sql | ||
| for key, value in self.params.items(): | ||
| placeholder = f":{key}" | ||
| if isinstance(value, (int, float)): | ||
| sql = sql.replace(placeholder, str(value)) | ||
| elif isinstance(value, str): | ||
| sql = sql.replace(placeholder, f"'{value}'") | ||
| # bytes (vectors) are handled separately |
Copilot
AI
Jan 29, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parameter substitution logic only handles int, float, and str types, but doesn't handle bytes (vectors) which are mentioned in the comment. However, looking at the tests, bytes parameters like vectors are passed in the params dict and seem to work. This suggests sql-redis handles bytes parameters internally. Consider adding a code comment explaining that bytes parameters are passed through to sql-redis's translate method rather than being substituted in the SQL string, to clarify the intent.
| # Decode key if bytes | ||
| str_key = key.decode("utf-8") if isinstance(key, bytes) else key | ||
| # Decode value if bytes | ||
| str_value = value.decode("utf-8") if isinstance(value, bytes) else value |
Copilot
AI
Jan 29, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The decoding logic assumes all bytes values should be decoded to strings. However, this may incorrectly decode values that should remain as bytes (e.g., binary data, vector embeddings). Consider checking if the value is actually a UTF-8 string before decoding, or preserve the original type for non-string data. You could wrap the decode in a try-except to handle non-UTF-8 bytes gracefully.
| # Decode key if bytes | |
| str_key = key.decode("utf-8") if isinstance(key, bytes) else key | |
| # Decode value if bytes | |
| str_value = value.decode("utf-8") if isinstance(value, bytes) else value | |
| # Decode key if bytes, but preserve non-UTF-8 bytes | |
| if isinstance(key, bytes): | |
| try: | |
| str_key = key.decode("utf-8") | |
| except UnicodeDecodeError: | |
| str_key = key | |
| else: | |
| str_key = key | |
| # Decode value if bytes, but preserve non-UTF-8 bytes | |
| if isinstance(value, bytes): | |
| try: | |
| str_value = value.decode("utf-8") | |
| except UnicodeDecodeError: | |
| str_value = value | |
| else: | |
| str_value = value |
| " MAX(age) as max_age,\n", | ||
| " AVG(age) as avg_age,\n", | ||
| " STDEV(age) as std_age,\n", | ||
| " FIRST_VALUE(age) as fist_value_age,\n", |
Copilot
AI
Jan 29, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in the alias name: "fist_value_age" should be "first_value_age" (missing 'r' in 'first'). This appears in both the SQL query and the output, suggesting it's consistently misspelled throughout.
| "metadata": {}, | ||
| "outputs": [], | ||
| "source": [ | ||
| "# Butm the index is still in place\n", |
Copilot
AI
Jan 29, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in comment: "Butm" should be "But" (extra 'm' at the end).
| "# Butm the index is still in place\n", | |
| "# But the index is still in place\n", |
| "# await index.clear()" | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "code", | ||
| "execution_count": 19, | ||
| "metadata": {}, | ||
| "outputs": [], | ||
| "source": [ | ||
| "# Butm the index is still in place\n", | ||
| "# await index.exists()" | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "code", | ||
| "execution_count": 20, | ||
| "metadata": {}, | ||
| "outputs": [], | ||
| "source": [ | ||
| "# Remove / delete the index in its entirety\n", | ||
| "# await index.delete()" |
Copilot
AI
Jan 29, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The commented cleanup code uses await keywords (async syntax) but this notebook doesn't appear to be using async/await elsewhere. The SearchIndex methods like clear(), exists(), and delete() are synchronous methods, not async. Remove the await keywords from these commented lines.
| "pillow>=11.3.0", | ||
| ] | ||
| sql = [ | ||
| "sql-redis @ file:///Users/robert.shelton/Documents/sql-redis/dist/sql_redis-0.1.0-py3-none-any.whl", |
Copilot
AI
Jan 29, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sql-redis dependency is using an absolute local file path specific to a developer's machine (/Users/robert.shelton/Documents/sql-redis/...). This will break on other systems and in CI/CD environments. According to the PR description, this is blocked on a release/update of sql-redis package. Before merging, this should be changed to reference a published version on PyPI (e.g., "sql-redis>=0.1.0").
Note: this pr is not ready for merge until a blocking release/update on sql-redis package.
Spec for SQLQuery class
Make sql-like commands available to be translated into Redis queries via redisvl to cut down on syntax overhead for engineers.
Ex:
This code would then produce the equivalent redis query to be executed against the database:
Expose a method on the object:
.redis_query_string()such that you can easily inspect the resulting redis query constructed from SQLQuery class invocation.Ex:
Packaging and dependencies
In order to use the
SQLQueryclass, user will have to install the optional dependency on sql-redis. This can be accomplished with the commandpip install redisvl[sql].Tested/supported operators
SELECT title, category FROM {index} WHERE category = 'electronics'FT.SEARCH test_index "@category:{electronics}" RETURN 2 title categorySELECT title, category FROM {index} WHERE category != 'electronics'FT.SEARCH test_index "-@category:{electronics}" RETURN 2 title categorySELECT title, category FROM {index} WHERE category IN ('books', 'accessories')FT.SEARCH test_index "@category:{books|accessories}" RETURN 2 title categorySELECT title, price FROM {index} WHERE price > 100FT.SEARCH test_index "@price:[(100 +inf]" RETURN 2 title priceSELECT title, price FROM {index} WHERE price >= 25 AND price <= 50FT.SEARCH test_index "@price:[25 +inf] @price:[-inf 50]" RETURN 2 title priceSELECT title, price FROM {index} WHERE price = 45FT.SEARCH test_index "@price:[45 45]" RETURN 2 title priceSELECT title, price FROM {index} WHERE price != 45FT.SEARCH test_index "-@price:[45 45]" RETURN 2 title priceSELECT title, price FROM {index} WHERE price < 50FT.SEARCH test_index "@price:[-inf (50]" RETURN 2 title priceSELECT title, price FROM {index} WHERE price >= 25 AND price <= 50FT.SEARCH test_index "@price:[25 +inf] @price:[-inf 50]" RETURN 2 title priceSELECT title, price FROM {index} WHERE price IN (45, 55, 65)SELECT title, price FROM {index} WHERE price BETWEEN 40 AND 60FT.SEARCH test_index "@price:[40 60]" RETURN 2 title priceSELECT title, name FROM {index} WHERE title = 'laptop'FT.SEARCH test_index "@title:laptop" RETURN 2 title nameSELECT title, name FROM {index} WHERE title != 'laptop'FT.SEARCH test_index "-@title:laptop" RETURN 2 title nameSELECT title, name FROM {index} WHERE title IN ('Python', 'Redis')SELECT title, vector_distance(embedding, :vec) AS score FROM {index} LIMIT 3FT.SEARCH test_index "*=>[KNN 3 @embedding $vector AS score]" PARAMS 2 vector $vector DIALECT 2 RETURN 2 title score LIMIT 0 3SELECT title, cosine_distance(embedding, :vec) AS vector_distance FROM {index} LIMIT 3FT.SEARCH test_index "*=>[KNN 3 @embedding $vector AS vector_distance]" PARAMS 2 vector $vector DIALECT 2 RETURN 2 title vector_distance LIMIT 0 3Tested/supported aggregation reducer functions
SELECT COUNT(*) as total FROM {index}FT.AGGREGATE test_index "*" GROUPBY 0 REDUCE COUNT 0 AS totalSELECT category, COUNT(*) as count FROM {index} GROUP BY categoryFT.AGGREGATE test_index "*" LOAD 1 category GROUPBY 1 @category REDUCE COUNT 0 AS countSELECT category, COUNT_DISTINCT(title) as unique_titles FROM {index} GROUP BY categoryFT.AGGREGATE test_index "*" LOAD 2 category title GROUPBY 1 @category REDUCE COUNT_DISTINCT 1 @title AS unique_titlesSELECT SUM(price) as total FROM {index}FT.AGGREGATE test_index "*" LOAD 1 price GROUPBY 0 REDUCE SUM 1 @price AS totalSELECT category, SUM(price) as total_price FROM {index} GROUP BY categoryFT.AGGREGATE test_index "*" LOAD 2 category price GROUPBY 1 @category REDUCE SUM 1 @price AS total_priceSELECT MIN(price) as min_price FROM {index}FT.AGGREGATE test_index "*" LOAD 1 price GROUPBY 0 REDUCE MIN 1 @price AS min_priceSELECT category, MIN(price) as min_price FROM {index} GROUP BY categoryFT.AGGREGATE test_index "*" LOAD 2 category price GROUPBY 1 @category REDUCE MIN 1 @price AS min_priceSELECT MAX(price) as max_price FROM {index}FT.AGGREGATE test_index "*" LOAD 1 price GROUPBY 0 REDUCE MAX 1 @price AS max_priceSELECT category, MAX(price) as max_price FROM {index} GROUP BY categoryFT.AGGREGATE test_index "*" LOAD 2 category price GROUPBY 1 @category REDUCE MAX 1 @price AS max_priceSELECT category, AVG(price) as avg_price FROM {index} GROUP BY categoryFT.AGGREGATE test_index "*" LOAD 2 category price GROUPBY 1 @category REDUCE AVG 1 @price AS avg_priceSELECT STDDEV(price) as price_stddev FROM {index}FT.AGGREGATE test_index "*" LOAD 1 price GROUPBY 0 REDUCE STDDEV 1 @price AS price_stddevSELECT category, QUANTILE(price, 0.5) as median_price FROM {index} GROUP BY categoryFT.AGGREGATE test_index "*" LOAD 2 price category GROUPBY 1 @category REDUCE QUANTILE 2 @price 0.5 AS median_priceSELECT category, ARRAY_AGG(title) as titles FROM {index} GROUP BY categoryFT.AGGREGATE test_index "*" LOAD 2 category title GROUPBY 1 @category REDUCE TOLIST 1 @title AS titlesSELECT category, FIRST_VALUE(title) as first_title FROM {index} GROUP BY categoryFT.AGGREGATE test_index "*" LOAD 2 category title GROUPBY 1 @category REDUCE FIRST_VALUE 1 @title AS first_titleTODO: subsequent work that will follow in separate PRs