Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Passing <ServerName\InstanceName>, as the host value, to the MSSQL connection string doesn't work. #546

Closed
manuelcmachado opened this issue Sep 28, 2023 · 1 comment
Labels
bug Something isn't working

Comments

@manuelcmachado
Copy link

What language are you using?

Python.

What version are you using?

0.3.2

What database are you using?

MSSQL

What dataframe are you using?

Polars

Can you describe your bug?

My MSSQL Server database uses ServerName\InstaceName as host. Passing those values as host to mssql://host:port/db?trusted_connection=true, the Polars read_database() and read_database_uri() methods, throw a "RuntimeError: parse error: invalid domain character." error.
If the MSSQL Server is setup with ServerName only it works just fine.

What are the steps to reproduce the behavior?

If possible, please include a minimal simple example including:

Database setup if the error only happens on specific data or data type

Table schema and example data

Example query / code
import polars as pl
import time
import connectorx as cx
import pyarrow

rdb_type = 'mssql'
server_name = '<servername>\<instancename>'
port = 1433 #usually 1433
database_name = 'AdventureWorksDW2022'

uri = f"{rdb_type}://{server_name}:{port}/{database_name}?trusted_connection=true"
query = """
        SELECT ProductKey, DateKey, MovementDate, UnitCost, UnitsIn, UnitsOut, UnitsBalance
        FROM AdventureWorksDW2022.dbo.FactProductInventory;
        """
start_time = time.time()
df = pl.read_database_uri(query, uri)# by default Polars uses connectorx as its connection engine
execution_time = (time.time() - start_time)

print(f'Reading data from the FactProductInventory table in the {database_name} database, in MSSQL Server, takes {execution_time} seconds')

What is the error?

Show the error result here.

RuntimeError Traceback (most recent call last)
Cell In[8], line 7
2 query = """
3 SELECT ProductKey, DateKey, MovementDate, UnitCost, UnitsIn, UnitsOut, UnitsBalance
4 FROM AdventureWorksDW2022.dbo.FactProductInventory;
5 """
6 start_time = time.time()
----> 7 df = pl.read_database_uri(query, uri)# by default Polars uses connectorx as its connection engine
8 execution_time = (time.time() - start_time)
10 print(f'Reading data from the FactProductInventory table in the {database_name} database, in MSSQL Server, takes {execution_time} seconds')

File ~\AppData\Roaming\Python\Python310\site-packages\polars\io\database.py:450, in read_database_uri(query, uri, partition_on, partition_range, partition_num, protocol, engine, schema_overrides)
447 engine = "connectorx"
449 if engine == "connectorx":
--> 450 return _read_sql_connectorx(
451 query,
452 connection_uri=uri,
453 partition_on=partition_on,
454 partition_range=partition_range,
455 partition_num=partition_num,
456 protocol=protocol,
457 schema_overrides=schema_overrides,
458 )
459 elif engine == "adbc":
460 if not isinstance(query, str):

File ~\AppData\Roaming\Python\Python310\site-packages\polars\io\database.py:486, in _read_sql_connectorx(query, connection_uri, partition_on, partition_range, partition_num, protocol, schema_overrides)
480 except ModuleNotFoundError:
481 raise ModuleNotFoundError(
482 "connectorx is not installed"
483 "\n\nPlease run pip install connectorx>=0.3.2."
484 ) from None
--> 486 tbl = cx.read_sql(
487 conn=connection_uri,
488 query=query,
489 return_type="arrow2",
490 partition_on=partition_on,
491 partition_range=partition_range,
492 partition_num=partition_num,
493 protocol=protocol,
494 )
495 return from_arrow(tbl, schema_overrides=schema_overrides)

File ~\miniconda3\lib\site-packages\connectorx_init_.py:297, in read_sql(conn, query, return_type, protocol, partition_on, partition_range, partition_num, index_col)
294 except ModuleNotFoundError:
295 raise ValueError("You need to install pyarrow first")
--> 297 result = _read_sql(
298 conn,
299 "arrow2" if return_type in {"arrow2", "polars", "polars2"} else "arrow",
300 queries=queries,
301 protocol=protocol,
302 partition_query=partition_query,
303 )
304 df = reconstruct_arrow(result)
305 if return_type in {"polars", "polars2"}:

RuntimeError: parse error: invalid domain character

@manuelcmachado manuelcmachado added the bug Something isn't working label Sep 28, 2023
@manuelcmachado manuelcmachado changed the title Passing <ServerName\InstanceName> as the host value to the MSSQL connection string does not work. Passing <ServerName\InstanceName>, as the host value, to the MSSQL connection string doesn't work. Sep 28, 2023
@manuelcmachado
Copy link
Author

Closing this issue. The solution posted here: #140 (comment), solved my problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant