Skip to content

Commit 8fbf45a

Browse files
authored
FEAT: Adding Ouput Converter APIs (#190)
### Work Item / Issue Reference <!-- IMPORTANT: Please follow the PR template guidelines below. For mssql-python maintainers: Insert your ADO Work Item ID below (e.g. AB#37452) For external contributors: Insert Github Issue number below (e.g. #149) Only one reference is required - either GitHub issue OR ADO Work Item. --> <!-- mssql-python maintainers: ADO Work Item --> > [AB#34913](https://sqlclientdrivers.visualstudio.com/c6d89619-62de-46a0-8b46-70b92a84d85e/_workitems/edit/34913) > [AB#34914](https://sqlclientdrivers.visualstudio.com/c6d89619-62de-46a0-8b46-70b92a84d85e/_workitems/edit/34914) > [AB#34915](https://sqlclientdrivers.visualstudio.com/c6d89619-62de-46a0-8b46-70b92a84d85e/_workitems/edit/34915) > [AB#34916](https://sqlclientdrivers.visualstudio.com/c6d89619-62de-46a0-8b46-70b92a84d85e/_workitems/edit/34916) ------------------------------------------------------------------- ### Summary This pull request adds a flexible output converter system to the database connection, allowing custom Python functions to be registered for converting SQL data types when fetching results. It introduces new methods for managing these converters and updates the row handling logic to apply them automatically. Comprehensive tests are added to verify correct registration, retrieval, removal, and integration of output converters, including edge cases and chaining behavior. **Core output converter system:** * Added methods to `Connection` (`add_output_converter`, `get_output_converter`, `remove_output_converter`, `clear_output_converters`) for registering and managing output converter functions for specific SQL types. These converters are called when values of the registered SQL type are read from the database. (`mssql_python/connection.py`) * Updated the `Row` class to apply registered output converters automatically to values fetched from the database, with fallback logic for string types and robust error handling. (`mssql_python/row.py`) **Testing and validation:** * Added extensive tests for output converter management, including adding, retrieving, removing, clearing, chaining, temporary replacement, and integration during data fetching. Tests also cover edge cases such as handling `NULL` values and using multiple converters at once. (`tests/test_003_connection.py`) * Introduced helper converter functions for specific SQL types (e.g., `DATETIMEOFFSET` and custom string handling) to support and validate the new converter system in tests. (`tests/test_003_connection.py`) **Test and utility enhancements:** * Imported additional modules (`struct`, `datetime`, `timezone`, and constants) to support new test cases and converter logic. (`tests/test_003_connection.py`) --------- Co-authored-by: Jahnvi Thakkar <jathakkar@microsoft.com>
1 parent aa3a705 commit 8fbf45a

File tree

5 files changed

+689
-150
lines changed

5 files changed

+689
-150
lines changed

mssql_python/connection.py

Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@
1414
import re
1515
import codecs
1616
from typing import Any
17+
import threading
1718
from mssql_python.cursor import Cursor
1819
from mssql_python.helpers import add_driver_to_connection_str, sanitize_connection_string, sanitize_user_input, log
1920
from mssql_python import ddbc_bindings
@@ -187,6 +188,10 @@ def __init__(self, connection_str: str = "", autocommit: bool = False, attrs_bef
187188
# TODO: Think and implement scenarios for multi-threaded access to cursors
188189
self._cursors = weakref.WeakSet()
189190

191+
# Initialize output converters dictionary and its lock for thread safety
192+
self._output_converters = {}
193+
self._converters_lock = threading.Lock()
194+
190195
# Auto-enable pooling if user never called
191196
if not PoolingManager.is_initialized():
192197
PoolingManager.enable()
@@ -531,6 +536,92 @@ def cursor(self) -> Cursor:
531536
cursor = Cursor(self)
532537
self._cursors.add(cursor) # Track the cursor
533538
return cursor
539+
540+
def add_output_converter(self, sqltype, func) -> None:
541+
"""
542+
Register an output converter function that will be called whenever a value
543+
with the given SQL type is read from the database.
544+
545+
Thread-safe implementation that protects the converters dictionary with a lock.
546+
547+
⚠️ WARNING: Registering an output converter will cause the supplied Python function
548+
to be executed on every matching database value. Do not register converters from
549+
untrusted sources, as this can result in arbitrary code execution and security
550+
vulnerabilities. This API should never be exposed to untrusted or external input.
551+
552+
Args:
553+
sqltype (int): The integer SQL type value to convert, which can be one of the
554+
defined standard constants (e.g. SQL_VARCHAR) or a database-specific
555+
value (e.g. -151 for the SQL Server 2008 geometry data type).
556+
func (callable): The converter function which will be called with a single parameter,
557+
the value, and should return the converted value. If the value is NULL
558+
then the parameter passed to the function will be None, otherwise it
559+
will be a bytes object.
560+
561+
Returns:
562+
None
563+
"""
564+
with self._converters_lock:
565+
self._output_converters[sqltype] = func
566+
# Pass to the underlying connection if native implementation supports it
567+
if hasattr(self._conn, 'add_output_converter'):
568+
self._conn.add_output_converter(sqltype, func)
569+
log('info', f"Added output converter for SQL type {sqltype}")
570+
571+
def get_output_converter(self, sqltype):
572+
"""
573+
Get the output converter function for the specified SQL type.
574+
575+
Thread-safe implementation that protects the converters dictionary with a lock.
576+
577+
Args:
578+
sqltype (int or type): The SQL type value or Python type to get the converter for
579+
580+
Returns:
581+
callable or None: The converter function or None if no converter is registered
582+
583+
Note:
584+
⚠️ The returned converter function will be executed on database values. Only use
585+
converters from trusted sources.
586+
"""
587+
with self._converters_lock:
588+
return self._output_converters.get(sqltype)
589+
590+
def remove_output_converter(self, sqltype):
591+
"""
592+
Remove the output converter function for the specified SQL type.
593+
594+
Thread-safe implementation that protects the converters dictionary with a lock.
595+
596+
Args:
597+
sqltype (int or type): The SQL type value to remove the converter for
598+
599+
Returns:
600+
None
601+
"""
602+
with self._converters_lock:
603+
if sqltype in self._output_converters:
604+
del self._output_converters[sqltype]
605+
# Pass to the underlying connection if native implementation supports it
606+
if hasattr(self._conn, 'remove_output_converter'):
607+
self._conn.remove_output_converter(sqltype)
608+
log('info', f"Removed output converter for SQL type {sqltype}")
609+
610+
def clear_output_converters(self) -> None:
611+
"""
612+
Remove all output converter functions.
613+
614+
Thread-safe implementation that protects the converters dictionary with a lock.
615+
616+
Returns:
617+
None
618+
"""
619+
with self._converters_lock:
620+
self._output_converters.clear()
621+
# Pass to the underlying connection if native implementation supports it
622+
if hasattr(self._conn, 'clear_output_converters'):
623+
self._conn.clear_output_converters()
624+
log('info', "Cleared all output converters")
534625

535626
def execute(self, sql: str, *args: Any) -> Cursor:
536627
"""

mssql_python/row.py

Lines changed: 58 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,13 @@ def __init__(self, cursor, description, values, column_map=None):
2424
column_map: Optional pre-built column map (for optimization)
2525
"""
2626
self._cursor = cursor
27-
self._values = values
27+
self._description = description
28+
29+
# Apply output converters if available
30+
if hasattr(cursor.connection, '_output_converters') and cursor.connection._output_converters:
31+
self._values = self._apply_output_converters(values)
32+
else:
33+
self._values = values
2834

2935
# TODO: ADO task - Optimize memory usage by sharing column map across rows
3036
# Instead of storing the full cursor_description in each Row object:
@@ -42,6 +48,57 @@ def __init__(self, cursor, description, values, column_map=None):
4248

4349
self._column_map = column_map
4450

51+
def _apply_output_converters(self, values):
52+
"""
53+
Apply output converters to raw values.
54+
55+
Args:
56+
values: Raw values from the database
57+
58+
Returns:
59+
List of converted values
60+
"""
61+
if not self._description:
62+
return values
63+
64+
converted_values = list(values)
65+
66+
for i, (value, desc) in enumerate(zip(values, self._description)):
67+
if desc is None or value is None:
68+
continue
69+
70+
# Get SQL type from description
71+
sql_type = desc[1] # type_code is at index 1 in description tuple
72+
73+
# Try to get a converter for this type
74+
converter = self._cursor.connection.get_output_converter(sql_type)
75+
76+
# If no converter found for the SQL type but the value is a string or bytes,
77+
# try the WVARCHAR converter as a fallback
78+
if converter is None and isinstance(value, (str, bytes)):
79+
from mssql_python.constants import ConstantsDDBC
80+
converter = self._cursor.connection.get_output_converter(ConstantsDDBC.SQL_WVARCHAR.value)
81+
82+
# If we found a converter, apply it
83+
if converter:
84+
try:
85+
# If value is already a Python type (str, int, etc.),
86+
# we need to convert it to bytes for our converters
87+
if isinstance(value, str):
88+
# Encode as UTF-16LE for string values (SQL_WVARCHAR format)
89+
value_bytes = value.encode('utf-16-le')
90+
converted_values[i] = converter(value_bytes)
91+
else:
92+
converted_values[i] = converter(value)
93+
except Exception:
94+
# Log the exception for debugging without leaking sensitive data
95+
if hasattr(self._cursor, 'log'):
96+
self._cursor.log('debug', 'Exception occurred in output converter', exc_info=True)
97+
# If conversion fails, keep the original value
98+
pass
99+
100+
return converted_values
101+
45102
def __getitem__(self, index):
46103
"""Allow accessing by numeric index: row[0]"""
47104
return self._values[index]

requirements.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,3 +4,4 @@ pybind11
44
coverage
55
unittest-xml-reporting
66
setuptools
7+
psutil

0 commit comments

Comments
 (0)