FEAT: Explicit parameters and improved type hints for bulkcopy#420
FEAT: Explicit parameters and improved type hints for bulkcopy#420bewithgaurav wants to merge 6 commits intomainfrom
Conversation
…IDE discoverability - All options now explicit in function signature - Pass params directly to pycore (no kwargs dict conversion) - Matches mssql-tds explicit params API Based on community feedback from discussion #414
c90f0f1 to
fa2eab4
Compare
There was a problem hiding this comment.
Pull request overview
Refactors Cursor._bulkcopy to replace **kwargs bulk-copy options with an explicit, type-hinted parameter list and updated docstring.
Changes:
- Replaced
**kwargsin_bulkcopywith explicit parameters (e.g.,batch_size,timeout,column_mappings, and bulk-copy flags). - Expanded
_bulkcopydocstring to document supported bulk copy options. - Updated the
pycore_cursor.bulkcopy(...)invocation to pass the new explicit parameters.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
mssql_python/cursor.py
Outdated
| self, | ||
| table_name: str, | ||
| data: Iterable[Union[Tuple, List]], | ||
| batch_size: Optional[int] = None, |
There was a problem hiding this comment.
batch_size is typed as Optional[int], but the validation accepts floats (isinstance(batch_size, (int, float))) while the error message says “positive integer”. Please align the type hint, validation, and message (either require an int, or explicitly support non-integer values and document why).
| batch_size: Optional[int] = None, | |
| batch_size: Optional[Union[int, float]] = None, |
📊 Code Coverage Report
Diff CoverageDiff: main...HEAD, staged and unstaged changesNo lines with coverage information in this diff. 📋 Files Needing Attention📉 Files with overall lowest coverage (click to expand)mssql_python.pybind.logger_bridge.hpp: 58.8%
mssql_python.pybind.logger_bridge.cpp: 59.2%
mssql_python.row.py: 66.2%
mssql_python.pybind.ddbc_bindings.cpp: 69.3%
mssql_python.pybind.ddbc_bindings.h: 69.7%
mssql_python.pybind.connection.connection.cpp: 75.3%
mssql_python.ddbc_bindings.py: 79.6%
mssql_python.pybind.connection.connection_pool.cpp: 79.6%
mssql_python.connection.py: 84.1%
mssql_python.cursor.py: 84.7%🔗 Quick Links
|
After Saurabh's review, Rust bulkcopy now uses direct types (bool, usize) instead of Option<T>. Python must not pass None values - instead omit the kwarg and let PyO3 use its defaults.
…/microsoft/mssql-python into bewithgaurav/fix-bulkcopy-kwargs
| return True | ||
|
|
||
| def _bulkcopy( | ||
| self, table_name: str, data: Iterable[Union[Tuple, List]], **kwargs |
There was a problem hiding this comment.
Could you please provide more details in the PR summary for this change?
This API change may not be scalable in future as any new parameter addition need to go through the API contract change which is not advisable.
There was a problem hiding this comment.
The conversation around new API spec is happening in Github Discussions topic #414
I'll update the PR summary with the new spec details and link to the above discussion shortly
will ping once done
There was a problem hiding this comment.
done - updated, pls recheck and let me know if you have any comments
There was a problem hiding this comment.
I am still not convinced about moving out from **kwargs completly.
Can we group the parameters logically with couple of parameter?
There was a problem hiding this comment.
I am still not convinced about moving out from **kwargs completly. Can we group the parameters logically with couple of parameter?
Hi @subrata-ms, can you please be more specific with regard to why you're not convinced here? We discussed in the thread linked above and those of us on the end user side who decided to participate seemed to agree that this is a good design. As someone who has worked with SQL Server bulk copy libraries for over 20 years in a number of different languages I feel that this is the best possible option from a usability perspective: The parameters will be obvious and easily discovered by users. Introducing a required external object here won't be especially helpful and users will just have to jump through another hoop when making small changes. And while it does seem like there are a touch more parameters here than we usually see, the real concern should be forward maintainability -- but there is none. These basic parameters haven't changed in all of the time I've been working with SQL Server and I seriously doubt they ever will.
There was a problem hiding this comment.
Hi @amachanic , I would suggest, we discuss these API design approach options in the API Design Review forum so we can gather broader feedback. ( #414 ).
- batch_size: int = 0 (0 means server optimal) - timeout: int = 30 - column_mappings: Optional[Union[List[str], List[Tuple[int, str]]]] = None - All boolean flags: bool = False (not Optional[bool] = None) Updated docstring with two column_mappings formats: - Simple: List[str] - position = source index - Advanced: List[Tuple[int, str]] - explicit (index, column) mapping Simplified bulkcopy call to pass params directly (no kwargs dict) since Rust API now uses explicit defaults per Saurabh's review.
Work Item / Issue Reference
Summary
Implements Revision 2 of the BulkCopy API spec from community feedback in #414.
Replaces
**kwargswith explicit, type-hinted parameters for better IDE autocomplete and discoverabilityChanges
API Signature
What Changed
**kwargsOptional[bool] = Nonebool = Falsebatch_sizeOptional[int] = Noneint = 0(0 = server optimal)column_mappingsList[Tuple[Union[str, int], str]]Union[List[str], List[Tuple[int, str]]]Column Mappings
Per @amachanic's feedback, now supports simpler format:
Simple -
List[str]:Advanced -
List[Tuple[int, str]]: