Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PostgreSQL unit test tries to access non-existent 'precision' attribute #1003

Closed
bkline opened this issue Jan 4, 2022 · 17 comments · Fixed by #1006
Closed

PostgreSQL unit test tries to access non-existent 'precision' attribute #1003

bkline opened this issue Jan 4, 2022 · 17 comments · Fixed by #1006

Comments

@bkline
Copy link
Contributor

bkline commented Jan 4, 2022

Environment

  • Python: 3.10.1
  • pyodbc: built from current HEAD of master branch
  • OS: macOS 11.6.2 (for client) Debian 10.2.1-6 (for server)
  • DB: PostgreSQL 14.1
  • driver: psqlodbcw.so 13.02.0000 (supports ODBC version 03.51)
  • unixODBC 2.3.9

Issue

The test_columns() unit test in pgtests.py tries to access row.precision for each Row object in the results set returned by the call to cursor.columns(). This results in an error for the test.

======================================================================
ERROR: test_columns (__main__.PGTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/bkline/repos/pyodbc/tests3/pgtests.py", line 575, in test_columns
    assert row.precision == 3, row.precision
AttributeError: 'pyodbc.Row' object has no attribute 'precision'

According to the documentation, precision is not a column in the rows returned by cursor.columns().

@bkline
Copy link
Contributor Author

bkline commented Jan 4, 2022

And for a bonus question: does anyone know why the name for the table schema value is table_schem instead of table_schema? I don't see any explanation in the documentation for this odd spelling, nor in the code comments.

@gordthompson
Copy link
Collaborator

I am unable to reproduce your issue.

$ python3 tests3/pgtests.py -t test_columns -vv
Library: /home/gord/git/pyodbc/build/lib.linux-x86_64-3.8/pyodbc.cpython-38-x86_64-linux-gnu.so
/home/gord/git/pyodbc/tests3 --> /home/gord/git/pyodbc
python:  3.8.10 (default, Nov 26 2021, 20:14:08) 
[GCC 9.3.0]
pyodbc:  4.0.33b6+test_with_dsn /home/gord/git/pyodbc/build/lib.linux-x86_64-3.8/pyodbc.cpython-38-x86_64-linux-gnu.so
odbc:    03.52
driver:  psqlodbcw.so 12.01.0000
         supports ODBC version 03.51
os:      Linux
unicode: Py_Unicode=4 SQLWCHAR=2
Max VARCHAR = 255
Max WVARCHAR = 255
Max BINARY = (not supported)
test_columns (__main__.PGTestCase) ... ok

----------------------------------------------------------------------
Ran 1 test in 0.814s

OK

driver: freetds stable 1.3.6

Really? AFAIK PostgreSQL doesn't speak TDS.

@bkline
Copy link
Contributor Author

bkline commented Jan 4, 2022

PostgreSQL doesn't speak TDS.

Sorry, that was copy/pasted from another ticket. I have fixed the description above.

% python3 tests3/pgtests.py -t test_columns -vv
Library: /Users/bkline/repos/pyodbc/build/lib.macosx-11-x86_64-3.10/pyodbc.cpython-310-darwin.so
/Users/bkline/repos/pyodbc/tests3 --> /Users/bkline/repos/pyodbc
python:  3.10.1 (main, Dec  6 2021, 23:20:29) [Clang 13.0.0 (clang-1300.0.29.3)]
pyodbc:  4.0.33b6 /Users/bkline/repos/pyodbc/build/lib.macosx-11-x86_64-3.10/pyodbc.cpython-310-darwin.so
odbc:    03.52
driver:  psqlodbcw.so 13.02.0000
         supports ODBC version 03.51
os:      Darwin
unicode: Py_Unicode=4 SQLWCHAR=2
Max VARCHAR = 255
Max WVARCHAR = 255
Max BINARY = (not supported)
test_columns (__main__.PGTestCase) ... ERROR

======================================================================
ERROR: test_columns (__main__.PGTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/bkline/repos/pyodbc/tests3/pgtests.py", line 575, in test_columns
    assert row.precision == 3, row.precision
AttributeError: 'pyodbc.Row' object has no attribute 'precision'

----------------------------------------------------------------------
Ran 1 test in 0.054s

FAILED (errors=1)

@bkline
Copy link
Contributor Author

bkline commented Jan 4, 2022

All these years using the unittest module and I never knew about the -t testname option. 🤦

@gordthompson
Copy link
Collaborator

Okay, so apparently it's not strictly a Python 3.10 thing:

$ python3.10 tests3/pgtests.py -t test_columns -vv
Library: /home/gord/git/pyodbc/build/lib.linux-x86_64-3.10/pyodbc.cpython-310-x86_64-linux-gnu.so
/home/gord/git/pyodbc/tests3 --> /home/gord/git/pyodbc
python:  3.10.1 (main, Dec 21 2021, 17:46:38) [GCC 9.3.0]
pyodbc:  4.0.33b7+test_with_dsn /home/gord/git/pyodbc/build/lib.linux-x86_64-3.10/pyodbc.cpython-310-x86_64-linux-gnu.so
odbc:    03.52
driver:  psqlodbcw.so 12.01.0000
         supports ODBC version 03.51
os:      Linux
unicode: Py_Unicode=4 SQLWCHAR=2
Max VARCHAR = 255
Max WVARCHAR = 255
Max BINARY = (not supported)
test_columns (__main__.PGTestCase) ... ok

----------------------------------------------------------------------
Ran 1 test in 1.719s

OK

Can you run odbcinst -j to check your version of unixODBC?

@bkline
Copy link
Contributor Author

bkline commented Jan 4, 2022

The version of unixODBC is 2.3.9.

@bkline
Copy link
Contributor Author

bkline commented Jan 4, 2022

The documentation says:

columns(table=None, catalog=None, schema=None, column=None)
Creates a result set of column information in the specified tables using the SQLColumns function.

Each row has the following columns:

table_cat
table_schem
table_name
column_name
data_type
type_name
column_size
buffer_length
decimal_digits
num_prec_radix
nullable
remarks
column_def
sql_data_type
sql_datetime_sub
char_octet_length
ordinal_position
is_nullable: One of SQL_NULLABLE, SQL_NO_NULLS, SQL_NULLS_UNKNOWN.

# columns in table x
for row in cursor.columns(table='x'):
    print(row.column_name)

Why would the test be written to look for a column which—according to that documentation—shouldn't even be there?

@gordthompson
Copy link
Collaborator

Why would the test be written to look for a column which—according to that documentation—shouldn't even be there?

Good question. Maybe different drivers deviate somewhat from the ODBC spec. It's certainly one of the columns returned by the PostgreSQL driver:

[ODBC][4745][1641328615.293688][SQLDescribeCol.c][504]
        Exit:[SQL_SUCCESS]                
            Column Name = [PRECISION]                
            Data Type = 0x7ffd53ceae2a -> 4                
            Column Size = 0x7ffd53ceae30 -> 10                
            Decimal Digits = 0x7ffd53ceae2c -> 0                
            Nullable = 0x7ffd53ceae2e -> 1

re: "Maybe different drivers deviate somewhat", above - apparently they do:

[pgtests.py]

        row = results['xΏz']
        assert row.type_name == 'varchar'
        assert row.precision == 4, row.precision

[sqlservertests.py]

        row = results['xΏz']
        assert row.type_name == 'varchar'
        assert row.column_size == 4, row.column_size

@bkline
Copy link
Contributor Author

bkline commented Jan 4, 2022

Does this mean that the pyodbc documentation quoted above is only correct for some versions of some drivers?

@gordthompson
Copy link
Collaborator

Does this mean that the pyodbc documentation quoted above is only correct for some versions of some drivers?

It seems so. It's probably based on what SQL Server ODBC does. For reference, here are the columns that PostgreSQL ODBC returns (for me, at least):

>>> import pyodbc
>>> cnxn = pyodbc.connect("DRIVER=PostgreSQL Unicode;SERVER=192.168.0.199;UID=scott;PWD=tiger;DATABASE=test", autocommit=True)
>>> crsr = cnxn.cursor()
>>> crsr.execute("CREATE TABLE t (id int primary key)")
<pyodbc.Cursor object at 0x7f6e39da1e30>
>>> col_info = crsr.columns("t")
>>> from pprint import pprint
>>> pprint([x[0] for x in col_info.description])
['table_qualifier',
 'table_owner',
 'table_name',
 'column_name',
 'data_type',
 'type_name',
 'precision',
 'length',
 'scale',
 'radix',
 'nullable',
 'remarks',
 'column_def',
 'sql_data_type',
 'sql_datetime_sub',
 'char_octet_length',
 'ordinal_position',
 'is_nullable',
 'display_size',
 'field_type',
 'auto_increment',
 'physical number',
 'table oid',
 'base typeid',
 'typmod',
 'table info']

@bkline
Copy link
Contributor Author

bkline commented Jan 4, 2022

I would have expected that if pyodbc is documenting the names of the columns (which seems like the right thing to do, if the method is going to be at all useful), it would be taking care of mapping names used by lower layers to the documented names. That way (among other things) we could be sure that at least for a given version of the package, the documentation and the software would match each other.

I found what seems like a relevant commit for the psqlodbc driver, in which they say they are switching to "ODBC 3 column names for the result set of catalog functions." This commit was made in late August, so if the driver you're running is earlier than that, we may have found the explanation for why you see different results for the tests than I see.

@gordthompson
Copy link
Collaborator

You're probably right. You are using "psqlodbcw.so 13.02.0000" while I'm using "psqlodbcw.so 12.01.0000".

@bkline
Copy link
Contributor Author

bkline commented Jan 4, 2022

Sure enough, the release you're running (12.01.0000) was tagged almost exactly two years ago. The release I'm running (13..02.0000) was released in September (a month after the commit I linked to above).

@bkline
Copy link
Contributor Author

bkline commented Jan 4, 2022

It's not 100% clear that I'm reading the pipeline test logs correctly, but if I am the pipeline is running with release 11.01.0000 of psqlodbc, tagged May 2019.

@gordthompson
Copy link
Collaborator

I would have expected that if pyodbc is documenting the names of the columns (which seems like the right thing to do, if the method is going to be at all useful), it would be taking care of mapping names used by lower layers to the documented names. That way (among other things) we could be sure that at least for a given version of the package, the documentation and the software would match each other.

The problem there is that ODBC drivers apparently feel free to change names and throw in any number of extra columns that may or may not exist in other implementations. (For example, there are 18 pyodbc-documented columns and psqlodbcw.so 12.01.0000 returns 26 columns.) Trying to document all possible variants would be … tedious. :)

I found what seems like a relevant commit for the psqlodbc driver, in which they say they are switching to "ODBC 3 column names for the result set of catalog functions." This commit was made in late August, so if the driver you're running is earlier than that, we may have found the explanation for why you see different results for the tests than I see.

Excellent detective work there, BTW.

@bkline
Copy link
Contributor Author

bkline commented Jan 4, 2022

TL;DR: The tests should be verifying that the behavior of the software matches the documentation.

The problem there is that ODBC drivers apparently feel free to change names ....

Absolutely! That's the very reason that pyodbc should present a well-documented method which exposes as stable a list of properties/names as possible. In this case, the names exposed by ODBC 3 seem like the logical choice. Or, if the currently documented names don't already match the ODBC 3 names, another reasonable option would be to stick with the names currently in the documentation (with annotation to point out how the names exposed by pyodbc correspond to the ODBC 3 names. Yes, we have absolutely no control over when the lower layers change the names they use. But we can provide a consistent, correctly documented interface to those names/properties.

Trying to document all possible variants would be … tedious. :)

If you thought I was suggesting that we document all of the variants, then I didn't do a good enough job of explaining what I was hoping for. When I wrote "... it [pyodbc] should be mapping names used by lower layers to the documented names" I meant that the documented names should be the same regardless of what names are used by the drivers. The mapping would happen in the code. For example (in pseudo-code):

ASK THE DRIVER FOR THE COLUMN PROPERTIES
IF THE DRIVER GAVE US A PROPERTY NAMED 'column_size':
    MAP THE VALUE TO `column_size`
OTHERWISE, IF THE DRIVER GAVE US A PROPERTY NAMED 'precision':
    MAP THE VALUE TO `column_size`
....

If that represents an unacceptably high level of effort (which it may well; this is a volunteer-driven, open-source project, after all), then it would be preferable for the documentation to avoid listing any column names to be found in the Row objects, and to say instead something along the lines of ...

This package offers a non-standard columns() method to provide the properties for each column in a table as exposed by the underlying drivers, using the property names returned by those drivers. It is the responsibility of user code to dynamically discover which names are used for the columns of the returned Row objects. For example ....
A more standard, fully documented interface for a subset of this information is provided in the Cursor object's description attribute.

And (assuming this is the road we go down) the unit tests should be testing that description attribute, not a volatile property whose names can change at the whim of the driver vendors. We absolutely don't want the unit tests to be at the mercy of what the driver vendors do, to the extent we can avoid it.

I would further recommend that all of the methods and properties which are not part of the Python DB-API 2.0 be clearly identified as package-specific extensions.

To be clear, none of the projects I work on use ODBC to talk with PostgreSQL. But when I submit a PR I would like to know that all of the unit tests pass, including DBMS-specific tests for stacks I don't necessarily use myself, when running under code which I have modified. It's not good enough that the tests happen to pass for the stacks used by the CI-pipeline.

Excellent detective work there, BTW.

Thanks! 😅

@bkline
Copy link
Contributor Author

bkline commented Jan 5, 2022

Confirming that the test_columns test which fails running with psqlodbcw.so 13.02.0000 passes when I point to a freshly-built psqlodbcw.so 12.01.0000, with everything else in the stack identical.

gordthompson added a commit to gordthompson/pyodbc that referenced this issue Jan 6, 2022
v-makouz pushed a commit to v-makouz/pyodbc that referenced this issue Sep 9, 2022
* Add support for Python 3.10, drop EOL 3.5 (mkleehammer#952)

* Remove duplicate entry in pyi stub (mkleehammer#979)

* Replace deprecated SafeConfigParser with ConfigParser (mkleehammer#953)

* Designate connection string as optional (mkleehammer#987)

* Fix spelling typos (mkleehammer#985)

Co-authored-by: Gord Thompson <gord@gordthompson.com>

* Fix for DSN Names with non-ASCII chars (mkleehammer#951)

* Fix for DSN Names with non-ASCII chars

Fixes: mkleehammer#948

Co-authored-by: bamboo <bamboo@galaxy.ad>
Co-authored-by: Gord Thompson <gordon.d.thompson@gmail.com>

* Added InterfaceError to pyodbc.pyi. (mkleehammer#1013)

Co-authored-by: Benjamin Holder <bholder@rpagency.com>

* Upgrade deprecated unicode encoding calls (mkleehammer#792)

* Do not include .pyc artifacts in source tarball mkleehammer#742

* Build wheels with cibuildwheels on GitHub Actions

Fixes mkleehammer#175
Ref mkleehammer#688
Closes mkleehammer#668
Closes mkleehammer#685
Fixes mkleehammer#441 and pretty much most issues that mention
` sql.h: No such file or directory`

This also need to setup some PyPI keys for automated uploads.

* Install unixodbc-dev for Linux wheels

* Enable GitHub Actions for pull requests

* Use Debian based `manylinux_2_24` image

* `apt-get` update before installing in wheel build

* Use PEP 440 version name required for wheels

* Skip building 32-bit wheels

* 4.0.dev0 for default version, because test_version() wants 3 parts here

Checked this won't shadow released minor version (credit goes to @hugovk)

    >>> from packaging.version import Version
    >>> Version("4.0.dev0") > Version("4.0.24")
    False

* Had to use Debian image for PyPy too

* Disable PyPy wheels

https://cibuildwheel.readthedocs.io/en/stable/options/#build-selection

PyPy is missing some C functions that `pyodbc` needs.

* Update README.md

* Avoid error when testing with DSN= connection

Fixes: mkleehammer#1000

* Disable setencoding/setdecoding in tests3/pgtests.py

Fixes: mkleehammer#1004

* Adjust test_columns() in tests3/pgtests.py for newer driver versions

Fixes: mkleehammer#1003

* Move driver version check out of function

* Add comment to _get_column_size()

* Fix memory leak with decimal parameters

Fixes: mkleehammer#1026

* Create codeql-analysis.yml

* Bugfix/sql param data memory leak (mkleehammer#703)

* Updated .gitignore

* * Created a test file for the specific scenario

* * Updated doc of test file for the specific SQLParamData scenario

* * Fixed the test file for the specific SQLParamData scenario by Py_XDECREF the PyObject with 1 reference.

* * Improved the test to close the cursor and set it to None, then forcing the gc

* * Changed the fix of the memory leak and updated the test.

* * Removed redundant empty line

* * Converted tabs to spaces

* * Moved variable out of conn's scope

* Update gitignore, remove duplicated

* Replace deprecated PyUnicode_FromUnicode(NULL, size) calls (mkleehammer#998)

Current versions of Python write a deprecation warning message to
stderr, which breaks CGI scripts running under web servers which
fold stderr into stdout. Likely breaks other software. This change
replaces the deprecated calls with PyUnicode_New(size, maxchar).
The accompanying code to populate the new objects has also been
rewritten to use the new PyUnicode APIs.

* Making pyodbc compatible with PostgreSQL infinity dates, returning MINYEAR and MAXYEAR to python, instead of values out of python's limits

* Removing autoformat from code

* Removing autoformat from code

* Add odbc_config support on mac and m1 homebrew dir

* Note EOL of 2.7 support in README (mkleehammer#945)

* Fix version of CI generated wheels

The CI system is checking out exact tags like "git checkout 4.0.33", which results in a
detached HEAD.  The version calculation was adding the commit hash.

* Fix for mkleehammer#1082 libraries in Linux wheels (mkleehammer#1084)

* use argparse instead of optparse (mkleehammer#1089)

Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>
Co-authored-by: Alex Nelson <alexander.nelson@nist.gov>
Co-authored-by: Kian Meng, Ang <kianmeng.ang@gmail.com>
Co-authored-by: Gord Thompson <gord@gordthompson.com>
Co-authored-by: bamboo <bamboo@galaxy.ad>
Co-authored-by: Gord Thompson <gordon.d.thompson@gmail.com>
Co-authored-by: bdholder <benjamin.holder@rocketmail.com>
Co-authored-by: Benjamin Holder <bholder@rpagency.com>
Co-authored-by: Inada Naoki <songofacandy@gmail.com>
Co-authored-by: Michael Fladischer <FladischerMichael@fladi.at>
Co-authored-by: Anatoli Babenia <anatoli@rainforce.org>
Co-authored-by: Francisco Morales <51379487+jose598@users.noreply.github.com>
Co-authored-by: Gord Thompson <gordthompson@users.noreply.github.com>
Co-authored-by: Michael Kleehammer <michael@kleehammer.com>
Co-authored-by: Gilad Leifman <leifmangilad@gmail.com>
Co-authored-by: Bob Kline <bkline@rksystems.com>
Co-authored-by: Leandro Scott <lsrzj@yahoo.com>
Co-authored-by: Jordan Mendelson <jordan@zenzen.org>
Co-authored-by: Keith Erskine <toastie604@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants