-
Notifications
You must be signed in to change notification settings - Fork 19
[DB-40533] Add support for new type VECTOR(<dim>,DOUBLE) #182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Add encode/decode support for new VECTOR datatype. A result column of VECTOR(<dim>, DOUBLE) shows as type datatype.VECTOR_DOUBLE in the metadata description. It is returned as datatype.Vector with subtype datatype.Vector.DOUBLE which can be used as a list. For a parameter to be detected as VECTOR(<dim>,DOUBLE) it also needs to use the datatype.Vector datatype. Alternatively, binding a string is also possible - which currently includes binding a list of numbers which will then be converted to string. This commit also adds tests for the described behavior.
- remove whitespace line - add documentation
|
||
# the actual vector: Each value as double in little endian encoding | ||
for val in value: | ||
self.__output += struct.pack('<d', float(val)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work if the value is NULL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A NULL should be set via putNull
, not putVectorDouble
.
But assuming someone would put some kind of NULL-value in here: What NULL representation would you expect there? I'll try to outline a few:
- If value is None this path will not be taken (we only reach here through
putValue
below which callsputVector
nly for instances of Vector - If value is an empty list this would encode an empty list. I did not include that in the tests, but I am pretty sure that would trigger a server side error with a VECTOR(0) not being supported or the like. Could add a test and ensure it works correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please consider my comments on detecting the server version in the test.
Otherwise it looks OK to me.
@@ -149,10 +151,37 @@ def __cmp__(self, other): | |||
return -1 | |||
|
|||
|
|||
class Vector(list): | |||
"""A specific type for SQL VECTOR(<dim>, DOUBLE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is not quite right, is it? A Vector element doesn't have to be a DOUBLE (I mean, it does now but it might not in the future)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. It is supposed to be usable with other element types as well. Will fix that.
# only activate this tests if tested against version 8 or above | ||
cursor.execute("select cast(substring_index(release_ver, '.', 1) as int)" | ||
" from system.nodes limit 1") | ||
row = cursor.fetchone() | ||
database_major_version = row[0] | ||
if database_major_version < 8: | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a problem with this, per se.
I think it would be cleaner to obtain this information once somewhere in conftest.py and make it available there, for example maybe in the DATABASE_FIXTURE since that already creates a connection.
The current DATABASE_FIXTURE is the connection arguments but you could add this there and then extract it again in the nuodb_base.NuoBase._setup().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll look into that. If it isn't extremely hard (which I doubt) I'll do that.
cursor.execute("select cast(substring_index(release_ver, '.', 1) as int)" | ||
" from system.nodes limit 1") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be safer and easier to use geteffectiveversion() and compare based on that integer value, rather than parsing the version string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just didn't know we have that function. Much better idea to use that.
Add encode/decode support for new VECTOR datatype.
A result column of VECTOR(, DOUBLE) shows as type datatype.VECTOR_DOUBLE in the metadata description.
It is returned as datatype.Vector with subtype
datatype.Vector.DOUBLE which can be used as a list.
For a parameter to be detected as VECTOR(,DOUBLE) it also needs to use the datatype.Vector datatype. Alternatively, binding a string is also possible - which currently includes binding a list of numbers which will then be converted to string.
This commit also adds tests for the described behavior.