Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type logic refactor #1222

Merged
merged 84 commits into from
May 24, 2022
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
84 commits
Select commit Hold shift + click to select a range
af4fa7c
add db type alias information to db types endpoint
dmos62 Mar 17, 2022
41c6623
fix set membership test
dmos62 Mar 17, 2022
9b294c8
Merge branch 'master' into add_alias_info_to_db_types_endpoint
dmos62 Mar 17, 2022
10bf354
Exploratory comments and minor refactor
dmos62 Mar 17, 2022
d644c37
Some more slight refactors and comments
dmos62 Mar 17, 2022
b52b3ab
+
dmos62 Mar 18, 2022
57cf327
Fix None filtering
dmos62 Mar 18, 2022
95595e3
Merge branch 'add_alias_info_to_db_types_endpoint' into some-type-log…
dmos62 Mar 18, 2022
d98271d
Add utilities to DatabaseType for looking up its ischema key
dmos62 Mar 18, 2022
9264c0d
Linter fixes
dmos62 Mar 18, 2022
9576f27
Minor refactor
dmos62 Mar 21, 2022
430c8af
Add get_sa_class method to DatabaseType
dmos62 Mar 21, 2022
f0517e8
Get rid of aliases information from DatabaseType
dmos62 Mar 21, 2022
754e793
Access ischema_names dict directly
dmos62 Mar 21, 2022
da659d3
Add DatabaseType::is_available_on_engine
dmos62 Mar 21, 2022
e5023cb
Delete get_available_types, some constants
dmos62 Mar 21, 2022
69fc2b2
Various type-related refactors
dmos62 Mar 23, 2022
c2357fe
Remove leftover aliases namespace
dmos62 Mar 23, 2022
0fd467e
Minor refactor
dmos62 Mar 23, 2022
3a73895
Linter fixes
dmos62 Mar 23, 2022
4a9aa6d
Merge branch 'master' into some-type-logic-refactor
dmos62 Mar 24, 2022
e614e2c
Remove other aliases remnants
dmos62 Mar 24, 2022
278d885
Minor changes
dmos62 Mar 24, 2022
50a4425
Various fixes
dmos62 Mar 24, 2022
5f3044a
Convert group of tests into a single parametrized test
dmos62 Mar 24, 2022
d39a436
Revert to using multiple tests
dmos62 Mar 24, 2022
898cd73
More refactoring
dmos62 Mar 25, 2022
4f4cb7a
Various refactors
dmos62 Mar 25, 2022
1ad0fb3
columns test_alter.py passing
dmos62 Mar 29, 2022
38156a4
A few more suites are passing; currently working on types test_cast
dmos62 Mar 31, 2022
dd8b004
Merge branch 'master' into some-type-logic-refactor
dmos62 Apr 14, 2022
6ff9d87
db/tests/types/ are passing
dmos62 Apr 21, 2022
ecb508d
db/tests/records/ is passing
dmos62 Apr 21, 2022
0b224ef
db/tests/tables/ is passing
dmos62 Apr 21, 2022
dab7c5c
add previously renamed file
dmos62 Apr 21, 2022
79db6be
db/tests/ passing
dmos62 Apr 21, 2022
29d792a
some initial changes to mathesar ns
dmos62 Apr 21, 2022
d0e8a8c
mathesar/tests/api/test_column_api.py passing
dmos62 Apr 27, 2022
d23a2d3
mathesar/tests/api/test_data_file_api.py is passing
dmos62 Apr 27, 2022
8752637
fix more mathesar namespace tests
dmos62 Apr 29, 2022
350e03a
All mathesar namespace tests are passing
dmos62 May 1, 2022
050534b
Fix last failing (non-integration) test
dmos62 May 1, 2022
7892e3e
Satisfy flake8
dmos62 May 1, 2022
88844cf
Merge branch 'master' into some-type-logic-refactor
dmos62 May 1, 2022
9047c2b
Refactor/fix number validation
dmos62 May 2, 2022
a81398c
Some post-merge test fixes
dmos62 May 2, 2022
6540b3b
Satisfy flake8
dmos62 May 2, 2022
beb365f
Undo accidental search and replace on a mathesar_ui file
dmos62 May 2, 2022
b2e988b
Fix whitespace
dmos62 May 2, 2022
cedd891
Remove dead code
dmos62 May 2, 2022
7f5d331
Typo
dmos62 May 2, 2022
8317427
Update integ. tests and fixtures
dmos62 May 3, 2022
d107bf8
Update frontend UI type configs
dmos62 May 3, 2022
902267a
Merge branch 'master' into some-type-logic-refactor
dmos62 May 3, 2022
58887d2
Have Dockerfile setting survive rebuilds
dmos62 May 5, 2022
952aeea
Fix some tests
dmos62 May 6, 2022
25c9f6f
Minor changes
dmos62 May 9, 2022
ff23257
Some merge conflict and bug fixes
dmos62 May 16, 2022
6c2ef2c
Some minor changes; enabled model reset for all mathesar tests
dmos62 May 16, 2022
d08a5af
Reduce fixture scope
dmos62 May 17, 2022
d33efb8
Add sqlalchemy_utils to requirements
dmos62 May 17, 2022
e0d9fd3
Make django test database short lived
dmos62 May 17, 2022
0873b11
Support xdist testing; optimize fixture scopes for performance
dmos62 May 19, 2022
6b5561f
Merge branch 'master' into some-type-logic-refactor
dmos62 May 20, 2022
cf911cc
Add xdist to requirements-dev.txt
dmos62 May 20, 2022
07fc8fc
Many merge conflict fixes and refactors
dmos62 May 20, 2022
3696692
Fix display options serializer
dmos62 May 23, 2022
f2cec65
Various fixes
dmos62 May 23, 2022
305556f
Merge branch 'master' into some-type-logic-refactor
dmos62 May 23, 2022
3bd9c7a
Remove logging from fixtures
dmos62 May 23, 2022
b4dc592
Minor changes
dmos62 May 23, 2022
eecdfae
Linter fixes
dmos62 May 23, 2022
8357b6e
Skip a hard-to-fix e2e test
dmos62 May 23, 2022
ec16cc3
Vulture fixes
dmos62 May 23, 2022
f8ff2cb
Linter fixes
dmos62 May 23, 2022
0708bc0
Add -n auto to setup.cfg so that tests run in parallel automatically
dmos62 May 23, 2022
dc03636
Disable xdist for e2e tests
dmos62 May 24, 2022
b133c4f
Get rid of non-canonical DatabaseTypes
dmos62 May 24, 2022
62fa079
Remove is_alias DatabaseType property
dmos62 May 24, 2022
7cdfa7c
Update E2E setup docs
seancolsen May 24, 2022
91adbb7
Use correct case for URI type
seancolsen May 24, 2022
bfcf549
Disable automatic test parallelization: fails with >7 threads
dmos62 May 24, 2022
80b368e
Merge branch 'some-type-logic-refactor' of github.com:centerofci/math…
dmos62 May 24, 2022
1b53d4c
Merge branch 'master' into some-type-logic-refactor
dmos62 May 24, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Exploratory comments and minor refactor
  • Loading branch information
dmos62 committed Mar 17, 2022
commit 10bf35471ac06e82af94d75aff16a907042f0126
35 changes: 20 additions & 15 deletions db/columns/operations/infer_types.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,36 +39,41 @@
}


def _get_reverse_type_map(engine):
supported_types = get_supported_alter_column_types(engine)
reverse_type_map = {v: k for k, v in supported_types.items()}
reverse_type_map.update(
def _get_type_classes_mapped_to_type_ids(engine):
"""
Returns SA type classes mapped to their type names/ids (as compiled by SA's PG dialect).
"""
supported_types = get_supported_alter_column_types(engine, friendly_names=True)
type_classes_to_ids = {v: k for k, v in supported_types.items()}
# NOTE: below dict merge seems to add some meta-entries to this map, which later, in
# infer_column_type, are used to leverage recursion.
type_classes_to_ids.update(
{
Text: base.STRING,
TEXT: base.STRING,
VARCHAR: base.STRING,
}
)
return reverse_type_map
return type_classes_to_ids


def infer_column_type(schema, table_name, column_name, engine, depth=0, type_inference_dag=TYPE_INFERENCE_DAG):
if depth > MAX_INFERENCE_DAG_DEPTH:
raise DagCycleError("The type_inference_dag likely has a cycle")
reverse_type_map = _get_reverse_type_map(engine)
type_classes_to_ids = _get_type_classes_mapped_to_type_ids(engine)

table = reflect_table(table_name, schema, engine)
column_type = table.columns[column_name].type.__class__
column_type_str = reverse_type_map.get(column_type)
column_type_class = table.columns[column_name].type.__class__
column_type_id = type_classes_to_ids.get(column_type_class)

logger.debug(f"column_type_str: {column_type_str}")
logger.debug(f"column_type_id: {column_type_id}")
table_oid = get_oid_from_table(table_name, schema, engine)
for type_str in type_inference_dag.get(column_type_str, []):
for type_id in type_inference_dag.get(column_type_id, []):
try:
with engine.begin() as conn:
alter_column_type(table_oid, column_name, engine, conn, type_str)
logger.info(f"Column {column_name} altered to type {type_str}")
column_type = infer_column_type(
alter_column_type(table_oid, column_name, engine, conn, type_id)
logger.info(f"Column {column_name} altered to type {type_id}")
column_type_class = infer_column_type(
schema,
table_name,
column_name,
Expand All @@ -81,6 +86,6 @@ def infer_column_type(schema, table_name, column_name, engine, depth=0, type_inf
# a type is appropriate for a column fails.
except DatabaseError:
logger.info(
f"Cannot alter column {column_name} to type {type_str}"
f"Cannot alter column {column_name} to type {type_id}"
)
return column_type
return column_type_class
3 changes: 3 additions & 0 deletions db/types/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,9 @@


CHAR = 'char'
# NOTE: 'string' doesn't seem to be a Postgres type identifier
# is STRING's purpose only to act as a meta-type in the recursive
# db.columns.operations.infer_types::infer_column_type?
STRING = 'string'
VARCHAR = 'varchar'

Expand Down
111 changes: 79 additions & 32 deletions db/types/operations/cast.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,9 @@
from db.types import base, email, money, multicurrency, uri
from db.types.exceptions import UnsupportedTypeException

# TODO it would be ideal to import these types regularly (from PostgresType import BIGINT, ...),
# and to pass around Enum instances, not strings.

# DB type name strings
BIGINT = base.PostgresType.BIGINT.value
BOOLEAN = base.PostgresType.BOOLEAN.value
Expand All @@ -21,6 +24,7 @@
TEXT = base.PostgresType.TEXT.value

# one-off strings representing keys in ischema_names
# TODO consider formalizing these as Enums, like PostgresType or MathesarCustomType
CHAR = base.CHAR
STRING = base.STRING
VARCHAR = base.VARCHAR
Expand All @@ -29,11 +33,12 @@
EMAIL = base.MathesarCustomType.EMAIL.value
MATHESAR_MONEY = base.MathesarCustomType.MATHESAR_MONEY.value
MULTICURRENCY_MONEY = base.MathesarCustomType.MULTICURRENCY_MONEY.value
URI = base.MathesarCustomType.URI.value

TIME_WITHOUT_TIME_ZONE = base.PostgresType.TIME_WITHOUT_TIME_ZONE.value
TIME_WITH_TIME_ZONE = base.PostgresType.TIME_WITH_TIME_ZONE.value
TIMESTAMP_WITH_TIME_ZONE = base.PostgresType.TIMESTAMP_WITH_TIME_ZONE.value
TIMESTAMP_WITHOUT_TIME_ZONE = base.PostgresType.TIMESTAMP_WITHOUT_TIME_ZONE.value
URI = base.MathesarCustomType.URI.value

# only needed for ischema lookup
FULL_VARCHAR = base.PostgresType.CHARACTER_VARYING.value
Expand All @@ -48,7 +53,19 @@
MONEY_ARR_FUNC_NAME = "get_mathesar_money_array"


def get_supported_alter_column_types(engine, friendly_names=True):
# TODO is it safe to have type names in two forms: friendly and unfriendly?
# NOTE: you could say that there are two distinct control flow branches in this
# method (practically two methods):
# both return only types available on the engine;
# both return a map whose values are SA type classes;
# friendly_names being true returns a map where keys are "canonical" (or "friendly") type names
# (according to some definition of canonical and friendly);
# friendly_names being false returns a map where keys are type names, as compiled by SA's PG
# dialect.
#
# TODO it's not obvious that the type set returned is column-alteration-specific; maybe change
# name or refactor to be more general?
def get_supported_alter_column_types(engine, friendly_names):
"""
Returns a list of valid types supported by mathesar for the given engine.

Expand All @@ -57,42 +74,72 @@ def get_supported_alter_column_types(engine, friendly_names=True):
friendly_names: sets whether to use "friendly" service-layer or the
actual DB-layer names.
"""
dialect_types = base.get_available_types(engine)
available_type_ids_to_classes = base.get_available_types(engine)

# TODO is this map about aliases?
# we've a dedicated namespace for aliases now, maybe leverage that.
#
# NOTE: below map's keys will only be used in the friendly_names=True case
# NOTE: below map's values seem to be "ischema_names"-friendly, as opposed to the keys.
# for example, VARCHAR is mapped to FULL_VARCHAR (CHARACTER_VARYING); I presume that's because
# VARCHAR is not a key in ischema_names.
#
# NOTE: below map's roles seem to be
# - map aliases to canonicals (VARCHAR -> CHARACTER_VARYING)
# - map mathesar custom type id to its qualified id (EMAIL -> MATHESAR_TYPES.EMAIL)
# - the mystical STRING -> NAME
# TODO this can be refactored into explicit steps.
# that will also solve the ambiguous name `friendly_type_map`.
friendly_type_map = {
# Default Postgres types
BIGINT: dialect_types.get(BIGINT),
BOOLEAN: dialect_types.get(BOOLEAN),
CHAR: dialect_types.get(FULL_CHAR),
DATE: dialect_types.get(DATE),
DECIMAL: dialect_types.get(DECIMAL),
DOUBLE_PRECISION: dialect_types.get(DOUBLE_PRECISION),
FLOAT: dialect_types.get(FLOAT),
INTEGER: dialect_types.get(INTEGER),
INTERVAL: dialect_types.get(INTERVAL),
MONEY: dialect_types.get(MONEY),
NUMERIC: dialect_types.get(NUMERIC),
REAL: dialect_types.get(REAL),
SMALLINT: dialect_types.get(SMALLINT),
STRING: dialect_types.get(NAME),
TEXT: dialect_types.get(TEXT),
TIME_WITHOUT_TIME_ZONE: dialect_types.get(TIME_WITHOUT_TIME_ZONE),
TIME_WITH_TIME_ZONE: dialect_types.get(TIME_WITH_TIME_ZONE),
TIMESTAMP_WITH_TIME_ZONE: dialect_types.get(TIMESTAMP_WITH_TIME_ZONE),
TIMESTAMP_WITHOUT_TIME_ZONE: dialect_types.get(TIMESTAMP_WITHOUT_TIME_ZONE),
VARCHAR: dialect_types.get(FULL_VARCHAR),
BIGINT: BIGINT,
BOOLEAN: BOOLEAN,
CHAR: FULL_CHAR,
DATE: DATE,
DECIMAL: DECIMAL,
DOUBLE_PRECISION: DOUBLE_PRECISION,
FLOAT: FLOAT,
INTEGER: INTEGER,
INTERVAL: INTERVAL,
MONEY: MONEY,
NUMERIC: NUMERIC,
REAL: REAL,
SMALLINT: SMALLINT,
# TODO what is the logic behind mapping (I presume) SA's string type to PG name type?
STRING: NAME,
TEXT: TEXT,
TIME_WITHOUT_TIME_ZONE: TIME_WITHOUT_TIME_ZONE,
TIME_WITH_TIME_ZONE: TIME_WITH_TIME_ZONE,
TIMESTAMP_WITH_TIME_ZONE: TIMESTAMP_WITH_TIME_ZONE,
TIMESTAMP_WITHOUT_TIME_ZONE: TIMESTAMP_WITHOUT_TIME_ZONE,
VARCHAR: FULL_VARCHAR,
# Custom Mathesar types
EMAIL: dialect_types.get(email.DB_TYPE),
MATHESAR_MONEY: dialect_types.get(money.DB_TYPE),
MULTICURRENCY_MONEY: dialect_types.get(multicurrency.DB_TYPE),
URI: dialect_types.get(uri.DB_TYPE),
EMAIL: email.DB_TYPE,
MATHESAR_MONEY: money.DB_TYPE,
MULTICURRENCY_MONEY: multicurrency.DB_TYPE,
URI: uri.DB_TYPE,
}
# NOTE: below available_type_ids_to_classes.get calls may return None.
# one case where that will happen is if it's a custom MA type
# that's not installed on the database.
friendly_type_map = {
k: available_type_ids_to_classes.get(v)
for k, v
in friendly_type_map.items()
if v is not None
}
# NOTE: friendly_names decides what the keys of the resulting map will be
if friendly_names:
type_map = {k: v for k, v in friendly_type_map.items() if v is not None}
type_map = {
k: v
for k, v
in friendly_type_map.items()
}
else:
type_map = {
val().compile(dialect=engine.dialect): val
for val in friendly_type_map.values()
if val is not None
v().compile(dialect=engine.dialect): v
for v
in friendly_type_map.values()
}
return type_map

Expand All @@ -101,7 +148,7 @@ def get_supported_alter_column_db_types(engine):
return set(
[
type_().compile(dialect=engine.dialect)
for type_ in get_supported_alter_column_types(engine).values()
for type_ in get_supported_alter_column_types(engine, friendly_names=True).values()
]
)

Expand Down