Skip to content
This repository has been archived by the owner on Feb 20, 2023. It is now read-only.

Type cleanup #1628

Merged
merged 19 commits into from
Jul 22, 2021
Merged

Type cleanup #1628

merged 19 commits into from
Jul 22, 2021

Conversation

mbutrovich
Copy link
Contributor

@mbutrovich mbutrovich commented Jul 15, 2021

Background:

type::TypeId has been another bee in my bonnet like TransientValue was, which was also a stopgap as other parts of the system were built without the execution engine or optimizer yet. As of right now we have about 5 different type definitions in the system: PostgresValueType in the network layer, Peloton-like type::TypeId in /include/type, other Postgres types in the parser, and SqlTypeId and TypeId in /include/execution/sql.

Overview:

Remove type::TypeId and replace with execution::sql::SqlTypeId.

Changes:

  • Finishes what Remove type::TransientValue and related classes #911 started by removing the rest of the /type folder.
  • Remove TypeId::VAR_ARRAY. Doesn't seem like it was used and not a real type.
  • Remove ColumnMapInfo from ColumnMap in the SqlTable. This went in with TPL2 and stashed type info at the storage layer. That's a huge footgun for how we plan to do schema change in the future, and the catalog is the sole authority of transactional schema information.
  • CTEs stash their temp tables' schemas in the CatalogAccessor just like their temp tables. TableVectorIterator was relying on the ColumnMapInfo and now requests schema for all tables from CatalogAccessor, so this is transparent to the execution engine.
  • Remove insertion of INVALID type into pg_type since that's not a real type.
  • pg_type_impl uses the type sizes defined in sql.h/.cpp rather than hardcoded.
  • VarlenEntry moved to its own header file.

Future work:

  • This partially addresses Fix made up types #1183, but we still have SqlTypeId::Variadic which is used in pg_proc incorrectly. This will take a larger rewrite of pg_proc behavior to actually match Postgres (see pg_proc PROALLARGTYPES does not follow Postgres semantics. #1359). I tried to do it in this PR, but the blast radius gets bigger than I'd like for an already-large PR.
  • Discuss REAL type. TPL supports it, but the front-end doesn't and neither does the storage layer. Current behavior just does a magic trick and uses DOUBLE, but maybe we should explicitly tell the user they're not getting REALs?
  • Discuss renaming TPL's Real to Double since that's what it actually is, and it's confusing in the current context.
  • Discuss merging PostgresValueType with SqlTypeId to reduce type translation. This would bump the latter to being an int32_t and has a large blast radius.
  • Discuss fixing the whole "NULLs come in as INVALID type" until fixed much later by the binder and execution engine. Large blast radius on this one and not in the scope of this PR.

@mbutrovich mbutrovich added in-progress This PR is being actively worked on and not ready to be reviewed or merged. Mark PRs with this. best-practice Style fixes or refactor in the code base. Mark issues with this. ready-for-ci Indicate that this build should be run through CI. labels Jul 15, 2021
@mbutrovich mbutrovich self-assigned this Jul 15, 2021
@noisepage-checks
Copy link

Major Decrease in Performance

STOP: this PR has a major negative performance impact

tps (%change) benchmark_type wal_device details
0.01% tpcc RAM disk
Detailsmaster tps=22075.29, commit tps=22077.46, query_mode=extended, benchmark_type=tpcc, scale_factor=32.0000, terminals=32, client_time=60, weights={'Payment': 43, 'Delivery': 4, 'NewOrder': 45, 'StockLevel': 4, 'OrderStatus': 4}, wal_device=RAM disk, max_connection_threads=32
-1.5% tpcc None
Detailsmaster tps=29682.29, commit tps=29236.31, query_mode=extended, benchmark_type=tpcc, scale_factor=32.0000, terminals=32, client_time=60, weights={'Payment': 43, 'Delivery': 4, 'NewOrder': 45, 'StockLevel': 4, 'OrderStatus': 4}, wal_device=None, max_connection_threads=32
-1.84% tpcc HDD
Detailsmaster tps=21863.97, commit tps=21462.16, query_mode=extended, benchmark_type=tpcc, scale_factor=32.0000, terminals=32, client_time=60, weights={'Payment': 43, 'Delivery': 4, 'NewOrder': 45, 'StockLevel': 4, 'OrderStatus': 4}, wal_device=HDD, max_connection_threads=32
-0.89% tatp RAM disk
Detailsmaster tps=6691.93, commit tps=6632.34, query_mode=extended, benchmark_type=tatp, scale_factor=1.0000, terminals=16, client_time=60, weights={'GetAccessData': 35, 'UpdateLocation': 14, 'GetNewDestination': 10, 'GetSubscriberData': 35, 'DeleteCallForwarding': 2, 'InsertCallForwarding': 2, 'UpdateSubscriberData': 2}, wal_device=RAM disk, max_connection_threads=32
-8.34% tatp None
Detailsmaster tps=7611.21, commit tps=6976.28, query_mode=extended, benchmark_type=tatp, scale_factor=1.0000, terminals=16, client_time=60, weights={'GetAccessData': 35, 'UpdateLocation': 14, 'GetNewDestination': 10, 'GetSubscriberData': 35, 'DeleteCallForwarding': 2, 'InsertCallForwarding': 2, 'UpdateSubscriberData': 2}, wal_device=None, max_connection_threads=32
-4.03% tatp HDD
Detailsmaster tps=6718.74, commit tps=6447.76, query_mode=extended, benchmark_type=tatp, scale_factor=1.0000, terminals=16, client_time=60, weights={'GetAccessData': 35, 'UpdateLocation': 14, 'GetNewDestination': 10, 'GetSubscriberData': 35, 'DeleteCallForwarding': 2, 'InsertCallForwarding': 2, 'UpdateSubscriberData': 2}, wal_device=HDD, max_connection_threads=32

@codecov
Copy link

codecov bot commented Jul 16, 2021

Codecov Report

Merging #1628 (964dcb4) into master (a628efe) will decrease coverage by 0.04%.
The diff coverage is 68.56%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1628      +/-   ##
==========================================
- Coverage   80.80%   80.76%   -0.05%     
==========================================
  Files         774      773       -1     
  Lines       55360    55420      +60     
==========================================
+ Hits        44736    44759      +23     
- Misses      10624    10661      +37     
Impacted Files Coverage Δ
src/catalog/index_schema.cpp 0.00% <0.00%> (ø)
src/common/strong_typedef.cpp 100.00% <ø> (ø)
src/include/binder/bind_node_visitor.h 100.00% <ø> (ø)
src/include/binder/binder_context.h 100.00% <ø> (ø)
src/include/catalog/catalog_accessor.h 100.00% <ø> (ø)
src/include/catalog/database_catalog.h 88.88% <ø> (ø)
src/include/execution/ast/context.h 100.00% <ø> (ø)
src/include/execution/compiler/codegen.h 100.00% <ø> (ø)
src/include/execution/exec/execution_context.h 96.42% <ø> (ø)
src/include/execution/sql/chaining_hash_table.h 91.91% <ø> (-1.02%) ⬇️
... and 122 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a628efe...964dcb4. Read the comment docs.

@noisepage-checks
Copy link

Major Decrease in Performance

STOP: this PR has a major negative performance impact

tps (%change) benchmark_type wal_device details
-2.0% tpcc RAM disk
Detailsmaster tps=22075.29, commit tps=21633.42, query_mode=extended, benchmark_type=tpcc, scale_factor=32.0000, terminals=32, client_time=60, weights={'Payment': 43, 'Delivery': 4, 'NewOrder': 45, 'StockLevel': 4, 'OrderStatus': 4}, wal_device=RAM disk, max_connection_threads=32
-2.61% tpcc None
Detailsmaster tps=29682.29, commit tps=28908.03, query_mode=extended, benchmark_type=tpcc, scale_factor=32.0000, terminals=32, client_time=60, weights={'Payment': 43, 'Delivery': 4, 'NewOrder': 45, 'StockLevel': 4, 'OrderStatus': 4}, wal_device=None, max_connection_threads=32
-3.8% tpcc HDD
Detailsmaster tps=21863.97, commit tps=21032.35, query_mode=extended, benchmark_type=tpcc, scale_factor=32.0000, terminals=32, client_time=60, weights={'Payment': 43, 'Delivery': 4, 'NewOrder': 45, 'StockLevel': 4, 'OrderStatus': 4}, wal_device=HDD, max_connection_threads=32
-0.74% tatp RAM disk
Detailsmaster tps=6691.93, commit tps=6642.08, query_mode=extended, benchmark_type=tatp, scale_factor=1.0000, terminals=16, client_time=60, weights={'GetAccessData': 35, 'UpdateLocation': 14, 'GetNewDestination': 10, 'GetSubscriberData': 35, 'DeleteCallForwarding': 2, 'InsertCallForwarding': 2, 'UpdateSubscriberData': 2}, wal_device=RAM disk, max_connection_threads=32
-8.91% tatp None
Detailsmaster tps=7611.21, commit tps=6932.97, query_mode=extended, benchmark_type=tatp, scale_factor=1.0000, terminals=16, client_time=60, weights={'GetAccessData': 35, 'UpdateLocation': 14, 'GetNewDestination': 10, 'GetSubscriberData': 35, 'DeleteCallForwarding': 2, 'InsertCallForwarding': 2, 'UpdateSubscriberData': 2}, wal_device=None, max_connection_threads=32
-2.1% tatp HDD
Detailsmaster tps=6718.74, commit tps=6577.68, query_mode=extended, benchmark_type=tatp, scale_factor=1.0000, terminals=16, client_time=60, weights={'GetAccessData': 35, 'UpdateLocation': 14, 'GetNewDestination': 10, 'GetSubscriberData': 35, 'DeleteCallForwarding': 2, 'InsertCallForwarding': 2, 'UpdateSubscriberData': 2}, wal_device=HDD, max_connection_threads=32

@noisepage-checks
Copy link

Major Decrease in Performance

STOP: this PR has a major negative performance impact

tps (%change) benchmark_type wal_device details
-5.42% tpcc RAM disk
Detailsmaster tps=22645.95, commit tps=21418.96, query_mode=extended, benchmark_type=tpcc, scale_factor=32.0000, terminals=32, client_time=60, weights={'Payment': 43, 'Delivery': 4, 'NewOrder': 45, 'StockLevel': 4, 'OrderStatus': 4}, wal_device=RAM disk, max_connection_threads=32
-1.65% tpcc None
Detailsmaster tps=29869.69, commit tps=29375.53, query_mode=extended, benchmark_type=tpcc, scale_factor=32.0000, terminals=32, client_time=60, weights={'Payment': 43, 'Delivery': 4, 'NewOrder': 45, 'StockLevel': 4, 'OrderStatus': 4}, wal_device=None, max_connection_threads=32
-2.84% tpcc HDD
Detailsmaster tps=22296.69, commit tps=21662.53, query_mode=extended, benchmark_type=tpcc, scale_factor=32.0000, terminals=32, client_time=60, weights={'Payment': 43, 'Delivery': 4, 'NewOrder': 45, 'StockLevel': 4, 'OrderStatus': 4}, wal_device=HDD, max_connection_threads=32
-1.04% tatp RAM disk
Detailsmaster tps=6736.69, commit tps=6666.54, query_mode=extended, benchmark_type=tatp, scale_factor=1.0000, terminals=16, client_time=60, weights={'GetAccessData': 35, 'UpdateLocation': 14, 'GetNewDestination': 10, 'GetSubscriberData': 35, 'DeleteCallForwarding': 2, 'InsertCallForwarding': 2, 'UpdateSubscriberData': 2}, wal_device=RAM disk, max_connection_threads=32
-9.37% tatp None
Detailsmaster tps=7664.09, commit tps=6945.89, query_mode=extended, benchmark_type=tatp, scale_factor=1.0000, terminals=16, client_time=60, weights={'GetAccessData': 35, 'UpdateLocation': 14, 'GetNewDestination': 10, 'GetSubscriberData': 35, 'DeleteCallForwarding': 2, 'InsertCallForwarding': 2, 'UpdateSubscriberData': 2}, wal_device=None, max_connection_threads=32
-0.96% tatp HDD
Detailsmaster tps=6587.19, commit tps=6524.18, query_mode=extended, benchmark_type=tatp, scale_factor=1.0000, terminals=16, client_time=60, weights={'GetAccessData': 35, 'UpdateLocation': 14, 'GetNewDestination': 10, 'GetSubscriberData': 35, 'DeleteCallForwarding': 2, 'InsertCallForwarding': 2, 'UpdateSubscriberData': 2}, wal_device=HDD, max_connection_threads=32

@noisepage-checks
Copy link

Major Decrease in Performance

STOP: this PR has a major negative performance impact

tps (%change) benchmark_type wal_device details
-4.29% tpcc RAM disk
Detailsmaster tps=22809.02, commit tps=21830.9, query_mode=extended, benchmark_type=tpcc, scale_factor=32.0000, terminals=32, client_time=60, weights={'Payment': 43, 'Delivery': 4, 'NewOrder': 45, 'StockLevel': 4, 'OrderStatus': 4}, wal_device=RAM disk, max_connection_threads=32
2.16% tpcc None
Detailsmaster tps=28697.74, commit tps=29316.21, query_mode=extended, benchmark_type=tpcc, scale_factor=32.0000, terminals=32, client_time=60, weights={'Payment': 43, 'Delivery': 4, 'NewOrder': 45, 'StockLevel': 4, 'OrderStatus': 4}, wal_device=None, max_connection_threads=32
-1.06% tpcc HDD
Detailsmaster tps=21709.66, commit tps=21478.49, query_mode=extended, benchmark_type=tpcc, scale_factor=32.0000, terminals=32, client_time=60, weights={'Payment': 43, 'Delivery': 4, 'NewOrder': 45, 'StockLevel': 4, 'OrderStatus': 4}, wal_device=HDD, max_connection_threads=32
3.0% tatp RAM disk
Detailsmaster tps=6693.45, commit tps=6894.43, query_mode=extended, benchmark_type=tatp, scale_factor=1.0000, terminals=16, client_time=60, weights={'GetAccessData': 35, 'UpdateLocation': 14, 'GetNewDestination': 10, 'GetSubscriberData': 35, 'DeleteCallForwarding': 2, 'InsertCallForwarding': 2, 'UpdateSubscriberData': 2}, wal_device=RAM disk, max_connection_threads=32
-6.69% tatp None
Detailsmaster tps=7544.53, commit tps=7040.09, query_mode=extended, benchmark_type=tatp, scale_factor=1.0000, terminals=16, client_time=60, weights={'GetAccessData': 35, 'UpdateLocation': 14, 'GetNewDestination': 10, 'GetSubscriberData': 35, 'DeleteCallForwarding': 2, 'InsertCallForwarding': 2, 'UpdateSubscriberData': 2}, wal_device=None, max_connection_threads=32
-0.57% tatp HDD
Detailsmaster tps=6588.52, commit tps=6550.96, query_mode=extended, benchmark_type=tatp, scale_factor=1.0000, terminals=16, client_time=60, weights={'GetAccessData': 35, 'UpdateLocation': 14, 'GetNewDestination': 10, 'GetSubscriberData': 35, 'DeleteCallForwarding': 2, 'InsertCallForwarding': 2, 'UpdateSubscriberData': 2}, wal_device=HDD, max_connection_threads=32

@noisepage-checks
Copy link

Major Decrease in Performance

STOP: this PR has a major negative performance impact

tps (%change) benchmark_type wal_device details
-2.5% tpcc RAM disk
Detailsmaster tps=22277.15, commit tps=21720.45, query_mode=extended, benchmark_type=tpcc, scale_factor=32.0000, terminals=32, client_time=60, weights={'Payment': 43, 'Delivery': 4, 'NewOrder': 45, 'StockLevel': 4, 'OrderStatus': 4}, wal_device=RAM disk, max_connection_threads=32
-1.26% tpcc None
Detailsmaster tps=29725.95, commit tps=29350.44, query_mode=extended, benchmark_type=tpcc, scale_factor=32.0000, terminals=32, client_time=60, weights={'Payment': 43, 'Delivery': 4, 'NewOrder': 45, 'StockLevel': 4, 'OrderStatus': 4}, wal_device=None, max_connection_threads=32
-2.14% tpcc HDD
Detailsmaster tps=21870.22, commit tps=21402.68, query_mode=extended, benchmark_type=tpcc, scale_factor=32.0000, terminals=32, client_time=60, weights={'Payment': 43, 'Delivery': 4, 'NewOrder': 45, 'StockLevel': 4, 'OrderStatus': 4}, wal_device=HDD, max_connection_threads=32
2.68% tatp RAM disk
Detailsmaster tps=6659.17, commit tps=6837.56, query_mode=extended, benchmark_type=tatp, scale_factor=1.0000, terminals=16, client_time=60, weights={'GetAccessData': 35, 'UpdateLocation': 14, 'GetNewDestination': 10, 'GetSubscriberData': 35, 'DeleteCallForwarding': 2, 'InsertCallForwarding': 2, 'UpdateSubscriberData': 2}, wal_device=RAM disk, max_connection_threads=32
-8.29% tatp None
Detailsmaster tps=7555.48, commit tps=6929.49, query_mode=extended, benchmark_type=tatp, scale_factor=1.0000, terminals=16, client_time=60, weights={'GetAccessData': 35, 'UpdateLocation': 14, 'GetNewDestination': 10, 'GetSubscriberData': 35, 'DeleteCallForwarding': 2, 'InsertCallForwarding': 2, 'UpdateSubscriberData': 2}, wal_device=None, max_connection_threads=32
-0.14% tatp HDD
Detailsmaster tps=6630.08, commit tps=6620.86, query_mode=extended, benchmark_type=tatp, scale_factor=1.0000, terminals=16, client_time=60, weights={'GetAccessData': 35, 'UpdateLocation': 14, 'GetNewDestination': 10, 'GetSubscriberData': 35, 'DeleteCallForwarding': 2, 'InsertCallForwarding': 2, 'UpdateSubscriberData': 2}, wal_device=HDD, max_connection_threads=32

@mbutrovich mbutrovich removed the in-progress This PR is being actively worked on and not ready to be reviewed or merged. Mark PRs with this. label Jul 22, 2021
@mbutrovich mbutrovich added the ready-for-review This PR passes all checks and is ready to be reviewed. Mark PRs with this. label Jul 22, 2021
@mbutrovich
Copy link
Contributor Author

Ran TATP and YCSB read only with 20 terminals on dev9 and didn't see a performance change.

@mbutrovich mbutrovich merged commit 845aedf into cmu-db:master Jul 22, 2021
@mbutrovich mbutrovich deleted the nuke_type_id branch July 22, 2021 20:05
@mbutrovich mbutrovich mentioned this pull request Jul 24, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
best-practice Style fixes or refactor in the code base. Mark issues with this. ready-for-ci Indicate that this build should be run through CI. ready-for-review This PR passes all checks and is ready to be reviewed. Mark PRs with this.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant