Skip to content
This repository has been archived by the owner on Feb 20, 2023. It is now read-only.

Support training models from psql SET commands #1585

Merged
merged 3 commits into from
May 24, 2021

Conversation

17zhangw
Copy link
Member

Description

PR adds support to perform the training of the OU model, the interference model, and the forecast model over psql by exposing three dedicated train_X_model knobs that can be triggered with the SET command.

The following additional settings are introduced (although some could possibly be coalesced) that can also be altered over psql via the SET command to adjust the training of the models:

  • interference_model_input_path: input path to directory for data to train interference model with
  • interference_model_train_methods: comma-delimited methods to use for training interference model (i.e., "rf")
  • interference_model_train_timeout: timeout for training the interference model
  • interference_model_pipeline_sample_rate: sampling rate of pipeline metric OUs for interference model
  • ou_model_input_path: input path for directory to data for training OU model
  • ou_model_train_methods: comma-delimited methods to use to train OU model (i.e., "lr,rf")
  • ou_model_train_timeout: timeout for training the OU model

@17zhangw 17zhangw self-assigned this May 20, 2021
@17zhangw 17zhangw added in-progress This PR is being actively worked on and not ready to be reviewed or merged. Mark PRs with this. ready-for-ci Indicate that this build should be run through CI. labels May 20, 2021
@noisepage-checks
Copy link

Minor Decrease in Performance

Be warned: this PR may have decreased the throughput of the system slightly.

tps (%change) benchmark_type wal_device details
-2.37% tpcc RAM disk
Detailsmaster tps=22769.53, commit tps=22230.27, query_mode=extended, benchmark_type=tpcc, scale_factor=32.0000, terminals=32, client_time=60, weights={'Payment': 43, 'Delivery': 4, 'NewOrder': 45, 'StockLevel': 4, 'OrderStatus': 4}, wal_device=RAM disk, max_connection_threads=32
1.12% tpcc None
Detailsmaster tps=28567.15, commit tps=28887.68, query_mode=extended, benchmark_type=tpcc, scale_factor=32.0000, terminals=32, client_time=60, weights={'Payment': 43, 'Delivery': 4, 'NewOrder': 45, 'StockLevel': 4, 'OrderStatus': 4}, wal_device=None, max_connection_threads=32
1.01% tpcc HDD
Detailsmaster tps=21726.02, commit tps=21945.97, query_mode=extended, benchmark_type=tpcc, scale_factor=32.0000, terminals=32, client_time=60, weights={'Payment': 43, 'Delivery': 4, 'NewOrder': 45, 'StockLevel': 4, 'OrderStatus': 4}, wal_device=HDD, max_connection_threads=32
7.26% tatp RAM disk
Detailsmaster tps=6504.48, commit tps=6977.0, query_mode=extended, benchmark_type=tatp, scale_factor=1.0000, terminals=16, client_time=60, weights={'GetAccessData': 35, 'UpdateLocation': 14, 'GetNewDestination': 10, 'GetSubscriberData': 35, 'DeleteCallForwarding': 2, 'InsertCallForwarding': 2, 'UpdateSubscriberData': 2}, wal_device=RAM disk, max_connection_threads=32
-0.19% tatp None
Detailsmaster tps=7512.73, commit tps=7498.8, query_mode=extended, benchmark_type=tatp, scale_factor=1.0000, terminals=16, client_time=60, weights={'GetAccessData': 35, 'UpdateLocation': 14, 'GetNewDestination': 10, 'GetSubscriberData': 35, 'DeleteCallForwarding': 2, 'InsertCallForwarding': 2, 'UpdateSubscriberData': 2}, wal_device=None, max_connection_threads=32
5.97% tatp HDD
Detailsmaster tps=6436.91, commit tps=6821.09, query_mode=extended, benchmark_type=tatp, scale_factor=1.0000, terminals=16, client_time=60, weights={'GetAccessData': 35, 'UpdateLocation': 14, 'GetNewDestination': 10, 'GetSubscriberData': 35, 'DeleteCallForwarding': 2, 'InsertCallForwarding': 2, 'UpdateSubscriberData': 2}, wal_device=HDD, max_connection_threads=32

@codecov
Copy link

codecov bot commented May 20, 2021

Codecov Report

Merging #1585 (2182dc9) into master (33bc907) will decrease coverage by 0.09%.
The diff coverage is 14.92%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1585      +/-   ##
==========================================
- Coverage   81.75%   81.66%   -0.10%     
==========================================
  Files         739      739              
  Lines       52045    52112      +67     
==========================================
+ Hits        42550    42556       +6     
- Misses       9495     9556      +61     
Impacted Files Coverage Δ
src/include/self_driving/planning/pilot.h 0.00% <0.00%> (ø)
src/include/settings/settings_manager.h 100.00% <ø> (ø)
src/settings/settings_callbacks.cpp 39.44% <0.00%> (-16.91%) ⬇️
src/include/settings/settings_defs.h 100.00% <100.00%> (ø)
src/include/storage/block_access_controller.h 88.23% <0.00%> (-5.89%) ⬇️
src/network/network_io_wrapper.cpp 82.25% <0.00%> (-3.23%) ⬇️
src/storage/index/hash_index.cpp 89.87% <0.00%> (-1.27%) ⬇️
src/include/execution/sql/chaining_hash_table.h 91.91% <0.00%> (-1.02%) ⬇️
src/storage/arrow_serializer.cpp 86.25% <0.00%> (-0.63%) ⬇️
src/include/main/db_main.h 89.40% <0.00%> (-0.24%) ⬇️
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 33bc907...2182dc9. Read the comment docs.

@17zhangw 17zhangw requested a review from linmagit May 20, 2021 21:57
@17zhangw 17zhangw added ready-for-review This PR passes all checks and is ready to be reviewed. Mark PRs with this. and removed in-progress This PR is being actively worked on and not ready to be reviewed or merged. Mark PRs with this. labels May 20, 2021
Copy link
Member

@linmagit linmagit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I'm just wondering what would happen if we specify something like --train_forecast_model when we start the system? At least we should make sure that it doesn't crash. Then will the system train the model or just ignore it? I think either way it's probably fine. But we should document what would be the behavior.

@noisepage-checks
Copy link

Minor Decrease in Performance

Be warned: this PR may have decreased the throughput of the system slightly.

tps (%change) benchmark_type wal_device details
-1.44% tpcc RAM disk
Detailsmaster tps=22554.9, commit tps=22230.27, query_mode=extended, benchmark_type=tpcc, scale_factor=32.0000, terminals=32, client_time=60, weights={'Payment': 43, 'Delivery': 4, 'NewOrder': 45, 'StockLevel': 4, 'OrderStatus': 4}, wal_device=RAM disk, max_connection_threads=32
-0.06% tpcc None
Detailsmaster tps=28904.05, commit tps=28887.68, query_mode=extended, benchmark_type=tpcc, scale_factor=32.0000, terminals=32, client_time=60, weights={'Payment': 43, 'Delivery': 4, 'NewOrder': 45, 'StockLevel': 4, 'OrderStatus': 4}, wal_device=None, max_connection_threads=32
-0.03% tpcc HDD
Detailsmaster tps=21953.18, commit tps=21945.97, query_mode=extended, benchmark_type=tpcc, scale_factor=32.0000, terminals=32, client_time=60, weights={'Payment': 43, 'Delivery': 4, 'NewOrder': 45, 'StockLevel': 4, 'OrderStatus': 4}, wal_device=HDD, max_connection_threads=32
7.34% tatp RAM disk
Detailsmaster tps=6500.06, commit tps=6977.0, query_mode=extended, benchmark_type=tatp, scale_factor=1.0000, terminals=16, client_time=60, weights={'GetAccessData': 35, 'UpdateLocation': 14, 'GetNewDestination': 10, 'GetSubscriberData': 35, 'DeleteCallForwarding': 2, 'InsertCallForwarding': 2, 'UpdateSubscriberData': 2}, wal_device=RAM disk, max_connection_threads=32
1.05% tatp None
Detailsmaster tps=7421.06, commit tps=7498.8, query_mode=extended, benchmark_type=tatp, scale_factor=1.0000, terminals=16, client_time=60, weights={'GetAccessData': 35, 'UpdateLocation': 14, 'GetNewDestination': 10, 'GetSubscriberData': 35, 'DeleteCallForwarding': 2, 'InsertCallForwarding': 2, 'UpdateSubscriberData': 2}, wal_device=None, max_connection_threads=32
5.81% tatp HDD
Detailsmaster tps=6446.34, commit tps=6821.09, query_mode=extended, benchmark_type=tatp, scale_factor=1.0000, terminals=16, client_time=60, weights={'GetAccessData': 35, 'UpdateLocation': 14, 'GetNewDestination': 10, 'GetSubscriberData': 35, 'DeleteCallForwarding': 2, 'InsertCallForwarding': 2, 'UpdateSubscriberData': 2}, wal_device=HDD, max_connection_threads=32

@noisepage-checks
Copy link

Minor Decrease in Performance

Be warned: this PR may have decreased the throughput of the system slightly.

tps (%change) benchmark_type wal_device details
-0.51% tpcc RAM disk
Detailsmaster tps=22345.3, commit tps=22230.27, query_mode=extended, benchmark_type=tpcc, scale_factor=32.0000, terminals=32, client_time=60, weights={'Payment': 43, 'Delivery': 4, 'NewOrder': 45, 'StockLevel': 4, 'OrderStatus': 4}, wal_device=RAM disk, max_connection_threads=32
-1.17% tpcc None
Detailsmaster tps=29229.71, commit tps=28887.68, query_mode=extended, benchmark_type=tpcc, scale_factor=32.0000, terminals=32, client_time=60, weights={'Payment': 43, 'Delivery': 4, 'NewOrder': 45, 'StockLevel': 4, 'OrderStatus': 4}, wal_device=None, max_connection_threads=32
-0.68% tpcc HDD
Detailsmaster tps=22096.96, commit tps=21945.97, query_mode=extended, benchmark_type=tpcc, scale_factor=32.0000, terminals=32, client_time=60, weights={'Payment': 43, 'Delivery': 4, 'NewOrder': 45, 'StockLevel': 4, 'OrderStatus': 4}, wal_device=HDD, max_connection_threads=32
8.53% tatp RAM disk
Detailsmaster tps=6428.73, commit tps=6977.0, query_mode=extended, benchmark_type=tatp, scale_factor=1.0000, terminals=16, client_time=60, weights={'GetAccessData': 35, 'UpdateLocation': 14, 'GetNewDestination': 10, 'GetSubscriberData': 35, 'DeleteCallForwarding': 2, 'InsertCallForwarding': 2, 'UpdateSubscriberData': 2}, wal_device=RAM disk, max_connection_threads=32
2.75% tatp None
Detailsmaster tps=7298.16, commit tps=7498.8, query_mode=extended, benchmark_type=tatp, scale_factor=1.0000, terminals=16, client_time=60, weights={'GetAccessData': 35, 'UpdateLocation': 14, 'GetNewDestination': 10, 'GetSubscriberData': 35, 'DeleteCallForwarding': 2, 'InsertCallForwarding': 2, 'UpdateSubscriberData': 2}, wal_device=None, max_connection_threads=32
8.05% tatp HDD
Detailsmaster tps=6312.8, commit tps=6821.09, query_mode=extended, benchmark_type=tatp, scale_factor=1.0000, terminals=16, client_time=60, weights={'GetAccessData': 35, 'UpdateLocation': 14, 'GetNewDestination': 10, 'GetSubscriberData': 35, 'DeleteCallForwarding': 2, 'InsertCallForwarding': 2, 'UpdateSubscriberData': 2}, wal_device=HDD, max_connection_threads=32

@17zhangw
Copy link
Member Author

Documented the behavior. We just ignore it if it is specified during startup.

@17zhangw 17zhangw requested a review from linmagit May 24, 2021 18:24
Copy link
Member

@linmagit linmagit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@noisepage-checks
Copy link

Minor Decrease in Performance

Be warned: this PR may have decreased the throughput of the system slightly.

tps (%change) benchmark_type wal_device details
2.42% tpcc RAM disk
Detailsmaster tps=21921.96, commit tps=22452.61, query_mode=extended, benchmark_type=tpcc, scale_factor=32.0000, terminals=32, client_time=60, weights={'Payment': 43, 'Delivery': 4, 'NewOrder': 45, 'StockLevel': 4, 'OrderStatus': 4}, wal_device=RAM disk, max_connection_threads=32
-1.78% tpcc None
Detailsmaster tps=28881.54, commit tps=28368.38, query_mode=extended, benchmark_type=tpcc, scale_factor=32.0000, terminals=32, client_time=60, weights={'Payment': 43, 'Delivery': 4, 'NewOrder': 45, 'StockLevel': 4, 'OrderStatus': 4}, wal_device=None, max_connection_threads=32
-1.06% tpcc HDD
Detailsmaster tps=22270.55, commit tps=22033.52, query_mode=extended, benchmark_type=tpcc, scale_factor=32.0000, terminals=32, client_time=60, weights={'Payment': 43, 'Delivery': 4, 'NewOrder': 45, 'StockLevel': 4, 'OrderStatus': 4}, wal_device=HDD, max_connection_threads=32
4.61% tatp RAM disk
Detailsmaster tps=6427.64, commit tps=6723.93, query_mode=extended, benchmark_type=tatp, scale_factor=1.0000, terminals=16, client_time=60, weights={'GetAccessData': 35, 'UpdateLocation': 14, 'GetNewDestination': 10, 'GetSubscriberData': 35, 'DeleteCallForwarding': 2, 'InsertCallForwarding': 2, 'UpdateSubscriberData': 2}, wal_device=RAM disk, max_connection_threads=32
2.81% tatp None
Detailsmaster tps=7242.02, commit tps=7445.62, query_mode=extended, benchmark_type=tatp, scale_factor=1.0000, terminals=16, client_time=60, weights={'GetAccessData': 35, 'UpdateLocation': 14, 'GetNewDestination': 10, 'GetSubscriberData': 35, 'DeleteCallForwarding': 2, 'InsertCallForwarding': 2, 'UpdateSubscriberData': 2}, wal_device=None, max_connection_threads=32
7.49% tatp HDD
Detailsmaster tps=6338.04, commit tps=6812.48, query_mode=extended, benchmark_type=tatp, scale_factor=1.0000, terminals=16, client_time=60, weights={'GetAccessData': 35, 'UpdateLocation': 14, 'GetNewDestination': 10, 'GetSubscriberData': 35, 'DeleteCallForwarding': 2, 'InsertCallForwarding': 2, 'UpdateSubscriberData': 2}, wal_device=HDD, max_connection_threads=32

@17zhangw 17zhangw added ready-to-merge This PR is ready to be merged. Mark PRs with this. and removed ready-for-review This PR passes all checks and is ready to be reviewed. Mark PRs with this. labels May 24, 2021
@linmagit linmagit merged commit b13282e into cmu-db:master May 24, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
ready-for-ci Indicate that this build should be run through CI. ready-to-merge This PR is ready to be merged. Mark PRs with this.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants