Skip to content
This repository has been archived by the owner on Feb 20, 2023. It is now read-only.

Switch Replication Messages to Message Pack #1572

Merged
merged 1 commit into from
May 1, 2021

Conversation

jkosh44
Copy link
Contributor

@jkosh44 jkosh44 commented Apr 30, 2021

Description

This PR switches our message serialization implementation for replication from JSON to Message Pack.

See #1570 for more details

Performance

Current TPCC numbers

(Scale factor 1, 8 threads)

Replication Durability Reqs per second
Sync Sync 13.19878036
Async Sync 96.29715975
Async Async 225.1782607
Disabled Sync 208.8917964
Disabled Async 259.5795886

Updated TPCC numbers with Message Pack

(Scale factor 1, 8 threads)

Replication Durability Reqs per second
Sync Sync 16.099501826675176
Async Sync 230.38818864265522
Async Async 223.0805671394837

@jkosh44 jkosh44 requested a review from lmwnshn April 30, 2021 21:54
@jkosh44 jkosh44 self-assigned this Apr 30, 2021
@jkosh44 jkosh44 added performance Performance related issues or changes. ready-for-ci Indicate that this build should be run through CI. ready-for-review This PR passes all checks and is ready to be reviewed. Mark PRs with this. labels Apr 30, 2021
Copy link
Contributor

@lmwnshn lmwnshn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran some quick unscientific numbers on my machine

TPCC 30 seconds SF10

On first run,

  • no replication sync durability 2060
  • async replication sync durability 1700

On second run, ran immediately after the first,

  • no replication sync durability 495
  • async replication sync durability 600

Much faster than, I think enabling async replication would halve the txn req/s before?

nice! But, to see if the "tpcc slows down over time" is an issue for others, can you try reproducing the above on dev10? Thanks!

@jkosh44
Copy link
Contributor Author

jkosh44 commented May 1, 2021

This is OLTPBench Scale Factor 10, 30 seconds, 4 terminals on dev10

No Replication Sync Durability Message Pack

First Run: 197.83327765645456 requests/sec
Second Run: 190.1275127759948 requests/sec

Sync Replication Sync Durability Message Pack

First Run: 8.533247457290898 requests/sec
Second Run: 8.466533122910848 requests/sec

Async Replication Sync Durability Message Pack

First Run: 182.53032011064028 requests/sec
Second Run: 187.42700035617074 requests/sec

@noisepage-checks
Copy link

Minor Decrease in Performance

Be warned: this PR may have decreased the throughput of the system slightly.

tps (%change) benchmark_type wal_device details
-0.15% tpcc RAM disk
Detailsmaster tps=22481.16, commit tps=22447.62, query_mode=extended, benchmark_type=tpcc, scale_factor=32.0000, terminals=32, client_time=60, weights={'Payment': 43, 'Delivery': 4, 'NewOrder': 45, 'StockLevel': 4, 'OrderStatus': 4}, wal_device=RAM disk, max_connection_threads=32
-1.05% tpcc None
Detailsmaster tps=29456.63, commit tps=29148.01, query_mode=extended, benchmark_type=tpcc, scale_factor=32.0000, terminals=32, client_time=60, weights={'Payment': 43, 'Delivery': 4, 'NewOrder': 45, 'StockLevel': 4, 'OrderStatus': 4}, wal_device=None, max_connection_threads=32
1.88% tpcc HDD
Detailsmaster tps=21486.66, commit tps=21891.19, query_mode=extended, benchmark_type=tpcc, scale_factor=32.0000, terminals=32, client_time=60, weights={'Payment': 43, 'Delivery': 4, 'NewOrder': 45, 'StockLevel': 4, 'OrderStatus': 4}, wal_device=HDD, max_connection_threads=32
6.83% tatp RAM disk
Detailsmaster tps=6552.52, commit tps=6999.81, query_mode=extended, benchmark_type=tatp, scale_factor=1.0000, terminals=16, client_time=60, weights={'GetAccessData': 35, 'UpdateLocation': 14, 'GetNewDestination': 10, 'GetSubscriberData': 35, 'DeleteCallForwarding': 2, 'InsertCallForwarding': 2, 'UpdateSubscriberData': 2}, wal_device=RAM disk, max_connection_threads=32
0.53% tatp None
Detailsmaster tps=7420.32, commit tps=7459.88, query_mode=extended, benchmark_type=tatp, scale_factor=1.0000, terminals=16, client_time=60, weights={'GetAccessData': 35, 'UpdateLocation': 14, 'GetNewDestination': 10, 'GetSubscriberData': 35, 'DeleteCallForwarding': 2, 'InsertCallForwarding': 2, 'UpdateSubscriberData': 2}, wal_device=None, max_connection_threads=32
7.57% tatp HDD
Detailsmaster tps=6437.02, commit tps=6924.2, query_mode=extended, benchmark_type=tatp, scale_factor=1.0000, terminals=16, client_time=60, weights={'GetAccessData': 35, 'UpdateLocation': 14, 'GetNewDestination': 10, 'GetSubscriberData': 35, 'DeleteCallForwarding': 2, 'InsertCallForwarding': 2, 'UpdateSubscriberData': 2}, wal_device=HDD, max_connection_threads=32

@codecov
Copy link

codecov bot commented May 1, 2021

Codecov Report

Merging #1572 (62829b4) into master (ea19034) will decrease coverage by 0.02%.
The diff coverage is 0.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1572      +/-   ##
==========================================
- Coverage   81.96%   81.94%   -0.03%     
==========================================
  Files         735      735              
  Lines       51698    51698              
==========================================
- Hits        42376    42362      -14     
- Misses       9322     9336      +14     
Impacted Files Coverage Δ
src/include/replication/replication_messages.h 0.00% <ø> (ø)
src/replication/replication_messages.cpp 0.00% <0.00%> (ø)
src/network/network_io_wrapper.cpp 77.41% <0.00%> (-8.07%) ⬇️
src/include/storage/block_access_controller.h 88.23% <0.00%> (-5.89%) ⬇️
...network/postgres/postgres_protocol_interpreter.cpp 81.33% <0.00%> (-5.34%) ⬇️
src/include/execution/sql/chaining_hash_table.h 90.90% <0.00%> (-2.03%) ⬇️
src/network/connection_handle.cpp 62.50% <0.00%> (-0.79%) ⬇️
src/traffic_cop/traffic_cop.cpp 74.90% <0.00%> (-0.73%) ⬇️
src/util/query_exec_util.cpp 85.91% <0.00%> (-0.71%) ⬇️
src/execution/sql/sorter.cpp 96.73% <0.00%> (-0.55%) ⬇️
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ea19034...62829b4. Read the comment docs.

@jkosh44
Copy link
Contributor Author

jkosh44 commented May 1, 2021

This is OLTPBench Scale Factor 10, 30 seconds, 4 terminals on dev10

No Replication Sync Durability JSON

First Run: 186.92910415581636 requests/sec
Second Run: 193.56030142970215 requests/sec

Async Replication Sync Durability JSON

First Run: 81.23117654541093 requests/sec
Second Run: 79.19858577197348 requests/sec

@lmwnshn lmwnshn merged commit 866623c into cmu-db:master May 1, 2021
@jkosh44 jkosh44 deleted the message-pack-cherrypick branch June 18, 2021 17:49
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
performance Performance related issues or changes. ready-for-ci Indicate that this build should be run through CI. ready-for-review This PR passes all checks and is ready to be reviewed. Mark PRs with this.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants