-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TiKV Rust Client RFC #7
base: master
Are you sure you want to change the base?
Changes from 3 commits
7419c9b
64aad0b
57cc178
e82ac4e
43ed161
5368649
0cfcaff
aea8d85
510791f
bf8ddab
5cb6085
584e88a
b239698
7dd712d
78d6100
cbfceb5
9215f8c
6b22b3e
9b892a3
c8b0167
80457b2
b68e34b
bd05fde
e3c080c
04569e8
e692e80
1a7c67f
e4ef825
aa950ff
5eb89c3
252eb33
9d2e4c3
41d029d
2c77911
bbb504c
1ff37ef
ddeaeea
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,386 @@ | ||
# Summary | ||
|
||
Introduce a full featured, official TiKV (and PD) Rust client. It is intended to be used as a reference implementation, or to provide C-compatible binding for future clients. | ||
|
||
# Motivation | ||
|
||
Currently, users of TiKV must use [TiDB's Go Client](https://github.com/pingcap/tidb/blob/master/store/tikv/client.go), which is not well packaged or documented. We would like to ensure that users can easily use TiKV and PD without needing to use TiDB. | ||
|
||
We think this would help encourage community participation in the TiKV project and associated libraries. During talks with several potential corporate users we discovered that there was an interest in using TiKV to resolve concerns such as caching and raw key-value stores. | ||
|
||
# Detailed design | ||
## Supported targets | ||
sunxiaoguang marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
We will target the `stable` channel of Rust starting in the Rust 2018 edition. We choose to begin with Rust 2018 so we do not need to concern ourselves with an upgrade path. | ||
|
||
We will also support the most recent `nightly` version of Rust, but users should not feel the need to reach for stable unless they are already using it. | ||
sunxiaoguang marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## Naming | ||
|
||
The [Rust API Guidelines](https://rust-lang-nursery.github.io/api-guidelines/naming.html) do not perscribe any particular crate name convention. | ||
Hoverbear marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
We choose to name the crate `tikv_client` to conform to the constraints presented to us by the Rust compiler. Cargo permits `tikv-client` and `tikv_client`, but `rustc` does not permit `tikv-client` so we choose to use `tikv_client` to reduce mental overhead. | ||
Hoverbear marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Choosing to seperate `tikv` and `client` helps potentially unfamiliar users to immediately understand the intent of the package. `tikvclient`, while understandable, is not immediately parsable by a human. | ||
|
||
All structures and functions will otherwise follow the [Rust API Guidelines](https://rust-lang-nursery.github.io/api-guidelines/), some of which will be enforced by `clippy`. | ||
|
||
## Installation | ||
|
||
To utilize the client programmatically, users will be able to add the `tikv-client` crate to their `Cargo.toml`'s dependencies. Then they must use the crate with `use tikv_client;`. Unfortunately due to Rust’s naming conventions this inconsistency is in place. | ||
|
||
To utilize the command line client, users will be able to install the binary via `cargo install tikv-client`. They will then be able to access the client through the binary `tikv-client`. If they wish for a different name they can alias it in their shell. | ||
sunxiaoguang marked this conversation as resolved.
Show resolved
Hide resolved
Hoverbear marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## Usage | ||
|
||
## Two types of APIs | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If I recall correctly we need to advise users that they are mutually exclusive, and they must choose one or the other. I think that is worth mentioning in this section. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added, PTAL @Hoverbear |
||
TiKV provides two types of APIs for developers: | ||
sunxiaoguang marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- The Raw Key-Value API | ||
|
||
If your application scenario does not need distributed transactions or MVCC (Multi-Version Concurrency Control) and only needs to guarantee the atomicity towards one key, you can use the Raw Key-Value API. | ||
|
||
- The Transactional Key-Value API | ||
|
||
If your application scenario requires distributed ACID transactions and the atomicity of multiple keys within a transaction, you can use the Transactional Key-Value API. | ||
|
||
Generally the Raw Key-Value API has higher throughput and lower latency compare to the Transactional Key-Value API. If distributed ACID transactions is not required, Raw Key-Value API is preferred over Transactional Key-Value API for better performance and ease of use. | ||
sunxiaoguang marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
The client provides two types of APIs in two separate modules for developers to choose from. | ||
|
||
### The common data types | ||
|
||
- Key: raw binary data | ||
sunxiaoguang marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- Value: raw binary data | ||
- KvPair: Key-value pair type | ||
- KeyRange: Half-open interval of keys | ||
sunxiaoguang marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- Config: Configuration for client | ||
|
||
### Raw Key-Value API Basic Usage | ||
|
||
To use the Raw Key-Value API, take the following steps: | ||
|
||
1. Create an instance of Config to specify endpoints of PD (Placement Driver) and optional security config | ||
|
||
```rust | ||
let config = Config::new(vec!["127.0.0.1:2379"]); | ||
sunxiaoguang marked this conversation as resolved.
Show resolved
Hide resolved
Hoverbear marked this conversation as resolved.
Show resolved
Hide resolved
|
||
``` | ||
|
||
2. Create a Raw Key-Value client. | ||
|
||
```rust | ||
let client = RawClient::new(&config); | ||
Hoverbear marked this conversation as resolved.
Show resolved
Hide resolved
|
||
``` | ||
|
||
3. Call the Raw Key-Value client methods to access the data on TiKV. The Raw Key-Value API contains following methods | ||
|
||
```rust | ||
fn get<K, C>(&self, key: K, cf: C) -> KvFuture<Value> | ||
where | ||
K: Into<Key>, | ||
C: Into<Option<String>>; | ||
|
||
fn batch_get<I, K, C>(&self, keys: I, cf: C) -> KvFuture<Vec<KvPair>> | ||
where | ||
I: IntoIterator<Item = K>, | ||
K: Into<Key>, | ||
C: Into<Option<String>>; | ||
|
||
fn put<P, C>(&self, pair: P, cf: C) -> KvFuture<()> | ||
where | ||
P: Into<KvPair>, | ||
C: Into<Option<String>>; | ||
|
||
fn batch_put<I, P, C>(&self, pairs: I, cf: C) -> KvFuture<()> | ||
where | ||
I: IntoIterator<Item = P>, | ||
P: Into<KvPair>, | ||
C: Into<Option<String>>; | ||
|
||
fn delete<K, C>(&self, key: K, cf: C) -> KvFuture<()> | ||
where | ||
K: Into<Key>, | ||
C: Into<Option<String>>; | ||
|
||
fn batch_delete<I, K, C>(&self, keys: I, cf: C) -> KvFuture<()> | ||
where | ||
I: IntoIterator<Item = K>, | ||
K: Into<Key>, | ||
C: Into<Option<String>>; | ||
|
||
fn scan<R, C>(&self, range: R, limit: u32, key_only: bool, cf: C) -> KvFuture<Vec<KvPair>> | ||
sunxiaoguang marked this conversation as resolved.
Show resolved
Hide resolved
|
||
where | ||
R: Into<KeyRange>, | ||
C: Into<Option<String>>; | ||
|
||
fn batch_scan<I, R, C>( | ||
&self, | ||
ranges: I, | ||
each_limit: u32, | ||
key_only: bool, | ||
cf: C, | ||
) -> KvFuture<Vec<KvPair>> | ||
where | ||
I: IntoIterator<Item = R>, | ||
R: Into<KeyRange>, | ||
C: Into<Option<String>>; | ||
|
||
fn delete_range<R, C>(&self, range: R, cf: C) -> KvFuture<()> | ||
where | ||
R: Into<KeyRange>, | ||
C: Into<Option<String>>; | ||
``` | ||
|
||
#### Usage example of the Raw Key-Value API | ||
|
||
```rust | ||
extern crate futures; | ||
extern crate tikv_client; | ||
|
||
use futures::future::Future; | ||
use tikv_client::raw::Client; | ||
use tikv_client::*; | ||
|
||
fn main() { | ||
let config = Config::new(vec!["127.0.0.1:3379"]); | ||
let raw = raw::RawClient::new(&config) | ||
.wait() | ||
.expect("Could not connect to tikv"); | ||
|
||
let key: Key = b"Company".to_vec().into(); | ||
let value: Value = b"PingCAP".to_vec().into(); | ||
|
||
raw.put((Clone::clone(&key), Clone::clone(&value)), None) | ||
Hoverbear marked this conversation as resolved.
Show resolved
Hide resolved
|
||
.wait() | ||
.expect("Could not put kv pair to tikv"); | ||
println!("Successfully put {:?}:{:?} to tikv", key, value); | ||
|
||
let value = raw | ||
.get(Clone::clone(&key), None) | ||
.wait() | ||
.expect("Could not get value"); | ||
println!("Found val: {:?} for key: {:?}", value, key); | ||
|
||
raw.delete(Clone::clone(&key), None) | ||
.wait() | ||
.expect("Could not delete value"); | ||
println!("Key: {:?} deleted", key); | ||
|
||
raw.get(key, None) | ||
.wait() | ||
.expect_err("Get returned value for not existing key"); | ||
} | ||
``` | ||
|
||
The result is like: | ||
|
||
```bash | ||
Successfully put Key([67, 111, 109, 112, 97, 110, 121]):Value([80, 105, 110, 103, 67, 65, 80]) to tikv | ||
Found val: Value([80, 105, 110, 103, 67, 65, 80]) for key: Key([67, 111, 109, 112, 97, 110, 121]) | ||
Key: Key([67, 111, 109, 112, 97, 110, 121]) deleted | ||
``` | ||
|
||
Raw Key-Value client is a client of the TiKV server and only supports the GET/BATCH_GET/PUT/BATCH_PUT/DELETE/BATCH_DELETE/SCAN/BATCH_SCAN/DELETE_RANGE commands. The Raw Key-Value client can be safely and concurrently accessed by multiple threads. Therefore, for one process, one client is enough generally. | ||
sunxiaoguang marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
### Try the Transactional Key-Value API | ||
sunxiaoguang marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
The Transactional Key-Value API is more complicated than the Raw Key-Value API. Some transaction related concepts are listed as follows. | ||
|
||
- Client | ||
|
||
Like the Raw Key-Value client, a Client is client to a TiKV cluster. | ||
|
||
- Snapshot | ||
|
||
A Snapshot is the state of a Client at a particular point of time, which provides some readonly methods. The multiple reads of the same Snapshot is guaranteed consistent. | ||
sunxiaoguang marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
- Transaction | ||
|
||
Like the Transaction in SQL, a Transaction symbolizes a series of read and write operations performed within the Client. Internally, a Transaction consists of a Snapshot for reads, and a buffer for all writes. The default isolation level of a Transaction is Snapshot Isolation. | ||
sunxiaoguang marked this conversation as resolved.
Show resolved
Hide resolved
sunxiaoguang marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
To use the Transactional Key-Value API, take the following steps: | ||
|
||
1. Create an instance of Config to specify endpoints of PD (Placement Driver) and optional security config | ||
|
||
```rust | ||
let config = Config::new(vec!["127.0.0.1:2379"]); | ||
``` | ||
|
||
2. Create a Transactional Key-Value client. | ||
|
||
```rust | ||
let client = TxnClient::new(&config); | ||
``` | ||
Hoverbear marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
4. (Optional) Modify data using a Transaction. | ||
|
||
The lifecycle of a Transaction is: _begin → {get, set, delete, scan} → {commit, rollback}_. | ||
|
||
5. Call the Transactional Key-Value API's methods to access the data on TiKV. The Transactional Key-Value API contains the following methods: | ||
|
||
```rust | ||
fn begin(&self) -> KvFuture<Transaction>; | ||
Hoverbear marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
fn get<K>(&self, key: K) -> KvFuture<Value> | ||
where | ||
K: Into<Key>; | ||
|
||
fn set<P>(&mut self, pair: P) -> KvFuture<()> | ||
where | ||
P: Into<KvPair>; | ||
|
||
fn delete<K>(&mut self, key: K) -> KvFuture<()> | ||
where | ||
K: Into<Key>; | ||
|
||
fn seek<K>(&self, key: K) -> KvFuture<Scanner> | ||
sunxiaoguang marked this conversation as resolved.
Show resolved
Hide resolved
|
||
where | ||
K: Into<Key>; Begin() -> Txn | ||
sunxiaoguang marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
fn commit(self) -> KvFuture<()>; | ||
|
||
fn rollback(self) -> KvFuture<()>; | ||
``` | ||
|
||
### Usage example of the Transactional Key-Value API | ||
|
||
```rust | ||
extern crate futures; | ||
extern crate tikv_client; | ||
|
||
use futures::{Async, Future, Stream}; | ||
use tikv_client::transaction::{Client, Mutator, Retriever, TxnClient}; | ||
use tikv_client::*; | ||
|
||
fn puts<P, I>(client: &TxnClient, pairs: P) | ||
where | ||
P: IntoIterator<Item = I>, | ||
I: Into<KvPair>, | ||
{ | ||
let mut txn = client.begin().wait().expect("Could not begin transaction"); | ||
let _: Vec<()> = pairs | ||
.into_iter() | ||
.map(Into::into) | ||
.map(|p| { | ||
txn.set(p).wait().expect("Could not set key value pair"); | ||
sunxiaoguang marked this conversation as resolved.
Show resolved
Hide resolved
|
||
}).collect(); | ||
txn.commit().wait().expect("Could not commit transaction"); | ||
} | ||
|
||
fn get(client: &TxnClient, key: Key) -> Value { | ||
let txn = client.begin().wait().expect("Could not begin transaction"); | ||
txn.get(key).wait().expect("Could not get value") | ||
} | ||
|
||
fn scan(client: &TxnClient, start: Key, limit: usize) { | ||
let txn = client.begin().wait().expect("Could not begin transaction"); | ||
let mut scanner = txn.seek(start).wait().expect("Could not seek to start key"); | ||
let mut limit = limit; | ||
loop { | ||
if limit == 0 { | ||
break; | ||
} | ||
match scanner.poll() { | ||
Ok(Async::Ready(None)) => return, | ||
Ok(Async::Ready(Some(pair))) => { | ||
limit -= 1; | ||
println!("{:?}", pair); | ||
} | ||
_ => break, | ||
} | ||
} | ||
} | ||
|
||
fn dels<P>(client: &TxnClient, pairs: P) | ||
where | ||
P: IntoIterator<Item = Key>, | ||
{ | ||
let mut txn = client.begin().wait().expect("Could not begin transaction"); | ||
let _: Vec<()> = pairs | ||
.into_iter() | ||
.map(|p| { | ||
txn.delete(p).wait().expect("Could not delete key"); | ||
}).collect(); | ||
txn.commit().wait().expect("Could not commit transaction"); | ||
} | ||
|
||
fn main() { | ||
let config = Config::new(vec!["127.0.0.1:3379"]); | ||
let txn = TxnClient::new(&config) | ||
.wait() | ||
.expect("Could not connect to tikv"); | ||
|
||
// set | ||
let key1: Key = b"key1".to_vec().into(); | ||
let value1: Value = b"value1".to_vec().into(); | ||
let key2: Key = b"key2".to_vec().into(); | ||
let value2: Value = b"value2".to_vec().into(); | ||
puts(&txn, vec![(key1, value1), (key2, value2)]); | ||
|
||
// get | ||
let key1: Key = b"key1".to_vec().into(); | ||
let value1 = get(&txn, Clone::clone(&key1)); | ||
println!("{:?}", (key1, value1)); | ||
|
||
|
||
// scan | ||
let key1: Key = b"key1".to_vec().into(); | ||
scan(&txn, key1, 10); | ||
|
||
// delete | ||
let key1: Key = b"key1".to_vec().into(); | ||
let key2: Key = b"key2".to_vec().into(); | ||
dels(&txn, vec![key1, key2]); | ||
} | ||
``` | ||
|
||
The result is like: | ||
|
||
```bash | ||
(Key([107, 101, 121, 49]), Value([118, 97, 108, 117, 101, 49])) | ||
(Key([107, 101, 121, 49]), Value([118, 97, 108, 117, 101, 49])) | ||
(Key([107, 101, 121, 49]), Value([118, 97, 108, 117, 101, 49])) | ||
``` | ||
|
||
## Programming model | ||
|
||
The client instance is thread safe and all the interfaces return futures so users can use the client in either asynchronous or synchronous way. A dedicated event loop thread is created at a per client instance basis to drive the reactor to make progress. | ||
overvenus marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## Tooling | ||
|
||
The `tikv_client` crate will be tested with Travis CI using Rust's standard testing framework. We will also include benchmark with criterion in the future. For public functions which process user input, we will seek to use fuzz testing such as `quickcheck` to find subtle bugs. | ||
|
||
The CI will validate all code is warning free, passes `rustfmt`, and passes a `clippy` check without lint warnings. | ||
|
||
All code that reaches the `master` branch should not output errors when the following commands are run: | ||
|
||
```shell | ||
cargo fmt --all -- --check | ||
cargo clippy --all -- -D clippy | ||
cargo test --all -- --nocapture | ||
cargo bench --all -- --test | ||
``` | ||
|
||
# Drawbacks | ||
|
||
Choosing not to create a Rust TiKV client would mean the current state of clients remains the same. | ||
|
||
It is likely that in the future we would end up creating a client in some other form due to customer demand. | ||
|
||
# Alternatives | ||
|
||
## Package the Go client | ||
|
||
Choosing to do this would likely be considerably much less work. The code is already written, so most of the work would be documenting and packaging. Unfortunately, Go does not share the same performance characteristics and FFI capabilities as Rust, so it is a poor core binding for future libraries. Due to the limited abstractions available in Go (it does not have a Linear Type System) we may not be able to create the semantic abstractions possible in a Rust client, reducing the quality of implementations referencing the client. | ||
|
||
## Choose another language | ||
|
||
We can choose another language such as C, C++, or Python. | ||
|
||
A C client would be the most portable and allow future users and customers to bind to the library as they wish. This quality is maintained in Rust, so it is not an advantage for C. Choosing to implement this client in C or C++ means we must take extra steps to support multiple packaging systems, string libraries, and other choices which would not be needed in languages like Ruby, Node.js, or Python. | ||
|
||
Choosing to use Python or Ruby for this client would likely be considerably less work than C/C++ as there is a reduced error surface. These languages do not offer good FFI bindings, and often require starting up a language runtime. We suspect that if we implement a C/C++/Rust client, future dynamic language libraries will be able to bind to the Rust client, allowing them to be written quickly and easily. | ||
|
||
# Unresolved questions | ||
|
||
There are some concerns about integration testing. While we can use the mock `Pd` available in TiKV to mock PD, we do not currently have something similar for TiKV. We suspect implementing a mock TiKV will be the easiest method. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lilin90 Any comments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both are OK. But I think we can add the Java client now: https://github.com/tikv/client-java