Skip to content

Commit b2f352c

Browse files
committed
crud: add readview support
Added readview support for select and pairs. Closes #343
1 parent 2d3d479 commit b2f352c

File tree

11 files changed

+3945
-36
lines changed

11 files changed

+3945
-36
lines changed

CHANGELOG.md

Lines changed: 11 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -5,12 +5,17 @@ All notable changes to this project will be documented in this file.
55
The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/)
66
and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html).
77

8+
## Unreleased
9+
10+
### Added
11+
* Read view support for select and pairs(#343).
12+
813
## [1.2.0] - 07-06-23
914

1015
### Added
11-
* Add `noreturn` option for operations:
12-
`insert`, `insert_object`, `insert_many`, `insert_object_many`,
13-
`replace`, `replace_object`, `replace_many`, `insert_object_many`,
16+
* Add `noreturn` option for operations:
17+
`insert`, `insert_object`, `insert_many`, `insert_object_many`,
18+
`replace`, `replace_object`, `replace_many`, `insert_object_many`,
1419
`upsert`, `upsert_object`, `upsert_many`, `upsert_object_many`,
1520
`update`, `delete` (#267).
1621

@@ -39,16 +44,16 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
3944
## [1.0.0] - 02-02-23
4045

4146
### Added
42-
* Add timeout condition for the validation of master presence in
47+
* Add timeout condition for the validation of master presence in
4348
replicaset and for the master connection (#95).
4449
* Support Cartridge clusterwide configuration for `crud.cfg` (#332).
4550

4651
### Changed
4752
* **Breaking**: forbid using space id in `crud.len` (#255).
4853

4954
### Fixed
50-
* Add validation of the master presence in replicaset and the
51-
master connection to the `utils.get_space` method before
55+
* Add validation of the master presence in replicaset and the
56+
master connection to the `utils.get_space` method before
5257
receiving the space from the connection (#331).
5358
* Fix fiber cancel on schema reload timeout in `call_reload_schema` (PR #336).
5459

README.md

Lines changed: 218 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,12 @@ It also provides the `crud-storage` and `crud-router` roles for
3737
- [Count](#count)
3838
- [Call options for crud methods](#call-options-for-crud-methods)
3939
- [Statistics](#statistics)
40+
- [Read view](#read-view)
41+
- [Creating a read view](#creating-a-read-view)
42+
- [Closing a read view](#closing-a-read-view)
43+
- [Read view select](#read-view-select)
44+
- [Read view select conditions](#read-view-select-conditions)
45+
- [Read view pairs](#read-view-pairs)
4046
- [Cartridge roles](#cartridge-roles)
4147
- [Usage](#usage)
4248
- [License](#license)
@@ -237,8 +243,8 @@ where:
237243
* `noreturn` (`?boolean`) - suppress successfully processed tuple
238244
(first return value is `nil`). `false` by default
239245
* `fetch_latest_metadata` (`?boolean`) - guarantees the
240-
up-to-date metadata (space format) in first return value, otherwise
241-
it may not take into account the latest migration of the data format.
246+
up-to-date metadata (space format) in first return value, otherwise
247+
it may not take into account the latest migration of the data format.
242248
Performance overhead is up to 15%. `false` by default
243249

244250
Returns metadata and array contains one inserted row, error.
@@ -308,8 +314,8 @@ where:
308314
* `noreturn` (`?boolean`) - suppress successfully processed tuples
309315
(first return value is `nil`). `false` by default
310316
* `fetch_latest_metadata` (`?boolean`) - guarantees the
311-
up-to-date metadata (space format) in first return value, otherwise
312-
it may not take into account the latest migration of the data format.
317+
up-to-date metadata (space format) in first return value, otherwise
318+
it may not take into account the latest migration of the data format.
313319
Performance overhead is up to 15%. `false` by default
314320

315321
Returns metadata and array with inserted rows, array of errors.
@@ -450,8 +456,8 @@ where:
450456
vshard router instance. Set this parameter if your space is not
451457
a part of the default vshard cluster
452458
* `fetch_latest_metadata` (`?boolean`) - guarantees the
453-
up-to-date metadata (space format) in first return value, otherwise
454-
it may not take into account the latest migration of the data format.
459+
up-to-date metadata (space format) in first return value, otherwise
460+
it may not take into account the latest migration of the data format.
455461
Performance overhead is up to 15%. `false` by default
456462

457463
Returns metadata and array contains one row, error.
@@ -493,8 +499,8 @@ where:
493499
* `noreturn` (`?boolean`) - suppress successfully processed tuple
494500
(first return value is `nil`). `false` by default
495501
* `fetch_latest_metadata` (`?boolean`) - guarantees the
496-
up-to-date metadata (space format) in first return value, otherwise
497-
it may not take into account the latest migration of the data format.
502+
up-to-date metadata (space format) in first return value, otherwise
503+
it may not take into account the latest migration of the data format.
498504
Performance overhead is up to 15%. `false` by default
499505

500506
Returns metadata and array contains one updated row, error.
@@ -535,8 +541,8 @@ where:
535541
* `noreturn` (`?boolean`) - suppress successfully processed tuple
536542
(first return value is `nil`). `false` by default
537543
* `fetch_latest_metadata` (`?boolean`) - guarantees the
538-
up-to-date metadata (space format) in first return value, otherwise
539-
it may not take into account the latest migration of the data format.
544+
up-to-date metadata (space format) in first return value, otherwise
545+
it may not take into account the latest migration of the data format.
540546
Performance overhead is up to 15%. `false` by default
541547

542548
Returns metadata and array contains one deleted row (empty for vinyl), error.
@@ -588,8 +594,8 @@ where:
588594
* `noreturn` (`?boolean`) - suppress successfully processed tuple
589595
(first return value is `nil`). `false` by default
590596
* `fetch_latest_metadata` (`?boolean`) - guarantees the
591-
up-to-date metadata (space format) in first return value, otherwise
592-
it may not take into account the latest migration of the data format.
597+
up-to-date metadata (space format) in first return value, otherwise
598+
it may not take into account the latest migration of the data format.
593599
Performance overhead is up to 15%. `false` by default
594600

595601
Returns inserted or replaced rows and metadata or nil with error.
@@ -659,8 +665,8 @@ where:
659665
* `noreturn` (`?boolean`) - suppress successfully processed tuples
660666
(first return value is `nil`). `false` by default
661667
* `fetch_latest_metadata` (`?boolean`) - guarantees the
662-
up-to-date metadata (space format) in first return value, otherwise
663-
it may not take into account the latest migration of the data format.
668+
up-to-date metadata (space format) in first return value, otherwise
669+
it may not take into account the latest migration of the data format.
664670
Performance overhead is up to 15%. `false` by default
665671

666672
Returns metadata and array with inserted/replaced rows, array of errors.
@@ -801,8 +807,8 @@ where:
801807
* `noreturn` (`?boolean`) - suppress successfully processed tuple
802808
(first return value is `nil`). `false` by default
803809
* `fetch_latest_metadata` (`?boolean`) - guarantees the
804-
up-to-date metadata (space format) in first return value, otherwise
805-
it may not take into account the latest migration of the data format.
810+
up-to-date metadata (space format) in first return value, otherwise
811+
it may not take into account the latest migration of the data format.
806812
Performance overhead is up to 15%. `false` by default
807813

808814
Returns metadata and empty array of rows or nil, error.
@@ -868,8 +874,8 @@ where:
868874
* `noreturn` (`?boolean`) - suppress successfully processed tuples
869875
(first return value is `nil`). `false` by default
870876
* `fetch_latest_metadata` (`?boolean`) - guarantees the
871-
up-to-date metadata (space format) in first return value, otherwise
872-
it may not take into account the latest migration of the data format.
877+
up-to-date metadata (space format) in first return value, otherwise
878+
it may not take into account the latest migration of the data format.
873879
Performance overhead is up to 15%. `false` by default
874880

875881
Returns metadata and array of errors.
@@ -1014,8 +1020,8 @@ where:
10141020
* `yield_every` (`?number`) - number of tuples processed on storage to yield after,
10151021
`yield_every` should be > 0, default value is 1000
10161022
* `fetch_latest_metadata` (`?boolean`) - guarantees the
1017-
up-to-date metadata (space format) in first return value, otherwise
1018-
it may not take into account the latest migration of the data format.
1023+
up-to-date metadata (space format) in first return value, otherwise
1024+
it may not take into account the latest migration of the data format.
10191025
Performance overhead is up to 15%. `false` by default
10201026

10211027

@@ -1541,6 +1547,198 @@ support preserving stats between role reload
15411547
(see [tarantool/metrics#334](https://github.com/tarantool/metrics/issues/334)),
15421548
thus this feature will be unsupported for `metrics` driver.
15431549

1550+
### Read view
1551+
1552+
A read view is an in-memory snapshot of the entire database that isn’t affected by future data modifications. Read views allow you to retrieve data using the `read_view_object:select()` and `read_view_object:pairs()` operations.
1553+
1554+
Read views can be used to make complex analytical queries. This reduces the load on the main database and improves RPS for a single Tarantool instance.
1555+
1556+
To improve memory consumption and performance, Tarantool creates read views using the copy-on-write technique. In this case, duplication of the entire data set is not required: Tarantool duplicates only blocks modified after a read view is created
1557+
1558+
Read views have the following limitations:
1559+
1560+
* Only the memtx engine is supported.
1561+
* Read view can be used starting from Tarantool Enterprise v2.11.0.
1562+
1563+
#### Creating a read view
1564+
1565+
To create a read view, call the `crud.readview()` function.
1566+
1567+
```lua
1568+
local foo = crud.readview(opts)
1569+
```
1570+
1571+
where:
1572+
1573+
* `opts`:
1574+
* `name` (`?string`) - name of the read view
1575+
* `timeout` (`?number`) - `vshard.call` timeout (in seconds)
1576+
1577+
**Example:**
1578+
1579+
```lua
1580+
local foo = crud.readview({name: 'foo', timeout: 3})
1581+
```
1582+
1583+
#### Closing a read view
1584+
1585+
When a read view is no longer needed, close it using the `read_view_object:close()` method because a read view may consume a substantial amount of memory.
1586+
1587+
```lua
1588+
local foo = foo.readview()
1589+
foo:close(opts)
1590+
```
1591+
1592+
where:
1593+
1594+
* `opts`:
1595+
* `timeout` (`?number`) - `vshard.call` timeout (in seconds)
1596+
1597+
Otherwise, a read view is closed implicitly when the read view object is collected by the Lua garbage collector.
1598+
1599+
**Example:**
1600+
1601+
```lua
1602+
local foo = crud.readview()
1603+
foo:close({timeout = 3})
1604+
```
1605+
1606+
#### Read view select
1607+
1608+
`read_view_object:select()` supports multi-conditional selects, treating a cluster as a single space, same as `crud.select`.
1609+
1610+
```lua
1611+
local foo = crud.readview()
1612+
local objects, err = foo:select(space_name, conditions, opts)
1613+
foo:close()
1614+
```
1615+
1616+
where:
1617+
1618+
* `space_name` (`string`) - name of the space
1619+
* `conditions` (`?table`) - array of [select conditions](#select-conditions)
1620+
* `opts`:
1621+
* `first` (`?number`) - the maximum count of the objects to return.
1622+
If negative value is specified, the objects behind `after` are returned
1623+
(`after` option is required in this case). [See pagination examples](doc/select.md#pagination).
1624+
* `after` (`?table`) - tuple after which objects should be selected
1625+
* `batch_size` (`?number`) - number of tuples to process per one request to storage
1626+
* `bucket_id` (`?number|cdata`) - bucket ID
1627+
* `force_map_call` (`?boolean`) - if `true`
1628+
then the map call is performed without any optimizations even
1629+
if full primary key equal condition is specified
1630+
* `timeout` (`?number`) - `vshard.call` timeout (in seconds)
1631+
* `fields` (`?table`) - field names for getting only a subset of fields
1632+
* `fullscan` (`?boolean`) - if `true` then a critical log entry will be skipped
1633+
on potentially long `select`, see [avoiding full scan](doc/select.md#avoiding-full-scan).
1634+
* `vshard_router` (`?string|table`) - Cartridge vshard group name or
1635+
vshard router instance. Set this parameter if your space is not
1636+
a part of the default vshard cluster
1637+
* `yield_every` (`?number`) - number of tuples processed on storage to yield after,
1638+
`yield_every` should be > 0, default value is 1000
1639+
* `fetch_latest_metadata` (`?boolean`) - guarantees the
1640+
up-to-date metadata (space format) in first return value, otherwise
1641+
it may not take into account the latest migration of the data format.
1642+
Performance overhead is up to 15%. `false` by default
1643+
1644+
1645+
Returns metadata and array of rows, error.
1646+
1647+
**Example:**
1648+
1649+
```lua
1650+
local foo = crud.readview()
1651+
foo:select('customers', nil, {batch_size=1, fullscan=true})
1652+
---
1653+
- metadata: [{'name': 'id', 'type': 'unsigned'}, {'name': 'bucket_id', 'type': 'unsigned'},
1654+
{'name': 'name', 'type': 'string'}, {'name': 'age', 'type': 'number'}]
1655+
rows:
1656+
- [1, 477, 'Elizabeth', 12]
1657+
- [2, 401, 'Mary', 46]
1658+
- [3, 2804, 'David', 33]
1659+
- [4, 1161, 'William', 81]
1660+
- [5, 1172, 'Jack', 35]
1661+
- [6, 1064, 'William', 25]
1662+
- [7, 693, 'Elizabeth', 18]
1663+
- null
1664+
...
1665+
crud.insert('customers', {8, box.NULL, 'Elizabeth', 23})
1666+
---
1667+
- rows:
1668+
- [8, 185, 'Elizabeth', 23]
1669+
metadata: [{'name': 'id', 'type': 'unsigned'}, {'name': 'bucket_id', 'type': 'unsigned'},
1670+
{'name': 'name', 'type': 'string'}, {'name': 'age', 'type': 'number'}]
1671+
- null
1672+
...
1673+
foo:select('customers', nil, {batch_size=1, fullscan=true})
1674+
---
1675+
- metadata: [{'name': 'id', 'type': 'unsigned'}, {'name': 'bucket_id', 'type': 'unsigned'},
1676+
{'name': 'name', 'type': 'string'}, {'name': 'age', 'type': 'number'}]
1677+
rows:
1678+
- [1, 477, 'Elizabeth', 12]
1679+
- [2, 401, 'Mary', 46]
1680+
- [3, 2804, 'David', 33]
1681+
- [4, 1161, 'William', 81]
1682+
- [5, 1172, 'Jack', 35]
1683+
- [6, 1064, 'William', 25]
1684+
- [7, 693, 'Elizabeth', 18]
1685+
- null
1686+
...
1687+
foo.close()
1688+
```
1689+
1690+
##### Read view select conditions
1691+
1692+
Select conditions for `read_view_object:select()` are the same as [select conditions](#select-conditions) for `crud.select`.
1693+
1694+
**Example:**
1695+
1696+
```lua
1697+
foo = crud.readview()
1698+
foo:select('customers', {{'<=', 'age', 35}}, {first = 10})
1699+
---
1700+
- metadata:
1701+
- {'name': 'id', 'type': 'unsigned'}
1702+
- {'name': 'bucket_id', 'type': 'unsigned'}
1703+
- {'name': 'name', 'type': 'string'}
1704+
- {'name': 'age', 'type': 'number'}
1705+
rows:
1706+
- [5, 1172, 'Jack', 35]
1707+
- [3, 2804, 'David', 33]
1708+
- [6, 1064, 'William', 25]
1709+
- [7, 693, 'Elizabeth', 18]
1710+
- [1, 477, 'Elizabeth', 12]
1711+
...
1712+
foo.close()
1713+
```
1714+
1715+
#### Read view pairs
1716+
1717+
You can iterate across a distributed space using the `read_view_object:pairs()` method.
1718+
Its arguments are the same as [`crud.readview.select`](#read-view-select) arguments except
1719+
`fullscan` (it does not exist because `crud.pairs` does not generate a critical
1720+
log entry on potentially long requests) and negative `first` values aren't
1721+
allowed.
1722+
User could pass use_tomap flag (false by default) to iterate over flat tuples or objects.
1723+
1724+
**Example:**
1725+
1726+
```lua
1727+
foo = crud.readview()
1728+
local tuples = {}
1729+
for _, tuple in foo:pairs('customers', {{'<=', 'age', 35}}, {use_tomap = false}) do
1730+
-- {5, 1172, 'Jack', 35}
1731+
table.insert(tuples, tuple)
1732+
end
1733+
1734+
local objects = {}
1735+
for _, object in foo:pairs('customers', {{'<=', 'age', 35}}, {use_tomap = true}) do
1736+
-- {id = 5, name = 'Jack', bucket_id = 1172, age = 35}
1737+
table.insert(objects, object)
1738+
end
1739+
foo:close()
1740+
```
1741+
15441742
## Cartridge roles
15451743

15461744
`cartridge.roles.crud-storage` is a Tarantool Cartridge role that depends on the

crud.lua

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ local borders = require('crud.borders')
2020
local sharding_metadata = require('crud.common.sharding.sharding_metadata')
2121
local utils = require('crud.common.utils')
2222
local stats = require('crud.stats')
23+
local readview = require('crud.readview')
2324

2425
local crud = {}
2526

@@ -147,6 +148,10 @@ crud.reset_stats = stats.reset
147148
-- @function storage_info
148149
crud.storage_info = utils.storage_info
149150

151+
-- @refer readview.new
152+
-- @function readview
153+
crud.readview = readview.new
154+
150155
--- Initializes crud on node
151156
--
152157
-- Exports all functions that are used for calls
@@ -174,6 +179,7 @@ function crud.init_storage()
174179
count.init()
175180
borders.init()
176181
sharding_metadata.init()
182+
readview.init()
177183

178184
_G._crud.storage_info_on_storage = utils.storage_info_on_storage
179185
end

0 commit comments

Comments
 (0)