Release v1.2.0 · apache/horaedb

Upgrade Guide

NOTE: this guide is only used for upgrading CeresDB v1.1.0 to CeresDB v1.2.0, ignore it if you want to deploy a brand new CeresDB cluster with v1.2.0.

In v1.2.0, some incompatible changes are contained, so it's important to upgrade carefully:

First, stop all the instances of CeresDB and CeresMeta;
Upgrade the CeresMeta first by referring to the Upgrade Guide of CeresMeta;
When upgrade the CeresDB, the config should be updated:

Change the config section [analytic.compaction_config] to [analytic.compaction] if you use it;
Add the config section about the [cluster_deployment.etcd_client] if your CeresDB cluster is in WithMeta mode:

[cluster_deployment.etcd_client]
server_addrs = ['127.0.0.1:2379']
root_path = "/rootPath"

NOTE: the root_path must be /rootPath if upgrade from v1.1.0.
4. After updating CeresDB config, start the CeresDB server;

Major Features

Enhancement on InfluxQL support:
- Support query with aggregators;
- #854 optimize influxql planner to load all tables on demand instead of loading them when initializing the planner;
- Replace influxdb_iox with CeresDB/influxql to remove unnecessary dependencies introduced by influxdb_iox;
Enhancement on proxy module:
- Implement the proxy as a separate module;
- Support forward table requests in proxy;
- Support read and write on partition table in proxy;
- Recover the metadata of partition table from CeresMeta instead of CeresDB in proxy;
Improvement of write performance:
- #822 solves the problem that compaction schedule triggered by flush procedure may block the write procedure;
- #814 is a big change set, and replaces the write queue with the lock on table level for less write contentions;
- #843 adjusts the flush strategy to avoid frequent write stall;
- #861 brings the level 1 to SSTs, and currently the SST of the level 0, which is generated by flushing, won't contain complex indexes, e.g. xor-filter, leading to faster flushing and less write stall;
Enhancement on observability:
- #774 introduces the hotspot recorder that can be used to find out the top tables with the highest write/read throughput in a specific time window;
- #827 #831 provides more metrics for all the stages of writing procedure, which can be used to troubleshoot write performance problems, and the grafana dashboard config has been already updated.
- #817 introduces the CPU profiler, and the flamegraph of CPU can be generated easily just by an HTTP request to CeresDB server;
Support the new mechanism of failover and load balancing, more details can refer to the [Release Note v1.2.0] of CeresMeta:
- #706 #853 implements the distributed locks for shard based on ETCD, and opening and closing of shards is only allowed with the shard lock held, and after that, data corruption caused by multiple shard leaders will be avoided completely;
- Support automatic failover of CeresDB nodes, that is to say, the service recovery can be handled automatically without any manual intervention;
- Support automatic load balance based on consistent hashing, which can ensure that shards are evenly distributed on each node of the cluster when the number of the cluster nodes increases or decreases;

Thanks

Heartfelt thanks for @zouxiang1993's effort in helping troubleshooting write performance issues.

What's Changed

fix: simplify the logs in query path (#770) by @zouxiang1993 in #776
fix: remove FixedSizeArena by @ShiKaiWi in #772
chore(deps): bump time from 0.1.44 to 0.3.15 by @dependabot in #761
feat: add default schema config by @jiacai2050 in #782
fix: remove body limit for influxql request by @jiacai2050 in #783
feat: add integration tests for influxql request by @jiacai2050 in #784
feat: add java integration tests by @jiacai2050 in #786
chore(deps): bump log4j-core from 2.8.2 to 2.17.1 in /integration_tests/sdk/java by @dependabot in #789
chore(deps): bump junit from 4.12 to 4.13.1 in /integration_tests/sdk/java by @dependabot in #788
fix: timestamp column should not be auto added by @chunshao90 in #787
chore: route use read_runtime by @chunshao90 in #794
feat: influxql support show measurements by @jiacai2050 in #795
chore: bump version to 1.1.0 by @jiacai2050 in #797
feat: impl getTableInfo in remoteEngine service by @chunshao90 in #793
feat: add rust sdk test by @Rachelint in #791
fix: avoid error when disk cache miss by @ShiKaiWi in #790
feat: impl get_table_info in remote_engine_client by @chunshao90 in #798
fix: avoid send empty record batch to client by @ShiKaiWi in #796
chore: remove useless cluster_version by @chunshao90 in #804
refactor: make tsbs more configurable by @ShiKaiWi in #805
fix: avoid break when drop wal table failed by @MachaelLee in #806
feat: implement route interface in http protocol by @MachaelLee in #803
refactor: bump datafusion, add influxql aggregator support by @jiacai2050 in #778
fix: add router when build request context for mysql by @jiacai2050 in #809
feat: hotspot recorder by @MachaelLee in #774
feat: introduce TableOperator to encasulate operation of tables by @Rachelint in #808
feat: expose rocksdb background jobs option by @jiacai2050 in #812
feat: integration test support env filter by @jiacai2050 in #811
chore: bump datafusion by @jiacai2050 in #810
feat: convert nanoseconds to milliseconds automatically by @dust1 in #780
feat: add cpu profiler by @jiacai2050 in #817
feat: upgrade rust-rocksdb by @ShiKaiWi in #821
feat: avoid blocking the write procedure because of compaction schedule by @ShiKaiWi in #822
feat: query partition table with proxy in grpc service by @chunshao90 in #802
feat: influxql support fill syntax by @jiacai2050 in #824
feat: install dev dependencies in make file by @MachaelLee in #815
chore: remove unused dependency by @chunshao90 in #823
feat: replace bg runtime with default and compact runtime by @ShiKaiWi in #826
chore: add commit id of nightly docker image by @chunshao90 in #829
chore: add write batch metrics by @jiacai2050 in #827
feat: http query with proxy by @chunshao90 in #807
feat: add metrics for write procedure by @ShiKaiWi in #831
feat: impl prom query with proxy by @chunshao90 in #833
feat: support write partition table in grpc service by @chunshao90 in #828
fix: improve remote write performance by using separate runtime by @ShiKaiWi in #837
chore: update ob client version by @MachaelLee in #835
chore: remove unnecessary deps by @jiacai2050 in #838
chore(deps): bump h2 from 0.3.16 to 0.3.17 by @dependabot in #841
feat: tsbs support more write options by @ShiKaiWi in #839
feat: support write batch in remote engine by @Rachelint in #840
feat: serialize table operations by lock rather than queue by @ShiKaiWi in #814
feat: avoid frequent write stall by @ShiKaiWi in #843
fix: wrong default write batch size for run_tsbs by @ShiKaiWi in #845
chore: clean forward configs by @jiacai2050 in #847
feat: refactor manifest to get snapshot in memory by @Rachelint in #825
chore: rename module sql to query_frontend by @Rachelint in #849
feat: forward request in grpc write by @chunshao90 in #844
chore: bump obkv client version by @MachaelLee in #850
feat: support domain name as the ceresdb node addr by @ShiKaiWi in #852
refactor: implement the distributed lock of shard by @ZuLiangWang in #706
feat: compaction support different level by @jiacai2050 in #848
fix: avoid panic when convert prom result by @jiacai2050 in #851
feat: only collecting all tables on demand in influxql planner by @Rachelint in #854
refactor: shard lock module by @ShiKaiWi in #853
feat: support prom remote query forward by @jiacai2050 in #855
feat: support querying partition table in prom query and http query by @chunshao90 in #857
fix: build filter when needed by @jiacai2050 in #861
feat: rename the compaction_config to compaction and adjust interval by @ShiKaiWi in #862
refactor: implement prom remote query by convert to datafusion plan directly by @jiacai2050 in #860
refactor: remove runtime from request context by @jiacai2050 in #859
test: add prometheus integration tests by @jiacai2050 in #864
chore: proxy as a separate module by @chunshao90 in #865
fix: fix write partition table by @chunshao90 in #869
chore: add some commands in Makefile by @chunshao90 in #866
fix: fix evict logic in remote client by @Rachelint in #872
fix: drop partition table by @chunshao90 in #871
chore!: rename table name in table_kv based wal by @Rachelint in #868
Revert "chore!: rename table name in table_kv based wal" by @ShiKaiWi in #873

Full Changelog: v1.1.0...v1.2.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.2.0

Upgrade Guide

Major Features

Thanks

What's Changed

Contributors