[DISCUSS] Add GreptimeDB as storage option #13722
Replies: 2 comments
-
|
Thanks for submitting this. GreptimeDB is new for the whole SkyWalking PMC/committer team. I think first of you, you need to find out an active committer to work with you as a sponsor. Because of hurt history, we can't accept this kind of big PR/feature and responsibility to the PMC. Ref to https://skywalking.apache.org/docs/main/next/en/faq/why-clickhouse-not-supported/. So, after you have a committer taking responsibility(integration review, license process, integration tests, documents), we could assign you as plugin repo, e.g. skywalking-with-greptimedb, same as our graal-distro(https://github.com/apache/skywalking-graalvm-distro). First thing I noticed, you are using MySQL driver(GPL-v2), which means same as JDBC-MySQL in native support, your release will not be able to include that, and same as distro image. |
Beta Was this translation helpful? Give feedback.
-
|
@wu-sheng I completely understand the process. Let’s keep this open in case anyone is interested in discussing it. I’ll try to run some benchmarks if possible. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
[DISCUSS] Add GreptimeDB as Storage Option
Motivation
GreptimeDB is an open-source observability database designed to unify the processing of metrics, logs, and traces. It is Apache-2.0 licensed, written in Rust (query engine powered by Apache DataFusion), and supports SQL and PromQL natively. It exposes MySQL and PostgreSQL wire protocols for reads and a gRPC protocol with language SDKs (Java, Go, etc.) for writes.
I looked at how SkyWalking's storage layer works, and think GreptimeDB is a reasonable fit. Here's why:
Simpler storage lifecycle management. The JDBC plugin creates day-suffixed tables (
table_20240301,table_20240302, ...) and rotates them for TTL. GreptimeDB handles TTL natively per table (WITH ('ttl' = '7d')), so the plugin just creates each table once, and expired data is cleaned up by the storage engine. This eliminates theIHistoryDeleteDAOlogic entirely — our implementation is a no-op.Time-range queries go through partition pruning instead of index scans. SkyWalking's query DAOs almost always filter by time range first. In the JDBC plugin, this goes through
WHERE time_bucket BETWEEN ? AND ?with an INVERTED INDEX. In GreptimeDB, we converttime_bucketto aTIMESTAMP TIME INDEXcolumn (greptime_ts), which is the table's partition key. The storage engine skips irrelevant partitions before touching any index. A concrete example:TraceQueryDAO.queryBasicTraces()filters bygreptime_ts >= ? AND greptime_ts <= ?— this hits partition pruning directly rather than scanning an index over all partitions.Upsert without UPDATE. SkyWalking metrics are aggregated in the OAP's Java layer (
combine()+calculate()), then the full row is written to storage. The JDBC plugin usesINSERT ... ON DUPLICATE KEY UPDATEor equivalent. GreptimeDB'smerge_mode=last_rowachieves the same thing with a plain INSERT — same primary key and timestamp → the later write wins. No UPDATE statement needed.Append-only workloads skip deduplication. Records (traces, logs, alarms) are write-once. GreptimeDB's
append_mode=truetells the storage engine to skip deduplication checks on these tables.Object storage as primary storage layer. GreptimeDB can use S3, Azure Blob Storage, etc. as its storage backend with disaggregated compute/storage. I don't have SkyWalking-specific cost numbers to share, but the architecture means users in cloud environments can scale storage independently from compute. This is documented in detail at GreptimeDB Storage Location.
Disclosure: I'm from the GreptimeDB community. We'd like to see GreptimeDB become a supported storage option for SkyWalking, and are willing to maintain this plugin long-term.
I have a working implementation (PR) with all DAO interfaces implemented, unit tests, and E2E tests passing against the shared
storage-cases.yaml. Happy to run performance benchmarks if the community is interested — SkyWalking'sbenchmarks/framework anddata-generatortool could be extended to cover storage backend comparisons.Architecture Graph
No architecture-level change to OAP core. This is a new
storage-greptimedb-pluginmodule underserver-storage-plugin/, following the same patterns as existing storage plugins.The plugin uses a dual-protocol approach:
io.greptime:ingester-all) for batch writes. The SDK supports async streaming and compression over a persistent gRPC connection, which is a better fit for SkyWalking's write-heavy workload than JDBC INSERT statements.Proposed Changes
New module
oap-server/server-storage-plugin/storage-greptimedb-pluginWhy not reuse the existing JDBC plugin?
This was the first thing I tried. The JDBC plugin's
TableHelperuses static methods (getTable(),getLatestTableForWrite(),generateId()) that are called directly from all DAO implementations. These can't be overridden by subclassing. More fundamentally, the day-suffixed table rotation, theINSERT ON DUPLICATE KEY UPDATEpattern, and the separate tag tables are all baked into the JDBC plugin's architecture, and all three are things we specifically want to avoid for GreptimeDB.The BanyanDB plugin took the same approach — a standalone plugin implementing all DAO interfaces from scratch. Our plugin follows that pattern (49 source files + 7 test files, comparable to BanyanDB's 60 + 2).
Table mapping
SkyWalking data models are mapped to GreptimeDB tables with the following modes:
merge_mode=last_rowservice_resp_time,service_cpmappend_mode=truesegment,log,alarm_recordmerge_mode=last_rowui_template,service_trafficEach table has a
greptime_ts TIMESTAMP TIME INDEXcolumn derived from the model's time bucket, enabling partition pruning on time-range queries.Index strategy
GreptimeDB supports several index types. Here's how they map to SkyWalking's query patterns:
INVERTED INDEX — for low-to-medium cardinality columns that appear in WHERE equality/range filters.
Example:
AlarmQueryDAO.getAlarm()filters byscope,layer,start_time. These columns get INVERTED INDEX so the query doesn't scan the full table after partition pruning.SKIPPING INDEX (bloom filter) — for high-cardinality columns used in exact-match lookups. Applies to
trace_id,segment_id, andunique_id.Example:
TraceQueryDAO.queryByTraceId()doesWHERE trace_id = ?. A bloom filter can quickly rule out SST files that don't contain the target trace, without maintaining a full inverted index over millions of unique trace IDs.FULLTEXT INDEX — for long text columns like log content. Created with
FULLTEXT INDEX WITH(analyzer = 'English', case_sensitive = 'false')on string columns longer than 16383 characters.Note: the current
LogQueryDAOimplementation doesn't leverage FULLTEXT search yet (thekeywordsOfContentparameter is not wired up — same as in the JDBC plugin). The index is there so that keyword search can be added in a future iteration without schema changes.Tags as JSON column — the JDBC plugin creates separate tag tables (
segment_tag,log_tag,alarm_record_tag) and uses INNER JOIN for tag filtering. We store tags as a JSON column on the main table and use GreptimeDB'sjson_path_match()function:This eliminates 3 extra tables and the JOIN overhead. The tradeoff is that JSON columns don't have index support yet in GreptimeDB — tag filtering relies on scanning within the already-pruned partition + primary key range. GreptimeDB has JSON indexing on the roadmap.
DAO implementations
All 40+ DAO interfaces required by
StorageModuleare implemented, covering:Testing
A full E2E test case at
test/e2e-v2/cases/storage/greptimedb/using the sharedstorage-cases.yamlverification suite. Tested against GreptimeDB v1.0.0-rc.1. Unit tests cover type conversion, DDL generation, schema registry, and query helper logic (94 tests total).For step-by-step instructions on running unit tests, E2E tests, and a quick-start demo, see
TESTING.md.Imported Dependencies libs and their licenses
io.greptime:ingester-allcom.mysql:mysql-connector-jNotes:
grpc-netty-shaded,grpc-protobuf,grpc-stub).mysql-connector-jis used to connect to GreptimeDB's MySQL-compatible protocol port. The Universal FOSS Exception explicitly permits use in Apache-2.0 licensed projects. That said, the existing JDBC storage plugin takes a different approach — it doesn't bundle any JDBC driver but relies on users providing one at runtime. If bundlingmysql-connector-jis a concern, we can adopt the same approach and document it as a deployment requirement. Happy to discuss what the community prefers here.Compatibility
No breaking changes.
storage.selector=greptimedboption. No impact on existing storage configurations.General usage docs
GreptimeDB setup
GreptimeDB can run in standalone or cluster mode. For a quick start:
OAP configuration
In
application.yml:Tables are created automatically on OAP startup.
References
GreptimeDB documentation for concepts mentioned in this proposal:
json_path_match()used for tag filtering (added in v0.10.2)Beta Was this translation helpful? Give feedback.
All reactions