Skip to content

Latest commit

 

History

History
116 lines (91 loc) · 4.81 KB

concepts.md

File metadata and controls

116 lines (91 loc) · 4.81 KB

Concepts

GreptimeDB is an open-source time-series database with a special focus on scalability, analytical capabilities and efficiency. It's designed to work on infrastructure of the cloud era, and users benefit from its elasticity and commodity storage.

Our core developers have been building time-series data platform for years. Based on their best-practices, GreptimeDB is born to bring you:

  • A standalone binary that scales to highly-available distributed cluster, providing a transparent expierence for cluster users
  • Optimized columnar layout for handling time-series data; compacted, compressed, stored on various storage backends
  • Flexible index options, tackling high cardinality issues down
  • Distributed, parallel query execution, leveraging elastic computing resource
  • Native SQL, and Python scripting for advanced analytical scenarios
  • Widely adopted database protocols and APIs
  • Extensible table engine architecture for extensive workloads

Components

In order to form a robust database cluster and keep complexity at an acceptable level, there are three main components in GreptimeDB architecture: Datanode, Frontend and Meta.

  • Datanodes hold regions of tables and data in Greptime DB cluster. It accepts read and write request sent from Frontend, and executes it against its data. A single-instance Datanode deployment can also be used as GreptimeDB standalone mode, for local development.
  • Frontend is a stateless component that can scale to as many as needed. It accepts incoming request, authenticates it, translates it from various protocols into GreptimeDB cluster's internal one, and forwards to certain Datanodes under guidance from Meta.
  • Meta is the central command of GreptimeDB cluster. In typical deployment, at least three nodes is required to setup a reliable Meta mini-cluster. Meta manages database and table information, including how data spread across the cluster and where to route requests to. It also keeps monitoring availability and performance of Datanodes, to ensure its routing table is valid and up-to-date.

Objects

To understand how GreptimeDB manages and serves its data, you need to know about these building blocks of GreptimeDB.

Database

Similar to database in relational databases, database is the minimal unit of data container, within which data can be managed and computed.

Table

Table in GreptimeDB is similar to it in traditional relational database except it requires a timestamp column. The table holds a set of data that shares a common schema. It can either be created from SQL CREATE TABLE, or inferred from the input data structure (the auto-schema feature). In distributed deployment, a table can be split into multiple partitions that sit on different datanodes.

Table Region

Each partition of distributed table is called a region. A region may contain a sequence of continuous data, depending on the partition algorithm. Region information is managed by Meta. It's completely transparent to users who send the query.

Data Types

Data in GreptimeDB is strongly typed. Auto-schema feature provides some flexibility when creating a table. Once the table is created, data of the same column must share common data type.

Currently, we have these data types built-in:

  • Boolean
  • Integers (8-bit, 16-bit, 32-bit and 64-bit)
  • Unsigned integers (8-bit, 16-bit, 32-bit and 64-bit)
  • Float numbers (32-bit and 64-bit)
  • Bytes
  • String
  • Date, datetime and timestamp

There are new types in upcoming releases:

  • Compound type like List
  • Geometry

APIs

GreptimeDB provides multiple types of APIs to fit itself into your existing data stack. Currently, we have these approaches to access the database:

What's Next