Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The primary key in GreptimeDB isn't the primary key in other databases #4920

Open
nicecui opened this issue Oct 31, 2024 · 1 comment
Open
Labels
C-enhancement Category Enhancements

Comments

@nicecui
Copy link
Collaborator

nicecui commented Oct 31, 2024

What type of enhancement is this?

API improvement, User experience

What does the enhancement do?

From Wikipedia, the primary key uniquely specifies a row in a relational table in the database industry. However, in GreptimeDB, the primary key specifies the tag columns in a time-series table, and the combination of these tags does not uniquely specify a row.

This difference in the meaning of the primary key between GreptimeDB and the industry standard leads to additional communication costs. When people see "primary key", they often assume it is the unique row identifier, which can lead to mistakes. GreptimeDB engineers then need to explain that the primary key does not function as users expect.

To address this issue, the creation statements need to be adjusted. For example, adding the time index to the primary key can align the primary key's behavior with users' expectations.

Change the following SQL:

CREATE TABLE grpc_latencies (
  ts TIMESTAMP TIME INDEX,
  host STRING,
  method_name STRING,
  latency DOUBLE,
  PRIMARY KEY (host, method_name)
);

to

CREATE TABLE grpc_latencies (
  ts TIMESTAMP TIME INDEX,
  host STRING,
  method_name STRING,
  latency DOUBLE,
  PRIMARY KEY (host, method_name, ts)
);

Implementation challenges

No response

@nicecui nicecui added the C-enhancement Category Enhancements label Oct 31, 2024
@MichaelScofield
Copy link
Collaborator

Well I suggest deprecate the term "primary key" in GreptimeDB, use "tag" instead. Like this:

CREATE TABLE grpc_latencies (
  ts TIMESTAMP TIME INDEX,
  host STRING,
  method_name STRING,
  latency DOUBLE,

  -- "PRIMARY KEY" is replaced with "TAG":
  TAG (host, method_name)
);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Category Enhancements
Projects
None yet
Development

No branches or pull requests

2 participants