-
Notifications
You must be signed in to change notification settings - Fork 440
[WIP] Add preliminary Clickhouse formatter #922
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
a659fd2 to
80f6e1e
Compare
nene
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks mostly in the right direction, but I spotted some issues.
Most glaringly the fact that formatting of SELECT clauses has been completely neglected.
| 'DROP {DICTIONARY | DATABASE | USER | ROLE | QUOTA | PROFILE | SETTINGS PROFILE | VIEW | FUNCTION | NAMED COLLECTION | ROW POLICY | POLICY} [IF EXISTS]', | ||
| 'DROP [TEMPORARY] TABLE [IF EXISTS] [IF EMPTY]', | ||
| // https://clickhouse.com/docs/sql-reference/statements/exists | ||
| 'EXISTS [TEMPORARY] {TABLE | DICTIONARY | DATABASE}', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The linked docs say that the TABLE | DICTIONARY | DATABASE part is optional.
But making all that optional would result in EXISTS (...) expressions being detected as EXISTS statements.
So I guess you made a deliberate choice here.
| const reservedClauses = expandPhrases([ | ||
| // https://clickhouse.com/docs/sql-reference/statements/explain | ||
| 'EXPLAIN [AST | SYNTAX | QUERY TREE | PLAN | PIPELINE | ESTIMATE | TABLE OVERRIDE]', | ||
| ]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect to see here a list of all the SELECT-clauses like FROM, WHERE, GROUP BY, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I don't expect to see the EXPLAIN statement in here - for all other dialects the EXPLAIN statement has been listed under tabularOnelineClauses.
| export const keywordPhrases: string[] = [ | ||
| // See documentation of `keywords` above | ||
| 'ADD COLUMN', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should really be just part of keywords.
The -phrases variables should be used for the cases where you're using the expandPhrases() function to expand expressions like TIMESTAMP {WITH | WITHOUT} TIME ZONE.
When you just need multi-word keywords, they work just fine out of the box. Just like you're using multi-word data type names further down.
| 'Variant', | ||
| 'YEAR', | ||
| 'bool', | ||
| 'boolean', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use uppercase for all the syntax. It doesn't really make a difference for the parser, but it'll be more consistent across the codebase. Like it'll simplify searching through the code.
Same applies to the list of functions.
| const standardOnelineClauses = expandPhrases([ | ||
| // https://clickhouse.com/docs/sql-reference/statements/create | ||
| 'CREATE [OR REPLACE] [TEMPORARY] TABLE [IF NOT EXISTS]', | ||
| // https://clickhouse.com/docs/sql-reference/statements/update | ||
| 'UPDATE', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really only the CREATE TABLE syntax should be part of standardOnelineClauses. The rest should be in tabularOnelineClauses.
|
I really would encourage you to add tests as early on as possible. You can start with just: import { format as originalFormat, FormatFn } from '../src/sqlFormatter.js';
import behavesLikeSqlFormatter from './behavesLikeSqlFormatter.js';
describe('ClickhouseFormatter', () => {
const language = 'clickhouse';
const format: FormatFn = (query, cfg = {}) => originalFormat(query, { ...cfg, language });
behavesLikeSqlFormatter(format);
// or maybe (I'm unsure how similar exactly Clickhouse is to PostgreSQL)
// behavesLikePostgresqlFormatter(format);
});That will make sure your configuration will work for the basic stuff that should be the same in all SQL dialects. If any of these bahavesLikeSqlFormatter() tests fail, then I would ask you to fix the problem anyway. It's fairly unlikely that there's something in Clickhouse dialect that would necessitate a change to these core tests. |
|
Also, it would be of great help if you went through the wiki and added information there about the Clickhouse dialect. Especially these first few pages about Identifiers, Parameters, ... Comments. Doing that will also help you to get these details right about Clickhouse dialect. |
|
@nene I appreciate you taking the time to look! I'm very aware that it's not nearly ready to land and that it needs quite a bit of love, I just didn't want to invest a ton more time if it's completely off the mark. I have a commit in progress with the tests; I'm trying to aim for nearly all of the example queries from the docs to get handled correctly. I'll ping you when I have something that's ready for review. |
Addresses #614
I've started putting together a Clickhouse formatter. Before I start getting into the weeds with tests and updating all of the docs/tests/playground, I wanted to check that things are directionally correct. Please let me know if there's anything you'd like to see done differently before I proceed.