Skip to content

Conversation

@mattbasta
Copy link

Addresses #614

I've started putting together a Clickhouse formatter. Before I start getting into the weeds with tests and updating all of the docs/tests/playground, I wanted to check that things are directionally correct. Please let me know if there's anything you'd like to see done differently before I proceed.

Copy link
Collaborator

@nene nene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks mostly in the right direction, but I spotted some issues.

Most glaringly the fact that formatting of SELECT clauses has been completely neglected.

'DROP {DICTIONARY | DATABASE | USER | ROLE | QUOTA | PROFILE | SETTINGS PROFILE | VIEW | FUNCTION | NAMED COLLECTION | ROW POLICY | POLICY} [IF EXISTS]',
'DROP [TEMPORARY] TABLE [IF EXISTS] [IF EMPTY]',
// https://clickhouse.com/docs/sql-reference/statements/exists
'EXISTS [TEMPORARY] {TABLE | DICTIONARY | DATABASE}',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The linked docs say that the TABLE | DICTIONARY | DATABASE part is optional.

But making all that optional would result in EXISTS (...) expressions being detected as EXISTS statements.

So I guess you made a deliberate choice here.

Comment on lines +8 to +11
const reservedClauses = expandPhrases([
// https://clickhouse.com/docs/sql-reference/statements/explain
'EXPLAIN [AST | SYNTAX | QUERY TREE | PLAN | PIPELINE | ESTIMATE | TABLE OVERRIDE]',
]);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect to see here a list of all the SELECT-clauses like FROM, WHERE, GROUP BY, etc.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I don't expect to see the EXPLAIN statement in here - for all other dialects the EXPLAIN statement has been listed under tabularOnelineClauses.

Comment on lines +436 to +438
export const keywordPhrases: string[] = [
// See documentation of `keywords` above
'ADD COLUMN',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should really be just part of keywords.

The -phrases variables should be used for the cases where you're using the expandPhrases() function to expand expressions like TIMESTAMP {WITH | WITHOUT} TIME ZONE.

When you just need multi-word keywords, they work just fine out of the box. Just like you're using multi-word data type names further down.

'Variant',
'YEAR',
'bool',
'boolean',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use uppercase for all the syntax. It doesn't really make a difference for the parser, but it'll be more consistent across the codebase. Like it'll simplify searching through the code.

Same applies to the list of functions.

Comment on lines +13 to +17
const standardOnelineClauses = expandPhrases([
// https://clickhouse.com/docs/sql-reference/statements/create
'CREATE [OR REPLACE] [TEMPORARY] TABLE [IF NOT EXISTS]',
// https://clickhouse.com/docs/sql-reference/statements/update
'UPDATE',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really only the CREATE TABLE syntax should be part of standardOnelineClauses. The rest should be in tabularOnelineClauses.

@nene
Copy link
Collaborator

nene commented Nov 15, 2025

I really would encourage you to add tests as early on as possible. You can start with just:

import { format as originalFormat, FormatFn } from '../src/sqlFormatter.js';
import behavesLikeSqlFormatter from './behavesLikeSqlFormatter.js';

describe('ClickhouseFormatter', () => {
  const language = 'clickhouse';
  const format: FormatFn = (query, cfg = {}) => originalFormat(query, { ...cfg, language });

  behavesLikeSqlFormatter(format);
  // or maybe (I'm unsure how similar exactly Clickhouse is to PostgreSQL)
  // behavesLikePostgresqlFormatter(format);
});

That will make sure your configuration will work for the basic stuff that should be the same in all SQL dialects.

If any of these bahavesLikeSqlFormatter() tests fail, then I would ask you to fix the problem anyway. It's fairly unlikely that there's something in Clickhouse dialect that would necessitate a change to these core tests.

@nene
Copy link
Collaborator

nene commented Nov 15, 2025

Also, it would be of great help if you went through the wiki and added information there about the Clickhouse dialect.

Especially these first few pages about Identifiers, Parameters, ... Comments.

Doing that will also help you to get these details right about Clickhouse dialect.

@mattbasta
Copy link
Author

@nene I appreciate you taking the time to look! I'm very aware that it's not nearly ready to land and that it needs quite a bit of love, I just didn't want to invest a ton more time if it's completely off the mark. I have a commit in progress with the tests; I'm trying to aim for nearly all of the example queries from the docs to get handled correctly. I'll ping you when I have something that's ready for review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants