dq

Query CSV, JSON, Avro, and Parquet files from the command line. Pipe operations together, like jq but for tables.

dq 'users.csv | filter { age > 25 } | select name city | sorta name'
dq -o csv 'users.json | filter { age > 25 }' > filtered.csv

Install

Homebrew:

brew tap razeghi71/tap
brew install dq

Go:

go install github.com/razeghi71/dq/cmd/dq@latest

Or build from source:

git clone https://github.com/razeghi71/dq.git
cd dq
go build -ldflags="-s -w" -o dq ./cmd/dq

How It Works

Every query starts with a file and pipes it through operations. Each operation takes a table in and returns a table out.

dq 'file.csv | operation1 | operation2 | ...'

Wrap queries in single quotes so your shell doesn't interpret |, {, }, or >.

Operations

`head` / `tail` - Get rows from the start or end

dq 'users.csv | head'          # first 10 rows (default)
dq 'users.csv | head 20'       # first 20 rows
dq 'users.csv | tail'          # last 10 rows (default)
dq 'users.csv | tail 5'        # last 5 rows

`select` - Keep only the columns you want

dq 'users.csv | select name age city'
dq 'data.json | select name address.city'        # nested field -> column "address_city"

`remove` - Drop columns you don't need

dq 'users.csv | remove password ssn'

`filter` - Keep rows that match a condition

Expressions go inside { }. Use == for equality, and/or for logic.

dq 'users.csv | filter { age > 25 }'
dq 'users.csv | filter { age > 25 and city == "NY" }'
dq 'users.csv | filter { email is not null }'
dq 'data.json | filter { address.city == "NY" }'    # nested field access

`sorta` / `sortd` - Sort rows ascending or descending

dq 'users.csv | sorta age'          # youngest first
dq 'users.csv | sortd age'          # oldest first
dq 'users.csv | sorta city age'     # sort by city, then age

`count` - Count how many rows

dq 'users.csv | count'
dq 'users.csv | filter { age > 30 } | count'

`distinct` - Remove duplicate rows

dq 'users.csv | distinct'             # unique rows
dq 'users.csv | distinct city'        # unique cities (keeps first occurrence)
dq 'users.csv | distinct city age'    # unique city+age combinations

`rename` - Rename columns

Names are paired: old then new. Use backticks for column names with spaces.

dq 'users.csv | rename name username'
dq 'users.csv | rename `first name` first_name `last name` last_name'

`transform` - Create or overwrite columns with computed values

Assignments are comma-separated. Works row by row.

dq 'users.csv | transform age2 = age * 2'
dq 'users.csv | transform name = upper(name), age_months = age * 12'
dq 'sales.csv | transform total = coalesce(quantity, 0) * coalesce(price, 0)'

`group` - Group rows by column values

Collects rows that share the same value(s) into groups. The non-grouped columns are nested into a column called grouped (or a custom name with as).

dq 'users.csv | group city'

city | grouped
---- | -------------------------
NY   | [ {name:alice,age:30,city:NY}, {name:bob,age:25,city:NY} ]
LA   | [ {name:carol,age:28,city:LA} ]

All original columns (including group keys) are preserved in the nested records.

dq 'users.csv | group city as people'       # custom nested column name
dq 'users.csv | group city department'       # group by multiple columns
dq 'data.json | group address.city'          # group by nested field -> key column "address_city"

`reduce` - Aggregate over grouped rows

Runs aggregation functions (sum, avg, count, etc.) over the nested rows created by group. The nested column is kept after reduction -- use remove to drop it.

dq 'users.csv | group city | reduce avg_age = avg(age), n = count()'

city | grouped                                                | avg_age | n
---- | ------------------------------------------------------ | ------- | -
NY   | [ {name:alice,age:30,city:NY}, {name:bob,age:25,city:NY} ] | 27.5    | 2
LA   | [ {name:carol,age:28,city:LA} ]                        | 28      | 1

By default reduce operates on the column named grouped. If you used a custom name with group ... as, or if you have a pre-existing list column (e.g. from a Parquet/JSON file), pass the column name as the first argument:

dq 'users.csv | group city as people | reduce people avg_age = avg(age)'
dq 'orders.parquet | reduce orders total = sum(amount), n = count()'

`group` + `reduce` - Putting them together

# average age per city, clean output
dq 'users.csv | group city | reduce avg_age = avg(age) | remove grouped'

# total revenue per category, top 3
dq 'sales.csv | group category | reduce total = sum(price), n = count() | remove grouped | sortd total | head 3'

Functions

For reduce (aggregate across rows): count(), sum(col), avg(col), min(col), max(col), first(col), last(col)

For transform (compute per row): upper(s), lower(s), len(s), trim(s), substr(s, start, len), coalesce(a, b, ...), if(cond, then, else), year(d), month(d), day(d)

Operators (work everywhere): +, -, *, /, ==, !=, <, >, <=, >=, and, or, not

Nested Fields

JSON, Avro, and Parquet files can contain nested records. Use dot notation to access sub-fields in filter, transform, select, and group:

dq 'data.json | filter { address.city == "Chicago" }'
dq 'data.json | transform city = address.city | select name city'
dq 'data.json | filter { profile.stats.logins > 10 }'
dq 'data.json | select name address.city'                          # -> columns: name, address_city
dq 'data.json | group address.city | reduce n = count() | remove grouped'

Dot paths in select and group flatten to underscore-separated column names (e.g., address.city becomes address_city). If a column with that name already exists, a numeric suffix is added (address_city_2).

Missing sub-fields return null.

Output Formats

By default dq prints a pretty ASCII table. Use -o to change the output format:

dq 'users.csv | select name age'                        # table (default)
dq -o csv  'users.csv | select name age' > out.csv      # CSV
dq -o json 'users.csv | select name age' > out.json     # JSON array of objects
dq -o jsonl 'users.csv | select name age' > out.jsonl   # one JSON object per line

Format	Flag	Notes
`table`	default	Pretty-printed ASCII table
`csv`	`-o csv`	Standard CSV. Nulls render as empty strings.
`json`	`-o json`	JSON array. Preserves types (ints, bools, nulls, nested).
`jsonl`	`-o jsonl`	One JSON object per line. Same type preservation as JSON.

Supported Input Formats

CSV (.csv), JSON (.json), JSONL (.jsonl), Avro (.avro), Parquet (.parquet)

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.github/workflows		.github/workflows
ast		ast
cmd/dq		cmd/dq
engine		engine
lexer		lexer
loader		loader
parser		parser
table		table
testdata		testdata
writer		writer
.gitignore		.gitignore
.goreleaser.yaml		.goreleaser.yaml
AGENTS.md		AGENTS.md
README.md		README.md
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

dq

Install

How It Works

Operations

`head` / `tail` - Get rows from the start or end

`select` - Keep only the columns you want

`remove` - Drop columns you don't need

`filter` - Keep rows that match a condition

`sorta` / `sortd` - Sort rows ascending or descending

`count` - Count how many rows

`distinct` - Remove duplicate rows

`rename` - Rename columns

`transform` - Create or overwrite columns with computed values

`group` - Group rows by column values

`reduce` - Aggregate over grouped rows

`group` + `reduce` - Putting them together

Functions

Nested Fields

Output Formats

Supported Input Formats

License

About

Uh oh!

Releases 4

Packages

Uh oh!

Contributors

Uh oh!

Languages

razeghi71/dq

Folders and files

Latest commit

History

Repository files navigation

dq

Install

How It Works

Operations

head / tail - Get rows from the start or end

select - Keep only the columns you want

remove - Drop columns you don't need

filter - Keep rows that match a condition

sorta / sortd - Sort rows ascending or descending

count - Count how many rows

distinct - Remove duplicate rows

rename - Rename columns

transform - Create or overwrite columns with computed values

group - Group rows by column values

reduce - Aggregate over grouped rows

group + reduce - Putting them together

Functions

Nested Fields

Output Formats

Supported Input Formats

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`head` / `tail` - Get rows from the start or end

`select` - Keep only the columns you want

`remove` - Drop columns you don't need

`filter` - Keep rows that match a condition

`sorta` / `sortd` - Sort rows ascending or descending

`count` - Count how many rows

`distinct` - Remove duplicate rows

`rename` - Rename columns

`transform` - Create or overwrite columns with computed values

`group` - Group rows by column values

`reduce` - Aggregate over grouped rows

`group` + `reduce` - Putting them together

Packages