A simple command-line tool to split large CSV files into multiple smaller parts with customizable options.
- Split CSV files into any number of parts
- Automatically generates output filenames with
_part1,_part2, etc. suffixes - Preserves original file extension and directory
- Optional header inclusion in all output files
- Support for custom column separators
- Even distribution of rows across parts
- Handles files with any number of rows efficiently
- Smart newline handling: Automatically replaces line breaks with spaces in quoted CSV fields while preserving them in unquoted fields
go build -o splitcsv main.go./splitcsv -in <input-file> [options]-in- Input CSV file path (required)
-parts- Number of parts to split into (default: 2)-header- Include header row in all output files (default: true)-comma- Column separator character (default: ",")
Split a CSV file into 2 parts (default):
./splitcsv -in data.csvOutput: data_part1.csv, data_part2.csv
Split into 5 parts:
./splitcsv -in sales_data.csv -parts 5Output: sales_data_part1.csv, sales_data_part2.csv, ..., sales_data_part5.csv
Split without including headers in output files:
./splitcsv -in data.csv -parts 3 -header=falseSplit a semicolon-separated file:
./splitcsv -in european_data.csv -parts 4 -comma ";"Split a large file with tab separator into 10 parts without headers:
./splitcsv -in huge_dataset.tsv -parts 10 -comma "\t" -header=false- Row Counting: First pass counts total data rows (excluding header)
- Distribution: Calculates optimal row distribution across parts
- File Generation: Creates output files with
_partNsuffix - Data Writing: Distributes rows evenly, with extra rows going to first parts
For a file with 100 data rows split into 3 parts:
- Part 1: 34 rows
- Part 2: 33 rows
- Part 3: 33 rows
Extra rows are distributed to the first parts to ensure even splitting.
The tool automatically handles CSV fields that contain line breaks:
- Quoted fields: Line breaks (
\nand\r\n) inside quoted fields are automatically replaced with spaces - Unquoted fields: Line breaks in unquoted fields are preserved as-is
- Escaped quotes: Properly handles escaped quotes (
"") within quoted fields
Example:
name,description,price
"Product A","This is a long
description with line breaks",100
Product B,Simple description,200
After processing:
name,description,price
"Product A","This is a long description with line breaks",100
Product B,Simple description,200
This ensures that CSV files with multi-line content in quoted fields remain properly formatted and compatible with standard CSV parsers.
Output files follow this pattern:
{original_name}_part{N}{original_extension}
Examples:
data.csv→data_part1.csv,data_part2.csvsales_2024.csv→sales_2024_part1.csv,sales_2024_part2.csvexport.tsv→export_part1.tsv,export_part2.tsv
The tool will exit with an error message if:
- Input file doesn't exist or can't be read
- Input file has no data rows
- Number of parts is less than 1
- Separator is not a single character
- Output files can't be created
- Memory efficient: processes files row by row
- Two-pass reading: first for counting, second for splitting
- Supports files of any size (limited only by available disk space)
- Go 1.16 or later
- Read permission for input file
- Write permission for output directory
This project is open source and available under the MIT License.