- Introduction
- Setup Guide
- Tools and Technologies
- Command Reference
- Transaction Management
- Error Handling
- Environment Configuration
- Deployment
- Internal Architecture
- Todo
- License
TCP-DuckDB is a TCP server implementation that provides networked access to DuckDB databases. The server enables remote database management with features like user authentication, access control, and PostgreSQL integration. Written in Go, it leverages the power of DuckDB, a lightweight analytical database engine.
- TCP Interface: Network-accessible database service
- User Authentication: Multi-user support with authentication
- Database Management: Create and manage DuckDB databases
- Permission Control: Fine-grained access permissions at database and table levels
- PostgreSQL Integration: Link with PostgreSQL databases
- Transaction Support: Full transaction support for data operations
- Connection Pooling: Efficient database connection management
- Docker and Docker Compose installed on your system
- Git to clone this repository
-
Clone the repository:
git clone https://github.com/rag-nar1/tcp-duckdb.git cd TCP-Duckdb
-
Build and start the server using Docker Compose:
docker compose up --build
-
Check logs to verify the server is running:
docker logs tcp-duckdb-tcp-duckdb-1
You should see output like:
INFO YYYY/MM/DD HH:MM:SS Super user created INFO YYYY/MM/DD HH:MM:SS listening to 0.0.0.0:4000
-
Stop the server when finished:
docker compose down
If you prefer to build and run manually:
-
Build the Docker image:
docker build -t tcp-duckdb .
-
Run the container:
docker run -d -p 4000:4000 \ -v $(pwd)/storge:/app/storge \ -v $(pwd)/server:/app/server \ -e ServerPort=4000 \ -e ServerAddr=0.0.0.0 \ -e DBdir=/app/storge/server/ \ -e ServerDbFile=db.sqlite3 \ -e ENCRYPTION_KEY=A15pG0m3hwf0tfpVW6m92eZ6vRmAQA3C \ --name tcp-duckdb-container \ tcp-duckdb
The server can be configured using environment variables:
Variable | Description | Default |
---|---|---|
ServerPort | Port the server listens on | 4000 |
ServerAddr | Address the server binds to | 0.0.0.0 |
DBdir | Directory for the SQLite database | /app/storge/server/ |
ServerDbFile | SQLite database filename | db.sqlite3 |
ENCRYPTION_KEY | Key used for encryption | ENCRYPTION_KEY |
You can modify these values in the docker-compose.yml
file or pass them directly when running the container.
To build and run the application locally:
- Install Go 1.24 or later
- Install SQLite development libraries
- Clone the repository
- Run:
go mod download go build -o ./build/server main/* ./build/server
Database initialization errors
If you see errors related to database tables, ensure the schema is correctly applied:
# Connect to the running container
docker exec -it tcp-duckdb-tcp-duckdb-1 bash
# Verify the database exists
ls -la /app/storge/server/
# Manually apply the schema if needed
sqlite3 /app/storge/server/db.sqlite3 < /app/storge/server/scheme.sql
Connection issues
The server listens on TCP port 4000. Verify the port is correctly mapped and not blocked by a firewall.
- Clone the repository:
git clone https://github.com/rag-nar1/tcp-duckdb.git
cd TCP-Duckdb
- Install dependencies:
go mod download
- Configure environment variables:
Create or modify the
.env
file in the project root:
ServerPort=4000
ServerAddr=localhost
DBdir=/path/to/storage/
ServerDbFile=server/db.sqlite3
ENCRYPTION_KEY="YourEncryptionKey"
- Build the project:
make build
- Run the server:
make run
- Go (Golang): The primary programming language used for the entire codebase. Go was chosen for its efficiency in building networked services, excellent concurrency support through goroutines, and strong standard library.
-
DuckDB: A lightweight, in-process analytical database management system. It serves as the primary storage engine for the application, providing fast analytical query capabilities.
-
SQLite: Used for storing user authentication and permission data. SQLite was chosen for its simplicity, reliability, and zero-configuration nature.
-
PostgreSQL: Supported as an optional integration, allowing linking and synchronization with PostgreSQL databases. The system can replicate schema and data from PostgreSQL into DuckDB.
-
go-duckdb: The Go driver for DuckDB that enables interaction with DuckDB databases from Go code.
-
go-sqlite3: The Go interface to the SQLite3 database, used for user management.
-
lib/pq: PostgreSQL driver for Go, used for connecting to PostgreSQL databases when using the link functionality.
-
godotenv: Used for loading environment variables from .env files.
-
Standard Library Packages:
net
: Core networking functionality for TCP server implementationdatabase/sql
: Database interactionsync
: Synchronization primitives for concurrent operationsbufio
: Buffered I/O operationscrypto
: Cryptographic functions for secure password hashing
-
Makefile: Used for build automation, with predefined tasks for building, running, and code formatting.
-
Git: Version control system with custom pre-commit hooks for code formatting.
-
Environment Configuration: Uses .env files for configuration management.
-
Connection Pool Implementation: Custom LRU (Least Recently Used) cache implementation for efficient database connection management.
[All Users]
Authenticates a user to access the server. This is the first command that must be executed before any other operation can be performed.
login [username] [password]
Authentication Process:
- The client sends the login command with username and password
- The server validates the credentials against the SQLite user database
- If successful, a user session is established with appropriate privileges
- All subsequent commands will operate under this authenticated user context
Super User Information:
- The default super user is
duck
with initial passwordduck
- The super user has full administrative privileges including:
- Creating and managing databases
- Creating and managing users
- Granting permissions
- Linking with PostgreSQL databases
- Performing update operations
- For security reasons, it is strongly recommended to change the super user password after initial setup using the update command
Example:
login duck duck
Response on success:
success
[Super User Only]
Creates a new DuckDB database (requires super user privileges).
create database [database_name]
Example:
create database analytics
Response on success:
success
Creates a new user (requires super user privileges).
create user [username] [password]
Example:
create user analyst securepassword
Response on success:
success
[All Users]
Connects to an existing database to execute queries.
connect [database_name]
After connecting, the system:
- Verifies database existence
- Checks user permissions
- Acquires database connection from pool
- Allows executing queries or starting transactions
Example:
connect analytics
Response on success:
success
Once connected, you can execute SQL queries directly:
SELECT * FROM users;
[Super User Only]
Grants database access to a user (requires super user privileges).
grant database [database_name] [username] [access_type]
Access types:
read
: Read-only accesswrite
: Read and write access
Example:
grant database analytics analyst read
Response on success:
success
Grants table-level permissions to a user (requires super user privileges).
grant table [database_name] [table_name] [username] [access_type...]
Access types:
select
: Permission to query the tableinsert
: Permission to insert dataupdate
: Permission to update datadelete
: Permission to delete data
Example:
grant table analytics users analyst select insert
Response on success:
success
[Super User Only]
Links a DuckDB database with a PostgreSQL database (requires super user privileges).
link [database_name] [postgresql_connection_string]
Implementation:
- Connects to the PostgreSQL database
- Retrieves schema information
- Creates corresponding tables in DuckDB
- Copies data from PostgreSQL to DuckDB
- Sets up audit triggers for change tracking
Example:
link analytics "postgresql://user:password@localhost:5432/analytics_pg"
Response on success:
success
[Super User Only]
Synchronizes changes from a linked PostgreSQL database to DuckDB (requires super user privileges).
migrate [database_name]
Implementation:
- Reads audit logs from PostgreSQL
- Applies changes to the DuckDB database
- Updates tracking information
Example:
migrate analytics
Response on success:
success
[Super User Only]
Updates database names or user information (requires super user privileges).
The update command has three variations:
update database [old_database_name] [new_database_name]
Implementation:
- Verifies database existence
- Renames the database file
- Updates database name in server records
Example:
update database analytics analytics_prod
Response on success:
success
update user username [old_username] [new_username]
Implementation:
- Verifies user existence
- Updates username in user database
Example:
update user username analyst data_scientist
Response on success:
success
update user password [username] [new_password]
Implementation:
- Verifies user existence
- Hashes the new password
- Updates password in user database
Example:
update user password analyst new_secure_password
Response on success:
success
[All Users with Database Access]
After connecting to a database, you can manage transactions:
start transaction
INSERT INTO users VALUES (1, 'John');
UPDATE users SET name = 'Johnny' WHERE id = 1;
commit
rollback
Example:
connect analytics
start transaction
INSERT INTO users VALUES (1, 'John');
UPDATE users SET name = 'Johnny' WHERE id = 1;
commit
The server implements structured error responses:
response.BadRequest(writer)
response.InternalError(writer)
response.UnauthorizedError(writer)
response.DoesNotExistDatabse(writer, dbname)
response.AccesDeniedOverDatabase(writer, UserName, dbname)
The server uses the following environment variables:
ServerPort
: TCP port for the server (default: 4000)ServerAddr
: Server address (default: localhost)DBdir
: Directory for storing databasesServerDbFile
: Path to the server's SQLite databaseENCRYPTION_KEY
: Key for encrypting/decrypting PostgreSQL connection strings
The server can be deployed on any system with Go and DuckDB installed:
- Build the server:
make build
-
Configure environment in
.env
-
Run the server:
make run
The main server component is responsible for the core server functionality:
- Configuration Management: Loads environment variables and configures the server
- Database Connection: Maintains connection to the SQLite user database
- Statement Preparation: Prepares SQL statements for efficient execution
- Super User Management: Creates and manages the super user account
- Logging: Implements structured logging for errors and information
Key Files:
config.go
: Server configuration and initializationserver.go
: Core server functionality implementation
Key Functions:
NewServer()
: Initializes server with configurationsCreateSuper()
: Creates the initial super user if it doesn't existPrepareStmt()
: Prepares SQL statements for later use
The Server struct centralizes server state:
type Server struct {
Sqlitedb *sql.DB // SQLite database connection
Dbstmt map[string]*sql.Stmt // Prepared statements
Pool *request_handler.RequestHandler // Connection pool
Port string // Server port
Address string // Full server address
InfoLog *log.Logger // Information logger
ErrorLog *log.Logger // Error logger
}
The entry point for the TCP server:
- Server Initialization: Initializes the server components
- TCP Listener: Sets up TCP socket and listens for connections
- Connection Handling: Accepts connections and spawns goroutines
- Command Routing: Routes incoming commands to appropriate handlers
Key Files:
main.go
: Entry point with TCP listenerrouter.go
: Command routing implementation
Key Functions:
main()
: Starts the TCP server listening on configured portHandleConnection()
: Processes each client connectionRouter()
: Routes requests to appropriate command handlers
Manages the lifecycle of database requests:
- Request Queue: Maintains queue of database connection requests
- Connection Pooling Integration: Works with pool module
- Concurrency Management: Handles simultaneous requests safely
Key Files:
request_handler.go
: Core request handling logic
Key Functions:
HandleRequest()
: Processes each database requestSpin()
: Starts the request handling background processPush()
: Adds new requests to the queue
Implements efficient connection management for DuckDB databases:
- LRU Cache: Implements Least Recently Used replacement policy
- Connection Limits: Manages maximum number of open connections
- Pin Counting: Tracks active database usage
- Resource Management: Efficiently manages database handles
Key Files:
pool.go
: Connection pool implementationlru.go
: LRU cache implementation for connection eviction
Key Functions:
Get()
: Retrieves a database connection from the poolNewPool()
: Creates a new connection poolRecordAccess()
: Updates access time for LRU tracking
Handles user authentication:
- Credential Verification: Validates username and password
- Password Hashing: Securely stores and validates passwords
- Session Establishment: Sets up user session after authentication
Key Files:
handler.go
: Authentication request handlingservice.go
: Authentication logic implementation
Key Functions:
Handler()
: Processes login requestsLogin()
: Validates credentials against database
Manages creation of databases and users:
- Database Creation: Creates new DuckDB databases
- User Creation: Creates new user accounts
- Permission Initialization: Sets up initial permissions
Key Files:
handler.go
: Request handling for creation operationsservice.go
: Implementation of creation operations
Key Functions:
CreateDatabase()
: Creates a new databaseCreateUser()
: Creates a new user
Manages database connections and query execution:
- Connection Establishment: Connects to specified database
- Permission Checking: Verifies user has access to database
- Query Execution: Executes SQL queries on connected database
- Transaction Management: Handles SQL transactions
Key Files:
handler.go
: Connection request handlingservice.go
: Query execution implementationtransaction.go
: Transaction management
Key Functions:
Handler()
: Processes connection requestsQueryService()
: Executes individual queriesTransaction()
: Manages database transactions
Manages access permissions:
- Database Permissions: Controls database access rights
- Table Permissions: Controls table-level access rights
- Permission Checking: Verifies permissions before granting
Key Files:
handler.go
: Grant request handlingservice.go
: Permission management implementation
Key Functions:
GrantDatabaseAccess()
: Grants database-level accessGrantTableAccess()
: Grants table-level access
Facilitates PostgreSQL database integration:
- Connection Management: Establishes connections to PostgreSQL
- Schema Transfer: Replicates PostgreSQL schema to DuckDB
- Data Migration: Copies data from PostgreSQL to DuckDB
- Connection String Encryption: Securely stores PostgreSQL credentials
Key Files:
handler.go
: Link request handlingservice.go
: Link implementation
Key Functions:
Link()
: Establishes connection and migrates schema/data
Handles data synchronization between PostgreSQL and DuckDB:
- Change Detection: Identifies changes in PostgreSQL
- Synchronization: Applies changes to DuckDB
- Audit Log: Processes audit logs for changes
Key Files:
handler.go
: Migrate request handlingservice.go
: Migration implementation
Key Functions:
Migrate()
: Synchronizes changes from PostgreSQL to DuckDB
Manages updates to database and user information:
- Database Name Updates: Renames databases
- User Information Updates: Updates usernames and passwords
- Validation: Verifies existence before updates
Key Files:
handler.go
: Update request handlingservice.go
: Update implementation
Key Functions:
UpdateDatabase()
: Renames a databaseUpdateUserUsername()
: Updates a user's usernameUpdateUserPassword()
: Updates a user's password
Provides utility functions used throughout the application:
- Password Hashing: Secures user passwords
- Encryption: Handles AES encryption for sensitive data
- Path Management: Manages file paths for databases
- String Handling: Provides string manipulation utilities
Key Files:
utils.go
: General utility functionscrypto.go
: Cryptographic functions
Key Functions:
Hash()
: Hashes passwordsEncrypt()/Decrypt()
: Encrypts/decrypts data with AESUserDbPath()
: Resolves database file paths
-
Connection Establishment
- Client connects to TCP server via
main.go
- Server spawns a goroutine for the connection via
HandleConnection()
- Client must authenticate via
login.Handler()
- Client connects to TCP server via
-
Command Processing
- After authentication,
Router()
inmain/router.go
processes requests - Commands are parsed and validated
- Requests are routed to appropriate module handlers
- Responses are sent back to client
- After authentication,
-
Database Operations
- Database connections are managed by the pool module
- Operations are checked against user permissions
- Transactions are handled with ACID guarantees
- Results are returned to client
-
PostgreSQL Integration Process
- Link operations copy schema and data from PostgreSQL
- Migrate operations synchronize changes from PostgreSQL
- Audit tables track changes for synchronization
This section outlines planned enhancements and improvements for the TCP-DuckDB project:
Task | Description | Priority |
---|---|---|
Client Library | Develop client libraries in multiple languages (Python, JavaScript, Java), go is in progress | High |
Connection Encryption | Implement TLS/SSL for secure client-server communication | High |
Change Data Capture | Swap the Audit table with CDC using Debezium and kafka | Medium |
Backup & Restore | Add automated backup and point-in-time recovery functionality | Medium |
Query Caching | Add intelligent query result caching | Low |
Web Admin Interface | Create a web-based administration interface | Low |
This project is licensed under the MIT License - see the LICENSE file for details.