PuffinDB is turning DuckDB into a next-generation vector database with the following features:
- Integration with all client applications embedding DuckDB through clientless architecture
- Storage of very large vectors on lakehouses such as Iceberg, Delta Lake, and Hudi
- Support for the Lance file format
- GPU acceleration
- Custom SQL functions for vector processing
- Data pipeline automation with support for PRQL
PuffinDB will support most NVIDIA GPU accelerators, while developing specific optimizations for the Grace Hopper Superchip
- Full SQL support — complex queries and joins with non-vectorized datasets
- Clientless architecture — direct support from any client embedding the DuckDB engine
- Serverless architecture — greater scalability, lower costs
- Data pipeline engine — automation of AI/ML operations
- Connector framework — integration with hundreds of databases and applications