Skip to content

ConfigFile

StevenRojasC edited this page May 14, 2025 · 15 revisions

VDMS Configuration File

VDMS uses a configuration file (written in JSON) that can be specified when starting the server by using the -cfg flag:

./vdms -cfg config-vdms.json

If no configuration file is specified, VDMS will try to open the default file (config-vdms.json), and will fail to initiate if the file is not found.

Parameters

All the parameters in the configuration file are optional, as VDMS has default values for all of them.

Param Explanation Default
autodelete_interval_s Time interval (in seconds) to delete specifc entries (Should be greater than 0) -1
autoreplicate_interval Time interval to backup the DB folder (Should be greater than 0) -1
aws_log_level This parameter is optional and it is used when "storage_type" parameter is set to "aws" value, it is used to control the level of verbosity of AWS logging system. The acccepted values are "off", "fatal", "error", "warn", "info", "debug", "trace" off
backup_flag Boolean whether to use the auto-replication thread false
backup_path Path to store the backup DB db_root_path
blobs_path Path to folder where blobs will be stored blobs (db/blobs)
bucket_name Bucket name for AWS storage vdms_bucket
db_root_path Path to the root folder where all filed/objects will be stored db
descriptors_path Path to folder where descriptors will be stored descriptors (db/descriptors)
endpoint_override Server address (including scheme and port number) to override the S3 storage address server. This parameter is valid when the "storage_type" parameter is set to "aws" value and "use_endpoint" parameter is set to true. http://127.0.0.1:9000 when "use_endpoint" parameter is set to true and the "storage_type" parameter is set to "aws"
expiration_time Time interval (in seconds) to automatically delete entries
flinng_cells_per_row For the FLINNG indexing for descriptors, controls the number of bits in the distance sensitive LSH vector for each row 1000
flinng_hashes_per_table For the FLINNG indexing for descriptors, controls the number of hash functions to be used per table for group testing 12
flinng_num_hash_tables For the FLINNG indexing for descriptors, controls the number of hash tables (permutations) to be used for the dataset for group testing 10
flinng_num_rows For the FLINNG indexing for descriptors, controls the number of distance sensitive LSH vectors 3
hnsw_efConstruction For the HNSW indexing for descriptors, controls the breadth of the search during the index construction phase 96
hnsw_efsearch For the HNSW indexing for descriptors, controls the breadth of the search during the search query 64
hnsw_M For the HNSW indexing for descriptors, controls the maximum number of neighbors that each descriptor can have at each layer 48
images_path Path to folder where images (all formats) will be stored images (db/images)
ivf_nlist For the IVF FLAT indexing for descriptors, specify the number of partitions to create using the k-means algorithm 16
k8s_container Boolean whether to use Kubernetes orchestration false
max_simultaneous_clients Number of max simultaneous connections open 500
pmgd_num_allocators Number of allocators when creating a new PMGD graph (this will only be used when creating a new graph, and ignored if the graph already exist) 1
pmgd_path Path to folder where PMGD graph will be stored db
port TCP port for incoming connections 55555
proxy_host Address of the proxy (optional). Example: "a.proxy.from.intel.com"
proxy_port Port number of the proxy. This parameter is needed when "proxy_host" parameter is set
proxy_scheme Scheme used by the proxy, accepted values are "http" and "https". This parameter is needed when "proxy_host" and "proxy_port" parameters are set
query_handler Specifies the query handler to use. Accepted values: pmgd (PMGD), or neo4j (Neo4j) pmgd
neo4j_conn_pool_sz Sets the pool size of neo4j client connections. This parameter is only used when "query_handler" is set to neo4j 32
replication_time -1
storage_type Database storage type. Accepted values: local (local storage), or aws (AWS S3) local
tmp_path Path to the temporary directory (optional) /tmp/tmp
unit Unit of the autoreplicate_interval variable. Accepted values: h (hour), m (minute), or s (seconds) s
use_endpoint Boolean whether to use an AWS storage mocking server (MinIO). This parameter is valid when the "storage_type" parameter is set to "aws" value false

Config File Example

// VDMS Config File
// This is the run-time config file
// Sets database paths and other parameters
{
    "port": 55555,
   "cert_file": "cert.pem",
   "key_file": "key.pem",
   "ca_file": "ca.pem",
    "autoreplicate_interval":-1, // it should be > 0
    "unit":"s",
    "max_simultaneous_clients": 100,
    // "backup_path":"backups_test", // set this if you want different path to store the back up file
    "db_root_path": "db",
    "backup_flag" : "false",
    "storage_type": "local", //local, aws, etc
    "bucket_name": "vdms_bucket",
    "more-info": "github.com/IntelLabs/vdms",
    "use_endpoint": false, // storage_type is set to local
    "endpoint_override": "http://127.0.0.1:9000",// Format "scheme://ip:port"
    "k8s_container": false,
    "proxy_host": "a.proxy.from.intel.com",
    "proxy_port": 912,
    "proxy_scheme": "http", // [http|https] valid values,
    "aws_log_level": "debug", // [off|fatal|error|warn|info|debug|trace]
    "tmp_path": "/tmp/tmp"
}

Default Directories Structure

By default, VDMS will create a directory structure as follows:

db
├── blobs
├── descriptors
├── graph
│   ├── allocator.jdb
│   ├── edges.jdb
│   ├── graph.jdb
│   ├── indexmanager.jdb
│   ├── journal.jdb
│   ├── nodes.jdb
│   ├── stringtable.jdb
│   └── transaction.jdb
└── images
    ├── jpg
    ├── png
    └── tdb
Clone this wiki locally