This is kotlin implementation of datrics-ai/text2sql.
Also project is part of industrial-educational cooperation of DTOL and ybigta
datrics-ai/text2sql use RAG&LLM approach for text2sql.
Which means we need to insert data like table schema, qustion-sql-pair, domain-knowldege to vector_db first(ingest) then request LLM with giving her retrieved data(infer).
For more detail HOW-IT-WORKS
- requirements(based on my desktop environment)
- maven (3.9.9)
- kotlin (2.1.10)
- jvm (1.8 ~ 23)
If you didn't installed above them. I recommend you to install them by sdkman
below command will create ingest-cli-<version>.jar executable jar.
git clone https://github.com/jsybf/text2sql.git
cd text2sql
RUN mvn clean package -pl ingest-cli -am- requirements(based on my desktop environment)
- maven (3.9.9)
- kotlin (2.1.10)
- jvm (1.8 ~ 23)
git clone https://github.com/jsybf/text2sql.git
cd text2sql
RUN mvn clean package -pl infer-server -ambuild infer server via docker is also another option.
git clone https://github.com/jsybf/text2sql.git
cd text2sql
docker build \
--build-arg config_path=<path of infer_config.yaml> \
-f infer-server/Dockerfile \
-t ybigta-dtol/infer-server \ # what ever
.
docker run -it --rm -p 8080:8080 ybigta-dtol/infer-server
request example
curl -X POST \
-H "Content-Type: text/plain" \
-d "개선대책담당자중 가장 많은 개선대책을 담당한 사람의 이름을 찾아줘" \
http://localhost:8080/infer- requirements
- jvm (1.8 ~ 23)
- config yaml file
- postgres with pgvector extensions
first you need to prepare config file. config file template is prepared.
not modifing systemPrompt fields is suggested
read tables from database and generate json description.
java -jar ingest-cli-<version>.jar gene-desc \
--config <CONFIG_FILE_PATH> \
--jdbc <JDBC_URL> \
--user <DB_USER> \
--password <DB_PW>
ingest table schema(markdown) to vectordb
java -jar ingest-cli-<version>.jar ingest-schema --config <CONFIG_FILE_PATH>ingest QA(question: natural langauge, answer: SQL)json file to vectordb
java -jar ingest-cli-<version>.jar ingest-qa --config <CONFIG_FILE_PATH>ingest domain mapping from QA in vectordb.
require ingest-qa, ingest-schema called before
java -jar ingest-cli-<version>.jar ingest-domain-mapping --config <CONFIG_FILE_PATH>