complete `addon_params` with all prompt templates #1401

drahnreb · 2025-04-17T21:35:54Z

Description

Today you can only control for very broad language and entity types.

This PR exposes all prompt templates keys by adding them to the addon_params. This allows for easy customization of extraction prompts.
This gives more fine-grained control over prompts, e.g. you could instantiate different objects with special addon_params for certain types of text with more suitable domain relevant few-shot examples.

It also opens the way to impose more structure, e.g. via an ontology or (causal) relations.

Related Issues

None

Changes Made

added the following keys (without DEFAULT_):

PROMPTS["DEFAULT_TUPLE_DELIMITER"]
PROMPTS["DEFAULT_RECORD_DELIMITER"]
PROMPTS["DEFAULT_COMPLETION_DELIMITER"]

PROMPTS["summarize_entity_descriptions"]
PROMPTS["entity_extraction_examples"]
PROMPTS["entity_extraction"]
PROMPTS["entity_continue_extraction"]
PROMPTS["entity_if_loop_extraction"]
PROMPTS["keywords_extraction_examples"]
PROMPTS["keywords_extraction"]

PROMPTS["mix_rag_response"]
PROMPTS["naive_rag_response"]

PROMPTS["similarity_check"]

Checklist

Changes tested locally
Code reviewed
Documentation updated (if necessary)
Unit tests added (if applicable)

Additional Notes

@danielaskdd Please review. I left it simple and just added to addon_params, but this could be also grouped into prompts.

drahnreb · 2025-04-19T10:31:23Z

@danielaskdd ready to merge if you want.

danielaskdd

Allowing users to configure prompts is a great idea. In terms of implementation, we hope to make it more thorough and convenient. Issue #1353 proposed a potentially better approach: writing multiple sets of different prompts in the prompt directory, enabling users to freely choose which prompt to use for document indexing or queries. Could you propose an interface design under this concept that would make it even more user-friendly?

drahnreb · 2025-04-20T15:17:04Z

This is the intention.
Once all prompt templates are exposed with this PR you could do this:

Directory structure:

my_docs/
 └── books/
     ├── book1.txt
     ├── book2.txt
 └── articles/
     ├── article1.txt
     ├── article2.txt
     ├── insert_prompt_template.json
my_queries/
 └── articles/
     └── query_prompt_template.json

async def initialize_rag(addon_params: Optional[dict] = None):
    rag_kwargs = {
        "working_dir": WORKING_DIR,
        "llm_model_func": gpt_4o_mini_complete,
    }
    # Only add addon_params to kwargs if it's provided by the caller
    # otherwise will overrid default_factory (should be fine still, default language is pulled from PROMPTS)
    if addon_params is not None:
        rag_kwargs["addon_params"] = addon_params

    rag = LightRAG(**rag_kwargs)

    await rag.initialize_storages()
    await initialize_pipeline_status()

    return rag

# create file based example
json.dump({
    "entity_extraction_examples": ["device", "make", "model", "publication", "date"]
}, open('./my_docs/articles/insert_template_prompts.json', 'w'))
json.dump({
    "rag_response": "System prompt specific to articles..."
}, open('./my_queries/articles/query_template_prompts.json', 'w'))

docs = {
    "books": {
        "file_paths": ["./books/book1.txt", "./books/book2.txt"],
        "addon_params": {
            "entity_extraction_examples": ["organization", "person", "location"],
        },
        "system_prompts": {
            "rag_response": "KG mode system prompt specific to books...",
            "naive_rag_response": "Naive mode system prompt specific to books...",
            "mix_rag_response": "Mix mode system prompt specific to books...",
        },
    },
    "articles": {
        "file_paths": ["./articles/article1.txt", "./articles/article2.txt"],
        "addon_params": json.load(open('./my_docs/articles/insert_template_prompts.json', 'r')),
        "system_prompts": json.load(open('./my_queries/articles/query_template_prompts.json', 'r')),
    },
}

def get_content(file_paths):
    contents = []
    for fp in file_paths:
        with open(fp, "r", encoding="utf-8") as f:
            contents.append(f.read())
    return contents

# Insert differently per doc type
for doc_type, doc_info in docs.items():
    file_paths = doc_info["file_paths"]
    addon_params = doc_info["addon_params"]

    # Initialize the RAG instance for each document type
    print(f"Initializing RAG for {doc_type}")
    rag = asyncio.run(initialize_rag(addon_params))

    contents = get_content(file_paths)
    rag.insert(contents, file_paths=file_paths)

# Perform hybrid search for specific to `books` type queries
print(
    rag.query(
        "What are the top themes in this story?",
        param=QueryParam(mode="hybrid"),
        system_prompt=docs["books"][
            "system_prompts"
        ]["rag_response"],  # Use the hybrid mode specific system prompt for books type data
    )
)

Of course you could write convenience functions for the template handling, template checks (are the placeholder present etc.) or the correct query template associations (e.g.: for local,global,hybrid you could specify rag_response while for mix it could be mix_rag_response and for naive it could be naive_rag_response to keep it aligned to current prompts.py and pass any of them to system_prompt as illustrated in the last example)...

We could open a new PR for handling and checks to warn users if placeholders are missing. As any prompt would not fail for now, this could serve as an example illustration in the meantime. And we add the information to the README and an example?

move *_responses from addon_param to query_param if not system_prompt, add optional system_prompt arg to query_with_keywords to customize context building and final response.

drahnreb · 2025-04-21T01:49:06Z

cleaned up and separated query from insert prompts d71ceb9
added checks to prevent possible problems when customizing critical prompt templates 3d7b1df
added exhaustive examples to illustrate usage 01aee34

This should address the core items. @danielaskdd PTAL when convenient.

drahnreb · 2025-04-23T16:37:21Z

Just in case I missed it?
@danielaskdd do you still need anything to approve this PR?

drahnreb changed the title ~~complete addon params~~ complete addon_params with all prompt templates Apr 17, 2025

drahnreb force-pushed the drahnreb/complete-addon-params branch from f96f050 to 61b6b19 Compare April 19, 2025 10:26

drahnreb mentioned this pull request Apr 20, 2025

[Feature Request]: "entity_continue_extraction" should be formulated a bit differently & new chunking function #1379

Open

2 tasks

danielaskdd requested changes Apr 20, 2025

View reviewed changes

drahnreb added 4 commits April 20, 2025 23:04

add: complete all prompts in addon_params

c8267e2

add similarity_check prompt in utils

7fba109

update docs

d3e57ad

fix linting

885b480

drahnreb force-pushed the drahnreb/complete-addon-params branch from 61b6b19 to 885b480 Compare April 20, 2025 21:07

drahnreb added 3 commits April 20, 2025 23:59

move query specific prompt templates in query_param.

d71ceb9

move *_responses from addon_param to query_param if not system_prompt, add optional system_prompt arg to query_with_keywords to customize context building and final response.

harmonize prompt keys. add get_and_validate_prompt_template func

3d7b1df

fix linting

db20f53

drahnreb requested a review from danielaskdd April 21, 2025 01:11

fix linting

01aee34

Merge branch 'main' into drahnreb/complete-addon-params

0243eb9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

complete `addon_params` with all prompt templates #1401

complete `addon_params` with all prompt templates #1401

drahnreb commented Apr 17, 2025 •

edited

Loading

drahnreb commented Apr 19, 2025

danielaskdd left a comment

drahnreb commented Apr 20, 2025 •

edited

Loading

drahnreb commented Apr 21, 2025 •

edited

Loading

drahnreb commented Apr 23, 2025

complete addon_params with all prompt templates #1401

Are you sure you want to change the base?

complete addon_params with all prompt templates #1401

Conversation

drahnreb commented Apr 17, 2025 • edited Loading

Description

Related Issues

Changes Made

Checklist

Additional Notes

drahnreb commented Apr 19, 2025

danielaskdd left a comment

Choose a reason for hiding this comment

drahnreb commented Apr 20, 2025 • edited Loading

drahnreb commented Apr 21, 2025 • edited Loading

drahnreb commented Apr 23, 2025

complete `addon_params` with all prompt templates #1401

complete `addon_params` with all prompt templates #1401

drahnreb commented Apr 17, 2025 •

edited

Loading

drahnreb commented Apr 20, 2025 •

edited

Loading

drahnreb commented Apr 21, 2025 •

edited

Loading