An LLM-powered tool that transforms everyday language into robust information extraction pipelines.
Features | Support |
---|---|
LLM Agent for prompt writing | ✅ Web App, Interactive chat |
Named Entity Recognition (NER) | ✅ Customizable granularity (e.g., sentence-level, document-level) |
Entity Attributes Extraction | ✅ Flexible formats |
Relation Extraction (RE) | ✅ Binary & Multiclass relations |
Visualization | ✅ Web App, Built-in entity & relation visualization |
- v1.0.0 (May 15, 2025):
- 📐User Guide is now moved to Documentation Page
- Web Application provides drag-and-drop access to LLM-IE.
- Refactored
FrameExtractor
by separating chunking methods (e.g., sentence) and prompting method (e.g., review). Chunking is now defined inUnitChunker
andContextChunker
, whileFrameExtractor
defines prompting method. - Documentation website. User guide and API reference is now available on .
- Optimized concurrent/batch processing. We adopt semaphore to better utilize the computation resource.
- Overview
- Prerequisite
- Installation
- Quick Start
- Web Applicaton
- Examples
- User Guide
- Benchmarks
- Citation
LLM-IE is a toolkit that provides robust information extraction utilities for named entity, entity attributes, and entity relation extraction. The flowchart below demonstrates the workflow starting from a casual language request to output visualization.
At least one LLM inference engine is required. There are built-in supports for 🚅 LiteLLM, 🦙 Llama-cpp-python, Ollama, 🤗 Huggingface_hub,
OpenAI API, and
vLLM. For installation guides, please refer to those projects. Other inference engines can be configured through the InferenceEngine abstract class. See LLM Inference Engine section below.
The Python package is available on PyPI.
pip install llm-ie
Note that this package does not check LLM inference engine installation nor install them. See prerequisite section for details.
We use a synthesized medical note by ChatGPT to demo the information extraction process. Our task is to extract diagnosis names, spans, and corresponding attributes (i.e., diagnosis datetime, status).
Choose one of the built-in engines below.
🚅 LiteLLM
from llm_ie.engines import LiteLLMInferenceEngine
inference_engine = LiteLLMInferenceEngine(model="openai/Llama-3.3-70B-Instruct", base_url="http://localhost:8000/v1", api_key="EMPTY")
OpenAI API & Compatible Services
Follow the Best Practices for API Key Safety to set up API key.
from llm_ie.engines import OpenAIInferenceEngine
inference_engine = OpenAIInferenceEngine(model="gpt-4o-mini")
For OpenAI compatible services (OpenRouter for example):
from llm_ie.engines import OpenAIInferenceEngine
inference_engine = OpenAIInferenceEngine(base_url="https://openrouter.ai/api/v1", model="meta-llama/llama-4-scout")
Azure OpenAI API
Follow the Azure AI Services Quickstart to set up Endpoint and API key.
from llm_ie.engines import AzureOpenAIInferenceEngine
inference_engine = AzureOpenAIInferenceEngine(model="gpt-4o-mini",
api_version="<your api version>")
🤗 Huggingface_hub
from llm_ie.engines import HuggingFaceHubInferenceEngine
inference_engine = HuggingFaceHubInferenceEngine(model="meta-llama/Meta-Llama-3-8B-Instruct")
Ollama
from llm_ie.engines import OllamaInferenceEngine
inference_engine = OllamaInferenceEngine(model_name="llama3.1:8b-instruct-q8_0")
vLLM
The vLLM support follows the OpenAI Compatible Server. For more parameters, please refer to the documentation.
Start the server
vllm serve meta-llama/Meta-Llama-3.1-8B-Instruct
Define inference engine
from llm_ie.engines import OpenAIInferenceEngine
inference_engine = OpenAIInferenceEngine(base_url="http://localhost:8000/v1",
api_key="EMPTY",
model="meta-llama/Meta-Llama-3.1-8B-Instruct")
🦙 Llama-cpp-python
from llm_ie.engines import LlamaCppInferenceEngine
inference_engine = LlamaCppInferenceEngine(repo_id="bullerwins/Meta-Llama-3.1-8B-Instruct-GGUF",
gguf_filename="Meta-Llama-3.1-8B-Instruct-Q8_0.gguf")
In this quick start demo, we use OpenRouter to run Llama-4-Scout. The outputs might be slightly different with other inference engines, LLMs, or quantization.
We start with defining prompt editor LLM agent. We store OpenRouter API key in environmental variable OPENROUTER_API_KEY
.
export OPENROUTER_API_KEY=<OpenRouter API key>
from llm_ie import OpenAIInferenceEngine, DirectFrameExtractor, PromptEditor, SentenceUnitChunker, SlideWindowContextChunker
# Define a LLM inference engine
llm = OpenAIInferenceEngine(base_url="https://openrouter.ai/api/v1", model="meta-llama/llama-4-scout", api_key=os.getenv("OPENROUTER_API_KEY"))
# Define LLM prompt editor
editor = PromptEditor(inference_engine, DirectFrameExtractor)
# Start chat
editor.chat()
This opens an interactive session:
The agent drafts a prompt template following the schema required by the DirectFrameExtractor
.
After a few rounds of chatting, we have a prompt template to start with:
### Task description
The paragraph below contains a clinical note with diagnoses listed. Please carefully review it and extract the diagnoses, including the diagnosis date and status.
### Schema definition
Your output should contain:
"entity_text" which is the diagnosis spelled as it appears in the text,
"Date" which is the date when the diagnosis was made,
"Status" which is the current status of the diagnosis (e.g. active, resolved, etc.)
### Output format definition
Your output should follow JSON format, for example:
[
{"entity_text": "<Diagnosis>", "attr": {"Date": "<date in YYYY-MM-DD format>", "Status": "<status>"}},
{"entity_text": "<Diagnosis>", "attr": {"Date": "<date in YYYY-MM-DD format>", "Status": "<status>"}}
]
### Additional hints
- Your output should be 100% based on the provided content. DO NOT output fake information.
- If there is no specific date or status, just omit those keys.
### Context
The text below is from the clinical note:
"{{input}}"
Instead of prompting LLMs with the entire document (which, by our experiments, has worse performance), we divide the input document into units (e.g., sentences, text lines, paragraphs). LLM only focus on one unit at a time, before moving to the next unit. This is achieved by the UnitChunker
classes. In this demo, we use SentenceUnitChunker
for sentence by sentence prompting. Though LLM only focus on one sentence at a time, we supply a context, in this case, a slide window of 2 sentences as context. This provides LLM with additional information. This is achieved by the SlideWindowContextChunker
class.
# Load synthesized medical note
with open("./demo/document/synthesized_note.txt", 'r') as f:
note_text = f.read()
# Define unit chunker. Prompts sentences-by-sentence.
unit_chunker = SentenceUnitChunker()
# Define context chunker. Provides context for units.
context_chunker = SlideWindowContextChunker(window_size=2)
# Define extractor
extractor = DirectFrameExtractor(llm,
unit_chunker=unit_chunker,
context_chunker=context_chunker,
prompt_template=prompt_template)
To run the frame extraction, use extract_frames
method. A list of entities with attributes ("frames") will be returned. Concurrent processing is supported by setting concurrent=True
.
# To stream the extraction process, use concurrent=False, stream=True:
frames = extractor.extract_frames(note_text, concurrent=False, verbose=True)
# For faster extraction, use concurrent=True to enable asynchronous prompting
# frames = extractor.extract_frames(note_text, concurrent=True)
# Check extractions
for frame in frames:
print(frame.to_dict())
The output is a list of frames. Each frame has a entity_text
, start
, end
, and a dictionary of attr
.
{'frame_id': '0', 'start': 537, 'end': 549, 'entity_text': 'hypertension', 'attr': {'Status': ''}}
{'frame_id': '1', 'start': 551, 'end': 565, 'entity_text': 'hyperlipidemia', 'attr': {'Status': ''}}
{'frame_id': '2', 'start': 571, 'end': 595, 'entity_text': 'Type 2 diabetes mellitus', 'attr': {'Status': ''}}
{'frame_id': '3', 'start': 991, 'end': 1003, 'entity_text': 'Hypertension', 'attr': {'Date': '2010', 'Status': None}}
{'frame_id': '4', 'start': 1026, 'end': 1040, 'entity_text': 'Hyperlipidemia', 'attr': {'Date': '2015', 'Status': None}}
{'frame_id': '5', 'start': 1063, 'end': 1087, 'entity_text': 'Type 2 Diabetes Mellitus', 'attr': {'Date': '2018', 'Status': None}}
{'frame_id': '6', 'start': 1646, 'end': 1682, 'entity_text': 'Jugular venous pressure not elevated', 'attr': {}}
{'frame_id': '7', 'start': 1703, 'end': 1767, 'entity_text': 'Clear to auscultation bilaterally, no wheezes, rales, or rhonchi', 'attr': {}}
{'frame_id': '8', 'start': 1802, 'end': 1823, 'entity_text': 'no hepatosplenomegaly', 'attr': {}}
{'frame_id': '9', 'start': 1926, 'end': 1962, 'entity_text': 'ST-segment depression in leads V4-V6', 'attr': {}}
{'frame_id': '10', 'start': 1982, 'end': 2004, 'entity_text': 'Elevated at 0.15 ng/mL', 'attr': {'Date': '', 'Status': ''}}
{'frame_id': '11', 'start': 2046, 'end': 2066, 'entity_text': 'No acute infiltrates', 'attr': {}}
{'frame_id': '12', 'start': 2068, 'end': 2093, 'entity_text': 'normal cardiac silhouette', 'attr': {}}
{'frame_id': '13', 'start': 2117, 'end': 2150, 'entity_text': 'Mild left ventricular hypertrophy', 'attr': {'Date': '', 'Status': ''}}
{'frame_id': '14', 'start': 2321, 'end': 2338, 'entity_text': 'Glucose 180 mg/dL', 'attr': {}}
{'frame_id': '15', 'start': 2340, 'end': 2350, 'entity_text': 'HbA1c 7.8%', 'attr': {}}
{'frame_id': '16', 'start': 2402, 'end': 2431, 'entity_text': 'acute coronary syndrome (ACS)', 'attr': {'Date': None, 'Status': None}}
{'frame_id': '17', 'start': 3025, 'end': 3033, 'entity_text': 'Diabetes', 'attr': {}}
{'frame_id': '18', 'start': 3925, 'end': 3935, 'entity_text': 'chest pain', 'attr': {'Date': '', 'Status': ''}}
We can save the frames to a document object for better management. The document holds text
and frames
. The add_frame()
method performs validation and (if passed) adds a frame to the document.
The valid_mode
controls how frame validation should be performed. For example, the valid_mode = "span"
will prevent new frames from being added if the frame spans (start
, end
) has already exist. The create_id = True
allows the document to assign unique frame IDs.
from llm_ie.data_types import LLMInformationExtractionDocument
# Define document
doc = LLMInformationExtractionDocument(doc_id="Synthesized medical note",
text=note_text)
# Add frames to a document
doc.add_frames(frames, create_id=True)
# Save document to file (.llmie)
doc.save("<your filename>.llmie")
To visualize the extracted frames, we use the viz_serve()
method.
doc.viz_serve()
A Flask App starts at port 5000 (default).
A drag-and-drop web Application for no-code access to the LLM-IE.
The image is available on 🐳Docker Hub. Use the command below to pull and run locally:
docker pull daviden1013/llm-ie-web-app:latest
docker run -p 5000:5000 daviden1013/llm-ie-web-app:latest
Interface for chatting with Prompt Editor LLM agent.
Stream frame extraction and download outputs.
- Interactive chat with LLM prompt editors
- Write prompt templates with LLM prompt editors
- NER + RE for Drug, Strength, Frequency
The detailed User Guide is available on our Documentation Page
We benchmarked the frame and relation extractors on biomedical information extraction tasks. The results and experiment code is available on this page.
For more information and benchmarks, please check our paper:
@article{hsu2025llm,
title={LLM-IE: a python package for biomedical generative information extraction with large language models},
author={Hsu, Enshuo and Roberts, Kirk},
journal={JAMIA open},
volume={8},
number={2},
pages={ooaf012},
year={2025},
publisher={Oxford University Press}
}