This project presents the implementation of an AI assistant built in Python with the LangChain framework. The main objective is to demonstrate the ability of an intelligent agent to make autonomous decisions about when to use its internal knowledge versus when to trigger an external tool. To achieve this, the assistant is equipped with a calculator tool and utilizes a ReAct (Reasoning and Acting) agent to analyze the user's query and determine the best approach: either answering general knowledge questions directly or invoking the calculator to accurately solve arithmetic operations.
- Language: Python 3.x
- Core Framework: LangChain
- Language Model (LLM): OpenAI GPT-3.5-Turbo
- Dependency Management: pip &
requirements.txt - API Key Management: python-dotenv
-
Clone the Repository:
git clone https://github.com/PryskaS/artefact-ai-assistant-challenge.git cd artefact-ai-assistant-challenge -
Set up a Virtual Environment (Recommended):
# On Windows python -m venv venv .\venv\Scripts\activate # On macOS/Linux python3 -m venv venv source venv/bin/activate
-
Install Dependencies:
pip install -r requirements.txt
-
Configure the API Key:
- Create a file named
.envin the root directory of the project. - Inside this file, add your OpenAI API key:
OPENAI_API_KEY="sk-..."
- Create a file named
-
Run the Assistant:
python main.py
The agent's logic is built upon three core components: the agent architecture, the tool definition, and the decision-making mechanism.
The system leverages a ReAct (Reasoning and Acting) agent. This architecture was chosen for its transparent, step-by-step reasoning process, which is ideal for tasks involving tool use.
- Core Loop: The agent operates on a "Thought → Action → Observation" cycle.
- Traceability: The
verbose=Trueflag exposes this entire reasoning chain in the terminal, making the agent's behavior easy to debug and understand. The agent effectively "shows its work" before producing a final answer.
A single Calculator tool was implemented to handle arithmetic operations. Key implementation details include:
- Secure Evaluation: The tool uses Python's
eval()function to compute mathematical results. To mitigate security risks,eval()is executed in a sandboxed environment with__builtins__disabled, preventing the execution of arbitrary code. - Error Handling: The function is wrapped in a
try...exceptblock. This ensures that invalid mathematical expressions or operations (like division by zero) do not crash the agent. Instead, a descriptive error string is returned, which the agent receives as an "Observation" to inform its next step.
The agent's decision to use the Calculator is not hard-coded. Instead, the decision is delegated to the LLM, governed by the quality of the tool's description.
-
Prompt-Driven Decision: The core of the routing logic lies in the
descriptionstring provided to theToolobject:description="Use this tool to evaluate simple arithmetic expressions. The input must be a valid mathematical expression string."
-
Mechanism: When presented with a user query, the LLM determines if the query matches the tool's described capability. For a query like "What is 128 * 46?", the LLM identifies a direct match and formats a call to the
Calculator. For knowledge-based questions, it infers a mismatch and proceeds to formulate aFinal Answerdirectly.
- Open-Source Model Integration: Initial integration attempts with open-source models (
Mistral-7B,Zephyr-7B) highlighted significant API compatibility issues within the LangChain framework, specifically concerning thetaskparameter (text-generationvs.conversational). This underscores the practical challenges of working with a rapidly evolving open-source LLM ecosystem. - Strategic Pivot as a Pragmatic Solution: Faced with integration blockers, a pivot to the OpenAI API (
gpt-3.5-turbo) was a pragmatic decision to ensure project stability and focus on the core agent logic. The OpenAI integration with theReActagent proved to be robust and functional out-of-the-box. This journey is a key takeaway on the trade-offs between using cutting-edge open-source models and relying on more mature, stable APIs. - Tool
descriptionis Paramount: The agent's routing accuracy is almost entirely dependent on the tool'sdescriptionstring. It functions as a micro-prompt that is critical for the LLM's ability to select the correct tool. Small changes in the description's wording can significantly alter the agent's behavior.
The current implementation is a solid proof-of-concept. The most logical path for evolution would be:
-
Stateful Agent: Refactor the agent to include memory (e.g.,
ConversationBufferMemory) to support multi-turn, contextual conversations. The current implementation is stateless, treating each query independently. -
Tool Expansion & Routing:
- With multiple tools, the basic ReAct agent can become inefficient. A potential improvement would be to implement a Router Agent, a preliminary LLM call that efficiently selects the most appropriate tool(s) before invoking the main agent chain.
- A concrete example of a new tool that could be added is one for providing the current date and time.
- Python Function:
from datetime import datetime def get_current_datetime() -> str: """Returns the current date and time in a readable format.""" # Note: This will use the server's time. For specific timezones, `pytz` would be needed. return datetime.now().strftime("%A, %B %d, %Y %I:%M %p")
- Tool Definition:
date_tool = Tool( name="Current Date and Time", func=get_current_datetime, description="Use this to get the current date and time. Use it for any questions about today's date or the current time." )
- Python Function:
-
UI Implementation: Develop a simple web interface using Streamlit or Gradio to replace the current command-line interface, making the assistant more interactive and accessible.