New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Initial Proposal Draft for AI Agent for API Testing & Tool Generation #629

Merged

ashitaprasad merged 2 commits into foss42:main from akshayw1:idea-draft-gsoc

Mar 3, 2025

Contributor

akshayw1 commented Mar 2, 2025

PR Description

This PR adds an initial proposal draft for the AI Agent for API Testing & Tool Generation (#620). The project aims to automate API testing using Large Language Models (LLMs), enabling intelligent test case generation, response validation, and seamless integration with AI agent frameworks like crewAI, smolagents, and pydantic-ai.

Related Issues

AI Agent for API Testing & Tool Generation #620

Feedback

Any insights or suggestions on the architecture, integration strategy, or additional features would be greatly appreciated to refine the proposal further. Looking forward to feedback

akshayw1 and others added 2 commits

March 2, 2025 20:13


          idea-draft-proposed-ai-agent-tool

bf0c82e


          move under correct folder

e5e718f

Contributor Author

akshayw1 commented Mar 2, 2025

@ashitaprasad, looking for feedback and answers for clarification.

Member

ashitaprasad commented Mar 2, 2025

ashitaprasad reviewed

View reviewed changes

doc/proposals/2025/gsoc/idea_Akshay_Waghmare_ai-agent-testing.md

		6. Tool Definition Generator: This component converts API specifications into properly structured tool definitions for various AI frameworks, handling the specific requirements and patterns of each target framework.

		7. Benchmark Framework: The evaluation system that assesses LLM performance on standardized tasks with detailed metrics for accuracy, coverage, relevance, and efficiency.

Member

ashitaprasad Mar 3, 2025

The design is for a standalone service and currently it is not aligned with API Dash.
You will also have to think of UI/UX for this feature not just backend.

Contributor Author

akshayw1 Mar 3, 2025

For these I have UI UX in my mind, Will provide figma for that soon

ashitaprasad reviewed

View reviewed changes

doc/proposals/2025/gsoc/idea_Akshay_Waghmare_ai-agent-testing.md


		7. Benchmark Framework: The evaluation system that assesses LLM performance on standardized tasks with detailed metrics for accuracy, coverage, relevance, and efficiency.

		All components will be implemented in Python with comprehensive test coverage and documentation. The architecture will be modular, allowing for component reuse and independent scaling as needs evolve.

Member

ashitaprasad Mar 3, 2025

API Dash is a Flutter project.
Everything has to be implemented in Flutter. LLM interactions will happen via Ollama (local)/ChatGPT/Claude APIs.

ashitaprasad reviewed

View reviewed changes

doc/proposals/2025/gsoc/idea_Akshay_Waghmare_ai-agent-testing.md


		I have some questions for more understanding:

		1. Which AI frameworks are highest priority for tool definition generation? Is there a specific order of importance for crewAI, langchain, pydantic-ai, and langgraph?

Member

ashitaprasad Mar 3, 2025

Tool definition generation is not a complex task. Once it is done for a framework, it will be easy to replicate for others.

ashitaprasad reviewed

View reviewed changes

doc/proposals/2025/gsoc/idea_Akshay_Waghmare_ai-agent-testing.md


		1. Which AI frameworks are highest priority for tool definition generation? Is there a specific order of importance for crewAI, langchain, pydantic-ai, and langgraph?

		2. Do you have preferred LLM providers that should be prioritized for integration, or should the system be designed to work with any provider through a common interface?

Member

ashitaprasad Mar 3, 2025

Ollama API (local)
ChatGPT, Anthropic, Gemini APIs (if the users wants to connect to some API providers)

ashitaprasad reviewed

View reviewed changes

doc/proposals/2025/gsoc/idea_Akshay_Waghmare_ai-agent-testing.md


		2. Do you have preferred LLM providers that should be prioritized for integration, or should the system be designed to work with any provider through a common interface?

		3. Are there specific types of APIs that should be given special focus in the benchmark dataset (e.g., e-commerce, financial, IoT)?

Member

ashitaprasad Mar 3, 2025

No. The benchmark must be a good mix.

ashitaprasad reviewed

View reviewed changes

doc/proposals/2025/gsoc/idea_Akshay_Waghmare_ai-agent-testing.md


		3. Are there specific types of APIs that should be given special focus in the benchmark dataset (e.g., e-commerce, financial, IoT)?

		4. How will the frontend be planned? Will it be a standalone interface, an extension of an existing dashboard, or fully integrated into an API testing - API Dash client ?

Member

ashitaprasad Mar 3, 2025

It will be fully integrated to API Dash.

ashitaprasad merged commit 1618b6c into foss42:main

akshayw1 deleted the idea-draft-gsoc branch

March 3, 2025 07:57

Contributor Author

akshayw1 commented Mar 3, 2025

@ashitaprasad Based on your feedback provided for proposal

AI Agent for API Testing and Automated Tool Integration Idea (Refined Proposal)

Changes and Refinements

Architecture and Implementation

Primary Implementation: Completely in Flutter for full integration with API Dash
LLM Integration:
- Primary LLM Providers:
  1. Ollama (Local)
  2. ChatGPT
  3. Anthropic Claude
  4. Google Gemini
Modular Design: Maintain a flexible, extensible architecture that allows easy integration of additional LLM providers

System Architecture Updates

Architectural Considerations

Flutter Approach
- Entire solution will be implemented as a Flutter module
- Seamless integration with existing API Dash infrastructure
- Leverage Flutter's cross-platform capabilities for consistent UI/UX
LLM Provider Abstraction
- Create a provider-agnostic interface for LLM interactions
- Support multiple providers with a unified integration approach
- Allow easy switching between local (Ollama) and cloud-based LLM services
Tool Definition Generation
- Design a flexible template system that can be easily adapted across different AI frameworks
- Initial focus on creating a generic template that can be quickly customized for:
  - crewAI
  - langchain
  - pydantic-ai
  - langgraph

Benchmark Framework Refinements

Create a diverse benchmark dataset covering multiple API types
Include a mix of API domains:
- REST APIs
- GraphQL APIs
- gRPC APIs
- Microservice APIs
Evaluation metrics:
- Test case generation accuracy
- API coverage
- Edge case detection
- Performance efficiency

UI/UX Considerations

Integration with API Dash
- Fully integrated Flutter module
- Consistent design language with existing API Dash UI
- Seamless user experience for API testing and tool generation
Key UI Features
- API specification upload interface
- LLM provider configuration
- Test generation and execution dashboard
- Tool definition export functionality
- Detailed reporting and visualization

Please review these changes so I can update them in my idea document. Let me know if I'm missing anything.

Member

ashitaprasad commented Mar 3, 2025

@akshayw1 any updates you make have to be sent as a PR

Contributor Author

akshayw1 commented Mar 3, 2025

Okay sure @ashitaprasad

akshayw1 restored the idea-draft-gsoc branch

March 3, 2025 11:08

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet