A modern, responsive UI for interacting with Large Language Models (LLMs). Built with React, TypeScript, Vite, and AWS Cloudscape Design System, this application provides a beautiful interface for chatting with AI models through Ollama, LM Studio, and Amazon Bedrock.
- Modern UI: Built with AWS Cloudscape Design System for a professional, accessible interface
- Multiple AI Providers: Support for Ollama, LM Studio, and Amazon Bedrock
- Real-time Streaming: Stream responses from AI models in real-time
- Model Configuration: Adjust temperature, top-p, and max tokens for fine-tuned responses
- User Preferences: Persistent settings for preferred AI provider and custom avatar initials
- Visual Provider Indicators: Clear icons showing which AI provider is active
- Document Upload: Upload documents (PDF, TXT, HTML, MD, CSV, DOC, DOCX, XLS, XLSX) with Bedrock models
- Usage Metrics: View token usage and latency for Bedrock requests
- Chat History: Manage multiple chat sessions with automatic history tracking (Coming soon)
- Responsive Design: Works seamlessly across desktop and mobile devices
- Dark/Light Mode: Automatic theme support through Cloudscape
Before running this application, ensure you have the following installed:
- Node.js: Version 18.x or higher
- npm: Version 9.x or higher (comes with Node.js)
- AI Provider (at least one):
- Ollama - Local AI models (recommended)
- LM Studio - Alternative local AI platform
- Amazon Bedrock - AWS cloud AI service (requires AWS credentials)
-
Clone the repository:
git clone https://github.com/praveenc/local-llm-ui.git cd local-llm-ui -
Install dependencies:
npm install
-
Download and install Ollama from ollama.com
-
Pull a model (e.g., qwen3-8b):
ollama pull qwen3-8b-8k:latest
# or an ollama cloud model ollama pull minimax-m2:cloud -
Verify Ollama is running:
ollama list
Ollama runs on
http://localhost:11434by default.
-
Download and install LM Studio from lmstudio.ai
-
Download a model through the LM Studio interface
-
Start the local server:
- Open LM Studio
- Go to the "Developer" or "Server" tab
- Click "Start Server"
- Ensure it's running on port
1234
-
(Optional) Enable JIT Loading:
- Go to Developer → Server Settings
- Enable "JIT Loading" to load models on-demand
-
Set up AWS credentials using one of these methods:
Option A: Environment Variables
export AWS_ACCESS_KEY_ID=your_access_key_id export AWS_SECRET_ACCESS_KEY=your_secret_access_key export AWS_REGION=us-west-2 # or your preferred region
Option B: AWS CLI
aws configure
Option C: AWS Credentials File Create
~/.aws/credentials:[default] aws_access_key_id = your_access_key_id aws_secret_access_key = your_secret_access_key
-
Ensure IAM Permissions: Your AWS user/role needs these permissions:
bedrock:ListInferenceProfilesbedrock:InvokeModelorbedrock:InvokeModelWithResponseStream
-
Request Model Access (if needed):
- Go to AWS Bedrock console
- Navigate to "Model access"
- Request access to desired models (e.g., Claude, Llama)
-
Verify Setup:
aws bedrock list-foundation-models --region us-west-2
Start the development server with hot module replacement:
npm run devThe application will be available at http://localhost:5173
Build the application for production:
npm run buildPreview the production build:
npm run preview-
Start the Application: Run
npm run dev -
Select AI Provider:
- Open the sidebar (Model Settings)
- Choose between Ollama, LM Studio, or Amazon Bedrock
- The app will automatically detect available models
-
Select a Model:
- Choose from the dropdown list of available models
- Models are filtered to show only chat-capable models
-
Start Chatting:
- Type your message in the input field
- Press Enter or click the send button
- Watch the AI response stream in real-time
Adjust model parameters in the expandable settings panel:
-
Temperature (0.0 - 2.0): Controls randomness
- Lower values (0.1-0.5): More focused and deterministic
- Higher values (0.8-1.5): More creative and varied
-
Top P (0.0 - 1.0): Controls diversity via nucleus sampling
- Lower values: More focused responses
- Higher values: More diverse responses
-
Max Tokens: Maximum length of the response
- Default: 4096 tokens
- Adjust based on your needs and model capabilities
- New Chat: Click the "New Chat" button in the sidebar to start fresh
- Clear History: Clears the current conversation while keeping the session
Customize your experience with persistent preferences:
- Access Preferences: Click the settings icon at the bottom of the sidebar
- Preferred Provider: Set your default AI provider (Ollama, LM Studio, or Bedrock)
- The app will automatically select this provider on startup
- Avatar Initials: Customize your chat avatar with 2 alphanumeric characters
- Automatically converted to uppercase
- Appears next to your messages in the chat
- Save: Click "Save" to persist your preferences
- Settings are stored in browser localStorage
- Preferences persist across sessions
When using Amazon Bedrock models, you can upload documents:
- Click the attachment icon in the chat input
- Select up to 5 files (max 4.5 MB each)
- Supported formats: PDF, TXT, HTML, MD, CSV, DOC, DOCX, XLS, XLSX
- Send your message with the attached documents
- The AI will analyze and respond based on the document content
For Bedrock models, view detailed usage metrics:
- Input Tokens: Tokens in your prompt
- Output Tokens: Tokens in the AI response
- Total Tokens: Combined token count
- Latency: Response time in milliseconds
Metrics appear in an expandable section below the chat input.
local-llm-ui/
├── src/
│ ├── components/
│ │ ├── chat/ # Chat-related components
│ │ │ ├── ChatContainer.tsx
│ │ │ ├── FloatingChatInput.tsx
│ │ │ ├── MessageList.tsx
│ │ │ ├── CodeBlock.tsx
│ │ │ └── ...
│ │ └── layout/ # Layout components
│ ├── layout/
│ │ ├── BaseAppLayout.tsx # Main app layout
│ │ └── SideBar.tsx # Model settings sidebar
│ ├── services/
│ │ ├── api.ts # API service orchestrator
│ │ ├── ollama.ts # Ollama integration
│ │ ├── lmstudio.ts # LM Studio integration
│ │ ├── bedrock.ts # Amazon Bedrock integration
│ │ └── types.ts # TypeScript types
│ ├── utils/
│ │ ├── preferences.ts # User preferences management
│ │ └── ... # Other utility functions
│ └── main.tsx # Application entry point
├── server/
│ └── bedrock-proxy.ts # Bedrock proxy server
├── public/ # Static assets
├── vite.config.ts # Vite configuration
└── package.json # Dependencies and scripts
The application uses Vite's environment variable system. Create a .env file in the root directory if you need custom configuration:
# Optional: Custom Ollama URL
VITE_OLLAMA_URL=http://localhost:11434
# Optional: Custom LM Studio URL
VITE_LMSTUDIO_URL=http://localhost:1234
# Optional: AWS Configuration (if not using AWS CLI or credentials file)
AWS_ACCESS_KEY_ID=your_access_key_id
AWS_SECRET_ACCESS_KEY=your_secret_access_key
AWS_REGION=us-west-2The Vite development server proxies requests to AI services:
/api/ollama→http://localhost:11434/api/lmstudio→http://localhost:1234/api/bedrock→ Handled by server-side proxy (AWS SDK)
This configuration is in vite.config.ts and handles CORS automatically. The Bedrock proxy runs server-side to securely handle AWS credentials.
Problem: The model dropdown is empty
Solutions:
-
Ollama: Ensure Ollama is running and you've pulled at least one model
ollama list ollama pull llama2
-
LM Studio: Ensure the server is running and a model is loaded or JIT Loading is enabled
-
Amazon Bedrock: Verify AWS credentials are configured and you have model access
Problem: "Cannot connect" error messages
Solutions:
- Ollama/LM Studio: Verify the AI service is running on the correct port
- Ollama/LM Studio: Check firewall settings
- Ollama/LM Studio: Ensure no other application is using the port
- Ollama/LM Studio: Restart the AI service
- Bedrock: Verify AWS credentials are configured correctly
- Bedrock: Check IAM permissions for Bedrock access
- Bedrock: Ensure you have requested model access in AWS console
Problem: AI responses are very slow
Solutions:
- Use a smaller model (e.g.,
llama2:7binstead ofllama2:70b) - Reduce
max_tokenssetting - Ensure your system meets the model's hardware requirements
- Close other resource-intensive applications
npm run dev- Start development servernpm run build- Build for productionnpm run preview- Preview production buildnpm run lint- Run ESLint
This project uses:
- ESLint: For code linting
- TypeScript: For type safety
- Cloudscape Design System: For UI components
- Follow the existing component structure
- Use Cloudscape components for consistency
- Maintain TypeScript types
- Test with Ollama, LM Studio, and Bedrock (if applicable)
Chat interface with model selection, settings, and real-time streaming responses
- React 19: UI framework
- TypeScript: Type safety
- Vite: Build tool and dev server
- Cloudscape Design System: AWS UI component library
- AWS SDK: Bedrock integration (@aws-sdk/client-bedrock, @aws-sdk/client-bedrock-runtime)
- React Markdown: Markdown rendering in chat
This project is licensed under the MIT License - see the LICENSE file for details.
For issues or questions:
- Check the troubleshooting section
- Verify your AI provider is properly configured
- Check the browser console for error messages
- Ensure all dependencies are installed correctly
- UI built with AWS Cloudscape Design System
- Supports Ollama, LM Studio, and Amazon Bedrock
- Powered by Vite and React

