Self-hosted web API that exposes free, unlimited access to modern LLM providers through a single, simple HTTP interface. It includes an optional web GUI for configuration and supports running via Python or Docker.
- Free to use - No API keys or subscriptions required
- Unlimited requests - No rate limiting
- Simple HTTP interface - Returns plain text responses
- Optional Web GUI - Easy configuration through browser
- Docker support - Ready-to-use container available
- Smart timeout handling - Automatic retry with optimized timeouts
Note: The demo server, when available, can be overloaded and may not always respond.
- Features
- Screenshots
- Quick start
- Run with Docker
- Run from source
- Development
- Development workflow
- Local development
- Testing
- Usage
- Quick examples (browser, curl, Python)
- File input
- Web GUI
- Command-line options
- Configuration
- Cookies
- Proxies
- Models and providers
- Private mode and password
- Siri integration
- Cluster Architecture
- Load balancing
- Network isolation
- High availability
- Scaling
- Requirements
- Star history
- Contributing
- License
Pull and run with an optional cookies.json and port mapping. In Docker, setting a GUI password is recommended (and required by some setups).
- Minimal (no cookies):
docker run -p 5500:5500 d0ckmg/free-gpt4-web-api:latest
- With cookies (read-only mount):
docker run \
-v /path/to/your/cookies.json:/cookies.json:ro \
-p 5500:5500 \
d0ckmg/free-gpt4-web-api:latest
- Override container port mapping:
docker run -p YOUR_PORT:5500 d0ckmg/free-gpt4-web-api:latest
- docker compose.yml (cluster with load balancing and scalable replicas):
version: "3.9" services: nginx: build: context: ./nginx dockerfile: Dockerfile ports: - "15432:15432" volumes: - "./logs/nginx:/var/log/nginx:rw" - "./nginx/nginx.conf:/etc/nginx/nginx.conf:ro" depends_on: - api networks: - external - internal restart: unless-stopped api: build: context: ./llm-api-service dockerfile: Dockerfile volumes: - "./llm-api-service/data:/app/data:rw" - "./logs:/app/logs:rw" networks: - internal - external environment: - LOG_LEVEL=${LOG_LEVEL:-INFO} - PROVIDER=${PROVIDER:-You} restart: unless-stopped healthcheck: test: ["CMD", "curl", "-f", "http://localhost:5500/models"] interval: 30s timeout: 10s retries: 3 start_period: 40s deploy: replicas: ${API_REPLICAS:-2} resources: limits: memory: ${API_MEMORY_LIMIT:-512M} reservations: memory: ${API_MEMORY_RESERVATION:-256M} restart_policy: condition: on-failure delay: 5s max_attempts: 3 networks: external: driver: bridge internal: driver: bridge internal: true
Note:
- If you plan to use the Web GUI in Docker, set a password (see “Command-line options”).
- The API listens on port 5500 in the container.
- Clone the repo
git clone https://github.com/aledipa/Free-GPT4-WEB-API.git
cd Free-GPT4-WEB-API
- Install dependencies
pip install -r requirements.txt
- Start the server (basic)
python3 src/FreeGPT4_Server.pyWhen using the Web GUI, always set a secure password:
python3 src/FreeGPT4_Server.py --enable-gui --password your_secure_passwordThis project uses a feature branch workflow with automatic testing to ensure production stability:
main- Production-ready code (auto-deployed)dev- Development branch (auto-tested)
Start development environment:
# Start dev environment
chmod +x scripts/start-dev.sh
./scripts/start-dev.sh
# Test your changes
curl "http://localhost:15433/?text=Hello"
# Stop dev environment
./scripts/stop-dev.sh- Create feature branch from
dev - Make changes and test locally
- Merge to
devand push (triggers auto-testing) - Auto-merge to
mainif tests pass - Auto-deploy to production from
main
See DEVELOPMENT.md for detailed development workflow.
The API returns plain text by default.
- Quick browser test:
- Start the server
- Open: http://127.0.0.1:5500/?text=Hello
Examples:
- GET http://127.0.0.1:5500/?text=Write%20a%20haiku
- If you changed the keyword parameter (see
--keyword), replacetextwith your chosen keyword.
- Simple text prompt:
curl "http://127.0.0.1:5500/?text=Explain%20quicksort%20in%20simple%20terms"
- File input (see
--file-inputin options):
fileTMP="$1"
curl -s -F file=@"${fileTMP}" http://127.0.0.1:5500/
import requests
resp = requests.get("http://127.0.0.1:5500/", params={"text": "Give me a limerick"})
print(resp.text)- Start with GUI enabled:
python3 FreeGPT4_Server.py --enable-gui
- Open settings or login:
From the GUI you can configure common options (e.g., model, provider, keyword, history, cookies).
Show help:
python3 src/FreeGPT4_Server.py [-h] [--remove-sources] [--enable-gui]
[--private-mode] [--enable-history] [--password PASSWORD]
[--cookie-file COOKIE_FILE] [--file-input] [--port PORT]
[--model MODEL] [--provider PROVIDER] [--keyword KEYWORD]
[--system-prompt SYSTEM_PROMPT] [--enable-proxies] [--enable-virtual-users]
[--log-level LEVEL] [--log-file FILE] [--log-format FORMAT] [--enable-request-logging]Options:
- -h, --help Show help and exit
- --remove-sources Remove sources from responses
- --enable-gui Enable graphical settings interface
- --private-mode Require a private token to access the API
- --enable-history Enable message history
- --password PASSWORD Set/change the password for the settings page
- Note: Mandatory in some Docker environments
- --cookie-file COOKIE_FILE Use a cookie file (e.g., /cookies.json)
- --file-input Enable file-as-input support (see curl example)
- --port PORT HTTP port (default: 5500)
- --model MODEL Model to use (default: gpt-4)
- --provider PROVIDER Provider to use (default: Bing)
- --keyword KEYWORD Change input query keyword (default: text)
- --system-prompt SYSTEM_PROMPT System prompt to steer answers
- --enable-proxies Use one or more proxies to reduce blocking
- --enable-virtual-users Enable virtual users to divide requests among multiple users
- --log-level LEVEL Set logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL)
- --log-file FILE Enable logging to file (specify file path)
- --log-format FORMAT Custom log format string
- --enable-request-logging Enable detailed request/response logging
Some providers require cookies to work properly. For the Bing model, only the “_U” cookie is needed.
- Passing cookies via file:
- Use
--cookie-file /cookies.jsonwhen running from source - In Docker, mount your cookies file read-only:
-v /path/to/cookies.json:/cookies.json:ro
- Use
- The GUI also exposes cookie-related settings.
Enable proxies to mitigate blocks:
- Start with
--enable-proxies - Ensure your environment is configured for aiohttp/aiohttp_socks if you need SOCKS/HTTP proxies.
- Models: gpt-4, gpt-4o, deepseek-r1, and other modern LLMs
- Default model:
gpt-4 - Default provider:
DuckDuckGo(reliable fallback) - Provider Fallback: Automatic switching between Bing, DuckDuckGo, and other providers
- Health Monitoring: Real-time provider status tracking
Change via flags or in the GUI:
--model gpt-4o --provider Bing- Smart Timeout Handling: Optimized 30-second timeouts with automatic retry
- Provider Fallback: Automatic switching when primary provider fails
- Health Monitoring: Continuous provider status tracking
- Blacklist System: Automatic exclusion of problematic providers
--private-moderequires a private token to access the API--passwordprotects the settings page (mandatory in Docker setups)- Security Enhancement: Authentication system hardened against bypass attacks
- Logging: All authentication attempts are logged for security monitoring
- Use a strong password if you expose the API beyond localhost
Important: Always set a password when using the Web GUI to prevent unauthorized access.
- Log levels: DEBUG, INFO, WARNING, ERROR, CRITICAL
- File logging: Use
--log-fileto enable logging to file - Request logging: Use
--enable-request-loggingfor detailed API request logs - Custom format: Use
--log-formatfor custom log message format - Docker logging: See
DOCKER_LOGGING.mdfor Docker-specific logging configuration
Examples:
# Basic logging
python3 src/FreeGPT4_Server.py --log-level INFO
# File logging with request details
python3 src/FreeGPT4_Server.py --log-file ./logs/api.log --enable-request-logging
# Docker with logging
docker compose -f docker compose.dev.yml up -dThe project now supports a cluster architecture with load balancing and network isolation:
# Start cluster with load balancing
chmod +x scripts/start-cluster.sh
./scripts/start-cluster.sh
# Monitor cluster
chmod +x scripts/monitor-cluster.sh
./scripts/monitor-cluster.sh- Scalable LLM API replicas (configurable via
API_REPLICASenvironment variable, default: 2) - Nginx reverse proxy with load balancing
- Network isolation - APIs not accessible from outside
- Health checks and automatic failover
- Resource management with memory limits and reservations
- Rate limiting and security headers
- Modular structure - Python app in
llm-api-service/directory
Internet → Nginx (external/internal) → LLM API Replicas (internal)
The cluster supports dynamic scaling through environment variables:
# Set number of API replicas (default: 2)
export API_REPLICAS=4
# Set memory limits
export API_MEMORY_LIMIT=1G
export API_MEMORY_RESERVATION=512M
# Start with custom scaling
docker compose up -dScaling Examples:
- Development:
API_REPLICAS=1(single instance) - Production:
API_REPLICAS=4(high availability) - High Load:
API_REPLICAS=8(maximum performance)
See CLUSTER_ARCHITECTURE.md for detailed documentation.
Use the GPTMode Apple Shortcut to ask your self-hosted API via Siri.
Shortcut:
Say “GPT Mode” to Siri and ask your question when prompted.
- Python 3.8+
- Flask[async]
- g4f (from https://github.com/xtekky/gpt4free)
- aiohttp
- aiohttp_socks
- Werkzeug
- requests (for enhanced HTTP handling)
For development and testing:
- pytest
- pytest-asyncio
- Timeout Errors: The system now automatically retries with fallback providers
- Provider Blocks: Health monitoring automatically switches to working providers
- Authentication Issues: Ensure you set a strong password and check logs for failed attempts
- Docker Permission Issues: Use read-only mounts for sensitive files like cookies.json
If you encounter issues:
- Check the application logs for detailed error information
- Verify your provider configuration in the Web GUI
- Ensure cookies are properly formatted (if using)
- Try different providers through the fallback system
Contributions are welcome! Feel free to open issues and pull requests to improve features, docs, or reliability.
GNU General Public License v3.0
See LICENSE for details.
.png)



