PyThaiNLP · wannaphong · Jan 28, 2026 · Jan 24, 2024 · Jan 15, 2026 · Jan 15, 2026
diff --git a/.dockerignore b/.dockerignore
@@ -0,0 +1,51 @@
+# Git
+.git
+.gitignore
+
+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+*.egg
+*.egg-info/
+dist/
+build/
+.eggs/
+.pytest_cache/
+.mypy_cache/
+.coverage
+htmlcov/
+
+# Virtual environments
+venv/
+env/
+ENV/
+.venv
+
+# IDEs
+.vscode/
+.idea/
+*.swp
+*.swo
+*~
+
+# Docs
+docs/
+/site
+
+# Notebooks
+.ipynb_checkpoints
+notebook/
+
+# GitHub
+.github/
+
+# Output files
+*.wav
+*.mp3
+output.*
+
+# Test files
+tests/
+test/
diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml
@@ -0,0 +1,41 @@
+name: CI Tests
+
+on:
+  push:
+    branches:
+      - main
+      - dev
+  pull_request:
+    branches:
+      - main
+      - dev
+
+jobs:
+  test:
+    runs-on: ${{ matrix.os }}
+    permissions:
+      contents: read
+    strategy:
+      fail-fast: false
+      matrix:
+        os: [ubuntu-latest]
+        python-version: ['3.8', '3.9', '3.10', '3.11']
+
+    steps:
+    - uses: actions/checkout@v4
+
+    - name: Set up Python ${{ matrix.python-version }}
+      uses: actions/setup-python@v5
+      with:
+        python-version: ${{ matrix.python-version }}
+        cache: 'pip'
+
+    - name: Install dependencies
+      run: |
+        python -m pip install --upgrade pip
+        pip install -r requirements.txt
+        pip install -e .
+
+    - name: Run tests
+      run: |
+        python -m unittest discover -s tests -p "test_*.py" -v
diff --git a/DOCKER.md b/DOCKER.md
@@ -0,0 +1,114 @@
+# Docker Usage Guide for PyThaiTTS
+
+This guide explains how to build and run PyThaiTTS using Docker.
+
+## Building the Docker Image
+
+To build the Docker image, run the following command from the root directory of the repository:
+
+```bash
+docker build -t pythaitts:latest .
+```
+
+This will create a Docker image named `pythaitts:latest` with all dependencies installed.
+
+## Running the Demo
+
+To run the demo script that demonstrates Thai text-to-speech synthesis:
+
+```bash
+docker run --rm pythaitts:latest
+```
+
+The demo will:
+1. Initialize the PyThaiTTS model (default: lunarlist_onnx)
+2. Generate speech from Thai text
+3. Save the output to a WAV file
+4. Display the waveform information
+
+## Custom Usage
+
+### Interactive Shell
+
+To start an interactive shell inside the container:
+
+```bash
+docker run --rm -it pythaitts:latest /bin/bash
+```
+
+### Run Custom Python Script
+
+To run your own Python script:
+
+```bash
+docker run --rm -v $(pwd)/your_script.py:/app/custom.py pythaitts:latest python custom.py
+```
+
+### Save Output Files
+
+To save generated audio files to your host machine:
+
+```bash
+docker run --rm -v $(pwd)/output:/app/output pythaitts:latest python -c "
+from pythaitts import TTS
+tts = TTS()
+tts.tts('สวัสดีครับ', filename='output/hello.wav')
+"
+```
+
+This will save the generated `hello.wav` file to the `output` directory on your host machine.
+
+## Example Usage in Python
+
+Inside the container, you can use PyThaiTTS as follows:
+
+```python
+from pythaitts import TTS
+
+# Initialize TTS with default model
+tts = TTS()
+
+# Generate speech and save to file
+file_path = tts.tts("ภาษาไทย ง่าย มาก มาก", filename="output.wav")
+print(f"Audio saved to: {file_path}")
+
+# Generate speech and get waveform
+waveform = tts.tts("ภาษาไทย ง่าย มาก มาก", return_type="waveform")
+print(f"Waveform shape: {waveform.shape}")
+```
+
+## Available TTS Models
+
+PyThaiTTS supports multiple models:
+
+- **lunarlist_onnx** (default): ONNX-optimized model, CPU-only
+- **khanomtan**: KhanomTan TTS model
+- **lunarlist**: Original Lunarlist model
+
+To use a different model:
+
+```python
+from pythaitts import TTS
+
+# Using KhanomTan model
+tts = TTS(pretrained="khanomtan", version="1.0")
+```
+
+## Requirements
+
+- Docker installed on your system
+- At least 2GB of available disk space
+- Internet connection for downloading models on first run
+
+## Troubleshooting
+
+If you encounter issues with model downloads, ensure:
+1. You have a stable internet connection
+2. The Hugging Face Hub is accessible from your network
+3. You have sufficient disk space for model files
+
+## Notes
+
+- The first run will download model files from Hugging Face Hub, which may take some time depending on your internet connection
+- Generated audio files are in WAV format
+- The default model (lunarlist_onnx) runs on CPU and doesn't require GPU support
diff --git a/Dockerfile b/Dockerfile
@@ -0,0 +1,29 @@
+# Use Python 3.11 as base image (compatible with the project requirements)
+FROM python:3.11-slim
+
+# Set working directory
+WORKDIR /app
+
+# Install system dependencies
+RUN apt-get update && apt-get install -y \
+    git \
+    && rm -rf /var/lib/apt/lists/*
+
+# Copy requirements and setup files
+COPY requirements.txt setup.py README.md ./
+COPY pythaitts ./pythaitts
+
+# Install Python dependencies
+RUN pip install --no-cache-dir --trusted-host pypi.org --trusted-host pypi.python.org --trusted-host files.pythonhosted.org -r requirements.txt
+
+# Install the package
+RUN pip install --no-cache-dir -e .
+
+# Copy demo script
+COPY demo.py ./
+
+# Set environment variable to avoid Python buffering
+ENV PYTHONUNBUFFERED=1
+
+# Run the demo script by default
+CMD ["python", "demo.py"]
diff --git a/README.md b/README.md
@@ -14,6 +14,8 @@ Install by pip:
 
 ## Usage
 
+### Basic Usage
+
 ```python
 from pythaitts import TTS
 
@@ -22,4 +24,61 @@ file = tts.tts("ภาษาไทย ง่าย มาก มาก", filenam
 wave = tts.tts("ภาษาไทย ง่าย มาก มาก",return_type="waveform") # It will get waveform.
 ```
 
+### Using Different TTS Models
+
+PyThaiTTS supports multiple TTS models. You can specify which model to use:
+
+```python
+from pythaitts import TTS
+
+# Use VachanaTTS (default voices: th_f_1, th_m_1, th_f_2, th_m_2)
+tts = TTS(pretrained="vachana")
+file = tts.tts("สวัสดีครับ", speaker_idx="th_f_1", filename="output.wav")
+
+# Use Lunarlist ONNX (default)
+tts = TTS(pretrained="lunarlist_onnx")
+file = tts.tts("ภาษาไทย ง่าย มาก", filename="output.wav")
+
+# Use KhanomTan
+tts = TTS(pretrained="khanomtan")
+file = tts.tts("ภาษาไทย", speaker_idx="Linda", filename="output.wav")
+```
+
+### Text Preprocessing
+
+PyThaiTTS includes automatic text preprocessing to improve TTS quality:
+- **Number to Thai text conversion**: Converts digits (e.g., "123") to Thai text (e.g., "หนึ่งร้อยยี่สิบสาม")
+- **Mai yamok (ๆ) expansion**: Expands the Thai repetition character (e.g., "ดีๆ" becomes "ดีดี")
+
+Preprocessing is enabled by default:
+
+```python
+from pythaitts import TTS
+
+tts = TTS()
+# Automatic preprocessing: "มี 5 คนๆ" becomes "มี ห้า คนคน"
+file = tts.tts("มี 5 คนๆ", filename="output.wav")
+```
+
+You can disable preprocessing if needed:
+
+```python
+file = tts.tts("มี 5 คนๆ", preprocess=False, filename="output.wav")
+```
+
+You can also use preprocessing functions directly:
+
+```python
+from pythaitts import num_to_thai, expand_maiyamok, preprocess_text
+
+# Convert numbers to Thai text
+print(num_to_thai("123"))  # Output: หนึ่งร้อยยี่สิบสาม
+
+# Expand mai yamok
+print(expand_maiyamok("ดีๆ"))  # Output: ดีดี
+
+# Full preprocessing
+print(preprocess_text("มี 5 คนๆ"))  # Output: มี ห้า คนคน
+```
+
 You can see more at [https://pythainlp.github.io/PyThaiTTS/](https://pythainlp.github.io/PyThaiTTS/).
diff --git a/demo.py b/demo.py
@@ -0,0 +1,58 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+"""
+Simple demo script for PyThaiTTS
+This script demonstrates Thai text-to-speech synthesis using PyThaiTTS.
+"""
+
+from pythaitts import TTS
+
+def main():
+    print("=" * 60)
+    print("PyThaiTTS Demo - Thai Text-to-Speech")
+    print("=" * 60)
+    print()
+
+    # Initialize TTS with default model (lunarlist_onnx)
+    print("Initializing TTS model (lunarlist_onnx)...")
+    try:
+        tts = TTS()
+        print("✓ TTS model loaded successfully!")
+        print()
+
+        # Sample Thai text
+        text = "สวัสดีครับ ยินดีต้อนรับสู่ PyThaiTTS"
+        print(f"Input text: {text}")
+        print()
+
+        # Generate speech and save to file
+        import datetime
+        timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
+        output_file = f"output_{timestamp}.wav"
+        print(f"Generating speech and saving to {output_file}...")
+        result = tts.tts(text, filename=output_file)
+        print(f"✓ Speech generated successfully!")
+        print(f"Output saved to: {result}")
+        print()
+
+        # Also demonstrate getting waveform
+        print("Generating waveform...")
+        waveform = tts.tts(text, return_type="waveform")
+        print(f"✓ Waveform generated successfully!")
+        print(f"Waveform shape: {waveform.shape}")
+        print()
+
+        print("=" * 60)
+        print("Demo completed successfully!")
+        print("=" * 60)
+
+    except Exception as e:
+        print(f"✗ Error occurred: {e}")
+        import traceback
+        traceback.print_exc()
+        return 1
+
+    return 0
+
+if __name__ == "__main__":
+    exit(main())
diff --git a/docs/index.rst b/docs/index.rst
@@ -5,4 +5,15 @@ PyThaiTTS
 Open Source Thai Text-to-speech library in Python
 
 .. autoclass:: TTS
-   :members:
+   :members:
+
+Text Preprocessing
+------------------
+
+PyThaiTTS provides text preprocessing functions to improve TTS quality.
+
+.. autofunction:: preprocess_text
+
+.. autofunction:: num_to_thai
+
+.. autofunction:: expand_maiyamok