PPM Prediction API

A text prediction API using Prediction by Partial Matching (PPM). Train on any text to create personalized predictions, or use the default English training text.

Features

Train on any text of your choice
Session-based models for individual customization
Predict at letter, word, or sentence level
Falls back to default English training if no custom training provided

Setup

Prerequisites

Node.js (>=20.0.0)
npm (>=9.0.0)

Installation

Clone the repository:

git clone https://github.com/willwade/PPM-API.git
cd PPM-API

Install dependencies:
```
npm install
```
(Optional) Install Python dependencies for generating training text:
```
pip install datasets
```

Usage

Start the API

Run the following command to start the API:

npm start

The API will be available at http://localhost:8080.

API Documentation

Once running, view the full API documentation at:

http://localhost:8080/api-docs

Quick Start Guide

Train a Model (Optional)

curl -X POST http://localhost:8080/train \
-H "Content-Type: application/json" \
-d '{
"url": "https://www.gutenberg.org/cache/epub/19778/pg19778.txt",
"maxOrder": 5
}'

Response:

{
"success": true,
"sessionId": "550e8400-e29b-41d4-a716-446655440000",
"message": "Training complete",
"trainingTimeMs": 1234,
"vocabularySizes": {
"letter": 52,
"word": 2000,
"sentence": 500
}
}

Make Predictions

curl -X POST http://localhost:8080/predict \
-H "Content-Type: application/json" \
-H "x-session-id: 550e8400-e29b-41d4-a716-446655440000" \
-d '{
"input": "The quick brown",
"level": "word",
"numPredictions": 5
}'

Response:

json
{
"input": "The quick brown",
"level": "word",
"predictions": [
{
"symbol": "fox",
"probability": 0.4,
"logProbability": -0.916
},
// ... more predictions
],
"contextOrder": 3,
"perplexity": 2.5
}

Training Options

You can train the model in two ways:

Using a URL:

{
"url": "https://www.gutenberg.org/cache/epub/19778/pg19778.txt",
"maxOrder": 5
}

Using Direct Text:

{
"text": "Your training text here",
"maxOrder": 5
}

Note: Provide either url OR text, but not both.

Prediction Levels

The API supports three prediction levels:

letter: Character-by-character prediction
word: Word-by-word prediction
sentence: Full sentence prediction

Session Management

When you train a model, you receive a sessionId
Use this sessionId in the x-session-id header for subsequent predictions
If no sessionId is provided, the API uses default English training text

Deployment

DigitalOcean App Platform

Fork this repository
Connect your DigitalOcean account
Create a new App from your forked repository
Deploy using Node.js settings:
- Environment: Node.js
- Build Command: npm install
- Run Command: npm start

Generate Training Text

To generate training text from datasets (Alice in Wonderland, AAC-like phrases, filtered dialogue):

Run the Python script:
```
python generate_training_text.py
```
The generated text will be saved to training_text.txt.

Issues

If you encounter any issues, please open an issue.

License

This project is licensed under the GPL v 3.0 License - see the LICENSE file for details.

Acknowledgements

PPM JS Was developed by Google Research - https://github.com/google-research/google-research/tree/master/jslm

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
controllers		controllers
routes		routes
scripts		scripts
utils		utils
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.js		app.js
generate_training_text.py		generate_training_text.py
package-lock.json		package-lock.json
package.json		package.json
ppm_language_model.js		ppm_language_model.js
swagger.json		swagger.json
training_text.txt		training_text.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PPM Prediction API

Features

Setup

Prerequisites

Installation

Usage

Start the API

API Documentation

Quick Start Guide

Training Options

Prediction Levels

Session Management

Deployment

DigitalOcean App Platform

Generate Training Text

Issues

License

Acknowledgements

About

Uh oh!

Releases

Packages

Languages

License

willwade/PPM-API

Folders and files

Latest commit

History

Repository files navigation

PPM Prediction API

Features

Setup

Prerequisites

Installation

Usage

Start the API

API Documentation

Quick Start Guide

Training Options

Prediction Levels

Session Management

Deployment

DigitalOcean App Platform

Generate Training Text

Issues

License

Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages