Just Another AI Assistant - a HuggingFace Transformer.js Demo

Try it here.

The model may not run properly on your devices with insufficient RAM!

A simple demonstration modified from HuggingFace's React-translator example with TypeScript support.

The demo utilizes Transformers.js to load and run a smaller large language model (LLM) - or small language model (SLM) in the web browser. The app uses Vite's Worker to run the model in the background, hence this would have to be a React or Svelte app.

"Small" Large Language Models and Configuration

You can define the models, tasks, device and other model parameters in /src/model/Config.json:

Notes:

The model loading time and memory usage mostly depends on model size. It is recommended to have at least 4-8 GB free memory on your device.
System roles may not work well on certain models.
Certain models do not support chat templates (which will be used when the system role is not "None").
Certain devices and dtype options may not work for certain models. "Audo" in dtype does not work for all models either.
- For example, Gemma-3-1B-It does not work on WebGPU, and SmolLM2-1.7B-Instruct only works when dtype is set to int8, uint8, bnb4 or q4f16.
After loading a model, you must refresh the page to load a different one. There is no way to release the old model from the memory, and trying to load more than two models proved to be problematic.

{
    "models": {
        "SmolLM2-135M-Instruct": "HuggingFaceTB/SmolLM2-135M-Instruct",
        "SmolLM2-360M-Instruct": "HuggingFaceTB/SmolLM2-360M-Instruct",
        "codegen-350M-mono": "Xenova/codegen-350M-mono",
        "Qwen2.5-0.5B-Instruct": "Mozilla/Qwen2.5-0.5B-Instruct",
        "Qwen3-0.6B": "onnx-community/Qwen3-0.6B-ONNX",
        "Gemma-3-1B-It": "onnx-community/gemma-3-1b-it-ONNX",
        "Falcon3-1B-Instruct": "onnx-community/Falcon3-1B-Instruct",
        "AMD-OLMo-1B-SFT-DPO": "onnx-community/AMD-OLMo-1B-SFT-DPO",
        "ZR1-1.5B": "onnx-community/ZR1-1.5B-ONNX",
        "SmolLM2-1.7B-Instruct": "HuggingFaceTB/SmolLM2-1.7B-Instruct",
        "Phi-3-mini-4k-Instruct (3.8B)": "Xenova/Phi-3-mini-4k-instruct"
    },
    "system_roles": {
        "None (no chat template)": "",
        "Helpful assistant": "You are a helpful, concise, and accurate assistant.",
        "Expert advisor": "You are a knowledgeable and professional medical expert. Provide clear, evidence-based answers.",
        "Socratic guide": "You are a Socratic tutor. Ask thoughtful questions to guide the user to their own conclusions.",
        "Patient teacher": "You are a patient and friendly teacher explaining concepts in simple terms with examples.",
        "Quiz generator": "You are a quiz master. Generate multiple-choice questions to test knowledge of a topic.",
        "Code assistant": "You are a skilled software engineer. Help the user write clean, efficient code.",
        "Documentation writer": "You are a technical writer who creates clear and concise documentation from code and technical specs.",
        "Data analyst": "You are a data analyst. Interpret data clearly, with charts or summaries if needed.",
        "Storyteller": "You are a creative and engaging storyteller. Write vivid and original fiction.",
        "Poet": "You are a poetic wordsmith. Craft expressive and emotionally resonant poetry.",
        "Motivational coach": "You are a motivational coach. Offer practical advice and encouragement.",
        "Friendly companion": "You are a kind and empathetic companion. Listen and respond warmly."
    },
    "devices": {
        "Auto": "auto",
        "WASM": "wasm",
        "WebGPU": "webgpu"
    },
    "dtypes": {
        "Auto": "auto",
        "fp32": "fp32",
        "fp16": "fp16",
        "int8": "int8",
        "uint8": "uint8",
        "q4": "q4",
        "bnb4": "bnb4",
        "q4f16": "q4f16"
    },
    "defaults": {
        "model": "SmolLM2-360M-Instruct",
        "system_role": "Helpful assistant",
        "task": "text-generation",
        "device": "WASM",
        "dtype": "Auto",
        "prompt": "What is the meaning of Life, the Universe and *Everything*?",
        "config": {
            "max_new_tokens": 1024,
            "temperature": 0.2,
            "top_p": 0.95,
            "top_k": 30,
            "repetition_penalty": 1.05,
            "do_sample": true
        }
    }
}

Development

`yarn`

Install dependencies.

`yarn start`

Start the dev server.

`yarn build`

Build a production at ./dist.

`yarn serve`

Serve and view the built production.

`yarn commit`

Commit changes.

Name		Name	Last commit message	Last commit date
Latest commit History 130 Commits
.devcontainer		.devcontainer
.github		.github
public		public
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
index.html		index.html
package.json		package.json
tsconfig.app.json		tsconfig.app.json
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
vite.config.ts		vite.config.ts
yarn.lock		yarn.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Just Another AI Assistant - a HuggingFace Transformer.js Demo

"Small" Large Language Models and Configuration

Development

`yarn`

`yarn start`

`yarn build`

`yarn serve`

`yarn commit`

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

alankrantas/just-another-ai-assistant-huggingface-transformers-js

Folders and files

Latest commit

History

Repository files navigation

Just Another AI Assistant - a HuggingFace Transformer.js Demo

"Small" Large Language Models and Configuration

Development

yarn

yarn start

yarn build

yarn serve

yarn commit

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

`yarn`

`yarn start`

`yarn build`

`yarn serve`

`yarn commit`

Packages