Skip to content

Latest commit

 

History

History
316 lines (245 loc) · 11.9 KB

README.md

File metadata and controls

316 lines (245 loc) · 11.9 KB

AI Utils

AI Utils is a compact library for building edge-rendered AI-powered streaming text and chat UIs. It takes care of the boilerplate streaming code while not adding any additional abstraction or indirection between you and your AI model provider's SDK--letting you focus on building your next big thing instead of wasting another day messing around with text encoders.

Features

  • Edge Runtime compatibility
  • First-class support for native OpenAI, Anthropic, and HuggingFace Inference JavaScript SDKs
  • SWR-powered React hooks for fetching and rendering streaming text responses
  • Callbacks for saving completed streaming responses to a database (in the same request)

Installation

pnpm install @vercel/ai-utils

Table of Contents

Background

Creating UIs with contemporary AI providers is a daunting task. Ideally, language models/providers would be fast enough where developers could just fetch complete responses data with JSON in a few hundred milliseconds, but the reality is starkly different. It's quite common for these LLMs to take 5-40s to whip up a response.

Instead of tormenting users with a seemingly endless loading spinner while these models conjure up responses or completions, the progressive approach involves streaming the text output to the frontend on the fly-—a tactic championed by OpenAI's ChatGPT. However, implementing this technique is easier said than done. Each AI provider has its own unique SDK, each has its own envelope surrounding the tokens, and each with different metadata (whose usefulness varies drastically).

Many AI utility helpers so far in the JS ecosystem tend to overcomplicate things with unnecessary magic tricks, excess levels of indirection, and lossy abstractions. Here's where Vercel AI Utils comes to the rescue—a compact library designed to alleviate the headaches of constructing streaming text UIs by taking care of the most annoying parts and then getting out of your way:

  • Diminish the boilerplate necessary for handling streaming text responses
  • Guarantee the capability to run functions at the Edge
  • Streamline fetching and rendering of streaming responses (in React)

The goal of this library lies in its commitment to work directly with each AI/Model Hosting Provider's SDK, an equivalent edge-compatible version, or a vanilla fetch function. Its job is simply to cut through the confusion and handle the intricacies of streaming text, leaving you to concentrate on building your next big thing instead of wasting another afternoon tweaking TextEncoder with trial and error.

Usage

// app/api/generate/route.ts
import { Configuration, OpenAIApi } from 'openai-edge'
import { OpenAIStream, StreamingTextResponse } from '@vercel/ai-utils'

const config = new Configuration({
  apiKey: process.env.OPENAI_API_KEY
})
const openai = new OpenAIApi(config)

export const runtime = 'edge'

export async function POST() {
  const response = await openai.createChatCompletion({
    model: 'gpt-4',
    stream: true,
    messages: [{ role: 'user', content: 'What is love?' }]
  })
  const stream = OpenAIStream(response)
  return new StreamingTextResponse(stream)
}

Tutorial

For this example, we'll stream a chat completion text from OpenAI's gpt-3.5-turbo and render it in Next.js. This tutorial assumes you have

Create a Next.js app

Create a Next.js application and install @vercel/ai-utils and openai-edge. We currently prefer the latter openai-edge library over the official OpenAI SDK because the official SDK uses axios which is not compatible with Vercel Edge Functions.

pnpx create-next-app my-ai-app
cd my-ai-app
pnpm install @vercel/ai-utils openai-edge

Add your OpenAI API Key to .env

Create a .env file and add an OpenAI API Key called

touch .env
OPENAI_API_KEY=xxxxxxxxx

Create a Route Handler

Create a Next.js Route Handler that uses the Edge Runtime that we'll use to generate a chat completion via OpenAI that we'll then stream back to our Next.js.

// ./app/api/chat/route.ts
import { Configuration, OpenAIApi } from 'openai-edge'
import { OpenAIStream, StreamingTextResponse } from '@vercel/ai-utils'

// Create an OpenAI API client (that's edge friendly!)
const config = new Configuration({
  apiKey: process.env.OPENAI_API_KEY
})
const openai = new OpenAIApi(config)

// IMPORTANT! Set the runtime to edge
export const runtime = 'edge'

export async function POST(req: Request) {
  // Extract the `prompt` from the body of the request
  const { messages } = await req.json()

  // Ask OpenAI for a streaming chat completion given the prompt
  const response = await openai.createChatCompletion({
    model: 'gpt-3.5-turbo',
    stream: true,
    messages
  })
  // Convert the response into a friendly text-stream
  const stream = OpenAIStream(response)
  // Respond with the stream
  return new StreamingTextResponse(stream)
}

Vercel AI Utils provides 2 utility helpers to make the above seamless: First, we pass the streaming response we receive from OpenAI to OpenAIStream. This method decodes/extracts the text tokens in the response and then re-encodes them properly for simple consumption. We can then pass that new stream directly to StreamingTextResponse. This is another utility class that extends the normal Node/Edge Runtime Response class with the default headers you probably want (hint: 'Content-Type': 'text/plain; charset=utf-8' is already set for you).

Wire up the UI

Create a Client component with a form that we'll use to gather the prompt from the user and then stream back the completion from.

// ./app/form.ts
'use client'

import { useChat } from '@vercel/ai-utils'

export default function Chat() {
  const { messages, input, handleInputChange, handleSubmit } = useChat()

  return (
    <div className="mx-auto w-full max-w-md py-24 flex flex-col stretch">
      {messages.length > 0
        ? messages.map(m => (
            <div key={m.id}>
              {m.role === 'user' ? 'User: ' : 'AI: '}
              {m.content}
            </div>
          ))
        : null}

      <form onSubmit={handleSubmit}>
        <input
          className="fixed w-full max-w-md bottom-0 border border-gray-300 rounded mb-8 shadow-xl p-2"
          value={input}
          placeholder="Say something..."
          onChange={handleInputChange}
        />
      </form>
    </div>
  )
}

API Reference

OpenAIStream(res: Response, cb: AIStreamCallbacks): ReadableStream

A transform that will extract the text from all chat and completion OpenAI models as returned as a ReadableStream.

// app/api/generate/route.ts
import { Configuration, OpenAIApi } from 'openai-edge';
import { OpenAIStream, StreamingTextResponse } from '@vercel/ai-utils';

const config = new Configuration({
  apiKey: process.env.OPENAI_API_KEY,
});
const openai = new OpenAIApi(config);

export const runtime = 'edge';

export async function POST() {
  const response = await openai.createChatCompletion({
    model: 'gpt-4',
    stream: true,
    messages: [{ role: 'user', content: 'What is love?' }],
  });
  const stream = OpenAIStream(response, {
    async onStart() {
      console.log('streamin yo')
    },
    async onToken(token) {
      console.log('token: ' + token)
    },
    async onCompletion(content) {
      console.log('full text: ' + )
      // await prisma.messages.create({ content }) or something
    }
  });
  return new StreamingTextResponse(stream);
}

HuggingFaceStream(iter: AsyncGenerator<any>, cb?: AIStreamCallbacks): ReadableStream

A transform that will extract the text from most chat and completion HuggingFace models and return them as a ReadableStream.

// app/api/generate/route.ts
import { HfInference } from '@huggingface/inference'
import { HuggingFaceStream, StreamingTextResponse } from '@vercel/ai-utils'

export const runtime = 'edge'

const Hf = new HfInference(process.env.HUGGINGFACE_API_KEY)

export async function POST() {
  const response = await Hf.textGenerationStream({
    model: 'OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5',
    inputs: `<|prompter|>What's the Earth total population?<|endoftext|><|assistant|>`,
    parameters: {
      max_new_tokens: 200,
      // @ts-ignore
      typical_p: 0.2, // you'll need this for OpenAssistant
      repetition_penalty: 1,
      truncate: 1000,
      return_full_text: false
    }
  })
  const stream = HuggingFaceStream(response)
  return new StreamingTextResponse(stream)
}

StreamingTextResponse(res: ReadableStream, init?: ResponseInit)

This is a tiny wrapper around Response class that makes returning ReadableStreams of text a one liner. Status is automatically set to 200, with 'Content-Type': 'text/plain; charset=utf-8' set as headers.

// app/api/generate/route.ts
import { OpenAIStream, StreamingTextResponse } from '@vercel/ai-utils'

export const runtime = 'edge'

export async function POST() {
  const response = await openai.createChatCompletion({
    model: 'gpt-4',
    stream: true,
    messages: { role: 'user', content: 'What is love?' }
  })
  const stream = OpenAIStream(response)
  return new StreamingTextResponse(stream, {
    'X-RATE-LIMIT': 'lol'
  }) // => new Response(stream, { status: 200, headers: { 'Content-Type': 'text/plain; charset=utf-8', 'X-RATE-LIMIT': 'lol' }})
}

useChat(options: UseChatOptions): ChatHelpers

An SWR-powered React hook for streaming text completion or chat messages and handling chat and prompt input state.

// app/chat.tsx
'use client'

import { useChat } from '@vercel/ai-utils'

export default function Chat() {
  const { messages, input, handleInputChange, handleSubmit } = useChat()

  return (
    <div className="mx-auto w-full max-w-md py-24 flex flex-col stretch">
      {messages.length > 0
        ? messages.map(m => (
            <div key={m.id}>
              {m.role === 'user' ? 'User: ' : 'AI: '}
              {m.content}
            </div>
          ))
        : null}

      <form onSubmit={handleSubmit}>
        <input
          className="fixed w-full max-w-md bottom-0 border border-gray-300 rounded mb-8 shadow-xl p-2"
          value={input}
          placeholder="Say something..."
          onChange={handleInputChange}
        />
      </form>
    </div>
  )
}

UseChatOptions

  • api?: string = '/api/chat' - The API endpoint that accepts a { messages: Message[] } object and returns a stream of tokens of the AI chat response. Defaults to /api/chat.
  • id?: string - An unique identifier for the chat. If not provided, a random one will be generated. When provided, the useChat hook with the same id will have shared states across components thanks to SWR.
  • initialInput?: string = '' - An optional string of initial prompt input
  • initialMessages?: Messages[] = [] - An optional array of initial chat messages