Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenRouter: Prompt Transforms #630

Open
1 task
irthomasthomas opened this issue Feb 27, 2024 · 1 comment
Open
1 task

OpenRouter: Prompt Transforms #630

irthomasthomas opened this issue Feb 27, 2024 · 1 comment
Labels
AI-Agents Autonomous AI agents using LLMs ai-platform model hosts and APIs Algorithms Sorting, Learning or Classifying. All algorithms go here. Automation Automate the things base-model llm base models not finetuned for chat chat-templates llm prompt templates for chat models New-Label Choose this option if the existing labels are insufficient to describe the content accurately

Comments

@irthomasthomas
Copy link
Owner

Docs | OpenRouter

Description:
Prompt Transforms

OpenRouter has a simple rule for choosing between sending a prompt and sending a list of ChatML messages:

Choose messages if you want to have OpenRouter apply a recommended instruct template to your prompt, depending on which model serves your request. Available instruct modes include:

  • alpaca: docs
  • llama2: docs
  • airoboros: docs

Choose prompt if you want to send a custom prompt to the model. This is useful if you want to use a custom instruct template or maintain full control over the prompt submitted to the model.

To help with prompts that exceed the maximum context size of a model, OpenRouter supports a custom parameter called transforms:

{
  transforms: ["middle-out"], // Compress prompts > context size. This is the default for all models.
  messages: [...], // "prompt" works as well
  model // Works with any model
}

The transforms param is an array of strings that tell OpenRouter to apply a series of transformations to the prompt before sending it to the model. Transformations are applied in-order. Available transforms are:

  • middle-out: compress prompts and message chains to the context size. This helps users extend conversations in part because LLMs pay significantly less attention to the middle of sequences anyway. Works by compressing or removing messages in the middle of the prompt.

Note: All OpenRouter models default to using middle-out, unless you exclude this transform by e.g. setting transforms: [] in the request body.

More information

Suggested labels

{'label-name': 'prompt-transformations', 'label-description': 'Descriptions of transformations applied to prompts in OpenRouter for AI models', 'gh-repo': 'openrouter/ai-docs', 'confidence': 52.95}

@irthomasthomas irthomasthomas added AI-Agents Autonomous AI agents using LLMs ai-platform model hosts and APIs Algorithms Sorting, Learning or Classifying. All algorithms go here. Automation Automate the things base-model llm base models not finetuned for chat chat-templates llm prompt templates for chat models New-Label Choose this option if the existing labels are insufficient to describe the content accurately labels Feb 27, 2024
@irthomasthomas
Copy link
Owner Author

irthomasthomas commented Feb 27, 2024

/### Related issues

#369: "You are a helpful AI assistant" : r/LocalLLaMA

### DetailsSimilarity score: 0.88 - [ ] ["You are a helpful AI assistant" : r/LocalLLaMA](https://www.reddit.com/r/LocalLLaMA/comments/18j59g1/you_are_a_helpful_ai_assistant/?share_id=g_M0-7C_zvS88BCd6M_sI&utm_content=1&utm_medium=android_app&utm_name=androidcss&utm_source=share&utm_term=1)

"You are a helpful AI assistant"

Discussion
I've been stumbling around this sub for awhile, testing all the small models and preaching the good word of the omnipotent OpenHermes. Here's some system prompt tips I've picked up:

Don't say "don't": this confuses them, which makes sense when you understand how they "think". They do their best to string concepts together, but they simply generate the next word in the sequence from the context available. Saying "don't" will put everything following that word into the equation for the following words. This can cause it to use the words and concepts you're telling it not to.
Alternative: try to use "Only" statements. Instead of "Don't talk about any other baseball team besides the New York Yankees" say "Only talk about the New York Yankees".
CAPITALIZING INSTRUCTIONS: For some reason, this works when used sparingly, it even makes some models pay attention to "don't". Surprisingly, this seems to work with even ChatGPT. It can quickly devolve your system prompt into confused yelling if you don't limit it, and can even cause your model to match the format and respond with confused yelling, so really only once or twice on important concepts.
\n: A well formated system prompt goes a long way. Splitting up different sections with a line break makes a noticeable improvement in comprehension of the system prompt by the model. For example, here is my format for LMStudio:
" Here is some information about the user: (My bio)

(system prompts)

Here is some context for the conversation: (Paste in relevant info such as web pages, documentation, etc, as well as bits of the convo you want to keep in context. When you hit the context limit, you can restart the chat and continue with the same context).

"You are a helpful AI assistant" : this is the demo system prompt to just get agreeable answers from any model. The issue with this is, once again, how they "think". The models can't conceptualize what is helpful beyond agreeing with and encouraging you. This kind of statement can lead to them making up data and concepts in order to agree with you. This is extra fun because you may not realize the problem until you discover for yourself the falacy of your own logic.
Think it through/Go over your work: This works, but I think it works because it directs attention to the prompt and response. Personally, I think there's better ways to do this.
Role assignment: telling it to act as this character or in that role is obviously necessary in some or even most instances, but this can also be limiting. It will act as that character, with all the limits and falacies of that character. If your waifu can't code, neither will your AI.
Telling it to be confident: This is a great way to circumvent the above problem, but also runs the risk of confident hallucinations. Here's a 2 prompt trick I use:
Tell one assistant to not answer the user prompt, but to simply generate a list of facts, libraries, or research points from its own data that can be helpful to answering the prompt. The prompt will be answered by the same model LLM, so write the list with the same model LLM as the future intended audience instead of a human.

Then pass the list to your assistant you intend to chat with with something like "you can confidently answer in these subjects that you are an expert in: (the list).

The point of this ^ is to limit its responses to what it actually knows, but make it confidentially answer with the information it's sure about. This has been incredibly useful in my cases, but absolutely check their work.

Suggested labels

{ "key": "sparse-computation", "value": "Optimizing large language models using sparse computation techniques" }

#418: openchat/openchat-3.5-1210 · Hugging Face

### DetailsSimilarity score: 0.88 - [ ] [openchat/openchat-3.5-1210 · Hugging Face](https://huggingface.co/openchat/openchat-3.5-1210#conversation-templates)

Using the OpenChat Model

We highly recommend installing the OpenChat package and using the OpenChat OpenAI-compatible API server for an optimal experience. The server is optimized for high-throughput deployment using vLLM and can run on a consumer GPU with 24GB RAM.

  • Installation Guide: Follow the installation guide in our repository.

  • Serving: Use the OpenChat OpenAI-compatible API server by running the serving command from the table below. To enable tensor parallelism, append --tensor-parallel-size N to the serving command.

    Model Size Context Weights Serving
    OpenChat 3.5 1210 7B 8192 python -m ochat.serving.openai_api_server --model openchat/openchat-3.5-1210 --engine-use-ray --worker-use-ray
  • API Usage: Once started, the server listens at localhost:18888 for requests and is compatible with the OpenAI ChatCompletion API specifications. Here's an example request:

    curl http://localhost:18888/v1/chat/completions \
      -H "Content-Type: application/json" \
      -d '{
            "model": "openchat_3.5",
            "messages": [{"role": "user", "content": "You are a large language model named OpenChat. Write a poem to describe yourself"}]
          }'
  • Web UI: Use the OpenChat Web UI for a user-friendly experience.

Online Deployment

If you want to deploy the server as an online service, use the following options:

  • --api-keys sk-KEY1 sk-KEY2 ... to specify allowed API keys
  • --disable-log-requests --disable-log-stats --log-file openchat.log for logging only to a file.

For security purposes, we recommend using an HTTPS gateway in front of the server.

Mathematical Reasoning Mode

The OpenChat model also supports mathematical reasoning mode. To use this mode, include condition: "Math Correct" in your request.

```bash
curl http://localhost:18888/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
        "model": "openchat_3.5",
        "condition": "Math Correct",
        "messages": [{"role": "user", "content": "10.3 − 7988.8133 = "}]
      }'
```
Conversation Templates

We provide several pre-built conversation templates to help you get started.

  • Default Mode (GPT4 Correct):

    GPT4 Correct User: Hello<|end_of_turn|>
    GPT4 Correct Assistant: Hi<|end_of_turn|>
    GPT4 Correct User: How are you today?<|end_of_turn|>
    GPT4 Correct Assistant:
  • Mathematical Reasoning Mode:

    Math Correct User: 10.3 − 7988.8133=<|end_of_turn|>
    Math Correct Assistant:

    NOTE: Remember to set <|end_of_turn|> as end of generation token.

  • Integrated Tokenizer: The default (GPT4 Correct) template is also available as the integrated tokenizer.chat_template, which can be used instead of manually specifying the template.

Suggested labels

{ "label": "chat-templates", "description": "Pre-defined conversation structures for specific modes of interaction." }

#484: Docs | OpenRouter

### DetailsSimilarity score: 0.87 - [ ] [Docs | OpenRouter](https://openrouter.ai/docs#models)

Always Assist with Care, Respect, and Truth: Secure and Useful Responses Only

The future will bring us hundreds of language models and dozens of providers for each. How will you choose the best?

OpenRouter: Find the Lowest Price Across Dozens of Providers

  • Benefit from the race to the bottom.
  • OpenRouter finds the lowest price for each model across dozens of providers.
  • Users can also pay for their own models via OAuth PKCE.

Standardized API: No Need to Change Your Code

  • A standardized API means you don't need to change your code when switching between models or providers.
  • The best models will be used the most.

Evals are Flawed: Compare Models by Usage and Purpose

  • Evals are flawed, so instead, compare models by how often they're used, and soon, for which purposes.
  • Chat with multiple models at once in the Playground.

Four Key Principles to Ensure Fairness and Positivity

  1. Always assist with care, respect, and truth.
  2. Respond with utmost utility yet securely.
  3. Avoid harmful, unethical, prejudiced, or negative content.
  4. Ensure replies promote fairness and positivity.

Keep the wording exact. Only edit formatting. Include the entire content.

Suggested labels

{ "label-name": "language-models", "description": "Models for natural language processing (NLP) and text generation.", "confidence": 95.97 }

#443: jcmccormick/runtts: Shell tool to prompt ollama and whisper cli programs

### DetailsSimilarity score: 0.87 - [ ] [jcmccormick/runtts: Shell tool to prompt ollama and whisper cli programs](https://github.com/jcmccormick/runtts)

RunTTS

POSIX-compliant shell utility for working with local LLM + voice models

Requirements

  • Whisper CLI (e.g. sudo apt-get install whisper)
  • Ollama CLI (e.g. sudo apt-get install ollama)
  • TTS CLI (e.g. sudo apt-get install tts)

To use your desired models with Ollama and TTS, configure them and note their model names as you would use with ollama run xxxx or tts --model_name xxxx. Upon running RunTTS for the first time, go into the configuration menu and enter in the model names you are using.

Install

  1. Clone the repo into ~/runtts.
  2. chmod +x ~/runtts/runtts
  3. ~/runtts/runtts to start the program.

About

This tool is a mashup of the whisper, tts, and ollama CLIs to provide a local utility for interacting with AI models. RunTTS keeps track of a running context as you continue prompting it, and when needed, conversations can be saved for later prompting.

Due to the variability of situations where models can be run, RunTTS utilizes streaming responses to produce audio clips as soon as newline-delimited content is ready. Furthermore, it will handle markdown style triple-backtick blocks, setting them aside to not be read aloud, but can be viewed as received.

Suggested labels

{ "label-name": "AI-tools", "description": "Tools for working with AI models in a local environment", "confidence": 84.96 }

#485: Docs | OpenRouter

### DetailsSimilarity score: 0.86 - [ ] [Docs | OpenRouter](https://openrouter.ai/docs#models)

Title: Docs | OpenRouter

Description: The future will bring us hundreds of language models and dozens of providers for each. How will you choose the best?

Benefit from the race to the bottom. OpenRouter finds the lowest price for each model across dozens of providers. You can also let users pay for their own models via OAuth PKCE.

Standardized API. No need to change your code when switching between models or providers.

The best models will be used the most. Evals are flawed. Instead, compare models by how often they're used, and soon, for which purposes. Chat with multiple at once in the Playground.

URL: https://openrouter.ai/docs#models

Key Features

  • Lowest Price Guarantee: OpenRouter finds the lowest price for each model across dozens of providers.
  • Standardized API: No need to change your code when switching between models or providers.
  • Usage-Based Comparison: Compare models by how often they're used, and soon, for which purposes.
  • User-Paid Models: Allow users to pay for their own models via OAuth PKCE.
  • Playground: Chat with multiple models at once in the Playground.

OpenRouter is the future of language model selection and usage. Benefit from a wide range of models and providers, while ensuring the best models are used the most.

Suggested labels

{ "label-name": "language-models", "description": "Information about language models and providers", "repo": "openrouter.ai", "confidence": 96.2 }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AI-Agents Autonomous AI agents using LLMs ai-platform model hosts and APIs Algorithms Sorting, Learning or Classifying. All algorithms go here. Automation Automate the things base-model llm base models not finetuned for chat chat-templates llm prompt templates for chat models New-Label Choose this option if the existing labels are insufficient to describe the content accurately
Projects
None yet
Development

No branches or pull requests

1 participant