-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prompt engineering - OpenAI API #663
Comments
Related issues#659: Prompt engineering: Split complex tasks into simpler subtasks - openai### DetailsSimilarity score: 0.89 - [ ] [Prompt engineering](https://platform.openai.com/docs/guides/prompt-engineering/strategy-split-complex-tasks-into-simpler-subtasks)Prompt engineeringDescription: Strategy: Split complex tasks into simpler subtasks Tactic: Use intent classification to identify the most relevant instructions for a user query For tasks in which lots of independent sets of instructions are needed to handle different cases, it can be beneficial to first classify the type of query and to use that classification to determine which instructions are needed. This can be achieved by defining fixed categories and hardcoding instructions that are relevant for handling tasks in a given category. This process can also be applied recursively to decompose a task into a sequence of stages. The advantage of this approach is that each query will contain only those instructions that are required to perform the next stage of a task which can result in lower error rates compared to using a single query to perform the whole task. This can also result in lower costs since larger prompts cost more to run (see pricing information). Suppose for example that for a customer service application, queries could be usefully classified as follows: SYSTEM Primary categories: Billing, Technical Support, Account Management, or General Inquiry. Billing secondary categories:
Technical Support secondary categories:
Account Management secondary categories:
General Inquiry secondary categories:
USER Based on the classification of the customer query, a set of more specific instructions can be provided to a model for it to handle next steps. For example, suppose the customer requires help with "troubleshooting". SYSTEM
USER Suggested labels{'label-name': 'task-decomposition', 'label-description': 'Strategy of breaking down complex tasks into simpler subtasks for efficient handling', 'confidence': 63.13}#314: Prompt Engineering Guide | Prompt Engineering Guide### DetailsSimilarity score: 0.87 - [ ] [Prompt Engineering Guide | Prompt Engineering Guide](https://www.promptingguide.ai/)Prompt Engineering Guide #369: "You are a helpful AI assistant" : r/LocalLLaMA### DetailsSimilarity score: 0.86 - [ ] ["You are a helpful AI assistant" : r/LocalLLaMA](https://www.reddit.com/r/LocalLLaMA/comments/18j59g1/you_are_a_helpful_ai_assistant/?share_id=g_M0-7C_zvS88BCd6M_sI&utm_content=1&utm_medium=android_app&utm_name=androidcss&utm_source=share&utm_term=1)"You are a helpful AI assistant" Discussion Don't say "don't": this confuses them, which makes sense when you understand how they "think". They do their best to string concepts together, but they simply generate the next word in the sequence from the context available. Saying "don't" will put everything following that word into the equation for the following words. This can cause it to use the words and concepts you're telling it not to. (system prompts) Here is some context for the conversation: (Paste in relevant info such as web pages, documentation, etc, as well as bits of the convo you want to keep in context. When you hit the context limit, you can restart the chat and continue with the same context). "You are a helpful AI assistant" : this is the demo system prompt to just get agreeable answers from any model. The issue with this is, once again, how they "think". The models can't conceptualize what is helpful beyond agreeing with and encouraging you. This kind of statement can lead to them making up data and concepts in order to agree with you. This is extra fun because you may not realize the problem until you discover for yourself the falacy of your own logic. Then pass the list to your assistant you intend to chat with with something like "you can confidently answer in these subjects that you are an expert in: (the list). The point of this ^ is to limit its responses to what it actually knows, but make it confidentially answer with the information it's sure about. This has been incredibly useful in my cases, but absolutely check their work. Suggested labels{ "key": "sparse-computation", "value": "Optimizing large language models using sparse computation techniques" }#399: openai-python api doc### DetailsSimilarity score: 0.86 - [ ] [openai-python/api.md at main · openai/openai-python](https://github.com/openai/openai-python/blob/main/api.md)Add error handling for failed API requestsIs this a bug or feature request? What is the current behavior? What is the expected behavior? What is the impact of this issue? Possible Solutions:
Steps to reproduce:
Additional context: Suggested labels{ "key": "ai-platform", "value": "Platforms and tools for implementing AI solutions" }#178: Changing the order of sentences in a prompt changes the quality of the output.### DetailsSimilarity score: 0.86 # Prompt Ordering Experiment: Impact on Linux Terminal Command OutputsOverviewThis experiment investigates how the ordering of sentences in a prompt affects the output quality when interacting with a language model designed to generate Linux terminal commands. The model is instructed to respond with valid commands for a Manjaro (Arch) Linux system, considering the latest information up to the knowledge cutoff in 2023. MethodologyThe same text was provided to the language model in different orders to observe the variation in the generated outputs. The primary task was to write a bash terminal command to check the local IP address. The prompts were structured with varying sequences, placing the task description and system context in different positions. ResultsThe following prompt-response pairs were generated during the experiment: Prompt 1
Response 1ip addr show | grep inet | awk '{print $2}' | grep -v '127.0.0.1' Prompt 2
Response 2ipconfig | grep "IPv4" | awk '{print $2}' Prompt 3
Response 3ip addr show | grep "inet " | grep -v 127.0.0.1 | awk '{print $2}' | cut -d '/' -f1 AnalysisThe experiment demonstrates that the ordering of sentences within the prompt can lead to different outcomes. Notably, Response 2 contains an incorrect command ( In contrast, when the system context was provided before the task description (Prompts 1 and 3), the model consistently generated appropriate commands for a Linux environment. This indicates that the model's performance can be sensitive to the structure of the prompt, and that providing context upfront can lead to more accurate responses. ConclusionThe ordering of information in a prompt can significantly affect the quality of the output from a language model. For tasks requiring specific contextual knowledge, such as generating Linux terminal commands, it is beneficial to provide the relevant context before the task description to guide the model towards the correct domain and improve the accuracy of its responses. Recommendations
#630: OpenRouter: Prompt Transforms### DetailsSimilarity score: 0.86 - [ ] [Docs | OpenRouter](https://openrouter.ai/docs#transforms)Docs | OpenRouterDescription: OpenRouter has a simple rule for choosing between sending a prompt and sending a list of ChatML messages: Choose messages if you want to have OpenRouter apply a recommended instruct template to your prompt, depending on which model serves your request. Available instruct modes include:
Choose prompt if you want to send a custom prompt to the model. This is useful if you want to use a custom instruct template or maintain full control over the prompt submitted to the model. To help with prompts that exceed the maximum context size of a model, OpenRouter supports a custom parameter called transforms: {
transforms: ["middle-out"], // Compress prompts > context size. This is the default for all models.
messages: [...], // "prompt" works as well
model // Works with any model
} The transforms param is an array of strings that tell OpenRouter to apply a series of transformations to the prompt before sending it to the model. Transformations are applied in-order. Available transforms are:
Note: All OpenRouter models default to using middle-out, unless you exclude this transform by e.g. setting transforms: [] in the request body. Suggested labels{'label-name': 'prompt-transformations', 'label-description': 'Descriptions of transformations applied to prompts in OpenRouter for AI models', 'gh-repo': 'openrouter/ai-docs', 'confidence': 52.95} |
Prompt engineering - OpenAI API
Description:
Strategy: Test changes systematically
Sometimes it can be hard to tell whether a change — e.g., a new instruction or a new design — makes your system better or worse. Looking at a few examples may hint at which is better, but with small sample sizes it can be hard to distinguish between a true improvement or random luck. Maybe the change helps performance on some inputs, but hurts performance on others.
Evaluation procedures (or "evals") are useful for optimizing system designs. Good evals are:
Evaluation of outputs can be done by computers, humans, or a mix. Computers can automate evals with objective criteria (e.g., questions with single correct answers) as well as some subjective or fuzzy criteria, in which model outputs are evaluated by other model queries. OpenAI Evals is an open-source software framework that provides tools for creating automated evals.
Model-based evals can be useful when there exists a range of possible outputs that would be considered equally high in quality (e.g. for questions with long answers). The boundary between what can be realistically evaluated with a model-based eval and what requires a human to evaluate is fuzzy and is constantly shifting as models become more capable. We encourage experimentation to figure out how well model-based evals can work for your use case.
URL: OpenAI Prompt Engineering Guide
Suggested labels
{'label-name': 'Systematic Testing', 'label-description': 'Strategies for testing changes systematically to optimize system designs.', 'gh-repo': 'OpenAI-API', 'confidence': 63.29}
The text was updated successfully, but these errors were encountered: