-
Notifications
You must be signed in to change notification settings - Fork 3
Feature/welcome capo #45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
* chore: add codeowners file * chore: add python poetry action and docs workflow * chore: update pre-commit file * chore: update docs * chore: update logo * chore: add cicd pipeline for automated deployment * chore: update poetry version * chore: fix action versioning * chore: add gitattributes to ignore line count in jupyter notebooks * chore: add and update docstrings * chore: fix end of files * chore: update action versions * Update README.md --------- Co-authored-by: mo374z <schlager.mo@t-online.de>
* chore: fix workflow execution * chore: fix version check in CICD pipeline
* update gitignore * initial implementation of opro * formatting of prompt template * added opro test run * opro refinements * fixed sampling error * add docs to opro * fix pre commit issues# * fix pre commit issues# * fixed end of line
* fixed pre commit config and removed end of file line breaks in tempaltes * added /
* added prompt_creation.py * change version
* Remove deepinfra file * change langchain-community version
* renamed get_tasks to get_task and change functionality accordingly. moved templates and data_sets * init * move templates to templates.py * Add nested asyncio to make it useable in notebooks * Update README.md * changed getting_started.ipynb and created helper functions * added sampling of initial population * fixed config * fixed callbacks * adjust runs * fix run evaluation api token * fix naming convention in opro, remove on epoch end for logger callback, fixed to allow for numeric values in class names * Update promptolution/llms/api_llm.py Co-authored-by: Timo Heiß <87521684+timo282@users.noreply.github.com> * fixed comments * Update pyproject.toml * resolve comments --------- Co-authored-by: mo374z <schlager.mo@t-online.de> Co-authored-by: Timo Heiß <87521684+timo282@users.noreply.github.com> Co-authored-by: Moritz Schlager <87517800+mo374z@users.noreply.github.com>
* implemented random selector * added random search selector * increased version count * fix typos * Update promptolution/predictors/base_predictor.py Co-authored-by: Timo Heiß <87521684+timo282@users.noreply.github.com> * Update promptolution/tasks/classification_tasks.py Co-authored-by: Timo Heiß <87521684+timo282@users.noreply.github.com> * resolve comments * resolve comments --------- Co-authored-by: Timo Heiß <87521684+timo282@users.noreply.github.com>
* Update release-notes.md * Fix release note links
This reverts commit e23dd74.
* Delete Experiment files * Removed config necessities * improved opro meta-prompts * added read from data frame feature * changed required python version to 3.9
* delete poetry.lock and upgrade transformers dependency * Update release-notes.md
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request primarily refactors various components in the Promptolution codebase, along with adjusting testing configurations and introducing the new CAPO optimizer. Key changes include the removal of debug verbosity parameters in several test scripts, updates to the ClassificationTask and BaseTask implementations (including improvements to subsampling and attribute handling), and the addition of new CAPO optimizer functionality with associated templates.
Reviewed Changes
Copilot reviewed 39 out of 39 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| scripts/evoprompt_ga_test_gsm8k.py, scripts/evoprompt_ga_test.py, scripts/api_test.py | Removed debugging verbosity parameters from optimizer calls. |
| promptolution/utils/token_counter.py, promptolution/utils/test_statistics.py | Added utility functions for token counting and statistical tests. |
| promptolution/templates.py | Introduced new CAPO-related prompt templates. |
| promptolution/tasks/classification_tasks.py | Updated task initialization, subsampling parameters, and evaluation logic; removed redundant super().init call. |
| promptolution/tasks/base_task.py, promptolution/tasks/init.py | Fixed typos in parameter names and updated task instantiation. |
| promptolution/predictors/base_predictor.py | Modified prompt-input concatenation to use element-wise pairing (zip) instead of cross product. |
| promptolution/optimizers/* | Adjusted evaluation and debug logging in several optimizers and added new CAPO optimizer implementation. |
| promptolution/llms/vllm.py, promptolution/helpers.py | Updated error messages, logging, and configuration usage. |
| promptolution/config.py | Modified attribute setting to track used attributes. |
| notebooks/getting_started.ipynb | Enhanced notebook cells with updated API usage and sample outputs. |
Comments suppressed due to low confidence (2)
promptolution/predictors/base_predictor.py:58
- Changing the prompt concatenation from a cross-product to a zip pairing alters the behavior of the predictor. Please confirm that this element-wise pairing is intended and consistent with downstream usage.
inputs = [prompt + "\n" + x for prompt, x in zip(prompts, xs)]
promptolution/config.py:26
- The setattr method uses 'self._used_attributes' without any visible initialization. Ensure that '_used_attributes' is properly initialized (e.g., in init) to avoid potential AttributeError.
self._used_attributes.add(name)
mo374z
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general we should maybe think about handling prompts (its a little weird to have them sometimes as objects sometimes as strings) but i also don't know a better solution for this rn
| raise ValueError("Block increment is only valid for block subsampling.") | ||
| self.block_idx += 1 | ||
| if self.block_idx >= self.n_blocks: | ||
| self.block_idx = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
starting again at the beginning? when does this make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Example: When capo is finished with one step of racing and enters the evolutionary loop once more, we want to start again from block 0
| logger = getLogger(__name__) | ||
|
|
||
|
|
||
| class CAPOPrompt: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe for the future make this a general class -> not necessarily specific to CAPO but to optimizers that also optimize few_shots
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, will be sensible as soon as we have at least a 2nd optimizer that does this. Otherwise its kinda wasteful generalization
The base branch was changed.
No description provided.