Skip to content

Conversation

@finitearth
Copy link
Owner

No description provided.

timo282 and others added 30 commits October 3, 2024 23:02
* chore: add codeowners file

* chore: add python poetry action and docs workflow

* chore: update pre-commit file

* chore: update docs

* chore: update logo

* chore: add cicd pipeline for automated deployment

* chore: update poetry version

* chore: fix action versioning

* chore: add gitattributes to ignore line count in jupyter notebooks

* chore: add and update docstrings

* chore: fix end of files

* chore: update action versions

* Update README.md

---------

Co-authored-by: mo374z <schlager.mo@t-online.de>
* chore: fix workflow execution

* chore: fix version check in CICD pipeline
* update gitignore

* initial implementation of opro

* formatting of prompt template

* added opro test run

* opro refinements

* fixed sampling error

* add docs to opro

* fix pre commit issues#

* fix pre commit issues#

* fixed end of line
* fixed pre commit config and removed end of file line breaks in tempaltes

* added /
* added prompt_creation.py

* change version
* Remove deepinfra file

* change langchain-community version
* renamed get_tasks to get_task and change functionality accordingly. moved templates and data_sets

* init

* move templates to templates.py

* Add nested asyncio to make it useable in notebooks

* Update README.md

* changed getting_started.ipynb and created helper functions

* added sampling of initial population

* fixed config

* fixed callbacks

* adjust runs

* fix run evaluation api token

* fix naming convention in opro, remove on epoch end for logger callback, fixed to allow for numeric values in class names

* Update promptolution/llms/api_llm.py

Co-authored-by: Timo Heiß <87521684+timo282@users.noreply.github.com>

* fixed comments

* Update pyproject.toml

* resolve comments

---------

Co-authored-by: mo374z <schlager.mo@t-online.de>
Co-authored-by: Timo Heiß <87521684+timo282@users.noreply.github.com>
Co-authored-by: Moritz Schlager <87517800+mo374z@users.noreply.github.com>
* implemented random selector

* added random search selector

* increased version count

* fix typos

* Update promptolution/predictors/base_predictor.py

Co-authored-by: Timo Heiß <87521684+timo282@users.noreply.github.com>

* Update promptolution/tasks/classification_tasks.py

Co-authored-by: Timo Heiß <87521684+timo282@users.noreply.github.com>

* resolve comments

* resolve comments

---------

Co-authored-by: Timo Heiß <87521684+timo282@users.noreply.github.com>
* Update release-notes.md

* Fix release note links
* Delete Experiment files

* Removed config necessities

* improved opro meta-prompts

* added read from data frame feature

* changed required python version to 3.9
* delete poetry.lock and upgrade transformers dependency

* Update release-notes.md
@finitearth finitearth requested a review from Copilot May 15, 2025 14:28
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request primarily refactors various components in the Promptolution codebase, along with adjusting testing configurations and introducing the new CAPO optimizer. Key changes include the removal of debug verbosity parameters in several test scripts, updates to the ClassificationTask and BaseTask implementations (including improvements to subsampling and attribute handling), and the addition of new CAPO optimizer functionality with associated templates.

Reviewed Changes

Copilot reviewed 39 out of 39 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
scripts/evoprompt_ga_test_gsm8k.py, scripts/evoprompt_ga_test.py, scripts/api_test.py Removed debugging verbosity parameters from optimizer calls.
promptolution/utils/token_counter.py, promptolution/utils/test_statistics.py Added utility functions for token counting and statistical tests.
promptolution/templates.py Introduced new CAPO-related prompt templates.
promptolution/tasks/classification_tasks.py Updated task initialization, subsampling parameters, and evaluation logic; removed redundant super().init call.
promptolution/tasks/base_task.py, promptolution/tasks/init.py Fixed typos in parameter names and updated task instantiation.
promptolution/predictors/base_predictor.py Modified prompt-input concatenation to use element-wise pairing (zip) instead of cross product.
promptolution/optimizers/* Adjusted evaluation and debug logging in several optimizers and added new CAPO optimizer implementation.
promptolution/llms/vllm.py, promptolution/helpers.py Updated error messages, logging, and configuration usage.
promptolution/config.py Modified attribute setting to track used attributes.
notebooks/getting_started.ipynb Enhanced notebook cells with updated API usage and sample outputs.
Comments suppressed due to low confidence (2)

promptolution/predictors/base_predictor.py:58

  • Changing the prompt concatenation from a cross-product to a zip pairing alters the behavior of the predictor. Please confirm that this element-wise pairing is intended and consistent with downstream usage.
inputs = [prompt + "\n" + x for prompt, x in zip(prompts, xs)]

promptolution/config.py:26

  • The setattr method uses 'self._used_attributes' without any visible initialization. Ensure that '_used_attributes' is properly initialized (e.g., in init) to avoid potential AttributeError.
self._used_attributes.add(name)

@finitearth finitearth marked this pull request as ready for review May 15, 2025 14:52
mo374z
mo374z previously approved these changes May 16, 2025
Copy link
Collaborator

@mo374z mo374z left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general we should maybe think about handling prompts (its a little weird to have them sometimes as objects sometimes as strings) but i also don't know a better solution for this rn

raise ValueError("Block increment is only valid for block subsampling.")
self.block_idx += 1
if self.block_idx >= self.n_blocks:
self.block_idx = 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

starting again at the beginning? when does this make sense?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example: When capo is finished with one step of racing and enters the evolutionary loop once more, we want to start again from block 0

timo282
timo282 previously approved these changes May 17, 2025
logger = getLogger(__name__)


class CAPOPrompt:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe for the future make this a general class -> not necessarily specific to CAPO but to optimizers that also optimize few_shots

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, will be sensible as soon as we have at least a 2nd optimizer that does this. Otherwise its kinda wasteful generalization

@finitearth finitearth changed the base branch from dev to main May 18, 2025 17:18
@finitearth finitearth dismissed stale reviews from timo282 and mo374z May 18, 2025 17:18

The base branch was changed.

@mo374z mo374z requested a review from timo282 May 18, 2025 17:19
@finitearth finitearth merged commit ad91732 into main May 18, 2025
2 checks passed
@finitearth finitearth deleted the feature/welcome_capo branch May 18, 2025 17:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants