Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Rejection sampling data generation pipeline with SelfImprovingCoT pipeline #1646

Merged
merged 25 commits into from
Feb 27, 2025

Conversation

JoyceXu02
Copy link
Collaborator

Description

Describe your changes in detail (optional if the linked issue already contains a detailed description of the changes).

Checklist

Go over all the following points, and put an x in all the boxes that apply.

  • I have read the CONTRIBUTION guide (required)
  • I have linked this PR to an issue using the Development section on the right sidebar or by adding Fixes #issue-number in the PR description (required)
  • I have checked if any dependencies need to be added or updated in pyproject.toml and poetry.lock
  • I have updated the tests accordingly (required for a bug fix or a new feature)
  • I have updated the documentation if needed:
  • I have added examples if this is a new feature
    Fixes [Feature Request] Rejection sampling data generation pipeline with SelfImprovingCoT pipeline #1504
    If you are unsure about any of these, don't hesitate to ask. We are here to help!

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@Wendong-Fan Wendong-Fan changed the title Added Rejection sampling data generation pipeline with SelfImprovingCoT pipeline feat: Rejection sampling data generation pipeline with SelfImprovingCoT pipeline Feb 24, 2025
Copy link
Member

@Wendong-Fan Wendong-Fan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @JoyceXu02 , seems this PR also includes the change to cookbook, could you clean this PR only including the change to self_improving_cot.py file?

Copy link
Member

@Wendong-Fan Wendong-Fan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @JoyceXu02 , left some comments below

Comment on lines 88 to 89
rejection_sampling: Optional[bool] = False,
rejection_sampling_n: int = 5,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can simpify the interface, if rejection_sampling_n is not None, then we enables the pipeline to generate multiple candidate traces

Comment on lines 519 to 521
for _i in range(self.rejection_sampling_n):
trace = self.generate_reasoning_trace(problem)
candidate_traces.append(trace)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some models support n parameter that can generate multi output with one request, could we leverage this for better efficiency? refer: https://platform.openai.com/docs/api-reference/chat/create#chat-create-n


Args:
problem (str): The problem text for generating a reasoning trace.
max_attempts (int): The number of candidate traces to generate.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no max_attempts passed to arg

first candidate if none qualify.
"""
candidate_traces = []
for _i in range(self.rejection_sampling_n):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.rejection_sampling_n should not be hardcoded to 5

candidate_traces.append(trace)

best_trace = None
best_avg_score = -1.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to define initial value of best_avg_score >0

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Wendong-Fan , will fix it.

MuggleJinx and others added 12 commits February 24, 2025 15:14
Co-authored-by: Wendong <w3ndong.fan@gmail.com>
Co-authored-by: Wendong-Fan <133094783+Wendong-Fan@users.noreply.github.com>
Co-authored-by: Xiaotian Jin <jinxiaotian_sal@outlook.com>
Co-authored-by: Wendong <w3ndong.fan@gmail.com>
Co-authored-by: Wendong-Fan <133094783+Wendong-Fan@users.noreply.github.com>
Co-authored-by: Wendong-Fan <133094783+Wendong-Fan@users.noreply.github.com>
Co-authored-by: Wendong <w3ndong.fan@gmail.com>
Co-authored-by: Wendong-Fan <133094783+Wendong-Fan@users.noreply.github.com>
…1627)

Co-authored-by: Wendong-Fan <133094783+Wendong-Fan@users.noreply.github.com>
Co-authored-by: Wendong-Fan <133094783+Wendong-Fan@users.noreply.github.com>
Co-authored-by: Wendong <w3ndong.fan@gmail.com>
Copy link
Member

@Wendong-Fan Wendong-Fan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @JoyceXu02 ! Left some comments below

Comment on lines 498 to 499
r"""
Generate multiple candidate reasoning traces for a problem and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docstring format

Suggested change
r"""
Generate multiple candidate reasoning traces for a problem and
r"""Generate multiple candidate reasoning traces for a problem and

Comment on lines 506 to 507
str: The best candidate trace that meets quality criteria, or the
first candidate if none qualify.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
str: The best candidate trace that meets quality criteria, or the
first candidate if none qualify.
str: The best candidate trace that meets quality criteria, or the
first candidate if none qualify.

Comment on lines 509 to 511
self.reason_agent.model_backend.model_config_dict['n'] = (
self.rejection_sampling_n
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not all models support n parameter, for those doesn't support n we still need to use loop to generate multiple content

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense. I will update on this soon.

best_trace = trace
best_avg_score = avg_score
if best_trace is None:
best_trace = candidate_traces[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we return the one with highest score even it didn't meet the threshold instead of hardcode to the first candidate?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. I will make a change on this one too.

Copy link
Member

@Wendong-Fan Wendong-Fan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @JoyceXu02 !

@Wendong-Fan Wendong-Fan merged commit 4bb539b into master Feb 27, 2025
6 checks passed
@Wendong-Fan Wendong-Fan deleted the features/cot-rejection-sampling branch February 27, 2025 15:27
ZackZikaiXiao pushed a commit to ZackZikaiXiao/camel that referenced this pull request Mar 23, 2025
…oT pipeline (camel-ai#1646)

Co-authored-by: Wendong-Fan <133094783+Wendong-Fan@users.noreply.github.com>
Co-authored-by: Xiaotian Jin <jinxiaotian_sal@outlook.com>
Co-authored-by: Wendong <w3ndong.fan@gmail.com>
Co-authored-by: Asher-hss <101127070+Asher-hss@users.noreply.github.com>
Co-authored-by: Zoe Yan <73959962+zoezyn@users.noreply.github.com>
Co-authored-by: Sarthak Bhardwaj <7sarthakbhardwaj@gmail.com>
Co-authored-by: Isaac Jin <whale3ye@gmail.com>
Co-authored-by: TTS <50868301+TOGOTOO@users.noreply.github.com>
Co-authored-by: Lei Zhang <zhanglei@apache.org>
Co-authored-by: Yifeng Wang(正经人王同学) <86822589+zjrwtx@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

[Feature Request] Rejection sampling data generation pipeline with SelfImprovingCoT pipeline
10 participants