-
Coders:
- read_data(data, prompt) -> (cleaned_data, used_code)
- improvement(data, improve_prompt, used_prompt, used_code) -> (cleaned_data, result_code)
-
Reviwer:
- evaluation(cleaned_data, used_code, score) -> new_prompt
- make_report(report_prompt, prompt_1, score_1, prompt_2, score_2) -> report
-
Judge:
- evaluation(report) -> (score)
-
Enviorment:
-
gen_codes(data, prompt_1, prompt_2) -> (cleaned_data_1, code_1, cleaned_data_2, code_2)
-
score(cleaned_data) -> score
-
make_evaluation(cleaned_data, used_code, score) -> new_prompt
-
train_code(initial_prompt_c1, initial_prompt_c2, initial_prompt_r, n_iterations) -> (report, score_loss)
-
train( initial_prompt_c1, initial_prompt_c2, initial_prompt_r, report_prompt, n_iterations_code, n_iterations_report) -> (report, score_report, score_loss_coders)
-
-
Notifications
You must be signed in to change notification settings - Fork 0
License
scrocha/trabalho-reinforcement-learning
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
No description or website provided.