Skip to content

Conversation

safooray
Copy link
Collaborator

@safooray safooray commented Jul 8, 2025

1- Adds docstrings where missing, and improves consistency of existing ones (Google style)
2- Introduces a Eureka pipeline config for adding docstrings to a set of python files.

Safoora Yousefi added 30 commits December 11, 2024 06:46
@safooray safooray requested a review from nushib July 9, 2025 04:48
try:
# Check if string is a tuple representation and has more than one element
if s[0] == '(' and s[-1] == ')' and len(sl) > 1:
# Evaluate each element using latex2sympy and round the result to 2 decimal places
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this removed?

if len(string.split("\\approx")) == 2:
string = string.split("\\approx")[-1]

# Fix sqrt values not wrapped in curly braces. Note: The function _fix_sqrt is not provided.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are are these removed?

except:
ans = s

# If there's a closing bracket without an opening bracket before it, consider everything before it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is the comment here removed?

def transform(self, df: pd.DataFrame) -> pd.DataFrame:
"""
Some models (e.g. GPT3.5) sometimes does not follow instructions and do not add the "Output:\n" marker in its format
add_output_marker adds a "Output:\n" string to the model output prior to parsing, if it detects a list start, i.e. "1."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed comment

if last_open == -1:
return None
sliced = model_output[last_open:]
# Grab everything between <final_answer> and </final_answer>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed comment

if title == "Justification":

if title == "Justification".lower():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like an ok change but flagging anyways

Returns:
int: The token usage for the row.
int or float: The extracted token count if present, otherwise NaN.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it cannot be float, right?

@@ -1,359 +1,509 @@
# This file was authored by BenchAgents authors and is being reused under the MIT license.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a bit hard to review because the whole code was removed and added again

results_strict = test_instruction_following_strict(
row["prompt"], row["response"], row["instruction_id_list"], row["kwargs"]
)
results_loose = {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor edit, previously not here

"""
# Pattern for numbers with commas
pattern_commas = r"-?\b\d{1,3}(?:,\d{3})+\b"
# Pattern for scientific notation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed comment, several of these in this function

if answer == pred_i:
correct = True
break
else: # gold_i is a string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants