Releases · microsoft/debug-gym

11 Aug 18:37

matheper

1.1.0

09a272a

1.1.0 Latest

Latest

What's Changed

arxiv url by @xingdi-eric-yuan in #106
Readme install from pypi by @matheper in #108
Change credential order in AzureOpenAILLM for proper token retrieval by @matheper in #110
AzureOpenAI Chained Credential by @matheper in #111
Function Calling Syntax by @xingdi-eric-yuan in #109
Update human class to use tool calls by @matheper in #114
make sure the working dir is always in sys path by @xingdi-eric-yuan in #115
Validate human input by @matheper in #116
Bugfix human test by @matheper in #118
Disable strict mode by @xingdi-eric-yuan in #117
Fix env.rewrite_counter by @matheper in #119
Fixed EvalTool being called with extra kwargs when reacting to events by @matheper in #120
Add logs viewer for Froggy by @MarcCote in #95
Fix: Use json.dumps for tool arguments in OpenAILLM by @ShiZhengyan in #123
Tools observations by @matheper in #121
Fix 'utf-8' codec error with surrogate pairs in Unicode strings by @Copilot in #129
Add max_retries parameter to Human class to limit terminal read attempts by @Copilot in #126
Improve test coverage report and default pytest configs by @matheper in #98
Clean up pytest.ini, partially reverting #98 by @matheper in #132
Remove current file by @matheper in #127
Add start and end args to ViewTool by @matheper in #133
Minor Fixes by @xingdi-eric-yuan in #124
Fix kwargs by @xingdi-eric-yuan in #135
Fix kwargs for pdb tool by @xingdi-eric-yuan in #136
Fix type annotation: tool_call_list should be list not dict by @ShiZhengyan in #134
Enhance Agent logging to include step number and reason for termination by @matheper in #137
Improve visualization by @xingdi-eric-yuan in #138
Pdb current frame file by @matheper in #139
Pdb breakpoint handling by @matheper in #140
Refactor llm_api into debug_gym.llms subpackage by @MarcCote in #142
Resolve absolute path from RepoEnv by @matheper in #144
Use better command completion for Human Mode by @MarcCote in #143
Fix: resolve_path and is_editable to account for ignored and read-only files by @matheper in #145
Ignore files from .gitignore by @matheper in #146
Fix Aider ignore patterns and add tests for path resolution and ignored/read-only files by @matheper in #147
Fix issue resolving env.working_dir by @matheper in #148
Set default RepoEnv.dir_tree_depth to 1 by @matheper in #150
replace unescape by filtering non-utf8 chars in system prompts by @xingdi-eric-yuan in #151
Adding SWE-Smith support by @MarcCote in #122
Only load image for instance_id we want to tests by @MarcCote in #154
No eval shortcut by @matheper in #152
Parallel execution by @matheper in #153
Trajectory Filtering by @xingdi-eric-yuan in #141
For SWE-Smith, add a new test split distinct from train-789 by @MarcCote in #156
Fix ViewTool handling empty files by @matheper in #157
Improve Retry by @xingdi-eric-yuan in #158
Support pickling PDBTool instances by @threewisemonkeys-as in #166
Resolve path mismatch issue raised on macOS by @dkokkotas in #159
Context change for SFT by @xingdi-eric-yuan in #162
Fix PDB indentation mismatch in list output context by @Copilot in #161
Fix init obs by @xingdi-eric-yuan in #171
Integrating thinking by @xingdi-eric-yuan in #172
Show pytest traceback for test failures. by @MarcCote in #173
Rich logger by @matheper in #170
Agents rich progress by @matheper in #174
A set of fixes by @xingdi-eric-yuan in #175
Add get_problem_ids (formerly get_dataset_split) to all benchmark env by @MarcCote in #176
Pin swe-smith version by @matheper in #181
Change to when tool call is on auto parsing by @icwhite in #182
Add memory limit to Docker containers by @matheper in #183
Disable rich live in human mode by @matheper in #180
Pin dataset revision for SWE-Smith dataset by @MarcCote in #188
Fix progress skip by @matheper in #186
Agent timeout by @matheper in #189
Always use debug level when logging to file by @matheper in #191
Add logfile to progress tracker. Stop pending tasks' spinner. by @MarcCote in #190
PyPI release action by @matheper in #192
Use hasattr instead of check dict to support subclassed methods. by @MarcCote in #194
Fix release pipeline by @matheper in #195
Dump experiment info by @matheper in #193
Avoid copying pycache in the temp workding dir. by @MarcCote in #196
Use the main logger when checking if LLM config is correct by @MarcCote in #199
Fix get_problem_ids to respect instance_ids parameter in SWE environments by @Copilot in #179
Improve gather_result, remove filter, fix inf loop in truncation by @xingdi-eric-yuan in #167
Add back the message and LLM response to log by @xingdi-eric-yuan in #200
Dump problem status by @matheper in #197
Fix log none plus tool call by @sordonia in #202
Fix Skip by @xingdi-eric-yuan in #211
Refac reasoning content by @xingdi-eric-yuan in #207
Set reset_prompt_history_after_rewrite to be False by default by @xingdi-eric-yuan in #215
Bump version 1.1.0 by @matheper in #214

New Contributors

@ShiZhengyan made their first contribution in #123
@Copilot made their first contribution in #129
@threewisemonkeys-as made their first contribution in #166
@dkokkotas made their first contribution in #159
@icwhite made their first contribution in #182

Full Changelog: 1.0.0...1.1.0

Contributors

MarcCote, matheper, and 6 other contributors

Assets 2

18 Jul 18:58

MarcCote

1.1.0rc1

db98ba6

1.1.0rc1 Pre-release

Pre-release

What's Changed

arxiv url by @xingdi-eric-yuan in #106
Readme install from pypi by @matheper in #108
Change credential order in AzureOpenAILLM for proper token retrieval by @matheper in #110
AzureOpenAI Chained Credential by @matheper in #111
Function Calling Syntax by @xingdi-eric-yuan in #109
Update human class to use tool calls by @matheper in #114
make sure the working dir is always in sys path by @xingdi-eric-yuan in #115
Validate human input by @matheper in #116
Bugfix human test by @matheper in #118
Disable strict mode by @xingdi-eric-yuan in #117
Fix env.rewrite_counter by @matheper in #119
Fixed EvalTool being called with extra kwargs when reacting to events by @matheper in #120
Add logs viewer for Froggy by @MarcCote in #95
Fix: Use json.dumps for tool arguments in OpenAILLM by @ShiZhengyan in #123
Tools observations by @matheper in #121
Fix 'utf-8' codec error with surrogate pairs in Unicode strings by @Copilot in #129
Add max_retries parameter to Human class to limit terminal read attempts by @Copilot in #126
Improve test coverage report and default pytest configs by @matheper in #98
Clean up pytest.ini, partially reverting #98 by @matheper in #132
Remove current file by @matheper in #127
Add start and end args to ViewTool by @matheper in #133
Minor Fixes by @xingdi-eric-yuan in #124
Fix kwargs by @xingdi-eric-yuan in #135
Fix kwargs for pdb tool by @xingdi-eric-yuan in #136
Fix type annotation: tool_call_list should be list not dict by @ShiZhengyan in #134
Enhance Agent logging to include step number and reason for termination by @matheper in #137
Improve visualization by @xingdi-eric-yuan in #138
Pdb current frame file by @matheper in #139
Pdb breakpoint handling by @matheper in #140
Refactor llm_api into debug_gym.llms subpackage by @MarcCote in #142
Resolve absolute path from RepoEnv by @matheper in #144
Use better command completion for Human Mode by @MarcCote in #143
Fix: resolve_path and is_editable to account for ignored and read-only files by @matheper in #145
Ignore files from .gitignore by @matheper in #146
Fix Aider ignore patterns and add tests for path resolution and ignored/read-only files by @matheper in #147
Fix issue resolving env.working_dir by @matheper in #148
Set default RepoEnv.dir_tree_depth to 1 by @matheper in #150
replace unescape by filtering non-utf8 chars in system prompts by @xingdi-eric-yuan in #151
Adding SWE-Smith support by @MarcCote in #122
Only load image for instance_id we want to tests by @MarcCote in #154
No eval shortcut by @matheper in #152
Parallel execution by @matheper in #153
Trajectory Filtering by @xingdi-eric-yuan in #141
For SWE-Smith, add a new test split distinct from train-789 by @MarcCote in #156
Fix ViewTool handling empty files by @matheper in #157
Improve Retry by @xingdi-eric-yuan in #158
Support pickling PDBTool instances by @threewisemonkeys-as in #166
Resolve path mismatch issue raised on macOS by @dkokkotas in #159
Context change for SFT by @xingdi-eric-yuan in #162
Fix PDB indentation mismatch in list output context by @Copilot in #161
Fix init obs by @xingdi-eric-yuan in #171
Integrating thinking by @xingdi-eric-yuan in #172
Show pytest traceback for test failures. by @MarcCote in #173
Rich logger by @matheper in #170
Agents rich progress by @matheper in #174
A set of fixes by @xingdi-eric-yuan in #175
Add get_problem_ids (formerly get_dataset_split) to all benchmark env by @MarcCote in #176
Pin swe-smith version by @matheper in #181
Change to when tool call is on auto parsing by @icwhite in #182
Add memory limit to Docker containers by @matheper in #183
Disable rich live in human mode by @matheper in #180
Pin dataset revision for SWE-Smith dataset by @MarcCote in #188
Fix progress skip by @matheper in #186
Agent timeout by @matheper in #189
Always use debug level when logging to file by @matheper in #191
Add logfile to progress tracker. Stop pending tasks' spinner. by @MarcCote in #190
PyPI release action by @matheper in #192
Use hasattr instead of check dict to support subclassed methods. by @MarcCote in #194
Fix release pipeline by @matheper in #195
Dump experiment info by @matheper in #193
Avoid copying pycache in the temp workding dir. by @MarcCote in #196

New Contributors

@ShiZhengyan made their first contribution in #123
@Copilot made their first contribution in #129
@threewisemonkeys-as made their first contribution in #166
@dkokkotas made their first contribution in #159
@icwhite made their first contribution in #182

Full Changelog: 1.0.0...1.1.0

Contributors

MarcCote, matheper, and 5 other contributors

Assets 2

27 Mar 20:49

MarcCote

1.0.0

ca41efe

1.0.0

What's Changed

New reasoning tool by @mormio in #1
SubstitutionPatcher with file path by @xingdi-eric-yuan in #2
update readme by @xingdi-eric-yuan in #3
Tokenizer fix by @xingdi-eric-yuan in #4
catch exceptions in max score compute by @xingdi-eric-yuan in #5
default max score 1, in case the test crashes by @xingdi-eric-yuan in #6
tests github actions by @matheper in #8
Mm/froggyignore by @mormio in #9
toolbox by @mormio in #10
Fix Context Truncation by @xingdi-eric-yuan in #7
Test Utils by @xingdi-eric-yuan in #12
Unit Tests by @chisingh in #11
Reset after Rewrite by @xingdi-eric-yuan in #14
Comparisons between baseline(simple rewrite) and agent(zero-shot pdb) by @Kim-Minseon in #16
zero shot agent that can access pdb after certain number of rewrite a… by @mormio in #15
Terminal refac by @matheper in #13
Terminal run bugfix by @matheper in #18
pass entrypoint properly for aider by @sordonia in #19
clean llm configs loading by @sordonia in #20
support for threads in run by @sordonia in #21
Test tools by @chisingh in #17
add terminal to pdb when we re-add it by @mormio in #27
Az login oai by @matheper in #32
bug fix: 0 as line number by @xingdi-eric-yuan in #34
Use logging instead of print. Add nice progress visualization. by @MarcCote in #35
Clean Tools by @xingdi-eric-yuan in #36
SWE-Bench integration (with caching) by @MarcCote in #28
Refac tests by @matheper in #39
Add .froggyignore and .froggyreadonly by @MarcCote in #40
Add an optional depth argument to listdir. Fix tests by @MarcCote in #41
Auth with ManagedIdentityCredential by @matheper in #43
Set az and oai logging level to WARNING by @matheper in #45
Mini nightmares by @xingdi-eric-yuan in #42
Gist url by @chisingh in #46
Moving Agents Out of Froggy by @xingdi-eric-yuan in #44
Remove rich, put back tqdm with proper handling for logging. llm mess… by @MarcCote in #47
Delete random by @matheper in #48
Use the proper NOT_GIVEN value when calling openai API by @MarcCote in #49
Add Dataclasses and RepoEnv Info refac by @matheper in #50
Mini nightmares env. by @xingdi-eric-yuan in #54
fix view by @xingdi-eric-yuan in #55
Improve mini nightmare by @xingdi-eric-yuan in #56
Improve mini nightmare by @xingdi-eric-yuan in #57
Update terminal.py by @sordonia in #60
Lite update tool syntax parser by @xingdi-eric-yuan in #62
Fix PDB hanging by @MarcCote in #63
Ranaming code. by @xingdi-eric-yuan in #64
Remove ANSI codes when logging to file. by @MarcCote in #65
EventHooks - Tools refac by @matheper in #58
Skip swe-bench testing PRs by @matheper in #68
Clean Reset Args by @xingdi-eric-yuan in #67
froggy/agents, froggy/pond, tests/ by @sordonia in #70
Fixed test and removed unused conftest by @matheper in #71
Agents refac. by @sordonia in #72
Better handling of shell session by @MarcCote in #73
Minor fix utils by @xingdi-eric-yuan in #75
Use session.is_running property instead of process.poll by @MarcCote in #77
Improve README by @xingdi-eric-yuan in #76
LLM Returns None by @xingdi-eric-yuan in #78
strip a list by @xingdi-eric-yuan in #79
Use LLMs with fixed temperature by @xingdi-eric-yuan in #80
Max score by @matheper in #81
Setup and session commands by @matheper in #82
aider entrypoint by @sordonia in #83
Set aider and mini_nightmare default entrypoints by @matheper in #84
Parse think token by @xingdi-eric-yuan in #86
Static analysis by @chisingh in #88
remove syntax error in pytorch data because it's too easy. by @xingdi-eric-yuan in #90
rename to debug_gym by @sordonia in #91
master/slave -> server/client by @xingdi-eric-yuan in #93
Minor obs change by @xingdi-eric-yuan in #94
Fixes for SWEBench by @MarcCote in #85
Analysis Scripts by @xingdi-eric-yuan in #87
LLM API refac by @matheper in #97
Rename Agents by @xingdi-eric-yuan in #100
[WIP] README update by @xingdi-eric-yuan in #102
RAI Statement by @xingdi-eric-yuan in #103
Add debug-gym-llm-config-template entrypoint by @matheper in #101
Add version by @MarcCote in #105
skip tools when collecting task names by @xingdi-eric-yuan in #104

New Contributors

@mormio made their first contribution in #1
@xingdi-eric-yuan made their first contribution in #2
@chisingh made their first contribution in #11
@Kim-Minseon made their first contribution in #16
@sordonia made their first contribution in #19
@MarcCote made their first contribution in #35

Full Changelog: https://github.com/microsoft/debug-gym/commits/1.0.0

Contributors

MarcCote, matheper, and 5 other contributors

Assets 3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

What's Changed

New Contributors

Contributors

Uh oh!

What's Changed

New Contributors

Contributors

Uh oh!

What's Changed

New Contributors

Contributors

Uh oh!

Releases: microsoft/debug-gym

1.1.0

What's Changed

New Contributors

Contributors

Uh oh!

1.1.0rc1

What's Changed

New Contributors

Contributors

Uh oh!

1.0.0

What's Changed

New Contributors

Contributors

Uh oh!