Releases: microsoft/debug-gym
Releases · microsoft/debug-gym
1.1.0
What's Changed
- arxiv url by @xingdi-eric-yuan in #106
- Readme install from pypi by @matheper in #108
- Change credential order in AzureOpenAILLM for proper token retrieval by @matheper in #110
- AzureOpenAI Chained Credential by @matheper in #111
- Function Calling Syntax by @xingdi-eric-yuan in #109
- Update human class to use tool calls by @matheper in #114
- make sure the working dir is always in sys path by @xingdi-eric-yuan in #115
- Validate human input by @matheper in #116
- Bugfix human test by @matheper in #118
- Disable strict mode by @xingdi-eric-yuan in #117
- Fix env.rewrite_counter by @matheper in #119
- Fixed EvalTool being called with extra kwargs when reacting to events by @matheper in #120
- Add logs viewer for Froggy by @MarcCote in #95
- Fix: Use json.dumps for tool arguments in OpenAILLM by @ShiZhengyan in #123
- Tools observations by @matheper in #121
- Fix 'utf-8' codec error with surrogate pairs in Unicode strings by @Copilot in #129
- Add max_retries parameter to Human class to limit terminal read attempts by @Copilot in #126
- Improve test coverage report and default pytest configs by @matheper in #98
- Clean up pytest.ini, partially reverting #98 by @matheper in #132
- Remove current file by @matheper in #127
- Add
start
andend
args to ViewTool by @matheper in #133 - Minor Fixes by @xingdi-eric-yuan in #124
- Fix kwargs by @xingdi-eric-yuan in #135
- Fix kwargs for pdb tool by @xingdi-eric-yuan in #136
- Fix type annotation: tool_call_list should be list not dict by @ShiZhengyan in #134
- Enhance Agent logging to include step number and reason for termination by @matheper in #137
- Improve visualization by @xingdi-eric-yuan in #138
- Pdb current frame file by @matheper in #139
- Pdb breakpoint handling by @matheper in #140
- Refactor llm_api into debug_gym.llms subpackage by @MarcCote in #142
- Resolve absolute path from RepoEnv by @matheper in #144
- Use better command completion for Human Mode by @MarcCote in #143
- Fix:
resolve_path
andis_editable
to account for ignored and read-only files by @matheper in #145 - Ignore files from .gitignore by @matheper in #146
- Fix Aider ignore patterns and add tests for path resolution and ignored/read-only files by @matheper in #147
- Fix issue resolving
env.working_dir
by @matheper in #148 - Set default
RepoEnv.dir_tree_depth
to 1 by @matheper in #150 - replace unescape by filtering non-utf8 chars in system prompts by @xingdi-eric-yuan in #151
- Adding SWE-Smith support by @MarcCote in #122
- Only load image for instance_id we want to tests by @MarcCote in #154
- No eval shortcut by @matheper in #152
- Parallel execution by @matheper in #153
- Trajectory Filtering by @xingdi-eric-yuan in #141
- For SWE-Smith, add a new test split distinct from train-789 by @MarcCote in #156
- Fix ViewTool handling empty files by @matheper in #157
- Improve Retry by @xingdi-eric-yuan in #158
- Support pickling
PDBTool
instances by @threewisemonkeys-as in #166 - Resolve path mismatch issue raised on macOS by @dkokkotas in #159
- Context change for SFT by @xingdi-eric-yuan in #162
- Fix PDB indentation mismatch in list output context by @Copilot in #161
- Fix init obs by @xingdi-eric-yuan in #171
- Integrating thinking by @xingdi-eric-yuan in #172
- Show pytest traceback for test failures. by @MarcCote in #173
- Rich logger by @matheper in #170
- Agents rich progress by @matheper in #174
- A set of fixes by @xingdi-eric-yuan in #175
- Add get_problem_ids (formerly get_dataset_split) to all benchmark env by @MarcCote in #176
- Pin swe-smith version by @matheper in #181
- Change to when tool call is on auto parsing by @icwhite in #182
- Add memory limit to Docker containers by @matheper in #183
- Disable rich live in human mode by @matheper in #180
- Pin dataset revision for SWE-Smith dataset by @MarcCote in #188
- Fix progress skip by @matheper in #186
- Agent timeout by @matheper in #189
- Always use debug level when logging to file by @matheper in #191
- Add logfile to progress tracker. Stop pending tasks' spinner. by @MarcCote in #190
- PyPI release action by @matheper in #192
- Use hasattr instead of check dict to support subclassed methods. by @MarcCote in #194
- Fix release pipeline by @matheper in #195
- Dump experiment info by @matheper in #193
- Avoid copying pycache in the temp workding dir. by @MarcCote in #196
- Use the main logger when checking if LLM config is correct by @MarcCote in #199
- Fix get_problem_ids to respect instance_ids parameter in SWE environments by @Copilot in #179
- Improve gather_result, remove filter, fix inf loop in truncation by @xingdi-eric-yuan in #167
- Add back the message and LLM response to log by @xingdi-eric-yuan in #200
- Dump problem status by @matheper in #197
- Fix log none plus tool call by @sordonia in #202
- Fix Skip by @xingdi-eric-yuan in #211
- Refac reasoning content by @xingdi-eric-yuan in #207
- Set reset_prompt_history_after_rewrite to be False by default by @xingdi-eric-yuan in #215
- Bump version 1.1.0 by @matheper in #214
New Contributors
- @ShiZhengyan made their first contribution in #123
- @Copilot made their first contribution in #129
- @threewisemonkeys-as made their first contribution in #166
- @dkokkotas made their first contribution in #159
- @icwhite made their first contribution in #182
Full Changelog: 1.0.0...1.1.0
1.1.0rc1
What's Changed
- arxiv url by @xingdi-eric-yuan in #106
- Readme install from pypi by @matheper in #108
- Change credential order in AzureOpenAILLM for proper token retrieval by @matheper in #110
- AzureOpenAI Chained Credential by @matheper in #111
- Function Calling Syntax by @xingdi-eric-yuan in #109
- Update human class to use tool calls by @matheper in #114
- make sure the working dir is always in sys path by @xingdi-eric-yuan in #115
- Validate human input by @matheper in #116
- Bugfix human test by @matheper in #118
- Disable strict mode by @xingdi-eric-yuan in #117
- Fix env.rewrite_counter by @matheper in #119
- Fixed EvalTool being called with extra kwargs when reacting to events by @matheper in #120
- Add logs viewer for Froggy by @MarcCote in #95
- Fix: Use json.dumps for tool arguments in OpenAILLM by @ShiZhengyan in #123
- Tools observations by @matheper in #121
- Fix 'utf-8' codec error with surrogate pairs in Unicode strings by @Copilot in #129
- Add max_retries parameter to Human class to limit terminal read attempts by @Copilot in #126
- Improve test coverage report and default pytest configs by @matheper in #98
- Clean up pytest.ini, partially reverting #98 by @matheper in #132
- Remove current file by @matheper in #127
- Add
start
andend
args to ViewTool by @matheper in #133 - Minor Fixes by @xingdi-eric-yuan in #124
- Fix kwargs by @xingdi-eric-yuan in #135
- Fix kwargs for pdb tool by @xingdi-eric-yuan in #136
- Fix type annotation: tool_call_list should be list not dict by @ShiZhengyan in #134
- Enhance Agent logging to include step number and reason for termination by @matheper in #137
- Improve visualization by @xingdi-eric-yuan in #138
- Pdb current frame file by @matheper in #139
- Pdb breakpoint handling by @matheper in #140
- Refactor llm_api into debug_gym.llms subpackage by @MarcCote in #142
- Resolve absolute path from RepoEnv by @matheper in #144
- Use better command completion for Human Mode by @MarcCote in #143
- Fix:
resolve_path
andis_editable
to account for ignored and read-only files by @matheper in #145 - Ignore files from .gitignore by @matheper in #146
- Fix Aider ignore patterns and add tests for path resolution and ignored/read-only files by @matheper in #147
- Fix issue resolving
env.working_dir
by @matheper in #148 - Set default
RepoEnv.dir_tree_depth
to 1 by @matheper in #150 - replace unescape by filtering non-utf8 chars in system prompts by @xingdi-eric-yuan in #151
- Adding SWE-Smith support by @MarcCote in #122
- Only load image for instance_id we want to tests by @MarcCote in #154
- No eval shortcut by @matheper in #152
- Parallel execution by @matheper in #153
- Trajectory Filtering by @xingdi-eric-yuan in #141
- For SWE-Smith, add a new test split distinct from train-789 by @MarcCote in #156
- Fix ViewTool handling empty files by @matheper in #157
- Improve Retry by @xingdi-eric-yuan in #158
- Support pickling
PDBTool
instances by @threewisemonkeys-as in #166 - Resolve path mismatch issue raised on macOS by @dkokkotas in #159
- Context change for SFT by @xingdi-eric-yuan in #162
- Fix PDB indentation mismatch in list output context by @Copilot in #161
- Fix init obs by @xingdi-eric-yuan in #171
- Integrating thinking by @xingdi-eric-yuan in #172
- Show pytest traceback for test failures. by @MarcCote in #173
- Rich logger by @matheper in #170
- Agents rich progress by @matheper in #174
- A set of fixes by @xingdi-eric-yuan in #175
- Add get_problem_ids (formerly get_dataset_split) to all benchmark env by @MarcCote in #176
- Pin swe-smith version by @matheper in #181
- Change to when tool call is on auto parsing by @icwhite in #182
- Add memory limit to Docker containers by @matheper in #183
- Disable rich live in human mode by @matheper in #180
- Pin dataset revision for SWE-Smith dataset by @MarcCote in #188
- Fix progress skip by @matheper in #186
- Agent timeout by @matheper in #189
- Always use debug level when logging to file by @matheper in #191
- Add logfile to progress tracker. Stop pending tasks' spinner. by @MarcCote in #190
- PyPI release action by @matheper in #192
- Use hasattr instead of check dict to support subclassed methods. by @MarcCote in #194
- Fix release pipeline by @matheper in #195
- Dump experiment info by @matheper in #193
- Avoid copying pycache in the temp workding dir. by @MarcCote in #196
New Contributors
- @ShiZhengyan made their first contribution in #123
- @Copilot made their first contribution in #129
- @threewisemonkeys-as made their first contribution in #166
- @dkokkotas made their first contribution in #159
- @icwhite made their first contribution in #182
Full Changelog: 1.0.0...1.1.0
1.0.0
What's Changed
- New reasoning tool by @mormio in #1
- SubstitutionPatcher with file path by @xingdi-eric-yuan in #2
- update readme by @xingdi-eric-yuan in #3
- Tokenizer fix by @xingdi-eric-yuan in #4
- catch exceptions in max score compute by @xingdi-eric-yuan in #5
- default max score 1, in case the test crashes by @xingdi-eric-yuan in #6
- tests github actions by @matheper in #8
- Mm/froggyignore by @mormio in #9
- toolbox by @mormio in #10
- Fix Context Truncation by @xingdi-eric-yuan in #7
- Test Utils by @xingdi-eric-yuan in #12
- Unit Tests by @chisingh in #11
- Reset after Rewrite by @xingdi-eric-yuan in #14
- Comparisons between baseline(simple rewrite) and agent(zero-shot pdb) by @Kim-Minseon in #16
- zero shot agent that can access pdb after certain number of rewrite a… by @mormio in #15
- Terminal refac by @matheper in #13
- Terminal run bugfix by @matheper in #18
- pass entrypoint properly for aider by @sordonia in #19
- clean llm configs loading by @sordonia in #20
- support for threads in run by @sordonia in #21
- Test tools by @chisingh in #17
- add terminal to pdb when we re-add it by @mormio in #27
- Az login oai by @matheper in #32
- bug fix: 0 as line number by @xingdi-eric-yuan in #34
- Use logging instead of print. Add nice progress visualization. by @MarcCote in #35
- Clean Tools by @xingdi-eric-yuan in #36
- SWE-Bench integration (with caching) by @MarcCote in #28
- Refac tests by @matheper in #39
- Add .froggyignore and .froggyreadonly by @MarcCote in #40
- Add an optional depth argument to listdir. Fix tests by @MarcCote in #41
- Auth with ManagedIdentityCredential by @matheper in #43
- Set az and oai logging level to WARNING by @matheper in #45
- Mini nightmares by @xingdi-eric-yuan in #42
- Gist url by @chisingh in #46
- Moving Agents Out of Froggy by @xingdi-eric-yuan in #44
- Remove rich, put back tqdm with proper handling for logging. llm mess… by @MarcCote in #47
- Delete random by @matheper in #48
- Use the proper NOT_GIVEN value when calling openai API by @MarcCote in #49
- Add Dataclasses and RepoEnv Info refac by @matheper in #50
- Mini nightmares env. by @xingdi-eric-yuan in #54
- fix view by @xingdi-eric-yuan in #55
- Improve mini nightmare by @xingdi-eric-yuan in #56
- Improve mini nightmare by @xingdi-eric-yuan in #57
- Update terminal.py by @sordonia in #60
- Lite update tool syntax parser by @xingdi-eric-yuan in #62
- Fix PDB hanging by @MarcCote in #63
- Ranaming code. by @xingdi-eric-yuan in #64
- Remove ANSI codes when logging to file. by @MarcCote in #65
- EventHooks - Tools refac by @matheper in #58
- Skip swe-bench testing PRs by @matheper in #68
- Clean Reset Args by @xingdi-eric-yuan in #67
- froggy/agents, froggy/pond, tests/ by @sordonia in #70
- Fixed test and removed unused conftest by @matheper in #71
- Agents refac. by @sordonia in #72
- Better handling of shell session by @MarcCote in #73
- Minor fix utils by @xingdi-eric-yuan in #75
- Use session.is_running property instead of process.poll by @MarcCote in #77
- Improve README by @xingdi-eric-yuan in #76
- LLM Returns None by @xingdi-eric-yuan in #78
- strip a list by @xingdi-eric-yuan in #79
- Use LLMs with fixed temperature by @xingdi-eric-yuan in #80
- Max score by @matheper in #81
- Setup and session commands by @matheper in #82
- aider entrypoint by @sordonia in #83
- Set aider and mini_nightmare default entrypoints by @matheper in #84
- Parse think token by @xingdi-eric-yuan in #86
- Static analysis by @chisingh in #88
- remove syntax error in pytorch data because it's too easy. by @xingdi-eric-yuan in #90
- rename to debug_gym by @sordonia in #91
- master/slave -> server/client by @xingdi-eric-yuan in #93
- Minor obs change by @xingdi-eric-yuan in #94
- Fixes for SWEBench by @MarcCote in #85
- Analysis Scripts by @xingdi-eric-yuan in #87
- LLM API refac by @matheper in #97
- Rename Agents by @xingdi-eric-yuan in #100
- [WIP] README update by @xingdi-eric-yuan in #102
- RAI Statement by @xingdi-eric-yuan in #103
- Add
debug-gym-llm-config-template
entrypoint by @matheper in #101 - Add version by @MarcCote in #105
- skip tools when collecting task names by @xingdi-eric-yuan in #104
New Contributors
- @mormio made their first contribution in #1
- @xingdi-eric-yuan made their first contribution in #2
- @chisingh made their first contribution in #11
- @Kim-Minseon made their first contribution in #16
- @sordonia made their first contribution in #19
- @MarcCote made their first contribution in #35
Full Changelog: https://github.com/microsoft/debug-gym/commits/1.0.0